From: Dietmar Eggemann <dietmar.eggemann@arm.com> To: 王擎 <wangqing@vivo.com>, "Sudeep Holla" <sudeep.holla@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "vincent.guittot@linaro.org" <vincent.guittot@linaro.org>, "peterz@infradead.org" <peterz@infradead.org> Subject: Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI Date: Wed, 27 Apr 2022 17:47:22 +0200 [thread overview] Message-ID: <9f0589d5-ab0a-9203-b961-984c61cc7283@arm.com> (raw) In-Reply-To: <SL2PR06MB3082EA14946F2E207B3043E2BDFA9@SL2PR06MB3082.apcprd06.prod.outlook.com> On 27/04/2022 04:18, 王擎 wrote: > >>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote: >>>> >>>>>>>>>> From: Wang Qing <wangqing@vivo.com> [...] >> .---------------. >> CPU |0 1 2 3 4 5 6 7| >> +---------------+ >> uarch |l l l l m m m b| (so called tri-gear: little, medium, big) >> +---------------+ >> L2 | | | | | | | >> +---------------+ >> L3 |<-- -->| >> +---------------+ >> |<-- cluster -->| >> +---------------+ >> |<-- DSU -->| >> '---------------' >> >> (and I'm not saying here it does!) then the existing SCHED_CLUSTER >> should be used in DT as well. Since it provides exactly the same >> functionality for the task scheduler no matter whether it's setup from >> ACPI or DT. >> >> parse_cluster() -> parse_core() should be changed to be able to decode >> both id's (package_id and cluster_id) in this case. > > Totally agree, but not implemented yet. Because now cluster_id is used > to describe the package/socket, the modification will involve all DTS. You're talking about the fact that 1. level clusterX (1) in cpu_map is used to set `cpu_topology[cpu].package_id`, right? That's the information for the DIE layer. The first level cluster[0,1,2] spawn all 8 CPUs and form 3 groups of CPUS (0-3, 4-6, 7), the later sched groups of the DIE sched domain. We don't have any socketN since it is a single socket system. Think about a system with 2 DSUs where you would have socket[0,1]. Sub-chapter 4 in `Documentation/devicetree/bindings/cpu/cpu-topology.txt`: cpu-map { socket0 { cluster0 { <--- (1) Sub-chapter 3 says: - cluster node ... A system can contain several layers of clustering within a single physical socket and cluster nodes can be contained in parent cluster nodes. A cluster node's child nodes must be: one or more cluster nodes This multi-level cluster thing hasn't been coded yet. parse_cluster() ... /* * First check for child clusters; we currently ignore any * information about the nesting of clusters and present the * scheduler with a flat list of them. */ ... [...] >> I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with >> L2-complexes on the LITTLES and I get: > > This system is exactly what I mentioned, but I have a question, > How did you parse out the cluster_id based on foo0/foo2? > Because if ACPI is not used, cluster_id is always -1. I haven't put in the extension to decode a 2-level clusterX cpu_map in parse_cluster() -> parse_core(). I just did a mock-up for illustration purpose inside parse_core() for my H960 with a 4-3-1 cpu-map: cpu-map cluster0 core0 core1 core2 core3 cluster1 core0 core1 core2 cluster2 core0 diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 1d6636ebaac5..8af40f13fdb5 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -529,6 +529,11 @@ static int __init parse_core(struct device_node *core, int package_id, cpu_topology[cpu].package_id = package_id; cpu_topology[cpu].core_id = core_id; + + /* mock-up CLS SD on 4-3-1 Armv9 DSU cluster w/ L2-complexes */ + if (cpu <= 3) + cpu_topology[cpu].cluster_id = cpu / 2; + } else if (leaf && cpu != -ENODEV) { pr_err("%pOF: Can't get CPU for leaf core\n", core); return -EINVAL; And on the scheduler-side I only had to enable CONFIG_SCHED_CLUSTER and everything worked just fine, no need for any arm64-specific topo table and alike. IMHO, this is what you have to do. Make a 2 level cluster cpumap: cpu-map cluster0 cluster0 core0 core1 cluster1 core2 core3 cluster1 core0 core1 core2 cluster2 core0 parse-able and set `cpu_topology[cpu].cluster_id` in parse_core(). > What I want to do is to change the foo0/foo2 to complex0/complex2 here, > then parse it like parse_cluster() -> parse_complex() -> parse_core(). You should read `Documentation/devicetree/bindings/cpu/cpu-topology.txt` and implement the multi-level cluster approach instead. Big advantage would be that there won't be any DT related changes/extensions needed. [...] > Yes, that's what I want, but still a little confused, why we use MC to > describe "cluster" and use CLS describe "complex", can you show some details? The DT entity `cluster` has nothing to do with the task scheduler domain name `SCHED_CLUSTER`. The name is actually meaningless and just there for debug purposes.
WARNING: multiple messages have this Message-ID (diff)
From: Dietmar Eggemann <dietmar.eggemann@arm.com> To: 王擎 <wangqing@vivo.com>, "Sudeep Holla" <sudeep.holla@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "vincent.guittot@linaro.org" <vincent.guittot@linaro.org>, "peterz@infradead.org" <peterz@infradead.org> Subject: Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI Date: Wed, 27 Apr 2022 17:47:22 +0200 [thread overview] Message-ID: <9f0589d5-ab0a-9203-b961-984c61cc7283@arm.com> (raw) In-Reply-To: <SL2PR06MB3082EA14946F2E207B3043E2BDFA9@SL2PR06MB3082.apcprd06.prod.outlook.com> On 27/04/2022 04:18, 王擎 wrote: > >>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote: >>>> >>>>>>>>>> From: Wang Qing <wangqing@vivo.com> [...] >> .---------------. >> CPU |0 1 2 3 4 5 6 7| >> +---------------+ >> uarch |l l l l m m m b| (so called tri-gear: little, medium, big) >> +---------------+ >> L2 | | | | | | | >> +---------------+ >> L3 |<-- -->| >> +---------------+ >> |<-- cluster -->| >> +---------------+ >> |<-- DSU -->| >> '---------------' >> >> (and I'm not saying here it does!) then the existing SCHED_CLUSTER >> should be used in DT as well. Since it provides exactly the same >> functionality for the task scheduler no matter whether it's setup from >> ACPI or DT. >> >> parse_cluster() -> parse_core() should be changed to be able to decode >> both id's (package_id and cluster_id) in this case. > > Totally agree, but not implemented yet. Because now cluster_id is used > to describe the package/socket, the modification will involve all DTS. You're talking about the fact that 1. level clusterX (1) in cpu_map is used to set `cpu_topology[cpu].package_id`, right? That's the information for the DIE layer. The first level cluster[0,1,2] spawn all 8 CPUs and form 3 groups of CPUS (0-3, 4-6, 7), the later sched groups of the DIE sched domain. We don't have any socketN since it is a single socket system. Think about a system with 2 DSUs where you would have socket[0,1]. Sub-chapter 4 in `Documentation/devicetree/bindings/cpu/cpu-topology.txt`: cpu-map { socket0 { cluster0 { <--- (1) Sub-chapter 3 says: - cluster node ... A system can contain several layers of clustering within a single physical socket and cluster nodes can be contained in parent cluster nodes. A cluster node's child nodes must be: one or more cluster nodes This multi-level cluster thing hasn't been coded yet. parse_cluster() ... /* * First check for child clusters; we currently ignore any * information about the nesting of clusters and present the * scheduler with a flat list of them. */ ... [...] >> I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with >> L2-complexes on the LITTLES and I get: > > This system is exactly what I mentioned, but I have a question, > How did you parse out the cluster_id based on foo0/foo2? > Because if ACPI is not used, cluster_id is always -1. I haven't put in the extension to decode a 2-level clusterX cpu_map in parse_cluster() -> parse_core(). I just did a mock-up for illustration purpose inside parse_core() for my H960 with a 4-3-1 cpu-map: cpu-map cluster0 core0 core1 core2 core3 cluster1 core0 core1 core2 cluster2 core0 diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 1d6636ebaac5..8af40f13fdb5 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -529,6 +529,11 @@ static int __init parse_core(struct device_node *core, int package_id, cpu_topology[cpu].package_id = package_id; cpu_topology[cpu].core_id = core_id; + + /* mock-up CLS SD on 4-3-1 Armv9 DSU cluster w/ L2-complexes */ + if (cpu <= 3) + cpu_topology[cpu].cluster_id = cpu / 2; + } else if (leaf && cpu != -ENODEV) { pr_err("%pOF: Can't get CPU for leaf core\n", core); return -EINVAL; And on the scheduler-side I only had to enable CONFIG_SCHED_CLUSTER and everything worked just fine, no need for any arm64-specific topo table and alike. IMHO, this is what you have to do. Make a 2 level cluster cpumap: cpu-map cluster0 cluster0 core0 core1 cluster1 core2 core3 cluster1 core0 core1 core2 cluster2 core0 parse-able and set `cpu_topology[cpu].cluster_id` in parse_core(). > What I want to do is to change the foo0/foo2 to complex0/complex2 here, > then parse it like parse_cluster() -> parse_complex() -> parse_core(). You should read `Documentation/devicetree/bindings/cpu/cpu-topology.txt` and implement the multi-level cluster approach instead. Big advantage would be that there won't be any DT related changes/extensions needed. [...] > Yes, that's what I want, but still a little confused, why we use MC to > describe "cluster" and use CLS describe "complex", can you show some details? The DT entity `cluster` has nothing to do with the task scheduler domain name `SCHED_CLUSTER`. The name is actually meaningless and just there for debug purposes. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-04-27 15:47 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-04-25 2:55 [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI Qing Wang 2022-04-25 2:55 ` Qing Wang 2022-04-25 10:06 ` Sudeep Holla 2022-04-25 10:06 ` Sudeep Holla 2022-04-25 11:18 ` 王擎 2022-04-25 11:18 ` 王擎 2022-04-25 16:59 ` Sudeep Holla 2022-04-25 16:59 ` Sudeep Holla 2022-04-26 2:23 ` 王擎 2022-04-26 2:23 ` 王擎 2022-04-26 6:40 ` Sudeep Holla 2022-04-26 6:40 ` Sudeep Holla 2022-04-26 6:52 ` 王擎 2022-04-26 6:52 ` 王擎 2022-04-26 13:25 ` Sudeep Holla 2022-04-26 13:25 ` Sudeep Holla 2022-04-26 19:14 ` Dietmar Eggemann 2022-04-26 19:14 ` Dietmar Eggemann 2022-04-27 2:18 ` 王擎 2022-04-27 2:18 ` 王擎 2022-04-27 15:47 ` Dietmar Eggemann [this message] 2022-04-27 15:47 ` Dietmar Eggemann 2022-04-28 0:04 ` 王擎 2022-04-28 0:04 ` 王擎
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=9f0589d5-ab0a-9203-b961-984c61cc7283@arm.com \ --to=dietmar.eggemann@arm.com \ --cc=catalin.marinas@arm.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=peterz@infradead.org \ --cc=sudeep.holla@arm.com \ --cc=vincent.guittot@linaro.org \ --cc=wangqing@vivo.com \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.