All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Ionela Voinescu <ionela.voinescu@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>
Cc: Atish Patra <atishp@atishpatra.org>,
	linux-kernel@vger.kernel.org, Atish Patra <atishp@rivosinc.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Qing Wang <wangqing@vivo.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-riscv@lists.infradead.org, Rob Herring <robh+dt@kernel.org>
Subject: Re: [PATCH v2 3/8] arch_topology: Set cluster identifier in each core/thread from /cpu-map
Date: Fri, 20 May 2022 14:33:19 +0200	[thread overview]
Message-ID: <26f39a9d-1a02-b77d-5c89-88a1fb0e4eac@arm.com> (raw)
In-Reply-To: <YoZ2gjjS3rbRaJZm@arm.com>

On 19/05/2022 18:55, Ionela Voinescu wrote:
> Hi,
> 
> As said before, this creates trouble for CONFIG_SCHED_CLUSTER=y.
> The output below is obtained from Juno.
> 
> When cluster_id is populated, a new CLS level is created by the scheduler
> topology code. In this case the clusters in DT determine that the cluster
> siblings and llc siblings are the same so the MC scheduler domain will
> be removed and, for Juno, only CLS and DIE will be kept.

[...]

> To be noted that we also get a new flag SD_PREFER_SIBLING for the CLS
> level that is not appropriate. We usually remove it for the child of a
> SD_ASYM_CPUCAPACITY domain, but we don't currently redo this after some
> levels are degenerated. This is a fixable issue.
> 
> But looking at the bigger picture, a good question is what is the best
> thing to do when cluster domains and llc domains span the same CPUs?
> 
> Possibly it would be best to restrict clusters (which are almost an
> arbitrary concept) to always span a subset of CPUs of the llc domain,
> if llc siblings can be obtained? If those clusters are not properly set
> up in DT to respect this condition, cluster_siblings would need to be
> cleared (or set to the current CPU) so the CLS domain is not created at
> all.
> 
> Additionally, should we use cluster information from DT (cluster_id) to
> create an MC level if we don't have llc information, even if
> CONFIG_SCHED_CLUSTER=n?
> 
> I currently don't have a very clear picture of how cluster domains and
> llc domains would "live" together in a variety of topologies. I'll try
> other DT topologies to see if there are others that can lead to trouble.

This would be an issue. Depending on CONFIG_SCHED_CLUSTER we would get
two different systems from the viewpoint of the scheduler.

To me `cluster_id/_sibling` don't describe a certain level of CPU
grouping (e.g. one level above core or one level below package).

They were introduced to describe one level below LLC (e.g. Kunpeng920 L3
(24 CPUs LLC) -> L3 tag (4 CPUs) or x86 Jacobsville L3 -> L2), (Commit
                 ^^^^^^                                   ^^
c5e22feffdd7 ("topology: Represent clusters of CPUs within a die")).

The Ampere Altra issue already gave us a taste of the possible issues of
this definition, commit db1e59483dfd ("topology: make core_mask include
at least cluster_siblings").

If we link `cluster_id/_sibling` against (1. level) cpu-map cluster
nodes plus using llc and `cluster_sibling >= llc_sibling` we will run
into these issues.

WARNING: multiple messages have this Message-ID (diff)
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Ionela Voinescu <ionela.voinescu@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>
Cc: Atish Patra <atishp@atishpatra.org>,
	linux-kernel@vger.kernel.org, Atish Patra <atishp@rivosinc.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Qing Wang <wangqing@vivo.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-riscv@lists.infradead.org, Rob Herring <robh+dt@kernel.org>
Subject: Re: [PATCH v2 3/8] arch_topology: Set cluster identifier in each core/thread from /cpu-map
Date: Fri, 20 May 2022 14:33:19 +0200	[thread overview]
Message-ID: <26f39a9d-1a02-b77d-5c89-88a1fb0e4eac@arm.com> (raw)
In-Reply-To: <YoZ2gjjS3rbRaJZm@arm.com>

On 19/05/2022 18:55, Ionela Voinescu wrote:
> Hi,
> 
> As said before, this creates trouble for CONFIG_SCHED_CLUSTER=y.
> The output below is obtained from Juno.
> 
> When cluster_id is populated, a new CLS level is created by the scheduler
> topology code. In this case the clusters in DT determine that the cluster
> siblings and llc siblings are the same so the MC scheduler domain will
> be removed and, for Juno, only CLS and DIE will be kept.

[...]

> To be noted that we also get a new flag SD_PREFER_SIBLING for the CLS
> level that is not appropriate. We usually remove it for the child of a
> SD_ASYM_CPUCAPACITY domain, but we don't currently redo this after some
> levels are degenerated. This is a fixable issue.
> 
> But looking at the bigger picture, a good question is what is the best
> thing to do when cluster domains and llc domains span the same CPUs?
> 
> Possibly it would be best to restrict clusters (which are almost an
> arbitrary concept) to always span a subset of CPUs of the llc domain,
> if llc siblings can be obtained? If those clusters are not properly set
> up in DT to respect this condition, cluster_siblings would need to be
> cleared (or set to the current CPU) so the CLS domain is not created at
> all.
> 
> Additionally, should we use cluster information from DT (cluster_id) to
> create an MC level if we don't have llc information, even if
> CONFIG_SCHED_CLUSTER=n?
> 
> I currently don't have a very clear picture of how cluster domains and
> llc domains would "live" together in a variety of topologies. I'll try
> other DT topologies to see if there are others that can lead to trouble.

This would be an issue. Depending on CONFIG_SCHED_CLUSTER we would get
two different systems from the viewpoint of the scheduler.

To me `cluster_id/_sibling` don't describe a certain level of CPU
grouping (e.g. one level above core or one level below package).

They were introduced to describe one level below LLC (e.g. Kunpeng920 L3
(24 CPUs LLC) -> L3 tag (4 CPUs) or x86 Jacobsville L3 -> L2), (Commit
                 ^^^^^^                                   ^^
c5e22feffdd7 ("topology: Represent clusters of CPUs within a die")).

The Ampere Altra issue already gave us a taste of the possible issues of
this definition, commit db1e59483dfd ("topology: make core_mask include
at least cluster_siblings").

If we link `cluster_id/_sibling` against (1. level) cpu-map cluster
nodes plus using llc and `cluster_sibling >= llc_sibling` we will run
into these issues.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Ionela Voinescu <ionela.voinescu@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>
Cc: Atish Patra <atishp@atishpatra.org>,
	linux-kernel@vger.kernel.org, Atish Patra <atishp@rivosinc.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Qing Wang <wangqing@vivo.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-riscv@lists.infradead.org, Rob Herring <robh+dt@kernel.org>
Subject: Re: [PATCH v2 3/8] arch_topology: Set cluster identifier in each core/thread from /cpu-map
Date: Fri, 20 May 2022 14:33:19 +0200	[thread overview]
Message-ID: <26f39a9d-1a02-b77d-5c89-88a1fb0e4eac@arm.com> (raw)
In-Reply-To: <YoZ2gjjS3rbRaJZm@arm.com>

On 19/05/2022 18:55, Ionela Voinescu wrote:
> Hi,
> 
> As said before, this creates trouble for CONFIG_SCHED_CLUSTER=y.
> The output below is obtained from Juno.
> 
> When cluster_id is populated, a new CLS level is created by the scheduler
> topology code. In this case the clusters in DT determine that the cluster
> siblings and llc siblings are the same so the MC scheduler domain will
> be removed and, for Juno, only CLS and DIE will be kept.

[...]

> To be noted that we also get a new flag SD_PREFER_SIBLING for the CLS
> level that is not appropriate. We usually remove it for the child of a
> SD_ASYM_CPUCAPACITY domain, but we don't currently redo this after some
> levels are degenerated. This is a fixable issue.
> 
> But looking at the bigger picture, a good question is what is the best
> thing to do when cluster domains and llc domains span the same CPUs?
> 
> Possibly it would be best to restrict clusters (which are almost an
> arbitrary concept) to always span a subset of CPUs of the llc domain,
> if llc siblings can be obtained? If those clusters are not properly set
> up in DT to respect this condition, cluster_siblings would need to be
> cleared (or set to the current CPU) so the CLS domain is not created at
> all.
> 
> Additionally, should we use cluster information from DT (cluster_id) to
> create an MC level if we don't have llc information, even if
> CONFIG_SCHED_CLUSTER=n?
> 
> I currently don't have a very clear picture of how cluster domains and
> llc domains would "live" together in a variety of topologies. I'll try
> other DT topologies to see if there are others that can lead to trouble.

This would be an issue. Depending on CONFIG_SCHED_CLUSTER we would get
two different systems from the viewpoint of the scheduler.

To me `cluster_id/_sibling` don't describe a certain level of CPU
grouping (e.g. one level above core or one level below package).

They were introduced to describe one level below LLC (e.g. Kunpeng920 L3
(24 CPUs LLC) -> L3 tag (4 CPUs) or x86 Jacobsville L3 -> L2), (Commit
                 ^^^^^^                                   ^^
c5e22feffdd7 ("topology: Represent clusters of CPUs within a die")).

The Ampere Altra issue already gave us a taste of the possible issues of
this definition, commit db1e59483dfd ("topology: make core_mask include
at least cluster_siblings").

If we link `cluster_id/_sibling` against (1. level) cpu-map cluster
nodes plus using llc and `cluster_sibling >= llc_sibling` we will run
into these issues.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-05-20 12:34 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18  9:33 [PATCH v2 0/8] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
2022-05-18  9:33 ` Sudeep Holla
2022-05-18  9:33 ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 1/8] arch_topology: Don't set cluster identifier as physical package identifier Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-20 12:31   ` Dietmar Eggemann
2022-05-20 12:31     ` Dietmar Eggemann
2022-05-20 12:31     ` Dietmar Eggemann
2022-05-20 13:13     ` Sudeep Holla
2022-05-20 13:13       ` Sudeep Holla
2022-05-20 13:13       ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 2/8] arch_topology: Set thread sibling cpumask only within the cluster Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-20 12:32   ` Dietmar Eggemann
2022-05-20 12:32     ` Dietmar Eggemann
2022-05-20 12:32     ` Dietmar Eggemann
2022-05-20 13:20     ` Sudeep Holla
2022-05-20 13:20       ` Sudeep Holla
2022-05-20 13:20       ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 3/8] arch_topology: Set cluster identifier in each core/thread from /cpu-map Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-19 16:55   ` Ionela Voinescu
2022-05-19 16:55     ` Ionela Voinescu
2022-05-19 16:55     ` Ionela Voinescu
2022-05-20 12:33     ` Dietmar Eggemann [this message]
2022-05-20 12:33       ` Dietmar Eggemann
2022-05-20 12:33       ` Dietmar Eggemann
2022-05-20 13:54       ` Sudeep Holla
2022-05-20 13:54         ` Sudeep Holla
2022-05-20 13:54         ` Sudeep Holla
2022-05-20 15:27     ` Sudeep Holla
2022-05-20 15:27       ` Sudeep Holla
2022-05-20 15:27       ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 4/8] arch_topology: Add support for parsing sockets in /cpu-map Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 5/8] arch_topology: Check for non-negative value rather than -1 for IDs validity Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 6/8] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 7/8] of: base: add support to get the device node for the CPU's last level cache Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33 ` [PATCH v2 8/8] arch_topology: Add support to build llc_sibling on DT platforms Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-18  9:33   ` Sudeep Holla
2022-05-19 18:10   ` Rob Herring
2022-05-19 18:10     ` Rob Herring
2022-05-19 18:10     ` Rob Herring
2022-05-20 12:59     ` Sudeep Holla
2022-05-20 12:59       ` Sudeep Holla
2022-05-20 12:59       ` Sudeep Holla
2022-05-20 14:36       ` Rob Herring
2022-05-20 14:36         ` Rob Herring
2022-05-20 14:36         ` Rob Herring
2022-05-20 15:06         ` Sudeep Holla
2022-05-20 15:06           ` Sudeep Holla
2022-05-20 15:06           ` Sudeep Holla
2022-05-20 12:33   ` Dietmar Eggemann
2022-05-20 12:33     ` Dietmar Eggemann
2022-05-20 12:33     ` Dietmar Eggemann
2022-05-20 14:56     ` Sudeep Holla
2022-05-20 14:56       ` Sudeep Holla
2022-05-20 14:56       ` Sudeep Holla
2022-05-19 16:32 ` [PATCH v2 0/8] arch_topology: Updates to add socket support and fix cluster ids Ionela Voinescu
2022-05-19 16:32   ` Ionela Voinescu
2022-05-19 16:32   ` Ionela Voinescu
2022-05-20 15:33   ` Sudeep Holla
2022-05-20 15:33     ` Sudeep Holla
2022-05-20 15:33     ` Sudeep Holla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=26f39a9d-1a02-b77d-5c89-88a1fb0e4eac@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=atishp@atishpatra.org \
    --cc=atishp@rivosinc.com \
    --cc=ionela.voinescu@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=morten.rasmussen@arm.com \
    --cc=robh+dt@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wangqing@vivo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.