From: Gautham R Shenoy <email@example.com> To: "Michal Suchánek" <firstname.lastname@example.org> Cc: "Gautham R. Shenoy" <email@example.com>, Michael Neuling <firstname.lastname@example.org>, Vincent Guittot <email@example.com>, Srikar Dronamraju <firstname.lastname@example.org>, Rik van Riel <email@example.com>, firstname.lastname@example.org, LKML <email@example.com>, Nicholas Piggin <firstname.lastname@example.org>, Valentin Schneider <email@example.com>, Parth Shah <firstname.lastname@example.org>, Vaidyanathan Srinivasan <email@example.com>, Mel Gorman <firstname.lastname@example.org>, Dietmar Eggemann <email@example.com> Subject: Re: [RFC/PATCH] powerpc/smp: Add SD_SHARE_PKG_RESOURCES flag to MC sched-domain Date: Wed, 14 Apr 2021 12:32:46 +0530 Message-ID: <20210414070246.GB13782@in.ibm.com> (raw) In-Reply-To: <20210412163355.GV6564@kitsune.suse.cz> On Mon, Apr 12, 2021 at 06:33:55PM +0200, Michal Suchánek wrote: > On Mon, Apr 12, 2021 at 04:24:44PM +0100, Mel Gorman wrote: > > On Mon, Apr 12, 2021 at 02:21:47PM +0200, Vincent Guittot wrote: > > > > > Peter, Valentin, Vincent, Mel, etal > > > > > > > > > > On architectures where we have multiple levels of cache access latencies > > > > > within a DIE, (For example: one within the current LLC or SMT core and the > > > > > other at MC or Hemisphere, and finally across hemispheres), do you have any > > > > > suggestions on how we could handle the same in the core scheduler? > > > > > > I would say that SD_SHARE_PKG_RESOURCES is there for that and doesn't > > > only rely on cache > > > > > > > From topology.c > > > > SD_SHARE_PKG_RESOURCES - describes shared caches > > > > I'm guessing here because I am not familiar with power10 but the central > > problem appears to be when to prefer selecting a CPU sharing L2 or L3 > > cache and the core assumes the last-level-cache is the only relevant one. > > It does not seem to be the case according to original description: > > >>>> When the scheduler tries to wakeup a task, it chooses between the > >>>> waker-CPU and the wakee's previous-CPU. Suppose this choice is called > >>>> the "target", then in the target's LLC domain, the scheduler > >>>> > >>>> a) tries to find an idle core in the LLC. This helps exploit the > This is the same as (b) Should this be SMT^^^ ? On POWER10, without this patch, the LLC is at SMT sched-domain domain. The difference between a) and b) is a) searches for an idle core, while b) searches for an idle CPU. > >>>> SMT folding that the wakee task can benefit from. If an idle > >>>> core is found, the wakee is woken up on it. > >>>> > >>>> b) Failing to find an idle core, the scheduler tries to find an idle > >>>> CPU in the LLC. This helps minimise the wakeup latency for the > >>>> wakee since it gets to run on the CPU immediately. > >>>> > >>>> c) Failing this, it will wake it up on target CPU. > >>>> > >>>> Thus, with P9-sched topology, since the CACHE domain comprises of two > >>>> SMT4 cores, there is a decent chance that we get an idle core, failing > >>>> which there is a relatively higher probability of finding an idle CPU > >>>> among the 8 threads in the domain. > >>>> > >>>> However, in P10-sched topology, since the SMT domain is the LLC and it > >>>> contains only a single SMT4 core, the probability that we find that > >>>> core to be idle is less. Furthermore, since there are only 4 CPUs to > >>>> search for an idle CPU, there is lower probability that we can get an > >>>> idle CPU to wake up the task on. > > > > > For this patch, I wondered if setting SD_SHARE_PKG_RESOURCES would have > > unintended consequences for load balancing because load within a die may > > not be spread between SMT4 domains if SD_SHARE_PKG_RESOURCES was set at > > the MC level. > > Not spreading load between SMT4 domains within MC is exactly what setting LLC > at MC level would address, wouldn't it? > > As in on P10 we have two relevant levels but the topology as is describes only > one, and moving the LLC level lower gives two levels the scheduler looks at > again. Or am I missing something? This is my current understanding as well, since with this patch we would then be able to move tasks quickly between the SMT4 cores, perhaps at the expense of losing out on cache-affinity. Which is why it would be good to verify this using a test/benchmark. > > Thanks > > Michal > -- Thanks and Regards gautham.
next prev parent reply index Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-04-02 5:37 Gautham R. Shenoy 2021-04-02 7:36 ` Gautham R Shenoy 2021-04-12 6:24 ` Srikar Dronamraju 2021-04-12 9:37 ` Mel Gorman 2021-04-12 10:06 ` Valentin Schneider 2021-04-12 10:48 ` Mel Gorman 2021-04-19 6:14 ` Gautham R Shenoy 2021-04-12 12:21 ` Vincent Guittot 2021-04-12 15:24 ` Mel Gorman 2021-04-12 16:33 ` Michal Suchánek 2021-04-14 7:02 ` Gautham R Shenoy [this message] 2021-04-13 7:10 ` Vincent Guittot 2021-04-14 7:00 ` Gautham R Shenoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210414070246.GB13782@in.ibm.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LinuxPPC-Dev Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \ firstname.lastname@example.org email@example.com public-inbox-index linuxppc-dev Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev AGPL code for this site: git clone https://public-inbox.org/public-inbox.git