All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Gautham R. Shenoy" <gautham.shenoy@amd.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org,
	Tejun Heo <tj@kernel.org>,
	x86@kernel.org
Subject: Re: [tip: sched/core] sched/fair: Multi-LLC select_idle_sibling()
Date: Fri, 2 Jun 2023 14:36:37 +0530	[thread overview]
Message-ID: <ZHmxHWbkQWvcq+bZ@BLR-5CG11610CF.amd.com> (raw)
In-Reply-To: <a78b5df0-2374-29bf-f948-3054f1e7e46c@amd.com>

Hello Peter,

On Fri, Jun 02, 2023 at 10:47:07AM +0530, K Prateek Nayak wrote:
> Hello Peter,
> 
> On 6/1/2023 8:21 PM, Peter Zijlstra wrote:
> > On Thu, Jun 01, 2023 at 02:00:01PM +0200, Peter Zijlstra wrote:
> >> On Thu, Jun 01, 2023 at 01:56:43PM +0200, Peter Zijlstra wrote:
> >>> On Thu, Jun 01, 2023 at 01:13:26PM +0200, Peter Zijlstra wrote:
> >>>>
> >>>> This DeathStarBench thing seems to suggest that scanning up to 4 CCDs
> >>>> isn't too much of a bother; so perhaps something like so?
> >>>>
> >>>> (on top of tip/sched/core from just a few hours ago, as I had to 'fix'
> >>>> this patch and force pushed the thing)
> >>>>
> >>>> And yeah, random hacks and heuristics here :/ Does there happen to be
> >>>> additional topology that could aid us here? Does the CCD fabric itself
> >>>> have a distance metric we can use?
> >>>
> >>>   https://www.anandtech.com/show/16529/amd-epyc-milan-review/4
> >>>
> >>> Specifically:
> >>>
> >>>   https://images.anandtech.com/doci/16529/Bounce-7763.png
> >>>
> >>> That seems to suggest there are some very minor distance effects in the
> >>> CCD fabric. I didn't read the article too closely, but you'll note that
> >>> the first 4 CCDs have inter-CCD latency < 100 while the rest has > 100.
> >>>
> >>> Could you also test on a Zen2 Epyc, does that require nr=8 instead of 4?
> >>> Should we perhaps write it like: 32 / llc_size ?
> >>>
> >>> The Zen2 picture:
> >>>
> >>>   https://images.anandtech.com/doci/16315/Bounce-7742.png
> >>>
> >>> Shows a more pronounced CCD fabric topology, you can really see the 2
> >>> CCX inside the CCD but also there's two ligher green squares around the
> >>> CCDs themselves.
> >>
> >> I can't seem to find pretty pictures for Zen4 Epyc; what does that want?
> >> That's even bigger at 96/8=12 LLCs afaict.
> > 
> > Going by random pictures on the interweb again, it looks like this Zen4
> > thing wants either 2 groups of 6 each, or 4 groups of 3.
>

Yes, this is what the topology looks like

|---------------------------------------------------------------------------------| 
|                                                                                 |
|   ----------- ----------- -----------     ----------- ----------- -----------   |
|   |(0-7)    | |(8-15)   | |(16-23)  |     |(48-55)  | |(56-63)  | |(64-71)  |   |
|   | LLC0    | | LLC1    | | LLC2    |     | LLC6    | | LLC7    | | LLC8    |   |
|   |(96-103) | |(104-111)| |(112-119)|     |(144-151)| |(152-159)| |(160-167)|   |
|   ----------- ----------- -----------     ----------- ----------- -----------   |
|                                                                                 |
|                                                                                 |
|   ----------- ----------- -----------     ----------- ----------- -----------   |
|   |(24-31)  | |(32-39)  | |(40-47)  |     |(72-79)  | |(80-87)  | |(88-95)  |   |
|   | LLC3    | | LLC4    | | LLC5    |     | LLC9    | | LLC10   | | LLC11   |   |
|   |(120-127)| |(128-135)| |(136-143)|     |(168-175)| |(176-183)| |(184-191)|   |
|   ----------- ----------- -----------     ----------- ----------- -----------   |
|                                                                                 |
|---------------------------------------------------------------------------------|


> I would think it is the latter since NPS4 does that but let me go verify.

2 groups of 6 each is the vertical split which is NPS2.

4 groups of 3 each is the vertical and horizontal split, which is
NPS4.

In both these cases, currently the domain hierarchy

SMT --> MC --> NODE --> NUMA

where the NODE will be the parent of MC and be the 2nd level wakeup domain.

If we define CLS to be the group with 3 LLCs, which becomes the parent
of the MC domain, then, the hierarchy would be

NPS1 : SMT --> MC --> CLS --> DIE
NPS2 : SMT --> MC --> CLS --> NODE --> NUMA
NPS4 : SMT --> MC --> CLS --> NUMA

NPS2 will have 5 domains within a single socket. Oh well!

--
Thanks and Regards
gautham.

  reply	other threads:[~2023-06-02  9:09 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 12:04 [tip: sched/core] sched/fair: Multi-LLC select_idle_sibling() tip-bot2 for Peter Zijlstra
2023-06-01  3:41 ` Abel Wu
2023-06-01  8:09   ` Peter Zijlstra
2023-06-01  9:33 ` K Prateek Nayak
2023-06-01 11:13   ` Peter Zijlstra
2023-06-01 11:56     ` Peter Zijlstra
2023-06-01 12:00       ` Peter Zijlstra
2023-06-01 14:47         ` Peter Zijlstra
2023-06-01 15:35           ` Peter Zijlstra
2023-06-02  5:13           ` K Prateek Nayak
2023-06-02  6:54             ` Peter Zijlstra
2023-06-02  9:19               ` K Prateek Nayak
2023-06-07 18:32               ` K Prateek Nayak
2023-06-13  8:25                 ` Peter Zijlstra
2023-06-13 10:30                   ` K Prateek Nayak
2023-06-14  8:17                     ` Peter Zijlstra
2023-06-14 14:58                       ` Chen Yu
2023-06-14 15:13                         ` Peter Zijlstra
2023-06-21  7:16                           ` Chen Yu
2023-06-16  6:34                   ` K Prateek Nayak
2023-07-05 11:57                     ` Peter Zijlstra
2023-07-08 13:17                       ` Chen Yu
2023-07-12 17:19                         ` Chen Yu
2023-07-13  3:43                           ` K Prateek Nayak
2023-07-17  1:09                             ` Chen Yu
2023-06-02  7:00             ` Peter Zijlstra
2023-06-01 14:51         ` Peter Zijlstra
2023-06-02  5:17           ` K Prateek Nayak
2023-06-02  9:06             ` Gautham R. Shenoy [this message]
2023-06-02 11:23               ` Peter Zijlstra
2023-06-01 16:44     ` Chen Yu
2023-06-02  3:12     ` K Prateek Nayak
     [not found] ` <CGME20230605152531eucas1p2a10401ec2180696cc9a5f2e94a67adca@eucas1p2.samsung.com>
2023-06-05 15:25   ` Marek Szyprowski
2023-06-05 17:56     ` Peter Zijlstra
2023-06-05 19:07     ` Peter Zijlstra
2023-06-05 22:20       ` Marek Szyprowski
2023-06-06  7:58       ` Chen Yu
2023-06-01  8:43 tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZHmxHWbkQWvcq+bZ@BLR-5CG11610CF.amd.com \
    --to=gautham.shenoy@amd.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.