linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Suthikulpanit, Suravee" <Suravee.Suthikulpanit@amd.com>
To: Mel Gorman <mgorman@techsingularity.net>,
	Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Lendacky, Thomas" <Thomas.Lendacky@amd.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH] sched/topology: Improve load balancing on AMD EPYC
Date: Wed, 26 Jun 2019 21:18:01 +0000	[thread overview]
Message-ID: <989944bc-6c3a-43b5-4f95-0bdfcc6d6c29@amd.com> (raw)
In-Reply-To: <20190624142420.GC2978@techsingularity.net>

On 6/24/19 9:24 AM, Mel Gorman wrote:
> On Wed, Jun 19, 2019 at 10:34:37PM +0100, Matt Fleming wrote:
>> On Tue, 18 Jun, at 02:33:18PM, Peter Zijlstra wrote:
>>> On Tue, Jun 18, 2019 at 11:43:19AM +0100, Matt Fleming wrote:
>>>> This works for me under all my tests. Thoughts?
>>>>
>>>> --->8---
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>>>> index 80a405c2048a..4db4e9e7654b 100644
>>>> --- a/arch/x86/kernel/cpu/amd.c
>>>> +++ b/arch/x86/kernel/cpu/amd.c
>>>> @@ -8,6 +8,7 @@
>>>>   #include <linux/sched.h>
>>>>   #include <linux/sched/clock.h>
>>>>   #include <linux/random.h>
>>>> +#include <linux/topology.h>
>>>>   #include <asm/processor.h>
>>>>   #include <asm/apic.h>
>>>>   #include <asm/cacheinfo.h>
>>>> @@ -824,6 +825,8 @@ static void init_amd_zn(struct cpuinfo_x86 *c)
>>>>   {
>>>>   	set_cpu_cap(c, X86_FEATURE_ZEN);
>>>>   
>>>
>>> I'm thinking this deserves a comment. Traditionally the SLIT table held
>>> relative memory latency. So where the identity is 10, 16 would indicate
>>> 1.6 times local latency and 32 would be 3.2 times local.
>>>
>>> Now, even very early on BIOS monkeys went about their business and put
>>> in random values in an attempt to 'tune' the system based on how
>>> $random-os behaved, which is all sorts of fu^Wwrong.
>>>
>>> Now, I suppose my question is; is that 32 Zen puts in an actual relative
>>> memory latency metric, or a random value we somehow have to deal with.
>>> And can we pretty please describe the whole sordid story behind this
>>> 'tunable' somewhere?
>>
>> This is one for the AMD folks. I don't know if the memory latency
>> really is 3.2 times or not, only that that's the value in all the Zen
>> machines I have access to. Even this 2-socket one:
>>
>> node distances:
>> node   0   1
>>    0:  10  32
>>    1:  32  10
>>
>> Tom, Suravee?
> 
> Do not consider this an authorative response but based on what I know
> of the physical topology, it is not unreasonable to use 32 in the SLIT
> table. There is a small latency when accessing another die on the same
> socket (details are generation specific). It's not quite a local access
> but it's not as much as a traditional remote access either (hence 16 being
> the base unit for another die to hint that it's not quite local but not
> quite remote either). 32 is based on accessing a die on a remote socket
> based on the expected performance and latency of the interconnect.
> 
> To the best of my knowledge, the magic numbers are reflective of the real
> topology and not just a gamification of the numbers for a random OS. If
> anything, the fact that there is a load balancing issue on Linux would
> indicate that they were not picking random numbers for Linux at least :P
> 

We use 16 to designate 1-hop latency (for different node within the same socket).
For across-socket access, since the latency is greater, we set the latency to 32
(twice the latency of 1-hop) not aware of the RECLAIM_DISTANCE at the time.

At this point, it might not be possible to change the SLIT values on
existing platforms out in the field. So, introducing the AMD family17h
quirk as Matt suggested would be a more feasible approach.

Going forward, we will make sure that this would not exceed the standard
RECLAIM_DISTANCE (30).

Thanks,
Suravee

  reply	other threads:[~2019-06-26 21:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-05 15:59 [PATCH] sched/topology: Improve load balancing on AMD EPYC Matt Fleming
2019-06-05 18:00 ` Peter Zijlstra
2019-06-10 21:26   ` Matt Fleming
2019-06-11 17:22     ` Lendacky, Thomas
2019-06-18 10:43       ` Matt Fleming
2019-06-18 12:33         ` Peter Zijlstra
2019-06-19 21:34           ` Matt Fleming
2019-06-24 14:24             ` Mel Gorman
2019-06-26 21:18               ` Suthikulpanit, Suravee [this message]
2019-06-28 15:15                 ` Matt Fleming
2019-07-22 14:11                   ` Suthikulpanit, Suravee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=989944bc-6c3a-43b5-4f95-0bdfcc6d6c29@amd.com \
    --to=suravee.suthikulpanit@amd.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mgorman@techsingularity.net \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).