From: Anshuman Khandual <anshuman.khandual@arm.com>
To: "Ramakrishnan, Krupa" <Krupa.Ramakrishnan@amd.com>,
"Rao, Bharata Bhasker" <bharata@amd.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"kamezawa.hiroyu@jp.fujitsu.com" <kamezawa.hiroyu@jp.fujitsu.com>,
"lee.schermerhorn@hp.com" <lee.schermerhorn@hp.com>,
"mgorman@suse.de" <mgorman@suse.de>,
"Srinivasan, Sadagopan" <Sadagopan.Srinivasan@amd.com>
Subject: Re: [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building node fallback list
Date: Fri, 3 Sep 2021 09:31:39 +0530 [thread overview]
Message-ID: <a051f54f-7bec-ab7b-cfac-d427b2e0e4bb@arm.com> (raw)
In-Reply-To: <SN6PR12MB2765859076BFE5B667A0C4719BCC9@SN6PR12MB2765.namprd12.prod.outlook.com>
On 8/31/21 8:56 PM, Ramakrishnan, Krupa wrote:
> [AMD Official Use Only]
>
> The bandwidth is limited by underutilization of cross socket links and not the latency. Hotspotting on one node will not engage all hardware resources based on our routing protocol which results in the lower bandwidth. Distributing equally across nodes 0 and 1 will yield the best results as it stresses the full system capabilities.
Makes sense. Nonetheless this patch clearly solves a problem.
>
> Thanks
> Krupa Ramakrishnan
>
> -----Original Message-----
> From: Anshuman Khandual <anshuman.khandual@arm.com>
> Sent: 31 August, 2021 4:58
> To: Rao, Bharata Bhasker <bharata@amd.com>; linux-mm@kvack.org; linux-kernel@vger.kernel.org
> Cc: akpm@linux-foundation.org; kamezawa.hiroyu@jp.fujitsu.com; lee.schermerhorn@hp.com; mgorman@suse.de; Ramakrishnan, Krupa <Krupa.Ramakrishnan@amd.com>; Srinivasan, Sadagopan <Sadagopan.Srinivasan@amd.com>
> Subject: Re: [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building node fallback list
>
> [CAUTION: External Email]
>
> On 8/30/21 5:46 PM, Bharata B Rao wrote:
>> As an example, consider a 4 node system with the following distance
>> matrix.
>>
>> Node 0 1 2 3
>> ----------------
>> 0 10 12 32 32
>> 1 12 10 32 32
>> 2 32 32 10 12
>> 3 32 32 12 10
>>
>> For this case, the node fallback list gets built like this:
>>
>> Node Fallback list
>> ---------------------
>> 0 0 1 2 3
>> 1 1 0 3 2
>> 2 2 3 0 1
>> 3 3 2 0 1 <-- Unexpected fallback order
>>
>> In the fallback list for nodes 2 and 3, the nodes 0 and 1 appear in
>> the same order which results in more allocations getting satisfied
>> from node 0 compared to node 1.
>>
>> The effect of this on remote memory bandwidth as seen by stream
>> benchmark is shown below:
>>
>> Case 1: Bandwidth from cores on nodes 2 & 3 to memory on nodes 0 & 1
>> (numactl -m 0,1 ./stream_lowOverhead ... --cores <from 2, 3>)
>> Case 2: Bandwidth from cores on nodes 0 & 1 to memory on nodes 2 & 3
>> (numactl -m 2,3 ./stream_lowOverhead ... --cores <from 0, 1>)
>>
>> ----------------------------------------
>> BANDWIDTH (MB/s)
>> TEST Case 1 Case 2
>> ----------------------------------------
>> COPY 57479.6 110791.8
>> SCALE 55372.9 105685.9
>> ADD 50460.6 96734.2
>> TRIADD 50397.6 97119.1
>> ----------------------------------------
>>
>> The bandwidth drop in Case 1 occurs because most of the allocations
>> get satisfied by node 0 as it appears first in the fallback order for
>> both nodes 2 and 3.
>
> I am wondering what causes this performance drop here ? Would not the memory access latency be similar between {2, 3} ---> { 0 } and {2, 3} ---> { 1 }, given both these nodes {0, 1} have same distance from {2, 3} i.e 32 from the above distance matrix. Even if the preferred node order changes from { 0 } to { 1 } for the accessing node { 3 }, it should not change the latency as such.
>
> Is the performance drop here, is caused by excessive allocation on node { 0 } resulting from page allocation latency instead.
>
next prev parent reply other threads:[~2021-09-03 4:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-30 12:16 [FIX PATCH 0/2] Fix NUMA nodes fallback list ordering Bharata B Rao
2021-08-30 12:16 ` [FIX PATCH 1/2] mm/page_alloc: Print node fallback order Bharata B Rao
2021-08-30 12:26 ` Mel Gorman
2021-09-03 4:15 ` Anshuman Khandual
2021-09-03 4:17 ` Bharata B Rao
2021-09-03 4:31 ` Anshuman Khandual
2021-08-30 12:16 ` [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building node fallback list Bharata B Rao
2021-08-30 12:29 ` Mel Gorman
2021-08-31 9:58 ` Anshuman Khandual
2021-08-31 15:26 ` Ramakrishnan, Krupa
2021-09-03 4:01 ` Anshuman Khandual [this message]
2021-09-03 4:20 ` Anshuman Khandual
2021-09-03 4:43 ` Bharata B Rao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a051f54f-7bec-ab7b-cfac-d427b2e0e4bb@arm.com \
--to=anshuman.khandual@arm.com \
--cc=Krupa.Ramakrishnan@amd.com \
--cc=Sadagopan.Srinivasan@amd.com \
--cc=akpm@linux-foundation.org \
--cc=bharata@amd.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).