All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Alistair Popple <apopple@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,  <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>,  <linux-cxl@vger.kernel.org>,
	<nvdimm@lists.linux.dev>,  <linux-acpi@vger.kernel.org>,
	 "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	 Wei Xu <weixugc@google.com>,
	 Dan Williams <dan.j.williams@intel.com>,
	 Dave Hansen <dave.hansen@intel.com>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	 "Jonathan Cameron" <Jonathan.Cameron@huawei.com>,
	Michal Hocko <mhocko@kernel.org>,  Yang Shi <shy828301@gmail.com>,
	Rafael J Wysocki <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH RESEND 3/4] acpi, hmat: calculate abstract distance with HMAT
Date: Tue, 22 Aug 2023 07:28:49 +0800	[thread overview]
Message-ID: <87edjwc6vi.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <878ra4wqz0.fsf@nvdebian.thelocal> (Alistair Popple's message of "Mon, 21 Aug 2023 21:53:13 +1000")

Alistair Popple <apopple@nvidia.com> writes:

> "Huang, Ying" <ying.huang@intel.com> writes:
>
>> Alistair Popple <apopple@nvidia.com> writes:
>>
>>> Huang Ying <ying.huang@intel.com> writes:
>>>
>>>> A memory tiering abstract distance calculation algorithm based on ACPI
>>>> HMAT is implemented.  The basic idea is as follows.
>>>>
>>>> The performance attributes of system default DRAM nodes are recorded
>>>> as the base line.  Whose abstract distance is MEMTIER_ADISTANCE_DRAM.
>>>> Then, the ratio of the abstract distance of a memory node (target) to
>>>> MEMTIER_ADISTANCE_DRAM is scaled based on the ratio of the performance
>>>> attributes of the node to that of the default DRAM nodes.
>>>
>>> The problem I encountered here with the calculations is that HBM memory
>>> ended up in a lower-tiered node which isn't what I wanted (at least when
>>> that HBM is attached to a GPU say).
>>
>> I have tested the series on a server machine with HBM (pure HBM, not
>> attached to a GPU).  Where, HBM is placed in a higher tier than DRAM.
>
> Good to know.
>
>>> I suspect this is because the calculations are based on the CPU
>>> point-of-view (access1) which still sees lower bandwidth to remote HBM
>>> than local DRAM, even though the remote GPU has higher bandwidth access
>>> to that memory. Perhaps we need to be considering access0 as well?
>>> Ie. HBM directly attached to a generic initiator should be in a higher
>>> tier regardless of CPU access characteristics?
>>
>> What's your requirements for memory tiers on the machine?  I guess you
>> want to put GPU attache HBM in a higher tier and put DRAM in a lower
>> tier.  So, cold HBM pages can be demoted to DRAM when there are memory
>> pressure on HBM?  This sounds reasonable from GPU point of view.
>
> Yes, that is what I would like to implement.
>
>> The above requirements may be satisfied via calculating abstract
>> distance based on access0 (or combined with access1).  But I suspect
>> this will be a general solution.  I guess that any memory devices that
>> are used mainly by the memory initiators other than CPUs want to put
>> themselves in a higher memory tier than DRAM, regardless of its
>> access0.
>
> Right. I'm still figuring out how ACPI HMAT fits together but that
> sounds reasonable.
>
>> One solution is to put GPU HBM in the highest memory tier (with smallest
>> abstract distance) always in GPU device driver regardless its HMAT
>> performance attributes.  Is it possible?
>
> It's certainly possible and easy enough to do, although I think it would
> be good to provide upper and lower bounds for HMAT derived adistances to
> make that easier. It does make me wonder what the point of HMAT is if we
> have to ignore it in some scenarios though. But perhaps I need to dig
> deeper into the GPU values to figure out how it can be applied correctly
> there.

In the original design (page 11 of [1]),

[1] https://lpc.events/event/16/contributions/1209/attachments/1042/1995/Live%20In%20a%20World%20With%20Multiple%20Memory%20Types.pdf

the default memory tier hierarchy is based on the performance from CPU
point of view.  Then the abstract distance of a memory type (e.g., GPU
HBM) can be adjusted via a sysfs knob
(<memory_type>/abstract_distance_offset) based on the requirements of
GPU.

That's another possible solution.

--
Best Regards,
Huang, Ying


  reply	other threads:[~2023-08-21 23:30 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-21  1:29 [PATCH RESEND 0/4] memory tiering: calculate abstract distance based on ACPI HMAT Huang Ying
2023-07-21  1:29 ` [PATCH RESEND 1/4] memory tiering: add abstract distance calculation algorithms management Huang Ying
2023-07-25  2:13   ` Alistair Popple
2023-07-25  3:14     ` Huang, Ying
2023-07-25  8:26       ` Alistair Popple
2023-07-26  7:33         ` Huang, Ying
2023-07-27  3:42           ` Alistair Popple
2023-07-27  4:02             ` Huang, Ying
2023-07-27  4:07               ` Alistair Popple
2023-07-27  5:41                 ` Huang, Ying
2023-07-28  1:20                   ` Alistair Popple
2023-08-11  3:51                     ` Huang, Ying
2023-08-21 11:26                       ` Alistair Popple
2023-08-21 22:50                         ` Huang, Ying
2023-08-21 23:52                           ` Alistair Popple
2023-08-22  0:58                             ` Huang, Ying
2023-08-22  7:11                               ` Alistair Popple
2023-08-23  5:56                                 ` Huang, Ying
2023-08-25  5:41                                   ` Alistair Popple
2023-07-21  1:29 ` [PATCH RESEND 2/4] acpi, hmat: refactor hmat_register_target_initiators() Huang Ying
2023-07-25  2:44   ` Alistair Popple
2023-08-07 16:55   ` Jonathan Cameron
2023-08-11  1:13     ` Huang, Ying
2023-07-21  1:29 ` [PATCH RESEND 3/4] acpi, hmat: calculate abstract distance with HMAT Huang Ying
2023-07-25  2:45   ` Alistair Popple
2023-07-25  6:47     ` Huang, Ying
2023-08-21 11:53       ` Alistair Popple
2023-08-21 23:28         ` Huang, Ying [this message]
2023-07-21  1:29 ` [PATCH RESEND 4/4] dax, kmem: calculate abstract distance with general interface Huang Ying
2023-07-25  3:11   ` Alistair Popple
2023-07-25  7:02     ` Huang, Ying
2023-08-21 12:03       ` Alistair Popple
2023-08-21 23:33         ` Huang, Ying
2023-08-22  7:36           ` Alistair Popple
2023-08-23  2:13             ` Huang, Ying
2023-08-25  6:00               ` Alistair Popple
2023-07-21  4:15 ` [PATCH RESEND 0/4] memory tiering: calculate abstract distance based on ACPI HMAT Alistair Popple
2023-07-24 17:58   ` Andrew Morton
2023-08-01  2:35     ` Bharata B Rao
2023-08-11  6:26       ` Huang, Ying
2023-08-11  7:49         ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edjwc6vi.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=apopple@nvidia.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave@stgolabs.net \
    --cc=hannes@cmpxchg.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rafael.j.wysocki@intel.com \
    --cc=shy828301@gmail.com \
    --cc=weixugc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.