linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>,
	Wei Xu <weixugc@google.com>, Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Yang Shi <shy828301@gmail.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Tim C Chen <tim.c.chen@intel.com>,
	Michal Hocko <mhocko@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Hesham Almatary <hesham.almatary@huawei.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Alistair Popple <apopple@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	jvgediya.oss@gmail.com, Jagdish Gediya <jvgediya@linux.ibm.com>
Subject: Re: [PATCH v10 1/8] mm/demotion: Add support for explicit memory tiers
Date: Wed, 27 Jul 2022 09:16:08 +0800	[thread overview]
Message-ID: <87lesfuzhj.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <9e9ba2e4-3a87-3a79-e336-8849dad4856a@linux.ibm.com> (Aneesh Kumar K. V.'s message of "Tue, 26 Jul 2022 17:29:56 +0530")

Aneesh Kumar K V <aneesh.kumar@linux.ibm.com> writes:

>>> diff --git a/include/linux/node.h b/include/linux/node.h
>>> index 40d641a8bfb0..a2a16d4104fd 100644
>>> --- a/include/linux/node.h
>>> +++ b/include/linux/node.h
>>> @@ -92,6 +92,12 @@ struct node {
>>>  	struct list_head cache_attrs;
>>>  	struct device *cache_dev;
>>>  #endif
>>> +	/*
>>> +	 * For memory devices, perf_level describes
>>> +	 * the device performance and how it should be used
>>> +	 * while building a memory hierarchy.
>>> +	 */
>>> +	int perf_level;
>> 
>> Think again, I found that "perf_level" may be not the best abstraction
>> of the performance of memory devices.  In concept, it's an abstraction of the memory
>> bandwidth.  But it will not reflect the memory latency.
>> 
>> Instead, the previous proposed "abstract_distance" is an abstraction of
>> the memory latency.  Per my understanding, the memory latency has more
>> direct influence on system performance.  And because the latency of the
>> memory device will increase if the memory accessing throughput nears its
>> max bandwidth, so the memory bandwidth can be reflected in the "abstract
>> distance" too.  That is, the "abstract distance" is an abstraction of
>> the memory latency under the expected memory accessing throughput.  The
>> "offset" to the default "abstract distance" reflects the different
>> expected memory accessing throughput.
>> 
>> So, I think we need some kind of abstraction of the memory latency
>> instead of memory bandwidth, e.g., "abstract distance".
>> 
>
> I am reworking other parts of the patch set based on your feedback.

Thanks!

> This part I guess we need to reach some consensus.

Yes.  Let's do that.

> IMHO perf_level (performance level) can indicate a combination of both latency
> and bandwidth.

"abstract distance" is based on latency, and bandwidth is reflected via
"latency under the expected memory accessing throughput".

How does perf_level indicate the combination?  Per my understanding,
it's bandwidth based.

> It is an abstract concept that indicates the performance of the
> device. As we learn more about which device attribute makes more impact in
> defining hierarchy, performance level will give more weightage to that specific
> attribute. It could be write latency or bandwidth. For me, distance has a direct
> linkage to latency because that is how we define numa distance now. Adding
> abstract to the name is not making it more abstract than perf_level. 
>
> I am open to suggestions from others.  Wei Xu has also suggested perf_level name.
> I can rename this to abstract_distance if that indicates the goal better.

I'm open to naming.  But I think that it's good to define it at some
degree instead of completely opaque stuff.  If it's latency based, then
low value corresponds to high performance.  If it's bandwidth based,
then low value corresponds to low performance.

Hi, Wei and Johannes,

What do you think about this?

Best Regards,
Huang, Ying

  reply	other threads:[~2022-07-27  1:16 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-20  2:59 [PATCH v10 0/8] mm/demotion: Memory tiers and demotion Aneesh Kumar K.V
2022-07-20  2:59 ` [PATCH v10 1/8] mm/demotion: Add support for explicit memory tiers Aneesh Kumar K.V
2022-07-26  3:53   ` Huang, Ying
2022-07-26 11:59     ` Aneesh Kumar K V
2022-07-27  1:16       ` Huang, Ying [this message]
2022-07-28 17:23         ` Johannes Weiner
2022-07-20  2:59 ` [PATCH v10 2/8] mm/demotion: Move memory demotion related code Aneesh Kumar K.V
2022-07-20  2:59 ` [PATCH v10 3/8] mm/demotion: Add hotplug callbacks to handle new numa node onlined Aneesh Kumar K.V
2022-07-26  4:03   ` Huang, Ying
2022-07-26 12:03     ` Aneesh Kumar K V
2022-07-27  1:53       ` Huang, Ying
2022-07-27  4:38         ` Aneesh Kumar K.V
2022-07-28  6:42           ` Huang, Ying
2022-07-20  2:59 ` [PATCH v10 4/8] mm/demotion/dax/kmem: Set node's performance level to MEMTIER_PERF_LEVEL_PMEM Aneesh Kumar K.V
2022-07-21  6:07   ` kernel test robot
2022-07-25  6:37   ` Huang, Ying
2022-07-25  6:48     ` Aneesh Kumar K V
2022-07-25  8:35       ` Huang, Ying
2022-07-25  8:42         ` Aneesh Kumar K V
2022-07-26  2:13           ` Huang, Ying
2022-07-27  4:31             ` Aneesh Kumar K.V
2022-07-28  6:39               ` Huang, Ying
2022-07-20  2:59 ` [PATCH v10 5/8] mm/demotion: Build demotion targets based on explicit memory tiers Aneesh Kumar K.V
2022-07-20  3:38   ` Aneesh Kumar K.V
2022-07-21  0:02   ` kernel test robot
2022-07-26  7:44   ` Huang, Ying
2022-07-26 12:30     ` Aneesh Kumar K V
2022-07-27  1:40       ` Huang, Ying
2022-07-27  4:35         ` Aneesh Kumar K.V
2022-07-28  6:51           ` Huang, Ying
2022-08-03  3:18         ` Aneesh Kumar K.V
2022-08-04  4:19           ` Huang, Ying
2022-07-20  2:59 ` [PATCH v10 6/8] mm/demotion: Add pg_data_t member to track node memory tier details Aneesh Kumar K.V
2022-07-26  8:02   ` Huang, Ying
2022-07-20  2:59 ` [PATCH v10 7/8] mm/demotion: Demote pages according to allocation fallback order Aneesh Kumar K.V
2022-07-26  8:24   ` Huang, Ying
2022-07-20  2:59 ` [PATCH v10 8/8] mm/demotion: Update node_is_toptier to work with memory tiers Aneesh Kumar K.V
2022-07-25  8:54   ` Huang, Ying
2022-07-25  8:56     ` Aneesh Kumar K V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lesfuzhj.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=apopple@nvidia.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave@stgolabs.net \
    --cc=hannes@cmpxchg.org \
    --cc=hesham.almatary@huawei.com \
    --cc=jvgediya.oss@gmail.com \
    --cc=jvgediya@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=tim.c.chen@intel.com \
    --cc=weixugc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).