All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: Yuanchu Xie <yuanchu@google.com>
Cc: David Hildenbrand <david@redhat.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	Henry Huang <henry.hj@antgroup.com>, Yu Zhao <yuzhao@google.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Huang Ying <ying.huang@intel.com>, Wei Xu <weixugc@google.com>,
	David Rientjes <rientjes@google.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	Shuah Khan <shuah@kernel.org>,
	Yosry Ahmed <yosryahmed@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>,
	Kairui Song <kasong@tencent.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Vasily Averin <vasily.averin@linux.dev>,
	Nhat Pham <nphamcs@gmail.com>, Miaohe Lin <linmiaohe@huawei.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Abel Wu <wuyun.abel@bytedance.com>,
	"Vishal Moola (Oracle)" <vishal.moola@gmail.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [RFC PATCH v3 0/8] mm: workingset reporting
Date: Fri, 29 Mar 2024 13:28:13 -0400	[thread overview]
Message-ID: <Zgb6LQndjoFVu4pv@memverge.com> (raw)
In-Reply-To: <CAJj2-QEg3+Ztg3rK6FpVVCxSG4DaDPWsO_bha5v5GrJazc5DVQ@mail.gmail.com>

On Wed, Mar 27, 2024 at 03:53:39PM -0700, Yuanchu Xie wrote:
> On Wed, Mar 27, 2024 at 2:44 PM Gregory Price
> <gregory.price@memverge.com> wrote:
> >
> > Please note that this proposed interface (move_phys_pages) is very
> > unlikely to be received upstream due to side channel concerns. Instead,
> > it's more likely that the tiering component will expose a "promote X
> > pages from tier A to tier B", and the kernel component would then
> > use/consume hotness information to determine which pages to promote.
> 
> I see that mm/memory-tiers.c only has support for demotion. What kind
> of hotness information do devices typically provide? The OCP proposal
> is not very specific about this.
> A list of hot pages with configurable threshold?
> Access frequency for all pages at configured granularity?
> Is there a way to tell which NUMA node is accessing them, for page promotion?

(caveat: i'm not a memory-tiers maintainer, you may want to poke at them
directly for more information, this is simply spitballing an idea)

I don't know of any public proposals of explicit hotness information
provided by hardware yet, just the general proposal.

For the sake of simplicity, I would make the assumption that you have
the least information possible - a simple list of "hot addresses" in
Host Physcal Address format.

I.e. there's some driver function that amounts to:

uint32_t device_get_hot_addresses(uint64_t *addresses, uint32_t buf_max);

Where the return value is number of addresses the device returned, and
the buf_max is the number of addresses that can be read.

Drives providing this functionality would then register this as a
callback when its memory becomes a member of some numa node.


Re: source node -
Devices have no real way of determining upstream source information.

> >
> > (Just as one example, there are many more realistic designs)
> >
> > So if there is a way to expose workingset data to the mm/memory_tiers.c
> > component instead of via sysfs/cgroup - that is preferable.
> 
> Appreciate the feedback. The data in its current form might be useful
> to inform demotion decisions, but for promotion, are you aware of any
> recent developments? I would like to encode hotness as workingset data
> as well.

There were some recent patches to DAMON about promotion/demotion.  You
might look there.

~Gregory

      reply	other threads:[~2024-03-29 17:28 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-27 21:30 [RFC PATCH v3 0/8] mm: workingset reporting Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 1/8] mm: multi-gen LRU: ignore non-leaf pmd_young for force_scan=true Yuanchu Xie
2024-04-09  6:50   ` Huang, Ying
2024-04-09 22:36     ` Yuanchu Xie
2024-04-10  6:15       ` Huang, Ying
2024-03-27 21:31 ` [RFC PATCH v3 2/8] mm: aggregate working set information into histograms Yuanchu Xie
2024-03-30  7:33   ` kernel test robot
2024-03-30  8:15   ` kernel test robot
2024-04-09  7:18   ` Huang, Ying
2024-03-27 21:31 ` [RFC PATCH v3 3/8] mm: use refresh interval to rate-limit workingset report aggregation Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 4/8] mm: report workingset during memory pressure driven scanning Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 5/8] mm: extend working set reporting to memcgs Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 6/8] mm: add per-memcg reaccess histogram Yuanchu Xie
2024-03-30  7:02   ` kernel test robot
2024-03-27 21:31 ` [RFC PATCH v3 7/8] mm: add kernel aging thread for workingset reporting Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 8/8] mm: test system-wide " Yuanchu Xie
2024-03-29 19:43   ` Muhammad Usama Anjum
2024-03-27 21:44 ` [RFC PATCH v3 0/8] mm: " Gregory Price
2024-03-27 22:53   ` Yuanchu Xie
2024-03-29 17:28     ` Gregory Price [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zgb6LQndjoFVu4pv@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=henry.hj@antgroup.com \
    --cc=kasong@tencent.com \
    --cc=khalid.aziz@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=quic_sudaraja@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shuah@kernel.org \
    --cc=vasily.averin@linux.dev \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=wuyun.abel@bytedance.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuanchu@google.com \
    --cc=yuzhao@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.