From: SeongJae Park <sj@kernel.org>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>,
akpm@linux-foundation.org, ying.huang@intel.com,
dave.hansen@linux.intel.com, ziy@nvidia.com, shy828301@gmail.com,
zhongjiang-ali@linux.alibaba.com, xlpang@linux.alibaba.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] mm/damon: Add a new scheme to support demotion on tiered memory system
Date: Thu, 23 Dec 2021 08:53:12 +0000 [thread overview]
Message-ID: <20211223085312.19748-1-sj@kernel.org> (raw)
In-Reply-To: <9db4baa1-2efe-8c01-e2e0-f275cc192fec@linux.alibaba.com>
On Wed, 22 Dec 2021 17:15:18 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>
>
> On 12/22/2021 4:43 PM, SeongJae Park wrote:
> > On Tue, 21 Dec 2021 22:18:06 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> >
> >> Hi SeongJae,
> >>
> >> On 12/21/2021 9:24 PM, SeongJae Park Wrote:
> >>> Hi Baolin,
> >>>
> >>>
> >>> Thank you for this great patch! I left some comments below.
> >>>
> >>> On Tue, 21 Dec 2021 17:18:04 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> >>>
> >>>> On tiered memory system, the reclaim path in shrink_page_list() already
> >>>> support demoting pages to slow memory node instead of discarding the
> >>>> pages. However, at that time the fast memory node memory wartermark is
> >>>> already tense, which will increase the memory allocation latency during
> >>>> page demotion.
> >>>>
> >>>> We can rely on the DAMON in user space to help to monitor the cold
> >>>> memory on fast memory node, and demote the cold pages to slow memory
> >>>> node proactively to keep the fast memory node in a healthy state.
> >>>> Thus this patch introduces a new scheme named DAMOS_DEMOTE to support
> >>>> this feature.
> >>>>
> >>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> >>>> ---
>
> [snip]
>
> >>
> >>>
> >>>> + list_add(&page->lru, &pagelist);
> >>>> + target_nid = next_demotion_node(page_to_nid(page));
> >>>> + nr_pages = thp_nr_pages(page);
> >>>> +
> >>>> + ret = migrate_pages(&pagelist, alloc_demote_page, NULL,
> >>>> + target_nid, MIGRATE_SYNC, MR_DEMOTION,
> >>>> + &nr_succeeded);
> >>>> + if (ret) {
> >>>> + if (!list_empty(&pagelist)) {
> >>>> + list_del(&page->lru);
> >>>> + mod_node_page_state(page_pgdat(page),
> >>>> + NR_ISOLATED_ANON + page_is_file_lru(page),
> >>>> + -nr_pages);
> >>>> + putback_lru_page(page);
> >>>> + }
> >>>> + } else {
> >>>> + __count_vm_events(PGDEMOTE_DIRECT, nr_succeeded);
> >>>> + demoted_pages += nr_succeeded;
> >>>> + }
> >>>
> >>> Why should we do above work for each page on our own instead of exporting and
> >>> calling 'demote_page_list()'?
> >>
> >> Cuase demote_page_list() is used for demote cold pages which are from
> >> one same fast memory node, they can have one same target demotion node.
> >>
> >> But for the regions for the process, we can get some cold pages from
> >> different fast memory nodes (suppose we have mulptiple dram node on the
> >> system), so its target demotion node may be different. Thus we should
> >> migrate each cold pages with getting the correct target demotion node.
> >
> > Thank you for the kind explanation. But, I still want to reuse the code rather
> > than copying if possible. How about below dumb ideas off the top of my head?
> >
> > 1. Call demote_page_list() for each page from here
>
> Sounds reasonable.
>
> > 2. Call demote_page_list() for each page from damos_migrate_pmd_entry()
>
> We are under mmap lock in damos_migrate_pmd_entry(), it is not suitable
> to do page migration.
>
> > 3. Make damos_migrate_pages_walk_ops to configure multiple demote_pages lists,
> > each per fast memory node, and calling demote_page_list() here for each
> > per-fast-memory-node demote_pages list.
>
> Which will bring more complexity I think, and we should avoid doing
> migration under mmap lock.
>
> > 4. Make demote_page_list() handles this case and use it. e.g., NULL pgdat
> > parameter means the pages are not from same node.
>
> Thanks for your suggestion, actually after more thinking, I think we can
> reuse the demote_page_list() and it can be easy to change. Somethink
> like below on top of my current patch, how do you think? Thanks.
So, you selected the option 1. I personally think option 3 or 4 would be more
optimal, but also agree that it could increase the complexity. As we already
have the time/space quota feature for schemes overhead control, I think
starting with this simple approach makes sense to me.
Thanks,
SJ
[...]
next prev parent reply other threads:[~2021-12-23 8:53 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-21 9:18 [PATCH 0/2] Add a new scheme to support demotion on tiered memory system Baolin Wang
2021-12-21 9:18 ` [PATCH 1/2] mm: Export the alloc_demote_page() function Baolin Wang
2021-12-21 10:13 ` SeongJae Park
2021-12-21 9:18 ` [PATCH 2/2] mm/damon: Add a new scheme to support demotion on tiered memory system Baolin Wang
2021-12-21 13:24 ` SeongJae Park
2021-12-21 14:18 ` Baolin Wang
2021-12-22 8:43 ` SeongJae Park
2021-12-22 9:15 ` Baolin Wang
2021-12-23 8:53 ` SeongJae Park [this message]
2021-12-21 13:26 ` [PATCH 0/2] " SeongJae Park
2021-12-21 14:32 ` Baolin Wang
2021-12-22 8:54 ` SeongJae Park
2021-12-22 9:57 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211223085312.19748-1-sj@kernel.org \
--to=sj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=shy828301@gmail.com \
--cc=xlpang@linux.alibaba.com \
--cc=ying.huang@intel.com \
--cc=zhongjiang-ali@linux.alibaba.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).