All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: SeongJae Park <sj@kernel.org>
Cc: akpm@linux-foundation.org, ying.huang@intel.com,
	dave.hansen@linux.intel.com, ziy@nvidia.com, shy828301@gmail.com,
	zhongjiang-ali@linux.alibaba.com, xlpang@linux.alibaba.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] mm/damon: Add a new scheme to support demotion on tiered memory system
Date: Wed, 22 Dec 2021 17:15:18 +0800	[thread overview]
Message-ID: <9db4baa1-2efe-8c01-e2e0-f275cc192fec@linux.alibaba.com> (raw)
In-Reply-To: <20211222084322.15902-1-sj@kernel.org>



On 12/22/2021 4:43 PM, SeongJae Park wrote:
> On Tue, 21 Dec 2021 22:18:06 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
>> Hi SeongJae,
>>
>> On 12/21/2021 9:24 PM, SeongJae Park Wrote:
>>> Hi Baolin,
>>>
>>>
>>> Thank you for this great patch!  I left some comments below.
>>>
>>> On Tue, 21 Dec 2021 17:18:04 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>>
>>>> On tiered memory system, the reclaim path in shrink_page_list() already
>>>> support demoting pages to slow memory node instead of discarding the
>>>> pages. However, at that time the fast memory node memory wartermark is
>>>> already tense, which will increase the memory allocation latency during
>>>> page demotion.
>>>>
>>>> We can rely on the DAMON in user space to help to monitor the cold
>>>> memory on fast memory node, and demote the cold pages to slow memory
>>>> node proactively to keep the fast memory node in a healthy state.
>>>> Thus this patch introduces a new scheme named DAMOS_DEMOTE to support
>>>> this feature.
>>>>
>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>> ---

[snip]

>>
>>>
>>>> +		list_add(&page->lru, &pagelist);
>>>> +		target_nid = next_demotion_node(page_to_nid(page));
>>>> +		nr_pages = thp_nr_pages(page);
>>>> +
>>>> +		ret = migrate_pages(&pagelist, alloc_demote_page, NULL,
>>>> +				    target_nid, MIGRATE_SYNC, MR_DEMOTION,
>>>> +				    &nr_succeeded);
>>>> +		if (ret) {
>>>> +			if (!list_empty(&pagelist)) {
>>>> +				list_del(&page->lru);
>>>> +				mod_node_page_state(page_pgdat(page),
>>>> +						    NR_ISOLATED_ANON + page_is_file_lru(page),
>>>> +						    -nr_pages);
>>>> +				putback_lru_page(page);
>>>> +			}
>>>> +		} else {
>>>> +			__count_vm_events(PGDEMOTE_DIRECT, nr_succeeded);
>>>> +			demoted_pages += nr_succeeded;
>>>> +		}
>>>
>>> Why should we do above work for each page on our own instead of exporting and
>>> calling 'demote_page_list()'?
>>
>> Cuase demote_page_list() is used for demote cold pages which are from
>> one same fast memory node, they can have one same target demotion node.
>>
>> But for the regions for the process, we can get some cold pages from
>> different fast memory nodes (suppose we have mulptiple dram node on the
>> system), so its target demotion node may be different. Thus we should
>> migrate each cold pages with getting the correct target demotion node.
> 
> Thank you for the kind explanation.  But, I still want to reuse the code rather
> than copying if possible.  How about below dumb ideas off the top of my head?
> 
> 1. Call demote_page_list() for each page from here

Sounds reasonable.

> 2. Call demote_page_list() for each page from damos_migrate_pmd_entry()

We are under mmap lock in damos_migrate_pmd_entry(), it is not suitable 
to do page migration.

> 3. Make damos_migrate_pages_walk_ops to configure multiple demote_pages lists,
>     each per fast memory node, and calling demote_page_list() here for each
>     per-fast-memory-node demote_pages list.

Which will bring more complexity I think, and we should avoid doing 
migration under mmap lock.

> 4. Make demote_page_list() handles this case and use it.  e.g., NULL pgdat
>     parameter means the pages are not from same node.

Thanks for your suggestion, actually after more thinking, I think we can 
reuse the demote_page_list() and it can be easy to change. Somethink 
like below on top of my current patch, how do you think? Thanks.

--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -796,7 +796,7 @@ static unsigned long damos_demote(struct 
damon_target *target,
         struct mm_struct *mm;
         LIST_HEAD(demote_pages);
         LIST_HEAD(pagelist);
-       int target_nid, nr_pages, ret = 0;
+       int nr_pages;
         unsigned int nr_succeeded, demoted_pages = 0;
         struct page *page, *next;

@@ -818,14 +818,11 @@ static unsigned long damos_demote(struct 
damon_target *target,
                 return 0;

         list_for_each_entry_safe(page, next, &demote_pages, lru) {
-               list_add(&page->lru, &pagelist);
-               target_nid = next_demotion_node(page_to_nid(page));
                 nr_pages = thp_nr_pages(page);
+               list_add(&page->lru, &pagelist);

-               ret = migrate_pages(&pagelist, alloc_demote_page, NULL,
-                                   target_nid, MIGRATE_SYNC, MR_DEMOTION,
-                                   &nr_succeeded);
-               if (ret) {
+               nr_succeeded = demote_page_list(pagelist, page_pgdat(page));
+               if (!nr_succeeded) {
                         if (!list_empty(&pagelist)) {
                                 list_del(&page->lru);
                                 mod_node_page_state(page_pgdat(page),
@@ -834,11 +831,9 @@ static unsigned long damos_demote(struct 
damon_target *target,
                                 putback_lru_page(page);
                         }
                 } else {
-                       __count_vm_events(PGDEMOTE_DIRECT, nr_succeeded);
                         demoted_pages += nr_succeeded;
                 }

-               INIT_LIST_HEAD(&pagelist);
                 cond_resched();
         }

  reply	other threads:[~2021-12-22  9:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-21  9:18 [PATCH 0/2] Add a new scheme to support demotion on tiered memory system Baolin Wang
2021-12-21  9:18 ` [PATCH 1/2] mm: Export the alloc_demote_page() function Baolin Wang
2021-12-21 10:13   ` SeongJae Park
2021-12-21  9:18 ` [PATCH 2/2] mm/damon: Add a new scheme to support demotion on tiered memory system Baolin Wang
2021-12-21 13:24   ` SeongJae Park
2021-12-21 14:18     ` Baolin Wang
2021-12-22  8:43       ` SeongJae Park
2021-12-22  9:15         ` Baolin Wang [this message]
2021-12-23  8:53           ` SeongJae Park
2021-12-22 11:17   ` kernel test robot
2021-12-22 11:17     ` kernel test robot
2021-12-21 13:26 ` [PATCH 0/2] " SeongJae Park
2021-12-21 14:32   ` Baolin Wang
2021-12-22  8:54     ` SeongJae Park
2021-12-22  9:57       ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9db4baa1-2efe-8c01-e2e0-f275cc192fec@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=xlpang@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    --cc=zhongjiang-ali@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.