From: "Huang\, Ying" <ying.huang@intel.com>
To: Yang Shi <shy828301@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
"David Rientjes" <rientjes@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim
Date: Fri, 21 Aug 2020 08:57:50 +0800 [thread overview]
Message-ID: <87v9hcvmr5.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <CAHbLzkrjxm38VV+ibQxoQkC4nW7F13aJcL5RypUchX30rqUstA@mail.gmail.com> (Yang Shi's message of "Thu, 20 Aug 2020 09:26:57 -0700")
Yang Shi <shy828301@gmail.com> writes:
> On Thu, Aug 20, 2020 at 8:22 AM Dave Hansen <dave.hansen@intel.com> wrote:
>>
>> On 8/20/20 1:06 AM, Huang, Ying wrote:
>> >> + /* Migrate pages selected for demotion */
>> >> + nr_reclaimed += demote_page_list(&ret_pages, &demote_pages, pgdat, sc);
>> >> +
>> >> pgactivate = stat->nr_activate[0] + stat->nr_activate[1];
>> >>
>> >> mem_cgroup_uncharge_list(&free_pages);
>> >> _
>> > Generally, it's good to batch the page migration. But one side effect
>> > is that, if the pages are failed to be migrated, they will be placed
>> > back to the LRU list instead of falling back to be reclaimed really.
>> > This may cause some issue in some situation. For example, if there's no
>> > enough space in the PMEM (slow) node, so the page migration fails, OOM
>> > may be triggered, because the direct reclaiming on the DRAM (fast) node
>> > may make no progress, while it can reclaim some pages really before.
>>
>> Yes, agreed.
>
> Kind of. But I think that should be transient and very rare. The
> kswapd on pmem nodes will be waken up to drop pages when we try to
> allocate migration target pages. It should be very rare that there is
> not reclaimable page on pmem nodes.
>
>>
>> There are a couple of ways we could fix this. Instead of splicing
>> 'demote_pages' back into 'ret_pages', we could try to get them back on
>> 'page_list' and goto the beginning on shrink_page_list(). This will
>> probably yield the best behavior, but might be a bit ugly.
>>
>> We could also add a field to 'struct scan_control' and just stop trying
>> to migrate after it has failed one or more times. The trick will be
>> picking a threshold that doesn't mess with either the normal reclaim
>> rate or the migration rate.
>
> In my patchset I implemented a fallback mechanism via adding a new
> PGDAT_CONTENDED node flag. Please check this out:
> https://patchwork.kernel.org/patch/10993839/.
>
> Basically the PGDAT_CONTENDED flag will be set once migrate_pages()
> return -ENOMEM which indicates the target pmem node is under memory
> pressure, then it would fallback to regular reclaim path. The flag
> would be cleared by clear_pgdat_congested() once the pmem node memory
> pressure is gone.
There may be some races between the flag set and clear. For example,
- try to migrate some pages from DRAM node to PMEM node
- no enough free pages on the PMEM node, so wakeup kswapd
- kswapd on PMEM node reclaimed some page and try to clear
PGDAT_CONTENDED on DRAM node
- set PGDAT_CONTENDED on DRAM node
This may be resolvable. But I still prefer to fallback to real page
reclaiming directly for the pages failed to be migrated. That looks
more robust.
Best Regards,
Huang, Ying
> We already use node flags to indicate the state of node in reclaim
> code, i.e. PGDAT_WRITEBACK, PGDAT_DIRTY, etc. So, adding a new flag
> sounds more straightforward to me IMHO.
>
>>
>> This is on my list to fix up next.
>>
next prev parent reply other threads:[~2020-08-21 0:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200818184122.29C415DF@viggo.jf.intel.com>
[not found] ` <20200818184131.C972AFCC@viggo.jf.intel.com>
[not found] ` <87lfi9wxk9.fsf@yhuang-dev.intel.com>
2020-08-20 15:21 ` [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim Dave Hansen
2020-08-20 16:26 ` Yang Shi
2020-08-21 0:57 ` Huang, Ying [this message]
2020-08-21 16:17 ` Yang Shi
2020-10-07 16:17 [RFC][PATCH 0/9] [v4][RESEND] Migrate Pages in lieu of discard Dave Hansen
2020-10-07 16:17 ` [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim Dave Hansen
2020-10-27 15:29 ` Oscar Salvador
2020-10-27 16:53 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v9hcvmr5.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=shy828301@gmail.com \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).