From: "Huang\, Ying" <ying.huang@intel.com>
To: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
<kbusch@kernel.org>, <yang.shi@linux.alibaba.com>,
<dan.j.williams@intel.com>
Subject: Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard
Date: Thu, 02 Jul 2020 13:02:03 +0800 [thread overview]
Message-ID: <87mu4ijyr8.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <alpine.DEB.2.23.453.2007011203500.1908531@chino.kir.corp.google.com> (David Rientjes's message of "Wed, 1 Jul 2020 12:25:08 -0700")
David Rientjes <rientjes@google.com> writes:
> On Wed, 1 Jul 2020, Dave Hansen wrote:
>
>> > Could this cause us to break a user's mbind() or allow a user to
>> > circumvent their cpuset.mems?
>>
>> In its current form, yes.
>>
>> My current rationale for this is that while it's not as deferential as
>> it can be to the user/kernel ABI contract, it's good *overall* behavior.
>> The auto-migration only kicks in when the data is about to go away. So
>> while the user's data might be slower than they like, it is *WAY* faster
>> than they deserve because it should be off on the disk.
>>
>
> It's outside the scope of this patchset, but eventually there will be a
> promotion path that I think requires a strict 1:1 relationship between
> DRAM and PMEM nodes because otherwise mbind(), set_mempolicy(), and
> cpuset.mems become ineffective for nodes facing memory pressure.
I have posted an patchset for AutoNUMA based promotion support,
https://lore.kernel.org/lkml/20200218082634.1596727-1-ying.huang@intel.com/
Where, the page is promoted upon NUMA hint page fault. So all memory
policy (mbind(), set_mempolicy(), and cpuset.mems) are available. We
can refuse promoting the page to the DRAM nodes that are not allowed by
any memory policy. So, 1:1 relationship isn't necessary for promotion.
> For the purposes of this patchset, agreed that DRAM -> PMEM -> swap makes
> perfect sense. Theoretically, I think you could have DRAM N0 and N1 and
> then a single PMEM N2 and this N2 can be the terminal node for both N0 and
> N1. On promotion, I think we need to rely on something stronger than
> autonuma to decide which DRAM node to promote to: specifically any user
> policy put into effect (memory tiering or autonuma shouldn't be allowed to
> subvert these user policies).
>
> As others have mentioned, we lose the allocation or process context at the
> time of demotion or promotion
As above, we have process context at time of promotion.
> and any workaround for that requires some
> hacks, such as mapping the page to cpuset (what is the right solution for
> shared pages?) or adding NUMA locality handling to memcg.
It sounds natural to me to add NUMA nodes restriction to memcg.
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2020-07-02 5:02 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-29 23:45 [RFC][PATCH 0/8] Migrate Pages in lieu of discard Dave Hansen
2020-06-29 23:45 ` [RFC][PATCH 1/8] mm/numa: node demotion data structure and lookup Dave Hansen
2020-06-29 23:45 ` [RFC][PATCH 2/8] mm/migrate: Defer allocating new page until needed Dave Hansen
2020-07-01 8:47 ` Greg Thelen
2020-07-01 14:46 ` Dave Hansen
2020-07-01 18:32 ` Yang Shi
2020-06-29 23:45 ` [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard Dave Hansen
2020-07-01 0:47 ` David Rientjes
2020-07-01 1:29 ` Yang Shi
2020-07-01 5:41 ` David Rientjes
2020-07-01 8:54 ` Huang, Ying
2020-07-01 18:20 ` Dave Hansen
2020-07-01 19:50 ` David Rientjes
2020-07-02 1:50 ` Huang, Ying
2020-07-01 15:15 ` Dave Hansen
2020-07-01 17:21 ` Yang Shi
2020-07-01 19:45 ` David Rientjes
2020-07-02 10:02 ` Jonathan Cameron
2020-07-01 1:40 ` Huang, Ying
2020-07-01 16:48 ` Dave Hansen
2020-07-01 19:25 ` David Rientjes
2020-07-02 5:02 ` Huang, Ying [this message]
2020-06-29 23:45 ` [RFC][PATCH 4/8] mm/vmscan: add page demotion counter Dave Hansen
2020-06-29 23:45 ` [RFC][PATCH 5/8] mm/numa: automatically generate node migration order Dave Hansen
2020-06-30 8:22 ` Huang, Ying
2020-07-01 18:23 ` Dave Hansen
2020-07-02 1:20 ` Huang, Ying
2020-06-29 23:45 ` [RFC][PATCH 6/8] mm/vmscan: Consider anonymous pages without swap Dave Hansen
2020-06-29 23:45 ` [RFC][PATCH 7/8] mm/vmscan: never demote for memcg reclaim Dave Hansen
2020-06-29 23:45 ` [RFC][PATCH 8/8] mm/numa: new reclaim mode to enable reclaim-based migration Dave Hansen
2020-06-30 7:23 ` Huang, Ying
2020-06-30 17:50 ` Yang Shi
2020-07-01 0:48 ` Huang, Ying
2020-07-01 1:12 ` Yang Shi
2020-07-01 1:28 ` Huang, Ying
2020-07-01 16:02 ` Dave Hansen
2020-07-03 9:30 ` Huang, Ying
2020-06-30 18:36 ` [RFC][PATCH 0/8] Migrate Pages in lieu of discard Shakeel Butt
2020-06-30 18:51 ` Dave Hansen
2020-06-30 19:25 ` Shakeel Butt
2020-06-30 19:31 ` Dave Hansen
2020-07-01 14:24 ` [RFC] [PATCH " Zi Yan
2020-07-01 14:32 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mu4ijyr8.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).