From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54CDCC433DF for ; Thu, 2 Jul 2020 05:02:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 21952207E8 for ; Thu, 2 Jul 2020 05:02:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 21952207E8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B33ED6B00DC; Thu, 2 Jul 2020 01:02:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ABD756B00DD; Thu, 2 Jul 2020 01:02:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9854B6B00DE; Thu, 2 Jul 2020 01:02:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id 7E0286B00DC for ; Thu, 2 Jul 2020 01:02:11 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E7B55181ABE81 for ; Thu, 2 Jul 2020 05:02:10 +0000 (UTC) X-FDA: 76991939220.22.ball43_550034d26e86 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id BC1E318038E67 for ; Thu, 2 Jul 2020 05:02:10 +0000 (UTC) X-HE-Tag: ball43_550034d26e86 X-Filterd-Recvd-Size: 4362 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Thu, 2 Jul 2020 05:02:09 +0000 (UTC) IronPort-SDR: c3txmCYiGjXI46z++ySyjpgFkr0O0FDmir9wU303n3zwWK7mblryc8x28AS4jsdRvAuN+1pPoD LgMTk++68REQ== X-IronPort-AV: E=McAfee;i="6000,8403,9669"; a="145872949" X-IronPort-AV: E=Sophos;i="5.75,302,1589266800"; d="scan'208";a="145872949" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 22:02:08 -0700 IronPort-SDR: MyyxQQ8g6JkMfyH8jmNxP9XjOz5K/O3Nte8eq9pu15YuiVO558HbtrUYlrwT9E+2rkAfKAFid9 wAdyKi1O0cqw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,302,1589266800"; d="scan'208";a="266975060" Received: from yhuang-dev.sh.intel.com (HELO yhuang-dev) ([10.239.159.23]) by fmsmga008.fm.intel.com with ESMTP; 01 Jul 2020 22:02:06 -0700 From: "Huang\, Ying" To: David Rientjes Cc: Dave Hansen , Dave Hansen , , , , , Subject: Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard References: <20200629234503.749E5340@viggo.jf.intel.com> <20200629234509.8F89C4EF@viggo.jf.intel.com> Date: Thu, 02 Jul 2020 13:02:03 +0800 In-Reply-To: (David Rientjes's message of "Wed, 1 Jul 2020 12:25:08 -0700") Message-ID: <87mu4ijyr8.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: BC1E318038E67 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Rientjes writes: > On Wed, 1 Jul 2020, Dave Hansen wrote: > >> > Could this cause us to break a user's mbind() or allow a user to >> > circumvent their cpuset.mems? >> >> In its current form, yes. >> >> My current rationale for this is that while it's not as deferential as >> it can be to the user/kernel ABI contract, it's good *overall* behavior. >> The auto-migration only kicks in when the data is about to go away. So >> while the user's data might be slower than they like, it is *WAY* faster >> than they deserve because it should be off on the disk. >> > > It's outside the scope of this patchset, but eventually there will be a > promotion path that I think requires a strict 1:1 relationship between > DRAM and PMEM nodes because otherwise mbind(), set_mempolicy(), and > cpuset.mems become ineffective for nodes facing memory pressure. I have posted an patchset for AutoNUMA based promotion support, https://lore.kernel.org/lkml/20200218082634.1596727-1-ying.huang@intel.com/ Where, the page is promoted upon NUMA hint page fault. So all memory policy (mbind(), set_mempolicy(), and cpuset.mems) are available. We can refuse promoting the page to the DRAM nodes that are not allowed by any memory policy. So, 1:1 relationship isn't necessary for promotion. > For the purposes of this patchset, agreed that DRAM -> PMEM -> swap makes > perfect sense. Theoretically, I think you could have DRAM N0 and N1 and > then a single PMEM N2 and this N2 can be the terminal node for both N0 and > N1. On promotion, I think we need to rely on something stronger than > autonuma to decide which DRAM node to promote to: specifically any user > policy put into effect (memory tiering or autonuma shouldn't be allowed to > subvert these user policies). > > As others have mentioned, we lose the allocation or process context at the > time of demotion or promotion As above, we have process context at time of promotion. > and any workaround for that requires some > hacks, such as mapping the page to cpuset (what is the right solution for > shared pages?) or adding NUMA locality handling to memcg. It sounds natural to me to add NUMA nodes restriction to memcg. Best Regards, Huang, Ying