From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA8DCC55ABD for ; Tue, 10 Nov 2020 06:00:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 59A99207BC for ; Tue, 10 Nov 2020 06:00:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 59A99207BC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DF75E6B005D; Tue, 10 Nov 2020 01:00:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D9FDC6B006C; Tue, 10 Nov 2020 01:00:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB5286B006E; Tue, 10 Nov 2020 01:00:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id 9E3996B005D for ; Tue, 10 Nov 2020 01:00:50 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4AC093623 for ; Tue, 10 Nov 2020 06:00:50 +0000 (UTC) X-FDA: 77467459860.15.stem34_5f05922272f2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 2556C1814B0C7 for ; Tue, 10 Nov 2020 06:00:50 +0000 (UTC) X-HE-Tag: stem34_5f05922272f2 X-Filterd-Recvd-Size: 4856 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Tue, 10 Nov 2020 06:00:49 +0000 (UTC) IronPort-SDR: gqzAe8XKlIENV8HMUpXy+9XTBl7AnccFbJCEWxtI4Pl1zOEP9djFapbgRhtBDaWblNqzHGj02J gGTa+TCfW+Lw== X-IronPort-AV: E=McAfee;i="6000,8403,9800"; a="149201902" X-IronPort-AV: E=Sophos;i="5.77,465,1596524400"; d="scan'208";a="149201902" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2020 22:00:47 -0800 IronPort-SDR: /SathYVwYsMAXVfDTBFf2Ua3xMNyHJ/3pBunkzJzZyRBqWpkoROn55Tb8mEZVimt3h+EBwyGLB aEb2MZCiB5QQ== X-IronPort-AV: E=Sophos;i="5.77,465,1596524400"; d="scan'208";a="473300211" Received: from yhuang-mobile.sh.intel.com ([10.238.5.184]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2020 22:00:44 -0800 From: Huang Ying To: Peter Zijlstra Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , Mel Gorman , Andrew Morton , Ingo Molnar , Rik van Riel , Johannes Weiner , "Matthew Wilcox (Oracle)" , Dave Hansen , Andi Kleen , Michal Hocko , David Rientjes Subject: [PATCH -V3 2/2] autonuma: Migrate on fault among multiple bound nodes Date: Tue, 10 Nov 2020 13:59:51 +0800 Message-Id: <20201110055951.85085-3-ying.huang@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201110055951.85085-1-ying.huang@intel.com> References: <20201110055951.85085-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now, AutoNUMA can only optimize the page placement among the NUMA nodes i= f the default memory policy is used. Because the memory policy specified expli= citly should take precedence. But this seems too strict in some situations. F= or example, on a system with 4 NUMA nodes, if the memory of an application i= s bound to the node 0 and 1, AutoNUMA can potentially migrate the pages between t= he node 0 and 1 to reduce cross-node accessing without breaking the explicit memo= ry binding policy. So in this patch, if mbind(.mode=3DMPOL_BIND, .flags=3DMPOL_MF_LAZY) is u= sed to bind the memory of the application to multiple nodes, and in the hint page fau= lt handler both the faulting page node and the accessing node are in the pol= icy nodemask, the page will be tried to be migrated to the accessing node to = reduce the cross-node accessing. [Peter Zijlstra: provided the simplified implementation method.] Signed-off-by: "Huang, Ying" Acked-by: Mel Gorman Cc: Andrew Morton Cc: Ingo Molnar Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dave Hansen Cc: Andi Kleen Cc: Michal Hocko Cc: David Rientjes --- mm/mempolicy.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 63d91fbd3ce6..40f2ff2607b3 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2490,15 +2490,19 @@ int mpol_misplaced(struct page *page, struct vm_a= rea_struct *vma, unsigned long break; =20 case MPOL_BIND: - /* - * allows binding to multiple nodes. - * use current page if in policy nodemask, - * else select nearest allowed node, if any. - * If no allowed nodes, use current [!misplaced]. + * Allows binding to multiple nodes. If both current and + * accessing nodes are in policy nodemask, migrate to + * accessing node to optimize page placement. Otherwise, + * use current page if in policy nodemask, else select + * nearest allowed node, if any. If no allowed nodes, use + * current [!misplaced]. */ - if (node_isset(curnid, pol->v.nodes)) + if (node_isset(curnid, pol->v.nodes)) { + if (node_isset(thisnid, pol->v.nodes)) + goto mopron; goto out; + } z =3D first_zones_zonelist( node_zonelist(numa_node_id(), GFP_HIGHUSER), gfp_zone(GFP_HIGHUSER), @@ -2512,6 +2516,7 @@ int mpol_misplaced(struct page *page, struct vm_are= a_struct *vma, unsigned long =20 /* Migrate the page towards the node whose CPU is referencing it */ if (pol->flags & MPOL_F_MOPRON) { +mopron: polnid =3D thisnid; =20 if (!should_numa_migrate_memory(current, page, curnid, thiscpu)) --=20 2.28.0