From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EC80C63777 for ; Thu, 3 Dec 2020 09:37:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 56A3C20C56 for ; Thu, 3 Dec 2020 09:37:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56A3C20C56 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 97A526B005D; Thu, 3 Dec 2020 04:37:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 92AE96B0068; Thu, 3 Dec 2020 04:37:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 818926B006C; Thu, 3 Dec 2020 04:37:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 690B86B005D for ; Thu, 3 Dec 2020 04:37:44 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2F101F00C for ; Thu, 3 Dec 2020 09:37:44 +0000 (UTC) X-FDA: 77551468848.11.voice65_1801f59273ba Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 10ADD180F8B87 for ; Thu, 3 Dec 2020 09:37:44 +0000 (UTC) X-HE-Tag: voice65_1801f59273ba X-Filterd-Recvd-Size: 4226 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Thu, 3 Dec 2020 09:37:43 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 84187AC55; Thu, 3 Dec 2020 09:37:42 +0000 (UTC) Date: Thu, 3 Dec 2020 09:37:39 +0000 From: Mel Gorman To: "Huang, Ying" Cc: Peter Zijlstra , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Matthew Wilcox (Oracle)" , Rafael Aquini , Andrew Morton , Ingo Molnar , Rik van Riel , Johannes Weiner , Dave Hansen , Andi Kleen , Michal Hocko , David Rientjes , linux-api@vger.kernel.org Subject: Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING Message-ID: <20201203093739.GB3306@suse.de> References: <20201202084234.15797-1-ying.huang@intel.com> <20201202084234.15797-3-ying.huang@intel.com> <20201202114357.GW3306@suse.de> <87ft4npskx.fsf@yhuang-dev.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <87ft4npskx.fsf@yhuang-dev.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 03, 2020 at 09:49:02AM +0800, Huang, Ying wrote: > >> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 > >> index 68011eecb..3754b3e12 100644 > >> --- a/man2/set_mempolicy.2 > >> +++ b/man2/set_mempolicy.2 > >> @@ -113,6 +113,12 @@ A nonempty > >> .I nodemask > >> specifies node IDs that are relative to the set of > >> node IDs allowed by the process's current cpuset. > >> +.TP > >> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)" > >> +Enable the Linux kernel NUMA balancing for the task if it is supported > >> +by kernel. > >> +If the flag isn't supported by Linux kernel, return -1 and errno is > >> +set to EINVAL. > >> .PP > >> .I nodemask > >> points to a bit mask of node IDs that contains up to > >> @@ -293,6 +299,9 @@ argument specified both > > > > Should this be expanded more to clarify it applies to MPOL_BIND > > specifically? > > > > Maybe the first patch should be expanded more and explictly fail if > > MPOL_F_NUMA_BALANCING is used with anything other than MPOL_BIND? > > For MPOL_PREFERRED, why could we not use NUMA balancing to migrate pages > to the accessing local node if it is same as the preferred node? You could but the kernel patch does not do that by making preferred_nid stick to the preferred node when hinting faults are trapped on that VMA. It would have to be a separate patch coupled with a man page update. If you wanted to go in this direction in the future, then the patch should explicitly return an error *now* if MPOL_PREFERRED is or'd with MPOL_F_NUMA_BALANCING so that an application becomes aware of MPOL_F_NUMA_BALANCING then it can detect if support exists in the current running kernel. > Even for MPOL_INTERLEAVE, if the target node is the same as the > accessing local node, can we use NUMA balancing to migrate pages? > The intent of MPOL_INTERLEAVE is to average the costs of the memory access so the average cost across the VMA is roughly similar across the entire range. This may be particularly important if the VMA is shared between multiple threads that are spread out on multiple nodes. A change in semantics there should be clearly documented. Similar, if you want to go in this direction, MPOL_F_NUMA_BALANCING should be chcked against MPOL_INTERLEAVE and explicitly fail now so suport can be detected at runtime. > So, I prefer to make MPOL_F_NUMA_BALANCING to be > > Optimizing with NUMA balancing if possible, and we may add more > optimization in the future. > Maybe, but I think it's best that the actual behaviour of the kernel is documented instead of desired behaviour or future planning. -- Mel Gorman SUSE Labs