From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 825F3C47087 for ; Fri, 28 May 2021 04:40:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 10E76613E5 for ; Fri, 28 May 2021 04:40:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 10E76613E5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 405176B006C; Fri, 28 May 2021 00:40:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B4656B006E; Fri, 28 May 2021 00:40:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22F846B0070; Fri, 28 May 2021 00:40:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0061.hostedemail.com [216.40.44.61]) by kanga.kvack.org (Postfix) with ESMTP id E09E86B006C for ; Fri, 28 May 2021 00:40:05 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 65580C5B5 for ; Fri, 28 May 2021 04:40:05 +0000 (UTC) X-FDA: 78189387570.27.51B0A5D Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf30.hostedemail.com (Postfix) with ESMTP id DB8D4E016176 for ; Fri, 28 May 2021 04:39:55 +0000 (UTC) IronPort-SDR: YSTP6QRMXnavHtvGsJj5KXQhDZIO7yJ+VgRCqrqEzZG5MGarA8O8lbTohCQVU/1DbloXuTmrNg e6cNXW6hlvVQ== X-IronPort-AV: E=McAfee;i="6200,9189,9997"; a="202897059" X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="202897059" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2021 21:39:59 -0700 IronPort-SDR: m08fvVK1SI3n5ZkbX/rHwhtHvJFBO1pc+EsJx9mrhUjGh276OL/YPIRN8tx4VQ24npXpHbj+9j 6BfiTbqPO9HA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,228,1616482800"; d="scan'208";a="477780397" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.147.94]) by orsmga001.jf.intel.com with ESMTP; 27 May 2021 21:39:55 -0700 Date: Fri, 28 May 2021 12:39:54 +0800 From: Feng Tang To: Michal Hocko Cc: linux-mm@kvack.org, Andrew Morton , David Rientjes , Dave Hansen , Ben Widawsky , linux-kernel@vger.kernel.org, Andrea Arcangeli , Mel Gorman , Mike Kravetz , Randy Dunlap , Vlastimil Babka , Andi Kleen , Dan Williams , ying.huang@intel.com Subject: Re: [PATCH v1 4/4] mm/mempolicy: kill MPOL_F_LOCAL bit Message-ID: <20210528043954.GA32292@shbuild999.sh.intel.com> References: <1622005302-23027-1-git-send-email-feng.tang@intel.com> <1622005302-23027-5-git-send-email-feng.tang@intel.com> <20210527121041.GA7743@shbuild999.sh.intel.com> <20210527133436.GD7743@shbuild999.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf30.hostedemail.com: domain of feng.tang@intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=feng.tang@intel.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: DB8D4E016176 X-Stat-Signature: duefbgkd77ygh11dhkz5p5y8oujoekig X-HE-Tag: 1622176795-879887 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 27, 2021 at 05:34:56PM +0200, Michal Hocko wrote: > On Thu 27-05-21 21:34:36, Feng Tang wrote: > > On Thu, May 27, 2021 at 02:26:24PM +0200, Michal Hocko wrote: > > > On Thu 27-05-21 20:10:41, Feng Tang wrote: > > > > On Thu, May 27, 2021 at 10:20:08AM +0200, Michal Hocko wrote: > > > > > On Wed 26-05-21 13:01:42, Feng Tang wrote: > > > > > > Now the only remaining case of a real 'local' policy faked by > > > > > > 'prefer' policy plus MPOL_F_LOCAL bit is: > > > > > > > > > > > > A valid 'prefer' policy with a valid 'preferred' node is 'rebind' > > > > > > to a nodemask which doesn't contains the 'preferred' node, then it > > > > > > will handle allocation with 'local' policy. > > > > > > > > > > > > Add a new 'MPOL_F_LOCAL_TEMP' bit for this case, and kill the > > > > > > MPOL_F_LOCAL bit, which could simplify the code much. > > > > > > > > > > As I've pointed out in the reply to the previous patch. It would have > > > > > been much better if most of the MPOL_F_LOCAL usage was gone by this > > > > > patch. > > > > > > > > > > I also dislike a new MPOL_F_LOCAL_TEMP. This smells like sneaking the > > > > > hack back in after you have painstakingly removed it. So this looks like > > > > > a step backwards to me. I also do not understand why do we need the > > > > > rebind callback for local policy at all. There is no node mask for local > > > > > so what is going on here? > > > > > > > > This is the special case 4 for 'perfer' policy with MPOL_F_STATIC_NODES > > > > flag set, say it prefer node 1, when it is later 'refind' to a new > > > > nodemask node 2-3, according to current code it will be add the > > > > MPOL_F_LOCAL bit and performs 'local' policy acctually. And in future > > > > it is 'rebind' again with a nodemask 1-2, it will be restored back > > > > to 'prefer' policy with preferred node 1. > > > > > > Honestly I still do not follow the actual problem. > > > > I was confused too, and don't know the original thought behind it. This > > case 4 was just imagined by reading the code. > > > > > A preferred node is a > > > _hint_. If you rebind the task to a different cpuset then why should we > > > actually care? The allocator will fallback to the closest node according > > > to the distance metric. Maybe the original code was trying to handle > > > that in some way but I really do fail to understand that code and I > > > strongly suspect it is more likely to overengineered rather than backed > > > by a real usecase. I might be wrong here but then this is an excellent > > > opportunity to clarify all those subtleties. > > > > From the code, the original special handling may be needed in 3 cases: > > get_policy_nodemask() > > policy_node() > > mempolicy_slab_node() > > to not return the preset prefer_nid. > > I am sorry but I do not follow. What is actually wrong if the preferred > node is outside of the cpuset nodemask? Sorry, I didn't make it clear. With current code logic, it will perform as 'local' policy, but its mode is kept as 'prefer', so the code still has these tricky bit checking when these APIs are called for this policy. I agree with you that these ping-pong rebind() may be over engineering, so for this case can we just change the policy from 'prefer' to 'local', and drop the tricky bit manipulation, as the 'prefer' is just a hint, if these rebind misses the target node, there is no need to stick with the 'prefer' policy? Thanks, Feng