linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Ben Widawsky <ben.widawsky@intel.com>
Cc: linux-mm <linux-mm@kvack.org>, Andi Kleen <ak@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Kuppuswamy Sathyanarayanan
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	Li Xinhai <lixinhai.lxh@gmail.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Mina Almasry <almasrymina@google.com>, Tejun Heo <tj@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-api@vger.kernel.org
Subject: Re: [PATCH 00/18] multiple preferred nodes
Date: Wed, 24 Jun 2020 22:42:32 +0200	[thread overview]
Message-ID: <20200624204232.GZ1320@dhcp22.suse.cz> (raw)
In-Reply-To: <20200624202344.woogq4n3bqkuejty@intel.com>

On Wed 24-06-20 13:23:44, Ben Widawsky wrote:
> On 20-06-24 22:07:50, Michal Hocko wrote:
> > On Wed 24-06-20 13:01:40, Ben Widawsky wrote:
> > > On 20-06-24 21:51:58, Michal Hocko wrote:
> > > > On Wed 24-06-20 12:37:33, Ben Widawsky wrote:
> > > > > On 20-06-24 20:39:17, Michal Hocko wrote:
> > > > > > On Wed 24-06-20 09:16:43, Ben Widawsky wrote:
> > [...]
> > > > > > > > Or do I miss something that really requires more involved approach like
> > > > > > > > building custom zonelists and other larger changes to the allocator?
> > > > > > > 
> > > > > > > I think I'm missing how this allows selecting from multiple preferred nodes. In
> > > > > > > this case when you try to get the page from the freelist, you'll get the
> > > > > > > zonelist of the preferred node, and when you actually scan through on page
> > > > > > > allocation, you have no way to filter out the non-preferred nodes. I think the
> > > > > > > plumbing of multiple nodes has to go all the way through
> > > > > > > __alloc_pages_nodemask(). But it's possible I've missed the point.
> > > > > > 
> > > > > > policy_nodemask() will provide the nodemask which will be used as a
> > > > > > filter on the policy_node.
> > > > > 
> > > > > Ah, gotcha. Enabling independent masks seemed useful. Some bad decisions got me
> > > > > to that point. UAPI cannot get independent masks, and callers of these functions
> > > > > don't yet use them.
> > > > > 
> > > > > So let me ask before I actually type it up and find it's much much simpler, is
> > > > > there not some perceived benefit to having both masks being independent?
> > > > 
> > > > I am not sure I follow. Which two masks do you have in mind? zonelist
> > > > and user provided nodemask?
> > > 
> > > Internally, a nodemask_t for preferred node, and a nodemask_t for bound nodes.
> > 
> > Each mask is a local to its policy object.
> 
> I mean for __alloc_pages_nodemask as an internal API. That is irrespective of
> policy. Policy decisions are all made beforehand. The question from a few mails
> ago was whether there is any use in keeping that change to
> __alloc_pages_nodemask accepting two nodemasks.

It is probably too late for me because I am still not following you
mean. Maybe it would be better to provide a pseudo code what you have in
mind. Anyway all that I am saying is that for the functionality that you
propose and _if_ the fallback strategy is fixed then all you should need
is to use the preferred nodemask for the __alloc_pages_nodemask and a
fallback allocation to the full (NULL nodemask). So you first try what
the userspace prefers - __GFP_RETRY_MAYFAIL will give you try hard but
do not OOM if the memory is depleted semantic and the fallback
allocation goes all the way to OOM on the complete memory depletion.
So I do not see much point in a custom zonelist for the policy. Maybe as
a micro-optimization to save some branches here and there.

If you envision usecases which might want to control the fallback
allocation strategy then this would get more complex because you
would need a sorted list of zones to try but this would really require
some solid usecase and it should build on top of a trivial
implementation which really is BIND with the fallback.

-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2020-06-24 20:42 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-19 16:24 [PATCH 00/18] multiple preferred nodes Ben Widawsky
2020-06-19 16:24 ` [PATCH 01/18] mm/mempolicy: Add comment for missing LOCAL Ben Widawsky
2020-06-24  7:55   ` Michal Hocko
2020-06-19 16:24 ` [PATCH 02/18] mm/mempolicy: Use node_mem_id() instead of node_id() Ben Widawsky
2020-06-24  8:25   ` Michal Hocko
2020-06-24 16:48     ` Ben Widawsky
2020-06-26 12:30       ` Michal Hocko
2020-06-19 16:24 ` [PATCH 03/18] mm/page_alloc: start plumbing multi preferred node Ben Widawsky
2020-06-19 16:24 ` [PATCH 04/18] mm/page_alloc: add preferred pass to page allocation Ben Widawsky
2020-06-19 16:24 ` [PATCH 05/18] mm/mempolicy: convert single preferred_node to full nodemask Ben Widawsky
2020-06-19 16:24 ` [PATCH 06/18] mm/mempolicy: Add MPOL_PREFERRED_MANY for multiple preferred nodes Ben Widawsky
2020-06-19 16:24 ` [PATCH 07/18] mm/mempolicy: allow preferred code to take a nodemask Ben Widawsky
2020-06-19 16:24 ` [PATCH 08/18] mm/mempolicy: refactor rebind code for PREFERRED_MANY Ben Widawsky
2020-06-19 16:24 ` [PATCH 09/18] mm: Finish handling MPOL_PREFERRED_MANY Ben Widawsky
2020-06-19 16:24 ` [PATCH 10/18] mm: clean up alloc_pages_vma (thp) Ben Widawsky
2020-06-19 16:24 ` [PATCH 11/18] mm: Extract THP hugepage allocation Ben Widawsky
2020-06-19 16:24 ` [PATCH 12/18] mm/mempolicy: Use __alloc_page_node for interleaved Ben Widawsky
2020-06-19 16:24 ` [PATCH 13/18] mm: kill __alloc_pages Ben Widawsky
2020-06-19 16:24 ` [PATCH 14/18] mm/mempolicy: Introduce policy_preferred_nodes() Ben Widawsky
2020-06-19 16:24 ` [PATCH 15/18] mm: convert callers of __alloc_pages_nodemask to pmask Ben Widawsky
2020-06-19 16:24 ` [PATCH 16/18] alloc_pages_nodemask: turn preferred nid into a nodemask Ben Widawsky
2020-06-19 16:24 ` [PATCH 17/18] mm: Use less stack for page allocations Ben Widawsky
2020-06-19 16:24 ` [PATCH 18/18] mm/mempolicy: Advertise new MPOL_PREFERRED_MANY Ben Widawsky
2020-06-22  7:09 ` [PATCH 00/18] multiple preferred nodes Michal Hocko
2020-06-23 11:20   ` Michal Hocko
2020-06-23 16:12     ` Ben Widawsky
2020-06-24  7:52       ` Michal Hocko
2020-06-24 16:16         ` Ben Widawsky
2020-06-24 18:39           ` Michal Hocko
2020-06-24 19:37             ` Ben Widawsky
2020-06-24 19:51               ` Michal Hocko
2020-06-24 20:01                 ` Ben Widawsky
2020-06-24 20:07                   ` Michal Hocko
2020-06-24 20:23                     ` Ben Widawsky
2020-06-24 20:42                       ` Michal Hocko [this message]
2020-06-24 20:55                         ` Ben Widawsky
2020-06-25  6:28                           ` Michal Hocko
2020-06-26 21:39         ` Ben Widawsky
2020-06-29 10:16           ` Michal Hocko
2020-06-22 20:54 ` Andi Kleen
2020-06-22 21:02   ` Ben Widawsky
2020-06-22 21:07   ` Dave Hansen
2020-06-22 22:02     ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2020-06-19 16:23 Ben Widawsky
2020-06-19 16:25 ` Ben Widawsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200624204232.GZ1320@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=ben.widawsky@intel.com \
    --cc=cl@linux.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jgg@ziepe.ca \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lixinhai.lxh@gmail.com \
    --cc=mgorman@techsingularity.net \
    --cc=mike.kravetz@oracle.com \
    --cc=rientjes@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).