All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Widawsky <ben.widawsky@intel.com>
To: linux-mm <linux-mm@kvack.org>, linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Ben Widawsky <ben.widawsky@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: [PATCH 08/12] mm/mempolicy: Create a page allocator for policy
Date: Tue, 30 Jun 2020 14:25:13 -0700	[thread overview]
Message-ID: <20200630212517.308045-9-ben.widawsky@intel.com> (raw)
In-Reply-To: <20200630212517.308045-1-ben.widawsky@intel.com>

This patch adds a helper function which takes care of handling multiple
preferred nodes. It will be called by future patches that need to handle
this, specifically VMA based page allocation, and task based page
allocation. Huge pages don't quite fit the same pattern because they use
different underlying page allocation functions. This consumes the
previous interleave policy specific allocation function to make a one
stop shop for policy based allocation.

For now, only interleaved policy will be used so there should be no
functional change yet. However, if bisection points to issues in the
next few commits, it was likely the fault of this patch.

Similar functionality is offered via policy_node() and
policy_nodemask(). By themselves however, neither can achieve this
fallback style of sets of nodes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
CC: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 mm/mempolicy.c | 60 +++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 47 insertions(+), 13 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3b38c9c4e580..1009cf90ad37 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2199,22 +2199,56 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk,
 	return ret;
 }
 
-/* Allocate a page in interleaved policy.
-   Own path because it needs to do special accounting. */
-static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
-					unsigned nid)
+/* Handle page allocation for all but interleaved policies */
+static struct page *alloc_pages_policy(struct mempolicy *pol, gfp_t gfp,
+				       unsigned int order, int preferred_nid)
 {
 	struct page *page;
+	gfp_t gfp_mask = gfp;
 
-	page = __alloc_pages(gfp, order, nid);
-	/* skip NUMA_INTERLEAVE_HIT counter update if numa stats is disabled */
-	if (!static_branch_likely(&vm_numa_stat_key))
+	if (pol->mode == MPOL_INTERLEAVE) {
+		page = __alloc_pages(gfp, order, preferred_nid);
+		/* skip NUMA_INTERLEAVE_HIT counter update if numa stats is disabled */
+		if (!static_branch_likely(&vm_numa_stat_key))
+			return page;
+		if (page && page_to_nid(page) == preferred_nid) {
+			preempt_disable();
+			__inc_numa_state(page_zone(page), NUMA_INTERLEAVE_HIT);
+			preempt_enable();
+		}
 		return page;
-	if (page && page_to_nid(page) == nid) {
-		preempt_disable();
-		__inc_numa_state(page_zone(page), NUMA_INTERLEAVE_HIT);
-		preempt_enable();
 	}
+
+	VM_BUG_ON(preferred_nid != NUMA_NO_NODE);
+
+	preferred_nid = numa_node_id();
+
+	/*
+	 * There is a two pass approach implemented here for
+	 * MPOL_PREFERRED_MANY. In the first pass we pretend the preferred nodes
+	 * are bound, but allow the allocation to fail. The below table explains
+	 * how this is achieved.
+	 *
+	 * | Policy                        | preferred nid | nodemask   |
+	 * |-------------------------------|---------------|------------|
+	 * | MPOL_DEFAULT                  | local         | NULL       |
+	 * | MPOL_PREFERRED                | best          | NULL       |
+	 * | MPOL_INTERLEAVE               | ERR           | ERR        |
+	 * | MPOL_BIND                     | local         | pol->nodes |
+	 * | MPOL_PREFERRED_MANY           | best          | pol->nodes |
+	 * | MPOL_PREFERRED_MANY (round 2) | local         | NULL       |
+	 * +-------------------------------+---------------+------------+
+	 */
+	if (pol->mode == MPOL_PREFERRED_MANY)
+		gfp_mask |= __GFP_RETRY_MAYFAIL;
+
+	page = __alloc_pages_nodemask(gfp_mask, order,
+				      policy_node(gfp, pol, preferred_nid),
+				      policy_nodemask(gfp, pol));
+
+	if (unlikely(!page && pol->mode == MPOL_PREFERRED_MANY))
+		page = __alloc_pages_nodemask(gfp, order, preferred_nid, NULL);
+
 	return page;
 }
 
@@ -2256,8 +2290,8 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
 		unsigned nid;
 
 		nid = interleave_nid(pol, vma, addr, PAGE_SHIFT + order);
+		page = alloc_pages_policy(pol, gfp, order, nid);
 		mpol_cond_put(pol);
-		page = alloc_page_interleave(gfp, order, nid);
 		goto out;
 	}
 
@@ -2341,7 +2375,7 @@ struct page *alloc_pages_current(gfp_t gfp, unsigned order)
 	 * nor system default_policy
 	 */
 	if (pol->mode == MPOL_INTERLEAVE)
-		page = alloc_page_interleave(gfp, order, interleave_nodes(pol));
+		page = alloc_pages_policy(pol, gfp, order, interleave_nodes(pol));
 	else
 		page = __alloc_pages_nodemask(gfp, order,
 				policy_node(gfp, pol, numa_node_id()),
-- 
2.27.0


  parent reply	other threads:[~2020-06-30 21:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-30 21:25 [PATCH v2 00/12] Introduced multi-preference mempolicy Ben Widawsky
2020-06-30 21:25 ` [PATCH 01/12] mm/mempolicy: Add comment for missing LOCAL Ben Widawsky
2020-06-30 21:25 ` [PATCH 02/12] mm/mempolicy: convert single preferred_node to full nodemask Ben Widawsky
2020-06-30 21:25 ` [PATCH 03/12] mm/mempolicy: Add MPOL_PREFERRED_MANY for multiple preferred nodes Ben Widawsky
2020-06-30 21:25 ` [PATCH 04/12] mm/mempolicy: allow preferred code to take a nodemask Ben Widawsky
2020-07-02  9:15   ` [mm/mempolicy] 9586f666c8: Kernel_panic-not_syncing:stack-protector:Kernel_stack_is_corrupted_in:mpol_new_preferred kernel test robot
2020-06-30 21:25 ` [PATCH 05/12] mm/mempolicy: refactor rebind code for PREFERRED_MANY Ben Widawsky
2020-06-30 21:25 ` [PATCH 06/12] mm/mempolicy: kill v.preferred_nodes Ben Widawsky
2020-06-30 21:25 ` [PATCH 07/12] mm/mempolicy: handle MPOL_PREFERRED_MANY like BIND Ben Widawsky
2020-06-30 21:25 ` Ben Widawsky [this message]
2020-06-30 21:25 ` [PATCH 09/12] mm/mempolicy: Thread allocation for many preferred Ben Widawsky
2020-06-30 21:25 ` [PATCH 10/12] mm/mempolicy: VMA " Ben Widawsky
2020-06-30 21:25 ` [PATCH 11/12] mm/mempolicy: huge-page " Ben Widawsky
2020-06-30 21:25 ` [PATCH 12/12] mm/mempolicy: Advertise new MPOL_PREFERRED_MANY Ben Widawsky
2020-10-30 19:02 [PATCH v2 RESEND 00/12] Introduced multi-preference mempolicy Ben Widawsky
2020-10-30 19:02 ` [PATCH 08/12] mm/mempolicy: Create a page allocator for policy Ben Widawsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200630212517.308045-9-ben.widawsky@intel.com \
    --to=ben.widawsky@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.