[PATCH v2 0/2] fix numa spreading for large hash tables

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/2] fix numa spreading for large hash tables
@ 2021-10-18 12:37 Chen Wandun
  2021-10-18 12:37 ` [PATCH v2 1/2] mm/vmalloc: " Chen Wandun
  2021-10-18 12:37 ` [PATCH v2 2/2] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation Chen Wandun
  0 siblings, 2 replies; 9+ messages in thread
From: Chen Wandun @ 2021-10-18 12:37 UTC (permalink / raw)
  To: akpm, npiggin, linux-mm, linux-kernel, edumazet, wangkefeng.wang,
	guohanjun, shakeelb, urezki
  Cc: chenwandun

[PATCH v2 1/2] fix numa spreading problem
[PATCH v2 2/2] optimization about performance

v1 ==> v2:
1. do a minimal fix in [PATCH v2 1/2]
2. add some comments

Chen Wandun (2):
  mm/vmalloc: fix numa spreading for large hash tables
  mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate
    memory allocation

 include/linux/gfp.h |  4 +++
 mm/mempolicy.c      | 82 +++++++++++++++++++++++++++++++++++++++++++++
 mm/vmalloc.c        | 28 ++++++++++++----
 3 files changed, 108 insertions(+), 6 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-18 12:37 [PATCH v2 0/2] fix numa spreading for large hash tables Chen Wandun
@ 2021-10-18 12:37 ` Chen Wandun
  2021-10-18 12:39   ` Matthew Wilcox
                     ` (2 more replies)
  2021-10-18 12:37 ` [PATCH v2 2/2] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation Chen Wandun
  1 sibling, 3 replies; 9+ messages in thread
From: Chen Wandun @ 2021-10-18 12:37 UTC (permalink / raw)
  To: akpm, npiggin, linux-mm, linux-kernel, edumazet, wangkefeng.wang,
	guohanjun, shakeelb, urezki
  Cc: chenwandun

Eric Dumazet reported a strange numa spreading info in [1], and found
commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced
this issue [2].

Dig into the difference before and after this patch, page allocation has
some difference:

before:
alloc_large_system_hash
    __vmalloc
        __vmalloc_node(..., NUMA_NO_NODE, ...)
            __vmalloc_node_range
                __vmalloc_area_node
                    alloc_page /* because NUMA_NO_NODE, so choose alloc_page branch */
                        alloc_pages_current
                            alloc_page_interleave /* can be proved by print policy mode */

after:
alloc_large_system_hash
    __vmalloc
        __vmalloc_node(..., NUMA_NO_NODE, ...)
            __vmalloc_node_range
                __vmalloc_area_node
                    alloc_pages_node /* choose nid by nuam_mem_id() */
                        __alloc_pages_node(nid, ....)

So after commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings"),
it will allocate memory in current node instead of interleaving allocate
memory.

[1]
https://lore.kernel.org/linux-mm/CANn89iL6AAyWhfxdHO+jaT075iOa3XcYn9k6JJc7JR2XYn6k_Q@mail.gmail.com/

[2]
https://lore.kernel.org/linux-mm/CANn89iLofTR=AK-QOZY87RdUZENCZUT4O6a0hvhu3_EwRMerOg@mail.gmail.com/

Fixes: 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
---
 mm/vmalloc.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d77830ff604c..87552a4018aa 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2816,6 +2816,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		unsigned int order, unsigned int nr_pages, struct page **pages)
 {
 	unsigned int nr_allocated = 0;
+	struct page *page;
+	int i;
 
 	/*
 	 * For order-0 pages we make use of bulk allocator, if
@@ -2823,7 +2825,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	 * to fails, fallback to a single page allocator that is
 	 * more permissive.
 	 */
-	if (!order) {
+	if (!order && nid != NUMA_NO_NODE) {
 		while (nr_allocated < nr_pages) {
 			unsigned int nr, nr_pages_request;
 
@@ -2848,7 +2850,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			if (nr != nr_pages_request)
 				break;
 		}
-	} else
+	} else if (order)
 		/*
 		 * Compound pages required for remap_vmalloc_page if
 		 * high-order pages.
@@ -2856,11 +2858,13 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		gfp |= __GFP_COMP;
 
 	/* High-order pages or fallback path if "bulk" fails. */
-	while (nr_allocated < nr_pages) {
-		struct page *page;
-		int i;
 
-		page = alloc_pages_node(nid, gfp, order);
+	page = NULL;
+	while (nr_allocated < nr_pages) {
+		if (nid == NUMA_NO_NODE)
+			page = alloc_pages(gfp, order);
+		else
+			page = alloc_pages_node(nid, gfp, order);
 		if (unlikely(!page))
 			break;
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation
  2021-10-18 12:37 [PATCH v2 0/2] fix numa spreading for large hash tables Chen Wandun
  2021-10-18 12:37 ` [PATCH v2 1/2] mm/vmalloc: " Chen Wandun
@ 2021-10-18 12:37 ` Chen Wandun
  1 sibling, 0 replies; 9+ messages in thread
From: Chen Wandun @ 2021-10-18 12:37 UTC (permalink / raw)
  To: akpm, npiggin, linux-mm, linux-kernel, edumazet, wangkefeng.wang,
	guohanjun, shakeelb, urezki
  Cc: chenwandun

It will cause significant performance regressions in some situations
as Andrew mentioned in [1]. The main situation is vmalloc, vmalloc
will allocate pages with NUMA_NO_NODE by default, that will result
in alloc page one by one;

In order to solve this, __alloc_pages_bulk and mempolicy should be
considered at the same time.

1) If node is specified in memory allocation request, it will alloc
all pages by __alloc_pages_bulk.

2) If interleaving allocate memory, it will cauculate how many pages
should be allocated in each node, and use __alloc_pages_bulk to alloc
pages in each node.

[1]: https://lore.kernel.org/lkml/CALvZod4G3SzP3kWxQYn0fj+VgG-G3yWXz=gz17+3N57ru1iajw@mail.gmail.com/t/#m750c8e3231206134293b089feaa090590afa0f60

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
---
 include/linux/gfp.h |  4 +++
 mm/mempolicy.c      | 82 +++++++++++++++++++++++++++++++++++++++++++++
 mm/vmalloc.c        | 20 ++++++++---
 3 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 55b2ec1f965a..cd98c858fc74 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -535,6 +535,10 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 				struct list_head *page_list,
 				struct page **page_array);
 
+unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
+				unsigned long nr_pages,
+				struct page **page_array);
+
 /* Bulk allocate order-0 pages */
 static inline unsigned long
 alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 1592b081c58e..56bb1fe4d179 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2202,6 +2202,88 @@ struct page *alloc_pages(gfp_t gfp, unsigned order)
 }
 EXPORT_SYMBOL(alloc_pages);
 
+unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp,
+		struct mempolicy *pol, unsigned long nr_pages,
+		struct page **page_array)
+{
+	int nodes;
+	unsigned long nr_pages_per_node;
+	int delta;
+	int i;
+	unsigned long nr_allocated;
+	unsigned long total_allocated = 0;
+
+	nodes = nodes_weight(pol->nodes);
+	nr_pages_per_node = nr_pages / nodes;
+	delta = nr_pages - nodes * nr_pages_per_node;
+
+	for (i = 0; i < nodes; i++) {
+		if (delta) {
+			nr_allocated = __alloc_pages_bulk(gfp,
+					interleave_nodes(pol), NULL,
+					nr_pages_per_node + 1, NULL,
+					page_array);
+			delta--;
+		} else {
+			nr_allocated = __alloc_pages_bulk(gfp,
+					interleave_nodes(pol), NULL,
+					nr_pages_per_node, NULL, page_array);
+		}
+
+		page_array += nr_allocated;
+		total_allocated += nr_allocated;
+	}
+
+	return total_allocated;
+}
+
+unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid,
+		struct mempolicy *pol, unsigned long nr_pages,
+		struct page **page_array)
+{
+	gfp_t preferred_gfp;
+	unsigned long nr_allocated = 0;
+
+	preferred_gfp = gfp | __GFP_NOWARN;
+	preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL);
+
+	nr_allocated  = __alloc_pages_bulk(preferred_gfp, nid, &pol->nodes,
+					   nr_pages, NULL, page_array);
+
+	if (nr_allocated < nr_pages)
+		nr_allocated += __alloc_pages_bulk(gfp, numa_node_id(), NULL,
+				nr_pages - nr_allocated, NULL,
+				page_array + nr_allocated);
+	return nr_allocated;
+}
+
+/* alloc pages bulk and mempolicy should be considered at the
+ * same time in some situation such as vmalloc.
+ *
+ * It can accelerate memory allocation especially interleaving
+ * allocate memory.
+ */
+unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
+		unsigned long nr_pages, struct page **page_array)
+{
+	struct mempolicy *pol = &default_policy;
+
+	if (!in_interrupt() && !(gfp & __GFP_THISNODE))
+		pol = get_task_policy(current);
+
+	if (pol->mode == MPOL_INTERLEAVE)
+		return alloc_pages_bulk_array_interleave(gfp, pol,
+							 nr_pages, page_array);
+
+	if (pol->mode == MPOL_PREFERRED_MANY)
+		return alloc_pages_bulk_array_preferred_many(gfp,
+				numa_node_id(), pol, nr_pages, page_array);
+
+	return __alloc_pages_bulk(gfp, policy_node(gfp, pol, numa_node_id()),
+				  policy_nodemask(gfp, pol), nr_pages, NULL,
+				  page_array);
+}
+
 int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst)
 {
 	struct mempolicy *pol = mpol_dup(vma_policy(src));
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 87552a4018aa..19b34c266fac 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2825,7 +2825,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	 * to fails, fallback to a single page allocator that is
 	 * more permissive.
 	 */
-	if (!order && nid != NUMA_NO_NODE) {
+	if (!order) {
 		while (nr_allocated < nr_pages) {
 			unsigned int nr, nr_pages_request;
 
@@ -2837,8 +2837,20 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			 */
 			nr_pages_request = min(100U, nr_pages - nr_allocated);
 
-			nr = alloc_pages_bulk_array_node(gfp, nid,
-				nr_pages_request, pages + nr_allocated);
+			/* memory allocation should consider mempolicy, we cant
+			 * wrongly use nearest node when nid == NUMA_NO_NODE,
+			 * otherwise memory may be allocated in only one node,
+			 * but mempolcy want to alloc memory by interleaving.
+			 */
+			if (nid == NUMA_NO_NODE)
+				nr = alloc_pages_bulk_array_mempolicy(gfp,
+							nr_pages_request,
+							pages + nr_allocated);
+
+			else
+				nr = alloc_pages_bulk_array_node(gfp, nid,
+							nr_pages_request,
+							pages + nr_allocated);
 
 			nr_allocated += nr;
 			cond_resched();
@@ -2850,7 +2862,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			if (nr != nr_pages_request)
 				break;
 		}
-	} else if (order)
+	} else
 		/*
 		 * Compound pages required for remap_vmalloc_page if
 		 * high-order pages.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-18 12:37 ` [PATCH v2 1/2] mm/vmalloc: " Chen Wandun
@ 2021-10-18 12:39   ` Matthew Wilcox
  2021-10-18 13:01     ` Chen Wandun
  2021-10-19 15:53     ` Shakeel Butt
  2021-10-18 14:03   ` Uladzislau Rezki
  2021-10-19 15:56   ` Shakeel Butt
  2 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2021-10-18 12:39 UTC (permalink / raw)
  To: Chen Wandun
  Cc: akpm, npiggin, linux-mm, linux-kernel, edumazet, wangkefeng.wang,
	guohanjun, shakeelb, urezki

On Mon, Oct 18, 2021 at 08:37:09PM +0800, Chen Wandun wrote:
> Eric Dumazet reported a strange numa spreading info in [1], and found
> commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced
> this issue [2].

I think the root problem here is that we have two meanings for
NUMA_NO_NODE.  I tend to read it as "The memory can be allocated from
any node", but here it's used to mean "The memory should be spread over
every node".  Should we split those out as -1 and -2?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-18 12:39   ` Matthew Wilcox
@ 2021-10-18 13:01     ` Chen Wandun
  2021-10-19 15:53     ` Shakeel Butt
  1 sibling, 0 replies; 9+ messages in thread
From: Chen Wandun @ 2021-10-18 13:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, npiggin, linux-mm, linux-kernel, edumazet, wangkefeng.wang,
	guohanjun, shakeelb, urezki



在 2021/10/18 20:39, Matthew Wilcox 写道:
> On Mon, Oct 18, 2021 at 08:37:09PM +0800, Chen Wandun wrote:
>> Eric Dumazet reported a strange numa spreading info in [1], and found
>> commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced
>> this issue [2].
> 
> I think the root problem here is that we have two meanings for
> NUMA_NO_NODE.  I tend to read it as "The memory can be allocated from
> any node", but here it's used to mean "The memory should be spread over
> every node".  Should we split those out as -1 and -2?
Yes, the intent of NUMA_NO_NODE some time is confused.

Besides，I think NUMA_NO_NODE should consider mempolicy in
most cases in the kernel unless it point out explicitly memory
can be allocated without considering mempolicy.

> .
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-18 12:37 ` [PATCH v2 1/2] mm/vmalloc: " Chen Wandun
  2021-10-18 12:39   ` Matthew Wilcox
@ 2021-10-18 14:03   ` Uladzislau Rezki
  2021-10-19 15:56   ` Shakeel Butt
  2 siblings, 0 replies; 9+ messages in thread
From: Uladzislau Rezki @ 2021-10-18 14:03 UTC (permalink / raw)
  To: Chen Wandun
  Cc: Andrew Morton, Nicholas Piggin, Linux Memory Management List,
	LKML, Eric Dumazet, Kefeng Wang, guohanjun, Shakeel Butt

> Eric Dumazet reported a strange numa spreading info in [1], and found
> commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced
> this issue [2].
>
> Dig into the difference before and after this patch, page allocation has
> some difference:
>
> before:
> alloc_large_system_hash
>     __vmalloc
>         __vmalloc_node(..., NUMA_NO_NODE, ...)
>             __vmalloc_node_range
>                 __vmalloc_area_node
>                     alloc_page /* because NUMA_NO_NODE, so choose alloc_page branch */
>                         alloc_pages_current
>                             alloc_page_interleave /* can be proved by print policy mode */
>
> after:
> alloc_large_system_hash
>     __vmalloc
>         __vmalloc_node(..., NUMA_NO_NODE, ...)
>             __vmalloc_node_range
>                 __vmalloc_area_node
>                     alloc_pages_node /* choose nid by nuam_mem_id() */
>                         __alloc_pages_node(nid, ....)
>
> So after commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings"),
> it will allocate memory in current node instead of interleaving allocate
> memory.
>
> [1]
> https://lore.kernel.org/linux-mm/CANn89iL6AAyWhfxdHO+jaT075iOa3XcYn9k6JJc7JR2XYn6k_Q@mail.gmail.com/
>
> [2]
> https://lore.kernel.org/linux-mm/CANn89iLofTR=AK-QOZY87RdUZENCZUT4O6a0hvhu3_EwRMerOg@mail.gmail.com/
>
> Fixes: 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings")
> Reported-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
> ---
>  mm/vmalloc.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index d77830ff604c..87552a4018aa 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2816,6 +2816,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>                 unsigned int order, unsigned int nr_pages, struct page **pages)
>  {
>         unsigned int nr_allocated = 0;
> +       struct page *page;
> +       int i;
>
>         /*
>          * For order-0 pages we make use of bulk allocator, if
> @@ -2823,7 +2825,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>          * to fails, fallback to a single page allocator that is
>          * more permissive.
>          */
> -       if (!order) {
> +       if (!order && nid != NUMA_NO_NODE) {
>                 while (nr_allocated < nr_pages) {
>                         unsigned int nr, nr_pages_request;
>
> @@ -2848,7 +2850,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>                         if (nr != nr_pages_request)
>                                 break;
>                 }
> -       } else
> +       } else if (order)
>                 /*
>                  * Compound pages required for remap_vmalloc_page if
>                  * high-order pages.
> @@ -2856,11 +2858,13 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>                 gfp |= __GFP_COMP;
>
>         /* High-order pages or fallback path if "bulk" fails. */
> -       while (nr_allocated < nr_pages) {
> -               struct page *page;
> -               int i;
>
> -               page = alloc_pages_node(nid, gfp, order);
> +       page = NULL;
>
Why do you need to set page to NULL here?

-- 
Vlad Rezki


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-18 12:39   ` Matthew Wilcox
  2021-10-18 13:01     ` Chen Wandun
@ 2021-10-19 15:53     ` Shakeel Butt
  2021-10-19 15:58       ` Eric Dumazet
  1 sibling, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2021-10-19 15:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Chen Wandun, Andrew Morton, Nicholas Piggin, Linux MM, LKML,
	Eric Dumazet, Kefeng Wang, guohanjun, Uladzislau Rezki (Sony)

On Mon, Oct 18, 2021 at 5:41 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Oct 18, 2021 at 08:37:09PM +0800, Chen Wandun wrote:
> > Eric Dumazet reported a strange numa spreading info in [1], and found
> > commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced
> > this issue [2].
>
> I think the root problem here is that we have two meanings for
> NUMA_NO_NODE.  I tend to read it as "The memory can be allocated from
> any node", but here it's used to mean "The memory should be spread over
> every node".  Should we split those out as -1 and -2?

I agree with Willy's suggestion to make it more explicit but as a
followup work. This patch needs a backport, so keep this simple.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-18 12:37 ` [PATCH v2 1/2] mm/vmalloc: " Chen Wandun
  2021-10-18 12:39   ` Matthew Wilcox
  2021-10-18 14:03   ` Uladzislau Rezki
@ 2021-10-19 15:56   ` Shakeel Butt
  2 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2021-10-19 15:56 UTC (permalink / raw)
  To: Chen Wandun
  Cc: Andrew Morton, Nicholas Piggin, Linux MM, LKML, Eric Dumazet,
	Kefeng Wang, guohanjun, Uladzislau Rezki (Sony)

On Mon, Oct 18, 2021 at 5:23 AM Chen Wandun <chenwandun@huawei.com> wrote:
>
[...]
>
>         /* High-order pages or fallback path if "bulk" fails. */
> -       while (nr_allocated < nr_pages) {
> -               struct page *page;
> -               int i;
>
> -               page = alloc_pages_node(nid, gfp, order);
> +       page = NULL;

No need for the above NULL assignment.

After removing this, you can add:

Reviewed-by: Shakeel Butt <shakeelb@google.com>

> +       while (nr_allocated < nr_pages) {
> +               if (nid == NUMA_NO_NODE)
> +                       page = alloc_pages(gfp, order);
> +               else
> +                       page = alloc_pages_node(nid, gfp, order);
>                 if (unlikely(!page))
>                         break;
>
> --
> 2.25.1
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-19 15:53     ` Shakeel Butt
@ 2021-10-19 15:58       ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2021-10-19 15:58 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Matthew Wilcox, Chen Wandun, Andrew Morton, Nicholas Piggin,
	Linux MM, LKML, Kefeng Wang, guohanjun, Uladzislau Rezki (Sony)

On Tue, Oct 19, 2021 at 8:54 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Mon, Oct 18, 2021 at 5:41 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Mon, Oct 18, 2021 at 08:37:09PM +0800, Chen Wandun wrote:
> > > Eric Dumazet reported a strange numa spreading info in [1], and found
> > > commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced
> > > this issue [2].
> >
> > I think the root problem here is that we have two meanings for
> > NUMA_NO_NODE.  I tend to read it as "The memory can be allocated from
> > any node", but here it's used to mean "The memory should be spread over
> > every node".  Should we split those out as -1 and -2?
>
> I agree with Willy's suggestion to make it more explicit but as a
> followup work. This patch needs a backport, so keep this simple.

NUMA_NO_NODE in process context also meant :
Please follow current thread NUMA policies.

One could hope for instance, that whenever large BPF maps are allocated,
current thread could set non default NUMA policies.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-19 15:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 12:37 [PATCH v2 0/2] fix numa spreading for large hash tables Chen Wandun
2021-10-18 12:37 ` [PATCH v2 1/2] mm/vmalloc: " Chen Wandun
2021-10-18 12:39   ` Matthew Wilcox
2021-10-18 13:01     ` Chen Wandun
2021-10-19 15:53     ` Shakeel Butt
2021-10-19 15:58       ` Eric Dumazet
2021-10-18 14:03   ` Uladzislau Rezki
2021-10-19 15:56   ` Shakeel Butt
2021-10-18 12:37 ` [PATCH v2 2/2] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation Chen Wandun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).