linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Some cleanups for the KVA/vmalloc
@ 2019-05-26 21:22 Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-26 21:22 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Patch [1] removes an unused argument "node" from the __alloc_vmap_area()
function and that is it.

Patch [2] is not driven by any particular workload that fails or so,
it is just better approach to handle one specific split case.

Patch [3] some cleanups in merging path. Basically on a first step
the mergeable node is detached and there is no reason to "unlink" it.
The same concerns the second step unless it has been merged on first
one.

Patch [4] moves BUG_ON()/RB_EMPTY_NODE() checks under "unlink" logic.
After [3] merging path "unlink" only linked nodes. Therefore we can say
that removing detached object is a bug in all cases.

v1->v2:
    - update the commit message. [2] patch;
    - fix typos in comments. [2] patch;
    - do the "preload" for NUMA awareness. [2] patch;

Uladzislau Rezki (Sony) (4):
  mm/vmap: remove "node" argument
  mm/vmap: preload a CPU with one object for split purpose
  mm/vmap: get rid of one single unlink_va() when merge
  mm/vmap: move BUG_ON() check to the unlink_va()

 mm/vmalloc.c | 116 +++++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 90 insertions(+), 26 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/4] mm/vmap: remove "node" argument
  2019-05-26 21:22 [PATCH v2 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
@ 2019-05-26 21:22 ` Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 2/4] mm/vmap: preload a CPU with one object for split purpose Uladzislau Rezki (Sony)
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-26 21:22 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Remove unused argument from the __alloc_vmap_area() function.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c42872ed82ac..ea1b65fac599 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -985,7 +985,7 @@ adjust_va_to_fit_type(struct vmap_area *va,
  */
 static __always_inline unsigned long
 __alloc_vmap_area(unsigned long size, unsigned long align,
-	unsigned long vstart, unsigned long vend, int node)
+	unsigned long vstart, unsigned long vend)
 {
 	unsigned long nva_start_addr;
 	struct vmap_area *va;
@@ -1062,7 +1062,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	 * If an allocation fails, the "vend" address is
 	 * returned. Therefore trigger the overflow path.
 	 */
-	addr = __alloc_vmap_area(size, align, vstart, vend, node);
+	addr = __alloc_vmap_area(size, align, vstart, vend);
 	if (unlikely(addr == vend))
 		goto overflow;
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/4] mm/vmap: preload a CPU with one object for split purpose
  2019-05-26 21:22 [PATCH v2 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
@ 2019-05-26 21:22 ` Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 4/4] mm/vmap: move BUG_ON() check to the unlink_va() Uladzislau Rezki (Sony)
  3 siblings, 0 replies; 6+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-26 21:22 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Refactor the NE_FIT_TYPE split case when it comes to an
allocation of one extra object. We need it in order to
build a remaining space.

Introduce ne_fit_preload()/ne_fit_preload_end() functions
for preloading one extra vmap_area object to ensure that
we have it available when fit type is NE_FIT_TYPE.

The preload is done per CPU in non-atomic context thus with
GFP_KERNEL allocation masks. More permissive parameters can
be beneficial for systems which are suffer from high memory
pressure or low memory condition.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 76 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ea1b65fac599..b553047aa05b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -364,6 +364,13 @@ static LIST_HEAD(free_vmap_area_list);
  */
 static struct rb_root free_vmap_area_root = RB_ROOT;
 
+/*
+ * Preload a CPU with one object for "no edge" split case. The
+ * aim is to get rid of allocations from the atomic context, thus
+ * to use more permissive allocation masks.
+ */
+static DEFINE_PER_CPU(struct vmap_area *, ne_fit_preload_node);
+
 static __always_inline unsigned long
 va_size(struct vmap_area *va)
 {
@@ -950,9 +957,24 @@ adjust_va_to_fit_type(struct vmap_area *va,
 		 *   L V  NVA  V R
 		 * |---|-------|---|
 		 */
-		lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
-		if (unlikely(!lva))
-			return -1;
+		lva = __this_cpu_xchg(ne_fit_preload_node, NULL);
+		if (unlikely(!lva)) {
+			/*
+			 * For percpu allocator we do not do any pre-allocation
+			 * and leave it as it is. The reason is it most likely
+			 * never ends up with NE_FIT_TYPE splitting. In case of
+			 * percpu allocations offsets and sizes are aligned to
+			 * fixed align request, i.e. RE_FIT_TYPE and FL_FIT_TYPE
+			 * are its main fitting cases.
+			 *
+			 * There are a few exceptions though, as an example it is
+			 * a first allocation (early boot up) when we have "one"
+			 * big free space that has to be split.
+			 */
+			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
+			if (!lva)
+				return -1;
+		}
 
 		/*
 		 * Build the remainder.
@@ -1023,6 +1045,48 @@ __alloc_vmap_area(unsigned long size, unsigned long align,
 }
 
 /*
+ * Preload this CPU with one extra vmap_area object to ensure
+ * that we have it available when fit type of free area is
+ * NE_FIT_TYPE.
+ *
+ * The preload is done in non-atomic context, thus it allows us
+ * to use more permissive allocation masks to be more stable under
+ * low memory condition and high memory pressure.
+ *
+ * If success it returns 1 with preemption disabled. In case
+ * of error 0 is returned with preemption not disabled. Note it
+ * has to be paired with ne_fit_preload_end().
+ */
+static int
+ne_fit_preload(int nid)
+{
+	preempt_disable();
+
+	if (!__this_cpu_read(ne_fit_preload_node)) {
+		struct vmap_area *node;
+
+		preempt_enable();
+		node = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, nid);
+		if (node == NULL)
+			return 0;
+
+		preempt_disable();
+
+		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, node))
+			kmem_cache_free(vmap_area_cachep, node);
+	}
+
+	return 1;
+}
+
+static void
+ne_fit_preload_end(int preloaded)
+{
+	if (preloaded)
+		preempt_enable();
+}
+
+/*
  * Allocate a region of KVA of the specified size and alignment, within the
  * vstart and vend.
  */
@@ -1034,6 +1098,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	struct vmap_area *va;
 	unsigned long addr;
 	int purged = 0;
+	int preloaded;
 
 	BUG_ON(!size);
 	BUG_ON(offset_in_page(size));
@@ -1056,6 +1121,12 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK);
 
 retry:
+	/*
+	 * Even if it fails we do not really care about that.
+	 * Just proceed as it is. "overflow" path will refill
+	 * the cache we allocate from.
+	 */
+	preloaded = ne_fit_preload(node);
 	spin_lock(&vmap_area_lock);
 
 	/*
@@ -1063,6 +1134,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	 * returned. Therefore trigger the overflow path.
 	 */
 	addr = __alloc_vmap_area(size, align, vstart, vend);
+	ne_fit_preload_end(preloaded);
+
 	if (unlikely(addr == vend))
 		goto overflow;
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/4] mm/vmap: get rid of one single unlink_va() when merge
  2019-05-26 21:22 [PATCH v2 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 2/4] mm/vmap: preload a CPU with one object for split purpose Uladzislau Rezki (Sony)
@ 2019-05-26 21:22 ` Uladzislau Rezki (Sony)
  2019-05-26 21:22 ` [PATCH v2 4/4] mm/vmap: move BUG_ON() check to the unlink_va() Uladzislau Rezki (Sony)
  3 siblings, 0 replies; 6+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-26 21:22 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

It does not make sense to try to "unlink" the node that is
definitely not linked with a list nor tree. On the first
merge step VA just points to the previously disconnected
busy area.

On the second step, check if the node has been merged and do
"unlink" if so, because now it points to an object that must
be linked.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b553047aa05b..6f91136f2cc8 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -718,9 +718,6 @@ merge_or_add_vmap_area(struct vmap_area *va,
 			/* Check and update the tree if needed. */
 			augment_tree_propagate_from(sibling);
 
-			/* Remove this VA, it has been merged. */
-			unlink_va(va, root);
-
 			/* Free vmap_area object. */
 			kmem_cache_free(vmap_area_cachep, va);
 
@@ -745,12 +742,12 @@ merge_or_add_vmap_area(struct vmap_area *va,
 			/* Check and update the tree if needed. */
 			augment_tree_propagate_from(sibling);
 
-			/* Remove this VA, it has been merged. */
-			unlink_va(va, root);
+			/* Remove this VA, if it has been merged. */
+			if (merged)
+				unlink_va(va, root);
 
 			/* Free vmap_area object. */
 			kmem_cache_free(vmap_area_cachep, va);
-
 			return;
 		}
 	}
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 4/4] mm/vmap: move BUG_ON() check to the unlink_va()
  2019-05-26 21:22 [PATCH v2 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
                   ` (2 preceding siblings ...)
  2019-05-26 21:22 ` [PATCH v2 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki (Sony)
@ 2019-05-26 21:22 ` Uladzislau Rezki (Sony)
  3 siblings, 0 replies; 6+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-26 21:22 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Move the BUG_ON()/RB_EMPTY_NODE() check under unlink_va()
function, it means if an empty node gets freed it is a BUG
thus is considered as faulty behaviour.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6f91136f2cc8..0cd2a152826e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -533,20 +533,16 @@ link_va(struct vmap_area *va, struct rb_root *root,
 static __always_inline void
 unlink_va(struct vmap_area *va, struct rb_root *root)
 {
-	/*
-	 * During merging a VA node can be empty, therefore
-	 * not linked with the tree nor list. Just check it.
-	 */
-	if (!RB_EMPTY_NODE(&va->rb_node)) {
-		if (root == &free_vmap_area_root)
-			rb_erase_augmented(&va->rb_node,
-				root, &free_vmap_area_rb_augment_cb);
-		else
-			rb_erase(&va->rb_node, root);
+	BUG_ON(RB_EMPTY_NODE(&va->rb_node));
 
-		list_del(&va->list);
-		RB_CLEAR_NODE(&va->rb_node);
-	}
+	if (root == &free_vmap_area_root)
+		rb_erase_augmented(&va->rb_node,
+			root, &free_vmap_area_rb_augment_cb);
+	else
+		rb_erase(&va->rb_node, root);
+
+	list_del(&va->list);
+	RB_CLEAR_NODE(&va->rb_node);
 }
 
 #if DEBUG_AUGMENT_PROPAGATE_CHECK
@@ -1188,8 +1184,6 @@ EXPORT_SYMBOL_GPL(unregister_vmap_purge_notifier);
 
 static void __free_vmap_area(struct vmap_area *va)
 {
-	BUG_ON(RB_EMPTY_NODE(&va->rb_node));
-
 	/*
 	 * Remove from the busy tree/list.
 	 */
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 3/4] mm/vmap: get rid of one single unlink_va() when merge
       [not found] <20190527030712.15472-1-hdanton@sina.com>
@ 2019-05-27  9:37 ` Uladzislau Rezki
  0 siblings, 0 replies; 6+ messages in thread
From: Uladzislau Rezki @ 2019-05-27  9:37 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Uladzislau Rezki (Sony),
	Andrew Morton, linux-mm, Roman Gushchin, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

On Mon, May 27, 2019 at 11:07:12AM +0800, Hillf Danton wrote:
> 
> On Mon, 27 May 2019 05:22:28 +0800 Uladzislau Rezki (Sony) wrote:
> > It does not make sense to try to "unlink" the node that is
> > definitely not linked with a list nor tree. On the first
> > merge step VA just points to the previously disconnected
> > busy area.
> > 
> > On the second step, check if the node has been merged and do
> > "unlink" if so, because now it points to an object that must
> > be linked.
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > ---
> 
> Acked-by: Hillf Danton <hdanton@sina.com>
> 
Thanks!

> >  mm/vmalloc.c | 9 +++------
> >  1 file changed, 3 insertions(+), 6 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index b553047aa05b..6f91136f2cc8 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -718,9 +718,6 @@ merge_or_add_vmap_area(struct vmap_area *va,
> >  			/* Check and update the tree if needed. */
> >  			augment_tree_propagate_from(sibling);
> > 
> > -			/* Remove this VA, it has been merged. */
> > -			unlink_va(va, root);
> > -
> >  			/* Free vmap_area object. */
> >  			kmem_cache_free(vmap_area_cachep, va);
> > 
> > @@ -745,12 +742,12 @@ merge_or_add_vmap_area(struct vmap_area *va,
> >  			/* Check and update the tree if needed. */
> >  			augment_tree_propagate_from(sibling);
> >
> > -			/* Remove this VA, it has been merged. */
> > -			unlink_va(va, root);
> > +			/* Remove this VA, if it has been merged. */
> > +			if (merged)
> > +				unlink_va(va, root);
> >
> The change makes the code much easier to read, thanks.
> What is more, checking merged makes the polished comment unnecessary, imo.
> And it can be applied, I think, to the above hunk.
> 
That is odd. Will remove it.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-05-27  9:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-26 21:22 [PATCH v2 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
2019-05-26 21:22 ` [PATCH v2 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
2019-05-26 21:22 ` [PATCH v2 2/4] mm/vmap: preload a CPU with one object for split purpose Uladzislau Rezki (Sony)
2019-05-26 21:22 ` [PATCH v2 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki (Sony)
2019-05-26 21:22 ` [PATCH v2 4/4] mm/vmap: move BUG_ON() check to the unlink_va() Uladzislau Rezki (Sony)
     [not found] <20190527030712.15472-1-hdanton@sina.com>
2019-05-27  9:37 ` [PATCH v2 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).