linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] Some cleanups for the KVA/vmalloc
@ 2019-05-27 15:18 Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-27 15:18 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Patch [1] removes an unused argument "node" from the __alloc_vmap_area()
function and that is it.

Patch [2] is not driven by any particular workload that fails or so,
it is just better approach to handle one specific split case.

Patch [3] some cleanups in merging path. Basically on a first step
the mergeable node is detached and there is no reason to "unlink" it.
The same concerns the second step unless it has been merged on first
one.

Patch [4] replaces BUG_ON() by WARN_ON() and moves it under "unlink" logic.
After [3] merging path "unlink" only linked nodes. Therefore we can say
that removing detached object is a bug in all cases.

v3->v4:
    - Replace BUG_ON by WARN_ON() in [4];
    - Update the commit message of the [4].

v2->v3:
    - remove the odd comment from the [3];

v1->v2:
    - update the commit message. [2] patch;
    - fix typos in comments. [2] patch;
    - do the "preload" for NUMA awareness. [2] patch;

Uladzislau Rezki (Sony) (4):
  mm/vmap: remove "node" argument
  mm/vmap: preload a CPU with one object for split purpose
  mm/vmap: get rid of one single unlink_va() when merge
  mm/vmap: switch to WARN_ON() and move it under unlink_va()

 mm/vmalloc.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 81 insertions(+), 18 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/4] mm/vmap: remove "node" argument
  2019-05-27 15:18 [PATCH v4 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
@ 2019-05-27 15:18 ` Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 2/4] mm/vmap: preload a CPU with one object for split purpose Uladzislau Rezki (Sony)
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-27 15:18 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Remove unused argument from the __alloc_vmap_area() function.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c42872ed82ac..ea1b65fac599 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -985,7 +985,7 @@ adjust_va_to_fit_type(struct vmap_area *va,
  */
 static __always_inline unsigned long
 __alloc_vmap_area(unsigned long size, unsigned long align,
-	unsigned long vstart, unsigned long vend, int node)
+	unsigned long vstart, unsigned long vend)
 {
 	unsigned long nva_start_addr;
 	struct vmap_area *va;
@@ -1062,7 +1062,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	 * If an allocation fails, the "vend" address is
 	 * returned. Therefore trigger the overflow path.
 	 */
-	addr = __alloc_vmap_area(size, align, vstart, vend, node);
+	addr = __alloc_vmap_area(size, align, vstart, vend);
 	if (unlikely(addr == vend))
 		goto overflow;
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/4] mm/vmap: preload a CPU with one object for split purpose
  2019-05-27 15:18 [PATCH v4 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
@ 2019-05-27 15:18 ` Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 4/4] mm/vmap: switch to WARN_ON() and move it under unlink_va() Uladzislau Rezki (Sony)
  3 siblings, 0 replies; 5+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-27 15:18 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Refactor the NE_FIT_TYPE split case when it comes to an
allocation of one extra object. We need it in order to
build a remaining space.

Introduce ne_fit_preload()/ne_fit_preload_end() functions
for preloading one extra vmap_area object to ensure that
we have it available when fit type is NE_FIT_TYPE.

The preload is done per CPU in non-atomic context thus with
GFP_KERNEL allocation masks. More permissive parameters can
be beneficial for systems which are suffer from high memory
pressure or low memory condition.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 76 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ea1b65fac599..b553047aa05b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -364,6 +364,13 @@ static LIST_HEAD(free_vmap_area_list);
  */
 static struct rb_root free_vmap_area_root = RB_ROOT;
 
+/*
+ * Preload a CPU with one object for "no edge" split case. The
+ * aim is to get rid of allocations from the atomic context, thus
+ * to use more permissive allocation masks.
+ */
+static DEFINE_PER_CPU(struct vmap_area *, ne_fit_preload_node);
+
 static __always_inline unsigned long
 va_size(struct vmap_area *va)
 {
@@ -950,9 +957,24 @@ adjust_va_to_fit_type(struct vmap_area *va,
 		 *   L V  NVA  V R
 		 * |---|-------|---|
 		 */
-		lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
-		if (unlikely(!lva))
-			return -1;
+		lva = __this_cpu_xchg(ne_fit_preload_node, NULL);
+		if (unlikely(!lva)) {
+			/*
+			 * For percpu allocator we do not do any pre-allocation
+			 * and leave it as it is. The reason is it most likely
+			 * never ends up with NE_FIT_TYPE splitting. In case of
+			 * percpu allocations offsets and sizes are aligned to
+			 * fixed align request, i.e. RE_FIT_TYPE and FL_FIT_TYPE
+			 * are its main fitting cases.
+			 *
+			 * There are a few exceptions though, as an example it is
+			 * a first allocation (early boot up) when we have "one"
+			 * big free space that has to be split.
+			 */
+			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
+			if (!lva)
+				return -1;
+		}
 
 		/*
 		 * Build the remainder.
@@ -1023,6 +1045,48 @@ __alloc_vmap_area(unsigned long size, unsigned long align,
 }
 
 /*
+ * Preload this CPU with one extra vmap_area object to ensure
+ * that we have it available when fit type of free area is
+ * NE_FIT_TYPE.
+ *
+ * The preload is done in non-atomic context, thus it allows us
+ * to use more permissive allocation masks to be more stable under
+ * low memory condition and high memory pressure.
+ *
+ * If success it returns 1 with preemption disabled. In case
+ * of error 0 is returned with preemption not disabled. Note it
+ * has to be paired with ne_fit_preload_end().
+ */
+static int
+ne_fit_preload(int nid)
+{
+	preempt_disable();
+
+	if (!__this_cpu_read(ne_fit_preload_node)) {
+		struct vmap_area *node;
+
+		preempt_enable();
+		node = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, nid);
+		if (node == NULL)
+			return 0;
+
+		preempt_disable();
+
+		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, node))
+			kmem_cache_free(vmap_area_cachep, node);
+	}
+
+	return 1;
+}
+
+static void
+ne_fit_preload_end(int preloaded)
+{
+	if (preloaded)
+		preempt_enable();
+}
+
+/*
  * Allocate a region of KVA of the specified size and alignment, within the
  * vstart and vend.
  */
@@ -1034,6 +1098,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	struct vmap_area *va;
 	unsigned long addr;
 	int purged = 0;
+	int preloaded;
 
 	BUG_ON(!size);
 	BUG_ON(offset_in_page(size));
@@ -1056,6 +1121,12 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK);
 
 retry:
+	/*
+	 * Even if it fails we do not really care about that.
+	 * Just proceed as it is. "overflow" path will refill
+	 * the cache we allocate from.
+	 */
+	preloaded = ne_fit_preload(node);
 	spin_lock(&vmap_area_lock);
 
 	/*
@@ -1063,6 +1134,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	 * returned. Therefore trigger the overflow path.
 	 */
 	addr = __alloc_vmap_area(size, align, vstart, vend);
+	ne_fit_preload_end(preloaded);
+
 	if (unlikely(addr == vend))
 		goto overflow;
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 3/4] mm/vmap: get rid of one single unlink_va() when merge
  2019-05-27 15:18 [PATCH v4 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 2/4] mm/vmap: preload a CPU with one object for split purpose Uladzislau Rezki (Sony)
@ 2019-05-27 15:18 ` Uladzislau Rezki (Sony)
  2019-05-27 15:18 ` [PATCH v4 4/4] mm/vmap: switch to WARN_ON() and move it under unlink_va() Uladzislau Rezki (Sony)
  3 siblings, 0 replies; 5+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-27 15:18 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

It does not make sense to try to "unlink" the node that is
definitely not linked with a list nor tree. On the first
merge step VA just points to the previously disconnected
busy area.

On the second step, check if the node has been merged and do
"unlink" if so, because now it points to an object that must
be linked.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Hillf Danton <hdanton@sina.com>
---
 mm/vmalloc.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b553047aa05b..371aba9a4bf1 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -718,9 +718,6 @@ merge_or_add_vmap_area(struct vmap_area *va,
 			/* Check and update the tree if needed. */
 			augment_tree_propagate_from(sibling);
 
-			/* Remove this VA, it has been merged. */
-			unlink_va(va, root);
-
 			/* Free vmap_area object. */
 			kmem_cache_free(vmap_area_cachep, va);
 
@@ -745,12 +742,11 @@ merge_or_add_vmap_area(struct vmap_area *va,
 			/* Check and update the tree if needed. */
 			augment_tree_propagate_from(sibling);
 
-			/* Remove this VA, it has been merged. */
-			unlink_va(va, root);
+			if (merged)
+				unlink_va(va, root);
 
 			/* Free vmap_area object. */
 			kmem_cache_free(vmap_area_cachep, va);
-
 			return;
 		}
 	}
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 4/4] mm/vmap: switch to WARN_ON() and move it under unlink_va()
  2019-05-27 15:18 [PATCH v4 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
                   ` (2 preceding siblings ...)
  2019-05-27 15:18 ` [PATCH v4 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki (Sony)
@ 2019-05-27 15:18 ` Uladzislau Rezki (Sony)
  3 siblings, 0 replies; 5+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-05-27 15:18 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Roman Gushchin, Uladzislau Rezki, Hillf Danton, Michal Hocko,
	Matthew Wilcox, LKML, Thomas Garnier, Oleksiy Avramchenko,
	Steven Rostedt, Joel Fernandes, Thomas Gleixner, Ingo Molnar,
	Tejun Heo

Trigger a warning if an object that is about to be freed is
detached. We used to have a BUG_ON(), but even though it is
considered as faulty behaviour that is not a good reason to
break a system.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 371aba9a4bf1..1dd459d0220a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -533,11 +533,7 @@ link_va(struct vmap_area *va, struct rb_root *root,
 static __always_inline void
 unlink_va(struct vmap_area *va, struct rb_root *root)
 {
-	/*
-	 * During merging a VA node can be empty, therefore
-	 * not linked with the tree nor list. Just check it.
-	 */
-	if (!RB_EMPTY_NODE(&va->rb_node)) {
+	if (!WARN_ON(RB_EMPTY_NODE(&va->rb_node))) {
 		if (root == &free_vmap_area_root)
 			rb_erase_augmented(&va->rb_node,
 				root, &free_vmap_area_rb_augment_cb);
@@ -1187,8 +1183,6 @@ EXPORT_SYMBOL_GPL(unregister_vmap_purge_notifier);
 
 static void __free_vmap_area(struct vmap_area *va)
 {
-	BUG_ON(RB_EMPTY_NODE(&va->rb_node));
-
 	/*
 	 * Remove from the busy tree/list.
 	 */
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-05-27 15:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-27 15:18 [PATCH v4 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
2019-05-27 15:18 ` [PATCH v4 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
2019-05-27 15:18 ` [PATCH v4 2/4] mm/vmap: preload a CPU with one object for split purpose Uladzislau Rezki (Sony)
2019-05-27 15:18 ` [PATCH v4 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki (Sony)
2019-05-27 15:18 ` [PATCH v4 4/4] mm/vmap: switch to WARN_ON() and move it under unlink_va() Uladzislau Rezki (Sony)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).