All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev()
@ 2023-04-14 14:57 Liam R. Howlett
  2023-04-14 14:57 ` [PATCH 2/3] maple_tree: Fix mas_empty_area() search Liam R. Howlett
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-14 14:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, Rick Edgecombe, Liam R. Howlett

Stop using maple state min/max for the range by passing through pointers
for those values.  This will allow the maple state to be reused without
resetting.

Also add some logic to fail out early on searching with invalid
arguments.

Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 lib/maple_tree.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 4df6a0ce1c1b..ed350aa293b2 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -4938,7 +4938,8 @@ static inline void *mas_prev_entry(struct ma_state *mas, unsigned long min)
  * Return: True if found in a leaf, false otherwise.
  *
  */
-static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
+static bool mas_rev_awalk(struct ma_state *mas, unsigned long size,
+		unsigned long *gap_min, unsigned long *gap_max)
 {
 	enum maple_type type = mte_node_type(mas->node);
 	struct maple_node *node = mas_mn(mas);
@@ -5003,8 +5004,8 @@ static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
 
 	if (unlikely(ma_is_leaf(type))) {
 		mas->offset = offset;
-		mas->min = min;
-		mas->max = min + gap - 1;
+		*gap_min = min;
+		*gap_max = min + gap - 1;
 		return true;
 	}
 
@@ -5280,6 +5281,9 @@ int mas_empty_area(struct ma_state *mas, unsigned long min,
 	unsigned long *pivots;
 	enum maple_type mt;
 
+	if (min >= max)
+		return -EINVAL;
+
 	if (mas_is_start(mas))
 		mas_start(mas);
 	else if (mas->offset >= 2)
@@ -5334,6 +5338,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
 {
 	struct maple_enode *last = mas->node;
 
+	if (min >= max)
+		return -EINVAL;
+
 	if (mas_is_start(mas)) {
 		mas_start(mas);
 		mas->offset = mas_data_end(mas);
@@ -5353,7 +5360,7 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
 	mas->index = min;
 	mas->last = max;
 
-	while (!mas_rev_awalk(mas, size)) {
+	while (!mas_rev_awalk(mas, size, &min, &max)) {
 		if (last == mas->node) {
 			if (!mas_rewind_node(mas))
 				return -EBUSY;
@@ -5368,17 +5375,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
 	if (unlikely(mas->offset == MAPLE_NODE_SLOTS))
 		return -EBUSY;
 
-	/*
-	 * mas_rev_awalk() has set mas->min and mas->max to the gap values.  If
-	 * the maximum is outside the window we are searching, then use the last
-	 * location in the search.
-	 * mas->max and mas->min is the range of the gap.
-	 * mas->index and mas->last are currently set to the search range.
-	 */
-
 	/* Trim the upper limit to the max. */
-	if (mas->max <= mas->last)
-		mas->last = mas->max;
+	if (max <= mas->last)
+		mas->last = max;
 
 	mas->index = mas->last - size + 1;
 	return 0;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] maple_tree: Fix mas_empty_area() search
  2023-04-14 14:57 [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Liam R. Howlett
@ 2023-04-14 14:57 ` Liam R. Howlett
  2023-04-14 14:57 ` [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown} Liam R. Howlett
  2023-04-19  9:02 ` [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Peng Zhang
  2 siblings, 0 replies; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-14 14:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, Rick Edgecombe, Liam R. Howlett

The internal function of mas_awalk() was incorrectly skipping the last
entry in a node, which could potentially be NULL.  This is only a
problem for the left-most node in the tree - otherwise that NULL would
not exist.

Fix mas_awalk() by using the metadata to obtain the end of the node for
the loop and the logical pivot as apposed to the raw pivot value.

Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 lib/maple_tree.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index ed350aa293b2..ecf4c1c0f4b3 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5029,10 +5029,10 @@ static inline bool mas_anode_descend(struct ma_state *mas, unsigned long size)
 {
 	enum maple_type type = mte_node_type(mas->node);
 	unsigned long pivot, min, gap = 0;
-	unsigned char offset;
-	unsigned long *gaps;
-	unsigned long *pivots = ma_pivots(mas_mn(mas), type);
-	void __rcu **slots = ma_slots(mas_mn(mas), type);
+	unsigned char offset, data_end;
+	unsigned long *gaps, *pivots;
+	void __rcu **slots;
+	struct maple_node *node;
 	bool found = false;
 
 	if (ma_is_dense(type)) {
@@ -5040,13 +5040,15 @@ static inline bool mas_anode_descend(struct ma_state *mas, unsigned long size)
 		return true;
 	}
 
-	gaps = ma_gaps(mte_to_node(mas->node), type);
+	node = mas_mn(mas);
+	pivots = ma_pivots(node, type);
+	slots = ma_slots(node, type);
+	gaps = ma_gaps(node, type);
 	offset = mas->offset;
 	min = mas_safe_min(mas, pivots, offset);
-	for (; offset < mt_slots[type]; offset++) {
-		pivot = mas_safe_pivot(mas, pivots, offset, type);
-		if (offset && !pivot)
-			break;
+	data_end = ma_data_end(node, type, pivots, mas->max);
+	for (; offset <= data_end; offset++) {
+		pivot = mas_logical_pivot(mas, pivots, offset, type);
 
 		/* Not within lower bounds */
 		if (mas->index > pivot)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 14:57 [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Liam R. Howlett
  2023-04-14 14:57 ` [PATCH 2/3] maple_tree: Fix mas_empty_area() search Liam R. Howlett
@ 2023-04-14 14:57 ` Liam R. Howlett
  2023-04-14 16:27   ` Edgecombe, Rick P
  2023-04-14 18:59   ` [PATCH v2] " Liam R. Howlett
  2023-04-19  9:02 ` [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Peng Zhang
  2 siblings, 2 replies; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-14 14:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, Rick Edgecombe, Liam R. Howlett

The maple tree limits the gap returned to a window that specifically
fits what was asked.  This may not be optimal in the case of switching
search directions or a gap that does not satisfy the requested space for
other reasons.  Fix the search by retrying the operation and limiting
the search window in the rare occasion that a conflict occurs.

Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Fixes: 3499a13168da ("mm/mmap: use maple tree for unmapped_area{_topdown}")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 mm/mmap.c | 50 +++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 45 insertions(+), 5 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 055fbd5ed762..c3b269260138 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1548,7 +1548,8 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
  */
 static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
 {
-	unsigned long length, gap;
+	unsigned long length, gap, low_limit;
+	struct vm_area_struct *tmp;
 
 	MA_STATE(mas, &current->mm->mm_mt, 0, 0);
 
@@ -1557,12 +1558,30 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
 	if (length < info->length)
 		return -ENOMEM;
 
-	if (mas_empty_area(&mas, info->low_limit, info->high_limit - 1,
-				  length))
+	low_limit = info->low_limit;
+retry:
+	if (mas_empty_area(&mas, low_limit, info->high_limit - 1, length))
 		return -ENOMEM;
 
 	gap = mas.index;
 	gap += (info->align_offset - gap) & info->align_mask;
+	tmp = mas_next(&mas, ULONG_MAX);
+	if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {
+		if (vm_start_gap(tmp) < gap + length - 1) {
+			low_limit = tmp->vm_end;
+			mas_reset(&mas);
+			goto retry;
+		}
+	} else {
+		tmp = mas_prev(&mas, 0);
+		if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
+		    vm_end_gap(tmp) > gap) {
+			low_limit = vm_end_gap(tmp);
+			mas_reset(&mas);
+			goto retry;
+		}
+	}
+
 	return gap;
 }
 
@@ -1578,7 +1597,8 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
  */
 static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info)
 {
-	unsigned long length, gap;
+	unsigned long length, gap, high_limit, gap_end;
+	struct vm_area_struct *tmp;
 
 	MA_STATE(mas, &current->mm->mm_mt, 0, 0);
 	/* Adjust search length to account for worst case alignment overhead */
@@ -1586,12 +1606,32 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info)
 	if (length < info->length)
 		return -ENOMEM;
 
-	if (mas_empty_area_rev(&mas, info->low_limit, info->high_limit - 1,
+	high_limit = info->high_limit;
+retry:
+	if (mas_empty_area_rev(&mas, info->low_limit, high_limit - 1,
 				length))
 		return -ENOMEM;
 
 	gap = mas.last + 1 - info->length;
 	gap -= (gap - info->align_offset) & info->align_mask;
+	gap_end = mas.last;
+	tmp = mas_next(&mas, ULONG_MAX);
+	if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {
+		if (vm_start_gap(tmp) <= gap_end) {
+			high_limit = vm_start_gap(tmp);
+			mas_reset(&mas);
+			goto retry;
+		}
+	} else {
+		tmp = mas_prev(&mas, 0);
+		if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
+		    vm_end_gap(tmp) > gap) {
+			high_limit = tmp->vm_start;
+			mas_reset(&mas);
+			goto retry;
+		}
+	}
+
 	return gap;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 14:57 ` [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown} Liam R. Howlett
@ 2023-04-14 16:27   ` Edgecombe, Rick P
  2023-04-14 17:26     ` Liam R. Howlett
  2023-04-14 18:59   ` [PATCH v2] " Liam R. Howlett
  1 sibling, 1 reply; 20+ messages in thread
From: Edgecombe, Rick P @ 2023-04-14 16:27 UTC (permalink / raw)
  To: Liam.Howlett, akpm; +Cc: linux-mm, linux-kernel

On Fri, 2023-04-14 at 10:57 -0400, Liam R. Howlett wrote:<br>
> +       tmp = mas_next(&mas, ULONG_MAX);
> +       if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {

Why also check VM_GROWSDOWN here (and VM_GROWSUP below)?
vm_start/end_gap() already have checks inside.

> +               if (vm_start_gap(tmp) < gap + length - 1) {
> +                       low_limit = tmp->vm_end;
> +                       mas_reset(&mas);
> +                       goto retry;
> +               }
> +       } else {
> +               tmp = mas_prev(&mas, 0);
> +               if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
> +                   vm_end_gap(tmp) > gap) {
> +                       low_limit = vm_end_gap(tmp); 
> +                       mas_reset(&mas);
> +                       goto retry; 
> +               }
> +       } 
> +

Could it be like this?

tmp = mas_next(&mas, ULONG_MAX);
if (tmp && vm_start_gap(tmp) < gap + length - 1) {
		low_limit = tmp->vm_end;
		mas_reset(&mas);
		goto retry;
	}
} else {
	tmp = mas_prev(&mas, 0);
	if (tmp && vm_end_gap(tmp) > gap) {
		low_limit = vm_end_gap(tmp);
		mas_reset(&mas);
		goto retry;
	}
}


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 16:27   ` Edgecombe, Rick P
@ 2023-04-14 17:26     ` Liam R. Howlett
  2023-04-14 17:29       ` Liam R. Howlett
  0 siblings, 1 reply; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-14 17:26 UTC (permalink / raw)
  To: Edgecombe, Rick P; +Cc: akpm, linux-mm, linux-kernel

* Edgecombe, Rick P <rick.p.edgecombe@intel.com> [230414 12:27]:
> On Fri, 2023-04-14 at 10:57 -0400, Liam R. Howlett wrote:<br>
> > +       tmp = mas_next(&mas, ULONG_MAX);
> > +       if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {
> 
> Why also check VM_GROWSDOWN here (and VM_GROWSUP below)?
> vm_start/end_gap() already have checks inside.

An artifact of a plan that was later abandoned.

> 
> > +               if (vm_start_gap(tmp) < gap + length - 1) {
> > +                       low_limit = tmp->vm_end;
> > +                       mas_reset(&mas);
> > +                       goto retry;
> > +               }
> > +       } else {
> > +               tmp = mas_prev(&mas, 0);
> > +               if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
> > +                   vm_end_gap(tmp) > gap) {
> > +                       low_limit = vm_end_gap(tmp); 
> > +                       mas_reset(&mas);
> > +                       goto retry; 
> > +               }
> > +       } 
> > +
> 
> Could it be like this?

Yes, I'll make this change.  Thanks for the suggestion.

> 
> tmp = mas_next(&mas, ULONG_MAX);
> if (tmp && vm_start_gap(tmp) < gap + length - 1) {
> 		low_limit = tmp->vm_end;
> 		mas_reset(&mas);
> 		goto retry;
> 	}
> } else {
> 	tmp = mas_prev(&mas, 0);
> 	if (tmp && vm_end_gap(tmp) > gap) {
> 		low_limit = vm_end_gap(tmp);
> 		mas_reset(&mas);
> 		goto retry;
> 	}
> }
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 17:26     ` Liam R. Howlett
@ 2023-04-14 17:29       ` Liam R. Howlett
  2023-04-14 17:53         ` Edgecombe, Rick P
  0 siblings, 1 reply; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-14 17:29 UTC (permalink / raw)
  To: Edgecombe, Rick P, akpm, linux-mm, linux-kernel

* Liam R. Howlett <Liam.Howlett@Oracle.com> [230414 13:26]:
> * Edgecombe, Rick P <rick.p.edgecombe@intel.com> [230414 12:27]:
> > On Fri, 2023-04-14 at 10:57 -0400, Liam R. Howlett wrote:<br>
> > > +       tmp = mas_next(&mas, ULONG_MAX);
> > > +       if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {
> > 
> > Why also check VM_GROWSDOWN here (and VM_GROWSUP below)?
> > vm_start/end_gap() already have checks inside.
> 
> An artifact of a plan that was later abandoned.
> 
> > 
> > > +               if (vm_start_gap(tmp) < gap + length - 1) {
> > > +                       low_limit = tmp->vm_end;
> > > +                       mas_reset(&mas);
> > > +                       goto retry;
> > > +               }
> > > +       } else {
> > > +               tmp = mas_prev(&mas, 0);
> > > +               if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
> > > +                   vm_end_gap(tmp) > gap) {
> > > +                       low_limit = vm_end_gap(tmp); 
> > > +                       mas_reset(&mas);
> > > +                       goto retry; 
> > > +               }
> > > +       } 
> > > +
> > 
> > Could it be like this?
> 
> Yes, I'll make this change.  Thanks for the suggestion.


Wait, I like how it is.

In my version, if there is a stack that is VM_GROWSDOWN there, but does
not intercept the gap, then I won't check the prev.. in yours, we will
never avoid checking prev.

> 
> > 
> > tmp = mas_next(&mas, ULONG_MAX);
> > if (tmp && vm_start_gap(tmp) < gap + length - 1) {
> > 		low_limit = tmp->vm_end;
> > 		mas_reset(&mas);
> > 		goto retry;
> > 	}
> > } else {
> > 	tmp = mas_prev(&mas, 0);
> > 	if (tmp && vm_end_gap(tmp) > gap) {
> > 		low_limit = vm_end_gap(tmp);
> > 		mas_reset(&mas);
> > 		goto retry;
> > 	}
> > }
> > 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 17:29       ` Liam R. Howlett
@ 2023-04-14 17:53         ` Edgecombe, Rick P
  2023-04-14 18:07           ` Liam R. Howlett
  0 siblings, 1 reply; 20+ messages in thread
From: Edgecombe, Rick P @ 2023-04-14 17:53 UTC (permalink / raw)
  To: Liam.Howlett, linux-mm, linux-kernel, akpm

On Fri, 2023-04-14 at 13:29 -0400, Liam R. Howlett wrote:
> * Liam R. Howlett <Liam.Howlett@Oracle.com> [230414 13:26]:
> > * Edgecombe, Rick P <rick.p.edgecombe@intel.com> [230414 12:27]:
> > > On Fri, 2023-04-14 at 10:57 -0400, Liam R. Howlett wrote:<br>
> > > > +       tmp = mas_next(&mas, ULONG_MAX);
> > > > +       if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {
> > > 
> > > Why also check VM_GROWSDOWN here (and VM_GROWSUP below)?
> > > vm_start/end_gap() already have checks inside.
> > 
> > An artifact of a plan that was later abandoned.
> > 
> > > 
> > > > +               if (vm_start_gap(tmp) < gap + length - 1) {
> > > > +                       low_limit = tmp->vm_end;
> > > > +                       mas_reset(&mas);
> > > > +                       goto retry;
> > > > +               }
> > > > +       } else {
> > > > +               tmp = mas_prev(&mas, 0);
> > > > +               if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
> > > > +                   vm_end_gap(tmp) > gap) {
> > > > +                       low_limit = vm_end_gap(tmp); 
> > > > +                       mas_reset(&mas);
> > > > +                       goto retry; 
> > > > +               }
> > > > +       } 
> > > > +
> > > 
> > > Could it be like this?
> > 
> > Yes, I'll make this change.  Thanks for the suggestion.
> 
> 
> Wait, I like how it is.
> 
> In my version, if there is a stack that is VM_GROWSDOWN there, but
> does
> not intercept the gap, then I won't check the prev.. in yours, we
> will
> never avoid checking prev.

Hmm, I see. I guess I'm thinking ahead a bit to adding the shadow stack
guard gap, but I can always add to these vm_flags checks.

But are you sure this optimization is even possible? The old
vma_compute_gap() had this comment:
/*
 * Note: in the rare case of a VM_GROWSDOWN above a VM_GROWSUP, we
 * allow two stack_guard_gaps between them here, and when choosing
 * an unmapped area; whereas when expanding we only require one.
 * That's a little inconsistent, but keeps the code here simpler.
 */

Assuming this is a real scenario, if you have VM_GROWSDOWN above and
VM_GROWSUP below, don't you need to check the gaps for above and below?
Again thinking about adding shadow stack guard pages, something like
that could be a more common scenario. Not that you need to fix my out
of tree issues, but I would probably need to adjust it to check both
directions.

I guess there is no way to embed this inside maple tree search so we
don't need to retry? (sorry if this is a dumb question, it's an opaque
box to me).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 17:53         ` Edgecombe, Rick P
@ 2023-04-14 18:07           ` Liam R. Howlett
  2023-04-14 18:21             ` Edgecombe, Rick P
  0 siblings, 1 reply; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-14 18:07 UTC (permalink / raw)
  To: Edgecombe, Rick P; +Cc: linux-mm, linux-kernel, akpm

* Edgecombe, Rick P <rick.p.edgecombe@intel.com> [230414 13:53]:
> On Fri, 2023-04-14 at 13:29 -0400, Liam R. Howlett wrote:
> > * Liam R. Howlett <Liam.Howlett@Oracle.com> [230414 13:26]:
> > > * Edgecombe, Rick P <rick.p.edgecombe@intel.com> [230414 12:27]:
> > > > On Fri, 2023-04-14 at 10:57 -0400, Liam R. Howlett wrote:<br>
> > > > > +       tmp = mas_next(&mas, ULONG_MAX);
> > > > > +       if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {
> > > > 
> > > > Why also check VM_GROWSDOWN here (and VM_GROWSUP below)?
> > > > vm_start/end_gap() already have checks inside.
> > > 
> > > An artifact of a plan that was later abandoned.
> > > 
> > > > 
> > > > > +               if (vm_start_gap(tmp) < gap + length - 1) {
> > > > > +                       low_limit = tmp->vm_end;
> > > > > +                       mas_reset(&mas);
> > > > > +                       goto retry;
> > > > > +               }
> > > > > +       } else {
> > > > > +               tmp = mas_prev(&mas, 0);
> > > > > +               if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
> > > > > +                   vm_end_gap(tmp) > gap) {
> > > > > +                       low_limit = vm_end_gap(tmp); 
> > > > > +                       mas_reset(&mas);
> > > > > +                       goto retry; 
> > > > > +               }
> > > > > +       } 
> > > > > +
> > > > 
> > > > Could it be like this?
> > > 
> > > Yes, I'll make this change.  Thanks for the suggestion.
> > 
> > 
> > Wait, I like how it is.
> > 
> > In my version, if there is a stack that is VM_GROWSDOWN there, but
> > does
> > not intercept the gap, then I won't check the prev.. in yours, we
> > will
> > never avoid checking prev.
> 
> Hmm, I see. I guess I'm thinking ahead a bit to adding the shadow stack
> guard gap, but I can always add to these vm_flags checks.
> 
> But are you sure this optimization is even possible? The old
> vma_compute_gap() had this comment:
> /*
>  * Note: in the rare case of a VM_GROWSDOWN above a VM_GROWSUP, we
>  * allow two stack_guard_gaps between them here, and when choosing
>  * an unmapped area; whereas when expanding we only require one.
>  * That's a little inconsistent, but keeps the code here simpler.
>  */

I didn't think this was possible.  ia64 (orphaned in 96ec72a3425d) did
this.

> 
> Assuming this is a real scenario, if you have VM_GROWSDOWN above and
> VM_GROWSUP below, don't you need to check the gaps for above and below?
> Again thinking about adding shadow stack guard pages, something like
> that could be a more common scenario. Not that you need to fix my out
> of tree issues, but I would probably need to adjust it to check both
> directions.
> 
> I guess there is no way to embed this inside maple tree search so we
> don't need to retry? (sorry if this is a dumb question, it's an opaque
> box to me).

Absolutely, and I'm working on this as well, but right now I'm trying
to fix my regression for this and past releases.  Handling this in the
maple tree is more involved and so there's more risk.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 18:07           ` Liam R. Howlett
@ 2023-04-14 18:21             ` Edgecombe, Rick P
  0 siblings, 0 replies; 20+ messages in thread
From: Edgecombe, Rick P @ 2023-04-14 18:21 UTC (permalink / raw)
  To: Liam.Howlett; +Cc: linux-mm, linux-kernel, akpm

On Fri, 2023-04-14 at 14:07 -0400, Liam R. Howlett wrote:
> * Edgecombe, Rick P <rick.p.edgecombe@intel.com> [230414 13:53]:
> > On Fri, 2023-04-14 at 13:29 -0400, Liam R. Howlett wrote:
> > > * Liam R. Howlett <Liam.Howlett@Oracle.com> [230414 13:26]:
> > > > * Edgecombe, Rick P <rick.p.edgecombe@intel.com> [230414
> > > > 12:27]:
> > > > > On Fri, 2023-04-14 at 10:57 -0400, Liam R. Howlett wrote:<br>
> > > > > > +       tmp = mas_next(&mas, ULONG_MAX);
> > > > > > +       if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) {
> > > > > 
> > > > > Why also check VM_GROWSDOWN here (and VM_GROWSUP below)?
> > > > > vm_start/end_gap() already have checks inside.
> > > > 
> > > > An artifact of a plan that was later abandoned.
> > > > 
> > > > > 
> > > > > > +               if (vm_start_gap(tmp) < gap + length - 1) {
> > > > > > +                       low_limit = tmp->vm_end;
> > > > > > +                       mas_reset(&mas);
> > > > > > +                       goto retry;
> > > > > > +               }
> > > > > > +       } else {
> > > > > > +               tmp = mas_prev(&mas, 0);
> > > > > > +               if (tmp && (tmp->vm_flags & VM_GROWSUP) &&
> > > > > > +                   vm_end_gap(tmp) > gap) {
> > > > > > +                       low_limit = vm_end_gap(tmp); 
> > > > > > +                       mas_reset(&mas);
> > > > > > +                       goto retry; 
> > > > > > +               }
> > > > > > +       } 
> > > > > > +
> > > > > 
> > > > > Could it be like this?
> > > > 
> > > > Yes, I'll make this change.  Thanks for the suggestion.
> > > 
> > > 
> > > Wait, I like how it is.
> > > 
> > > In my version, if there is a stack that is VM_GROWSDOWN there,
> > > but
> > > does
> > > not intercept the gap, then I won't check the prev.. in yours, we
> > > will
> > > never avoid checking prev.
> > 
> > Hmm, I see. I guess I'm thinking ahead a bit to adding the shadow
> > stack
> > guard gap, but I can always add to these vm_flags checks.
> > 
> > But are you sure this optimization is even possible? The old
> > vma_compute_gap() had this comment:
> > /*
> >   * Note: in the rare case of a VM_GROWSDOWN above a VM_GROWSUP, we
> >   * allow two stack_guard_gaps between them here, and when choosing
> >   * an unmapped area; whereas when expanding we only require one.
> >   * That's a little inconsistent, but keeps the code here simpler.
> >   */
> 
> I didn't think this was possible.  ia64 (orphaned in 96ec72a3425d)
> did
> this.

Ah, ok.

> 
> > 
> > Assuming this is a real scenario, if you have VM_GROWSDOWN above
> > and
> > VM_GROWSUP below, don't you need to check the gaps for above and
> > below?
> > Again thinking about adding shadow stack guard pages, something
> > like
> > that could be a more common scenario. Not that you need to fix my
> > out
> > of tree issues, but I would probably need to adjust it to check
> > both
> > directions.
> > 
> > I guess there is no way to embed this inside maple tree search so
> > we
> > don't need to retry? (sorry if this is a dumb question, it's an
> > opaque
> > box to me).
> 
> Absolutely, and I'm working on this as well, but right now I'm trying
> to fix my regression for this and past releases.  Handling this in
> the
> maple tree is more involved and so there's more risk.

Ok, thanks. It looks good to me in that respect.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 14:57 ` [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown} Liam R. Howlett
  2023-04-14 16:27   ` Edgecombe, Rick P
@ 2023-04-14 18:59   ` Liam R. Howlett
  2023-04-14 19:09     ` Andrew Morton
  2023-04-29 14:32     ` Tad
  1 sibling, 2 replies; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-14 18:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, Rick Edgecombe, Liam R. Howlett

The maple tree limits the gap returned to a window that specifically
fits what was asked.  This may not be optimal in the case of switching
search directions or a gap that does not satisfy the requested space for
other reasons.  Fix the search by retrying the operation and limiting
the search window in the rare occasion that a conflict occurs.

Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Fixes: 3499a13168da ("mm/mmap: use maple tree for unmapped_area{_topdown}")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---

v1 changes:
 - Add comment about avoiding prev check
 - Remove check for VM_GROWSUP on prev since vm_end_gap() does this

 mm/mmap.c | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 055fbd5ed762..6b9116f1c304 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1548,7 +1548,8 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
  */
 static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
 {
-	unsigned long length, gap;
+	unsigned long length, gap, low_limit;
+	struct vm_area_struct *tmp;
 
 	MA_STATE(mas, &current->mm->mm_mt, 0, 0);
 
@@ -1557,12 +1558,29 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
 	if (length < info->length)
 		return -ENOMEM;
 
-	if (mas_empty_area(&mas, info->low_limit, info->high_limit - 1,
-				  length))
+	low_limit = info->low_limit;
+retry:
+	if (mas_empty_area(&mas, low_limit, info->high_limit - 1, length))
 		return -ENOMEM;
 
 	gap = mas.index;
 	gap += (info->align_offset - gap) & info->align_mask;
+	tmp = mas_next(&mas, ULONG_MAX);
+	if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) { /* Avoid prev check if possible */
+		if (vm_start_gap(tmp) < gap + length - 1) {
+			low_limit = tmp->vm_end;
+			mas_reset(&mas);
+			goto retry;
+		}
+	} else {
+		tmp = mas_prev(&mas, 0);
+		if (tmp && vm_end_gap(tmp) > gap) {
+			low_limit = vm_end_gap(tmp);
+			mas_reset(&mas);
+			goto retry;
+		}
+	}
+
 	return gap;
 }
 
@@ -1578,7 +1596,8 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
  */
 static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info)
 {
-	unsigned long length, gap;
+	unsigned long length, gap, high_limit, gap_end;
+	struct vm_area_struct *tmp;
 
 	MA_STATE(mas, &current->mm->mm_mt, 0, 0);
 	/* Adjust search length to account for worst case alignment overhead */
@@ -1586,12 +1605,31 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info)
 	if (length < info->length)
 		return -ENOMEM;
 
-	if (mas_empty_area_rev(&mas, info->low_limit, info->high_limit - 1,
+	high_limit = info->high_limit;
+retry:
+	if (mas_empty_area_rev(&mas, info->low_limit, high_limit - 1,
 				length))
 		return -ENOMEM;
 
 	gap = mas.last + 1 - info->length;
 	gap -= (gap - info->align_offset) & info->align_mask;
+	gap_end = mas.last;
+	tmp = mas_next(&mas, ULONG_MAX);
+	if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) { /* Avoid prev check if possible */
+		if (vm_start_gap(tmp) <= gap_end) {
+			high_limit = vm_start_gap(tmp);
+			mas_reset(&mas);
+			goto retry;
+		}
+	} else {
+		tmp = mas_prev(&mas, 0);
+		if (tmp && vm_end_gap(tmp) > gap) {
+			high_limit = tmp->vm_start;
+			mas_reset(&mas);
+			goto retry;
+		}
+	}
+
 	return gap;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 18:59   ` [PATCH v2] " Liam R. Howlett
@ 2023-04-14 19:09     ` Andrew Morton
  2023-04-29 14:32     ` Tad
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2023-04-14 19:09 UTC (permalink / raw)
  To: Liam R. Howlett; +Cc: linux-mm, linux-kernel, Rick Edgecombe

On Fri, 14 Apr 2023 14:59:19 -0400 "Liam R. Howlett" <Liam.Howlett@oracle.com> wrote:

> The maple tree limits the gap returned to a window that specifically
> fits what was asked.  This may not be optimal in the case of switching
> search directions or a gap that does not satisfy the requested space for
> other reasons.  Fix the search by retrying the operation and limiting
> the search window in the rare occasion that a conflict occurs.

This is a performance regression, yes?  Of what magnitude?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev()
  2023-04-14 14:57 [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Liam R. Howlett
  2023-04-14 14:57 ` [PATCH 2/3] maple_tree: Fix mas_empty_area() search Liam R. Howlett
  2023-04-14 14:57 ` [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown} Liam R. Howlett
@ 2023-04-19  9:02 ` Peng Zhang
  2023-04-19 22:54   ` Liam R. Howlett
  2 siblings, 1 reply; 20+ messages in thread
From: Peng Zhang @ 2023-04-19  9:02 UTC (permalink / raw)
  To: Liam R. Howlett; +Cc: linux-mm, linux-kernel, Rick Edgecombe, Andrew Morton


在 2023/4/14 22:57, Liam R. Howlett 写道:
> Stop using maple state min/max for the range by passing through pointers
> for those values.  This will allow the maple state to be reused without
> resetting.
>
> Also add some logic to fail out early on searching with invalid
> arguments.
>
> Fixes: 54a611b60590 ("Maple Tree: add new data structure")
> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> ---
>   lib/maple_tree.c | 27 +++++++++++++--------------
>   1 file changed, 13 insertions(+), 14 deletions(-)
>
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index 4df6a0ce1c1b..ed350aa293b2 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -4938,7 +4938,8 @@ static inline void *mas_prev_entry(struct ma_state *mas, unsigned long min)
>    * Return: True if found in a leaf, false otherwise.
>    *
>    */
> -static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
> +static bool mas_rev_awalk(struct ma_state *mas, unsigned long size,
> +		unsigned long *gap_min, unsigned long *gap_max)
>   {
>   	enum maple_type type = mte_node_type(mas->node);
>   	struct maple_node *node = mas_mn(mas);
> @@ -5003,8 +5004,8 @@ static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
>   
>   	if (unlikely(ma_is_leaf(type))) {
>   		mas->offset = offset;
> -		mas->min = min;
> -		mas->max = min + gap - 1;
> +		*gap_min = min;
> +		*gap_max = min + gap - 1;
>   		return true;
>   	}
>   
> @@ -5280,6 +5281,9 @@ int mas_empty_area(struct ma_state *mas, unsigned long min,
>   	unsigned long *pivots;
>   	enum maple_type mt;
>   
> +	if (min >= max)
This can lead to errors, min == max is valid.
I think it's better to change it to this:
if (min > max || size == 0 || max - min < size - 1)
> +		return -EINVAL;
> +
>   	if (mas_is_start(mas))
>   		mas_start(mas);
>   	else if (mas->offset >= 2)
> @@ -5334,6 +5338,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
>   {
>   	struct maple_enode *last = mas->node;
>   
> +	if (min >= max)
ditto.
> +		return -EINVAL;
> +
>   	if (mas_is_start(mas)) {
>   		mas_start(mas);
>   		mas->offset = mas_data_end(mas);
> @@ -5353,7 +5360,7 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
>   	mas->index = min;
>   	mas->last = max;
>   
> -	while (!mas_rev_awalk(mas, size)) {
> +	while (!mas_rev_awalk(mas, size, &min, &max)) {
>   		if (last == mas->node) {
>   			if (!mas_rewind_node(mas))
>   				return -EBUSY;
> @@ -5368,17 +5375,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
>   	if (unlikely(mas->offset == MAPLE_NODE_SLOTS))
>   		return -EBUSY;
>   
> -	/*
> -	 * mas_rev_awalk() has set mas->min and mas->max to the gap values.  If
> -	 * the maximum is outside the window we are searching, then use the last
> -	 * location in the search.
> -	 * mas->max and mas->min is the range of the gap.
> -	 * mas->index and mas->last are currently set to the search range.
> -	 */
> -
>   	/* Trim the upper limit to the max. */
> -	if (mas->max <= mas->last)
> -		mas->last = mas->max;
> +	if (max <= mas->last)
> +		mas->last = max;

We can get max as follows, without using pointers to track min, max in 
mas_rev_awalk().

mt = mte_node_type(mas->node); pivots = ma_pivots(mas_mn(mas), mt); max 
= mas_logical_pivot(mas, pivots, mas->offset, mt);
	if (max < mas->last) /* The equal sign here can be removed */
		mas->last = max;

>   
>   	mas->index = mas->last - size + 1;
>   	return 0;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev()
  2023-04-19  9:02 ` [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Peng Zhang
@ 2023-04-19 22:54   ` Liam R. Howlett
  2023-04-20  4:21     ` Peng Zhang
  0 siblings, 1 reply; 20+ messages in thread
From: Liam R. Howlett @ 2023-04-19 22:54 UTC (permalink / raw)
  To: Peng Zhang; +Cc: linux-mm, linux-kernel, Rick Edgecombe, Andrew Morton

* Peng Zhang <zhangpeng.00@bytedance.com> [230419 05:02]:
> 
> 在 2023/4/14 22:57, Liam R. Howlett 写道:
> > Stop using maple state min/max for the range by passing through pointers
> > for those values.  This will allow the maple state to be reused without
> > resetting.
> > 
> > Also add some logic to fail out early on searching with invalid
> > arguments.
> > 
> > Fixes: 54a611b60590 ("Maple Tree: add new data structure")
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> > ---
> >   lib/maple_tree.c | 27 +++++++++++++--------------
> >   1 file changed, 13 insertions(+), 14 deletions(-)
> > 
> > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> > index 4df6a0ce1c1b..ed350aa293b2 100644
> > --- a/lib/maple_tree.c
> > +++ b/lib/maple_tree.c
> > @@ -4938,7 +4938,8 @@ static inline void *mas_prev_entry(struct ma_state *mas, unsigned long min)
> >    * Return: True if found in a leaf, false otherwise.
> >    *
> >    */
> > -static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
> > +static bool mas_rev_awalk(struct ma_state *mas, unsigned long size,
> > +		unsigned long *gap_min, unsigned long *gap_max)
> >   {
> >   	enum maple_type type = mte_node_type(mas->node);
> >   	struct maple_node *node = mas_mn(mas);
> > @@ -5003,8 +5004,8 @@ static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
> >   	if (unlikely(ma_is_leaf(type))) {
> >   		mas->offset = offset;
> > -		mas->min = min;
> > -		mas->max = min + gap - 1;
> > +		*gap_min = min;
> > +		*gap_max = min + gap - 1;
> >   		return true;
> >   	}
> > @@ -5280,6 +5281,9 @@ int mas_empty_area(struct ma_state *mas, unsigned long min,
> >   	unsigned long *pivots;
> >   	enum maple_type mt;
> > +	if (min >= max)
> This can lead to errors, min == max is valid.
> I think it's better to change it to this:
> if (min > max || size == 0 || max - min < size - 1)

I am not sure what it means to search within a range of one.  I guess
you would expect it to just return that value if it's empty?

In any case, since we are dealing with pages of data for the VMAs,
having min == max really makes no sense, even with the subtraction of
one in the caller to reduce the max, the min and max should be at least
PAGE_SIZE - 1 apart here.

I think you are right, and I think this needs to be looked at for the
tree on its own, but I don't think it's a problem for the VMA user. I'll
write a testcase and ensure a search for a single entry in a single
entry window works separately.  Thanks for pointing this out.

> > +		return -EINVAL;
> > +
> >   	if (mas_is_start(mas))
> >   		mas_start(mas);
> >   	else if (mas->offset >= 2)
> > @@ -5334,6 +5338,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
> >   {
> >   	struct maple_enode *last = mas->node;
> > +	if (min >= max)
> ditto.

I'll do the search in both directions.

> > +		return -EINVAL;
> > +
> >   	if (mas_is_start(mas)) {
> >   		mas_start(mas);
> >   		mas->offset = mas_data_end(mas);
> > @@ -5353,7 +5360,7 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
> >   	mas->index = min;
> >   	mas->last = max;
> > -	while (!mas_rev_awalk(mas, size)) {
> > +	while (!mas_rev_awalk(mas, size, &min, &max)) {
> >   		if (last == mas->node) {
> >   			if (!mas_rewind_node(mas))
> >   				return -EBUSY;
> > @@ -5368,17 +5375,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
> >   	if (unlikely(mas->offset == MAPLE_NODE_SLOTS))
> >   		return -EBUSY;
> > -	/*
> > -	 * mas_rev_awalk() has set mas->min and mas->max to the gap values.  If
> > -	 * the maximum is outside the window we are searching, then use the last
> > -	 * location in the search.
> > -	 * mas->max and mas->min is the range of the gap.
> > -	 * mas->index and mas->last are currently set to the search range.
> > -	 */
> > -
> >   	/* Trim the upper limit to the max. */
> > -	if (mas->max <= mas->last)
> > -		mas->last = mas->max;
> > +	if (max <= mas->last)
> > +		mas->last = max;
> 
> We can get max as follows, without using pointers to track min, max in
> mas_rev_awalk().
> 
> mt = mte_node_type(mas->node); pivots = ma_pivots(mas_mn(mas), mt); max =
> mas_logical_pivot(mas, pivots, mas->offset, mt);

Yes, but why would we do this?  We have done all this work already in
mas_rev_awalk(), and we have to do it there to get the offset in the
first place.

> 	if (max < mas->last) /* The equal sign here can be removed */

Thanks.  I'll keep this in mind when I revisit the function.  I don't
want to re-spin the patch for this alone.

> 		mas->last = max;
> 
> >   	mas->index = mas->last - size + 1;
> >   	return 0;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev()
  2023-04-19 22:54   ` Liam R. Howlett
@ 2023-04-20  4:21     ` Peng Zhang
  0 siblings, 0 replies; 20+ messages in thread
From: Peng Zhang @ 2023-04-20  4:21 UTC (permalink / raw)
  To: Liam R. Howlett, Peng Zhang, linux-mm, linux-kernel,
	Rick Edgecombe, Andrew Morton


在 2023/4/20 06:54, Liam R. Howlett 写道:
> * Peng Zhang <zhangpeng.00@bytedance.com> [230419 05:02]:
>> 在 2023/4/14 22:57, Liam R. Howlett 写道:
>>> Stop using maple state min/max for the range by passing through pointers
>>> for those values.  This will allow the maple state to be reused without
>>> resetting.
>>>
>>> Also add some logic to fail out early on searching with invalid
>>> arguments.
>>>
>>> Fixes: 54a611b60590 ("Maple Tree: add new data structure")
>>> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
>>> ---
>>>    lib/maple_tree.c | 27 +++++++++++++--------------
>>>    1 file changed, 13 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>>> index 4df6a0ce1c1b..ed350aa293b2 100644
>>> --- a/lib/maple_tree.c
>>> +++ b/lib/maple_tree.c
>>> @@ -4938,7 +4938,8 @@ static inline void *mas_prev_entry(struct ma_state *mas, unsigned long min)
>>>     * Return: True if found in a leaf, false otherwise.
>>>     *
>>>     */
>>> -static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
>>> +static bool mas_rev_awalk(struct ma_state *mas, unsigned long size,
>>> +		unsigned long *gap_min, unsigned long *gap_max)
>>>    {
>>>    	enum maple_type type = mte_node_type(mas->node);
>>>    	struct maple_node *node = mas_mn(mas);
>>> @@ -5003,8 +5004,8 @@ static bool mas_rev_awalk(struct ma_state *mas, unsigned long size)
>>>    	if (unlikely(ma_is_leaf(type))) {
>>>    		mas->offset = offset;
>>> -		mas->min = min;
>>> -		mas->max = min + gap - 1;
>>> +		*gap_min = min;
>>> +		*gap_max = min + gap - 1;
>>>    		return true;
>>>    	}
>>> @@ -5280,6 +5281,9 @@ int mas_empty_area(struct ma_state *mas, unsigned long min,
>>>    	unsigned long *pivots;
>>>    	enum maple_type mt;
>>> +	if (min >= max)
>> This can lead to errors, min == max is valid.
>> I think it's better to change it to this:
>> if (min > max || size == 0 || max - min < size - 1)
> I am not sure what it means to search within a range of one.  I guess
> you would expect it to just return that value if it's empty?
Yes, if min==max, and the value pointed to by this index is empty,
we should return it.
>
> In any case, since we are dealing with pages of data for the VMAs,
> having min == max really makes no sense, even with the subtraction of
> one in the caller to reduce the max, the min and max should be at least
> PAGE_SIZE - 1 apart here.
>
> I think you are right, and I think this needs to be looked at for the
> tree on its own, but I don't think it's a problem for the VMA user. I'll
> write a testcase and ensure a search for a single entry in a single
> entry window works separately.  Thanks for pointing this out.
Yes, it's a problem with the tree itself.
>
>>> +		return -EINVAL;
>>> +
>>>    	if (mas_is_start(mas))
>>>    		mas_start(mas);
>>>    	else if (mas->offset >= 2)
>>> @@ -5334,6 +5338,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
>>>    {
>>>    	struct maple_enode *last = mas->node;
>>> +	if (min >= max)
>> ditto.
> I'll do the search in both directions.
Here is also the same problem with the tree itself as above.
Maybe I didn't understand what you mean. I think the case of
min==max should still be considered in mas_empty_area_rev(),
because mas_empty_area_rev() is a separate interface.
>
>>> +		return -EINVAL;
>>> +
>>>    	if (mas_is_start(mas)) {
>>>    		mas_start(mas);
>>>    		mas->offset = mas_data_end(mas);
>>> @@ -5353,7 +5360,7 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
>>>    	mas->index = min;
>>>    	mas->last = max;
>>> -	while (!mas_rev_awalk(mas, size)) {
>>> +	while (!mas_rev_awalk(mas, size, &min, &max)) {
>>>    		if (last == mas->node) {
>>>    			if (!mas_rewind_node(mas))
>>>    				return -EBUSY;
>>> @@ -5368,17 +5375,9 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
>>>    	if (unlikely(mas->offset == MAPLE_NODE_SLOTS))
>>>    		return -EBUSY;
>>> -	/*
>>> -	 * mas_rev_awalk() has set mas->min and mas->max to the gap values.  If
>>> -	 * the maximum is outside the window we are searching, then use the last
>>> -	 * location in the search.
>>> -	 * mas->max and mas->min is the range of the gap.
>>> -	 * mas->index and mas->last are currently set to the search range.
>>> -	 */
>>> -
>>>    	/* Trim the upper limit to the max. */
>>> -	if (mas->max <= mas->last)
>>> -		mas->last = mas->max;
>>> +	if (max <= mas->last)
>>> +		mas->last = max;
>> We can get max as follows, without using pointers to track min, max in
>> mas_rev_awalk().
>>
>> mt = mte_node_type(mas->node); pivots = ma_pivots(mas_mn(mas), mt); max =
>> mas_logical_pivot(mas, pivots, mas->offset, mt);
> Yes, but why would we do this?  We have done all this work already in
> mas_rev_awalk(), and we have to do it there to get the offset in the
> first place.
Yes, both methods will work.
>
>> 	if (max < mas->last) /* The equal sign here can be removed */
> Thanks.  I'll keep this in mind when I revisit the function.  I don't
> want to re-spin the patch for this alone.
>
>> 		mas->last = max;
>>
>>>    	mas->index = mas->last - size + 1;
>>>    	return 0;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-14 18:59   ` [PATCH v2] " Liam R. Howlett
  2023-04-14 19:09     ` Andrew Morton
@ 2023-04-29 14:32     ` Tad
  2023-04-30 22:41       ` Michael Keyes
  1 sibling, 1 reply; 20+ messages in thread
From: Tad @ 2023-04-29 14:32 UTC (permalink / raw)
  To: liam.howlett; +Cc: akpm, linux-kernel, linux-mm, rick.p.edgecombe

This reintroduces the issue described in
https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/

Linux 6.2.13 can no longer successfully run the mmap-test reproducer linked
there.

Linux 6.2.12 passes.

Regards,
Tad.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-29 14:32     ` Tad
@ 2023-04-30 22:41       ` Michael Keyes
  2023-05-02 14:08         ` Liam R. Howlett
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Keyes @ 2023-04-30 22:41 UTC (permalink / raw)
  To: Tad, liam.howlett; +Cc: akpm, linux-kernel, linux-mm, rick.p.edgecombe

On 29.04.23 15:32, Tad wrote:
> This reintroduces the issue described in
> https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
Yes, I also ran into this (even though I'd somehow missed it the
previous time).

Apparently the issue arises at mm/mmap.c:1582, where low_limit is set to
vm_end_gap(tmp). Occasionally, this returns a 64-bit address (e.g.
0x7fedea581000), which is obviously greater than high_limit for a 32-bit
mmap, and causes the next call to mas_empty_area() to fail.

I'm not sure why vm_end_gap(tmp) occasionally returns a 64-bit address,
or if the best solution is to just check for this and skip the retry if
it occurs…

-- 
Michael



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-04-30 22:41       ` Michael Keyes
@ 2023-05-02 14:08         ` Liam R. Howlett
  2023-05-02 14:09           ` Liam R. Howlett
  0 siblings, 1 reply; 20+ messages in thread
From: Liam R. Howlett @ 2023-05-02 14:08 UTC (permalink / raw)
  To: Michael Keyes; +Cc: Tad, akpm, linux-kernel, linux-mm, rick.p.edgecombe

* Michael Keyes <mgkeyes@vigovproductions.net> [230430 18:41]:
> On 29.04.23 15:32, Tad wrote:
> > This reintroduces the issue described in
> > https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
> Yes, I also ran into this (even though I'd somehow missed it the
> previous time).

Rick Edgecombe reported something similar [1].

This is probably to do with my stack guard checks I recently added.

> 
> Apparently the issue arises at mm/mmap.c:1582, where low_limit is set to
> vm_end_gap(tmp). Occasionally, this returns a 64-bit address (e.g.
> 0x7fedea581000), which is obviously greater than high_limit for a 32-bit
> mmap, and causes the next call to mas_empty_area() to fail.
> 
> I'm not sure why vm_end_gap(tmp) occasionally returns a 64-bit address,
> or if the best solution is to just check for this and skip the retry if
> it occurs…
> 

Thanks for the debugging.  I will look into it.

I am currently trying to revise how the iterators, prev/next deal with
shifting outside the requested limits.  I suspect it's something to do
with hitting the limit and what someone would assume the next operation
means.

[1] https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-05-02 14:08         ` Liam R. Howlett
@ 2023-05-02 14:09           ` Liam R. Howlett
  2023-05-15  8:57             ` Juhyung Park
  0 siblings, 1 reply; 20+ messages in thread
From: Liam R. Howlett @ 2023-05-02 14:09 UTC (permalink / raw)
  To: Michael Keyes, Tad, akpm, linux-kernel, linux-mm, rick.p.edgecombe
  Cc: Edgecombe, Rick P

...Adding Rick to the Cc this time.

* Liam R. Howlett <Liam.Howlett@Oracle.com> [230502 10:08]:
> * Michael Keyes <mgkeyes@vigovproductions.net> [230430 18:41]:
> > On 29.04.23 15:32, Tad wrote:
> > > This reintroduces the issue described in
> > > https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
> > Yes, I also ran into this (even though I'd somehow missed it the
> > previous time).
> 
> Rick Edgecombe reported something similar [1].
> 
> This is probably to do with my stack guard checks I recently added.
> 
> > 
> > Apparently the issue arises at mm/mmap.c:1582, where low_limit is set to
> > vm_end_gap(tmp). Occasionally, this returns a 64-bit address (e.g.
> > 0x7fedea581000), which is obviously greater than high_limit for a 32-bit
> > mmap, and causes the next call to mas_empty_area() to fail.
> > 
> > I'm not sure why vm_end_gap(tmp) occasionally returns a 64-bit address,
> > or if the best solution is to just check for this and skip the retry if
> > it occurs…
> > 
> 
> Thanks for the debugging.  I will look into it.
> 
> I am currently trying to revise how the iterators, prev/next deal with
> shifting outside the requested limits.  I suspect it's something to do
> with hitting the limit and what someone would assume the next operation
> means.
> 
> [1] https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-05-02 14:09           ` Liam R. Howlett
@ 2023-05-15  8:57             ` Juhyung Park
  2023-05-15 14:36               ` Liam R. Howlett
  0 siblings, 1 reply; 20+ messages in thread
From: Juhyung Park @ 2023-05-15  8:57 UTC (permalink / raw)
  To: Liam R. Howlett, Michael Keyes, Tad, akpm, linux-kernel,
	linux-mm, rick.p.edgecombe

Hi Liam,

It's a bit hard to follow this particular issue on v6.1 as there are 
many email threads related to this.

I just wanted to ask if whether this is fixed on mainline and v6.1 
stable yet.

If there's a new thread tackling this issue, I'd appreciate it if you 
can link it here.

Thanks,
regards

On 5/2/23 23:09, Liam R. Howlett wrote:
> ...Adding Rick to the Cc this time.
> 
> * Liam R. Howlett <Liam.Howlett@Oracle.com> [230502 10:08]:
>> * Michael Keyes <mgkeyes@vigovproductions.net> [230430 18:41]:
>>> On 29.04.23 15:32, Tad wrote:
>>>> This reintroduces the issue described in
>>>> https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
>>> Yes, I also ran into this (even though I'd somehow missed it the
>>> previous time).
>>
>> Rick Edgecombe reported something similar [1].
>>
>> This is probably to do with my stack guard checks I recently added.
>>
>>>
>>> Apparently the issue arises at mm/mmap.c:1582, where low_limit is set to
>>> vm_end_gap(tmp). Occasionally, this returns a 64-bit address (e.g.
>>> 0x7fedea581000), which is obviously greater than high_limit for a 32-bit
>>> mmap, and causes the next call to mas_empty_area() to fail.
>>>
>>> I'm not sure why vm_end_gap(tmp) occasionally returns a 64-bit address,
>>> or if the best solution is to just check for this and skip the retry if
>>> it occurs…
>>>
>>
>> Thanks for the debugging.  I will look into it.
>>
>> I am currently trying to revise how the iterators, prev/next deal with
>> shifting outside the requested limits.  I suspect it's something to do
>> with hitting the limit and what someone would assume the next operation
>> means.
>>
>> [1] https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] mm/mmap: Regression fix for unmapped_area{_topdown}
  2023-05-15  8:57             ` Juhyung Park
@ 2023-05-15 14:36               ` Liam R. Howlett
  0 siblings, 0 replies; 20+ messages in thread
From: Liam R. Howlett @ 2023-05-15 14:36 UTC (permalink / raw)
  To: Juhyung Park
  Cc: Michael Keyes, Tad, akpm, linux-kernel, linux-mm, rick.p.edgecombe

* Juhyung Park <qkrwngud825@gmail.com> [230515 04:57]:
> Hi Liam,
> 
> It's a bit hard to follow this particular issue on v6.1 as there are many
> email threads related to this.
> 
> I just wanted to ask if whether this is fixed on mainline and v6.1 stable
> yet.

Not yet.  It is in mm-unstable at this time.

> 
> If there's a new thread tackling this issue, I'd appreciate it if you can
> link it here.

https://lore.kernel.org/linux-mm/20230505145829.74574-1-zhangpeng.00@bytedance.com/


> 
> Thanks,
> regards
> 
> On 5/2/23 23:09, Liam R. Howlett wrote:
> > ...Adding Rick to the Cc this time.
> > 
> > * Liam R. Howlett <Liam.Howlett@Oracle.com> [230502 10:08]:
> > > * Michael Keyes <mgkeyes@vigovproductions.net> [230430 18:41]:
> > > > On 29.04.23 15:32, Tad wrote:
> > > > > This reintroduces the issue described in
> > > > > https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
> > > > Yes, I also ran into this (even though I'd somehow missed it the
> > > > previous time).
> > > 
> > > Rick Edgecombe reported something similar [1].
> > > 
> > > This is probably to do with my stack guard checks I recently added.
> > > 
> > > > 
> > > > Apparently the issue arises at mm/mmap.c:1582, where low_limit is set to
> > > > vm_end_gap(tmp). Occasionally, this returns a 64-bit address (e.g.
> > > > 0x7fedea581000), which is obviously greater than high_limit for a 32-bit
> > > > mmap, and causes the next call to mas_empty_area() to fail.
> > > > 
> > > > I'm not sure why vm_end_gap(tmp) occasionally returns a 64-bit address,
> > > > or if the best solution is to just check for this and skip the retry if
> > > > it occurs…
> > > > 
> > > 
> > > Thanks for the debugging.  I will look into it.
> > > 
> > > I am currently trying to revise how the iterators, prev/next deal with
> > > shifting outside the requested limits.  I suspect it's something to do
> > > with hitting the limit and what someone would assume the next operation
> > > means.
> > > 
> > > [1] https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/
> > 
> > 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-05-15 14:36 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-14 14:57 [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Liam R. Howlett
2023-04-14 14:57 ` [PATCH 2/3] maple_tree: Fix mas_empty_area() search Liam R. Howlett
2023-04-14 14:57 ` [PATCH 3/3] mm/mmap: Regression fix for unmapped_area{_topdown} Liam R. Howlett
2023-04-14 16:27   ` Edgecombe, Rick P
2023-04-14 17:26     ` Liam R. Howlett
2023-04-14 17:29       ` Liam R. Howlett
2023-04-14 17:53         ` Edgecombe, Rick P
2023-04-14 18:07           ` Liam R. Howlett
2023-04-14 18:21             ` Edgecombe, Rick P
2023-04-14 18:59   ` [PATCH v2] " Liam R. Howlett
2023-04-14 19:09     ` Andrew Morton
2023-04-29 14:32     ` Tad
2023-04-30 22:41       ` Michael Keyes
2023-05-02 14:08         ` Liam R. Howlett
2023-05-02 14:09           ` Liam R. Howlett
2023-05-15  8:57             ` Juhyung Park
2023-05-15 14:36               ` Liam R. Howlett
2023-04-19  9:02 ` [PATCH 1/3] maple_tree: Make maple state reusable after mas_empty_area_rev() Peng Zhang
2023-04-19 22:54   ` Liam R. Howlett
2023-04-20  4:21     ` Peng Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.