All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] further cleanup of vma_merge()
@ 2023-03-21 20:45 Lorenzo Stoakes
  2023-03-21 20:45 ` [PATCH v2 1/4] mm/mmap/vma_merge: further improve prev/next VMA naming Lorenzo Stoakes
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-21 20:45 UTC (permalink / raw)
  To: linux-mm, linux-kernel, Andrew Morton
  Cc: David Hildenbrand, Matthew Wilcox, Vlastimil Babka,
	Liam R . Howlett, maple-tree, Vernon Yang, Lorenzo Stoakes

Following on from Vlastimil Babka's patch series "cleanup vma_merge() and
improve mergeability tests" which was in turn based on Liam's prior
cleanups, this patch series introduces changes discussed in review of
Vlastimil's series and goes further in attempting to make the logic as
clear as possible.

Nearly all of this should have absolutely no functional impact, however it
does add a singular VM_WARN_ON() case.

With many thanks to Vernon for helping kick start the discussion around
simplification - abstract use of vma did indeed turn out not to be
necessary - and to Liam for his excellent suggestions which greatly
simplified things.

v2:
- Put the patch series on a serious diet, cut comments down to avoid
  bloat.
- Added clever use of find_vma_intersection() and vma_lookup() as suggested
  by Liam which improved clarity + brevity significantly.
- Eliminated the use of a temporary vma local as suggested by Vernon, it
  does seem this was ultimately adding confusion and Liam's suggestions
  eliminated the need for this.
- Moved around initial variables to be more sensible and to initialise each
  variable in one place where possible.

v1:
https://lore.kernel.org/all/cover.1679137163.git.lstoakes@gmail.com

Lorenzo Stoakes (4):
  mm/mmap/vma_merge: further improve prev/next VMA naming
  mm/mmap/vma_merge: set next to NULL if not applicable
  mm/mmap/vma_merge: extend invariants, avoid invalid res, vma
  mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable
    case

 mm/mmap.c | 155 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 80 insertions(+), 75 deletions(-)

--
2.39.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] mm/mmap/vma_merge: further improve prev/next VMA naming
  2023-03-21 20:45 [PATCH v2 0/4] further cleanup of vma_merge() Lorenzo Stoakes
@ 2023-03-21 20:45 ` Lorenzo Stoakes
  2023-03-21 20:45 ` [PATCH v2 2/4] mm/mmap/vma_merge: set next to NULL if not applicable Lorenzo Stoakes
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-21 20:45 UTC (permalink / raw)
  To: linux-mm, linux-kernel, Andrew Morton
  Cc: David Hildenbrand, Matthew Wilcox, Vlastimil Babka,
	Liam R . Howlett, maple-tree, Vernon Yang, Lorenzo Stoakes

Previously the ASCII diagram above vma_merge() and the accompanying
variable naming was rather confusing, however recent efforts by Liam
Howlett and Vlastimil Babka have significantly improved matters.

This patch goes a little further - replacing 'X' with 'N' which feels a lot
more natural and replacing what was 'N' with 'C' which stands for
'concurrent' VMA.

No word quite describes a VMA that has coincident start as the input span,
concurrent, abbreviated to 'curr' (and which can be thought of also as
'current') however fits intuitions well alongside prev and next.

This has no functional impact.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 mm/mmap.c | 86 +++++++++++++++++++++++++++----------------------------
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 042d22e63528..c9834364ac98 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -861,44 +861,44 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
  * this area are about to be changed to vm_flags - and the no-change
  * case has already been eliminated.
  *
- * The following mprotect cases have to be considered, where AAAA is
+ * The following mprotect cases have to be considered, where **** is
  * the area passed down from mprotect_fixup, never extending beyond one
- * vma, PPPP is the previous vma, NNNN is a vma that starts at the same
- * address as AAAA and is of the same or larger span, and XXXX the next
- * vma after AAAA:
+ * vma, PPPP is the previous vma, CCCC is a concurrent vma that starts
+ * at the same address as **** and is of the same or larger span, and
+ * NNNN the next vma after ****:
  *
- *     AAAA             AAAA                   AAAA
- *    PPPPPPXXXXXX    PPPPPPXXXXXX       PPPPPPNNNNNN
+ *     ****             ****                   ****
+ *    PPPPPPNNNNNN    PPPPPPNNNNNN       PPPPPPCCCCCC
  *    cannot merge    might become       might become
- *                    PPXXXXXXXXXX       PPPPPPPPPPNN
+ *                    PPNNNNNNNNNN       PPPPPPPPPPCC
  *    mmap, brk or    case 4 below       case 5 below
  *    mremap move:
- *                        AAAA               AAAA
- *                    PPPP    XXXX       PPPPNNNNXXXX
+ *                        ****               ****
+ *                    PPPP    NNNN       PPPPCCCCNNNN
  *                    might become       might become
  *                    PPPPPPPPPPPP 1 or  PPPPPPPPPPPP 6 or
- *                    PPPPPPPPXXXX 2 or  PPPPPPPPXXXX 7 or
- *                    PPPPXXXXXXXX 3     PPPPXXXXXXXX 8
+ *                    PPPPPPPPNNNN 2 or  PPPPPPPPNNNN 7 or
+ *                    PPPPNNNNNNNN 3     PPPPNNNNNNNN 8
  *
- * It is important for case 8 that the vma NNNN overlapping the
- * region AAAA is never going to extended over XXXX. Instead XXXX must
- * be extended in region AAAA and NNNN must be removed. This way in
+ * It is important for case 8 that the vma CCCC overlapping the
+ * region **** is never going to extended over NNNN. Instead NNNN must
+ * be extended in region **** and CCCC must be removed. This way in
  * all cases where vma_merge succeeds, the moment vma_merge drops the
  * rmap_locks, the properties of the merged vma will be already
  * correct for the whole merged range. Some of those properties like
  * vm_page_prot/vm_flags may be accessed by rmap_walks and they must
  * be correct for the whole merged range immediately after the
- * rmap_locks are released. Otherwise if XXXX would be removed and
- * NNNN would be extended over the XXXX range, remove_migration_ptes
+ * rmap_locks are released. Otherwise if NNNN would be removed and
+ * CCCC would be extended over the NNNN range, remove_migration_ptes
  * or other rmap walkers (if working on addresses beyond the "end"
- * parameter) may establish ptes with the wrong permissions of NNNN
- * instead of the right permissions of XXXX.
+ * parameter) may establish ptes with the wrong permissions of CCCC
+ * instead of the right permissions of NNNN.
  *
  * In the code below:
  * PPPP is represented by *prev
- * NNNN is represented by *mid or not represented at all (NULL)
- * XXXX is represented by *next or not represented at all (NULL)
- * AAAA is not represented - it will be merged and the vma containing the
+ * CCCC is represented by *curr or not represented at all (NULL)
+ * NNNN is represented by *next or not represented at all (NULL)
+ * **** is not represented - it will be merged and the vma containing the
  *      area is returned, or the function will return NULL
  */
 struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
@@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 {
 	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
 	pgoff_t vma_pgoff;
-	struct vm_area_struct *mid, *next, *res = NULL;
+	struct vm_area_struct *curr, *next, *res = NULL;
 	struct vm_area_struct *vma, *adjust, *remove, *remove2;
 	int err = -1;
 	bool merge_prev = false;
@@ -930,19 +930,19 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 	if (vm_flags & VM_SPECIAL)
 		return NULL;
 
-	mid = find_vma(mm, prev ? prev->vm_end : 0);
-	if (mid && mid->vm_end == end)			/* cases 6, 7, 8 */
-		next = find_vma(mm, mid->vm_end);
+	curr = find_vma(mm, prev ? prev->vm_end : 0);
+	if (curr && curr->vm_end == end)		/* cases 6, 7, 8 */
+		next = find_vma(mm, curr->vm_end);
 	else
-		next = mid;
+		next = curr;
 
-	/* In cases 1 - 4 there's no NNNN vma */
-	if (mid && end <= mid->vm_start)
-		mid = NULL;
+	/* In cases 1 - 4 there's no CCCC vma */
+	if (curr && end <= curr->vm_start)
+		curr = NULL;
 
 	/* verify some invariant that must be enforced by the caller */
 	VM_WARN_ON(prev && addr <= prev->vm_start);
-	VM_WARN_ON(mid && end > mid->vm_end);
+	VM_WARN_ON(curr && end > curr->vm_end);
 	VM_WARN_ON(addr >= end);
 
 	if (prev) {
@@ -974,21 +974,21 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 		remove = next;				/* case 1 */
 		vma_end = next->vm_end;
 		err = dup_anon_vma(prev, next);
-		if (mid) {				/* case 6 */
-			remove = mid;
+		if (curr) {				/* case 6 */
+			remove = curr;
 			remove2 = next;
 			if (!next->anon_vma)
-				err = dup_anon_vma(prev, mid);
+				err = dup_anon_vma(prev, curr);
 		}
 	} else if (merge_prev) {
 		err = 0;				/* case 2 */
-		if (mid) {
-			err = dup_anon_vma(prev, mid);
-			if (end == mid->vm_end) {	/* case 7 */
-				remove = mid;
+		if (curr) {
+			err = dup_anon_vma(prev, curr);
+			if (end == curr->vm_end) {	/* case 7 */
+				remove = curr;
 			} else {			/* case 5 */
-				adjust = mid;
-				adj_start = (end - mid->vm_start);
+				adjust = curr;
+				adj_start = (end - curr->vm_start);
 			}
 		}
 	} else if (merge_next) {
@@ -1004,10 +1004,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 			vma_end = next->vm_end;
 			vma_pgoff = next->vm_pgoff;
 			err = 0;
-			if (mid) {			/* case 8 */
-				vma_pgoff = mid->vm_pgoff;
-				remove = mid;
-				err = dup_anon_vma(next, mid);
+			if (curr) {			/* case 8 */
+				vma_pgoff = curr->vm_pgoff;
+				remove = curr;
+				err = dup_anon_vma(next, curr);
 			}
 		}
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] mm/mmap/vma_merge: set next to NULL if not applicable
  2023-03-21 20:45 [PATCH v2 0/4] further cleanup of vma_merge() Lorenzo Stoakes
  2023-03-21 20:45 ` [PATCH v2 1/4] mm/mmap/vma_merge: further improve prev/next VMA naming Lorenzo Stoakes
@ 2023-03-21 20:45 ` Lorenzo Stoakes
  2023-03-22  1:42   ` Liam R. Howlett
  2023-03-21 20:45 ` [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma Lorenzo Stoakes
  2023-03-21 20:45 ` [PATCH v2 4/4] mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable case Lorenzo Stoakes
  3 siblings, 1 reply; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-21 20:45 UTC (permalink / raw)
  To: linux-mm, linux-kernel, Andrew Morton
  Cc: David Hildenbrand, Matthew Wilcox, Vlastimil Babka,
	Liam R . Howlett, maple-tree, Vernon Yang, Lorenzo Stoakes

We are only interested in next if end == next->vm_start (in which case we
check to see if we can set merge_next), so perform this check alongside
checking whether curr should be set.

This groups all of the simple range checks together and establishes the
invariant that, if prev, curr or next are non-NULL then their positions are
as expected.

This has no functional impact.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 mm/mmap.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index c9834364ac98..6361baf75601 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -930,15 +930,15 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 	if (vm_flags & VM_SPECIAL)
 		return NULL;
 
-	curr = find_vma(mm, prev ? prev->vm_end : 0);
-	if (curr && curr->vm_end == end)		/* cases 6, 7, 8 */
-		next = find_vma(mm, curr->vm_end);
-	else
-		next = curr;
+	/* Does the input range span an existing VMA? (cases 5 - 8) */
+	curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
 
-	/* In cases 1 - 4 there's no CCCC vma */
-	if (curr && end <= curr->vm_start)
-		curr = NULL;
+	if (curr && end == curr->vm_end)
+		/* Is there is a VMA immediately adjacent (cases 6 - 8)? */
+		next = vma_lookup(mm, curr->vm_end);
+	else if (!curr)
+		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
+		next = vma_lookup(mm, end);
 
 	/* verify some invariant that must be enforced by the caller */
 	VM_WARN_ON(prev && addr <= prev->vm_start);
@@ -959,11 +959,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 		}
 	}
 	/* Can we merge the successor? */
-	if (next && end == next->vm_start &&
-			mpol_equal(policy, vma_policy(next)) &&
-			can_vma_merge_before(next, vm_flags,
-					     anon_vma, file, pgoff+pglen,
-					     vm_userfaultfd_ctx, anon_name)) {
+	if (next && mpol_equal(policy, vma_policy(next)) &&
+	    can_vma_merge_before(next, vm_flags,
+				 anon_vma, file, pgoff+pglen,
+				 vm_userfaultfd_ctx, anon_name)) {
 		merge_next = true;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma
  2023-03-21 20:45 [PATCH v2 0/4] further cleanup of vma_merge() Lorenzo Stoakes
  2023-03-21 20:45 ` [PATCH v2 1/4] mm/mmap/vma_merge: further improve prev/next VMA naming Lorenzo Stoakes
  2023-03-21 20:45 ` [PATCH v2 2/4] mm/mmap/vma_merge: set next to NULL if not applicable Lorenzo Stoakes
@ 2023-03-21 20:45 ` Lorenzo Stoakes
  2023-03-22  9:16   ` Vlastimil Babka
       [not found]   ` <CGME20230323170046eucas1p2483ab0fcc3d6bc56d4b6d09143bbadda@eucas1p2.samsung.com>
  2023-03-21 20:45 ` [PATCH v2 4/4] mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable case Lorenzo Stoakes
  3 siblings, 2 replies; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-21 20:45 UTC (permalink / raw)
  To: linux-mm, linux-kernel, Andrew Morton
  Cc: David Hildenbrand, Matthew Wilcox, Vlastimil Babka,
	Liam R . Howlett, maple-tree, Vernon Yang, Lorenzo Stoakes

Previously, vma was an uninitialised variable which was only definitely
assigned as a result of the logic covering all possible input cases - for
it to have remained uninitialised, prev would have to be NULL, and next
would _have_ to be mergeable.

We now reuse vma to assign curr and next, so to be absolutely explicit,
ensure this variable is _always_ assigned, and while we're at it remove the
redundant assignment of both res and vma (if prev is NULL then we simply
assign to NULL).

In addition, we absolutely do rely on addr == curr->vm_start should curr
exist, so assert as much.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 mm/mmap.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 6361baf75601..7aec49c3bc74 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 {
 	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
 	pgoff_t vma_pgoff;
-	struct vm_area_struct *curr, *next, *res = NULL;
+	struct vm_area_struct *curr, *next, *res;
 	struct vm_area_struct *vma, *adjust, *remove, *remove2;
 	int err = -1;
 	bool merge_prev = false;
@@ -940,14 +940,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
 		next = vma_lookup(mm, end);
 
-	/* verify some invariant that must be enforced by the caller */
+	/*
+	 * By default, we return prev. Cases 3, 4, 8 will instead return next
+	 * and cases 3, 8 will also update vma to point at next.
+	 */
+	res = vma = prev;
+
+	/* Verify some invariant that must be enforced by the caller. */
 	VM_WARN_ON(prev && addr <= prev->vm_start);
-	VM_WARN_ON(curr && end > curr->vm_end);
+	VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
 	VM_WARN_ON(addr >= end);
 
 	if (prev) {
-		res = prev;
-		vma = prev;
 		vma_start = prev->vm_start;
 		vma_pgoff = prev->vm_pgoff;
 		/* Can we merge the predecessor? */
@@ -958,6 +962,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 			vma_prev(vmi);
 		}
 	}
+
 	/* Can we merge the successor? */
 	if (next && mpol_equal(policy, vma_policy(next)) &&
 	    can_vma_merge_before(next, vm_flags,
@@ -998,6 +1003,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 			adj_start = -(prev->vm_end - addr);
 			err = dup_anon_vma(next, prev);
 		} else {
+			/*
+			 * Note that cases 3 and 8 are the ONLY ones where prev
+			 * is permitted to be (but is not necessarily) NULL.
+			 */
 			vma = next;			/* case 3 */
 			vma_start = addr;
 			vma_end = next->vm_end;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable case
  2023-03-21 20:45 [PATCH v2 0/4] further cleanup of vma_merge() Lorenzo Stoakes
                   ` (2 preceding siblings ...)
  2023-03-21 20:45 ` [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma Lorenzo Stoakes
@ 2023-03-21 20:45 ` Lorenzo Stoakes
  2023-03-22  2:08   ` Liam R. Howlett
  3 siblings, 1 reply; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-21 20:45 UTC (permalink / raw)
  To: linux-mm, linux-kernel, Andrew Morton
  Cc: David Hildenbrand, Matthew Wilcox, Vlastimil Babka,
	Liam R . Howlett, maple-tree, Vernon Yang, Lorenzo Stoakes

Reorder the initial variables sensibly and set vma_start and vm_pgoff there
rather than later so all initial values are set at the same time meaning we
don't have to set these later.

Rather than setting err = -1 and only resetting if we hit merge cases,
explicitly check the non-mergeable case to make it abundantly clear that we
only proceed with the rest if something is mergeable, default err to 0 and
only update if an error might occur.

Move the merge_prev, merge_next cases closer to the logic determining curr,
next.

This has no functional impact.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 mm/mmap.c | 55 ++++++++++++++++++++++++++-----------------------------
 1 file changed, 26 insertions(+), 29 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 7aec49c3bc74..d60cb0b7ae15 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -909,18 +909,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 			struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
 			struct anon_vma_name *anon_name)
 {
-	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
-	pgoff_t vma_pgoff;
 	struct vm_area_struct *curr, *next, *res;
 	struct vm_area_struct *vma, *adjust, *remove, *remove2;
-	int err = -1;
+	struct vma_prepare vp;
+	int err = 0;
 	bool merge_prev = false;
 	bool merge_next = false;
 	bool vma_expanded = false;
-	struct vma_prepare vp;
+	unsigned long vma_start = prev ? prev->vm_start : addr;
 	unsigned long vma_end = end;
+	pgoff_t vma_pgoff = prev ? prev->vm_pgoff : 0;
+	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
 	long adj_start = 0;
-	unsigned long vma_start = addr;
 
 	validate_mm(mm);
 	/*
@@ -940,6 +940,23 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
 		next = vma_lookup(mm, end);
 
+	/* Can we merge the predecessor? */
+	if (prev && addr == prev->vm_end && mpol_equal(vma_policy(prev), policy)
+	    && can_vma_merge_after(prev, vm_flags, anon_vma, file,
+				   pgoff, vm_userfaultfd_ctx, anon_name)) {
+		merge_prev = true;
+		vma_prev(vmi);
+	}
+
+	/* Can we merge the successor? */
+	merge_next = next && mpol_equal(policy, vma_policy(next)) &&
+		can_vma_merge_before(next, vm_flags,
+				     anon_vma, file, pgoff+pglen,
+				     vm_userfaultfd_ctx, anon_name);
+
+	if (!merge_prev && !merge_next)
+		return NULL; /* Not mergeable. */
+
 	/*
 	 * By default, we return prev. Cases 3, 4, 8 will instead return next
 	 * and cases 3, 8 will also update vma to point at next.
@@ -951,26 +968,6 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 	VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
 	VM_WARN_ON(addr >= end);
 
-	if (prev) {
-		vma_start = prev->vm_start;
-		vma_pgoff = prev->vm_pgoff;
-		/* Can we merge the predecessor? */
-		if (prev->vm_end == addr && mpol_equal(vma_policy(prev), policy)
-		    && can_vma_merge_after(prev, vm_flags, anon_vma, file,
-				   pgoff, vm_userfaultfd_ctx, anon_name)) {
-			merge_prev = true;
-			vma_prev(vmi);
-		}
-	}
-
-	/* Can we merge the successor? */
-	if (next && mpol_equal(policy, vma_policy(next)) &&
-	    can_vma_merge_before(next, vm_flags,
-				 anon_vma, file, pgoff+pglen,
-				 vm_userfaultfd_ctx, anon_name)) {
-		merge_next = true;
-	}
-
 	remove = remove2 = adjust = NULL;
 	/* Can we merge both the predecessor and the successor? */
 	if (merge_prev && merge_next &&
@@ -985,7 +982,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 				err = dup_anon_vma(prev, curr);
 		}
 	} else if (merge_prev) {
-		err = 0;				/* case 2 */
+							/* case 2 */
 		if (curr) {
 			err = dup_anon_vma(prev, curr);
 			if (end == curr->vm_end) {	/* case 7 */
@@ -995,7 +992,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 				adj_start = (end - curr->vm_start);
 			}
 		}
-	} else if (merge_next) {
+	} else { /* merge_next */
 		res = next;
 		if (prev && addr < prev->vm_end) {	/* case 4 */
 			vma_end = addr;
@@ -1011,7 +1008,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 			vma_start = addr;
 			vma_end = next->vm_end;
 			vma_pgoff = next->vm_pgoff;
-			err = 0;
+
 			if (curr) {			/* case 8 */
 				vma_pgoff = curr->vm_pgoff;
 				remove = curr;
@@ -1020,7 +1017,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 		}
 	}
 
-	/* Cannot merge or error in anon_vma clone */
+	/* Error in anon_vma clone. */
 	if (err)
 		return NULL;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] mm/mmap/vma_merge: set next to NULL if not applicable
  2023-03-21 20:45 ` [PATCH v2 2/4] mm/mmap/vma_merge: set next to NULL if not applicable Lorenzo Stoakes
@ 2023-03-22  1:42   ` Liam R. Howlett
  2023-03-22  6:23     ` Lorenzo Stoakes
  0 siblings, 1 reply; 13+ messages in thread
From: Liam R. Howlett @ 2023-03-22  1:42 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-mm, linux-kernel, Andrew Morton, David Hildenbrand,
	Matthew Wilcox, Vlastimil Babka, maple-tree, Vernon Yang

* Lorenzo Stoakes <lstoakes@gmail.com> [230321 16:51]:
> We are only interested in next if end == next->vm_start (in which case we
> check to see if we can set merge_next), so perform this check alongside
> checking whether curr should be set.
> 
> This groups all of the simple range checks together and establishes the
> invariant that, if prev, curr or next are non-NULL then their positions are
> as expected.
> 
> This has no functional impact.
> 
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> ---
>  mm/mmap.c | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index c9834364ac98..6361baf75601 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -930,15 +930,15 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  	if (vm_flags & VM_SPECIAL)
>  		return NULL;
>  
> -	curr = find_vma(mm, prev ? prev->vm_end : 0);
> -	if (curr && curr->vm_end == end)		/* cases 6, 7, 8 */
> -		next = find_vma(mm, curr->vm_end);
> -	else
> -		next = curr;
> +	/* Does the input range span an existing VMA? (cases 5 - 8) */
> +	curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
>  
> -	/* In cases 1 - 4 there's no CCCC vma */
> -	if (curr && end <= curr->vm_start)
> -		curr = NULL;
> +	if (curr && end == curr->vm_end)
> +		/* Is there is a VMA immediately adjacent (cases 6 - 8)? */
> +		next = vma_lookup(mm, curr->vm_end);

Since end == curr->vm_end, this lookup is the same as below so these two
statements can be combined.

I also think you may need to initialize next to null since it may not be
set for the 'cannot merge' case.

Something like:
	if ((!curr) ||		    /* case 1-4 */
	     (end == curr->vm_end)) /* Case 6-8, adjacent vma */
		next = vma_lookup(mm, end);
	else
		next = NULL


> +	else if (!curr)
> +		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
> +		next = vma_lookup(mm, end);

Nit, can we have braces for comments that make the if/else look like
it's unguarded?

>  
>  	/* verify some invariant that must be enforced by the caller */
>  	VM_WARN_ON(prev && addr <= prev->vm_start);
> @@ -959,11 +959,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  		}
>  	}
>  	/* Can we merge the successor? */
> -	if (next && end == next->vm_start &&
> -			mpol_equal(policy, vma_policy(next)) &&
> -			can_vma_merge_before(next, vm_flags,
> -					     anon_vma, file, pgoff+pglen,
> -					     vm_userfaultfd_ctx, anon_name)) {
> +	if (next && mpol_equal(policy, vma_policy(next)) &&
> +	    can_vma_merge_before(next, vm_flags,
> +				 anon_vma, file, pgoff+pglen,
> +				 vm_userfaultfd_ctx, anon_name)) {
>  		merge_next = true;
>  	}
>  
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable case
  2023-03-21 20:45 ` [PATCH v2 4/4] mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable case Lorenzo Stoakes
@ 2023-03-22  2:08   ` Liam R. Howlett
  2023-03-22  6:07     ` Lorenzo Stoakes
  0 siblings, 1 reply; 13+ messages in thread
From: Liam R. Howlett @ 2023-03-22  2:08 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-mm, linux-kernel, Andrew Morton, David Hildenbrand,
	Matthew Wilcox, Vlastimil Babka, maple-tree, Vernon Yang

* Lorenzo Stoakes <lstoakes@gmail.com> [230321 16:46]:
> Reorder the initial variables sensibly and set vma_start and vm_pgoff there
								^vma_pgoff
	Indicating it is used for the vm_area_struct *vma

> rather than later so all initial values are set at the same time meaning we
> don't have to set these later.

I did these later to reduce the number of times we were checking prev.
With this patch, we check prev 3 times, but before we were checking it
once.  The compiler might do something clever here to reduce the
checking?

I'm also not sure adding the conditional operator in the init helps your
goal of cleaning it up.

> 
> Rather than setting err = -1 and only resetting if we hit merge cases,
> explicitly check the non-mergeable case to make it abundantly clear that we
> only proceed with the rest if something is mergeable, default err to 0 and
> only update if an error might occur.
> 
> Move the merge_prev, merge_next cases closer to the logic determining curr,
> next.
> 
> This has no functional impact.
> 
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> ---
>  mm/mmap.c | 55 ++++++++++++++++++++++++++-----------------------------
>  1 file changed, 26 insertions(+), 29 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 7aec49c3bc74..d60cb0b7ae15 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -909,18 +909,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  			struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
>  			struct anon_vma_name *anon_name)
>  {
> -	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> -	pgoff_t vma_pgoff;
>  	struct vm_area_struct *curr, *next, *res;
>  	struct vm_area_struct *vma, *adjust, *remove, *remove2;
> -	int err = -1;
> +	struct vma_prepare vp;
> +	int err = 0;
>  	bool merge_prev = false;
>  	bool merge_next = false;
>  	bool vma_expanded = false;
> -	struct vma_prepare vp;
> +	unsigned long vma_start = prev ? prev->vm_start : addr;
>  	unsigned long vma_end = end;
> +	pgoff_t vma_pgoff = prev ? prev->vm_pgoff : 0;
> +	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
>  	long adj_start = 0;
> -	unsigned long vma_start = addr;
>  
>  	validate_mm(mm);
>  	/*
> @@ -940,6 +940,23 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
>  		next = vma_lookup(mm, end);
>  
> +	/* Can we merge the predecessor? */
> +	if (prev && addr == prev->vm_end && mpol_equal(vma_policy(prev), policy)
> +	    && can_vma_merge_after(prev, vm_flags, anon_vma, file,
> +				   pgoff, vm_userfaultfd_ctx, anon_name)) {
> +		merge_prev = true;
> +		vma_prev(vmi);
> +	}
> +
> +	/* Can we merge the successor? */
> +	merge_next = next && mpol_equal(policy, vma_policy(next)) &&
> +		can_vma_merge_before(next, vm_flags,
> +				     anon_vma, file, pgoff+pglen,
> +				     vm_userfaultfd_ctx, anon_name);
> +
> +	if (!merge_prev && !merge_next)
> +		return NULL; /* Not mergeable. */
> +
>  	/*
>  	 * By default, we return prev. Cases 3, 4, 8 will instead return next
>  	 * and cases 3, 8 will also update vma to point at next.
> @@ -951,26 +968,6 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  	VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
>  	VM_WARN_ON(addr >= end);
>  
> -	if (prev) {
> -		vma_start = prev->vm_start;
> -		vma_pgoff = prev->vm_pgoff;
> -		/* Can we merge the predecessor? */
> -		if (prev->vm_end == addr && mpol_equal(vma_policy(prev), policy)
> -		    && can_vma_merge_after(prev, vm_flags, anon_vma, file,
> -				   pgoff, vm_userfaultfd_ctx, anon_name)) {
> -			merge_prev = true;
> -			vma_prev(vmi);
> -		}
> -	}
> -
> -	/* Can we merge the successor? */
> -	if (next && mpol_equal(policy, vma_policy(next)) &&
> -	    can_vma_merge_before(next, vm_flags,
> -				 anon_vma, file, pgoff+pglen,
> -				 vm_userfaultfd_ctx, anon_name)) {
> -		merge_next = true;
> -	}
> -
>  	remove = remove2 = adjust = NULL;
>  	/* Can we merge both the predecessor and the successor? */
>  	if (merge_prev && merge_next &&
> @@ -985,7 +982,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  				err = dup_anon_vma(prev, curr);
>  		}
>  	} else if (merge_prev) {
> -		err = 0;				/* case 2 */
> +							/* case 2 */
>  		if (curr) {
>  			err = dup_anon_vma(prev, curr);
>  			if (end == curr->vm_end) {	/* case 7 */
> @@ -995,7 +992,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  				adj_start = (end - curr->vm_start);
>  			}
>  		}
> -	} else if (merge_next) {
> +	} else { /* merge_next */
>  		res = next;
>  		if (prev && addr < prev->vm_end) {	/* case 4 */
>  			vma_end = addr;
> @@ -1011,7 +1008,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  			vma_start = addr;
>  			vma_end = next->vm_end;
>  			vma_pgoff = next->vm_pgoff;
> -			err = 0;
> +

Was this blank line intentional?  I assume so, to give a gap for the
comment below?  It's probably worth having.

>  			if (curr) {			/* case 8 */
>  				vma_pgoff = curr->vm_pgoff;
>  				remove = curr;
> @@ -1020,7 +1017,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  		}
>  	}
>  
> -	/* Cannot merge or error in anon_vma clone */
> +	/* Error in anon_vma clone. */
>  	if (err)
>  		return NULL;
>  
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable case
  2023-03-22  2:08   ` Liam R. Howlett
@ 2023-03-22  6:07     ` Lorenzo Stoakes
  0 siblings, 0 replies; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-22  6:07 UTC (permalink / raw)
  To: Liam R. Howlett, linux-mm, linux-kernel, Andrew Morton,
	David Hildenbrand, Matthew Wilcox, Vlastimil Babka, maple-tree,
	Vernon Yang

On Tue, Mar 21, 2023 at 10:08:58PM -0400, Liam R. Howlett wrote:
> * Lorenzo Stoakes <lstoakes@gmail.com> [230321 16:46]:
> > Reorder the initial variables sensibly and set vma_start and vm_pgoff there
> 								^vma_pgoff
> 	Indicating it is used for the vm_area_struct *vma
>
> > rather than later so all initial values are set at the same time meaning we
> > don't have to set these later.
>
> I did these later to reduce the number of times we were checking prev.
> With this patch, we check prev 3 times, but before we were checking it
> once.  The compiler might do something clever here to reduce the
> checking?
>

Apologies for undoing your work! Which obviously wasn't my intention :)

I suspect the compiler would, but we probably shouldn't rely on it.

> I'm also not sure adding the conditional operator in the init helps your
> goal of cleaning it up.

The purpose was to group everything together, reduce indentation in the
prev case, have it resemble the next case more closely, reduce LoC and to
have the prev if block be solely concerned with merging the predecessor
rather than both setting these values and then checking to see if we can
merge the predecessor.

However, on second thoughts I think avoiding repeatedly checking it trumps
that so I will revert to the previous approach.

>
> >
> > Rather than setting err = -1 and only resetting if we hit merge cases,
> > explicitly check the non-mergeable case to make it abundantly clear that we
> > only proceed with the rest if something is mergeable, default err to 0 and
> > only update if an error might occur.
> >
> > Move the merge_prev, merge_next cases closer to the logic determining curr,
> > next.
> >
> > This has no functional impact.
> >
> > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > ---
> >  mm/mmap.c | 55 ++++++++++++++++++++++++++-----------------------------
> >  1 file changed, 26 insertions(+), 29 deletions(-)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 7aec49c3bc74..d60cb0b7ae15 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -909,18 +909,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  			struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
> >  			struct anon_vma_name *anon_name)
> >  {
> > -	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> > -	pgoff_t vma_pgoff;
> >  	struct vm_area_struct *curr, *next, *res;
> >  	struct vm_area_struct *vma, *adjust, *remove, *remove2;
> > -	int err = -1;
> > +	struct vma_prepare vp;
> > +	int err = 0;
> >  	bool merge_prev = false;
> >  	bool merge_next = false;
> >  	bool vma_expanded = false;
> > -	struct vma_prepare vp;
> > +	unsigned long vma_start = prev ? prev->vm_start : addr;
> >  	unsigned long vma_end = end;
> > +	pgoff_t vma_pgoff = prev ? prev->vm_pgoff : 0;
> > +	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> >  	long adj_start = 0;
> > -	unsigned long vma_start = addr;
> >
> >  	validate_mm(mm);
> >  	/*
> > @@ -940,6 +940,23 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
> >  		next = vma_lookup(mm, end);
> >
> > +	/* Can we merge the predecessor? */
> > +	if (prev && addr == prev->vm_end && mpol_equal(vma_policy(prev), policy)
> > +	    && can_vma_merge_after(prev, vm_flags, anon_vma, file,
> > +				   pgoff, vm_userfaultfd_ctx, anon_name)) {
> > +		merge_prev = true;
> > +		vma_prev(vmi);
> > +	}
> > +
> > +	/* Can we merge the successor? */
> > +	merge_next = next && mpol_equal(policy, vma_policy(next)) &&
> > +		can_vma_merge_before(next, vm_flags,
> > +				     anon_vma, file, pgoff+pglen,
> > +				     vm_userfaultfd_ctx, anon_name);
> > +
> > +	if (!merge_prev && !merge_next)
> > +		return NULL; /* Not mergeable. */
> > +
> >  	/*
> >  	 * By default, we return prev. Cases 3, 4, 8 will instead return next
> >  	 * and cases 3, 8 will also update vma to point at next.
> > @@ -951,26 +968,6 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  	VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> >  	VM_WARN_ON(addr >= end);
> >
> > -	if (prev) {
> > -		vma_start = prev->vm_start;
> > -		vma_pgoff = prev->vm_pgoff;
> > -		/* Can we merge the predecessor? */
> > -		if (prev->vm_end == addr && mpol_equal(vma_policy(prev), policy)
> > -		    && can_vma_merge_after(prev, vm_flags, anon_vma, file,
> > -				   pgoff, vm_userfaultfd_ctx, anon_name)) {
> > -			merge_prev = true;
> > -			vma_prev(vmi);
> > -		}
> > -	}
> > -
> > -	/* Can we merge the successor? */
> > -	if (next && mpol_equal(policy, vma_policy(next)) &&
> > -	    can_vma_merge_before(next, vm_flags,
> > -				 anon_vma, file, pgoff+pglen,
> > -				 vm_userfaultfd_ctx, anon_name)) {
> > -		merge_next = true;
> > -	}
> > -
> >  	remove = remove2 = adjust = NULL;
> >  	/* Can we merge both the predecessor and the successor? */
> >  	if (merge_prev && merge_next &&
> > @@ -985,7 +982,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  				err = dup_anon_vma(prev, curr);
> >  		}
> >  	} else if (merge_prev) {
> > -		err = 0;				/* case 2 */
> > +							/* case 2 */
> >  		if (curr) {
> >  			err = dup_anon_vma(prev, curr);
> >  			if (end == curr->vm_end) {	/* case 7 */
> > @@ -995,7 +992,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  				adj_start = (end - curr->vm_start);
> >  			}
> >  		}
> > -	} else if (merge_next) {
> > +	} else { /* merge_next */
> >  		res = next;
> >  		if (prev && addr < prev->vm_end) {	/* case 4 */
> >  			vma_end = addr;
> > @@ -1011,7 +1008,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  			vma_start = addr;
> >  			vma_end = next->vm_end;
> >  			vma_pgoff = next->vm_pgoff;
> > -			err = 0;
> > +
>
> Was this blank line intentional?  I assume so, to give a gap for the
> comment below?  It's probably worth having.
>
> >  			if (curr) {			/* case 8 */
> >  				vma_pgoff = curr->vm_pgoff;
> >  				remove = curr;
> > @@ -1020,7 +1017,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  		}
> >  	}
> >
> > -	/* Cannot merge or error in anon_vma clone */
> > +	/* Error in anon_vma clone. */
> >  	if (err)
> >  		return NULL;
> >
> > --
> > 2.39.2
> >
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] mm/mmap/vma_merge: set next to NULL if not applicable
  2023-03-22  1:42   ` Liam R. Howlett
@ 2023-03-22  6:23     ` Lorenzo Stoakes
  0 siblings, 0 replies; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-22  6:23 UTC (permalink / raw)
  To: Liam R. Howlett, linux-mm, linux-kernel, Andrew Morton,
	David Hildenbrand, Matthew Wilcox, Vlastimil Babka, maple-tree,
	Vernon Yang

On Tue, Mar 21, 2023 at 09:42:32PM -0400, Liam R. Howlett wrote:
> * Lorenzo Stoakes <lstoakes@gmail.com> [230321 16:51]:
> > We are only interested in next if end == next->vm_start (in which case we
> > check to see if we can set merge_next), so perform this check alongside
> > checking whether curr should be set.
> >
> > This groups all of the simple range checks together and establishes the
> > invariant that, if prev, curr or next are non-NULL then their positions are
> > as expected.
> >
> > This has no functional impact.
> >
> > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > ---
> >  mm/mmap.c | 25 ++++++++++++-------------
> >  1 file changed, 12 insertions(+), 13 deletions(-)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index c9834364ac98..6361baf75601 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -930,15 +930,15 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  	if (vm_flags & VM_SPECIAL)
> >  		return NULL;
> >
> > -	curr = find_vma(mm, prev ? prev->vm_end : 0);
> > -	if (curr && curr->vm_end == end)		/* cases 6, 7, 8 */
> > -		next = find_vma(mm, curr->vm_end);
> > -	else
> > -		next = curr;
> > +	/* Does the input range span an existing VMA? (cases 5 - 8) */
> > +	curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
> >
> > -	/* In cases 1 - 4 there's no CCCC vma */
> > -	if (curr && end <= curr->vm_start)
> > -		curr = NULL;
> > +	if (curr && end == curr->vm_end)
> > +		/* Is there is a VMA immediately adjacent (cases 6 - 8)? */
> > +		next = vma_lookup(mm, curr->vm_end);
>
> Since end == curr->vm_end, this lookup is the same as below so these two
> statements can be combined.

I can't believe I didn't see this :) wood for the trees + a great example as to
the benefit of code review.

>
> I also think you may need to initialize next to null since it may not be
> set for the 'cannot merge' case.

You're right, will fix.

>
> Something like:
> 	if ((!curr) ||		    /* case 1-4 */
> 	     (end == curr->vm_end)) /* Case 6-8, adjacent vma */
> 		next = vma_lookup(mm, end);
> 	else
> 		next = NULL
>

This is really turning out to be quite nicely succinct now I think! Will do.

>
> > +	else if (!curr)
> > +		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
> > +		next = vma_lookup(mm, end);
>
> Nit, can we have braces for comments that make the if/else look like
> it's unguarded?
>

Ack will fix in respin.

> >
> >  	/* verify some invariant that must be enforced by the caller */
> >  	VM_WARN_ON(prev && addr <= prev->vm_start);
> > @@ -959,11 +959,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  		}
> >  	}
> >  	/* Can we merge the successor? */
> > -	if (next && end == next->vm_start &&
> > -			mpol_equal(policy, vma_policy(next)) &&
> > -			can_vma_merge_before(next, vm_flags,
> > -					     anon_vma, file, pgoff+pglen,
> > -					     vm_userfaultfd_ctx, anon_name)) {
> > +	if (next && mpol_equal(policy, vma_policy(next)) &&
> > +	    can_vma_merge_before(next, vm_flags,
> > +				 anon_vma, file, pgoff+pglen,
> > +				 vm_userfaultfd_ctx, anon_name)) {
> >  		merge_next = true;
> >  	}
> >
> > --
> > 2.39.2
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma
  2023-03-21 20:45 ` [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma Lorenzo Stoakes
@ 2023-03-22  9:16   ` Vlastimil Babka
       [not found]   ` <CGME20230323170046eucas1p2483ab0fcc3d6bc56d4b6d09143bbadda@eucas1p2.samsung.com>
  1 sibling, 0 replies; 13+ messages in thread
From: Vlastimil Babka @ 2023-03-22  9:16 UTC (permalink / raw)
  To: Lorenzo Stoakes, linux-mm, linux-kernel, Andrew Morton
  Cc: David Hildenbrand, Matthew Wilcox, Liam R . Howlett, maple-tree,
	Vernon Yang

On 3/21/23 21:45, Lorenzo Stoakes wrote:
> Previously, vma was an uninitialised variable which was only definitely
> assigned as a result of the logic covering all possible input cases - for
> it to have remained uninitialised, prev would have to be NULL, and next
> would _have_ to be mergeable.
> 
> We now reuse vma to assign curr and next, so to be absolutely explicit,
> ensure this variable is _always_ assigned, and while we're at it remove the
> redundant assignment of both res and vma (if prev is NULL then we simply
> assign to NULL).
> 
> In addition, we absolutely do rely on addr == curr->vm_start should curr
> exist, so assert as much.
> 
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

Nit suggestion below.

> ---
>  mm/mmap.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 6361baf75601..7aec49c3bc74 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  {
>  	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
>  	pgoff_t vma_pgoff;
> -	struct vm_area_struct *curr, *next, *res = NULL;
> +	struct vm_area_struct *curr, *next, *res;
>  	struct vm_area_struct *vma, *adjust, *remove, *remove2;
>  	int err = -1;
>  	bool merge_prev = false;
> @@ -940,14 +940,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
>  		next = vma_lookup(mm, end);
>  
> -	/* verify some invariant that must be enforced by the caller */
> +	/*
> +	 * By default, we return prev. Cases 3, 4, 8 will instead return next
> +	 * and cases 3, 8 will also update vma to point at next.
> +	 */
> +	res = vma = prev;

Later in the function there's a line:

	remove = remove2 = adjust = NULL;

Now it would make sense to move it up here?

> +
> +	/* Verify some invariant that must be enforced by the caller. */
>  	VM_WARN_ON(prev && addr <= prev->vm_start);
> -	VM_WARN_ON(curr && end > curr->vm_end);
> +	VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
>  	VM_WARN_ON(addr >= end);
>  
>  	if (prev) {
> -		res = prev;
> -		vma = prev;
>  		vma_start = prev->vm_start;
>  		vma_pgoff = prev->vm_pgoff;
>  		/* Can we merge the predecessor? */
> @@ -958,6 +962,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  			vma_prev(vmi);
>  		}
>  	}
> +
>  	/* Can we merge the successor? */
>  	if (next && mpol_equal(policy, vma_policy(next)) &&
>  	    can_vma_merge_before(next, vm_flags,
> @@ -998,6 +1003,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  			adj_start = -(prev->vm_end - addr);
>  			err = dup_anon_vma(next, prev);
>  		} else {
> +			/*
> +			 * Note that cases 3 and 8 are the ONLY ones where prev
> +			 * is permitted to be (but is not necessarily) NULL.
> +			 */
>  			vma = next;			/* case 3 */
>  			vma_start = addr;
>  			vma_end = next->vm_end;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma
       [not found]   ` <CGME20230323170046eucas1p2483ab0fcc3d6bc56d4b6d09143bbadda@eucas1p2.samsung.com>
@ 2023-03-23 17:00     ` Marek Szyprowski
  2023-03-23 17:08       ` Lorenzo Stoakes
  0 siblings, 1 reply; 13+ messages in thread
From: Marek Szyprowski @ 2023-03-23 17:00 UTC (permalink / raw)
  To: Lorenzo Stoakes, linux-mm, linux-kernel, Andrew Morton
  Cc: David Hildenbrand, Matthew Wilcox, Vlastimil Babka,
	Liam R . Howlett, maple-tree, Vernon Yang

On 21.03.2023 21:45, Lorenzo Stoakes wrote:
> Previously, vma was an uninitialised variable which was only definitely
> assigned as a result of the logic covering all possible input cases - for
> it to have remained uninitialised, prev would have to be NULL, and next
> would _have_ to be mergeable.
>
> We now reuse vma to assign curr and next, so to be absolutely explicit,
> ensure this variable is _always_ assigned, and while we're at it remove the
> redundant assignment of both res and vma (if prev is NULL then we simply
> assign to NULL).
>
> In addition, we absolutely do rely on addr == curr->vm_start should curr
> exist, so assert as much.
>
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>

This patch has been merged into today's linux next-20230323 as commit 
6426bbcc76be ("mm/mmap/vma_merge: extend invariants, avoid invalid res, 
vma").

Unfortunately it breaks booting of some ARM 32bit machines, like Samsung 
Exynos5422-based Odroid XU3 board. This shortened log shows the issue:

Run /sbin/init as init process
   with arguments:
     /sbin/init
   with environment:
     HOME=/
     TERM=linux
8<--- cut here ---
Unhandled fault: page domain fault (0x01b) at 0xb6f03010
[b6f03010] *pgd=b5e84835
Internal error: : 1b [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 2 PID: 1 Comm: init Not tainted 6.3.0-rc1-00296-g6426bbcc76be #6445
Hardware name: Samsung Exynos (Flattened Device Tree)
PC is at vma_merge+0xa0/0x728
LR is at vma_merge+0x294/0x728
pc : [<c02b08a8>]    lr : [<c02b0a9c>]    psr: a0000013
...
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 4f11406a  DAC: 00000051
...
Process init (pid: 1, stack limit = 0x5219a5c7)
Stack: (0xf0835e30 to 0xf0836000)
...
  vma_merge from mprotect_fixup+0xc8/0x290
  mprotect_fixup from do_mprotect_pkey.constprop.0+0x210/0x338
  do_mprotect_pkey.constprop.0 from ret_fast_syscall+0x0/0x1c
Exception stack(0xf0835fa8 to 0xf0835ff0)
...
---[ end trace 0000000000000000 ]---
note: init[1] exited with irqs disabled
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Reverting it on top of linux-next, together with 183b2bced4c9 
("mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable 
case"), which depends on this patch, fixes the boot issue.


> ---
>   mm/mmap.c | 19 ++++++++++++++-----
>   1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 6361baf75601..7aec49c3bc74 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>   {
>   	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
>   	pgoff_t vma_pgoff;
> -	struct vm_area_struct *curr, *next, *res = NULL;
> +	struct vm_area_struct *curr, *next, *res;
>   	struct vm_area_struct *vma, *adjust, *remove, *remove2;
>   	int err = -1;
>   	bool merge_prev = false;
> @@ -940,14 +940,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>   		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
>   		next = vma_lookup(mm, end);
>   
> -	/* verify some invariant that must be enforced by the caller */
> +	/*
> +	 * By default, we return prev. Cases 3, 4, 8 will instead return next
> +	 * and cases 3, 8 will also update vma to point at next.
> +	 */
> +	res = vma = prev;
> +
> +	/* Verify some invariant that must be enforced by the caller. */
>   	VM_WARN_ON(prev && addr <= prev->vm_start);
> -	VM_WARN_ON(curr && end > curr->vm_end);
> +	VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
>   	VM_WARN_ON(addr >= end);
>   
>   	if (prev) {
> -		res = prev;
> -		vma = prev;
>   		vma_start = prev->vm_start;
>   		vma_pgoff = prev->vm_pgoff;
>   		/* Can we merge the predecessor? */
> @@ -958,6 +962,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>   			vma_prev(vmi);
>   		}
>   	}
> +
>   	/* Can we merge the successor? */
>   	if (next && mpol_equal(policy, vma_policy(next)) &&
>   	    can_vma_merge_before(next, vm_flags,
> @@ -998,6 +1003,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>   			adj_start = -(prev->vm_end - addr);
>   			err = dup_anon_vma(next, prev);
>   		} else {
> +			/*
> +			 * Note that cases 3 and 8 are the ONLY ones where prev
> +			 * is permitted to be (but is not necessarily) NULL.
> +			 */
>   			vma = next;			/* case 3 */
>   			vma_start = addr;
>   			vma_end = next->vm_end;

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma
  2023-03-23 17:00     ` Marek Szyprowski
@ 2023-03-23 17:08       ` Lorenzo Stoakes
  2023-03-23 17:10         ` Lorenzo Stoakes
  0 siblings, 1 reply; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-23 17:08 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-mm, linux-kernel, Andrew Morton, David Hildenbrand,
	Matthew Wilcox, Vlastimil Babka, Liam R . Howlett, maple-tree,
	Vernon Yang

On Thu, Mar 23, 2023 at 06:00:45PM +0100, Marek Szyprowski wrote:
> On 21.03.2023 21:45, Lorenzo Stoakes wrote:
> > Previously, vma was an uninitialised variable which was only definitely
> > assigned as a result of the logic covering all possible input cases - for
> > it to have remained uninitialised, prev would have to be NULL, and next
> > would _have_ to be mergeable.
> >
> > We now reuse vma to assign curr and next, so to be absolutely explicit,
> > ensure this variable is _always_ assigned, and while we're at it remove the
> > redundant assignment of both res and vma (if prev is NULL then we simply
> > assign to NULL).
> >
> > In addition, we absolutely do rely on addr == curr->vm_start should curr
> > exist, so assert as much.
> >
> > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
>
> This patch has been merged into today's linux next-20230323 as commit
> 6426bbcc76be ("mm/mmap/vma_merge: extend invariants, avoid invalid res,
> vma").
>
> Unfortunately it breaks booting of some ARM 32bit machines, like Samsung
> Exynos5422-based Odroid XU3 board. This shortened log shows the issue:

I'm very sorry about this. This was due to a bug with uninitiased
state. This was fixed in v3 of this series, which is now at v5 [1]. This
has already been taken to mm-unstable in Andrew's repo, hopefully it will
be taken to -next soon!

I have made sure to reply quickly whenever this is raised to ensure people
are aware, and I will also buy anybody affected a pint if I meet them in
person :)

[1]:https://lore.kernel.org/all/over.1679516210.git.lstoakes@gmail.com/

>
> Run /sbin/init as init process
>    with arguments:
>      /sbin/init
>    with environment:
>      HOME=/
>      TERM=linux
> 8<--- cut here ---
> Unhandled fault: page domain fault (0x01b) at 0xb6f03010
> [b6f03010] *pgd=b5e84835
> Internal error: : 1b [#1] PREEMPT SMP ARM
> Modules linked in:
> CPU: 2 PID: 1 Comm: init Not tainted 6.3.0-rc1-00296-g6426bbcc76be #6445
> Hardware name: Samsung Exynos (Flattened Device Tree)
> PC is at vma_merge+0xa0/0x728
> LR is at vma_merge+0x294/0x728
> pc : [<c02b08a8>]    lr : [<c02b0a9c>]    psr: a0000013
> ...
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 4f11406a  DAC: 00000051
> ...
> Process init (pid: 1, stack limit = 0x5219a5c7)
> Stack: (0xf0835e30 to 0xf0836000)
> ...
>   vma_merge from mprotect_fixup+0xc8/0x290
>   mprotect_fixup from do_mprotect_pkey.constprop.0+0x210/0x338
>   do_mprotect_pkey.constprop.0 from ret_fast_syscall+0x0/0x1c
> Exception stack(0xf0835fa8 to 0xf0835ff0)
> ...
> ---[ end trace 0000000000000000 ]---
> note: init[1] exited with irqs disabled
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> Reverting it on top of linux-next, together with 183b2bced4c9
> ("mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable
> case"), which depends on this patch, fixes the boot issue.
>
>
> > ---
> >   mm/mmap.c | 19 ++++++++++++++-----
> >   1 file changed, 14 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 6361baf75601..7aec49c3bc74 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >   {
> >   	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> >   	pgoff_t vma_pgoff;
> > -	struct vm_area_struct *curr, *next, *res = NULL;
> > +	struct vm_area_struct *curr, *next, *res;
> >   	struct vm_area_struct *vma, *adjust, *remove, *remove2;
> >   	int err = -1;
> >   	bool merge_prev = false;
> > @@ -940,14 +940,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >   		/* Is there a VMA next to a hole (case 1 - 3) or prev (4)? */
> >   		next = vma_lookup(mm, end);
> >
> > -	/* verify some invariant that must be enforced by the caller */
> > +	/*
> > +	 * By default, we return prev. Cases 3, 4, 8 will instead return next
> > +	 * and cases 3, 8 will also update vma to point at next.
> > +	 */
> > +	res = vma = prev;
> > +
> > +	/* Verify some invariant that must be enforced by the caller. */
> >   	VM_WARN_ON(prev && addr <= prev->vm_start);
> > -	VM_WARN_ON(curr && end > curr->vm_end);
> > +	VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> >   	VM_WARN_ON(addr >= end);
> >
> >   	if (prev) {
> > -		res = prev;
> > -		vma = prev;
> >   		vma_start = prev->vm_start;
> >   		vma_pgoff = prev->vm_pgoff;
> >   		/* Can we merge the predecessor? */
> > @@ -958,6 +962,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >   			vma_prev(vmi);
> >   		}
> >   	}
> > +
> >   	/* Can we merge the successor? */
> >   	if (next && mpol_equal(policy, vma_policy(next)) &&
> >   	    can_vma_merge_before(next, vm_flags,
> > @@ -998,6 +1003,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >   			adj_start = -(prev->vm_end - addr);
> >   			err = dup_anon_vma(next, prev);
> >   		} else {
> > +			/*
> > +			 * Note that cases 3 and 8 are the ONLY ones where prev
> > +			 * is permitted to be (but is not necessarily) NULL.
> > +			 */
> >   			vma = next;			/* case 3 */
> >   			vma_start = addr;
> >   			vma_end = next->vm_end;
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma
  2023-03-23 17:08       ` Lorenzo Stoakes
@ 2023-03-23 17:10         ` Lorenzo Stoakes
  0 siblings, 0 replies; 13+ messages in thread
From: Lorenzo Stoakes @ 2023-03-23 17:10 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-mm, linux-kernel, Andrew Morton, David Hildenbrand,
	Matthew Wilcox, Vlastimil Babka, Liam R . Howlett, maple-tree,
	Vernon Yang

On Thu, Mar 23, 2023 at 05:08:19PM +0000, Lorenzo Stoakes wrote:
> [1]:https://lore.kernel.org/all/over.1679516210.git.lstoakes@gmail.com/

Whoops! Typo - https://lore.kernel.org/all/cover.1679516210.git.lstoakes@gmail.com/

:)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-03-23 17:10 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-21 20:45 [PATCH v2 0/4] further cleanup of vma_merge() Lorenzo Stoakes
2023-03-21 20:45 ` [PATCH v2 1/4] mm/mmap/vma_merge: further improve prev/next VMA naming Lorenzo Stoakes
2023-03-21 20:45 ` [PATCH v2 2/4] mm/mmap/vma_merge: set next to NULL if not applicable Lorenzo Stoakes
2023-03-22  1:42   ` Liam R. Howlett
2023-03-22  6:23     ` Lorenzo Stoakes
2023-03-21 20:45 ` [PATCH v2 3/4] mm/mmap/vma_merge: extend invariants, avoid invalid res, vma Lorenzo Stoakes
2023-03-22  9:16   ` Vlastimil Babka
     [not found]   ` <CGME20230323170046eucas1p2483ab0fcc3d6bc56d4b6d09143bbadda@eucas1p2.samsung.com>
2023-03-23 17:00     ` Marek Szyprowski
2023-03-23 17:08       ` Lorenzo Stoakes
2023-03-23 17:10         ` Lorenzo Stoakes
2023-03-21 20:45 ` [PATCH v2 4/4] mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable case Lorenzo Stoakes
2023-03-22  2:08   ` Liam R. Howlett
2023-03-22  6:07     ` Lorenzo Stoakes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.