All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Fixes for vma_merge() error path
@ 2023-09-27 16:07 Liam R. Howlett
  2023-09-27 16:07 ` [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge() Liam R. Howlett
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-27 16:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Vlastimil Babka, Suren Baghdasaryan, Matthew Wilcox,
	Liam R. Howlett

Jann Horn reported a potential vma iterator issue in the failure path of
the vma_merge() code.  After examining the interface, it seemed the best
course of action is to simply add an undo path in the unlikely case of
an error.

On examining the vma iterator issue, another issue was discovered that
would increase the memory usage during failure scenarios, so this is
addressed in patch 2.

Since it is unclear in the code, another patch adds comments to the
vma_merge() function on why dup_anon_vma() is safe in 'case 6'.

Liam R. Howlett (3):
  mmap: Fix vma_iterator in error path of vma_merge()
  mmap: Fix error paths with dup_anon_vma()
  mmap: Add clarifying comment to vma_merge() code

 mm/mmap.c | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-27 16:07 [PATCH 0/3] Fixes for vma_merge() error path Liam R. Howlett
@ 2023-09-27 16:07 ` Liam R. Howlett
  2023-09-27 17:14   ` Andrew Morton
                     ` (2 more replies)
  2023-09-27 16:07 ` [PATCH 2/3] mmap: Fix error paths with dup_anon_vma() Liam R. Howlett
  2023-09-27 16:07 ` [PATCH 3/3] mmap: Add clarifying comment to vma_merge() code Liam R. Howlett
  2 siblings, 3 replies; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-27 16:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Vlastimil Babka, Suren Baghdasaryan, Matthew Wilcox,
	Liam R. Howlett, stable

When merging of the previous VMA fails after the vma iterator has been
moved to the previous entry, the vma iterator must be advanced to ensure
the caller takes the correct action on the next vma iterator event.  Fix
this by adding a vma_next() call to the error path.

Users may experience higher CPU usage, most likely in very low memory
situations.

Link: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/
Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/
Fixes: 18b098af2890 ("vma_merge: set vma iterator to correct position.")
Cc: stable@vger.kernel.org
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 mm/mmap.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index b56a7f0c9f85..b5bc4ca9bdc4 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -968,14 +968,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 				vma_pgoff = curr->vm_pgoff;
 				vma_start_write(curr);
 				remove = curr;
-				err = dup_anon_vma(next, curr);
+				err = dup_anon_vma(next, curr, &anon_dup);
 			}
 		}
 	}
 
 	/* Error in anon_vma clone. */
 	if (err)
-		return NULL;
+		goto anon_vma_fail;
 
 	if (vma_start < vma->vm_start || vma_end > vma->vm_end)
 		vma_expanded = true;
@@ -988,7 +988,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 	}
 
 	if (vma_iter_prealloc(vmi, vma))
-		return NULL;
+		goto prealloc_fail;
 
 	init_multi_vma_prep(&vp, vma, adjust, remove, remove2);
 	VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
@@ -1016,6 +1016,12 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 	vma_complete(&vp, vmi, mm);
 	khugepaged_enter_vma(res, vm_flags);
 	return res;
+
+prealloc_fail:
+anon_vma_fail:
+	if (merge_prev)
+		vma_next(vmi);
+	return NULL;
 }
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] mmap: Fix error paths with dup_anon_vma()
  2023-09-27 16:07 [PATCH 0/3] Fixes for vma_merge() error path Liam R. Howlett
  2023-09-27 16:07 ` [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge() Liam R. Howlett
@ 2023-09-27 16:07 ` Liam R. Howlett
  2023-09-29 10:07   ` Vlastimil Babka
  2023-09-27 16:07 ` [PATCH 3/3] mmap: Add clarifying comment to vma_merge() code Liam R. Howlett
  2 siblings, 1 reply; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-27 16:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Vlastimil Babka, Suren Baghdasaryan, Matthew Wilcox,
	Liam R. Howlett, stable

When the calling function fails after the dup_anon_vma(), the
duplication of the anon_vma is not being undone.  Add the necessary
unlink_anon_vma() call to the error paths that are missing them.

This issue showed up during inspection of the error path in vma_merge()
for an unrelated vma iterator issue.

Users may experience increased memory usage, which may be problematic as
the failure would likely be caused by a low memory situation.

Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
Cc: stable@vger.kernel.org
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 mm/mmap.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index b5bc4ca9bdc4..2f0ee489db8a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -587,7 +587,7 @@ static inline void vma_complete(struct vma_prepare *vp,
  * Returns: 0 on success.
  */
 static inline int dup_anon_vma(struct vm_area_struct *dst,
-			       struct vm_area_struct *src)
+		struct vm_area_struct *src, struct vm_area_struct **dup)
 {
 	/*
 	 * Easily overlooked: when mprotect shifts the boundary, make sure the
@@ -597,6 +597,7 @@ static inline int dup_anon_vma(struct vm_area_struct *dst,
 	if (src->anon_vma && !dst->anon_vma) {
 		vma_assert_write_locked(dst);
 		dst->anon_vma = src->anon_vma;
+		*dup = dst;
 		return anon_vma_clone(dst, src);
 	}
 
@@ -624,6 +625,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	       unsigned long start, unsigned long end, pgoff_t pgoff,
 	       struct vm_area_struct *next)
 {
+	struct vm_area_struct *anon_dup = NULL;
 	bool remove_next = false;
 	struct vma_prepare vp;
 
@@ -633,7 +635,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
 
 		remove_next = true;
 		vma_start_write(next);
-		ret = dup_anon_vma(vma, next);
+		ret = dup_anon_vma(vma, next, &anon_dup);
 		if (ret)
 			return ret;
 	}
@@ -661,6 +663,8 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	return 0;
 
 nomem:
+	if (anon_dup)
+		unlink_anon_vmas(anon_dup);
 	return -ENOMEM;
 }
 
@@ -860,6 +864,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 {
 	struct vm_area_struct *curr, *next, *res;
 	struct vm_area_struct *vma, *adjust, *remove, *remove2;
+	struct vm_area_struct *anon_dup = NULL;
 	struct vma_prepare vp;
 	pgoff_t vma_pgoff;
 	int err = 0;
@@ -927,18 +932,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 		vma_start_write(next);
 		remove = next;				/* case 1 */
 		vma_end = next->vm_end;
-		err = dup_anon_vma(prev, next);
+		err = dup_anon_vma(prev, next, &anon_dup);
 		if (curr) {				/* case 6 */
 			vma_start_write(curr);
 			remove = curr;
 			remove2 = next;
 			if (!next->anon_vma)
-				err = dup_anon_vma(prev, curr);
+				err = dup_anon_vma(prev, curr, &anon_dup);
 		}
 	} else if (merge_prev) {			/* case 2 */
 		if (curr) {
 			vma_start_write(curr);
-			err = dup_anon_vma(prev, curr);
+			err = dup_anon_vma(prev, curr, &anon_dup);
 			if (end == curr->vm_end) {	/* case 7 */
 				remove = curr;
 			} else {			/* case 5 */
@@ -954,7 +959,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 			vma_end = addr;
 			adjust = next;
 			adj_start = -(prev->vm_end - addr);
-			err = dup_anon_vma(next, prev);
+			err = dup_anon_vma(next, prev, &anon_dup);
 		} else {
 			/*
 			 * Note that cases 3 and 8 are the ONLY ones where prev
@@ -1018,6 +1023,9 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 	return res;
 
 prealloc_fail:
+	if (anon_dup)
+		unlink_anon_vmas(anon_dup);
+
 anon_vma_fail:
 	if (merge_prev)
 		vma_next(vmi);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] mmap: Add clarifying comment to vma_merge() code
  2023-09-27 16:07 [PATCH 0/3] Fixes for vma_merge() error path Liam R. Howlett
  2023-09-27 16:07 ` [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge() Liam R. Howlett
  2023-09-27 16:07 ` [PATCH 2/3] mmap: Fix error paths with dup_anon_vma() Liam R. Howlett
@ 2023-09-27 16:07 ` Liam R. Howlett
  2 siblings, 0 replies; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-27 16:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Vlastimil Babka, Suren Baghdasaryan, Matthew Wilcox,
	Liam R. Howlett

When tracing through the code in vma_merge(), it was not completely
clear why the error return to a dup_anon_vma() call would not overwrite
a previous attempt to the same function.  This commit adds a comment
specifying why it is safe.

Suggested-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/linux-mm/CAG48ez3iDwFPR=Ed1BfrNuyUJPMK_=StjxhUsCkL6po1s7bONg@mail.gmail.com/
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 mm/mmap.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2f0ee489db8a..3c78afb707cf 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -937,6 +937,11 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
 			vma_start_write(curr);
 			remove = curr;
 			remove2 = next;
+			/*
+			 * Note that the dup_anon_vma below cannot overwrite err
+			 * since the first caller would do nothing unless next
+			 * has an anon_vma.
+			 */
 			if (!next->anon_vma)
 				err = dup_anon_vma(prev, curr, &anon_dup);
 		}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-27 16:07 ` [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge() Liam R. Howlett
@ 2023-09-27 17:14   ` Andrew Morton
  2023-09-27 17:26     ` Liam R. Howlett
  2023-09-27 20:06   ` Matthew Wilcox
  2023-09-29  9:52   ` Vlastimil Babka
  2 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2023-09-27 17:14 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Vlastimil Babka, Suren Baghdasaryan, Matthew Wilcox, stable

On Wed, 27 Sep 2023 12:07:44 -0400 "Liam R. Howlett" <Liam.Howlett@oracle.com> wrote:

> When merging of the previous VMA fails after the vma iterator has been
> moved to the previous entry, the vma iterator must be advanced to ensure
> the caller takes the correct action on the next vma iterator event.  Fix
> this by adding a vma_next() call to the error path.
> 
> Users may experience higher CPU usage, most likely in very low memory
> situations.

Looking through this thread:

> Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/

I'm seeing no indication that the effect is CPU consumption?  Jann is
excpecting bogus oom-killing?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-27 17:14   ` Andrew Morton
@ 2023-09-27 17:26     ` Liam R. Howlett
  2023-09-28 17:06       ` Liam R. Howlett
  0 siblings, 1 reply; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-27 17:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Vlastimil Babka, Suren Baghdasaryan, Matthew Wilcox, stable

* Andrew Morton <akpm@linux-foundation.org> [230927 13:14]:
> On Wed, 27 Sep 2023 12:07:44 -0400 "Liam R. Howlett" <Liam.Howlett@oracle.com> wrote:
> 
> > When merging of the previous VMA fails after the vma iterator has been
> > moved to the previous entry, the vma iterator must be advanced to ensure
> > the caller takes the correct action on the next vma iterator event.  Fix
> > this by adding a vma_next() call to the error path.
> > 
> > Users may experience higher CPU usage, most likely in very low memory
> > situations.
> 
> Looking through this thread:
> 
> > Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/
> 
> I'm seeing no indication that the effect is CPU consumption?  Jann is
> excpecting bogus oom-killing?

His testing injected a bogus oom, but since the vma iterator may get
stuck in a "I can merge! oh, I'm out of memory" loop due to the
vma_merge() called with the same VMA in this loop, I would expect it to
be increased CPU usage when almost out of memory until a task is killed.
I don't think he expected a bogus OOM since we are using GFP_KERNEL
during mm/internal.h:vma_iter_prealloc() calls.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-27 16:07 ` [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge() Liam R. Howlett
  2023-09-27 17:14   ` Andrew Morton
@ 2023-09-27 20:06   ` Matthew Wilcox
  2023-09-27 22:42     ` Liam R. Howlett
  2023-09-29  9:52   ` Vlastimil Babka
  2 siblings, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2023-09-27 20:06 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: Andrew Morton, maple-tree, linux-mm, linux-kernel, Jann Horn,
	Lorenzo Stoakes, Vlastimil Babka, Suren Baghdasaryan, stable

On Wed, Sep 27, 2023 at 12:07:44PM -0400, Liam R. Howlett wrote:
> +++ b/mm/mmap.c
> @@ -968,14 +968,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  				vma_pgoff = curr->vm_pgoff;
>  				vma_start_write(curr);
>  				remove = curr;
> -				err = dup_anon_vma(next, curr);
> +				err = dup_anon_vma(next, curr, &anon_dup);

This isn't bisectable.  dup_anon_vma() doesn't gain a third argument
until patch 2.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-27 20:06   ` Matthew Wilcox
@ 2023-09-27 22:42     ` Liam R. Howlett
  0 siblings, 0 replies; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-27 22:42 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, maple-tree, linux-mm, linux-kernel, Jann Horn,
	Lorenzo Stoakes, Vlastimil Babka, Suren Baghdasaryan, stable

* Matthew Wilcox <willy@infradead.org> [230927 16:06]:
> On Wed, Sep 27, 2023 at 12:07:44PM -0400, Liam R. Howlett wrote:
> > +++ b/mm/mmap.c
> > @@ -968,14 +968,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  				vma_pgoff = curr->vm_pgoff;
> >  				vma_start_write(curr);
> >  				remove = curr;
> > -				err = dup_anon_vma(next, curr);
> > +				err = dup_anon_vma(next, curr, &anon_dup);
> 
> This isn't bisectable.  dup_anon_vma() doesn't gain a third argument
> until patch 2.

Ah, I'll respin and retest.

Sorry about that,
Liam

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-27 17:26     ` Liam R. Howlett
@ 2023-09-28 17:06       ` Liam R. Howlett
  0 siblings, 0 replies; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-28 17:06 UTC (permalink / raw)
  To: Andrew Morton, maple-tree, linux-mm, linux-kernel, Jann Horn,
	Lorenzo Stoakes, Vlastimil Babka, Suren Baghdasaryan,
	Matthew Wilcox, stable

* Liam R. Howlett <Liam.Howlett@Oracle.com> [230927 13:26]:
> * Andrew Morton <akpm@linux-foundation.org> [230927 13:14]:
> > On Wed, 27 Sep 2023 12:07:44 -0400 "Liam R. Howlett" <Liam.Howlett@oracle.com> wrote:
> > 
> > > When merging of the previous VMA fails after the vma iterator has been
> > > moved to the previous entry, the vma iterator must be advanced to ensure
> > > the caller takes the correct action on the next vma iterator event.  Fix
> > > this by adding a vma_next() call to the error path.
> > > 
> > > Users may experience higher CPU usage, most likely in very low memory
> > > situations.
> > 
> > Looking through this thread:
> > 
> > > Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/
> > 
> > I'm seeing no indication that the effect is CPU consumption?  Jann is
> > excpecting bogus oom-killing?
> 
> His testing injected a bogus oom, but since the vma iterator may get
> stuck in a "I can merge! oh, I'm out of memory" loop due to the
> vma_merge() called with the same VMA in this loop, I would expect it to
> be increased CPU usage when almost out of memory until a task is killed.
> I don't think he expected a bogus OOM since we are using GFP_KERNEL
> during mm/internal.h:vma_iter_prealloc() calls.

The initial call to vma_merge() is correct, but on the second call vma
is the same as prev so it won't attempt to merge prev again.  I think it
would only cause one extra call to vma_merge().

So I think you are correct, CPU usage will not increase very much.
But, there also will not be a bogus OOM.  There will just be two calls to
vma_merge() for the same VMA when there is an OOM even and we could have
merged prev.

I doubt the user would notice anything and they have bigger memory
issues at that time.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-27 16:07 ` [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge() Liam R. Howlett
  2023-09-27 17:14   ` Andrew Morton
  2023-09-27 20:06   ` Matthew Wilcox
@ 2023-09-29  9:52   ` Vlastimil Babka
  2023-09-29 14:41     ` Liam R. Howlett
  2 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2023-09-29  9:52 UTC (permalink / raw)
  To: Liam R. Howlett, Andrew Morton
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Suren Baghdasaryan, Matthew Wilcox, stable

On 9/27/23 18:07, Liam R. Howlett wrote:
> When merging of the previous VMA fails after the vma iterator has been
> moved to the previous entry, the vma iterator must be advanced to ensure
> the caller takes the correct action on the next vma iterator event.  Fix
> this by adding a vma_next() call to the error path.
> 
> Users may experience higher CPU usage, most likely in very low memory
> situations.

Maybe we could say explicitly that before this fix, vma_merge will be called
twice on the same vma, which to the best of our knowledge does not cause
anything worse than some wasted cycles because vma == prev, but it's fragile?

> Link: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/
> Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/
> Fixes: 18b098af2890 ("vma_merge: set vma iterator to correct position.")
> Cc: stable@vger.kernel.org
> Cc: Jann Horn <jannh@google.com>
> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> ---
>  mm/mmap.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index b56a7f0c9f85..b5bc4ca9bdc4 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -968,14 +968,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  				vma_pgoff = curr->vm_pgoff;
>  				vma_start_write(curr);
>  				remove = curr;
> -				err = dup_anon_vma(next, curr);
> +				err = dup_anon_vma(next, curr, &anon_dup);
>  			}
>  		}
>  	}
>  
>  	/* Error in anon_vma clone. */
>  	if (err)
> -		return NULL;
> +		goto anon_vma_fail;
>  
>  	if (vma_start < vma->vm_start || vma_end > vma->vm_end)
>  		vma_expanded = true;

The vma_iter_config() actions done in this part are something we don't need
to undo?

> @@ -988,7 +988,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  	}
>  
>  	if (vma_iter_prealloc(vmi, vma))
> -		return NULL;
> +		goto prealloc_fail;



>  	init_multi_vma_prep(&vp, vma, adjust, remove, remove2);
>  	VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
> @@ -1016,6 +1016,12 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  	vma_complete(&vp, vmi, mm);
>  	khugepaged_enter_vma(res, vm_flags);
>  	return res;
> +
> +prealloc_fail:
> +anon_vma_fail:
> +	if (merge_prev)
> +		vma_next(vmi);
> +	return NULL;
>  }
>  
>  /*


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] mmap: Fix error paths with dup_anon_vma()
  2023-09-27 16:07 ` [PATCH 2/3] mmap: Fix error paths with dup_anon_vma() Liam R. Howlett
@ 2023-09-29 10:07   ` Vlastimil Babka
  2023-09-29 14:52     ` Liam R. Howlett
  0 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2023-09-29 10:07 UTC (permalink / raw)
  To: Liam R. Howlett, Andrew Morton
  Cc: maple-tree, linux-mm, linux-kernel, Jann Horn, Lorenzo Stoakes,
	Suren Baghdasaryan, Matthew Wilcox, stable

On 9/27/23 18:07, Liam R. Howlett wrote:
> When the calling function fails after the dup_anon_vma(), the
> duplication of the anon_vma is not being undone.  Add the necessary
> unlink_anon_vma() call to the error paths that are missing them.
> 
> This issue showed up during inspection of the error path in vma_merge()
> for an unrelated vma iterator issue.
> 
> Users may experience increased memory usage, which may be problematic as
> the failure would likely be caused by a low memory situation.
> 
> Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
> Cc: stable@vger.kernel.org
> Cc: Jann Horn <jannh@google.com>
> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> ---
>  mm/mmap.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index b5bc4ca9bdc4..2f0ee489db8a 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -587,7 +587,7 @@ static inline void vma_complete(struct vma_prepare *vp,
>   * Returns: 0 on success.
>   */
>  static inline int dup_anon_vma(struct vm_area_struct *dst,
> -			       struct vm_area_struct *src)
> +		struct vm_area_struct *src, struct vm_area_struct **dup)
>  {
>  	/*
>  	 * Easily overlooked: when mprotect shifts the boundary, make sure the
> @@ -597,6 +597,7 @@ static inline int dup_anon_vma(struct vm_area_struct *dst,
>  	if (src->anon_vma && !dst->anon_vma) {
>  		vma_assert_write_locked(dst);
>  		dst->anon_vma = src->anon_vma;
> +		*dup = dst;
>  		return anon_vma_clone(dst, src);

So the code is simpler that way and seems current callers are fine, but
shouldn't we rather only assign *dup if the clone succeeds?

>  	}
>  
> @@ -624,6 +625,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  	       unsigned long start, unsigned long end, pgoff_t pgoff,
>  	       struct vm_area_struct *next)
>  {
> +	struct vm_area_struct *anon_dup = NULL;
>  	bool remove_next = false;
>  	struct vma_prepare vp;
>  
> @@ -633,7 +635,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  
>  		remove_next = true;
>  		vma_start_write(next);
> -		ret = dup_anon_vma(vma, next);
> +		ret = dup_anon_vma(vma, next, &anon_dup);
>  		if (ret)
>  			return ret;
>  	}
> @@ -661,6 +663,8 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  	return 0;
>  
>  nomem:
> +	if (anon_dup)
> +		unlink_anon_vmas(anon_dup);
>  	return -ENOMEM;
>  }
>  
> @@ -860,6 +864,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  {
>  	struct vm_area_struct *curr, *next, *res;
>  	struct vm_area_struct *vma, *adjust, *remove, *remove2;
> +	struct vm_area_struct *anon_dup = NULL;
>  	struct vma_prepare vp;
>  	pgoff_t vma_pgoff;
>  	int err = 0;
> @@ -927,18 +932,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  		vma_start_write(next);
>  		remove = next;				/* case 1 */
>  		vma_end = next->vm_end;
> -		err = dup_anon_vma(prev, next);
> +		err = dup_anon_vma(prev, next, &anon_dup);
>  		if (curr) {				/* case 6 */
>  			vma_start_write(curr);
>  			remove = curr;
>  			remove2 = next;
>  			if (!next->anon_vma)
> -				err = dup_anon_vma(prev, curr);
> +				err = dup_anon_vma(prev, curr, &anon_dup);
>  		}
>  	} else if (merge_prev) {			/* case 2 */
>  		if (curr) {
>  			vma_start_write(curr);
> -			err = dup_anon_vma(prev, curr);
> +			err = dup_anon_vma(prev, curr, &anon_dup);
>  			if (end == curr->vm_end) {	/* case 7 */
>  				remove = curr;
>  			} else {			/* case 5 */
> @@ -954,7 +959,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  			vma_end = addr;
>  			adjust = next;
>  			adj_start = -(prev->vm_end - addr);
> -			err = dup_anon_vma(next, prev);
> +			err = dup_anon_vma(next, prev, &anon_dup);
>  		} else {
>  			/*
>  			 * Note that cases 3 and 8 are the ONLY ones where prev
> @@ -1018,6 +1023,9 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>  	return res;
>  
>  prealloc_fail:
> +	if (anon_dup)
> +		unlink_anon_vmas(anon_dup);
> +
>  anon_vma_fail:
>  	if (merge_prev)
>  		vma_next(vmi);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge()
  2023-09-29  9:52   ` Vlastimil Babka
@ 2023-09-29 14:41     ` Liam R. Howlett
  0 siblings, 0 replies; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-29 14:41 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, maple-tree, linux-mm, linux-kernel, Jann Horn,
	Lorenzo Stoakes, Suren Baghdasaryan, Matthew Wilcox, stable

* Vlastimil Babka <vbabka@suse.cz> [230929 05:52]:
> On 9/27/23 18:07, Liam R. Howlett wrote:
> > When merging of the previous VMA fails after the vma iterator has been
> > moved to the previous entry, the vma iterator must be advanced to ensure
> > the caller takes the correct action on the next vma iterator event.  Fix
> > this by adding a vma_next() call to the error path.
> > 
> > Users may experience higher CPU usage, most likely in very low memory
> > situations.
> 
> Maybe we could say explicitly that before this fix, vma_merge will be called
> twice on the same vma, which to the best of our knowledge does not cause
> anything worse than some wasted cycles because vma == prev, but it's fragile?

I will modify this statement again in v3.

> 
> > Link: https://urldefense.com/v3/__https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY*Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/__;Kw!!ACWV5N9M2RV99hQ!LhxAWtA9bZgQkMs8Egf7OLmMSj69FWYmfgxD-UoydFparflJmeHvdvKoQChX_kelOhqCP_SSnB1juSOrAg$ 
> > Closes: https://urldefense.com/v3/__https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY*Y2YnsU45yL5giS-Qn=ejtiHpgJAbdQ@mail.gmail.com/__;Kw!!ACWV5N9M2RV99hQ!LhxAWtA9bZgQkMs8Egf7OLmMSj69FWYmfgxD-UoydFparflJmeHvdvKoQChX_kelOhqCP_SSnB1juSOrAg$ 
> > Fixes: 18b098af2890 ("vma_merge: set vma iterator to correct position.")
> > Cc: stable@vger.kernel.org
> > Cc: Jann Horn <jannh@google.com>
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> > ---
> >  mm/mmap.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index b56a7f0c9f85..b5bc4ca9bdc4 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -968,14 +968,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  				vma_pgoff = curr->vm_pgoff;
> >  				vma_start_write(curr);
> >  				remove = curr;
> > -				err = dup_anon_vma(next, curr);
> > +				err = dup_anon_vma(next, curr, &anon_dup);
> >  			}
> >  		}
> >  	}
> >  
> >  	/* Error in anon_vma clone. */
> >  	if (err)
> > -		return NULL;
> > +		goto anon_vma_fail;
> >  
> >  	if (vma_start < vma->vm_start || vma_end > vma->vm_end)
> >  		vma_expanded = true;
> 
> The vma_iter_config() actions done in this part are something we don't need
> to undo?

Oh, right.  Thanks.

This made me realise that my prealloc_fail path assumes there is a
'curr' vma that will cause the vma_next() to set the correct range. We
actually may be 1 or 4, which means we are looking to add a VMA to the
gap; in this case, vma_next() would go beyond where it was at the start
of the function, and may cause issue since we do not return an error.

So the current patch fixes the problem Jann discovered (and with any
iterators), but the issue may exist in other error scenarios today or in
the future. There also may be an issue with next merging failing and
skipping a vma...

Looking through the code, there are functions that do not match the
entire vma iterator location due to trimming (mbind_range, mlock_fixup)
so using the start/end that was passed in may not accurately represent
the range passed in though the vma iterator.  Although, those ranges do
point to the correct location in the tree - they just may be smaller.

All the callers have `addr` within the range of the vma iterator, so it
would be safe to do vma_iter_set(vmi, addr); and vma_iter_load(vmi) to
restore the location and range.  Safest would be to save the vma
iterator start and end location and restore those after a re-walk, but
this seems unnecessary and would complicate backporting.

> 
> > @@ -988,7 +988,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  	}
> >  
> >  	if (vma_iter_prealloc(vmi, vma))
> > -		return NULL;
> > +		goto prealloc_fail;
> 
> 
> 
> >  	init_multi_vma_prep(&vp, vma, adjust, remove, remove2);
> >  	VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
> > @@ -1016,6 +1016,12 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  	vma_complete(&vp, vmi, mm);
> >  	khugepaged_enter_vma(res, vm_flags);
> >  	return res;
> > +
> > +prealloc_fail:
> > +anon_vma_fail:
> > +	if (merge_prev)
> > +		vma_next(vmi);
> > +	return NULL;
> >  }
> >  
> >  /*
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] mmap: Fix error paths with dup_anon_vma()
  2023-09-29 10:07   ` Vlastimil Babka
@ 2023-09-29 14:52     ` Liam R. Howlett
  0 siblings, 0 replies; 13+ messages in thread
From: Liam R. Howlett @ 2023-09-29 14:52 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, maple-tree, linux-mm, linux-kernel, Jann Horn,
	Lorenzo Stoakes, Suren Baghdasaryan, Matthew Wilcox, stable

* Vlastimil Babka <vbabka@suse.cz> [230929 06:08]:
> On 9/27/23 18:07, Liam R. Howlett wrote:
> > When the calling function fails after the dup_anon_vma(), the
> > duplication of the anon_vma is not being undone.  Add the necessary
> > unlink_anon_vma() call to the error paths that are missing them.
> > 
> > This issue showed up during inspection of the error path in vma_merge()
> > for an unrelated vma iterator issue.
> > 
> > Users may experience increased memory usage, which may be problematic as
> > the failure would likely be caused by a low memory situation.
> > 
> > Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
> > Cc: stable@vger.kernel.org
> > Cc: Jann Horn <jannh@google.com>
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> > ---
> >  mm/mmap.c | 20 ++++++++++++++------
> >  1 file changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index b5bc4ca9bdc4..2f0ee489db8a 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -587,7 +587,7 @@ static inline void vma_complete(struct vma_prepare *vp,
> >   * Returns: 0 on success.
> >   */
> >  static inline int dup_anon_vma(struct vm_area_struct *dst,
> > -			       struct vm_area_struct *src)
> > +		struct vm_area_struct *src, struct vm_area_struct **dup)
> >  {
> >  	/*
> >  	 * Easily overlooked: when mprotect shifts the boundary, make sure the
> > @@ -597,6 +597,7 @@ static inline int dup_anon_vma(struct vm_area_struct *dst,
> >  	if (src->anon_vma && !dst->anon_vma) {
> >  		vma_assert_write_locked(dst);
> >  		dst->anon_vma = src->anon_vma;
> > +		*dup = dst;
> >  		return anon_vma_clone(dst, src);
> 
> So the code is simpler that way and seems current callers are fine, but
> shouldn't we rather only assign *dup if the clone succeeds?

Fair point.  I'll address this in v3.

> 
> >  	}
> >  
> > @@ -624,6 +625,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	       unsigned long start, unsigned long end, pgoff_t pgoff,
> >  	       struct vm_area_struct *next)
> >  {
> > +	struct vm_area_struct *anon_dup = NULL;
> >  	bool remove_next = false;
> >  	struct vma_prepare vp;
> >  
> > @@ -633,7 +635,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  
> >  		remove_next = true;
> >  		vma_start_write(next);
> > -		ret = dup_anon_vma(vma, next);
> > +		ret = dup_anon_vma(vma, next, &anon_dup);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -661,6 +663,8 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	return 0;
> >  
> >  nomem:
> > +	if (anon_dup)
> > +		unlink_anon_vmas(anon_dup);
> >  	return -ENOMEM;
> >  }
> >  
> > @@ -860,6 +864,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  {
> >  	struct vm_area_struct *curr, *next, *res;
> >  	struct vm_area_struct *vma, *adjust, *remove, *remove2;
> > +	struct vm_area_struct *anon_dup = NULL;
> >  	struct vma_prepare vp;
> >  	pgoff_t vma_pgoff;
> >  	int err = 0;
> > @@ -927,18 +932,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  		vma_start_write(next);
> >  		remove = next;				/* case 1 */
> >  		vma_end = next->vm_end;
> > -		err = dup_anon_vma(prev, next);
> > +		err = dup_anon_vma(prev, next, &anon_dup);
> >  		if (curr) {				/* case 6 */
> >  			vma_start_write(curr);
> >  			remove = curr;
> >  			remove2 = next;
> >  			if (!next->anon_vma)
> > -				err = dup_anon_vma(prev, curr);
> > +				err = dup_anon_vma(prev, curr, &anon_dup);
> >  		}
> >  	} else if (merge_prev) {			/* case 2 */
> >  		if (curr) {
> >  			vma_start_write(curr);
> > -			err = dup_anon_vma(prev, curr);
> > +			err = dup_anon_vma(prev, curr, &anon_dup);
> >  			if (end == curr->vm_end) {	/* case 7 */
> >  				remove = curr;
> >  			} else {			/* case 5 */
> > @@ -954,7 +959,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  			vma_end = addr;
> >  			adjust = next;
> >  			adj_start = -(prev->vm_end - addr);
> > -			err = dup_anon_vma(next, prev);
> > +			err = dup_anon_vma(next, prev, &anon_dup);
> >  		} else {
> >  			/*
> >  			 * Note that cases 3 and 8 are the ONLY ones where prev
> > @@ -1018,6 +1023,9 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> >  	return res;
> >  
> >  prealloc_fail:
> > +	if (anon_dup)
> > +		unlink_anon_vmas(anon_dup);
> > +
> >  anon_vma_fail:
> >  	if (merge_prev)
> >  		vma_next(vmi);
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-09-29 14:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-27 16:07 [PATCH 0/3] Fixes for vma_merge() error path Liam R. Howlett
2023-09-27 16:07 ` [PATCH 1/3] mmap: Fix vma_iterator in error path of vma_merge() Liam R. Howlett
2023-09-27 17:14   ` Andrew Morton
2023-09-27 17:26     ` Liam R. Howlett
2023-09-28 17:06       ` Liam R. Howlett
2023-09-27 20:06   ` Matthew Wilcox
2023-09-27 22:42     ` Liam R. Howlett
2023-09-29  9:52   ` Vlastimil Babka
2023-09-29 14:41     ` Liam R. Howlett
2023-09-27 16:07 ` [PATCH 2/3] mmap: Fix error paths with dup_anon_vma() Liam R. Howlett
2023-09-29 10:07   ` Vlastimil Babka
2023-09-29 14:52     ` Liam R. Howlett
2023-09-27 16:07 ` [PATCH 3/3] mmap: Add clarifying comment to vma_merge() code Liam R. Howlett

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.