[38/94] mm/gup: Add mm_populate_vma() for use when the vma is known
diff mbox series

Message ID 20210428153542.2814175-39-Liam.Howlett@Oracle.com
State New, archived
Headers show
Series
  • Introducing the Maple Tree
Related show

Commit Message

Liam Howlett April 28, 2021, 3:36 p.m. UTC
When a vma is known, avoid calling mm_populate to search for the vma to
populate.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
---
 mm/gup.c      | 20 ++++++++++++++++++++
 mm/internal.h |  4 ++++
 2 files changed, 24 insertions(+)

Comments

Michel Lespinasse May 1, 2021, 5:13 a.m. UTC | #1
On Wed, Apr 28, 2021 at 03:36:08PM +0000, Liam Howlett wrote:
> When a vma is known, avoid calling mm_populate to search for the vma to
> populate.
> 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> ---
>  mm/gup.c      | 20 ++++++++++++++++++++
>  mm/internal.h |  4 ++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index c3a17b189064..48fe98ab0729 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1468,6 +1468,26 @@ long populate_vma_page_range(struct vm_area_struct *vma,
>  				NULL, NULL, locked);
>  }
>  
> +/*
> + * mm_populate_vma() - Populate a single range in a single vma.
> + * @vma: The vma to populate.
> + * @start: The start address to populate
> + * @end: The end address to stop populating
> + *
> + * Note: Ignores errors.
> + */
> +void mm_populate_vma(struct vm_area_struct *vma, unsigned long start,
> +		unsigned long end)
> +{
> +	struct mm_struct *mm = current->mm;
> +	int locked = 1;
> +
> +	mmap_read_lock(mm);
> +	populate_vma_page_range(vma, start, end, &locked);
> +	if (locked)
> +		mmap_read_unlock(mm);
> +}
> +

This seems like a nonsensical API at first glance - VMAs that are found
in the vma tree might be modified, merged, split, or freed at any time
if the mmap lock is not held, so the API can not be safely used. I think
this applies to maple tree vmas just as much as it did for rbtree vmas ?

--
Michel "walken" Lespinasse
Liam Howlett May 3, 2021, 3:53 p.m. UTC | #2
* Michel Lespinasse <michel@lespinasse.org> [210501 01:13]:
> On Wed, Apr 28, 2021 at 03:36:08PM +0000, Liam Howlett wrote:
> > When a vma is known, avoid calling mm_populate to search for the vma to
> > populate.
> > 
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > ---
> >  mm/gup.c      | 20 ++++++++++++++++++++
> >  mm/internal.h |  4 ++++
> >  2 files changed, 24 insertions(+)
> > 
> > diff --git a/mm/gup.c b/mm/gup.c
> > index c3a17b189064..48fe98ab0729 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -1468,6 +1468,26 @@ long populate_vma_page_range(struct vm_area_struct *vma,
> >  				NULL, NULL, locked);
> >  }
> >  
> > +/*
> > + * mm_populate_vma() - Populate a single range in a single vma.
> > + * @vma: The vma to populate.
> > + * @start: The start address to populate
> > + * @end: The end address to stop populating
> > + *
> > + * Note: Ignores errors.
> > + */
> > +void mm_populate_vma(struct vm_area_struct *vma, unsigned long start,
> > +		unsigned long end)
> > +{
> > +	struct mm_struct *mm = current->mm;
> > +	int locked = 1;
> > +
> > +	mmap_read_lock(mm);
> > +	populate_vma_page_range(vma, start, end, &locked);
> > +	if (locked)
> > +		mmap_read_unlock(mm);
> > +}
> > +
> 
> This seems like a nonsensical API at first glance - VMAs that are found
> in the vma tree might be modified, merged, split, or freed at any time
> if the mmap lock is not held, so the API can not be safely used. I think
> this applies to maple tree vmas just as much as it did for rbtree vmas ?

This is correct - it cannot be used without having the mmap_sem lock.
This is a new internal mm code API and is used to avoid callers that use
mm_populate() on a range that is known to be in a single VMA and already
have that VMA.  So instead of re-walking the tree to re-find the VMAs,
this function can be used with the known VMA and range.

It is used as described in patch 39 and 40 of this series.

Thanks,
Liam
Matthew Wilcox (Oracle) May 3, 2021, 4 p.m. UTC | #3
On Mon, May 03, 2021 at 03:53:58PM +0000, Liam Howlett wrote:
> * Michel Lespinasse <michel@lespinasse.org> [210501 01:13]:
> > On Wed, Apr 28, 2021 at 03:36:08PM +0000, Liam Howlett wrote:
> > > When a vma is known, avoid calling mm_populate to search for the vma to
> > > populate.
> > > 
> > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > > ---
> > >  mm/gup.c      | 20 ++++++++++++++++++++
> > >  mm/internal.h |  4 ++++
> > >  2 files changed, 24 insertions(+)
> > > 
> > > diff --git a/mm/gup.c b/mm/gup.c
> > > index c3a17b189064..48fe98ab0729 100644
> > > --- a/mm/gup.c
> > > +++ b/mm/gup.c
> > > @@ -1468,6 +1468,26 @@ long populate_vma_page_range(struct vm_area_struct *vma,
> > >  				NULL, NULL, locked);
> > >  }
> > >  
> > > +/*
> > > + * mm_populate_vma() - Populate a single range in a single vma.
> > > + * @vma: The vma to populate.
> > > + * @start: The start address to populate
> > > + * @end: The end address to stop populating
> > > + *
> > > + * Note: Ignores errors.
> > > + */
> > > +void mm_populate_vma(struct vm_area_struct *vma, unsigned long start,
> > > +		unsigned long end)
> > > +{
> > > +	struct mm_struct *mm = current->mm;
> > > +	int locked = 1;
> > > +
> > > +	mmap_read_lock(mm);
> > > +	populate_vma_page_range(vma, start, end, &locked);
> > > +	if (locked)
> > > +		mmap_read_unlock(mm);
> > > +}
> > > +
> > 
> > This seems like a nonsensical API at first glance - VMAs that are found
> > in the vma tree might be modified, merged, split, or freed at any time
> > if the mmap lock is not held, so the API can not be safely used. I think
> > this applies to maple tree vmas just as much as it did for rbtree vmas ?
> 
> This is correct - it cannot be used without having the mmap_sem lock.
> This is a new internal mm code API and is used to avoid callers that use
> mm_populate() on a range that is known to be in a single VMA and already
> have that VMA.  So instead of re-walking the tree to re-find the VMAs,
> this function can be used with the known VMA and range.
> 
> It is used as described in patch 39 and 40 of this series.

In patch 39, what you do is:

1 Take the mmap_sem for write
2 do stuff
3 Drop the mmap_sem
4 Call mm_populate_vm() with the vma, which takes the mmap_sem
   for read

The problem is that between 3 & 4, a racing thread might cause us to free
the vma and so we've now passed a bogus pointer into mm_populate_vm().

What we need instead is to downgrade the mmap_sem from write to read at
step 3, so the vma is guaranteed to still be good.
Liam Howlett May 3, 2021, 11:01 p.m. UTC | #4
* Matthew Wilcox <willy@infradead.org> [210503 12:02]:
> On Mon, May 03, 2021 at 03:53:58PM +0000, Liam Howlett wrote:
> > * Michel Lespinasse <michel@lespinasse.org> [210501 01:13]:
> > > On Wed, Apr 28, 2021 at 03:36:08PM +0000, Liam Howlett wrote:
> > > > When a vma is known, avoid calling mm_populate to search for the vma to
> > > > populate.
> > > > 
> > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > > > ---
> > > >  mm/gup.c      | 20 ++++++++++++++++++++
> > > >  mm/internal.h |  4 ++++
> > > >  2 files changed, 24 insertions(+)
> > > > 
> > > > diff --git a/mm/gup.c b/mm/gup.c
> > > > index c3a17b189064..48fe98ab0729 100644
> > > > --- a/mm/gup.c
> > > > +++ b/mm/gup.c
> > > > @@ -1468,6 +1468,26 @@ long populate_vma_page_range(struct vm_area_struct *vma,
> > > >  				NULL, NULL, locked);
> > > >  }
> > > >  
> > > > +/*
> > > > + * mm_populate_vma() - Populate a single range in a single vma.
> > > > + * @vma: The vma to populate.
> > > > + * @start: The start address to populate
> > > > + * @end: The end address to stop populating
> > > > + *
> > > > + * Note: Ignores errors.
> > > > + */
> > > > +void mm_populate_vma(struct vm_area_struct *vma, unsigned long start,
> > > > +		unsigned long end)
> > > > +{
> > > > +	struct mm_struct *mm = current->mm;
> > > > +	int locked = 1;
> > > > +
> > > > +	mmap_read_lock(mm);
> > > > +	populate_vma_page_range(vma, start, end, &locked);
> > > > +	if (locked)
> > > > +		mmap_read_unlock(mm);
> > > > +}
> > > > +
> > > 
> > > This seems like a nonsensical API at first glance - VMAs that are found
> > > in the vma tree might be modified, merged, split, or freed at any time
> > > if the mmap lock is not held, so the API can not be safely used. I think
> > > this applies to maple tree vmas just as much as it did for rbtree vmas ?
> > 
> > This is correct - it cannot be used without having the mmap_sem lock.
> > This is a new internal mm code API and is used to avoid callers that use
> > mm_populate() on a range that is known to be in a single VMA and already
> > have that VMA.  So instead of re-walking the tree to re-find the VMAs,
> > this function can be used with the known VMA and range.
> > 
> > It is used as described in patch 39 and 40 of this series.
> 
> In patch 39, what you do is:
> 
> 1 Take the mmap_sem for write
> 2 do stuff
> 3 Drop the mmap_sem
> 4 Call mm_populate_vm() with the vma, which takes the mmap_sem
>    for read
> 
> The problem is that between 3 & 4, a racing thread might cause us to free
> the vma and so we've now passed a bogus pointer into mm_populate_vm().
> 
> What we need instead is to downgrade the mmap_sem from write to read at
> step 3, so the vma is guaranteed to still be good.

Thank you.  I will remove these patches from the series and work on this
idea.

Regards,
Liam

Patch
diff mbox series

diff --git a/mm/gup.c b/mm/gup.c
index c3a17b189064..48fe98ab0729 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1468,6 +1468,26 @@  long populate_vma_page_range(struct vm_area_struct *vma,
 				NULL, NULL, locked);
 }
 
+/*
+ * mm_populate_vma() - Populate a single range in a single vma.
+ * @vma: The vma to populate.
+ * @start: The start address to populate
+ * @end: The end address to stop populating
+ *
+ * Note: Ignores errors.
+ */
+void mm_populate_vma(struct vm_area_struct *vma, unsigned long start,
+		unsigned long end)
+{
+	struct mm_struct *mm = current->mm;
+	int locked = 1;
+
+	mmap_read_lock(mm);
+	populate_vma_page_range(vma, start, end, &locked);
+	if (locked)
+		mmap_read_unlock(mm);
+}
+
 /*
  * __mm_populate - populate and/or mlock pages within a range of address space.
  *
diff --git a/mm/internal.h b/mm/internal.h
index 7ad55938d391..583f2f1e6ff8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -346,6 +346,10 @@  static inline bool is_data_mapping(vm_flags_t flags)
 	return (flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE;
 }
 
+/* mm/gup.c */
+extern void mm_populate_vma(struct vm_area_struct *vma, unsigned long start,
+			    unsigned long end);
+
 /* Maple tree operations using VMAs */
 /*
  * vma_mas_store() - Store a VMA in the maple tree.