linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] vmalloc: Free pages as a batch
@ 2020-09-21 22:46 Matthew Wilcox (Oracle)
  2020-09-21 22:46 ` [PATCH 2/2] vfree: Update documentation Matthew Wilcox (Oracle)
  2020-09-23  8:35 ` [PATCH 1/2] vmalloc: Free pages as a batch Michal Hocko
  0 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-09-21 22:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, linux-kernel, hch, Uladzislau Rezki

Use release_pages() to free the pages allocated by vmalloc().  This is
slightly more efficient in terms of disabling and enabling IRQs once
per batch instead of once per page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmalloc.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index be4724b916b3..3893fc8915c4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2263,16 +2263,8 @@ static void __vunmap(const void *addr, int deallocate_pages)
 	vm_remove_mappings(area, deallocate_pages);
 
 	if (deallocate_pages) {
-		int i;
-
-		for (i = 0; i < area->nr_pages; i++) {
-			struct page *page = area->pages[i];
-
-			BUG_ON(!page);
-			__free_pages(page, 0);
-		}
+		release_pages(area->pages, area->nr_pages);
 		atomic_long_sub(area->nr_pages, &nr_vmalloc_pages);
-
 		kvfree(area->pages);
 	}
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] vfree: Update documentation
  2020-09-21 22:46 [PATCH 1/2] vmalloc: Free pages as a batch Matthew Wilcox (Oracle)
@ 2020-09-21 22:46 ` Matthew Wilcox (Oracle)
  2020-09-22 14:35   ` Christoph Hellwig
  2020-09-23  8:35 ` [PATCH 1/2] vmalloc: Free pages as a batch Michal Hocko
  1 sibling, 1 reply; 9+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-09-21 22:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, linux-kernel, hch, Uladzislau Rezki

 * Document that you can call vfree() on an address returned from vmap()
 * Remove the note about the minimum size -- the minimum size of a vmalloc
   allocation is one page
 * Add a Context: section
 * Fix capitalisation
 * Reword the prohibition on calling from NMI context to avoid a double
   negative

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmalloc.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3893fc8915c4..942a44bdeec6 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2313,20 +2313,20 @@ static void __vfree(const void *addr)
 }
 
 /**
- * vfree - release memory allocated by vmalloc()
- * @addr:  memory base address
+ * vfree - Release memory allocated by vmalloc()
+ * @addr:  Memory base address
  *
  * Free the virtually continuous memory area starting at @addr, as
- * obtained from vmalloc(), vmalloc_32() or __vmalloc(). If @addr is
- * NULL, no operation is performed.
- *
- * Must not be called in NMI context (strictly speaking, only if we don't
- * have CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG, but making the calling
- * conventions for vfree() arch-depenedent would be a really bad idea)
+ * obtained from vmalloc(), vmalloc_32() or __vmalloc().  If called
+ * on an @addr obtained from vmap(), it will put one refcount on each
+ * mapped page, which will free the page if this is the last refcount
+ * on the page.  If @addr is NULL, no operation is performed.
  *
+ * Context:
  * May sleep if called *not* from interrupt context.
- *
- * NOTE: assumes that the object at @addr has a size >= sizeof(llist_node)
+ * Must not be called in NMI context (strictly speaking, it could be
+ * if we have CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG, but making the calling
+ * conventions for vfree() arch-depenedent would be a really bad idea).
  */
 void vfree(const void *addr)
 {
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] vfree: Update documentation
  2020-09-21 22:46 ` [PATCH 2/2] vfree: Update documentation Matthew Wilcox (Oracle)
@ 2020-09-22 14:35   ` Christoph Hellwig
  2020-09-22 15:06     ` Matthew Wilcox
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2020-09-22 14:35 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, linux-kernel, hch, Uladzislau Rezki

On Mon, Sep 21, 2020 at 11:46:28PM +0100, Matthew Wilcox (Oracle) wrote:
>  * Document that you can call vfree() on an address returned from vmap()
>  * Remove the note about the minimum size -- the minimum size of a vmalloc
>    allocation is one page
>  * Add a Context: section
>  * Fix capitalisation
>  * Reword the prohibition on calling from NMI context to avoid a double
>    negative
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/vmalloc.c | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 3893fc8915c4..942a44bdeec6 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2313,20 +2313,20 @@ static void __vfree(const void *addr)
>  }
>  
>  /**
> - * vfree - release memory allocated by vmalloc()
> - * @addr:  memory base address
> + * vfree - Release memory allocated by vmalloc()
> + * @addr:  Memory base address
>   *
>   * Free the virtually continuous memory area starting at @addr, as
> + * obtained from vmalloc(), vmalloc_32() or __vmalloc().  If called
> + * on an @addr obtained from vmap(), it will put one refcount on each
> + * mapped page, which will free the page if this is the last refcount
> + * on the page.  If @addr is NULL, no operation is performed.

This reads a little confusing.  First it only allows vmalloc* and
then it mentions vmap in the next sentence.  And what about
vmalloc_32_user, vzalloc_node, vmalloc_node, vmalloc_user, vzalloc and
__vmalloc_node?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] vfree: Update documentation
  2020-09-22 14:35   ` Christoph Hellwig
@ 2020-09-22 15:06     ` Matthew Wilcox
  2020-09-22 15:09       ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-09-22 15:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Andrew Morton, linux-mm, linux-kernel, Uladzislau Rezki

On Tue, Sep 22, 2020 at 04:35:06PM +0200, Christoph Hellwig wrote:
> >  /**
> > - * vfree - release memory allocated by vmalloc()
> > - * @addr:  memory base address
> > + * vfree - Release memory allocated by vmalloc()
> > + * @addr:  Memory base address
> >   *
> >   * Free the virtually continuous memory area starting at @addr, as
> > + * obtained from vmalloc(), vmalloc_32() or __vmalloc().  If called
> > + * on an @addr obtained from vmap(), it will put one refcount on each
> > + * mapped page, which will free the page if this is the last refcount
> > + * on the page.  If @addr is NULL, no operation is performed.
> 
> This reads a little confusing.  First it only allows vmalloc* and
> then it mentions vmap in the next sentence.  And what about
> vmalloc_32_user, vzalloc_node, vmalloc_node, vmalloc_user, vzalloc and
> __vmalloc_node?

In my defence, I didn't write that sentence; you snipped:

- * obtained from vmalloc(), vmalloc_32() or __vmalloc(). If @addr is
- * NULL, no operation is performed.

I don't think it makes sense to list all vmalloc-style allocators here.
It won't be updated by people who add new variations.  How about this?

 * Free the virtually continuous memory area starting at @addr, as
 * obtained from one of the vmalloc() family of APIs.  This will
 * usually also free the physical memory underlying the virtual
 * allocation, but that memory is reference counted, so it will not
 * be freed until the last user goes away.
 *
 * If @addr is NULL, no operation is performed.

I'm trying to strike a balance between being accurate and not requiring
device driver authors to learn all about struct page.  I may be too
close to the implementation to write good documentation for it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] vfree: Update documentation
  2020-09-22 15:06     ` Matthew Wilcox
@ 2020-09-22 15:09       ` Christoph Hellwig
  2020-09-22 15:22         ` Matthew Wilcox
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2020-09-22 15:09 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Andrew Morton, linux-mm, linux-kernel,
	Uladzislau Rezki

On Tue, Sep 22, 2020 at 04:06:03PM +0100, Matthew Wilcox wrote:
> I don't think it makes sense to list all vmalloc-style allocators here.
> It won't be updated by people who add new variations.  How about this?
> 
>  * Free the virtually continuous memory area starting at @addr, as
>  * obtained from one of the vmalloc() family of APIs.  This will
>  * usually also free the physical memory underlying the virtual
>  * allocation, but that memory is reference counted, so it will not
>  * be freed until the last user goes away.
>  *
>  * If @addr is NULL, no operation is performed.
> 
> I'm trying to strike a balance between being accurate and not requiring
> device driver authors to learn all about struct page.  I may be too
> close to the implementation to write good documentation for it.

I think the above is sensible, but not enough.  vmap really needs to
be treated special, as by default area->pages for vmap is NULL.  So
for vfree to be useful on a vmap mapping, the callers needs to
manually set it up by poking into the internals.  Actually, I think
we really want another API rather than vmap for that.  Let me respin
my series to include that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] vfree: Update documentation
  2020-09-22 15:09       ` Christoph Hellwig
@ 2020-09-22 15:22         ` Matthew Wilcox
  2020-09-22 15:31           ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-09-22 15:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Andrew Morton, linux-mm, linux-kernel, Uladzislau Rezki

On Tue, Sep 22, 2020 at 05:09:10PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 22, 2020 at 04:06:03PM +0100, Matthew Wilcox wrote:
> > I don't think it makes sense to list all vmalloc-style allocators here.
> > It won't be updated by people who add new variations.  How about this?
> > 
> >  * Free the virtually continuous memory area starting at @addr, as
> >  * obtained from one of the vmalloc() family of APIs.  This will
> >  * usually also free the physical memory underlying the virtual
> >  * allocation, but that memory is reference counted, so it will not
> >  * be freed until the last user goes away.
> >  *
> >  * If @addr is NULL, no operation is performed.
> > 
> > I'm trying to strike a balance between being accurate and not requiring
> > device driver authors to learn all about struct page.  I may be too
> > close to the implementation to write good documentation for it.
> 
> I think the above is sensible, but not enough.  vmap really needs to
> be treated special, as by default area->pages for vmap is NULL.  So
> for vfree to be useful on a vmap mapping, the callers needs to
> manually set it up by poking into the internals.  Actually, I think
> we really want another API rather than vmap for that.  Let me respin
> my series to include that.

I've been thinking about somethng like:

void *vmap_mapping(struct address_space *mapping, pgoff_t start,
		unsigned long len);

but it doesn't quite work for the shmem cases because they need to use
shmem_read_mapping_page_gfp() instead of ->readpage.  I'd also want it
to work for DAX, but I don't have a user for that yet so it's hard to
justify adding it.

I have ideas for making shmem/tmpfs work more like other filesystems,
so we don't need shmem_read_mapping_page_gfp() but there are only so
many hours in the day.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] vfree: Update documentation
  2020-09-22 15:22         ` Matthew Wilcox
@ 2020-09-22 15:31           ` Christoph Hellwig
  2020-09-22 16:42             ` Matthew Wilcox
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2020-09-22 15:31 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Andrew Morton, linux-mm, linux-kernel,
	Uladzislau Rezki

On Tue, Sep 22, 2020 at 04:22:40PM +0100, Matthew Wilcox wrote:
> On Tue, Sep 22, 2020 at 05:09:10PM +0200, Christoph Hellwig wrote:
> > On Tue, Sep 22, 2020 at 04:06:03PM +0100, Matthew Wilcox wrote:
> > > I don't think it makes sense to list all vmalloc-style allocators here.
> > > It won't be updated by people who add new variations.  How about this?
> > > 
> > >  * Free the virtually continuous memory area starting at @addr, as
> > >  * obtained from one of the vmalloc() family of APIs.  This will
> > >  * usually also free the physical memory underlying the virtual
> > >  * allocation, but that memory is reference counted, so it will not
> > >  * be freed until the last user goes away.
> > >  *
> > >  * If @addr is NULL, no operation is performed.
> > > 
> > > I'm trying to strike a balance between being accurate and not requiring
> > > device driver authors to learn all about struct page.  I may be too
> > > close to the implementation to write good documentation for it.
> > 
> > I think the above is sensible, but not enough.  vmap really needs to
> > be treated special, as by default area->pages for vmap is NULL.  So
> > for vfree to be useful on a vmap mapping, the callers needs to
> > manually set it up by poking into the internals.  Actually, I think
> > we really want another API rather than vmap for that.  Let me respin
> > my series to include that.
> 
> I've been thinking about somethng like:
> 
> void *vmap_mapping(struct address_space *mapping, pgoff_t start,
> 		unsigned long len);
> 
> but it doesn't quite work for the shmem cases because they need to use
> shmem_read_mapping_page_gfp() instead of ->readpage.  I'd also want it
> to work for DAX, but I don't have a user for that yet so it's hard to
> justify adding it.

That seems a little too special cased to me.

FYI, if you are fine with it I'll pick your two patches with the
updates from this thread up for my series.  It has be now morphed into
a general vmalloc.c cleanup series.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] vfree: Update documentation
  2020-09-22 15:31           ` Christoph Hellwig
@ 2020-09-22 16:42             ` Matthew Wilcox
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Wilcox @ 2020-09-22 16:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Andrew Morton, linux-mm, linux-kernel, Uladzislau Rezki

On Tue, Sep 22, 2020 at 05:31:36PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 22, 2020 at 04:22:40PM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 22, 2020 at 05:09:10PM +0200, Christoph Hellwig wrote:
> > > On Tue, Sep 22, 2020 at 04:06:03PM +0100, Matthew Wilcox wrote:
> > > > I don't think it makes sense to list all vmalloc-style allocators here.
> > > > It won't be updated by people who add new variations.  How about this?
> > > > 
> > > >  * Free the virtually continuous memory area starting at @addr, as
> > > >  * obtained from one of the vmalloc() family of APIs.  This will
> > > >  * usually also free the physical memory underlying the virtual
> > > >  * allocation, but that memory is reference counted, so it will not
> > > >  * be freed until the last user goes away.
> > > >  *
> > > >  * If @addr is NULL, no operation is performed.
> > > > 
> > > > I'm trying to strike a balance between being accurate and not requiring
> > > > device driver authors to learn all about struct page.  I may be too
> > > > close to the implementation to write good documentation for it.
> > > 
> > > I think the above is sensible, but not enough.  vmap really needs to
> > > be treated special, as by default area->pages for vmap is NULL.  So
> > > for vfree to be useful on a vmap mapping, the callers needs to
> > > manually set it up by poking into the internals.  Actually, I think
> > > we really want another API rather than vmap for that.  Let me respin
> > > my series to include that.
> > 
> > I've been thinking about somethng like:
> > 
> > void *vmap_mapping(struct address_space *mapping, pgoff_t start,
> > 		unsigned long len);
> > 
> > but it doesn't quite work for the shmem cases because they need to use
> > shmem_read_mapping_page_gfp() instead of ->readpage.  I'd also want it
> > to work for DAX, but I don't have a user for that yet so it's hard to
> > justify adding it.
> 
> That seems a little too special cased to me.

It's for filesystems that currently use map_bh or kmap_atomic() to
get at their metadata (directories, superblocks, etc).

> FYI, if you are fine with it I'll pick your two patches with the
> updates from this thread up for my series.  It has be now morphed into
> a general vmalloc.c cleanup series.

Yes, that's absolutely fine.  Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] vmalloc: Free pages as a batch
  2020-09-21 22:46 [PATCH 1/2] vmalloc: Free pages as a batch Matthew Wilcox (Oracle)
  2020-09-21 22:46 ` [PATCH 2/2] vfree: Update documentation Matthew Wilcox (Oracle)
@ 2020-09-23  8:35 ` Michal Hocko
  1 sibling, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2020-09-23  8:35 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, linux-kernel, hch, Uladzislau Rezki

On Mon 21-09-20 23:46:27, Matthew Wilcox wrote:
> Use release_pages() to free the pages allocated by vmalloc().  This is
> slightly more efficient in terms of disabling and enabling IRQs once
> per batch instead of once per page.

Hmm, does this really lead to runtime improvements? Batching IRQ is
certainly nice but release_pages is much more heavy weight and all
additional checks are simply always false for vmalloc pages so all those
checks are pointless.

Maybe storing those pages into the linked list and use
free_unref_page_list instead would achieve what you want?

> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/vmalloc.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index be4724b916b3..3893fc8915c4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2263,16 +2263,8 @@ static void __vunmap(const void *addr, int deallocate_pages)
>  	vm_remove_mappings(area, deallocate_pages);
>  
>  	if (deallocate_pages) {
> -		int i;
> -
> -		for (i = 0; i < area->nr_pages; i++) {
> -			struct page *page = area->pages[i];
> -
> -			BUG_ON(!page);
> -			__free_pages(page, 0);
> -		}
> +		release_pages(area->pages, area->nr_pages);
>  		atomic_long_sub(area->nr_pages, &nr_vmalloc_pages);
> -
>  		kvfree(area->pages);
>  	}
>  
> -- 
> 2.28.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-09-23  8:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 22:46 [PATCH 1/2] vmalloc: Free pages as a batch Matthew Wilcox (Oracle)
2020-09-21 22:46 ` [PATCH 2/2] vfree: Update documentation Matthew Wilcox (Oracle)
2020-09-22 14:35   ` Christoph Hellwig
2020-09-22 15:06     ` Matthew Wilcox
2020-09-22 15:09       ` Christoph Hellwig
2020-09-22 15:22         ` Matthew Wilcox
2020-09-22 15:31           ` Christoph Hellwig
2020-09-22 16:42             ` Matthew Wilcox
2020-09-23  8:35 ` [PATCH 1/2] vmalloc: Free pages as a batch Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).