linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] page_alloc: Fix freeing non-compound pages
@ 2020-09-26 21:39 Matthew Wilcox (Oracle)
  2020-09-29  1:03 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-09-26 21:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle),
	Nick Piggin, Hugh Dickins, Peter Zijlstra, linux-mm,
	linux-kernel

Here is a very rare race which leaks memory:

Page P0 is allocated to the page cache.  Page P1 is free.

Thread A                Thread B                Thread C
find_get_entry():
xas_load() returns P0
						Removes P0 from page cache
						P0 finds its buddy P1
			alloc_pages(GFP_KERNEL, 1) returns P0
			P0 has refcount 1
page_cache_get_speculative(P0)
P0 has refcount 2
			__free_pages(P0)
			P0 has refcount 1
put_page(P0)
P1 is not freed

Fix this by freeing all the pages in __free_pages() that won't be freed
by the call to put_page().  It's usually not a good idea to split a page,
but this is a very unlikely scenario.

Fixes: e286781d5f2e ("mm: speculative page references")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
v2: Add a test module.  Verified it works by:
 (1) loading the test module on an unmodified kernel and watching it OOM
 (2) modifying __free_pages() with my v1 patch that neglected to add the
     PageHead test and hitting VM_BUG_ON_PAGE(PageTail(page))
 (3) applying all of this patch and seeing it survive

 lib/Kconfig.debug     |  9 +++++++++
 lib/Makefile          |  1 +
 lib/test_free_pages.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 mm/page_alloc.c       |  3 +++
 4 files changed, 55 insertions(+)
 create mode 100644 lib/test_free_pages.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e068c3c7189a..eed59af0e907 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2369,6 +2369,15 @@ config TEST_HMM
 
 	  If unsure, say N.
 
+config TEST_FREE_PAGES
+	tristate "Test freeing pages"
+	help
+	  Test that a memory leak does not occur due to a race between
+	  freeing a block of pages and a speculative page reference.
+	  Loading this module is safe if your kernel has the bug fixed.
+	  If the bug is not fixed, it will leak gigabytes of memory and
+	  probably OOM your system.
+
 config TEST_FPU
 	tristate "Test floating point operations in kernel space"
 	depends on X86 && !KCOV_INSTRUMENT_ALL
diff --git a/lib/Makefile b/lib/Makefile
index a4a4c6864f51..071b687b7363 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -99,6 +99,7 @@ obj-$(CONFIG_TEST_BLACKHOLE_DEV) += test_blackhole_dev.o
 obj-$(CONFIG_TEST_MEMINIT) += test_meminit.o
 obj-$(CONFIG_TEST_LOCKUP) += test_lockup.o
 obj-$(CONFIG_TEST_HMM) += test_hmm.o
+obj-$(CONFIG_TEST_FREE_PAGES) += test_free_pages.o
 
 #
 # CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
diff --git a/lib/test_free_pages.c b/lib/test_free_pages.c
new file mode 100644
index 000000000000..074e76bd76b2
--- /dev/null
+++ b/lib/test_free_pages.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * test_free_pages.c: Check that free_pages() doesn't leak memory
+ * Copyright (c) 2020 Oracle
+ * Author: Matthew Wilcox <willy@infradead.org>
+ */
+
+#include <linux/gfp.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+
+static void test_free_pages(gfp_t gfp)
+{
+	unsigned int i;
+
+	for (i = 0; i < 1000 * 1000; i++) {
+		unsigned long addr = __get_free_pages(gfp, 3);
+		struct page *page = virt_to_page(addr);
+
+		/* Simulate page cache getting a speculative reference */
+		get_page(page);
+		free_pages(addr, 3);
+		put_page(page);
+	}
+}
+
+static int m_in(void)
+{
+	test_free_pages(GFP_KERNEL);
+	test_free_pages(GFP_KERNEL | __GFP_COMP);
+
+	return 0;
+}
+
+static void m_ex(void)
+{
+}
+
+module_init(m_in);
+module_exit(m_ex);
+MODULE_AUTHOR("Matthew Wilcox <willy@infradead.org>");
+MODULE_LICENSE("GPL");
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fab5e97dc9ca..9b259c76e285 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4947,6 +4947,9 @@ void __free_pages(struct page *page, unsigned int order)
 {
 	if (put_page_testzero(page))
 		free_the_page(page, order);
+	else if (!PageHead(page))
+		while (order-- > 0)
+			free_the_page(page + (1 << order), order);
 }
 EXPORT_SYMBOL(__free_pages);
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-26 21:39 [PATCH v2] page_alloc: Fix freeing non-compound pages Matthew Wilcox (Oracle)
@ 2020-09-29  1:03 ` Andrew Morton
  2020-09-29  1:17   ` Matthew Wilcox
  2020-09-29  3:40   ` Matthew Wilcox
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2020-09-29  1:03 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Nick Piggin, Hugh Dickins, Peter Zijlstra, linux-mm, linux-kernel

On Sat, 26 Sep 2020 22:39:19 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:

> Here is a very rare race which leaks memory:

Not worth a cc:stable?

> Page P0 is allocated to the page cache.  Page P1 is free.
> 
> Thread A                Thread B                Thread C
> find_get_entry():
> xas_load() returns P0
> 						Removes P0 from page cache
> 						P0 finds its buddy P1
> 			alloc_pages(GFP_KERNEL, 1) returns P0
> 			P0 has refcount 1
> page_cache_get_speculative(P0)
> P0 has refcount 2
> 			__free_pages(P0)

			__free_pages(P0, 1), I assume.

> 			P0 has refcount 1
> put_page(P0)

but this is implicitly order 0

> P1 is not freed

huh.

> Fix this by freeing all the pages in __free_pages() that won't be freed
> by the call to put_page().  It's usually not a good idea to split a page,
> but this is a very unlikely scenario.
> 
> ...
>
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4947,6 +4947,9 @@ void __free_pages(struct page *page, unsigned int order)
>  {
>  	if (put_page_testzero(page))
>  		free_the_page(page, order);
> +	else if (!PageHead(page))
> +		while (order-- > 0)
> +			free_the_page(page + (1 << order), order);

Well that's weird and scary looking.  `page' has non-zero refcount yet
we go and free random followon pages.  Methinks it merits an
explanatory comment?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29  1:03 ` Andrew Morton
@ 2020-09-29  1:17   ` Matthew Wilcox
  2020-09-29  4:46     ` Andrew Morton
  2020-09-29  3:40   ` Matthew Wilcox
  1 sibling, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2020-09-29  1:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nick Piggin, Hugh Dickins, Peter Zijlstra, linux-mm, linux-kernel

On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
> On Sat, 26 Sep 2020 22:39:19 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:
> 
> > Here is a very rare race which leaks memory:
> 
> Not worth a cc:stable?

Yes, it probably should have been.  I just assume the stablebot will
pick up anything that has a Fixes: tag.

> > Page P0 is allocated to the page cache.  Page P1 is free.
> > 
> > Thread A                Thread B                Thread C
> > find_get_entry():
> > xas_load() returns P0
> > 						Removes P0 from page cache
> > 						P0 finds its buddy P1
> > 			alloc_pages(GFP_KERNEL, 1) returns P0
> > 			P0 has refcount 1
> > page_cache_get_speculative(P0)
> > P0 has refcount 2
> > 			__free_pages(P0)
> 
> 			__free_pages(P0, 1), I assume.

Good catch.  That was what I meant to type.

> > 			P0 has refcount 1
> > put_page(P0)
> 
> but this is implicitly order 0

Right, because it's not a compound page.

> > P1 is not freed
> 
> huh.

Yeah.  Nasty, and we'll never know how often it was hit.

> > Fix this by freeing all the pages in __free_pages() that won't be freed
> > by the call to put_page().  It's usually not a good idea to split a page,
> > but this is a very unlikely scenario.
> > 
> > ...
> >
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -4947,6 +4947,9 @@ void __free_pages(struct page *page, unsigned int order)
> >  {
> >  	if (put_page_testzero(page))
> >  		free_the_page(page, order);
> > +	else if (!PageHead(page))
> > +		while (order-- > 0)
> > +			free_the_page(page + (1 << order), order);
> 
> Well that's weird and scary looking.  `page' has non-zero refcount yet
> we go and free random followon pages.  Methinks it merits an
> explanatory comment?

Well, poot.  I lost that comment in the shuffling of patches.  In a
different tree, I have:

@@ -4943,10 +4943,19 @@ static inline void free_the_page(struct page *page, unsi
gned int order)
                __free_pages_ok(page, order);
 }
 
+/*
+ * If we free a non-compound allocation, another thread may have a
+ * speculative reference to the first page.  It has no way of knowing
+ * about the rest of the allocation, so we have to free all but the
+ * first page here.
+ */
 void __free_pages(struct page *page, unsigned int order)
 {
        if (put_page_testzero(page))
                free_the_page(page, order);
+       else if (!PageHead(page))
+               while (order-- > 0)
+                       free_the_page(page + (1 << order), order);
 }
 EXPORT_SYMBOL(__free_pages);
 

Although I'm now thinking of making that comment into kernel-doc and
turning it into advice to the caller rather than an internal note to
other mm developers.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29  1:03 ` Andrew Morton
  2020-09-29  1:17   ` Matthew Wilcox
@ 2020-09-29  3:40   ` Matthew Wilcox
  2020-09-29  7:26     ` Mike Rapoport
  1 sibling, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2020-09-29  3:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nick Piggin, Hugh Dickins, Peter Zijlstra, linux-mm,
	linux-kernel, hch, rppt, rdunlap

On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
> Well that's weird and scary looking.  `page' has non-zero refcount yet
> we go and free random followon pages.  Methinks it merits an
> explanatory comment?

Here's some kernel-doc.  Opinions?

/**
 * __free_pages - Free pages allocated with alloc_pages().
 * @page: The page pointer returned from alloc_pages().
 * @order: The order of the allocation.
 *
 * This function differs from put_page() in that it can free multi-page
 * allocations that were not allocated with %__GFP_COMP.  This function
 * does not check that the @order passed in matches that of the
 * allocation, so it is possible to leak memory.  Freeing more memory than
 * was allocated will probably be warned about by other debugging checks.
 *
 * It is only safe to use the page reference count to determine when
 * to free an allocation if you use %__GFP_COMP (in which case, you may
 * as well use put_page() to free the page).  Another thread may have a
 * speculative reference to the first page, but it has no way of knowing
 * about the rest of the allocation, so we have to free all but the
 * first page here.
 *
 * Context: May be called in interrupt context but not NMI context.
 */


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29  1:17   ` Matthew Wilcox
@ 2020-09-29  4:46     ` Andrew Morton
  2020-09-29 22:53       ` Matthew Wilcox
  2020-10-19  1:00       ` Nicholas Piggin
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2020-09-29  4:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Nick Piggin, Hugh Dickins, Peter Zijlstra, linux-mm, linux-kernel

On Tue, 29 Sep 2020 02:17:19 +0100 Matthew Wilcox <willy@infradead.org> wrote:

> On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
> > On Sat, 26 Sep 2020 22:39:19 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:
> > 
> > > Here is a very rare race which leaks memory:
> > 
> > Not worth a cc:stable?
> 
> Yes, it probably should have been.

Have you a feeling for how often this occurs?

>  I just assume the stablebot will
> pick up anything that has a Fixes: tag.

We asked them not to do that for mm/ patches.  Crazy stuff was getting
backported.

> > >
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -4947,6 +4947,9 @@ void __free_pages(struct page *page, unsigned int order)
> > >  {
> > >  	if (put_page_testzero(page))
> > >  		free_the_page(page, order);
> > > +	else if (!PageHead(page))
> > > +		while (order-- > 0)
> > > +			free_the_page(page + (1 << order), order);
> > 
> > Well that's weird and scary looking.  `page' has non-zero refcount yet
> > we go and free random followon pages.  Methinks it merits an
> > explanatory comment?
> 
> Well, poot.  I lost that comment in the shuffling of patches.  In a
> different tree, I have:
> 
> @@ -4943,10 +4943,19 @@ static inline void free_the_page(struct page *page, unsi
> gned int order)
>                 __free_pages_ok(page, order);
>  }
>  
> +/*
> + * If we free a non-compound allocation, another thread may have a

"non-compound, higher-order", I suggest?

> + * speculative reference to the first page.  It has no way of knowing
> + * about the rest of the allocation, so we have to free all but the
> + * first page here.
> + */
>  void __free_pages(struct page *page, unsigned int order)
>  {
>         if (put_page_testzero(page))
>                 free_the_page(page, order);
> +       else if (!PageHead(page))
> +               while (order-- > 0)
> +                       free_the_page(page + (1 << order), order);
>  }
>  EXPORT_SYMBOL(__free_pages);
>  
> 
> Although I'm now thinking of making that comment into kernel-doc and
> turning it into advice to the caller rather than an internal note to
> other mm developers.

hm.  But what action could the caller take?  The explanatory comment
seems OK to me.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29  3:40   ` Matthew Wilcox
@ 2020-09-29  7:26     ` Mike Rapoport
  2020-09-29 14:06       ` Matthew Wilcox
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Rapoport @ 2020-09-29  7:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Nick Piggin, Hugh Dickins, Peter Zijlstra,
	linux-mm, linux-kernel, hch, rdunlap

On Tue, Sep 29, 2020 at 04:40:26AM +0100, Matthew Wilcox wrote:
> On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
> > Well that's weird and scary looking.  `page' has non-zero refcount yet
> > we go and free random followon pages.  Methinks it merits an
> > explanatory comment?
> 
> Here's some kernel-doc.  Opinions?
> 
> /**
>  * __free_pages - Free pages allocated with alloc_pages().
>  * @page: The page pointer returned from alloc_pages().
>  * @order: The order of the allocation.
>  *
>  * This function differs from put_page() in that it can free multi-page

This sentence presumes existing description/prior knowledge about
put_page().

Maybe

  This function can free multi-page allocations that were not allocated
  with %__GFP_COMP, unlike put_page() that would free only the first page
  in such case. __free_pages() does not ...

>  * allocations that were not allocated with %__GFP_COMP.  This function
>  * does not check that the @order passed in matches that of the
>  * allocation, so it is possible to leak memory.  Freeing more memory than
>  * was allocated will probably be warned about by other debugging checks.
>  *
>  * It is only safe to use the page reference count to determine when
>  * to free an allocation if you use %__GFP_COMP (in which case, you may
>  * as well use put_page() to free the page).  Another thread may have a
>  * speculative reference to the first page, but it has no way of knowing
>  * about the rest of the allocation, so we have to free all but the
>  * first page here.
>  *
>  * Context: May be called in interrupt context but not NMI context.
>  */
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29  7:26     ` Mike Rapoport
@ 2020-09-29 14:06       ` Matthew Wilcox
  2020-09-30  9:17         ` Mike Rapoport
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2020-09-29 14:06 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Nick Piggin, Hugh Dickins, Peter Zijlstra,
	linux-mm, linux-kernel, hch, rdunlap

On Tue, Sep 29, 2020 at 10:26:22AM +0300, Mike Rapoport wrote:
> This sentence presumes existing description/prior knowledge about
> put_page().
> 
> Maybe
> 
>   This function can free multi-page allocations that were not allocated
>   with %__GFP_COMP, unlike put_page() that would free only the first page
>   in such case. __free_pages() does not ...

Thanks.  After waking up this morning I did a more extensive rewrite:

/**
 * __free_pages - Free pages allocated with alloc_pages().
 * @page: The page pointer returned from alloc_pages().
 * @order: The order of the allocation.
 *
 * This function can free multi-page allocations that are not compound
 * pages.  It does not check that the @order passed in matches that of
 * the allocation, so it is easy to leak memory.  Freeing more memory
 * than was allocated will probably emit a warning.
 *
 * If the last reference to this page is speculative, it will be released
 * by put_page() which only frees the first page of a non-compound
 * allocation.  To prevent the remaining pages from being leaked, we free
 * the subsequent pages here.  If you want to use the page's reference
 * count to decide when to free the allocation, you should allocate a
 * compound page, and use put_page() instead of __free_pages().
 *
 * Context: May be called in interrupt context or holding a normal
 * spinlock, but not in NMI context or while holding a raw spinlock.
 */


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29  4:46     ` Andrew Morton
@ 2020-09-29 22:53       ` Matthew Wilcox
  2020-10-19  1:00       ` Nicholas Piggin
  1 sibling, 0 replies; 10+ messages in thread
From: Matthew Wilcox @ 2020-09-29 22:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nick Piggin, Hugh Dickins, Peter Zijlstra, linux-mm, linux-kernel

On Mon, Sep 28, 2020 at 09:46:56PM -0700, Andrew Morton wrote:
> On Tue, 29 Sep 2020 02:17:19 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> 
> > On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
> > > On Sat, 26 Sep 2020 22:39:19 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:
> > > 
> > > > Here is a very rare race which leaks memory:
> > > 
> > > Not worth a cc:stable?
> > 
> > Yes, it probably should have been.
> 
> Have you a feeling for how often this occurs?

I doubt it happens often.  I don't think I could construct a workload to
make it happen frequently.  Maybe more often with a virtualised workload
where a thread can be preempted between instructions.

> >  I just assume the stablebot will
> > pick up anything that has a Fixes: tag.
> 
> We asked them not to do that for mm/ patches.  Crazy stuff was getting
> backported.

That's a shame.  I'll try to remember to cc them explicitly in the future.

> > Although I'm now thinking of making that comment into kernel-doc and
> > turning it into advice to the caller rather than an internal note to
> > other mm developers.
> 
> hm.  But what action could the caller take?  The explanatory comment
> seems OK to me.

Use compound pages instead of non-compound pages.  Although Linus has
asked that people stop using __get_free_pages(), so maybe that will be
the direction we go in.

https://lore.kernel.org/lkml/CA+55aFwyxJ+TOpaJZnC5MPJ-25xbLAEu8iJP8zTYhmA3LXFF8Q@mail.gmail.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29 14:06       ` Matthew Wilcox
@ 2020-09-30  9:17         ` Mike Rapoport
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Rapoport @ 2020-09-30  9:17 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Nick Piggin, Hugh Dickins, Peter Zijlstra,
	linux-mm, linux-kernel, hch, rdunlap

On Tue, Sep 29, 2020 at 03:06:22PM +0100, Matthew Wilcox wrote:
> On Tue, Sep 29, 2020 at 10:26:22AM +0300, Mike Rapoport wrote:
> > This sentence presumes existing description/prior knowledge about
> > put_page().
> > 
> > Maybe
> > 
> >   This function can free multi-page allocations that were not allocated
> >   with %__GFP_COMP, unlike put_page() that would free only the first page
> >   in such case. __free_pages() does not ...
> 
> Thanks.  After waking up this morning I did a more extensive rewrite:

I like this one

Acked-by: Mike Rapoport <rppt@linux.ibm.com>

> /**
>  * __free_pages - Free pages allocated with alloc_pages().
>  * @page: The page pointer returned from alloc_pages().
>  * @order: The order of the allocation.
>  *
>  * This function can free multi-page allocations that are not compound
>  * pages.  It does not check that the @order passed in matches that of
>  * the allocation, so it is easy to leak memory.  Freeing more memory
>  * than was allocated will probably emit a warning.
>  *
>  * If the last reference to this page is speculative, it will be released
>  * by put_page() which only frees the first page of a non-compound
>  * allocation.  To prevent the remaining pages from being leaked, we free
>  * the subsequent pages here.  If you want to use the page's reference
>  * count to decide when to free the allocation, you should allocate a
>  * compound page, and use put_page() instead of __free_pages().
>  *
>  * Context: May be called in interrupt context or holding a normal
>  * spinlock, but not in NMI context or while holding a raw spinlock.
>  */
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
  2020-09-29  4:46     ` Andrew Morton
  2020-09-29 22:53       ` Matthew Wilcox
@ 2020-10-19  1:00       ` Nicholas Piggin
  1 sibling, 0 replies; 10+ messages in thread
From: Nicholas Piggin @ 2020-10-19  1:00 UTC (permalink / raw)
  To: Andrew Morton, Matthew Wilcox
  Cc: Hugh Dickins, linux-kernel, linux-mm, Peter
 Zijlstra

Excerpts from Andrew Morton's message of September 29, 2020 2:46 pm:
> On Tue, 29 Sep 2020 02:17:19 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> 
>> On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
>> > On Sat, 26 Sep 2020 22:39:19 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:
>> > 
>> > > Here is a very rare race which leaks memory:

Great catch! [sorry, a bit behind with emails]

>> > 
>> > Not worth a cc:stable?
>> 
>> Yes, it probably should have been.
> 
> Have you a feeling for how often this occurs?
> 
>>  I just assume the stablebot will
>> pick up anything that has a Fixes: tag.
> 
> We asked them not to do that for mm/ patches.  Crazy stuff was getting
> backported.
> 
>> > >
>> > > --- a/mm/page_alloc.c
>> > > +++ b/mm/page_alloc.c
>> > > @@ -4947,6 +4947,9 @@ void __free_pages(struct page *page, unsigned int order)
>> > >  {
>> > >  	if (put_page_testzero(page))
>> > >  		free_the_page(page, order);
>> > > +	else if (!PageHead(page))
>> > > +		while (order-- > 0)
>> > > +			free_the_page(page + (1 << order), order);
>> > 
>> > Well that's weird and scary looking.  `page' has non-zero refcount yet
>> > we go and free random followon pages.  Methinks it merits an
>> > explanatory comment?
>> 
>> Well, poot.  I lost that comment in the shuffling of patches.  In a
>> different tree, I have:
>> 
>> @@ -4943,10 +4943,19 @@ static inline void free_the_page(struct page *page, unsi
>> gned int order)
>>                 __free_pages_ok(page, order);
>>  }
>>  
>> +/*
>> + * If we free a non-compound allocation, another thread may have a
> 
> "non-compound, higher-order", I suggest?
> 
>> + * speculative reference to the first page.  It has no way of knowing
>> + * about the rest of the allocation, so we have to free all but the
>> + * first page here.
>> + */
>>  void __free_pages(struct page *page, unsigned int order)
>>  {
>>         if (put_page_testzero(page))
>>                 free_the_page(page, order);
>> +       else if (!PageHead(page))
>> +               while (order-- > 0)
>> +                       free_the_page(page + (1 << order), order);
>>  }
>>  EXPORT_SYMBOL(__free_pages);
>>  
>> 
>> Although I'm now thinking of making that comment into kernel-doc and
>> turning it into advice to the caller rather than an internal note to
>> other mm developers.
> 
> hm.  But what action could the caller take?  The explanatory comment
> seems OK to me.

The version of this without the comment got merged. I didn't mind the
comment...

Thanks,
Nick

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-10-19  1:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-26 21:39 [PATCH v2] page_alloc: Fix freeing non-compound pages Matthew Wilcox (Oracle)
2020-09-29  1:03 ` Andrew Morton
2020-09-29  1:17   ` Matthew Wilcox
2020-09-29  4:46     ` Andrew Morton
2020-09-29 22:53       ` Matthew Wilcox
2020-10-19  1:00       ` Nicholas Piggin
2020-09-29  3:40   ` Matthew Wilcox
2020-09-29  7:26     ` Mike Rapoport
2020-09-29 14:06       ` Matthew Wilcox
2020-09-30  9:17         ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).