All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] mm: fix missing wake-up event for FSDAX pages
@ 2022-07-05 12:35 Muchun Song
  2022-07-05 21:18 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Muchun Song @ 2022-07-05 12:35 UTC (permalink / raw)
  To: akpm, willy, jgg, jhubbard, william.kucharski, dan.j.williams, jack
  Cc: linux-mm, linux-kernel, linux-fsdevel, nvdimm, Muchun Song, stable

FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
1, then the page is freed.  The FSDAX pages can be pinned through GUP,
then they will be unpinned via unpin_user_page() using a folio variant
to put the page, however, folio variants did not consider this special
case, the result will be to miss a wakeup event (like the user of
__fuse_dax_break_layouts()).  Since FSDAX pages are only possible get
by GUP users, so fix GUP instead of folio_put() to lower overhead.

Fixes: d8ddc099c6b3 ("mm/gup: Add gup_put_folio()")
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
---
v2:
 - Fix GUP instead of folio_put() suggested by Matthew.

 include/linux/mm.h | 14 +++++++++-----
 mm/gup.c           |  6 ++++--
 mm/memremap.c      |  6 +++---
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 517f9deba56f..b324c9fa2940 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1157,23 +1157,27 @@ static inline bool is_zone_movable_page(const struct page *page)
 #if defined(CONFIG_ZONE_DEVICE) && defined(CONFIG_FS_DAX)
 DECLARE_STATIC_KEY_FALSE(devmap_managed_key);
 
-bool __put_devmap_managed_page(struct page *page);
-static inline bool put_devmap_managed_page(struct page *page)
+bool __put_devmap_managed_page_refs(struct page *page, int refs);
+static inline bool put_devmap_managed_page_refs(struct page *page, int refs)
 {
 	if (!static_branch_unlikely(&devmap_managed_key))
 		return false;
 	if (!is_zone_device_page(page))
 		return false;
-	return __put_devmap_managed_page(page);
+	return __put_devmap_managed_page_refs(page, refs);
 }
-
 #else /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */
-static inline bool put_devmap_managed_page(struct page *page)
+static inline bool put_devmap_managed_page_refs(struct page *page, int refs)
 {
 	return false;
 }
 #endif /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */
 
+static inline bool put_devmap_managed_page(struct page *page)
+{
+	return put_devmap_managed_page_refs(page, 1);
+}
+
 /* 127: arbitrary random number, small enough to assemble well */
 #define folio_ref_zero_or_close_to_overflow(folio) \
 	((unsigned int) folio_ref_count(folio) + 127u <= 127u)
diff --git a/mm/gup.c b/mm/gup.c
index 4e1999402e71..965ba755023f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -87,7 +87,8 @@ static inline struct folio *try_get_folio(struct page *page, int refs)
 	 * belongs to this folio.
 	 */
 	if (unlikely(page_folio(page) != folio)) {
-		folio_put_refs(folio, refs);
+		if (!put_devmap_managed_page_refs(&folio->page, refs))
+			folio_put_refs(folio, refs);
 		goto retry;
 	}
 
@@ -176,7 +177,8 @@ static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
 			refs *= GUP_PIN_COUNTING_BIAS;
 	}
 
-	folio_put_refs(folio, refs);
+	if (!put_devmap_managed_page_refs(&folio->page, refs))
+		folio_put_refs(folio, refs);
 }
 
 /**
diff --git a/mm/memremap.c b/mm/memremap.c
index f0955785150f..58b20c3c300b 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -509,7 +509,7 @@ void free_zone_device_page(struct page *page)
 }
 
 #ifdef CONFIG_FS_DAX
-bool __put_devmap_managed_page(struct page *page)
+bool __put_devmap_managed_page_refs(struct page *page, int refs)
 {
 	if (page->pgmap->type != MEMORY_DEVICE_FS_DAX)
 		return false;
@@ -519,9 +519,9 @@ bool __put_devmap_managed_page(struct page *page)
 	 * refcount is 1, then the page is free and the refcount is
 	 * stable because nobody holds a reference on the page.
 	 */
-	if (page_ref_dec_return(page) == 1)
+	if (page_ref_sub_return(page, refs) == 1)
 		wake_up_var(&page->_refcount);
 	return true;
 }
-EXPORT_SYMBOL(__put_devmap_managed_page);
+EXPORT_SYMBOL(__put_devmap_managed_page_refs);
 #endif /* CONFIG_FS_DAX */
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] mm: fix missing wake-up event for FSDAX pages
  2022-07-05 12:35 [PATCH v2] mm: fix missing wake-up event for FSDAX pages Muchun Song
@ 2022-07-05 21:18 ` Andrew Morton
  2022-07-05 23:38   ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2022-07-05 21:18 UTC (permalink / raw)
  To: Muchun Song
  Cc: willy, jgg, jhubbard, william.kucharski, dan.j.williams, jack,
	linux-mm, linux-kernel, linux-fsdevel, nvdimm, stable

On Tue,  5 Jul 2022 20:35:32 +0800 Muchun Song <songmuchun@bytedance.com> wrote:

> FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
> 1, then the page is freed.  The FSDAX pages can be pinned through GUP,
> then they will be unpinned via unpin_user_page() using a folio variant
> to put the page, however, folio variants did not consider this special
> case, the result will be to miss a wakeup event (like the user of
> __fuse_dax_break_layouts()).  Since FSDAX pages are only possible get
> by GUP users, so fix GUP instead of folio_put() to lower overhead.
> 

What are the user visible runtime effects of this bug?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] mm: fix missing wake-up event for FSDAX pages
  2022-07-05 21:18 ` Andrew Morton
@ 2022-07-05 23:38   ` Matthew Wilcox
  2022-07-05 23:47     ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2022-07-05 23:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Muchun Song, jgg, jhubbard, william.kucharski, dan.j.williams,
	jack, linux-mm, linux-kernel, linux-fsdevel, nvdimm, stable

On Tue, Jul 05, 2022 at 02:18:19PM -0700, Andrew Morton wrote:
> On Tue,  5 Jul 2022 20:35:32 +0800 Muchun Song <songmuchun@bytedance.com> wrote:
> 
> > FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
> > 1, then the page is freed.  The FSDAX pages can be pinned through GUP,
> > then they will be unpinned via unpin_user_page() using a folio variant
> > to put the page, however, folio variants did not consider this special
> > case, the result will be to miss a wakeup event (like the user of
> > __fuse_dax_break_layouts()).  Since FSDAX pages are only possible get
> > by GUP users, so fix GUP instead of folio_put() to lower overhead.
> > 
> 
> What are the user visible runtime effects of this bug?

"missing wake up event" seems pretty obvious to me?  Something goes to
sleep waiting for a page to become unused, and is never woken.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] mm: fix missing wake-up event for FSDAX pages
  2022-07-05 23:38   ` Matthew Wilcox
@ 2022-07-05 23:47     ` Andrew Morton
  2022-07-06  2:47       ` Muchun Song
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2022-07-05 23:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Muchun Song, jgg, jhubbard, william.kucharski, dan.j.williams,
	jack, linux-mm, linux-kernel, linux-fsdevel, nvdimm, stable

On Wed, 6 Jul 2022 00:38:41 +0100 Matthew Wilcox <willy@infradead.org> wrote:

> On Tue, Jul 05, 2022 at 02:18:19PM -0700, Andrew Morton wrote:
> > On Tue,  5 Jul 2022 20:35:32 +0800 Muchun Song <songmuchun@bytedance.com> wrote:
> > 
> > > FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
> > > 1, then the page is freed.  The FSDAX pages can be pinned through GUP,
> > > then they will be unpinned via unpin_user_page() using a folio variant
> > > to put the page, however, folio variants did not consider this special
> > > case, the result will be to miss a wakeup event (like the user of
> > > __fuse_dax_break_layouts()).  Since FSDAX pages are only possible get
> > > by GUP users, so fix GUP instead of folio_put() to lower overhead.
> > > 
> > 
> > What are the user visible runtime effects of this bug?
> 
> "missing wake up event" seems pretty obvious to me?  Something goes to
> sleep waiting for a page to become unused, and is never woken.

No, missed wakeups are often obscured by another wakeup coming in
shortly afterwards.

If this wakeup is not one of these, then are there reports from the
softlockup detector?

Do we have reports of processes permanently stuck in D state?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] mm: fix missing wake-up event for FSDAX pages
  2022-07-05 23:47     ` Andrew Morton
@ 2022-07-06  2:47       ` Muchun Song
  2022-07-06  3:00         ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Muchun Song @ 2022-07-06  2:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox, jgg, jhubbard, william.kucharski, dan.j.williams,
	jack, linux-mm, linux-kernel, linux-fsdevel, nvdimm, stable

On Tue, Jul 05, 2022 at 04:47:10PM -0700, Andrew Morton wrote:
> On Wed, 6 Jul 2022 00:38:41 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> 
> > On Tue, Jul 05, 2022 at 02:18:19PM -0700, Andrew Morton wrote:
> > > On Tue,  5 Jul 2022 20:35:32 +0800 Muchun Song <songmuchun@bytedance.com> wrote:
> > > 
> > > > FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
> > > > 1, then the page is freed.  The FSDAX pages can be pinned through GUP,
> > > > then they will be unpinned via unpin_user_page() using a folio variant
> > > > to put the page, however, folio variants did not consider this special
> > > > case, the result will be to miss a wakeup event (like the user of
> > > > __fuse_dax_break_layouts()).  Since FSDAX pages are only possible get
> > > > by GUP users, so fix GUP instead of folio_put() to lower overhead.
> > > > 
> > > 
> > > What are the user visible runtime effects of this bug?
> > 
> > "missing wake up event" seems pretty obvious to me?  Something goes to
> > sleep waiting for a page to become unused, and is never woken.
> 
> No, missed wakeups are often obscured by another wakeup coming in
> shortly afterwards.
> 

I need to clarify the task will never be woken up.

> If this wakeup is not one of these, then are there reports from the
> softlockup detector?
> 
> Do we have reports of processes permanently stuck in D state?
>

No. The task is in an TASK_INTERRUPTIBLE state (see __fuse_dax_break_layouts). 
The hung task reporter only reports D task (TASK_UNINTERRUPTIBLE).

Thanks.
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] mm: fix missing wake-up event for FSDAX pages
  2022-07-06  2:47       ` Muchun Song
@ 2022-07-06  3:00         ` Andrew Morton
  2022-07-06  3:11           ` Muchun Song
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2022-07-06  3:00 UTC (permalink / raw)
  To: Muchun Song
  Cc: Matthew Wilcox, jgg, jhubbard, william.kucharski, dan.j.williams,
	jack, linux-mm, linux-kernel, linux-fsdevel, nvdimm, stable

On Wed, 6 Jul 2022 10:47:32 +0800 Muchun Song <songmuchun@bytedance.com> wrote:

> > If this wakeup is not one of these, then are there reports from the
> > softlockup detector?
> > 
> > Do we have reports of processes permanently stuck in D state?
> >
> 
> No. The task is in an TASK_INTERRUPTIBLE state (see __fuse_dax_break_layouts). 
> The hung task reporter only reports D task (TASK_UNINTERRUPTIBLE).

Thanks, I updated the changelog a bit.

: FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
: 1, then the page is freed.  The FSDAX pages can be pinned through GUP,
: then they will be unpinned via unpin_user_page() using a folio variant
: to put the page, however, folio variants did not consider this special
: case, the result will be to miss a wakeup event (like the user of
: __fuse_dax_break_layouts()).  This results in a task being permanently
: stuck in TASK_INTERRUPTIBLE state.
: 
: Since FSDAX pages are only possibly obtained by GUP users, so fix GUP
: instead of folio_put() to lower overhead.

I believe these details are helpful for -stable maintainers who are
wondering why they were sent stuff.  Also for maintainers of
downstreeam older kernels who are scratching heads over some user bug
report, trying to find a patch which might fix it - for this they want
to see a description of the user-visible effects, for matching with
that bug report.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] mm: fix missing wake-up event for FSDAX pages
  2022-07-06  3:00         ` Andrew Morton
@ 2022-07-06  3:11           ` Muchun Song
  0 siblings, 0 replies; 7+ messages in thread
From: Muchun Song @ 2022-07-06  3:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox, jgg, jhubbard, william.kucharski, dan.j.williams,
	jack, linux-mm, linux-kernel, linux-fsdevel, nvdimm, stable

On Tue, Jul 05, 2022 at 08:00:42PM -0700, Andrew Morton wrote:
> On Wed, 6 Jul 2022 10:47:32 +0800 Muchun Song <songmuchun@bytedance.com> wrote:
> 
> > > If this wakeup is not one of these, then are there reports from the
> > > softlockup detector?
> > > 
> > > Do we have reports of processes permanently stuck in D state?
> > >
> > 
> > No. The task is in an TASK_INTERRUPTIBLE state (see __fuse_dax_break_layouts). 
> > The hung task reporter only reports D task (TASK_UNINTERRUPTIBLE).
> 
> Thanks, I updated the changelog a bit.
> 
> : FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
> : 1, then the page is freed.  The FSDAX pages can be pinned through GUP,
> : then they will be unpinned via unpin_user_page() using a folio variant
> : to put the page, however, folio variants did not consider this special
> : case, the result will be to miss a wakeup event (like the user of
> : __fuse_dax_break_layouts()).  This results in a task being permanently
> : stuck in TASK_INTERRUPTIBLE state.
> : 
> : Since FSDAX pages are only possibly obtained by GUP users, so fix GUP
> : instead of folio_put() to lower overhead.
> 
> I believe these details are helpful for -stable maintainers who are
> wondering why they were sent stuff.  Also for maintainers of
> downstreeam older kernels who are scratching heads over some user bug
> report, trying to find a patch which might fix it - for this they want
> to see a description of the user-visible effects, for matching with
> that bug report.
>

Thanks Andrew, it's really helpful.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-06  3:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-05 12:35 [PATCH v2] mm: fix missing wake-up event for FSDAX pages Muchun Song
2022-07-05 21:18 ` Andrew Morton
2022-07-05 23:38   ` Matthew Wilcox
2022-07-05 23:47     ` Andrew Morton
2022-07-06  2:47       ` Muchun Song
2022-07-06  3:00         ` Andrew Morton
2022-07-06  3:11           ` Muchun Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.