All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: Partha Sarathi Satapathy <partha.satapathy@oracle.com>
Cc: linux_uek_grp@oracle.com, stable@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)
Date: Wed, 27 Apr 2022 07:37:11 +0200	[thread overview]
Message-ID: <YmjWh44pNkpg+PUV@kroah.com> (raw)
In-Reply-To: <1651034789-32295-1-git-send-email-partha.satapathy@oracle.com>

On Wed, Apr 27, 2022 at 04:46:28AM +0000, Partha Sarathi Satapathy wrote:
> From: Hugh Dickins <hughd@google.com>
> 
> Twice now, when exercising ext4 looped on shmem huge pages, I have crashed
> on the PF_ONLY_HEAD check inside PageWaiters(): ext4_finish_bio() calling
> end_page_writeback() calling wake_up_page() on tail of a shmem huge page,
> no longer an ext4 page at all.
> 
> The problem is that PageWriteback is not accompanied by a page reference
> (as the NOTE at the end of test_clear_page_writeback() acknowledges): as
> soon as TestClearPageWriteback has been done, that page could be removed
> from page cache, freed, and reused for something else by the time that
> wake_up_page() is reached.
> 
> https://lore.kernel.org/linux-mm/20200827122019.GC14765@casper.infradead.org/
> Matthew Wilcox suggested avoiding or weakening the PageWaiters() tail
> check; but I'm paranoid about even looking at an unreferenced struct page,
> lest its memory might itself have already been reused or hotremoved (and
> wake_up_page_bit() may modify that memory with its ClearPageWaiters()).
> 
> Then on crashing a second time, realized there's a stronger reason against
> that approach.  If my testing just occasionally crashes on that check,
> when the page is reused for part of a compound page, wouldn't it be much
> more common for the page to get reused as an order-0 page before reaching
> wake_up_page()?  And on rare occasions, might that reused page already be
> marked PageWriteback by its new user, and already be waited upon?  What
> would that look like?
> 
> It would look like BUG_ON(PageWriteback) after wait_on_page_writeback()
> in write_cache_pages() (though I have never seen that crash myself).
> 
> Matthew Wilcox explaining this to himself:
>  "page is allocated, added to page cache, dirtied, writeback starts,
> 
>   --- thread A ---
>   filesystem calls end_page_writeback()
>         test_clear_page_writeback()
>   --- context switch to thread B ---
>   truncate_inode_pages_range() finds the page, it doesn't have writeback set,
>   we delete it from the page cache.  Page gets reallocated, dirtied, writeback
>   starts again.  Then we call write_cache_pages(), see
>   PageWriteback() set, call wait_on_page_writeback()
>   --- context switch back to thread A ---
>   wake_up_page(page, PG_writeback);
>   ... thread B is woken, but because the wakeup was for the old use of
>   the page, PageWriteback is still set.
> 
>   Devious"
> 
> And prior to 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic")
> this would have been much less likely: before that, wake_page_function()'s
> non-exclusive case would stop walking and not wake if it found Writeback
> already set again; whereas now the non-exclusive case proceeds to wake.
> 
> I have not thought of a fix that does not add a little overhead: the
> simplest fix is for end_page_writeback() to get_page() before calling
> test_clear_page_writeback(), then put_page() after wake_up_page().
> 
> Was there a chance of missed wakeups before, since a page freed before
> reaching wake_up_page() would have PageWaiters cleared?  I think not,
> because each waiter does hold a reference on the page.  This bug comes
> when the old use of the page, the one we do TestClearPageWriteback on,
> had *no* waiters, so no additional page reference beyond the page cache
> (and whoever racily freed it).  The reuse of the page has a waiter
> holding a reference, and its own PageWriteback set; but the belated
> wake_up_page() has woken the reuse to hit that BUG_ON(PageWriteback).
> 
> Orabug: 33634162
> 
> Reported-by: syzbot+3622cea378100f45d59f@syzkaller.appspotmail.com
> Reported-by: Qian Cai <cai@lca.pw>
> Fixes: 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: stable@vger.kernel.org # v5.8+
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> (cherry picked from commit 073861ed77b6b957c3c8d54a11dc503f7d986ceb)
> Signed-off-by: Partha Sarathi Satapathy <partha.satapathy@oracle.com>

What stable kernel(s) are you wanting this backported to?  It's already
in the 5.10 release and shouldn't go further back than 5.9.

confused,

greg k-h

      reply	other threads:[~2022-05-01 11:25 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-27  4:46 [PATCH 1/2] mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback) Partha Sarathi Satapathy
2022-04-27  5:37 ` Greg KH [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmjWh44pNkpg+PUV@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=linux_uek_grp@oracle.com \
    --cc=partha.satapathy@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.