All of lore.kernel.org
 help / color / mirror / Atom feed
From: riteshh <riteshh@linux.ibm.com>
To: Qu Wenruo <wqu@suse.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Neal Gompa <ngompa13@gmail.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v3 00/13] btrfs: support read-write for subpage metadata
Date: Wed, 21 Apr 2021 16:43:10 +0530	[thread overview]
Message-ID: <20210421111310.q2g2fhzhlcoaykff@riteshh-domain> (raw)
In-Reply-To: <ca952f24-d0cd-fda7-c9c5-85eba3e7d04a@suse.com>

On 21/04/21 04:26PM, Qu Wenruo wrote:
>
>
> On 2021/4/21 下午3:30, riteshh wrote:
> > On 21/04/21 12:33PM, riteshh wrote:
> > > On 21/04/19 09:24PM, Qu Wenruo wrote:
> > > > [...]
> > > > > >
> > > > > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > > > > > index 45ec3f5ef839..49f78d643392 100644
> > > > > > --- a/fs/btrfs/file.c
> > > > > > +++ b/fs/btrfs/file.c
> > > > > > @@ -1341,7 +1341,17 @@ static int prepare_uptodate_page(struct inode
> > > > > > *inode,
> > > > > >                          unlock_page(page);
> > > > > >                          return -EIO;
> > > > > >                  }
> > > > > > -               if (page->mapping != inode->i_mapping) {
> > > > > > +
> > > > > > +               /*
> > > > > > +                * Since btrfs_readpage() will get the page unlocked, we
> > > > > > have
> > > > > > +                * a window where fadvice() can try to release the page.
> > > > > > +                * Here we check both inode mapping and PagePrivate() to
> > > > > > +                * make sure the page is not released.
> > > > > > +                *
> > > > > > +                * The priavte flag check is essential for subpage as we
> > > > > > need
> > > > > > +                * to store extra bitmap using page->private.
> > > > > > +                */
> > > > > > +               if (page->mapping != inode->i_mapping ||
> > > > > > PagePrivate(page)) {
> > > > >    ^ Obviously it should be !PagePrivate(page).
> > > >
> > > > Hi Ritesh,
> > > >
> > > > Mind to have another try on generic/095?
> > > >
> > > > This time the branch is updated with the following commit at top:
> > > >
> > > > commit d700b16dced6f2e2b47e1ca5588a92216ce84dfb (HEAD -> subpage,
> > > > github/subpage)
> > > > Author: Qu Wenruo <wqu@suse.com>
> > > > Date:   Mon Apr 19 13:41:31 2021 +0800
> > > >
> > > >      btrfs: fix a crash caused by race between prepare_pages() and
> > > >      btrfs_releasepage()
> > > >
> > > > The fix uses the PagePrivate() check to avoid the problem, and passes
> > > > several generic/auto loops without any sign of crash.
> > > >
> > > > But considering I always have difficult in reproducing the bug with previous
> > > > improper fix, your verification would be very helpful.
> > > >
> > >
> > > Hi Qu,
> > >
> > > Thanks for the patch. I did try above patch but even with this I could still
> > > reproduce the issue.
> > >
> > > 1. I think the original problem could be due to below logs.
> > > 	[   79.079641] run fstests generic/095 at 2021-04-21 06:46:23
> > > 	<...>
> > > 	[   83.634710] Page cache invalidation failure on direct I/O.  Possible data corruption due to collision with buffered I/O!
> > >
> > > Meaning, there might be a race here between DIO and buffered IO.
> > > So from DIO path we call invalidate_inode_pages2_range(). Somehow this maybe
> > > causing call of btrfs_releasepage().
> > >
> > > Now from code, invalidate_inode_pages2_range() can be called from both
> > > __iomap_dio_rw() and from iomap_dio_complete(). So it is not clear as to from
> > > where this might be triggering this bug.
> >
> > I think I got one of the problem.
> > 1. we use page->private pointer as btrfs_subpage struct which also happens to
> >     hold spinlock within it.
> >
> >     Now in btrfs_subpage_clear_writeback()
> >     -> we take this spinlock  spin_lock_irqsave(&subpage->lock, flags);
> >     -> we call end_page_writeback(page);
> >     		  -> this may end up waking up invalidate_inode_pages2_range()
> > 		  which is waiting for writeback to complete.
> > 			  -> this then may also call btrfs_releasepage() on the
> > 			  same page and also free the subpage structure.
>
> This indeeds looks like a problem.
>
> This really means we need to have such a small race window below:
> (btrfs_invalidatepage() doesn't seem to be possible to race considering
>  how much work needed to be done in that function)
>
> 	Thread 1		|		Thread 2
> --------------------------------+------------------------------------
>  end_bio_extent_writepage()	| btrfs_releasepage()
>  |- spin_lock_irqsave()		| |
>  |- end_page_writeback()	| |
>  |				| |- if (PageWriteback() ||...)
>  |				| |- clear_page_extent_mapped()
>  |- spin_unlock_irqrestore().
>
> It looks like my arm boards are not fast enough to trigger the race.
>
> Although it can be fixed by doing the same thing as dirty bit, by checking
> the bitmap first and then call end_page_writeback() with spinlock unlocked.
>
> Would you please try the following fix? (based on the latest branch, which
> already has the previous fixes included).
>
> I'm also running the tests on all my arm boards to make sure it doesn't
> cause extra problem, so far so good, but my board is far from fast, thus not
> yet 100% tested.
>
> Thanks,
> Qu
>
> diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
> index 696485ab68a2..c5abf9745c10 100644
> --- a/fs/btrfs/subpage.c
> +++ b/fs/btrfs/subpage.c
> @@ -420,13 +420,16 @@ void btrfs_subpage_clear_writeback(const struct
> btrfs_fs_info *fs_info,
>  {
>         struct btrfs_subpage *subpage = (struct btrfs_subpage
> *)page->private;
>         u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len);
> +       bool finished = false;
>         unsigned long flags;
>
>         spin_lock_irqsave(&subpage->lock, flags);
>         subpage->writeback_bitmap &= ~tmp;
>         if (subpage->writeback_bitmap == 0)
> -               end_page_writeback(page);
> +               finished = true;
>         spin_unlock_irqrestore(&subpage->lock, flags);
> +       if (finished)
> +               end_page_writeback(page);
>  }
>
>  void btrfs_subpage_set_ordered(const struct btrfs_fs_info *fs_info,

Thanks for this patch. I have re-tested generic/095 with 100 iterations and -g
quick (with both of your patches). I don't see this issue anymore.
So with the two patches (including above one) the race with
btrfs_releasepage() is now fixed.


For both of these patches, please feel free to add:

Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Tested-by: Ritesh Harjani <riteshh@linux.ibm.com>

-ritesh

  reply	other threads:[~2021-04-21 11:13 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-25  7:14 [PATCH v3 00/13] btrfs: support read-write for subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 01/13] btrfs: add sysfs interface for supported sectorsize Qu Wenruo
2021-03-25 14:41   ` Anand Jain
2021-03-29 18:20     ` David Sterba
2021-04-01 22:32       ` Anand Jain
2021-04-01 17:56   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 02/13] btrfs: use min() to replace open-code in btrfs_invalidatepage() Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 03/13] btrfs: remove unnecessary variable shadowing " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 04/13] btrfs: refactor how we iterate ordered extent " Qu Wenruo
2021-04-02  1:15   ` Anand Jain
2021-04-02  3:33     ` Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 05/13] btrfs: introduce helpers for subpage dirty status Qu Wenruo
2021-04-01 18:11   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 06/13] btrfs: introduce helpers for subpage writeback status Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 07/13] btrfs: allow btree_set_page_dirty() to do more sanity check on subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 08/13] btrfs: support subpage metadata csum calculation at write time Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 09/13] btrfs: make alloc_extent_buffer() check subpage dirty bitmap Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 10/13] btrfs: make the page uptodate assert to be subpage compatible Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 11/13] btrfs: make set/clear_extent_buffer_dirty() " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 12/13] btrfs: make set_btree_ioerr() accept extent buffer and " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 13/13] btrfs: add subpage overview comments Qu Wenruo
2021-03-25 12:20 ` [PATCH v3 00/13] btrfs: support read-write for subpage metadata Neal Gompa
2021-03-25 13:16   ` Qu Wenruo
2021-03-28 20:02     ` Ritesh Harjani
2021-03-29  2:01       ` Qu Wenruo
2021-04-02  1:39         ` Anand Jain
2021-04-02  3:26           ` Qu Wenruo
2021-04-02  8:33         ` Ritesh Harjani
2021-04-02  8:36           ` Qu Wenruo
2021-04-02  8:46             ` Ritesh Harjani
2021-04-02  8:52               ` Qu Wenruo
2021-04-12 11:33                 ` Qu Wenruo
2021-04-15  3:44                   ` riteshh
2021-04-15 14:52                     ` riteshh
2021-04-15 23:19                       ` Qu Wenruo
2021-04-15 23:34                         ` Qu Wenruo
2021-04-16  1:34                           ` Qu Wenruo
2021-04-16  5:50                             ` riteshh
2021-04-16  6:14                               ` Qu Wenruo
2021-04-16 16:52                                 ` riteshh
2021-04-19  5:59                                   ` riteshh
2021-04-19  6:16                                     ` Qu Wenruo
2021-04-19  7:04                                       ` riteshh
2021-04-19  7:19                                       ` Qu Wenruo
2021-04-19 13:24                                         ` Qu Wenruo
2021-04-21  7:03                                           ` riteshh
2021-04-21  7:15                                             ` Qu Wenruo
2021-04-21  7:30                                             ` riteshh
2021-04-21  8:26                                               ` Qu Wenruo
2021-04-21 11:13                                                 ` riteshh [this message]
2021-04-21 11:42                                                   ` Qu Wenruo
2021-04-21 12:15                                                     ` riteshh
2021-03-29 18:53 ` David Sterba
2021-04-01  5:36   ` Qu Wenruo
2021-04-01 17:55     ` David Sterba
2021-04-02  1:27     ` Anand Jain
2021-04-03 11:08 ` David Sterba
2021-04-05  6:14   ` Qu Wenruo
2021-04-06  2:31     ` Anand Jain
2021-04-06 19:20       ` David Sterba
2021-04-06 23:59       ` Qu Wenruo
2021-04-06 19:13     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210421111310.q2g2fhzhlcoaykff@riteshh-domain \
    --to=riteshh@linux.ibm.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=ngompa13@gmail.com \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=ritesh.list@gmail.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.