From: David Sterba <dsterba@suse.cz>
To: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: dsterba@suse.cz, clm@fb.com, josef@toxicpanda.com,
dsterba@suse.com, anand.jain@oracle.com,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
skhan@linuxfoundation.org, gregkh@linuxfoundation.org,
linux-kernel-mentees@lists.linuxfoundation.org,
syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com
Subject: Re: [PATCH v2] btrfs: fix rw device counting in __btrfs_free_extra_devids
Date: Fri, 13 Aug 2021 10:51:38 +0200 [thread overview]
Message-ID: <20210813085137.GQ5047@twin.jikos.cz> (raw)
In-Reply-To: <1e0aafb2-9e55-5f64-d347-1765de0560c5@gmail.com>
On Fri, Aug 13, 2021 at 01:31:25AM +0800, Desmond Cheong Zhi Xi wrote:
> On 12/8/21 11:50 pm, David Sterba wrote:
> > On Thu, Aug 12, 2021 at 11:43:16PM +0800, Desmond Cheong Zhi Xi wrote:
> >> On 12/8/21 6:38 pm, David Sterba wrote:
> >>> On Tue, Jul 27, 2021 at 03:13:03PM +0800, Desmond Cheong Zhi Xi wrote:
> >>>> --- a/fs/btrfs/volumes.c
> >>>> +++ b/fs/btrfs/volumes.c
> >>>> @@ -1078,6 +1078,7 @@ static void __btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices,
> >>>> if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
> >>>> list_del_init(&device->dev_alloc_list);
> >>>> clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
> >>>> + fs_devices->rw_devices--;
> >>>> }
> >>>> list_del_init(&device->dev_list);
> >>>> fs_devices->num_devices--;
> >>>
> >>> I've hit a crash on master branch with stacktrace very similar to one
> >>> this bug was supposed to fix. It's a failed assertion on device close.
> >>> This patch was the last one to touch it and it matches some of the
> >>> keywords, namely the BTRFS_DEV_STATE_REPLACE_TGT bit that used to be in
> >>> the original patch but was not reinstated in your fix.
> >>>
> >>> I'm not sure how reproducible it is, right now I have only one instance
> >>> and am hunting another strange problem. They could be related.
> >>>
> >>> assertion failed: !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state), in fs/btrfs/volumes.c:1150
> >>>
> >>> https://susepaste.org/view/raw/18223056 full log with other stacktraces,
> >>> possibly relatedg
> >>>
> >>
> >> Looking at the logs, it seems that a dev_replace was started, then
> >> suspended. But it wasn't canceled or resumed before the fs devices were
> >> closed.
> >>
> >> I'll investigate further, just throwing some observations out there.
> >
> > Thanks. I'm testing the patch revert, no crash after first loop, I'll
> > run a few more to be sure as it's not entirely reliable.
> >
> > Sending the revert is option of last resort as we're approaching end of
> > 5.14 dev cycle and the crash prevents testing (unlike the fuzzer
> > warning).
> >
>
> I might be missing something, so any thoughts would be appreciated. But
> I don't think the assertion in btrfs_close_one_device is correct.
>
> From what I see, this crash happens when close_ctree is called while a
> dev_replace hasn't completed. In close_ctree, we suspend the
> dev_replace, but keep the replace target around so that we can resume
> the dev_replace procedure when we mount the root again. This is the call
> trace:
>
> close_ctree():
> btrfs_dev_replace_suspend_for_unmount();
> btrfs_close_devices():
> btrfs_close_fs_devices():
> btrfs_close_one_device():
> ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
> &device->dev_state));
>
> However, since the replace target sticks around, there is a device with
> BTRFS_DEV_STATE_REPLACE_TGT set, and we fail the assertion in
> btrfs_close_one_device.
>
> Two options I can think of:
>
> - We could remove the assertion.
>
> - Or we could clear the BTRFS_DEV_STATE_REPLACE_TGT bit in
> btrfs_dev_replace_suspend_for_unmount. This is fine since the bit is set
> again in btrfs_init_dev_replace if the dev_replace->replace_state is
> BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED. But this approach strikes me as
> a little odd because the device is still the replace target when
> mounting in the future.
The option #2 does not sound safe because the TGT bit is checked in
several places where device list is queried for various reasons, even
without a mounted filesystem.
Removing the assertion makes more sense but I'm still not convinced that
the this is expected/allowed state of a closed device.
next prev parent reply other threads:[~2021-08-13 8:54 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-27 7:13 [PATCH v2] btrfs: fix rw device counting in __btrfs_free_extra_devids Desmond Cheong Zhi Xi
2021-07-28 12:58 ` David Sterba
2021-08-12 10:38 ` David Sterba
2021-08-12 15:43 ` Desmond Cheong Zhi Xi
2021-08-12 15:50 ` David Sterba
2021-08-12 17:31 ` Desmond Cheong Zhi Xi
2021-08-13 8:51 ` David Sterba [this message]
2021-08-13 9:57 ` Desmond Cheong Zhi Xi
2021-08-13 10:30 ` David Sterba
2021-08-19 17:11 ` Desmond Cheong Zhi Xi
2021-08-19 17:34 ` David Sterba
2021-08-20 3:09 ` Desmond Cheong Zhi Xi
2021-08-20 10:58 ` David Sterba
2021-08-20 17:53 ` Desmond Cheong Zhi Xi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210813085137.GQ5047@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=anand.jain@oracle.com \
--cc=clm@fb.com \
--cc=desmondcheongzx@gmail.com \
--cc=dsterba@suse.com \
--cc=gregkh@linuxfoundation.org \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel-mentees@lists.linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).