All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Josef Bacik <josef@toxicpanda.com>,
	dsterba@suse.cz, linux-btrfs@vger.kernel.org,
	syzbot+4cfe71a4da060be47502@syzkaller.appspotmail.com
Subject: Re: [PATCH add reported by] btrfs: fix rw_devices count in __btrfs_free_extra_devids
Date: Fri, 25 Sep 2020 18:11:15 +0800	[thread overview]
Message-ID: <b93a6de0-96f7-11f1-e4ac-59de97d60cc0@oracle.com> (raw)
In-Reply-To: <a6766b76-a1fd-4011-5290-11406bc2923e@toxicpanda.com>

On 24/9/20 10:02 pm, Josef Bacik wrote:
> On 9/24/20 7:25 AM, David Sterba wrote:
>> On Wed, Sep 23, 2020 at 09:42:17AM -0400, Josef Bacik wrote:
>>> On 9/23/20 12:42 AM, Anand Jain wrote:
>>>> On 22/9/20 9:08 pm, Josef Bacik wrote:
>>>>> On 9/22/20 8:33 AM, Anand Jain wrote:
>>
>>> Yeah I mean we do something in btrfs_init_dev_replace(), like when we 
>>> search for
>>> the key, we double check to make sure we don't have a devid ==
>>> BTRFS_DEV_REPLACE_DEVID in our devices if we don't find a key. 


>>> If we 
>>> do we
>>> return -EIO and bail out of the mount.  Thanks,


I read fast and missed the bailout part before.

If we bailout the mount, it means a btrfs rootfs can fail to boot up.

To recover from it, the user has to remove the trespassing/extra device
manually and reboot.
For a non-rootfs, the user would have to remove the device manually and run
'btrfs dev scan --forget' to free up the extra devices.
What we are doing now is removing the extra/trespassing device
internally.

IMO. The case of trespassing/extra device trying to sabotage the setup
is a bit different from a corrupted device, in the former case
resilience is preferred?

Thanks, Anand


>>
>>  From user perspective, then do what? Or do we treat this with minimal
>> efforts to provide a sane fallback and error handling just to pass
>> fuzzers (like in many other cases)?
>>
> 
> That's a question for fsck.  I don't want to spend a lot of time chasing 
> imaginary cases that fuzzers come up with, I just want them to fail as 
> quickly as possible so we can move on with our lives.
> 
> If this happened in the real world then it would be because we either
> 
> 1) Lost the replace item somehow?
> 2) Got a random corruption that changed the devid to 0
> 
> I think for #1 it's impossible to detect really, unless you can tell 
> which device was being replaced somehow?  I'm not sure  how you would do 
> that, I'm not familiar enough with the replace code to see if we could 
> figure that out.
> 
> For #2 it should be straightforward, as long as we can determine that we 
> really weren't doing a device replace, then we just change the devid to 
> 1 or something and carry on with life?  Thanks,
> 




> Josef


  reply	other threads:[~2020-09-25 10:13 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-22 12:30 [PATCH] btrfs: fix rw_devices count in __btrfs_free_extra_devids Anand Jain
2020-09-22 12:33 ` [PATCH add reported by] " Anand Jain
2020-09-22 13:08 ` Josef Bacik
2020-09-23  4:42   ` Anand Jain
2020-09-23 13:42     ` Josef Bacik
2020-09-24  5:19       ` Anand Jain
2020-09-24 11:25       ` David Sterba
2020-09-24 14:02         ` Josef Bacik
2020-09-25 10:11           ` Anand Jain [this message]
2020-09-25 14:28             ` Josef Bacik
2020-10-06 13:12               ` Anand Jain
2020-09-22 12:33 Anand Jain
2020-10-06 13:08 ` [PATCH] btrfs: fix devid 0 without a replace item by failing the mount Anand Jain
2020-10-06 13:12   ` [PATCH v2] " Anand Jain
2020-10-06 14:54   ` [PATCH] " kernel test robot
2020-10-06 14:54     ` kernel test robot
2020-10-07  2:07     ` Anand Jain
2020-10-07  2:07       ` Anand Jain
2020-10-12  2:51       ` [kbuild-all] " Rong Chen
2020-10-12  2:51         ` Rong Chen
2020-10-06 16:44   ` kernel test robot
2020-10-06 16:44     ` kernel test robot
2020-10-06 13:12 [PATCH v2] " Anand Jain
2020-10-12  5:26 ` [PATCH v2 add prerequisite-patch-id] " Anand Jain
2020-10-21  4:02   ` [PATCH RESEND " Anand Jain
2020-10-12  5:36   ` [PATCH " Anand Jain
2020-10-21  5:49   ` [PATCH RESEND " kernel test robot
2020-10-21  5:49     ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b93a6de0-96f7-11f1-e4ac-59de97d60cc0@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=syzbot+4cfe71a4da060be47502@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.