All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Burkov <boris@bur.io>
To: Anand Jain <anand.jain@oracle.com>
Cc: linux-btrfs@vger.kernel.org, dsterba@suse.cz
Subject: Re: [PATCH] btrfs: validate device maj:min during open
Date: Fri, 1 Mar 2024 11:47:49 -0800	[thread overview]
Message-ID: <20240301194749.GA1841098@zen.localdomain> (raw)
In-Reply-To: <3edcf934-ba56-4ac3-8edb-1663f327ab3a@oracle.com>

On Fri, Mar 01, 2024 at 10:17:41PM +0530, Anand Jain wrote:
> 
> 
> On 3/1/24 21:04, Boris Burkov wrote:
> > On Fri, Mar 01, 2024 at 07:21:32AM +0530, Anand Jain wrote:
> > > Boris managed to create a device capable of changing its maj:min without
> > > altering its device path.
> > > 
> > > Only multi-devices can be scanned. A device that gets scanned and remains
> > > in the Btrfs kernel cache might end up with an incorrect maj:min.
> > > 
> > > Despite the tempfsid feature patch did not introduce this bug, it could
> > > lead to issues if the above multi-device is converted to a single device
> > > with a stale maj:min. Subsequently, attempting to mount the same device
> > > with the correct maj:min might mistake it for another device with the same
> > > fsid, potentially resulting in wrongly auto-enabling the tempfsid feature.
> > > 
> > > To address this, this patch validates the device's maj:min at the time of
> > > device open and updates it if it has changed since the last scan.
> > 
> 
> > I do believe this patch is correct for fixing my test case,
> 
> Right. It fixes the bug reported in the testcase only.
> 
> > but I don't
> > believe that it is the proper fix for this issue.
> 
> Indeed, I anticipated that it might be confusing. As I've clarified
> elsewhere, this isn't the solution for the already known multi-nodes
> single-device issue. The resolution is currently being discussed.

I'm not familiar with the the multi-nodes single-device issue.

My point was that while I agree that this fix works for the issue I do
know about, an invalid devt in the btrfs device cache, I do not like it
as a matter of taste.

We have an invalid cache and instead of invalidating it in a general
way at the point of invalidation, you are proposing that we just fix it
up when we happen to notice. I believe that is a fragile solution that
will simply leave the door open for more bugs in the future. (if not
present).

If we do go this way, at the very least I think we should do something a
bit more general. Spitballing here, but perhaps we can get rid of all
redundant caching of random values like devt in btrfs_device and just
use the contents of the bdev struct as it is currently in the block
layer. Or we need a way to treat a btrfs_device as invalid unless
open_one_device has run on it.

> 
> I think I've come up with a better idea now. I'll send out the
> proposal soon for discussions.
> 
> However, my first challenge is to simulate a multi-nodes
> single-device environment for testing.
> 
> Thx, Anand
> 
> > This is just plugging
> > one more hole in a leaky dam.
> 
> > > Fixes: a5b8a5f9f835 ("btrfs: support cloned-device mount capability")
> > > Reported-by: Boris Burkov <boris@bur.io>
> > > Co-developed-by: Boris Burkov <boris@bur.io>
> > > Signed-off-by: Anand Jain <anand.jain@oracle.com>
> > > ---
> > >   fs/btrfs/volumes.c | 19 +++++++++++++++++++
> > >   1 file changed, 19 insertions(+)
> > > 
> > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > > index deb4f191730d..4c498f088302 100644
> > > --- a/fs/btrfs/volumes.c
> > > +++ b/fs/btrfs/volumes.c
> > > @@ -644,6 +644,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
> > >   	struct bdev_handle *bdev_handle;
> > >   	struct btrfs_super_block *disk_super;
> > >   	u64 devid;
> > > +	dev_t devt;
> > >   	int ret;
> > >   	if (device->bdev)
> > > @@ -692,6 +693,24 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
> > >   	device->bdev = bdev_handle->bdev;
> > >   	clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
> > > +	ret = lookup_bdev(device->name->str, &devt);
> > 
> > It should be fine to just use the dev_t in bdev_handle->bdev->bd_dev.
> > 
> > > +	if (ret) {
> > > +		btrfs_err(NULL,
> > > +	"failed to validate (%d:%d) maj:min for device %s %d resetting to 0",
> > > +			  MAJOR(device->devt), MINOR(device->devt),
> > > +			  device->name->str, ret);
> > > +		device->devt = 0;
> > > +	} else {
> > > +		if (device->devt != devt) {
> > > +			btrfs_warn(NULL,
> > > +				"device %s maj:min changed from %d:%d to %d:%d",
> > > +				   device->name->str, MAJOR(device->devt),
> > > +				   MINOR(device->devt), MAJOR(devt),
> > > +				   MINOR(devt));
> > > +			device->devt = devt;
> > > +		}
> > > +	}
> > > +
> > >   	fs_devices->open_devices++;
> > >   	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
> > >   	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
> > > -- 
> > > 2.38.1
> > > 

  reply	other threads:[~2024-03-01 19:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01  1:51 [PATCH] btrfs: validate device maj:min during open Anand Jain
2024-03-01 15:34 ` Boris Burkov
2024-03-01 16:47   ` Anand Jain
2024-03-01 19:47     ` Boris Burkov [this message]
2024-03-01 16:59   ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240301194749.GA1841098@zen.localdomain \
    --to=boris@bur.io \
    --cc=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.