All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: dsterba@suse.cz
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC] btrfs: pick device with lowest devt for show_devname
Date: Sat, 25 Nov 2023 09:09:19 +0800	[thread overview]
Message-ID: <36171811-ed49-4427-a647-e052ec70faa0@oracle.com> (raw)
In-Reply-To: <20231124161906.GE18929@twin.jikos.cz>



On 11/25/23 00:19, David Sterba wrote:
> On Thu, Nov 02, 2023 at 07:10:48PM +0800, Anand Jain wrote:
>> In a non-single-device Btrfs filesystem, if Btrfs is already mounted and
>> if you run the command 'mount -a,' it will fail and the command
>> 'umount <device>' also fails. As below:
>>
>> ----------------
>> $ cat /etc/fstab | grep btrfs
>> UUID=12345678-1234-1234-1234-123456789abc /btrfs btrfs defaults,nofail 0 0
>>
>> $ mkfs.btrfs -qf --uuid 12345678-1234-1234-1234-123456789abc /dev/sda2 /dev/sda1
>> $ mount --verbose -a
>> /                        : ignored
>> /btrfs                   : successfully mounted
>>
>> $ ls -l /dev/disk/by-uuid | grep 12345678-1234-1234-1234-123456789abc
>> lrwxrwxrwx 1 root root 10 Nov  2 17:43 12345678-1234-1234-1234-123456789abc -> ../../sda1
>>
>> $ cat /proc/self/mounts | grep btrfs
>> /dev/sda2 /btrfs btrfs rw,relatime,space_cache=v2,subvolid=5,subvol=/ 0 0
>>
>> $ findmnt --df /btrfs
>> SOURCE    FSTYPE SIZE  USED AVAIL USE% TARGET
>> /dev/sda2 btrfs    2G  5.8M  1.8G   0% /btrfs
>>
>> $ mount --verbose -a
>> /                        : ignored
>> mount: /btrfs: /dev/sda1 already mounted or mount point busy.
>> $echo $?
>> 32
>>
>> $ umount /dev/sda1
>> umount: /dev/sda1: not mounted.
>> $ echo $?
>> 32
>> ----------------
>>
>> I assume (RFC) this is because '/dev/disk/by-uuid,' '/proc/self/mounts,'
>> and 'findmnt' do not all reference the same device, resulting in the
>> 'mount -a' and 'umount' failures. However, an empirically found solution
>> is to align them using a rule, such as the disk with the lowest 'devt,'
>> for a multi-device Btrfs filesystem.
>>
>> I'm not yet sure (RFC) how to create a udev rule to point to the disk with
>> the lowest 'devt,' as this kernel patch does, and I believe it is
>> possible.
>>
>> And this would ensure that '/proc/self/mounts,' 'findmnt,' and
>> '/dev/disk/by-uuid' all reference the same device.
>>
>> After applying this patch, the above test passes. Unfortunately,
>> /dev/disk/by-uuid also points to the lowest 'devt' by chance, even though
>> no rule has been set as of now. As shown below.
>>
>> ----------------
>> $ mkfs.btrfs -qf --uuid 12345678-1234-1234-1234-123456789abc /dev/sda2 /dev/sda1
>>
>> $ mount --verbose -a
>> /                        : ignored
>> /btrfs                   : successfully mounted
>>
>> $ ls -l /dev/disk/by-uuid | grep 12345678-1234-1234-1234-123456789abc
>> lrwxrwxrwx 1 root root 10 Nov  2 17:53 12345678-1234-1234-1234-123456789abc -> ../../sda1
>>
>> $ cat /proc/self/mounts | grep btrfs
>> /dev/sda1 /btrfs btrfs rw,relatime,space_cache=v2,subvolid=5,subvol=/ 0 0
>>
>> $ findmnt --df /btrfs
>> SOURCE    FSTYPE SIZE  USED AVAIL USE% TARGET
>> /dev/sda1 btrfs    2G  5.8M  1.8G   0% /btrfs
>>
>> $ mount --verbose -a
>> /                        : ignored
>> /btrfs                   : already mounted
>> $echo $?
>> 0
>>
>> $ umount /dev/sda1
>> $echo $?
>> 0
>> ----------------
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/super.c | 15 +++++++++++++--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index 66bdb6fd83bd..d768917cc5cc 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -2301,7 +2301,18 @@ static int btrfs_unfreeze(struct super_block *sb)
>>   
>>   static int btrfs_show_devname(struct seq_file *m, struct dentry *root)
>>   {
>> -	struct btrfs_fs_info *fs_info = btrfs_sb(root->d_sb);
>> +	struct btrfs_fs_devices *fs_devices = btrfs_sb(root->d_sb)->fs_devices;
>> +	struct btrfs_device *device;
>> +	struct btrfs_device *first_device = NULL;
>> +
>> +	list_for_each_entry(device, &fs_devices->devices, dev_list) {
>> +		if (first_device == NULL) {
>> +			first_device = device;
>> +			continue;
>> +		}
>> +		if (first_device->devt > device->devt)
>> +			first_device = device;
>> +	}
> 
> I think we agree in principle that the devt can be used to determine the
> device to print in show_devname, however I'd like to avoid iterating the
> device list here. We used to have it, first with mutex protection, then
> RCU. That we can simply print the latest_dev is quite convenient.
> 
> I suggest to either add a separate fs_info variable to set the desired
> device and only print it here, or reuse latest_dev for that.
> 

I got it. Thx. However, I am still thinking about the fix. more below.

> Adding a separate variable for that should be safer as latest_dev is
> used in various way and I can't say it's always clear.


Libmount and libblkid use the device path from the mount table and udev,
respectively. These output gets string-matched during 'umount <dev>',
'findmnt', and 'mount -a'. However, in the case of Btrfs, there may be
more than one device per FSID, and there is neither a master device nor
a consolidating pseudo device for public relations, similar to LVM.

A rule, such as selecting the device with the lowest maj:min, works if
it is implemented in both systemd and btrfs.ko. But, this approach does
not resolve the 'umount' problem, such as
'umount <device-which-mount-table-did-not-show>'.

I am skeptical about whether we have a strong case to create a single
pseudo device per multi-device Btrfs filesystem, such as, for example
'/dev/btrfs/<fsid>-<random>/rootid=5' and which means pseudo device
will carry the btrfs-magic and the actual blk devices something else.

OR for now, regarding the umount issue mentioned above, we just can
document it for the users to be aware of.

Any feedback is greatly appreciated.

Thanks.


  reply	other threads:[~2023-11-25  1:09 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-02 11:10 [PATCH RFC] btrfs: pick device with lowest devt for show_devname Anand Jain
2023-11-02 20:26 ` David Sterba
2023-11-02 22:55   ` Anand Jain
2023-11-20 14:42     ` David Sterba
2023-11-24 16:19 ` David Sterba
2023-11-25  1:09   ` Anand Jain [this message]
2023-11-27 11:48     ` Anand Jain
2023-11-28  8:00       ` Goffredo Baroncelli
2023-11-28 23:28         ` Anand Jain
2023-11-29 13:38           ` Andrei Borzenkov
2023-11-29 20:54           ` Goffredo Baroncelli
2023-12-05 17:44             ` David Sterba
2023-12-06 19:52               ` Goffredo Baroncelli
2023-12-05 17:43         ` David Sterba
2023-11-29 21:20 ` Goffredo Baroncelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36171811-ed49-4427-a647-e052ec70faa0@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.