* Assumption on fixed device numbers in Plasma's desktop search Baloo
@ 2021-06-25 19:06 Martin Steigerwald
2021-06-26 0:27 ` Qu Wenruo
2021-06-26 0:54 ` NeilBrown
0 siblings, 2 replies; 11+ messages in thread
From: Martin Steigerwald @ 2021-06-25 19:06 UTC (permalink / raw)
To: linux-block; +Cc: linux-btrfs
Hi!
I found repeatedly that Baloo indexes the same files twice or even more
often after a while.
I reported this upstream in:
Bug 438434 - Baloo appears to be indexing twice the number of files than
are actually in my home directory
https://bugs.kde.org/show_bug.cgi?id=438434
And got back that if the device number changes, Baloo will think it has
new files even tough the path is still the same. And found over time that
the device number for the single BTRFS filesystem on a NVMe SSD in a
ThinkPad T14 Gen1 AMD can change. It is not (maybe yet) RAID 1. I do
have BTRFS RAID 1 in another laptop and there I also had this issue
already.
I argued that a desktop application has no business to rely on a device
number and got back that search/indexing is in the middle between an
application and system software. And that Baloo needs an "invariant" for
a file. See comment #11 of that bug report:
https://bugs.kde.org/show_bug.cgi?id=438434#c11
I got the suggestion to try to find a way to tell the kernel to use a
fixed device number.
I still think, an application or an infrastructure service for a desktop
environment or even anything else in user space should not rely on a
device number to be fixed and never change upon reboots.
But maybe you have a different idea about that and it is okay for an
userspace component to do that. I would like to hear your idea about
that.
Another question would be whether I could somehow make sure that the
device number does not change, even if just as a work-around. I know for
NFS there is a fsid= mount option, but it does not appear to be
something generic, at least the mount man page seems to have nothing
related to fsid.
Best,
--
Martin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-25 19:06 Assumption on fixed device numbers in Plasma's desktop search Baloo Martin Steigerwald
@ 2021-06-26 0:27 ` Qu Wenruo
2021-06-26 8:49 ` Martin Steigerwald
2021-06-26 0:54 ` NeilBrown
1 sibling, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2021-06-26 0:27 UTC (permalink / raw)
To: Martin Steigerwald, linux-block; +Cc: linux-btrfs
On 2021/6/26 上午3:06, Martin Steigerwald wrote:
> Hi!
>
> I found repeatedly that Baloo indexes the same files twice or even more
> often after a while.
>
> I reported this upstream in:
>
> Bug 438434 - Baloo appears to be indexing twice the number of files than
> are actually in my home directory
>
> https://bugs.kde.org/show_bug.cgi?id=438434
>
> And got back that if the device number changes, Baloo will think it has
> new files even tough the path is still the same. And found over time that
> the device number for the single BTRFS filesystem on a NVMe SSD in a
> ThinkPad T14 Gen1 AMD can change. It is not (maybe yet) RAID 1. I do
> have BTRFS RAID 1 in another laptop and there I also had this issue
> already.
Since btrfs has multi-device support by default, it reports anonymous
device number, just as if you use a filesystem over LVM.
The problem is why the anonymous device number change.
If the fs is always mounted at a fixed sequence with fixed
snapshots/subvolume mount, it should not get a new anonymous device number.
But if snapshots or new subvolumes are involved, or just
mounting/reading subvolumes in different order, then the device number
for each subvolume will change.
>
> I argued that a desktop application has no business to rely on a device
> number and got back that search/indexing is in the middle between an
> application and system software. And that Baloo needs an "invariant" for
> a file. See comment #11 of that bug report:
>
> https://bugs.kde.org/show_bug.cgi?id=438434#c11
Well, a lot of tools relies on device number to distinguish filesystem
boundary, like find.
Thus it's a little hard to argue.
But on the other hand, it also means baloo can't handle regular fs over
LVM cases well neither.
>
> I got the suggestion to try to find a way to tell the kernel to use a
> fixed device number.
I don't think it's possible for btrfs, as each subvolume get its
anonymous device number assigned when it gets first read.
Thus it's really hard to make it fixed, as the reason for anonymous
device number is to avoid conflicts.
>
> I still think, an application or an infrastructure service for a desktop
> environment or even anything else in user space should not rely on a
> device number to be fixed and never change upon reboots.
Well, LVM/device mapper is doing the same thing, a lot of behavior
change is never a good idea for the kernel.
Thus for use cases where we really need a proper mapping, we use hashes,
not just device number, like what we did in dupremover.
>
> But maybe you have a different idea about that and it is okay for an
> userspace component to do that. I would like to hear your idea about
> that.
>
> Another question would be whether I could somehow make sure that the
> device number does not change, even if just as a work-around.
If you really just want a fixed device number, you can ensure that by:
- Make sure all users of anonymous devices get fixed sequence
Things like device mapper/LVM, btrfs should get loaded/initialized
in a fixed order.
- Make sure the subvolume you care always get mounted/read before any
other subvolumes
So that the target subvolume always get the first device number in the
pool.
But this also means, all later subvolumes not in the fixed mount/read
sequence can not get a fixed number.
Thanks,
Qu
> I know for
> NFS there is a fsid= mount option, but it does not appear to be
> something generic, at least the mount man page seems to have nothing
> related to fsid.
>
>
> Best,
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-25 19:06 Assumption on fixed device numbers in Plasma's desktop search Baloo Martin Steigerwald
2021-06-26 0:27 ` Qu Wenruo
@ 2021-06-26 0:54 ` NeilBrown
2021-06-26 3:38 ` Bart Van Assche
2021-06-26 8:51 ` Martin Steigerwald
1 sibling, 2 replies; 11+ messages in thread
From: NeilBrown @ 2021-06-26 0:54 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-block, linux-btrfs
On Sat, 26 Jun 2021, Martin Steigerwald wrote:
> Hi!
>
> I found repeatedly that Baloo indexes the same files twice or even more
> often after a while.
>
> I reported this upstream in:
>
> Bug 438434 - Baloo appears to be indexing twice the number of files than
> are actually in my home directory
>
> https://bugs.kde.org/show_bug.cgi?id=438434
>
> And got back that if the device number changes, Baloo will think it has
> new files even tough the path is still the same. And found over time that
> the device number for the single BTRFS filesystem on a NVMe SSD in a
> ThinkPad T14 Gen1 AMD can change. It is not (maybe yet) RAID 1. I do
> have BTRFS RAID 1 in another laptop and there I also had this issue
> already.
>
> I argued that a desktop application has no business to rely on a device
> number and got back that search/indexing is in the middle between an
> application and system software.
NO SOFTWARE can rely on device numbers being stable in Linux. Not
desktop, not system, not anything. They are stable while the device is
in use (e.g. while the filesystem is mounted) but can definitely change
on reboot. This has been the case since about Linux 2.4.
> And that Baloo needs an "invariant" for
> a file. See comment #11 of that bug report:
That is really hard to provide in general. Possibly the best approach
is to use the statfs() systemcall to get the "f_fsid" field. This is
64bits. It is not supported uniformly well by all filesystems, but I
think it is at least not worse than using the device number. For a lot
of older filesystems it is just an encoding of the device number.
For btrfs, xfs, ext4 it is much much better.
NeilBrown
>
> https://bugs.kde.org/show_bug.cgi?id=438434#c11
>
> I got the suggestion to try to find a way to tell the kernel to use a
> fixed device number.
>
> I still think, an application or an infrastructure service for a desktop
> environment or even anything else in user space should not rely on a
> device number to be fixed and never change upon reboots.
>
> But maybe you have a different idea about that and it is okay for an
> userspace component to do that. I would like to hear your idea about
> that.
>
> Another question would be whether I could somehow make sure that the
> device number does not change, even if just as a work-around. I know for
> NFS there is a fsid= mount option, but it does not appear to be
> something generic, at least the mount man page seems to have nothing
> related to fsid.
>
>
> Best,
> --
> Martin
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 0:54 ` NeilBrown
@ 2021-06-26 3:38 ` Bart Van Assche
2021-06-26 5:17 ` NeilBrown
2021-06-26 8:51 ` Martin Steigerwald
1 sibling, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2021-06-26 3:38 UTC (permalink / raw)
To: NeilBrown, Martin Steigerwald; +Cc: linux-block, linux-btrfs
On 6/25/21 5:54 PM, NeilBrown wrote:
> On Sat, 26 Jun 2021, Martin Steigerwald wrote:
>> And that Baloo needs an "invariant" for
>> a file. See comment #11 of that bug report:
>
> That is really hard to provide in general. Possibly the best approach
> is to use the statfs() systemcall to get the "f_fsid" field. This is
> 64bits. It is not supported uniformly well by all filesystems, but I
> think it is at least not worse than using the device number. For a lot
> of older filesystems it is just an encoding of the device number.
>
> For btrfs, xfs, ext4 it is much much better.
How about combining the UUID of the partition with the file path? An
example from one of the VMs on my workstation:
$ df .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vda1 25670972 12730276 11613648 53% /
$ lsblk -O | grep vda1
└─vda1 vda1 /dev/vda1 252:1 11.1G 24.5G ext4 12.1G 50% 1.0
/ 84cebea8-7e6f-4c2a-8a1b-8bc0c9744751 ae2151de
dos 0x83 Linux ae2151de-01
0x80 128 0 0 0
25G root disk brw-rw---- 0 512
0 512 512 1 mq-deadline 256 part 0 512B
2G 0 0B 0 vda block:virtio:pci
none 0
In other words, UUID 84cebea8-7e6f-4c2a-8a1b-8bc0c9744751 has been
associated with the block device under the filesystem that owns the
directory from which the 'df' command has been run.
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 3:38 ` Bart Van Assche
@ 2021-06-26 5:17 ` NeilBrown
2021-06-26 6:14 ` Andrei Borzenkov
0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2021-06-26 5:17 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Martin Steigerwald, linux-block, linux-btrfs
On Sat, 26 Jun 2021, Bart Van Assche wrote:
> On 6/25/21 5:54 PM, NeilBrown wrote:
> > On Sat, 26 Jun 2021, Martin Steigerwald wrote:
> >> And that Baloo needs an "invariant" for
> >> a file. See comment #11 of that bug report:
> >
> > That is really hard to provide in general. Possibly the best approach
> > is to use the statfs() systemcall to get the "f_fsid" field. This is
> > 64bits. It is not supported uniformly well by all filesystems, but I
> > think it is at least not worse than using the device number. For a lot
> > of older filesystems it is just an encoding of the device number.
> >
> > For btrfs, xfs, ext4 it is much much better.
>
> How about combining the UUID of the partition with the file path? An
> example from one of the VMs on my workstation:
A btrfs filesystem can span multiple partitions, and those partitions
can be added and removed dynamically. So you could migrated from one to
another.
f_fsid really is best for any modern filesystem.
NeilBrown
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 5:17 ` NeilBrown
@ 2021-06-26 6:14 ` Andrei Borzenkov
2021-06-26 6:24 ` Qu Wenruo
0 siblings, 1 reply; 11+ messages in thread
From: Andrei Borzenkov @ 2021-06-26 6:14 UTC (permalink / raw)
To: NeilBrown, Bart Van Assche; +Cc: Martin Steigerwald, linux-block, linux-btrfs
On 26.06.2021 08:17, NeilBrown wrote:
> On Sat, 26 Jun 2021, Bart Van Assche wrote:
>> On 6/25/21 5:54 PM, NeilBrown wrote:
>>> On Sat, 26 Jun 2021, Martin Steigerwald wrote:
>>>> And that Baloo needs an "invariant" for
>>>> a file. See comment #11 of that bug report:
>>>
>>> That is really hard to provide in general. Possibly the best approach
>>> is to use the statfs() systemcall to get the "f_fsid" field. This is
>>> 64bits. It is not supported uniformly well by all filesystems, but I
>>> think it is at least not worse than using the device number. For a lot
>>> of older filesystems it is just an encoding of the device number.
>>>
>>> For btrfs, xfs, ext4 it is much much better.
>>
>> How about combining the UUID of the partition with the file path? An
>> example from one of the VMs on my workstation:
>
> A btrfs filesystem can span multiple partitions, and those partitions
> can be added and removed dynamically. So you could migrated from one to
> another.
>
I suspect it was intended to be "filesytemm UUID". At least that is the
field in lsblk output that was referenced.
> f_fsid really is best for any modern filesystem.
>
> NeilBrown
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 6:14 ` Andrei Borzenkov
@ 2021-06-26 6:24 ` Qu Wenruo
0 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2021-06-26 6:24 UTC (permalink / raw)
To: Andrei Borzenkov, NeilBrown, Bart Van Assche
Cc: Martin Steigerwald, linux-block, linux-btrfs
On 2021/6/26 下午2:14, Andrei Borzenkov wrote:
> On 26.06.2021 08:17, NeilBrown wrote:
>> On Sat, 26 Jun 2021, Bart Van Assche wrote:
>>> On 6/25/21 5:54 PM, NeilBrown wrote:
>>>> On Sat, 26 Jun 2021, Martin Steigerwald wrote:
>>>>> And that Baloo needs an "invariant" for
>>>>> a file. See comment #11 of that bug report:
>>>>
>>>> That is really hard to provide in general. Possibly the best approach
>>>> is to use the statfs() systemcall to get the "f_fsid" field. This is
>>>> 64bits. It is not supported uniformly well by all filesystems, but I
>>>> think it is at least not worse than using the device number. For a lot
>>>> of older filesystems it is just an encoding of the device number.
>>>>
>>>> For btrfs, xfs, ext4 it is much much better.
>>>
>>> How about combining the UUID of the partition with the file path? An
>>> example from one of the VMs on my workstation:
>>
>> A btrfs filesystem can span multiple partitions, and those partitions
>> can be added and removed dynamically. So you could migrated from one to
>> another.
>>
>
> I suspect it was intended to be "filesytemm UUID". At least that is the
> field in lsblk output that was referenced.
Filesystem UUID is not enough.
In btrfs, all subvolumes share the same fsid.
While for statfs() call, we do extra XOR with subvolume id to generate
unique f_fsid for each subvolume.
Thanks,
Qu
>
>> f_fsid really is best for any modern filesystem.
>>
>> NeilBrown
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 0:27 ` Qu Wenruo
@ 2021-06-26 8:49 ` Martin Steigerwald
2021-06-26 9:33 ` Qu Wenruo
0 siblings, 1 reply; 11+ messages in thread
From: Martin Steigerwald @ 2021-06-26 8:49 UTC (permalink / raw)
To: linux-block, Qu Wenruo; +Cc: linux-btrfs
Qu Wenruo - 26.06.21, 02:27:54 CEST:
> On 2021/6/26 上午3:06, Martin Steigerwald wrote:
> > Hi!
> >
> > I found repeatedly that Baloo indexes the same files twice or even
> > more often after a while.
> >
> > I reported this upstream in:
> >
> > Bug 438434 - Baloo appears to be indexing twice the number of files
> > than are actually in my home directory
> >
> > https://bugs.kde.org/show_bug.cgi?id=438434
> >
> > And got back that if the device number changes, Baloo will think it
> > has new files even tough the path is still the same. And found over
> > time that the device number for the single BTRFS filesystem on a
> > NVMe SSD in a ThinkPad T14 Gen1 AMD can change. It is not (maybe
> > yet) RAID 1. I do have BTRFS RAID 1 in another laptop and there I
> > also had this issue already.
>
> Since btrfs has multi-device support by default, it reports anonymous
> device number, just as if you use a filesystem over LVM.
Ah, this!
I forgot to mention that: I use BTRFS on top of LVM on top of LUKS based
dm-crypt on a partition on the NVMe SSD. Sorry, somehow I forgot to
mention that here. I mentioned it in the bug report. I'd use a different
approach if there would be one that give me full disk encryption. I am
not willing to use ecryptfs on top of BTRFS and as far as I know BTRFS
cannot yet encrypt by itself.
I still think this could give a fixed order of loading:
1. Unlock LUKS.
2. Activate LVM logical volumes. No idea whether that happens in a fixed
order though or whether it can have a different order on each boot.
3. Mount BTRFS. /home is always on the same subvolume. So that should
not change.
> The problem is why the anonymous device number change.
Good question. Maybe I have an idea about that. See below.
> > I argued that a desktop application has no business to rely on a
> > device number and got back that search/indexing is in the middle
> > between an application and system software. And that Baloo needs an
> > "invariant" for a file. See comment #11 of that bug report:
> >
> > https://bugs.kde.org/show_bug.cgi?id=438434#c11
>
> Well, a lot of tools relies on device number to distinguish filesystem
> boundary, like find.
> Thus it's a little hard to argue.
>
> But on the other hand, it also means baloo can't handle regular fs
> over LVM cases well neither.
Yes. Also it could not handle the case of a driver loading race
condition with two or more different controllers in a desktop machine.
> > I got the suggestion to try to find a way to tell the kernel to use
> > a fixed device number.
>
> I don't think it's possible for btrfs, as each subvolume get its
> anonymous device number assigned when it gets first read.
>
> Thus it's really hard to make it fixed, as the reason for anonymous
> device number is to avoid conflicts.
Fair enough.
> > I still think, an application or an infrastructure service for a
> > desktop environment or even anything else in user space should not
> > rely on a device number to be fixed and never change upon reboots.
>
> Well, LVM/device mapper is doing the same thing, a lot of behavior
> change is never a good idea for the kernel.
>
> Thus for use cases where we really need a proper mapping, we use
> hashes, not just device number, like what we did in dupremover.
I think I suggested that some time ago.
> > Another question would be whether I could somehow make sure that the
> > device number does not change, even if just as a work-around.
>
> If you really just want a fixed device number, you can ensure that by:
>
> - Make sure all users of anonymous devices get fixed sequence
> Things like device mapper/LVM, btrfs should get loaded/initialized
> in a fixed order.
Ah, I see.
> - Make sure the subvolume you care always get mounted/read before any
> other subvolumes
> So that the target subvolume always get the first device number in
> the pool.
Hmm, that may be a pointer. This is what I currently have in fstab:
/dev/nvme/home /home btrfs lazytime,compress=zstd 0 0
/dev/nvme/home /zeit/home btrfs subvol=zeit 0 0
In the first line the default subvolume is used which I changed
accordingly after creating this BTRFS. I use the approach to keep
(temporary) snapshots separated from the directory tree in /home.
Could it be that this order between these two mounts is not the same on
every boot? I use Devuan with Runit, so the mounting would happen by
some init scripts (instead of Systemd).
I am not aware of an option for fstab to mount this one first and then
the other second, but I could set the second mount to noauto and mount
it when I need it.
> But this also means, all later subvolumes not in the fixed
> mount/read sequence can not get a fixed number.
I somehow thought this would get complicated.
Best,
--
Martin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 0:54 ` NeilBrown
2021-06-26 3:38 ` Bart Van Assche
@ 2021-06-26 8:51 ` Martin Steigerwald
1 sibling, 0 replies; 11+ messages in thread
From: Martin Steigerwald @ 2021-06-26 8:51 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-block, linux-btrfs
NeilBrown - 26.06.21, 02:54:09 CEST:
> > And that Baloo needs an "invariant" for
>
> > a file. See comment #11 of that bug report:
> That is really hard to provide in general. Possibly the best approach
> is to use the statfs() systemcall to get the "f_fsid" field. This is
> 64bits. It is not supported uniformly well by all filesystems, but I
> think it is at least not worse than using the device number. For a
> lot of older filesystems it is just an encoding of the device number.
>
> For btrfs, xfs, ext4 it is much much better.
Thank you for the clear statement and for your alternative suggestion. I
will forward this to Baloo upstream.
I think the main focus of Baloo would be to work on currently mostly in
use Linux filesystem which should be BTRFS, XFS, EXT4 and probably F2FS.
--
Martin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 8:49 ` Martin Steigerwald
@ 2021-06-26 9:33 ` Qu Wenruo
2021-06-26 10:18 ` Martin Steigerwald
0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2021-06-26 9:33 UTC (permalink / raw)
To: Martin Steigerwald, linux-block; +Cc: linux-btrfs
On 2021/6/26 下午4:49, Martin Steigerwald wrote:
> Qu Wenruo - 26.06.21, 02:27:54 CEST:
>> On 2021/6/26 上午3:06, Martin Steigerwald wrote:
>>> Hi!
>>>
>>> I found repeatedly that Baloo indexes the same files twice or even
>>> more often after a while.
>>>
>>> I reported this upstream in:
>>>
>>> Bug 438434 - Baloo appears to be indexing twice the number of files
>>> than are actually in my home directory
>>>
>>> https://bugs.kde.org/show_bug.cgi?id=438434
>>>
>>> And got back that if the device number changes, Baloo will think it
>>> has new files even tough the path is still the same. And found over
>>> time that the device number for the single BTRFS filesystem on a
>>> NVMe SSD in a ThinkPad T14 Gen1 AMD can change. It is not (maybe
>>> yet) RAID 1. I do have BTRFS RAID 1 in another laptop and there I
>>> also had this issue already.
>>
>> Since btrfs has multi-device support by default, it reports anonymous
>> device number, just as if you use a filesystem over LVM.
>
> Ah, this!
>
> I forgot to mention that: I use BTRFS on top of LVM on top of LUKS based
> dm-crypt on a partition on the NVMe SSD. Sorry, somehow I forgot to
> mention that here. I mentioned it in the bug report. I'd use a different
> approach if there would be one that give me full disk encryption. I am
> not willing to use ecryptfs on top of BTRFS and as far as I know BTRFS
> cannot yet encrypt by itself.
>
> I still think this could give a fixed order of loading:
>
> 1. Unlock LUKS.
>
> 2. Activate LVM logical volumes. No idea whether that happens in a fixed
> order though or whether it can have a different order on each boot.
LVM/LUKS normally isn't a big deal, as most of them are initialized
before btrfs, and have a pretty fixed initialization sequence.
Unless you change the LVM setup, then at least all your LVs should have
a fixed device number.
(But there are still cases where kernel update may change them)
>
> 3. Mount BTRFS. /home is always on the same subvolume. So that should
> not change.
Normally it won't change.
But it's more dependent on the btrfs behavior.
Thus I'm not that confident it won't change forever.
But at this point I guess you already get the point, under normal cases,
no config change then device number won't change.
However any change in kernel/storage stack/config can lead to different
device number.
>
>> The problem is why the anonymous device number change.
>
> Good question. Maybe I have an idea about that. See below.
>
>>> I argued that a desktop application has no business to rely on a
>>> device number and got back that search/indexing is in the middle
>>> between an application and system software. And that Baloo needs an
>>> "invariant" for a file. See comment #11 of that bug report:
>>>
>>> https://bugs.kde.org/show_bug.cgi?id=438434#c11
>>
>> Well, a lot of tools relies on device number to distinguish filesystem
>> boundary, like find.
>> Thus it's a little hard to argue.
>>
>> But on the other hand, it also means baloo can't handle regular fs
>> over LVM cases well neither.
>
> Yes. Also it could not handle the case of a driver loading race
> condition with two or more different controllers in a desktop machine.
Thus the idea from Neil should help, instead of using device number,
using f_fsid from statfs() should provide a way more stable result.
And f_fsid can also handle btrfs subvolumes pretty well.
But this also means, if one day you change your default/mounted
subvolume, baloo will again rebuild the cache using the new f_fsid.
>
>>> I got the suggestion to try to find a way to tell the kernel to use
>>> a fixed device number.
>>
>> I don't think it's possible for btrfs, as each subvolume get its
>> anonymous device number assigned when it gets first read.
>>
>> Thus it's really hard to make it fixed, as the reason for anonymous
>> device number is to avoid conflicts.
>
> Fair enough.
>
>>> I still think, an application or an infrastructure service for a
>>> desktop environment or even anything else in user space should not
>>> rely on a device number to be fixed and never change upon reboots.
>>
>> Well, LVM/device mapper is doing the same thing, a lot of behavior
>> change is never a good idea for the kernel.
>>
>> Thus for use cases where we really need a proper mapping, we use
>> hashes, not just device number, like what we did in dupremover.
>
> I think I suggested that some time ago.
>
>>> Another question would be whether I could somehow make sure that the
>>> device number does not change, even if just as a work-around.
>>
>> If you really just want a fixed device number, you can ensure that by:
>>
>> - Make sure all users of anonymous devices get fixed sequence
>> Things like device mapper/LVM, btrfs should get loaded/initialized
>> in a fixed order.
>
> Ah, I see.
>
>> - Make sure the subvolume you care always get mounted/read before any
>> other subvolumes
>> So that the target subvolume always get the first device number in
>> the pool.
>
> Hmm, that may be a pointer. This is what I currently have in fstab:
>
> /dev/nvme/home /home btrfs lazytime,compress=zstd 0 0
> /dev/nvme/home /zeit/home btrfs subvol=zeit 0 0
>
> In the first line the default subvolume is used which I changed
> accordingly after creating this BTRFS. I use the approach to keep
> (temporary) snapshots separated from the directory tree in /home.
>
> Could it be that this order between these two mounts is not the same on
> every boot?
> I use Devuan with Runit, so the mounting would happen by
> some init scripts (instead of Systemd).
Then it's out of the scope of btrfs.
I was just wondering if systemd is involved, but you just ruled it out.
But still if the init tool choose to shuffle the mount sequence to do
more parallel mounts, then device number will be even more unreliable.
>
> I am not aware of an option for fstab to mount this one first and then
> the other second, but I could set the second mount to noauto and mount
> it when I need it.
>
>> But this also means, all later subvolumes not in the fixed
>> mount/read sequence can not get a fixed number.
>
> I somehow thought this would get complicated.
It's already complicated.
So this just proves Neil is right, device number is only reliable at the
lifespan of the fs, nothing else.
Thanks,
Qu
>
> Best,
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
2021-06-26 9:33 ` Qu Wenruo
@ 2021-06-26 10:18 ` Martin Steigerwald
0 siblings, 0 replies; 11+ messages in thread
From: Martin Steigerwald @ 2021-06-26 10:18 UTC (permalink / raw)
To: linux-block, Qu Wenruo; +Cc: linux-btrfs
Qu Wenruo - 26.06.21, 11:33:17 CEST:
> > I am not aware of an option for fstab to mount this one first and
> > then the other second, but I could set the second mount to noauto
> > and mount it when I need it.
> >
> >> But this also means, all later subvolumes not in the fixed
> >> mount/read sequence can not get a fixed number.
> >
> > I somehow thought this would get complicated.
>
> It's already complicated.
>
> So this just proves Neil is right, device number is only reliable at
> the lifespan of the fs, nothing else.
Thank you again.
I informed upstream about the conclusions from this thread.
Let's see what they come up with.
They have an energy efficiency goal, for that it would be desirable to
stop indexing files twice or thrice or even more times. :)
Best,
--
Martin
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-06-26 10:18 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-25 19:06 Assumption on fixed device numbers in Plasma's desktop search Baloo Martin Steigerwald
2021-06-26 0:27 ` Qu Wenruo
2021-06-26 8:49 ` Martin Steigerwald
2021-06-26 9:33 ` Qu Wenruo
2021-06-26 10:18 ` Martin Steigerwald
2021-06-26 0:54 ` NeilBrown
2021-06-26 3:38 ` Bart Van Assche
2021-06-26 5:17 ` NeilBrown
2021-06-26 6:14 ` Andrei Borzenkov
2021-06-26 6:24 ` Qu Wenruo
2021-06-26 8:51 ` Martin Steigerwald
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).