From: Martin Steigerwald <martin@lichtvoll.de>
To: linux-block@vger.kernel.org, Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Assumption on fixed device numbers in Plasma's desktop search Baloo
Date: Sat, 26 Jun 2021 10:49:30 +0200 [thread overview]
Message-ID: <2009039.b04VgvrTqe@ananda> (raw)
In-Reply-To: <fe83dadc-bbcf-2f85-6664-bad3fcd83553@gmx.com>
Qu Wenruo - 26.06.21, 02:27:54 CEST:
> On 2021/6/26 上午3:06, Martin Steigerwald wrote:
> > Hi!
> >
> > I found repeatedly that Baloo indexes the same files twice or even
> > more often after a while.
> >
> > I reported this upstream in:
> >
> > Bug 438434 - Baloo appears to be indexing twice the number of files
> > than are actually in my home directory
> >
> > https://bugs.kde.org/show_bug.cgi?id=438434
> >
> > And got back that if the device number changes, Baloo will think it
> > has new files even tough the path is still the same. And found over
> > time that the device number for the single BTRFS filesystem on a
> > NVMe SSD in a ThinkPad T14 Gen1 AMD can change. It is not (maybe
> > yet) RAID 1. I do have BTRFS RAID 1 in another laptop and there I
> > also had this issue already.
>
> Since btrfs has multi-device support by default, it reports anonymous
> device number, just as if you use a filesystem over LVM.
Ah, this!
I forgot to mention that: I use BTRFS on top of LVM on top of LUKS based
dm-crypt on a partition on the NVMe SSD. Sorry, somehow I forgot to
mention that here. I mentioned it in the bug report. I'd use a different
approach if there would be one that give me full disk encryption. I am
not willing to use ecryptfs on top of BTRFS and as far as I know BTRFS
cannot yet encrypt by itself.
I still think this could give a fixed order of loading:
1. Unlock LUKS.
2. Activate LVM logical volumes. No idea whether that happens in a fixed
order though or whether it can have a different order on each boot.
3. Mount BTRFS. /home is always on the same subvolume. So that should
not change.
> The problem is why the anonymous device number change.
Good question. Maybe I have an idea about that. See below.
> > I argued that a desktop application has no business to rely on a
> > device number and got back that search/indexing is in the middle
> > between an application and system software. And that Baloo needs an
> > "invariant" for a file. See comment #11 of that bug report:
> >
> > https://bugs.kde.org/show_bug.cgi?id=438434#c11
>
> Well, a lot of tools relies on device number to distinguish filesystem
> boundary, like find.
> Thus it's a little hard to argue.
>
> But on the other hand, it also means baloo can't handle regular fs
> over LVM cases well neither.
Yes. Also it could not handle the case of a driver loading race
condition with two or more different controllers in a desktop machine.
> > I got the suggestion to try to find a way to tell the kernel to use
> > a fixed device number.
>
> I don't think it's possible for btrfs, as each subvolume get its
> anonymous device number assigned when it gets first read.
>
> Thus it's really hard to make it fixed, as the reason for anonymous
> device number is to avoid conflicts.
Fair enough.
> > I still think, an application or an infrastructure service for a
> > desktop environment or even anything else in user space should not
> > rely on a device number to be fixed and never change upon reboots.
>
> Well, LVM/device mapper is doing the same thing, a lot of behavior
> change is never a good idea for the kernel.
>
> Thus for use cases where we really need a proper mapping, we use
> hashes, not just device number, like what we did in dupremover.
I think I suggested that some time ago.
> > Another question would be whether I could somehow make sure that the
> > device number does not change, even if just as a work-around.
>
> If you really just want a fixed device number, you can ensure that by:
>
> - Make sure all users of anonymous devices get fixed sequence
> Things like device mapper/LVM, btrfs should get loaded/initialized
> in a fixed order.
Ah, I see.
> - Make sure the subvolume you care always get mounted/read before any
> other subvolumes
> So that the target subvolume always get the first device number in
> the pool.
Hmm, that may be a pointer. This is what I currently have in fstab:
/dev/nvme/home /home btrfs lazytime,compress=zstd 0 0
/dev/nvme/home /zeit/home btrfs subvol=zeit 0 0
In the first line the default subvolume is used which I changed
accordingly after creating this BTRFS. I use the approach to keep
(temporary) snapshots separated from the directory tree in /home.
Could it be that this order between these two mounts is not the same on
every boot? I use Devuan with Runit, so the mounting would happen by
some init scripts (instead of Systemd).
I am not aware of an option for fstab to mount this one first and then
the other second, but I could set the second mount to noauto and mount
it when I need it.
> But this also means, all later subvolumes not in the fixed
> mount/read sequence can not get a fixed number.
I somehow thought this would get complicated.
Best,
--
Martin
next prev parent reply other threads:[~2021-06-26 8:49 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-25 19:06 Assumption on fixed device numbers in Plasma's desktop search Baloo Martin Steigerwald
2021-06-26 0:27 ` Qu Wenruo
2021-06-26 8:49 ` Martin Steigerwald [this message]
2021-06-26 9:33 ` Qu Wenruo
2021-06-26 10:18 ` Martin Steigerwald
2021-06-26 0:54 ` NeilBrown
2021-06-26 3:38 ` Bart Van Assche
2021-06-26 5:17 ` NeilBrown
2021-06-26 6:14 ` Andrei Borzenkov
2021-06-26 6:24 ` Qu Wenruo
2021-06-26 8:51 ` Martin Steigerwald
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2009039.b04VgvrTqe@ananda \
--to=martin@lichtvoll.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).