From: David Sterba <email@example.com> To: Chris Murphy <firstname.lastname@example.org> Cc: Neal Gompa <email@example.com>, Btrfs BTRFS <firstname.lastname@example.org>, Josef Bacik <email@example.com>, David Sterba <firstname.lastname@example.org> Subject: Re: APFS improvements (e.g. firm links, volume w/ subvols replication) as ideas for Btrfs? Date: Wed, 12 Jun 2019 11:58:51 +0200 [thread overview] Message-ID: <20190612095851.GG3563@twin.jikos.cz> (raw) In-Reply-To: <CAJCQCtSPZwcg5y-d+mOhmyCdvq1dpzLUg05kPUg7CYhZp6Oz_Q@mail.gmail.com> On Tue, Jun 11, 2019 at 10:03:51PM -0600, Chris Murphy wrote: > On Tue, Jun 11, 2019 at 12:31 PM Neal Gompa <email@example.com> wrote: > > > > Hey, > > > > So Apple held its WWDC event last week, and among other things, they > > talked about improvements they've made to filesystems in macOS. > > > > Among other things, one of the things introduced was a concept of > > "firm links", which is something like NTFS' directory junctions, > > except they can cross (sub)volumes. > > My understanding is it's a work around for the lack of APFS supporting > directory hardlinks. Btrfs does support directory hardlinks but a Directory hardlinks are not supported in general on linux and prohibited on the VFS level. (check fs/namei.c vfs_link, explicitly returns -EPERM for a directory). > hardlink points to a particular inode within a particular subvolume > (files tree) so it's not possible to have a hard link that crosses > subvolumes. A reflink can already do this, but it's really just an > efficient copy, the resulting directory is independent. A directory > symlink can mirror a directory across subvolumes, but like any symlink > it must have a fixed path available to always find the real deal. > > I think a firm link like thing on Btrfs would require a format change, > but I'm not certain. My best guess of what it'd be, is a dir/file > object that gets its own inode but contains a hard reference (not > independent object) to a subvolid+inode. > > > >This concept makes it easier to > > handle uglier layouts. While bind mounts work kind of okay for this > > with simpler configurations, it requires operating system awareness, > > rather than being setup automatically as the volume is mounted. This > > is less brittle and works better for recovery environments, and help > > make easier to do read-only system volumes while supported read-write > > sections in a more flexible way. > > There are a couple of things going on. One is something between VFS > and Btrfs does this goofy assumption that bind mounts are subvolumes, > which is definitely not true. I bring this up here: > https://lore.kernel.org/linux-btrfs/CAJCQCtT=-YoFJgEo=BFqfiPdtMoJCYR3dJPSekf+HQ22GYGztw@mail.gmail.com/ The subvolumes build on top of the bind mount API internally but it is or should be a different kind of object. > Near as I can tell, Btrfs kernel code just needs to be smarter about > distinguishing between bind mounts of directories versus the behind > the scene bind mount used for subvolumes mounted using -o subvol= or > -o subvolid= ; I don't think that's difficult. It's just someone needs > to work through the logic and set aside the resources to do it. I tried to fix that and got half way through, then hit the difficult problems mainly with nested subvolumes. For leaf subvolumes, the difference between subvolume/dir/dir/dir (bind mounted) and subvolume (mounted with -o) is to traverse back the path until the subvolume is hit, which in both cases would be 'subvolume'. Howvever, with nested subvolumes it's not easy to see where to stop subvol1/dir/dir/subvol2/dir/dir/subvol3/dir/dir and take 3 cases: mount -o subvol=subvol1 mount -o subvol=subvol2 mount -o subvol=subvol3 the backward path traversal will always say it's subvol3 (that's wrong from users POV). Keeping track of the exact subvolume that was mounted is not trivial because it partially has to duplicate the internal VFS information which makes it hard to keep consistent after moves. There was a concept proposal called 'fs view' that would add proper subvolume abstraction for subvolumes to VFS but I don't know how far this got.
next prev parent reply other threads:[~2019-06-12 9:58 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-06-11 18:31 Neal Gompa 2019-06-12 4:03 ` Chris Murphy 2019-06-12 8:06 ` Neal Gompa 2019-06-12 20:02 ` Chris Murphy 2019-06-13 11:37 ` Austin S. Hemmelgarn 2019-06-12 9:58 ` David Sterba [this message] 2019-08-05 20:59 ` Chris Murphy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190612095851.GG3563@twin.jikos.cz \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: APFS improvements (e.g. firm links, volume w/ subvols replication) as ideas for Btrfs?' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions on how to clone and mirror all data and code used for this inbox