All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: Matthew Bobrowski <mbobrowski@mbobrowski.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Mark Fasheh <mark@fasheh.com>, Chris Mason <clm@fb.com>,
	Josef Bacik <jbacik@fb.com>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs and fanotify filesystem watches
Date: Fri, 23 Nov 2018 15:34:25 +0200	[thread overview]
Message-ID: <CAOQ4uxjt2+BECu60aisZs_D6pgHK3VbBjju6hzmvG=Ls88UqUA@mail.gmail.com> (raw)
In-Reply-To: <20181123125206.GC31877@quack2.suse.cz>

On Fri, Nov 23, 2018 at 2:52 PM Jan Kara <jack@suse.cz> wrote:
>
> Changed subject to better match what we discuss and added btrfs list to CC.
>
> On Thu 22-11-18 17:18:25, Amir Goldstein wrote:
> > On Thu, Nov 22, 2018 at 3:26 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > > > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > > > > the single super block struct, so all dentries in all subvolumes will
> > > > > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> > > > >
> > > > > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> > > > >
> > > > >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> > > > >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> > > > >         /* Mask in the root object ID too, to disambiguate subvols */
> > > > >         buf->f_fsid.val[0] ^=
> > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> > > > >         buf->f_fsid.val[1] ^=
> > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> > > > >
> > > > > So subvolume root is xored into the FSID. Thus dentries from different
> > > > > subvolumes are going to return different fsids...
> > > > >
> > > >
> > > > Right... how could I have missed that :-/
> > > >
> > > > I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> > > > and I saw how many flaws you pointed to in $SUBJECT patch.
> > > > Instead, I will use:
> > > > statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...
> > >
> > > So what about my proposal to store fsid in the notification mark when it is
> > > created and then use it when that mark results in event being generated?
> > > When mark is created, we have full path available, so getting proper fsid
> > > is trivial. Furthermore if the behavior is documented like that, it is
> > > fairly easy for userspace to track fsids it should care about and translate
> > > them to proper file descriptors for open_by_handle().
> > >
> > > This would get rid of statfs() on every event creation (which I don't like
> > > much) and also avoids these problems "how to get fsid for inode". What do
> > > you think?
> > >
> >
> > That's interesting. I like the simplicity.
> > What happens when there are 2 btrfs subvols /subvol1 /subvol2
> > and the application obviously doesn't know about this and does:
> > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol1);
> > statfs("/subvol1",...);
> > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol2);
> > statfs("/subvol2",...);
> >
> > Now the second fanotify_mark() call just updates the existing super block
> > mark mask, but does not change the fsid on the mark, so all events will have
> > fsid of subvol1 that was stored when first creating the mark.
>
> Yes.
>
> > fanotify-watch application (for example) hashes the watches (paths) under
> > /subvol2 by fid with fsid of subvol2, so events cannot get matched back to
> > "watch" (i.e. path).
>
> I agree this can be confusing... but with btrfs fanotify-watch will be
> confused even with your current code, won't it? Because FAN_MARK_FILESYSTEM
> on /subvol1 (with fsid A) is going to return also events on inodes from
> /subvol2 (with fsid B). So your current code will return event with fsid B
> which fanotify-watch has no way to match back and can get confused.
>
> So currently application can get events with fsid it has never seen, with
> the code as I suggest it can get "wrong" fsid. That is confusing but still
> somewhat better?

It's not better IMO because the purpose of the fid in event is a unique key
that application can use to match a path it already indexed.
wrong fsid => no match.
Using open_by_handle_at() to resolve unknown fid is a limited solution
and I don't think it is going to work in this case (see below).

>
> The core of the problem is that btrfs does not have "the superblock
> identifier" that would correspond to FAN_MARK_FILESYSTEM scope of events
> that we could use.
>
> > And when trying to open_by_handle fid with fhandle from /subvol2
> > using mount_fd of /subvol1, I suppose we can either get ESTALE
> > or a disconnected dentry, because object from /subvol2 cannot
> > have a path inside /subvol1....
>
> So open_by_handle() should work fine even if we get mount_fd of /subvol1
> and handle for inode on /subvol2. mount_fd in open_by_handle() is really
> only used to get the superblock and that is the same.

I don't think it will work fine.
do_handle_to_path() will compose a path from /subvol1 mnt and dentry from
/subvol2. This may resolve to a full path that does not really exist,
so application
cannot match it to anything that is can use name_to_handle_at() to identify.

The whole thing just sounds too messy. At the very least we need more time
to communicate with btrfs developers and figure this out, so I am going to
go with -EXDEV for any attempt to set *any* mark on a group with
FAN_REPORT_FID if fsid of fanotify_mark() path argument is different
from fsid of path->dentry->d_sb->s_root.

We can relax that later if we figure out a better way.

BTW, I am also going to go with -ENODEV for zero fsid (e.g. tmpfs).
tmpfs can be easily fixed to have non zero fsid if filesystem watch on tmpfs is
important.

Thanks,
Amir.

  reply	other threads:[~2018-11-23 13:34 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-14 17:43 [PATCH v2 0/5] fsnotify prep work for fanotify dentry events Amir Goldstein
2018-11-14 17:43 ` [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available Amir Goldstein
2018-11-20 11:32   ` Jan Kara
2018-11-20 14:36     ` Amir Goldstein
2018-11-21 12:51       ` Jan Kara
2018-11-21 16:13         ` Amir Goldstein
2018-11-22  9:52           ` Jan Kara
2018-11-22 12:36             ` Amir Goldstein
2018-11-22 13:26               ` Jan Kara
2018-11-22 15:18                 ` Amir Goldstein
2018-11-22 19:42                   ` Amir Goldstein
2018-11-23 12:56                     ` Jan Kara
2018-11-23 13:42                       ` Amir Goldstein
2018-11-23 12:52                   ` Btrfs and fanotify filesystem watches Jan Kara
2018-11-23 13:34                     ` Amir Goldstein [this message]
2018-11-23 17:53                       ` Amir Goldstein
2018-11-27  8:35                         ` Jan Kara
2018-11-14 17:43 ` [PATCH v2 2/5] fsnotify: annotate filename events Amir Goldstein
2018-11-20 11:59   ` Jan Kara
2018-11-20 14:58     ` Amir Goldstein
2018-11-21 13:18       ` Jan Kara
2018-11-21 15:40         ` Amir Goldstein
2018-11-22  7:45           ` Amir Goldstein
2018-11-22  9:33             ` Jan Kara
2018-11-22  8:40     ` Amir Goldstein
2018-11-14 17:43 ` [PATCH v2 3/5] fsnotify: simplify API for " Amir Goldstein
2018-11-14 17:43 ` [PATCH v2 4/5] fsnotify: make MOVED_FROM a dentry event Amir Goldstein
2018-11-20 12:45   ` Jan Kara
2018-11-14 17:43 ` [PATCH v2 5/5] fsnotify: send event type FSNOTIFY_EVENT_DENTRY to super block mark Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxjt2+BECu60aisZs_D6pgHK3VbBjju6hzmvG=Ls88UqUA@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mark@fasheh.com \
    --cc=mbobrowski@mbobrowski.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.