All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Bobrowski <repnop@google.com>
To: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>,
	Linux API <linux-api@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Fanotify API - Tracking File Movement
Date: Fri, 6 May 2022 09:44:22 +1000	[thread overview]
Message-ID: <YnRhVgu6JKNinarh@google.com> (raw)
In-Reply-To: <20220505133057.zm5t6vumc4xdcnsg@quack3.lan>

On Thu, May 05, 2022 at 03:30:57PM +0200, Jan Kara wrote:
> On Thu 05-05-22 15:56:16, Amir Goldstein wrote:
> > On Thu, May 5, 2022 at 2:22 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > Hello Matthew!
> > >
> > > On Thu 05-05-22 20:25:31, Matthew Bobrowski wrote:
> > > > I was having a brief chat with Amir the other day about an idea/use
> > > > case that I have which at present don't believe is robustly supported
> > > > by the fanotify API. I was wondering whether you could share some
> > > > thoughts on supporting the following idea.
> > > >
> > > > I have a need to track file movement across a filesystem without
> > > > necessarily burdening the system by having to watch the entire
> > > > filesystem for such movements. That is, knowing when file /dir1/a had
> > > > been moved from /dir1/a to /dir2/a and then from /dir2/a to /dir3/a
> > > > and so on. Or more simply, knowing the destination/new path of the
> > > > file once it has moved.
> > >
> > > OK, and the places the file moves to can be arbitrary? That seems like a
> > > bit narrow usecase :)
> > >
> > > > Initially, I was thinking of using FAN_MOVE_SELF, but it doesn't quite
> > > > cut it. For such events, you only know the target location or path of
> > > > a file had been modified once it has subsequently been moved
> > > > elsewhere. Not to mention that path resolution using the file
> > > > identifier from such an event may not always work. Then there's
> > > > FAN_RENAME which could arguably work. This would include setting up a
> > > > watch on the parent directory of the file of interest and then using
> > > > the information record of type FAN_EVENT_INFO_TYPE_NEW_DFID_NAME to
> > > > figure out the new target location of the file once it has been moved
> > > > and then resetting the mark on the next parent directory once the new
> > > > target location is known. But, as Amir rightfully mentioned, this
> > > > rinse and repeat mark approach is suboptimal as it can lead to certain
> > > > race conditions.
> > >
> > > It seems to me you really want FAN_MOVE_SELF but you'd need more
> > > information coming with it like the new parent dir, wouldn't you? It would
> > > be relatively easy to add that info but it would kind of suck that it would
> > > be difficult to discover in advance whether the directory info will arrive
> > > with the event or not. But that actually would seem to be the case for
> > > FAN_RENAME as well because we didn't seem to bother to refuse FAN_RENAME on
> > > a file. Amir?
> > >
> > 
> > No, we did not, but it is also not refused for all the other dirent events and
> > it was never refused by inotify too, so that behavior is at least consistent.
> > But if we do want to change the behavior of FAN_RENAME on file, my preference
> > would be to start with a Fixes commit that forbis that, backport it to stable
> > and then allow the new behavior upstream.
> > I can post the fix patch.
> 
> Yeah, I think we should do that. Thanks for looking into this!
> 
> > > > Having briefly mentioned all this, what is your stance on maybe
> > > > extending out FAN_RENAME to also cover files? Or, maybe you have
> > > > another approach/idea in mind to cover such cases i.e. introducing a
> > > > new flag FAN_{TRACK,TRACE}.
> > >
> > > So extending FAN_MOVE_SELF or FAN_RENAME looks OK to me, not much thoughts
> > > beyond that :).
> > 
> > Both FAN_RENAME and FAN_REPORT_TARGET_FID are from v5.17
> > which is rather new and it is highly unlikely that anyone has ever used them,
> > so I think we can get away with fixing the API either way.
> > Not to mention that the man pages have not been updated.
> > 
> > This is from the man page that is pending review:
> > 
> >        FAN_REPORT_TARGET_FID (since Linux 5.17)
> >               Events for fanotify groups initialized with this flag
> > will contain additional information
> >               about the child correlated with directory entry
> > modification events...
> >               For the directory entry modification events
> >               FAN_CREATE,  FAN_DELETE,  FAN_RENAME,  and  FAN_MOVE,
> > an  additional...
> > 
> >        FAN_MOVED_TO (since Linux 5.1)
> >               Create an event when a file or directory has been moved
> > to a marked parent directory...
> > 
> >        FAN_RENAME (since Linux 5.17)
> >               This  event contains the same information provided by
> > events FAN_MOVED_FROM
> >               and FAN_MOVED_TO, ...
> > 
> >        FAN_MOVE_SELF (since Linux 5.1)
> >               Create an event when a marked file or directory itself
> > has been moved...
> > 
> > I think it will be easier to retrofit this functionality of FAN_RENAME
> > (i.e. ...provided
> > by events FAN_MOVED_FROM, FAN_MOVED_TO, and FAN_MOVE_SELF).
> > Looking at the code, I think it will also be much easier to implement
> > for FAN_RENAME
> > because it is special-cased for reporting.
> > 
> > HOWEVER! look at the way we implemented reporting of FAN_RENAME
> > (i.e. match_mask). We report_new location only if watching sb or watching
> > new dir. We did that for a reason because watcher may not have permissions
> > to read new dir. We could revisit this decision for a privileged group, but will
> > need to go back reading all the discussions we had about this point to see
> > if there were other reasons(?).
> 
> Yeah, this is a good point. We are able to safely report the new parent
> only if the watching process is able to prove it is able to watch it.
> Adding even more special cases there would be ugly and error prone I'm
> afraid. We could certainly make this available only to priviledged
> notification groups but still it is one more odd corner case and the
> usecase does not seem to be that big.

Sorry, I'm confused about the conclusion we've drawn here. Are we hard
up against not extending FAN_RENAME for the sole reason that the
implementation might be ugly and error prone?

Can we not expose this case exclusively to privileged notification
groups/watchers? This case seems far simpler than what has already
been implemented in the FAN_RENAME series, that is as you mentioned,
trying to safely report the new parent only if the watching process is
able to prove it is able to watch it. If anything, I would've expected
the privileged case to be implemented prior to attempting to cover
whether the super block or target directory is being watched.

> So maybe watching on sb for FAN_RENAME and just quickly filtering
> based on child FID would be better solution than fiddling with new
> event for files?

Ah, I really wanted to stay away from watching the super block for all
FAN_RENAME events. I feel like userspace wearing the pain for such
cases is suboptimal, as this is something that can effectively be done
in-kernel.

/M

  reply	other threads:[~2022-05-05 23:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05 10:25 Fanotify API - Tracking File Movement Matthew Bobrowski
2022-05-05 11:22 ` Jan Kara
2022-05-05 12:56   ` Amir Goldstein
2022-05-05 13:30     ` Jan Kara
2022-05-05 23:44       ` Matthew Bobrowski [this message]
2022-05-06  0:00         ` Amir Goldstein
2022-05-06 10:06           ` Jan Kara
2022-05-07 16:03             ` Amir Goldstein
2022-05-11 17:52               ` Jan Kara
2022-05-11 18:54                 ` Amir Goldstein
2022-05-13 13:18               ` Matthew Bobrowski
2022-05-13 14:14                 ` Amir Goldstein
2022-05-13 15:54                   ` Matthew Bobrowski
2022-05-13 18:39                     ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnRhVgu6JKNinarh@google.com \
    --to=repnop@google.com \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.