All of lore.kernel.org
 help / color / mirror / Atom feed
* FAN_REPORT_CHILD_FID
@ 2021-07-11 17:02 Amir Goldstein
  2021-07-12 11:10 ` FAN_REPORT_CHILD_FID Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2021-07-11 17:02 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Matthew Bobrowski

Jan,

I am struggling with an attempt to extend the fanotify API and
I wanted to ask your opinion before I go too far in the wrong direction.

I am working with an application that used to use inotify rename
cookies to match MOVED_FROM/MOVED_TO events.
The application was converted to use fanotify name events, but
the rename cookie functionality was missing, so I am carrying
a small patch for FAN_REPORT_COOKIE.

I do not want to propose this patch for upstream, because I do
not like this API.

What I thought was that instead of a "cookie" I would like to
use the child fid as a way to pair up move events.
This requires that the move events will never be merged and
therefore not re-ordered (as is the case with inotify move events).

My thinking was to generalize this concept and introduce
FAN_REPORT_CHILD_FID flag. With that flag, dirent events
will report additional FID records, like events on a non-dir child
(but also for dirent events on subdirs).

Either FAN_REPORT_CHILD_FID would also prevent dirent events
from being merged or we could use another flag for that purpose,
but I wasn't able to come up with an idea for a name for this flag :-/

I sketched this patch [1] to implement the flag and to document
the desired semantics. It's only build tested and I did not even
implement the merge rules listed in the commit message.

[1] https://github.com/amir73il/linux/commits/fanotify_child_fid

There are other benefits from FAN_REPORT_CHILD_FID which are
not related to matching move event pairs, such as the case described
in this discussion [2], where I believe you suggested something along
the lines of FAN_REPORT_CHILD_FID.

[2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxhEsbfA5+sW4XPnUKgCkXtwoDA-BR3iRO34Nx5c4y7Nug@mail.gmail.com/

Thoughts?

Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-11 17:02 FAN_REPORT_CHILD_FID Amir Goldstein
@ 2021-07-12 11:10 ` Jan Kara
  2021-07-12 13:00   ` FAN_REPORT_CHILD_FID Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2021-07-12 11:10 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel, Matthew Bobrowski

Hi Amir!

On Sun 11-07-21 20:02:29, Amir Goldstein wrote:
> I am struggling with an attempt to extend the fanotify API and
> I wanted to ask your opinion before I go too far in the wrong direction.
> 
> I am working with an application that used to use inotify rename
> cookies to match MOVED_FROM/MOVED_TO events.
> The application was converted to use fanotify name events, but
> the rename cookie functionality was missing, so I am carrying
> a small patch for FAN_REPORT_COOKIE.
> 
> I do not want to propose this patch for upstream, because I do
> not like this API.
> 
> What I thought was that instead of a "cookie" I would like to
> use the child fid as a way to pair up move events.
> This requires that the move events will never be merged and
> therefore not re-ordered (as is the case with inotify move events).
> 
> My thinking was to generalize this concept and introduce
> FAN_REPORT_CHILD_FID flag. With that flag, dirent events
> will report additional FID records, like events on a non-dir child
> (but also for dirent events on subdirs).

I'm starting to get lost in what reports what so let me draw a table here:

Non-directories
				DFID	FID	CHILD_FID
ACCESS/MODIFY/OPEN/CLOSE/ATTRIB parent	self	self
CREATE/DELETE/MOVE		-	-	-
DELETE_SELF/MOVE_SELF		x	self	self
('-' means cannot happen, 'x' means not generated)

Directories
				DFID	FID	CHILD_FID
ACCESS/MODIFY/OPEN/CLOSE/ATTRIB self	self	self
CREATE/DELETE/MOVE		self	self	target
DELETE_SELF/MOVE_SELF		x	self	self

Did I get this right?

I guess "CHILD_FID" seems somewhat confusing as it isn't immediately clear
from the name what it would report e.g. for open of a non-directory. Maybe
we could call it "TARGET_FID"? Also I'm not sure it needs to be exclusive
with FID. Sure it doesn't make much sense to report both FID and CHILD_FID
but does the exclusivity buy us anything? I guess I don't have strong
opinion either way, I'm just curious.
 
> Either FAN_REPORT_CHILD_FID would also prevent dirent events
> from being merged or we could use another flag for that purpose,
> but I wasn't able to come up with an idea for a name for this flag :-/
> 
> I sketched this patch [1] to implement the flag and to document
> the desired semantics. It's only build tested and I did not even
> implement the merge rules listed in the commit message.
> 
> [1] https://github.com/amir73il/linux/commits/fanotify_child_fid

WRT changes to merging: Whenever some application wants to depend on the
ordering of events I'm starting to get suspicious. What is it using these
events for? How is renaming different from linking a file into a new dir
and unlinking it from the previous one which is a series of events that
could be merged? Also fanotify could still be merging events happening
after rename to events before rename. Can the application tolerate that?
Inotify didn't do this because it is always merging only to the last event
in the queue.

When we were talking about FID events in the past (in the context of
directory events) we always talked about application just maintaining a set
of dirs (plus names) to look into. And that is safe and sound. But what you
talk about now seems rather fragile at least from the limited information I
have about your usecase...

> There are other benefits from FAN_REPORT_CHILD_FID which are
> not related to matching move event pairs, such as the case described
> in this discussion [2], where I believe you suggested something along
> the lines of FAN_REPORT_CHILD_FID.
> 
> [2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxhEsbfA5+sW4XPnUKgCkXtwoDA-BR3iRO34Nx5c4y7Nug@mail.gmail.com/

Yes, I can see FAN_REPORT_CHILD_FID (or however we call it) can be useful
at times (in fact I think we made a mistake that we didn't make reported
FID to always be what you now suggest as CHILD_FID, but we found that out
only when DFID+NAME implementation settled so that train was long gone).
So I have no problem with that functionality as such.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-12 11:10 ` FAN_REPORT_CHILD_FID Jan Kara
@ 2021-07-12 13:00   ` Amir Goldstein
  2021-07-12 16:26     ` FAN_REPORT_CHILD_FID Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2021-07-12 13:00 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Matthew Bobrowski

On Mon, Jul 12, 2021 at 2:10 PM Jan Kara <jack@suse.cz> wrote:
>
> Hi Amir!
>
> On Sun 11-07-21 20:02:29, Amir Goldstein wrote:
> > I am struggling with an attempt to extend the fanotify API and
> > I wanted to ask your opinion before I go too far in the wrong direction.
> >
> > I am working with an application that used to use inotify rename
> > cookies to match MOVED_FROM/MOVED_TO events.
> > The application was converted to use fanotify name events, but
> > the rename cookie functionality was missing, so I am carrying
> > a small patch for FAN_REPORT_COOKIE.
> >
> > I do not want to propose this patch for upstream, because I do
> > not like this API.
> >
> > What I thought was that instead of a "cookie" I would like to
> > use the child fid as a way to pair up move events.
> > This requires that the move events will never be merged and
> > therefore not re-ordered (as is the case with inotify move events).
> >
> > My thinking was to generalize this concept and introduce
> > FAN_REPORT_CHILD_FID flag. With that flag, dirent events
> > will report additional FID records, like events on a non-dir child
> > (but also for dirent events on subdirs).
>
> I'm starting to get lost in what reports what so let me draw a table here:
>
> Non-directories
>                                 DFID    FID     CHILD_FID
> ACCESS/MODIFY/OPEN/CLOSE/ATTRIB parent  self    self
> CREATE/DELETE/MOVE              -       -       -
> DELETE_SELF/MOVE_SELF           x       self    self
> ('-' means cannot happen, 'x' means not generated)
>
> Directories
>                                 DFID    FID     CHILD_FID
> ACCESS/MODIFY/OPEN/CLOSE/ATTRIB self    self    self
> CREATE/DELETE/MOVE              self    self    target
> DELETE_SELF/MOVE_SELF           x       self    self
>
> Did I get this right?

I am not sure if the columns in your table refer to group flags
or to info records types? or a little bit of both, but I did not
mean for CHILD_FID to be a different record type.

Anyway, the only complexity missing from the table is that
for events reporting a single record with fid of a directory,
(i.e. self event on dir or dirent event) the record type depends
on the group flags.

FAN_REPORT_FID => FAN_EVENT_INFO_TYPE_FID
FAN_REPORT_DIR_FID => FAN_EVENT_INFO_TYPE_DFID

>
> I guess "CHILD_FID" seems somewhat confusing as it isn't immediately clear
> from the name what it would report e.g. for open of a non-directory.

I agree it is a bit confusing. FWIW for events on a non-dir child (not dirent)
FAN_REPORT_FID and FAN_REPORT_CHILD_FID flags yield the exact
same event info.

> Maybe
> we could call it "TARGET_FID"? Also I'm not sure it needs to be exclusive
> with FID. Sure it doesn't make much sense to report both FID and CHILD_FID
> but does the exclusivity buy us anything? I guess I don't have strong
> opinion either way, I'm just curious.
>

FAN_REPORT_TARGET_FID sounds good to me.
You are right. I don't think that exclusivity buys us anything.

> > There are other benefits from FAN_REPORT_CHILD_FID which are
> > not related to matching move event pairs, such as the case described
> > in this discussion [2], where I believe you suggested something along
> > the lines of FAN_REPORT_CHILD_FID.
> >
> > [2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxhEsbfA5+sW4XPnUKgCkXtwoDA-BR3iRO34Nx5c4y7Nug@mail.gmail.com/
>
> Yes, I can see FAN_REPORT_CHILD_FID (or however we call it) can be useful
> at times (in fact I think we made a mistake that we didn't make reported
> FID to always be what you now suggest as CHILD_FID, but we found that out
> only when DFID+NAME implementation settled so that train was long gone).

Yes, we did. FAN_REPORT_TARGET_FID is also about trying to make amends.
We could have just as well called it FAN_REPORT_FID_V2, but no ;-)

> So I have no problem with that functionality as such.
>

Good, so I will try to see that I can come up with sane semantics that
also result in a sane man page.

> > Either FAN_REPORT_CHILD_FID would also prevent dirent events
> > from being merged or we could use another flag for that purpose,
> > but I wasn't able to come up with an idea for a name for this flag :-/
> >
> > I sketched this patch [1] to implement the flag and to document
> > the desired semantics. It's only build tested and I did not even
> > implement the merge rules listed in the commit message.
> >
> > [1] https://github.com/amir73il/linux/commits/fanotify_child_fid
>
> WRT changes to merging: Whenever some application wants to depend on the
> ordering of events I'm starting to get suspicious.

I completely agree with that sentiment.

But note that the application does NOT require event ordering.

I was proposing the strict ordering of MOVE_ events as a method
to allow for matching of MOVE_ pairs of the same target as
a *replacement* for the inotify rename cookie method.

> What is it using these events for?

The application is trying to match MOVE_ event pairs.
It's a best effort situation - when local file has been renamed,
a remote rename can also be attempted while verifying that the
recorded fid of the source (in remote file) matches the fid of the
local target.

> How is renaming different from linking a file into a new dir
> and unlinking it from the previous one which is a series of events that
> could be merged?

It is different because renames are common operations that actual people
often do and link+unlink are less common so we do not care to optimize
them. Anyway, as many other applications, our application does not
support syncing hardlinks to remote location, so link+unlink would be
handled as plain copy+delete and dedup of copied file is handled
is handled by the remote sync protocol.

As a matter of fact, a rename could also be (and sometimes is) handled
as copy+delete. In that case, the remote content would be fine but the
logs and change history would be inaccurate.

BTW, in my sketch commit message I offer to prevent merge of all dirent
events, not only MOVE_, because I claim that there is not much to be
gained from merging the CREATE/DELETE of a specific TARGET_FID
event with other events as there are normally very few of those events.

However, while I can argue why it is useful to avoid merge of dirent events,
it's not as easy for me to come up with a name for that flag not to
easily explain the semantics in man page :-/
so any help with coming up with simplified semantics would be appreciated.

> Also fanotify could still be merging events happening
> after rename to events before rename. Can the application tolerate that?

Yes. The application treats the name of the file as a property that
can be synced regardless of the file's data and metadata and it doesn't
need to be synced to remote in the same order that changes happened.
The destination is "eventually consistent" with the source.

> Inotify didn't do this because it is always merging only to the last event
> in the queue.
>
> When we were talking about FID events in the past (in the context of
> directory events) we always talked about application just maintaining a set
> of dirs (plus names) to look into. And that is safe and sound. But what you
> talk about now seems rather fragile at least from the limited information I
> have about your usecase...
>

It's true. The application uses the DFID as the main key for tracking changes -
i.e. which directories need to be synced.
Rename between directories is a case where syncing individual directories
looses information. It does not loose data, only the accuracy of reported
change history - IOW, its a minor functionality gap, but one that product
people will not be willing to waiver.

P.S. unlike rename of non-dir, rename of directories and large directory
trees specifically must be identified and handled as a remote rename,
but that is easy to achieve with MOVE_SELF, because as I wrote, the
application uses DFID as the key for tracking changes, so MOVE_SELF
of directory will carry a DFID whose "remote path" is stored in a db.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-12 13:00   ` FAN_REPORT_CHILD_FID Amir Goldstein
@ 2021-07-12 16:26     ` Jan Kara
  2021-07-12 18:08       ` FAN_REPORT_CHILD_FID Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2021-07-12 16:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel, Matthew Bobrowski

On Mon 12-07-21 16:00:54, Amir Goldstein wrote:
> On Mon, Jul 12, 2021 at 2:10 PM Jan Kara <jack@suse.cz> wrote:
> > On Sun 11-07-21 20:02:29, Amir Goldstein wrote:
> > > I am struggling with an attempt to extend the fanotify API and
> > > I wanted to ask your opinion before I go too far in the wrong direction.
> > >
> > > I am working with an application that used to use inotify rename
> > > cookies to match MOVED_FROM/MOVED_TO events.
> > > The application was converted to use fanotify name events, but
> > > the rename cookie functionality was missing, so I am carrying
> > > a small patch for FAN_REPORT_COOKIE.
> > >
> > > I do not want to propose this patch for upstream, because I do
> > > not like this API.
> > >
> > > What I thought was that instead of a "cookie" I would like to
> > > use the child fid as a way to pair up move events.
> > > This requires that the move events will never be merged and
> > > therefore not re-ordered (as is the case with inotify move events).
> > >
> > > My thinking was to generalize this concept and introduce
> > > FAN_REPORT_CHILD_FID flag. With that flag, dirent events
> > > will report additional FID records, like events on a non-dir child
> > > (but also for dirent events on subdirs).
> >
> > I'm starting to get lost in what reports what so let me draw a table here:
> >
> > Non-directories
> >                                 DFID    FID     CHILD_FID
> > ACCESS/MODIFY/OPEN/CLOSE/ATTRIB parent  self    self
> > CREATE/DELETE/MOVE              -       -       -
> > DELETE_SELF/MOVE_SELF           x       self    self
> > ('-' means cannot happen, 'x' means not generated)
> >
> > Directories
> >                                 DFID    FID     CHILD_FID
> > ACCESS/MODIFY/OPEN/CLOSE/ATTRIB self    self    self
> > CREATE/DELETE/MOVE              self    self    target
> > DELETE_SELF/MOVE_SELF           x       self    self
> >
> > Did I get this right?
> 
> I am not sure if the columns in your table refer to group flags
> or to info records types? or a little bit of both, but I did not
> mean for CHILD_FID to be a different record type.

Yeah, a bit of both.

> Anyway, the only complexity missing from the table is that
> for events reporting a single record with fid of a directory,
> (i.e. self event on dir or dirent event) the record type depends
> on the group flags.
> 
> FAN_REPORT_FID => FAN_EVENT_INFO_TYPE_FID
> FAN_REPORT_DIR_FID => FAN_EVENT_INFO_TYPE_DFID

Right, I didn't realize this.

> > I guess "CHILD_FID" seems somewhat confusing as it isn't immediately clear
> > from the name what it would report e.g. for open of a non-directory.
> 
> I agree it is a bit confusing. FWIW for events on a non-dir child (not dirent)
> FAN_REPORT_FID and FAN_REPORT_CHILD_FID flags yield the exact
> same event info.
> 
> > Maybe
> > we could call it "TARGET_FID"? Also I'm not sure it needs to be exclusive
> > with FID. Sure it doesn't make much sense to report both FID and CHILD_FID
> > but does the exclusivity buy us anything? I guess I don't have strong
> > opinion either way, I'm just curious.
> >
> 
> FAN_REPORT_TARGET_FID sounds good to me.
> You are right. I don't think that exclusivity buys us anything.

OK. I've realized that the exclusivity is needed if we want to report info
enabled by FAN_REPORT_TARGET_FID as FAN_EVENT_INFO_TYPE_FID. Because
otherwise it would not be well defined what information is contained in
FAN_EVENT_INFO_TYPE_FID. So either we have to go for exclusivity or for new
type of event information.

> > > There are other benefits from FAN_REPORT_CHILD_FID which are
> > > not related to matching move event pairs, such as the case described
> > > in this discussion [2], where I believe you suggested something along
> > > the lines of FAN_REPORT_CHILD_FID.
> > >
> > > [2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxhEsbfA5+sW4XPnUKgCkXtwoDA-BR3iRO34Nx5c4y7Nug@mail.gmail.com/
> >
> > Yes, I can see FAN_REPORT_CHILD_FID (or however we call it) can be useful
> > at times (in fact I think we made a mistake that we didn't make reported
> > FID to always be what you now suggest as CHILD_FID, but we found that out
> > only when DFID+NAME implementation settled so that train was long gone).
> 
> Yes, we did. FAN_REPORT_TARGET_FID is also about trying to make amends.
> We could have just as well called it FAN_REPORT_FID_V2, but no ;-)

OK, I was suspecting that but wasn't sure :). I guess that's another reason
why exclusivity makes more sense.

> > > Either FAN_REPORT_CHILD_FID would also prevent dirent events
> > > from being merged or we could use another flag for that purpose,
> > > but I wasn't able to come up with an idea for a name for this flag :-/
> > >
> > > I sketched this patch [1] to implement the flag and to document
> > > the desired semantics. It's only build tested and I did not even
> > > implement the merge rules listed in the commit message.
> > >
> > > [1] https://github.com/amir73il/linux/commits/fanotify_child_fid
> >
> > WRT changes to merging: Whenever some application wants to depend on the
> > ordering of events I'm starting to get suspicious.
> 
> I completely agree with that sentiment.
> 
> But note that the application does NOT require event ordering.
> 
> I was proposing the strict ordering of MOVE_ events as a method
> to allow for matching of MOVE_ pairs of the same target as
> a *replacement* for the inotify rename cookie method.

Aha, I see I got confused a bit. Sorry about that.

> > What is it using these events for?
> 
> The application is trying to match MOVE_ event pairs.
> It's a best effort situation - when local file has been renamed,
> a remote rename can also be attempted while verifying that the
> recorded fid of the source (in remote file) matches the fid of the
> local target.
> 
> > How is renaming different from linking a file into a new dir
> > and unlinking it from the previous one which is a series of events that
> > could be merged?
> 
> It is different because renames are common operations that actual people
> often do and link+unlink are less common so we do not care to optimize
> them. Anyway, as many other applications, our application does not
> support syncing hardlinks to remote location, so link+unlink would be
> handled as plain copy+delete and dedup of copied file is handled
> is handled by the remote sync protocol.
> 
> As a matter of fact, a rename could also be (and sometimes is) handled
> as copy+delete. In that case, the remote content would be fine but the
> logs and change history would be inaccurate.
> 
> BTW, in my sketch commit message I offer to prevent merge of all dirent
> events, not only MOVE_, because I claim that there is not much to be
> gained from merging the CREATE/DELETE of a specific TARGET_FID
> event with other events as there are normally very few of those events.
> 
> However, while I can argue why it is useful to avoid merge of dirent events,
> it's not as easy for me to come up with a name for that flag not to
> easily explain the semantics in man page :-/
> so any help with coming up with simplified semantics would be appreciated.

Just a brainstorming idea: How about creating new event FAN_RENAME that
would report two DFIDs (if it is cross directory rename)? On the uAPI side
it is very straightforward I think (unlike inotify where this could not be
easily done because of fixed sized event + name). On the kernel API side we
need to somehow feed the second directory into fsnotify() but with the
changes Gabriel is doing it should not be that painful anymore... And then
we can just avoid any problems with event matching, event merging etc.
Thoughts?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-12 16:26     ` FAN_REPORT_CHILD_FID Jan Kara
@ 2021-07-12 18:08       ` Amir Goldstein
  2021-07-14  1:16         ` FAN_REPORT_CHILD_FID Matthew Bobrowski
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2021-07-12 18:08 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Matthew Bobrowski

On Mon, Jul 12, 2021 at 7:26 PM Jan Kara <jack@suse.cz> wrote:
>
> On Mon 12-07-21 16:00:54, Amir Goldstein wrote:
> > On Mon, Jul 12, 2021 at 2:10 PM Jan Kara <jack@suse.cz> wrote:
> > > On Sun 11-07-21 20:02:29, Amir Goldstein wrote:
> > > > I am struggling with an attempt to extend the fanotify API and
> > > > I wanted to ask your opinion before I go too far in the wrong direction.
> > > >
> > > > I am working with an application that used to use inotify rename
> > > > cookies to match MOVED_FROM/MOVED_TO events.
> > > > The application was converted to use fanotify name events, but
> > > > the rename cookie functionality was missing, so I am carrying
> > > > a small patch for FAN_REPORT_COOKIE.
> > > >
> > > > I do not want to propose this patch for upstream, because I do
> > > > not like this API.
> > > >
> > > > What I thought was that instead of a "cookie" I would like to
> > > > use the child fid as a way to pair up move events.
> > > > This requires that the move events will never be merged and
> > > > therefore not re-ordered (as is the case with inotify move events).
> > > >
> > > > My thinking was to generalize this concept and introduce
> > > > FAN_REPORT_CHILD_FID flag. With that flag, dirent events
> > > > will report additional FID records, like events on a non-dir child
> > > > (but also for dirent events on subdirs).
> > >
> > > I'm starting to get lost in what reports what so let me draw a table here:
> > >
> > > Non-directories
> > >                                 DFID    FID     CHILD_FID
> > > ACCESS/MODIFY/OPEN/CLOSE/ATTRIB parent  self    self
> > > CREATE/DELETE/MOVE              -       -       -
> > > DELETE_SELF/MOVE_SELF           x       self    self
> > > ('-' means cannot happen, 'x' means not generated)
> > >
> > > Directories
> > >                                 DFID    FID     CHILD_FID
> > > ACCESS/MODIFY/OPEN/CLOSE/ATTRIB self    self    self
> > > CREATE/DELETE/MOVE              self    self    target
> > > DELETE_SELF/MOVE_SELF           x       self    self
> > >
> > > Did I get this right?
> >
> > I am not sure if the columns in your table refer to group flags
> > or to info records types? or a little bit of both, but I did not
> > mean for CHILD_FID to be a different record type.
>
> Yeah, a bit of both.
>
> > Anyway, the only complexity missing from the table is that
> > for events reporting a single record with fid of a directory,
> > (i.e. self event on dir or dirent event) the record type depends
> > on the group flags.
> >
> > FAN_REPORT_FID => FAN_EVENT_INFO_TYPE_FID
> > FAN_REPORT_DIR_FID => FAN_EVENT_INFO_TYPE_DFID
>
> Right, I didn't realize this.
>
> > > I guess "CHILD_FID" seems somewhat confusing as it isn't immediately clear
> > > from the name what it would report e.g. for open of a non-directory.
> >
> > I agree it is a bit confusing. FWIW for events on a non-dir child (not dirent)
> > FAN_REPORT_FID and FAN_REPORT_CHILD_FID flags yield the exact
> > same event info.
> >
> > > Maybe
> > > we could call it "TARGET_FID"? Also I'm not sure it needs to be exclusive
> > > with FID. Sure it doesn't make much sense to report both FID and CHILD_FID
> > > but does the exclusivity buy us anything? I guess I don't have strong
> > > opinion either way, I'm just curious.
> > >
> >
> > FAN_REPORT_TARGET_FID sounds good to me.
> > You are right. I don't think that exclusivity buys us anything.
>
> OK. I've realized that the exclusivity is needed if we want to report info
> enabled by FAN_REPORT_TARGET_FID as FAN_EVENT_INFO_TYPE_FID. Because
> otherwise it would not be well defined what information is contained in
> FAN_EVENT_INFO_TYPE_FID. So either we have to go for exclusivity or for new
> type of event information.
>
> > > > There are other benefits from FAN_REPORT_CHILD_FID which are
> > > > not related to matching move event pairs, such as the case described
> > > > in this discussion [2], where I believe you suggested something along
> > > > the lines of FAN_REPORT_CHILD_FID.
> > > >
> > > > [2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxhEsbfA5+sW4XPnUKgCkXtwoDA-BR3iRO34Nx5c4y7Nug@mail.gmail.com/
> > >
> > > Yes, I can see FAN_REPORT_CHILD_FID (or however we call it) can be useful
> > > at times (in fact I think we made a mistake that we didn't make reported
> > > FID to always be what you now suggest as CHILD_FID, but we found that out
> > > only when DFID+NAME implementation settled so that train was long gone).
> >
> > Yes, we did. FAN_REPORT_TARGET_FID is also about trying to make amends.
> > We could have just as well called it FAN_REPORT_FID_V2, but no ;-)
>
> OK, I was suspecting that but wasn't sure :). I guess that's another reason
> why exclusivity makes more sense.
>
> > > > Either FAN_REPORT_CHILD_FID would also prevent dirent events
> > > > from being merged or we could use another flag for that purpose,
> > > > but I wasn't able to come up with an idea for a name for this flag :-/
> > > >
> > > > I sketched this patch [1] to implement the flag and to document
> > > > the desired semantics. It's only build tested and I did not even
> > > > implement the merge rules listed in the commit message.
> > > >
> > > > [1] https://github.com/amir73il/linux/commits/fanotify_child_fid
> > >
> > > WRT changes to merging: Whenever some application wants to depend on the
> > > ordering of events I'm starting to get suspicious.
> >
> > I completely agree with that sentiment.
> >
> > But note that the application does NOT require event ordering.
> >
> > I was proposing the strict ordering of MOVE_ events as a method
> > to allow for matching of MOVE_ pairs of the same target as
> > a *replacement* for the inotify rename cookie method.
>
> Aha, I see I got confused a bit. Sorry about that.
>
> > > What is it using these events for?
> >
> > The application is trying to match MOVE_ event pairs.
> > It's a best effort situation - when local file has been renamed,
> > a remote rename can also be attempted while verifying that the
> > recorded fid of the source (in remote file) matches the fid of the
> > local target.
> >
> > > How is renaming different from linking a file into a new dir
> > > and unlinking it from the previous one which is a series of events that
> > > could be merged?
> >
> > It is different because renames are common operations that actual people
> > often do and link+unlink are less common so we do not care to optimize
> > them. Anyway, as many other applications, our application does not
> > support syncing hardlinks to remote location, so link+unlink would be
> > handled as plain copy+delete and dedup of copied file is handled
> > is handled by the remote sync protocol.
> >
> > As a matter of fact, a rename could also be (and sometimes is) handled
> > as copy+delete. In that case, the remote content would be fine but the
> > logs and change history would be inaccurate.
> >
> > BTW, in my sketch commit message I offer to prevent merge of all dirent
> > events, not only MOVE_, because I claim that there is not much to be
> > gained from merging the CREATE/DELETE of a specific TARGET_FID
> > event with other events as there are normally very few of those events.
> >
> > However, while I can argue why it is useful to avoid merge of dirent events,
> > it's not as easy for me to come up with a name for that flag not to
> > easily explain the semantics in man page :-/
> > so any help with coming up with simplified semantics would be appreciated.
>
> Just a brainstorming idea: How about creating new event FAN_RENAME that
> would report two DFIDs (if it is cross directory rename)?

I like the idea, but it would have to be two DFID_NAME is case of
FAN_REPORT_DFID_NAME and also for same parent rename
to be consistent.

> On the uAPI side
> it is very straightforward I think (unlike inotify where this could not be
> easily done because of fixed sized event + name).

Right.

> On the kernel API side we
> need to somehow feed the second directory into fsnotify() but with the
> changes Gabriel is doing it should not be that painful anymore...

Right, but we also need to parcel the fanotify_name_event
for storing two DFID_NAME and one FID.

> And then
> we can just avoid any problems with event matching, event merging etc.
> Thoughts?

It certainly qualifies as "simplifying semantics" :)
I think it's worth a shot, so I'll take a swing at it.
However, that means that "FAN_REPORT_FID_V2" will have
to wait for another time or never...

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-12 18:08       ` FAN_REPORT_CHILD_FID Amir Goldstein
@ 2021-07-14  1:16         ` Matthew Bobrowski
  2021-07-14 12:09           ` FAN_REPORT_CHILD_FID Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Bobrowski @ 2021-07-14  1:16 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel

On Mon, Jul 12, 2021 at 09:08:18PM +0300, Amir Goldstein wrote:
> On Mon, Jul 12, 2021 at 7:26 PM Jan Kara <jack@suse.cz> wrote:
> > On Mon 12-07-21 16:00:54, Amir Goldstein wrote:
> > Just a brainstorming idea: How about creating new event FAN_RENAME that
> > would report two DFIDs (if it is cross directory rename)?
> 
> I like the idea, but it would have to be two DFID_NAME is case of
> FAN_REPORT_DFID_NAME and also for same parent rename
> to be consistent.

I don't have much to add to this conversation, but I'm just curious here.

If we do require two separate DFID_NAME record objects in the case of cross
directory rename operations, how does an event listener distinguish the
difference between which is which i.e. moved_{from/to}?  To me, this
implies that the event listener is expected to rely on specific
supplemental information object ordering, which to my knowledge is a
contract that we had always wanted to avoid drawing.

/M

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-14  1:16         ` FAN_REPORT_CHILD_FID Matthew Bobrowski
@ 2021-07-14 12:09           ` Amir Goldstein
  2021-07-15 23:53             ` FAN_REPORT_CHILD_FID Matthew Bobrowski
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2021-07-14 12:09 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: Jan Kara, linux-fsdevel

On Wed, Jul 14, 2021 at 4:16 AM Matthew Bobrowski <repnop@google.com> wrote:
>
> On Mon, Jul 12, 2021 at 09:08:18PM +0300, Amir Goldstein wrote:
> > On Mon, Jul 12, 2021 at 7:26 PM Jan Kara <jack@suse.cz> wrote:
> > > On Mon 12-07-21 16:00:54, Amir Goldstein wrote:
> > > Just a brainstorming idea: How about creating new event FAN_RENAME that
> > > would report two DFIDs (if it is cross directory rename)?
> >
> > I like the idea, but it would have to be two DFID_NAME is case of
> > FAN_REPORT_DFID_NAME and also for same parent rename
> > to be consistent.
>
> I don't have much to add to this conversation, but I'm just curious here.
>
> If we do require two separate DFID_NAME record objects in the case of cross
> directory rename operations, how does an event listener distinguish the
> difference between which is which i.e. moved_{from/to}?  To me, this
> implies that the event listener is expected to rely on specific
> supplemental information object ordering, which to my knowledge is a
> contract that we had always wanted to avoid drawing.
>

I think the records should not rely on ordering, but on self describing types,
such as FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO}
but I am trying to think of better names.

I am still debating with myself between adding a new event type
(FAN_RENAME), adding a new report flag (FAN_REPORT_TARGET_FID)
that adds info records to existing MOVE_ events or some combination.

My goal is to minimize the man page size and complexity.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-14 12:09           ` FAN_REPORT_CHILD_FID Amir Goldstein
@ 2021-07-15 23:53             ` Matthew Bobrowski
  2021-07-16  9:47               ` FAN_REPORT_CHILD_FID Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Bobrowski @ 2021-07-15 23:53 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel

On Wed, Jul 14, 2021 at 03:09:56PM +0300, Amir Goldstein wrote:
> On Wed, Jul 14, 2021 at 4:16 AM Matthew Bobrowski <repnop@google.com> wrote:
> > On Mon, Jul 12, 2021 at 09:08:18PM +0300, Amir Goldstein wrote:
> > > On Mon, Jul 12, 2021 at 7:26 PM Jan Kara <jack@suse.cz> wrote:
> > > > On Mon 12-07-21 16:00:54, Amir Goldstein wrote:
> > > > Just a brainstorming idea: How about creating new event FAN_RENAME that
> > > > would report two DFIDs (if it is cross directory rename)?
> > >
> > > I like the idea, but it would have to be two DFID_NAME is case of
> > > FAN_REPORT_DFID_NAME and also for same parent rename
> > > to be consistent.
> >
> > I don't have much to add to this conversation, but I'm just curious here.
> >
> > If we do require two separate DFID_NAME record objects in the case of cross
> > directory rename operations, how does an event listener distinguish the
> > difference between which is which i.e. moved_{from/to}?  To me, this
> > implies that the event listener is expected to rely on specific
> > supplemental information object ordering, which to my knowledge is a
> > contract that we had always wanted to avoid drawing.
> >
> 
> I think the records should not rely on ordering, but on self describing types,
> such as FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO}
> but I am trying to think of better names.

Right, having such information types would work nicely IMO. I had something
like that in mind, but I just wanted to make sure that we weren't building
some reliance on the ordering of these information records as we recently
discussed not doing exactly that.

As far as the naming goes, FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO} is a
little bit of a mouth full IMO, but at this stage I haven't thought of any
better alternatives.

> I am still debating with myself between adding a new event type
> (FAN_RENAME), adding a new report flag (FAN_REPORT_TARGET_FID)
> that adds info records to existing MOVE_ events or some combination.

Well, if we went with adding a new event FAN_RENAME and specifying that
resulted in the generation of additional
FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO} information record types for an
event, wouldn't it be weird as it doesn't follow the conventional mechanism
of a listener asking for additional information records? As in,
traditionally we'd intialize the notification group with a flag and then
that flag controls whether or not one is permitted to receive events of a
particular type that may or may not include information records?

Maybe a combination approach is needed in this instance, but this doesn't
necessarily simplify things when attempting to document the API semantics
IMO.

> My goal is to minimize the man page size and complexity.

Either approach, I think they'll be significant documentation changes that
are needed. :)

/M

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-15 23:53             ` FAN_REPORT_CHILD_FID Matthew Bobrowski
@ 2021-07-16  9:47               ` Jan Kara
  2021-07-16 11:24                 ` FAN_REPORT_CHILD_FID Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2021-07-16  9:47 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: Amir Goldstein, Jan Kara, linux-fsdevel

On Fri 16-07-21 09:53:15, Matthew Bobrowski wrote:
> On Wed, Jul 14, 2021 at 03:09:56PM +0300, Amir Goldstein wrote:
> > I am still debating with myself between adding a new event type
> > (FAN_RENAME), adding a new report flag (FAN_REPORT_TARGET_FID)
> > that adds info records to existing MOVE_ events or some combination.
> 
> Well, if we went with adding a new event FAN_RENAME and specifying that
> resulted in the generation of additional
> FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO} information record types for an
> event, wouldn't it be weird as it doesn't follow the conventional mechanism
> of a listener asking for additional information records? As in,
> traditionally we'd intialize the notification group with a flag and then
> that flag controls whether or not one is permitted to receive events of a
> particular type that may or may not include information records?
> 
> Maybe a combination approach is needed in this instance, but this doesn't
> necessarily simplify things when attempting to document the API semantics
> IMO.

So there are couple of ways how to approach this I guess. One is that we
add a flag like FAN_REPORT_SECONDARY_DFID and FAN_REPORT_SECONDARY_NAME
which would add another DFID(+NAME) record of new type to rename event. In
principle these could be added to MOVED_FROM and/or MOVED_TO events
(probably both are useful). But I'd find the naming somewhat confusing and
difficult to sensibly describe.

That's why I think it may be clearer to go with new FAN_RENAME event that
will be triggered when the directory is on either end of rename(2) (source
or target). If DFID(+NAME) is enabled for the group, the event would report
both source and target DFIDs (and names if enabled) instead of one. I don't
think special FAN_REPORT_? flag to enable the second DFID would be useful
in this case (either for clarity or enabling some functionality).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-16  9:47               ` FAN_REPORT_CHILD_FID Jan Kara
@ 2021-07-16 11:24                 ` Amir Goldstein
  2021-07-27 10:44                   ` FAN_REPORT_CHILD_FID Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2021-07-16 11:24 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Fri, Jul 16, 2021 at 12:47 PM Jan Kara <jack@suse.cz> wrote:
>
> On Fri 16-07-21 09:53:15, Matthew Bobrowski wrote:
> > On Wed, Jul 14, 2021 at 03:09:56PM +0300, Amir Goldstein wrote:
> > > I am still debating with myself between adding a new event type
> > > (FAN_RENAME), adding a new report flag (FAN_REPORT_TARGET_FID)
> > > that adds info records to existing MOVE_ events or some combination.
> >
> > Well, if we went with adding a new event FAN_RENAME and specifying that
> > resulted in the generation of additional
> > FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO} information record types for an
> > event, wouldn't it be weird as it doesn't follow the conventional mechanism
> > of a listener asking for additional information records? As in,
> > traditionally we'd intialize the notification group with a flag and then
> > that flag controls whether or not one is permitted to receive events of a
> > particular type that may or may not include information records?
> >
> > Maybe a combination approach is needed in this instance, but this doesn't
> > necessarily simplify things when attempting to document the API semantics
> > IMO.
>
> So there are couple of ways how to approach this I guess. One is that we
> add a flag like FAN_REPORT_SECONDARY_DFID and FAN_REPORT_SECONDARY_NAME
> which would add another DFID(+NAME) record of new type to rename event. In
> principle these could be added to MOVED_FROM and/or MOVED_TO events
> (probably both are useful). But I'd find the naming somewhat confusing and
> difficult to sensibly describe.
>
> That's why I think it may be clearer to go with new FAN_RENAME event that
> will be triggered when the directory is on either end of rename(2) (source
> or target). If DFID(+NAME) is enabled for the group, the event would report
> both source and target DFIDs (and names if enabled) instead of one. I don't
> think special FAN_REPORT_? flag to enable the second DFID would be useful
> in this case (either for clarity or enabling some functionality).
>

I agree that FAN_RENAME without any new REPORT flag is possible.
Still, I would like to at least try to come up with some UAPI that is more
compatible with existing semantics and simlifies them rather than creating
more special cases.

For example, if FAN_REPORT_ALL_FIDS would start reporting all the
relevant fid records for all events, then nothing would change for
FAN_OPEN etc, but FAN_CREATE etc would start reporting the
target fid and FAN_MOVED_* would start reporting source, target and self
fids or dfid/name records and then FAN_MOVED_* pair can be "merged"
because they carry the exact same information.
(I suppose FAN_MOVE_SELF could be "merged" with them as well)

When I write "merged" I mean queued as a single recorded event
depending on backend flags, like we do with event ON_CHILD and
event on self for inotify vs. fanotify.

With this scheme, listeners that only set FAN_MOVED_FROM in
mask would get all relevant information and users that only set
FAN_MOVED_TO would get all relevant information and we avoid
the spam of different event formats for the same event in case
users set FAN_MOVED|FAN_RENAME|FAN_MOVE_SELF in the mask.

That just leaves the question of HOW to describe the info records
in a consistent way.

I was thinking about:

#define FAN_EVENT_INFO_OF_SELF           1
#define FAN_EVENT_INFO_OF_SOURCE    2
#define FAN_EVENT_INFO_OF_TARGET     3

struct fanotify_event_info_header {
        __u8 info_type; /* The type of object of the info record */
        __u8 info_of;     /* The subject of the info record */
        __u16 len;
};

The existing info_type values determine HOW that info can be used.
For example, FAN_EVENT_INFO_TYPE_DFID can always be resolved
to a linked path if directory is not dead and reachable, while
FAN_EVENT_INFO_TYPE_FID is similar, but it could resolve to an
unknown path even if the inode is linked.

Most events will only report records OF_SELF or OF_TARGET or both.
FAN_MOVED_* will also report records OF_SOURCE.

One way to think about it is that we are encoding all the information
found in man page about what the info record is describing, depending
on the event type and making it available in the event itself, so application
does not need to code the semantics of the man page w.r.t info records
meaning for different event types.

w.r.t backward compatibility of event parsers, naturally, the REPORT_ALL
flag protects us from regressions, but also adapting to the new scheme is
pretty easy. In particular, if application ignored the info_of field the only
event where things could go wrong is FAN_MOVED_TO because the
first info record is not what it used to be (assuming that we report the
info of source dirent first).

Thoughts?

Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-16 11:24                 ` FAN_REPORT_CHILD_FID Amir Goldstein
@ 2021-07-27 10:44                   ` Amir Goldstein
  2021-07-28  9:51                     ` FAN_REPORT_CHILD_FID Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2021-07-27 10:44 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Fri, Jul 16, 2021 at 2:24 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Jul 16, 2021 at 12:47 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Fri 16-07-21 09:53:15, Matthew Bobrowski wrote:
> > > On Wed, Jul 14, 2021 at 03:09:56PM +0300, Amir Goldstein wrote:
> > > > I am still debating with myself between adding a new event type
> > > > (FAN_RENAME), adding a new report flag (FAN_REPORT_TARGET_FID)
> > > > that adds info records to existing MOVE_ events or some combination.
> > >
> > > Well, if we went with adding a new event FAN_RENAME and specifying that
> > > resulted in the generation of additional
> > > FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO} information record types for an
> > > event, wouldn't it be weird as it doesn't follow the conventional mechanism
> > > of a listener asking for additional information records? As in,
> > > traditionally we'd intialize the notification group with a flag and then
> > > that flag controls whether or not one is permitted to receive events of a
> > > particular type that may or may not include information records?
> > >
> > > Maybe a combination approach is needed in this instance, but this doesn't
> > > necessarily simplify things when attempting to document the API semantics
> > > IMO.
> >
> > So there are couple of ways how to approach this I guess. One is that we
> > add a flag like FAN_REPORT_SECONDARY_DFID and FAN_REPORT_SECONDARY_NAME
> > which would add another DFID(+NAME) record of new type to rename event. In
> > principle these could be added to MOVED_FROM and/or MOVED_TO events
> > (probably both are useful). But I'd find the naming somewhat confusing and
> > difficult to sensibly describe.
> >
> > That's why I think it may be clearer to go with new FAN_RENAME event that
> > will be triggered when the directory is on either end of rename(2) (source
> > or target). If DFID(+NAME) is enabled for the group, the event would report
> > both source and target DFIDs (and names if enabled) instead of one. I don't
> > think special FAN_REPORT_? flag to enable the second DFID would be useful
> > in this case (either for clarity or enabling some functionality).
> >
>
> I agree that FAN_RENAME without any new REPORT flag is possible.
> Still, I would like to at least try to come up with some UAPI that is more
> compatible with existing semantics and simlifies them rather than creating
> more special cases.
>
> For example, if FAN_REPORT_ALL_FIDS would start reporting all the
> relevant fid records for all events, then nothing would change for
> FAN_OPEN etc, but FAN_CREATE etc would start reporting the
> target fid and FAN_MOVED_* would start reporting source, target and self
> fids or dfid/name records and then FAN_MOVED_* pair can be "merged"
> because they carry the exact same information.
> (I suppose FAN_MOVE_SELF could be "merged" with them as well)
>
> When I write "merged" I mean queued as a single recorded event
> depending on backend flags, like we do with event ON_CHILD and
> event on self for inotify vs. fanotify.
>
> With this scheme, listeners that only set FAN_MOVED_FROM in
> mask would get all relevant information and users that only set
> FAN_MOVED_TO would get all relevant information and we avoid
> the spam of different event formats for the same event in case
> users set FAN_MOVED|FAN_RENAME|FAN_MOVE_SELF in the mask.
>
> That just leaves the question of HOW to describe the info records
> in a consistent way.
>
> I was thinking about:
>
> #define FAN_EVENT_INFO_OF_SELF           1
> #define FAN_EVENT_INFO_OF_SOURCE    2
> #define FAN_EVENT_INFO_OF_TARGET     3
>
> struct fanotify_event_info_header {
>         __u8 info_type; /* The type of object of the info record */
>         __u8 info_of;     /* The subject of the info record */
>         __u16 len;
> };
>
> The existing info_type values determine HOW that info can be used.
> For example, FAN_EVENT_INFO_TYPE_DFID can always be resolved
> to a linked path if directory is not dead and reachable, while
> FAN_EVENT_INFO_TYPE_FID is similar, but it could resolve to an
> unknown path even if the inode is linked.
>
> Most events will only report records OF_SELF or OF_TARGET or both.
> FAN_MOVED_* will also report records OF_SOURCE.
>
> One way to think about it is that we are encoding all the information
> found in man page about what the info record is describing, depending
> on the event type and making it available in the event itself, so application
> does not need to code the semantics of the man page w.r.t info records
> meaning for different event types.
>
> w.r.t backward compatibility of event parsers, naturally, the REPORT_ALL
> flag protects us from regressions, but also adapting to the new scheme is
> pretty easy. In particular, if application ignored the info_of field the only
> event where things could go wrong is FAN_MOVED_TO because the
> first info record is not what it used to be (assuming that we report the
> info of source dirent first).
>

> Thoughts?

Jan,

I know you are catching up on emails and this is a very low priority.
Whenever you get to it, here is a POC branch I prepared for
FAN_REPORT_FID_OF:

https://github.com/amir73il/linux/commits/fanotify_fid_of

This does not include the unified move event with 3 info records
only the basic extension of the UAPI with the sub_type info
concept, which is used to report child fid records for dirent events.

Here is an update of LTP test to support my claim about
ease of adaptation for existing applications to new UAPI:

https://github.com/amir73il/ltp/commits/fanotify_fid_of

I haven't bothered to create a man page draft, but I did go over the
existing documentation to estimate how much of it would need to change
and I think it would require surprisingly little changes as the man page
sections about expected info records were written in quite a broad language.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: FAN_REPORT_CHILD_FID
  2021-07-27 10:44                   ` FAN_REPORT_CHILD_FID Amir Goldstein
@ 2021-07-28  9:51                     ` Amir Goldstein
  0 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2021-07-28  9:51 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, Linux API

+linux-api

On Tue, Jul 27, 2021 at 1:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Jul 16, 2021 at 2:24 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Jul 16, 2021 at 12:47 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Fri 16-07-21 09:53:15, Matthew Bobrowski wrote:
> > > > On Wed, Jul 14, 2021 at 03:09:56PM +0300, Amir Goldstein wrote:
> > > > > I am still debating with myself between adding a new event type
> > > > > (FAN_RENAME), adding a new report flag (FAN_REPORT_TARGET_FID)
> > > > > that adds info records to existing MOVE_ events or some combination.
> > > >
> > > > Well, if we went with adding a new event FAN_RENAME and specifying that
> > > > resulted in the generation of additional
> > > > FAN_EVENT_INFO_TYPE_DFID_NAME_{FROM,TO} information record types for an
> > > > event, wouldn't it be weird as it doesn't follow the conventional mechanism
> > > > of a listener asking for additional information records? As in,
> > > > traditionally we'd intialize the notification group with a flag and then
> > > > that flag controls whether or not one is permitted to receive events of a
> > > > particular type that may or may not include information records?
> > > >
> > > > Maybe a combination approach is needed in this instance, but this doesn't
> > > > necessarily simplify things when attempting to document the API semantics
> > > > IMO.
> > >
> > > So there are couple of ways how to approach this I guess. One is that we
> > > add a flag like FAN_REPORT_SECONDARY_DFID and FAN_REPORT_SECONDARY_NAME
> > > which would add another DFID(+NAME) record of new type to rename event. In
> > > principle these could be added to MOVED_FROM and/or MOVED_TO events
> > > (probably both are useful). But I'd find the naming somewhat confusing and
> > > difficult to sensibly describe.
> > >
> > > That's why I think it may be clearer to go with new FAN_RENAME event that
> > > will be triggered when the directory is on either end of rename(2) (source
> > > or target). If DFID(+NAME) is enabled for the group, the event would report
> > > both source and target DFIDs (and names if enabled) instead of one. I don't
> > > think special FAN_REPORT_? flag to enable the second DFID would be useful
> > > in this case (either for clarity or enabling some functionality).
> > >
> >
> > I agree that FAN_RENAME without any new REPORT flag is possible.
> > Still, I would like to at least try to come up with some UAPI that is more
> > compatible with existing semantics and simlifies them rather than creating
> > more special cases.
> >
> > For example, if FAN_REPORT_ALL_FIDS would start reporting all the
> > relevant fid records for all events, then nothing would change for
> > FAN_OPEN etc, but FAN_CREATE etc would start reporting the
> > target fid and FAN_MOVED_* would start reporting source, target and self
> > fids or dfid/name records and then FAN_MOVED_* pair can be "merged"
> > because they carry the exact same information.
> > (I suppose FAN_MOVE_SELF could be "merged" with them as well)
> >
> > When I write "merged" I mean queued as a single recorded event
> > depending on backend flags, like we do with event ON_CHILD and
> > event on self for inotify vs. fanotify.
> >
> > With this scheme, listeners that only set FAN_MOVED_FROM in
> > mask would get all relevant information and users that only set
> > FAN_MOVED_TO would get all relevant information and we avoid
> > the spam of different event formats for the same event in case
> > users set FAN_MOVED|FAN_RENAME|FAN_MOVE_SELF in the mask.
> >
> > That just leaves the question of HOW to describe the info records
> > in a consistent way.
> >
> > I was thinking about:
> >
> > #define FAN_EVENT_INFO_OF_SELF           1
> > #define FAN_EVENT_INFO_OF_SOURCE    2
> > #define FAN_EVENT_INFO_OF_TARGET     3
> >
> > struct fanotify_event_info_header {
> >         __u8 info_type; /* The type of object of the info record */
> >         __u8 info_of;     /* The subject of the info record */
> >         __u16 len;
> > };
> >
> > The existing info_type values determine HOW that info can be used.
> > For example, FAN_EVENT_INFO_TYPE_DFID can always be resolved
> > to a linked path if directory is not dead and reachable, while
> > FAN_EVENT_INFO_TYPE_FID is similar, but it could resolve to an
> > unknown path even if the inode is linked.
> >
> > Most events will only report records OF_SELF or OF_TARGET or both.
> > FAN_MOVED_* will also report records OF_SOURCE.
> >
> > One way to think about it is that we are encoding all the information
> > found in man page about what the info record is describing, depending
> > on the event type and making it available in the event itself, so application
> > does not need to code the semantics of the man page w.r.t info records
> > meaning for different event types.
> >
> > w.r.t backward compatibility of event parsers, naturally, the REPORT_ALL
> > flag protects us from regressions, but also adapting to the new scheme is
> > pretty easy. In particular, if application ignored the info_of field the only
> > event where things could go wrong is FAN_MOVED_TO because the
> > first info record is not what it used to be (assuming that we report the
> > info of source dirent first).
> >
>
> > Thoughts?
>
> Jan,
>
> I know you are catching up on emails and this is a very low priority.
> Whenever you get to it, here is a POC branch I prepared for
> FAN_REPORT_FID_OF:
>
> https://github.com/amir73il/linux/commits/fanotify_fid_of
>

Slight change of plans. I pushed new branches:
https://github.com/amir73il/linux/commits/fanotify_target_fid
https://github.com/amir73il/ltp/commits/fanotify_target_fid

With the following changes:
1. Uses FAN_REPORT_TARGET_FID as you suggested
2. Requires FAN_REPORT_NAME (to reduce test matrix)
3. Implements 3rd record only for MOVED_FROM event

> This does not include the unified move event with 3 info records
> only the basic extension of the UAPI with the sub_type info
> concept, which is used to report child fid records for dirent events.
>

So I dropped the plan for a "unified rename event".
Adding one extra record only to MOVED_FROM was really easy
because we already have that information available in the "moved"
dentry passed to the backend.

I kept the info record subtype concept and second info record
I created with type DFID_NAME and subtype FID_OF_PARENT2,
but now this UAPI detail is up for debate.

We can easily throw away the subtype and use info type DFID2_NAME
instead.

The thing is, in an earlier version I did not require FAN_REPORT_NAME
and it was useful to reuse the subtype without having to create two new
types (i.e.  DFID2 and DFID2_NAME).

All-in-all it's a very minor difference in the UAPI and I could go either way.
I am leaning towards dropping the subtype, but I am not sure, so I kept
it to hear what other people have to say (CC: linux-api).

There is also the possibility to use the same info record type for the
second info record (DFID_NAME) because this record can only be
reported after a first valid DFID_NAME record, but I do not like this option.

Thoughts?

Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-28  9:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-11 17:02 FAN_REPORT_CHILD_FID Amir Goldstein
2021-07-12 11:10 ` FAN_REPORT_CHILD_FID Jan Kara
2021-07-12 13:00   ` FAN_REPORT_CHILD_FID Amir Goldstein
2021-07-12 16:26     ` FAN_REPORT_CHILD_FID Jan Kara
2021-07-12 18:08       ` FAN_REPORT_CHILD_FID Amir Goldstein
2021-07-14  1:16         ` FAN_REPORT_CHILD_FID Matthew Bobrowski
2021-07-14 12:09           ` FAN_REPORT_CHILD_FID Amir Goldstein
2021-07-15 23:53             ` FAN_REPORT_CHILD_FID Matthew Bobrowski
2021-07-16  9:47               ` FAN_REPORT_CHILD_FID Jan Kara
2021-07-16 11:24                 ` FAN_REPORT_CHILD_FID Amir Goldstein
2021-07-27 10:44                   ` FAN_REPORT_CHILD_FID Amir Goldstein
2021-07-28  9:51                     ` FAN_REPORT_CHILD_FID Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.