linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/10] fanotify events with name info
@ 2020-07-02 12:57 Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 01/10] inotify: report both events on parent and child with single callback Amir Goldstein
                   ` (9 more replies)
  0 siblings, 10 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

Hi Jan,

This patch set implements the FAN_REPORT_NAME and FAN_REPORT_DIR_FID
group flags.  It depends on the prep patch series I posted earlier [1].

Previously on this patch series...

In v2 [2] I implemented both the event FAN_DIR_MODIFY and the group flag
FAN_REPORT_NAME.  We decided to merge only the FAN_DIR_MODIFY part for
v5.7-rc1 and leave the rest for later after we complete the discussion
or user API for FAN_REPORT_NAME.

As the discussions continued on v3 [3], we came to a conclusion that the
FAN_DIR_MODIFY event is redundant and we disabled it before the release
of v5.7.

This time around, the patches are presented after the man page for the
user API has been written [4] and after we covered all the open API
questions in long discussions among us without leaving any open ends.

The patches are available on github branch fanotify_name_fid [5] based
on v5.7-rc3 + Mel's revert patch.  LTP tests [6] and a demo [7] are also
available.

Thanks,
Amir.

Changes since v3:
- All user API aspects clearly defined in man page
- Support flag FAN_REPORT_DIR_FID in addition to FAN_REPORT_NAME
- Support most combinations of new flags with FAN_REPORT_FID
- Supersede the functionality of removed FAN_DIR_MODIFY event
- Report name for more event types including FAN_MOVE_SELF
- Stronger integration with fsnotify_parent() to eliminate duplicate
  events with and without name

[1] https://lore.kernel.org/linux-fsdevel/20200612093343.5669-1-amir73il@gmail.com/
[2] https://lore.kernel.org/linux-fsdevel/20200217131455.31107-1-amir73il@gmail.com/
[3] https://lore.kernel.org/linux-fsdevel/20200505162014.10352-1-amir73il@gmail.com/
[4] https://github.com/amir73il/man-pages/commits/fanotify_name_fid
[5] https://github.com/amir73il/linux/commits/fanotify_name_fid
[6] https://github.com/amir73il/ltp/commits/fanotify_name_fid
[7] https://github.com/amir73il/inotify-tools/commits/fanotify_name

Amir Goldstein (10):
  inotify: report both events on parent and child with single callback
  fanotify: report both events on parent and child with single callback
  fsnotify: send event to parent and child with single callback
  fsnotify: send event with parent/name info to sb/mount/non-dir marks
  fsnotify: send MOVE_SELF event with parent/name info
  fanotify: add basic support for FAN_REPORT_DIR_FID
  fanotify: report events with parent dir fid to sb/mount/non-dir marks
  fanotify: add support for FAN_REPORT_NAME
  fanotify: report parent fid + name + child fid
  fanotify: report parent fid + child fid

 fs/kernfs/file.c                     |  10 ++-
 fs/notify/fanotify/fanotify.c        |  96 +++++++++++++++++++++--
 fs/notify/fanotify/fanotify.h        |   2 +
 fs/notify/fanotify/fanotify_user.c   | 111 +++++++++++++++++++++++----
 fs/notify/fsnotify.c                 | 103 ++++++++++++++++++-------
 fs/notify/inotify/inotify_fsnotify.c |  37 ++++++---
 include/linux/fanotify.h             |   2 +-
 include/linux/fsnotify.h             |  15 ++--
 include/linux/fsnotify_backend.h     |  32 +++++++-
 include/uapi/linux/fanotify.h        |  15 +++-
 10 files changed, 345 insertions(+), 78 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 01/10] inotify: report both events on parent and child with single callback
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 02/10] fanotify: " Amir Goldstein
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

fsnotify usually calls inotify_handle_event() once for watching parent
to report event with child's name and once for watching child to report
event without child's name.

Do the same thing with a single callback instead of two callbacks when
marks iterator contains both inode and child entries.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/inotify/inotify_fsnotify.c | 37 +++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index dfd455798a1b..6fb8ae34edd1 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -55,13 +55,11 @@ static int inotify_merge(struct list_head *list,
 	return event_compare(last_event, event);
 }
 
-int inotify_handle_event(struct fsnotify_group *group, u32 mask,
-			 const void *data, int data_type, struct inode *dir,
-			 const struct qstr *file_name, u32 cookie,
-			 struct fsnotify_iter_info *iter_info)
+static int inotify_one_event(struct fsnotify_group *group, u32 mask,
+			     struct fsnotify_mark *inode_mark,
+			     const struct path *path,
+			     const struct qstr *file_name, u32 cookie)
 {
-	const struct path *path = fsnotify_data_path(data, data_type);
-	struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
 	struct inotify_inode_mark *i_mark;
 	struct inotify_event_info *event;
 	struct fsnotify_event *fsn_event;
@@ -69,9 +67,6 @@ int inotify_handle_event(struct fsnotify_group *group, u32 mask,
 	int len = 0;
 	int alloc_len = sizeof(struct inotify_event_info);
 
-	if (WARN_ON(fsnotify_iter_vfsmount_mark(iter_info)))
-		return 0;
-
 	if ((inode_mark->mask & FS_EXCL_UNLINK) &&
 	    path && d_unlinked(path->dentry))
 		return 0;
@@ -135,6 +130,30 @@ int inotify_handle_event(struct fsnotify_group *group, u32 mask,
 	return 0;
 }
 
+int inotify_handle_event(struct fsnotify_group *group, u32 mask,
+			 const void *data, int data_type, struct inode *dir,
+			 const struct qstr *file_name, u32 cookie,
+			 struct fsnotify_iter_info *iter_info)
+{
+	const struct path *path = fsnotify_data_path(data, data_type);
+	struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
+	struct fsnotify_mark *child_mark = fsnotify_iter_child_mark(iter_info);
+	int ret = 0;
+
+	if (WARN_ON(fsnotify_iter_vfsmount_mark(iter_info)))
+		return 0;
+
+	/* If parent is watching, report the event with child's name */
+	if (inode_mark)
+		ret = inotify_one_event(group, mask, inode_mark, path,
+					file_name, cookie);
+	if (ret || !child_mark)
+		return ret;
+
+	/* If child is watching, report another event without child's name */
+	return inotify_one_event(group, mask, child_mark, path, NULL, 0);
+}
+
 static void inotify_freeing_mark(struct fsnotify_mark *fsn_mark, struct fsnotify_group *group)
 {
 	inotify_ignored_and_remove_idr(fsn_mark, group);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 02/10] fanotify: report both events on parent and child with single callback
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 01/10] inotify: report both events on parent and child with single callback Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 03/10] fsnotify: send event to " Amir Goldstein
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

fsnotify usually calls fanotify_handle_event() once for watching parent
and once for watching child, even though both events are exactly the
same and will most likely get merged before user reads them.

Add support for handling both event flavors with a single callback
instead of two callbacks when marks iterator contains both inode and
child entries.

fanotify will queue a single event in that case and the unneeded merge
will be avoided.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index c3986fbb6801..7f40b8f57934 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -246,8 +246,11 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
 		/*
 		 * If the event is for a child and this mark doesn't care about
 		 * events on a child, don't send it!
+		 * The special object type "child" always cares about events on
+		 * a child, because it refers to the child inode itself.
 		 */
 		if (event_mask & FS_EVENT_ON_CHILD &&
+		    type != FSNOTIFY_OBJ_TYPE_CHILD &&
 		    (type != FSNOTIFY_OBJ_TYPE_INODE ||
 		     !(mark->mask & FS_EVENT_ON_CHILD)))
 			continue;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 01/10] inotify: report both events on parent and child with single callback Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 02/10] fanotify: " Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-14 10:34   ` Jan Kara
  2020-07-02 12:57 ` [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks Amir Goldstein
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Instead of calling fsnotify() twice, once with parent inode and once
with child inode, if event should be sent to parent inode, send it
with both parent and child inodes marks in object type iterator and call
the backend handle_event() callback only once.

The parent inode is assigned to the standard "inode" iterator type and
the child inode is assigned to the special "child" iterator type.

In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
the dir argment to handle_event will be the parent inode, the file_name
argument to handle_event is non NULL and refers to the name of the child
and the child inode can be accessed with fsnotify_data_inode().

This will allow fanotify to make decisions based on child or parent's
ignored mask.  For example, when a parent is interested in a specific
event on its children, but a specific child wishes to ignore this event,
the event will not be reported.  This is not what happens with current
code, but according to man page, it is the expected behavior.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/kernfs/file.c     | 10 ++++++----
 fs/notify/fsnotify.c | 40 ++++++++++++++++++++++++++--------------
 2 files changed, 32 insertions(+), 18 deletions(-)

diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index e23b3f62483c..5b1468bc509e 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -883,6 +883,7 @@ static void kernfs_notify_workfn(struct work_struct *work)
 
 	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
 		struct kernfs_node *parent;
+		struct inode *p_inode = NULL;
 		struct inode *inode;
 		struct qstr name;
 
@@ -899,8 +900,6 @@ static void kernfs_notify_workfn(struct work_struct *work)
 		name = (struct qstr)QSTR_INIT(kn->name, strlen(kn->name));
 		parent = kernfs_get_parent(kn);
 		if (parent) {
-			struct inode *p_inode;
-
 			p_inode = ilookup(info->sb, kernfs_ino(parent));
 			if (p_inode) {
 				fsnotify(p_inode, FS_MODIFY | FS_EVENT_ON_CHILD,
@@ -911,8 +910,11 @@ static void kernfs_notify_workfn(struct work_struct *work)
 			kernfs_put(parent);
 		}
 
-		fsnotify(inode, FS_MODIFY, inode, FSNOTIFY_EVENT_INODE,
-			 NULL, 0);
+		if (!p_inode) {
+			fsnotify(inode, FS_MODIFY, inode, FSNOTIFY_EVENT_INODE,
+				 NULL, 0);
+		}
+
 		iput(inode);
 	}
 
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 51ada3cfd2ff..7c6e624b24c9 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -145,15 +145,17 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
 /*
  * Notify this dentry's parent about a child's events with child name info
  * if parent is watching.
- * Notify also the child without name info if child inode is watching.
+ * Notify only the child without name info if parent is not watching.
  */
 int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
 		      int data_type)
 {
+	struct inode *inode = d_inode(dentry);
 	struct dentry *parent;
 	struct inode *p_inode;
 	int ret = 0;
 
+	parent = NULL;
 	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
 		goto notify_child;
 
@@ -165,23 +167,23 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
 	} else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
 		struct name_snapshot name;
 
-		/*
-		 * We are notifying a parent, so set a flag in mask to inform
-		 * backend that event has information about a child entry.
-		 */
+		/* When notifying parent, child should be passed as data */
+		WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));
+
+		/* Notify both parent and child with child name info */
 		take_dentry_name_snapshot(&name, dentry);
 		ret = fsnotify(p_inode, mask | FS_EVENT_ON_CHILD, data,
 			       data_type, &name.name, 0);
 		release_dentry_name_snapshot(&name);
+	} else {
+notify_child:
+		/* Notify child without child name info */
+		ret = fsnotify(inode, mask, data, data_type, NULL, 0);
 	}
 
 	dput(parent);
 
-	if (ret)
-		return ret;
-
-notify_child:
-	return fsnotify(d_inode(dentry), mask, data, data_type, NULL, 0);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(__fsnotify_parent);
 
@@ -322,12 +324,16 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 	struct super_block *sb = to_tell->i_sb;
 	struct inode *dir = S_ISDIR(to_tell->i_mode) ? to_tell : NULL;
 	struct mount *mnt = NULL;
+	struct inode *child = NULL;
 	int ret = 0;
 	__u32 test_mask, marks_mask;
 
 	if (path)
 		mnt = real_mount(path->mnt);
 
+	if (mask & FS_EVENT_ON_CHILD)
+		child = fsnotify_data_inode(data, data_type);
+
 	/*
 	 * Optimization: srcu_read_lock() has a memory barrier which can
 	 * be expensive.  It protects walking the *_fsnotify_marks lists.
@@ -336,21 +342,23 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 	 * need SRCU to keep them "alive".
 	 */
 	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
-	    (!mnt || !mnt->mnt_fsnotify_marks))
+	    (!mnt || !mnt->mnt_fsnotify_marks) &&
+	    (!child || !child->i_fsnotify_marks))
 		return 0;
 
 	/* An event "on child" is not intended for a mount/sb mark */
 	marks_mask = to_tell->i_fsnotify_mask;
-	if (!(mask & FS_EVENT_ON_CHILD)) {
+	if (!child) {
 		marks_mask |= sb->s_fsnotify_mask;
 		if (mnt)
 			marks_mask |= mnt->mnt_fsnotify_mask;
+	} else {
+		marks_mask |= child->i_fsnotify_mask;
 	}
 
 	/*
 	 * if this is a modify event we may need to clear the ignored masks
-	 * otherwise return if neither the inode nor the vfsmount/sb care about
-	 * this type of event.
+	 * otherwise return if none of the marks care about this type of event.
 	 */
 	test_mask = (mask & ALL_FSNOTIFY_EVENTS);
 	if (!(mask & FS_MODIFY) && !(test_mask & marks_mask))
@@ -366,6 +374,10 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
 			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
 	}
+	if (child) {
+		iter_info.marks[FSNOTIFY_OBJ_TYPE_CHILD] =
+			fsnotify_first_mark(&child->i_fsnotify_marks);
+	}
 
 	/*
 	 * We need to merge inode/vfsmount/sb mark lists so that e.g. inode mark
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
                   ` (2 preceding siblings ...)
  2020-07-02 12:57 ` [PATCH v4 03/10] fsnotify: send event to " Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-14 11:54   ` Jan Kara
  2020-07-02 12:57 ` [PATCH v4 05/10] fsnotify: send MOVE_SELF event with parent/name info Amir Goldstein
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Similar to events "on child" to watching directory, send event "on child"
with parent/name info if sb/mount/non-dir marks are interested in
parent/name info.

The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
interest in parent/name info for events on non-directory inodes.

Events on "orphan" children (disconnected dentries) are sent without
parent/name info.

Events on direcories are send with parent/name info only if the parent
directory is watching.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c             | 50 +++++++++++++++++++++++---------
 include/linux/fsnotify.h         | 10 +++++--
 include/linux/fsnotify_backend.h | 32 +++++++++++++++++---
 3 files changed, 73 insertions(+), 19 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 7c6e624b24c9..6683c77a5b13 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -144,27 +144,55 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
 
 /*
  * Notify this dentry's parent about a child's events with child name info
- * if parent is watching.
- * Notify only the child without name info if parent is not watching.
+ * if parent is watching or if inode/sb/mount are interested in events with
+ * parent and name info.
+ *
+ * Notify only the child without name info if parent is not watching and
+ * inode/sb/mount are not interested in events with parent and name info.
  */
 int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
 		      int data_type)
 {
+	const struct path *path = fsnotify_data_path(data, data_type);
+	struct mount *mnt = path ? real_mount(path->mnt) : NULL;
 	struct inode *inode = d_inode(dentry);
 	struct dentry *parent;
+	bool parent_watched = dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED;
+	__u32 p_mask, test_mask, marks_mask = 0;
 	struct inode *p_inode;
 	int ret = 0;
 
+	/*
+	 * Do inode/sb/mount care about parent and name info on non-dir?
+	 * Do they care about any event at all?
+	 */
+	if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
+	    (!mnt || !mnt->mnt_fsnotify_marks)) {
+		if (!parent_watched)
+			return 0;
+	} else if (!(mask & FS_ISDIR) && !IS_ROOT(dentry)) {
+		marks_mask |= fsnotify_want_parent(inode->i_fsnotify_mask);
+		marks_mask |= fsnotify_want_parent(inode->i_sb->s_fsnotify_mask);
+		if (mnt)
+			marks_mask |= fsnotify_want_parent(mnt->mnt_fsnotify_mask);
+	}
+
 	parent = NULL;
-	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+	test_mask = mask & FS_EVENTS_POSS_TO_PARENT;
+	if (!(marks_mask & test_mask) && !parent_watched)
 		goto notify_child;
 
+	/* Does parent inode care about events on children? */
 	parent = dget_parent(dentry);
 	p_inode = parent->d_inode;
+	p_mask = fsnotify_inode_watches_children(p_inode);
 
-	if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
+	if (p_mask)
+		marks_mask |= p_mask;
+	else if (unlikely(parent_watched))
 		__fsnotify_update_child_dentry_flags(p_inode);
-	} else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
+
+	if ((marks_mask & test_mask) && p_inode != inode) {
 		struct name_snapshot name;
 
 		/* When notifying parent, child should be passed as data */
@@ -346,15 +374,11 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 	    (!child || !child->i_fsnotify_marks))
 		return 0;
 
-	/* An event "on child" is not intended for a mount/sb mark */
-	marks_mask = to_tell->i_fsnotify_mask;
-	if (!child) {
-		marks_mask |= sb->s_fsnotify_mask;
-		if (mnt)
-			marks_mask |= mnt->mnt_fsnotify_mask;
-	} else {
+	marks_mask = to_tell->i_fsnotify_mask | sb->s_fsnotify_mask;
+	if (mnt)
+		marks_mask |= mnt->mnt_fsnotify_mask;
+	if (child)
 		marks_mask |= child->i_fsnotify_mask;
-	}
 
 	/*
 	 * if this is a modify event we may need to clear the ignored masks
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 9b2566d273a9..044cae3a0628 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -44,10 +44,16 @@ static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
 {
 	struct inode *inode = d_inode(dentry);
 
-	if (S_ISDIR(inode->i_mode))
+	if (S_ISDIR(inode->i_mode)) {
 		mask |= FS_ISDIR;
 
-	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+		/* sb/mount marks are not interested in name of directory */
+		if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+			goto notify_child;
+	}
+
+	/* disconnected dentry cannot notify parent */
+	if (IS_ROOT(dentry))
 		goto notify_child;
 
 	return __fsnotify_parent(dentry, mask, data, data_type);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 2c62628566c5..a7363c33211e 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -49,8 +49,11 @@
 #define FS_OPEN_EXEC_PERM	0x00040000	/* open/exec event in a permission hook */
 
 #define FS_EXCL_UNLINK		0x04000000	/* do not send events if object is unlinked */
-/* This inode cares about things that happen to its children.  Always set for
- * dnotify and inotify. */
+/*
+ * Set on inode mark that cares about things that happen to its children.
+ * Always set for dnotify and inotify.
+ * Set on inode/sb/mount marks that care about parenet/name info.
+ */
 #define FS_EVENT_ON_CHILD	0x08000000
 
 #define FS_DN_RENAME		0x10000000	/* file renamed */
@@ -72,14 +75,22 @@
 				  FS_OPEN_EXEC_PERM)
 
 /*
- * This is a list of all events that may get sent to a parent based on fs event
- * happening to inodes inside that directory.
+ * This is a list of all events that may get sent to a parent that is watching
+ * with flag FS_EVENT_ON_CHILD based on fs event on a child of that directory.
  */
 #define FS_EVENTS_POSS_ON_CHILD   (ALL_FSNOTIFY_PERM_EVENTS | \
 				   FS_ACCESS | FS_MODIFY | FS_ATTRIB | \
 				   FS_CLOSE_WRITE | FS_CLOSE_NOWRITE | \
 				   FS_OPEN | FS_OPEN_EXEC)
 
+/*
+ * This is a list of all events that may get sent with the parent inode as the
+ * @to_tell argument of fsnotify().
+ * It may include events that can be sent to an inode/sb/mount mark, but cannot
+ * be sent to a parent watching children.
+ */
+#define FS_EVENTS_POSS_TO_PARENT (FS_EVENTS_POSS_ON_CHILD)
+
 /* Events that can be reported to backends */
 #define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
 			     FS_EVENTS_POSS_ON_CHILD | \
@@ -398,6 +409,19 @@ extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
 extern void fsnotify_sb_delete(struct super_block *sb);
 extern u32 fsnotify_get_cookie(void);
 
+static inline __u32 fsnotify_want_parent(__u32 mask)
+{
+	/* FS_EVENT_ON_CHILD is set on marks that want parent/name info */
+	if (!(mask & FS_EVENT_ON_CHILD))
+		return 0;
+	/*
+	 * This object might be watched by a mark that cares about parent/name
+	 * info, does it care about the specific set of events that can be
+	 * reported with parent/name info?
+	 */
+	return mask & FS_EVENTS_POSS_TO_PARENT;
+}
+
 static inline int fsnotify_inode_watches_children(struct inode *inode)
 {
 	/* FS_EVENT_ON_CHILD is set if the inode may care */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 05/10] fsnotify: send MOVE_SELF event with parent/name info
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
                   ` (3 preceding siblings ...)
  2020-07-02 12:57 ` [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-14 12:13   ` Jan Kara
  2020-07-02 12:57 ` [PATCH v4 06/10] fanotify: add basic support for FAN_REPORT_DIR_FID Amir Goldstein
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

MOVE_SELF event does not get reported to a parent watching children
when a child is moved, but it can be reported to sb/mount mark with
parent/name info if group is interested in parent/name info.

Use the fsnotify_parent() helper to send a MOVE_SELF event and adjust
fsnotify() to handle the case of an event "on child" that should not
be sent to the watching parent's inode mark.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c             | 21 +++++++++++++++++----
 include/linux/fsnotify.h         |  5 +----
 include/linux/fsnotify_backend.h |  2 +-
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 6683c77a5b13..0faf5b09a73e 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -352,6 +352,7 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 	struct super_block *sb = to_tell->i_sb;
 	struct inode *dir = S_ISDIR(to_tell->i_mode) ? to_tell : NULL;
 	struct mount *mnt = NULL;
+	struct inode *inode = NULL;
 	struct inode *child = NULL;
 	int ret = 0;
 	__u32 test_mask, marks_mask;
@@ -362,6 +363,14 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 	if (mask & FS_EVENT_ON_CHILD)
 		child = fsnotify_data_inode(data, data_type);
 
+	/*
+	 * If event is "on child" then to_tell is a watching parent.
+	 * An event "on child" may be sent to mount/sb mark with parent/name
+	 * info, but not appropriate for watching parent (e.g. FS_MOVE_SELF).
+	 */
+	if (!child || (mask & FS_EVENTS_POSS_ON_CHILD))
+		inode = to_tell;
+
 	/*
 	 * Optimization: srcu_read_lock() has a memory barrier which can
 	 * be expensive.  It protects walking the *_fsnotify_marks lists.
@@ -369,14 +378,17 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 	 * SRCU because we have no references to any objects and do not
 	 * need SRCU to keep them "alive".
 	 */
-	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
+	if (!sb->s_fsnotify_marks &&
 	    (!mnt || !mnt->mnt_fsnotify_marks) &&
+	    (!inode || !inode->i_fsnotify_marks) &&
 	    (!child || !child->i_fsnotify_marks))
 		return 0;
 
-	marks_mask = to_tell->i_fsnotify_mask | sb->s_fsnotify_mask;
+	marks_mask = sb->s_fsnotify_mask;
 	if (mnt)
 		marks_mask |= mnt->mnt_fsnotify_mask;
+	if (inode)
+		marks_mask |= inode->i_fsnotify_mask;
 	if (child)
 		marks_mask |= child->i_fsnotify_mask;
 
@@ -390,14 +402,15 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
 
 	iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
 
-	iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
-		fsnotify_first_mark(&to_tell->i_fsnotify_marks);
 	iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
 		fsnotify_first_mark(&sb->s_fsnotify_marks);
 	if (mnt) {
 		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
 			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
 	}
+	if (inode)
+		iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
+			fsnotify_first_mark(&inode->i_fsnotify_marks);
 	if (child) {
 		iter_info.marks[FSNOTIFY_OBJ_TYPE_CHILD] =
 			fsnotify_first_mark(&child->i_fsnotify_marks);
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 044cae3a0628..61dccaf21e7b 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -131,7 +131,6 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 	u32 fs_cookie = fsnotify_get_cookie();
 	__u32 old_dir_mask = FS_MOVED_FROM;
 	__u32 new_dir_mask = FS_MOVED_TO;
-	__u32 mask = FS_MOVE_SELF;
 	const struct qstr *new_name = &moved->d_name;
 
 	if (old_dir == new_dir)
@@ -140,7 +139,6 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 	if (isdir) {
 		old_dir_mask |= FS_ISDIR;
 		new_dir_mask |= FS_ISDIR;
-		mask |= FS_ISDIR;
 	}
 
 	fsnotify_name(old_dir, old_dir_mask, source, old_name, fs_cookie);
@@ -149,8 +147,7 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 	if (target)
 		fsnotify_link_count(target);
 
-	if (source)
-		fsnotify(source, mask, source, FSNOTIFY_EVENT_INODE, NULL, 0);
+	fsnotify_dentry(moved, FS_MOVE_SELF);
 	audit_inode_child(new_dir, moved, AUDIT_TYPE_CHILD_CREATE);
 }
 
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index a7363c33211e..28f6cf704875 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -89,7 +89,7 @@
  * It may include events that can be sent to an inode/sb/mount mark, but cannot
  * be sent to a parent watching children.
  */
-#define FS_EVENTS_POSS_TO_PARENT (FS_EVENTS_POSS_ON_CHILD)
+#define FS_EVENTS_POSS_TO_PARENT (FS_EVENTS_POSS_ON_CHILD | FS_MOVE_SELF)
 
 /* Events that can be reported to backends */
 #define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 06/10] fanotify: add basic support for FAN_REPORT_DIR_FID
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
                   ` (4 preceding siblings ...)
  2020-07-02 12:57 ` [PATCH v4 05/10] fsnotify: send MOVE_SELF event with parent/name info Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 07/10] fanotify: report events with parent dir fid to sb/mount/non-dir marks Amir Goldstein
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

For now, the flag is mutually exclusive with FAN_REPORT_FID.
Events include a single info record of type FAN_EVENT_INFO_TYPE_DFID
with a directory file handle.

For now, events are only reported for:
- Directory modification events
- Events on children of a watching directory
- Events on directory objects

Soon, we will add support for reporting the parent directory fid
for events on non-directories with filesystem/mount mark and
support for reporting both parent directory fid and child fid.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 33 +++++++++++++-
 fs/notify/fanotify/fanotify_user.c | 69 ++++++++++++++++++++++++++----
 include/linux/fanotify.h           |  2 +-
 include/uapi/linux/fanotify.h      | 11 +++--
 4 files changed, 99 insertions(+), 16 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 7f40b8f57934..23fb1bfb6945 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -206,7 +206,7 @@ static int fanotify_get_response(struct fsnotify_group *group,
 static u32 fanotify_group_event_mask(struct fsnotify_group *group,
 				     struct fsnotify_iter_info *iter_info,
 				     u32 event_mask, const void *data,
-				     int data_type)
+				     int data_type, struct inode *dir)
 {
 	__u32 marks_mask = 0, marks_ignored_mask = 0;
 	__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
@@ -226,6 +226,10 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
 		/* Path type events are only relevant for files and dirs */
 		if (!d_is_reg(path->dentry) && !d_can_lookup(path->dentry))
 			return 0;
+	} else if (!(fid_mode & FAN_REPORT_FID)) {
+		/* Do we have a directory inode to report? */
+		if (!dir)
+			return 0;
 	}
 
 	fsnotify_foreach_obj_type(type) {
@@ -375,6 +379,28 @@ static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
 	return fsnotify_data_inode(data, data_type);
 }
 
+/*
+ * The inode to use as identifier when reporting dir fid depends on the event.
+ * Report the modified directory inode on dirent modification events.
+ * Report the "victim" inode if "victim" is a directory.
+ * Report the parent inode if "victim" is not a directory and event is
+ * reported to parent.
+ * Otherwise, do not report dir fid.
+ */
+static struct inode *fanotify_dfid_inode(u32 event_mask, const void *data,
+					 int data_type, struct inode *dir)
+{
+	struct inode *inode = fsnotify_data_inode(data, data_type);
+
+	if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
+		return dir;
+
+	if (S_ISDIR(inode->i_mode))
+		return inode;
+
+	return dir;
+}
+
 struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
 						 gfp_t gfp)
 {
@@ -470,6 +496,9 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
 	bool name_event = false;
 
+	if ((fid_mode & FAN_REPORT_DIR_FID) && dir)
+		id = fanotify_dfid_inode(mask, data, data_type, dir);
+
 	/*
 	 * For queues with unlimited length lost events are not expected and
 	 * can possibly have security implications. Avoid losing events when
@@ -580,7 +609,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
 	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
 
 	mask = fanotify_group_event_mask(group, iter_info, mask, data,
-					 data_type);
+					 data_type, dir);
 	if (!mask)
 		return 0;
 
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index af8268b44c68..fe2c25e26753 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -215,7 +215,7 @@ static int process_access_response(struct fsnotify_group *group,
 }
 
 static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
-			     const char *name, size_t name_len,
+			     int info_type, const char *name, size_t name_len,
 			     char __user *buf, size_t count)
 {
 	struct fanotify_event_info_fid info = { };
@@ -238,8 +238,21 @@ static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
 	 * Copy event info fid header followed by variable sized file handle
 	 * and optionally followed by variable sized filename.
 	 */
-	info.hdr.info_type = name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
-					FAN_EVENT_INFO_TYPE_FID;
+	switch (info_type) {
+	case FAN_EVENT_INFO_TYPE_FID:
+	case FAN_EVENT_INFO_TYPE_DFID:
+		if (WARN_ON_ONCE(name_len))
+			return -EFAULT;
+		break;
+	case FAN_EVENT_INFO_TYPE_DFID_NAME:
+		if (WARN_ON_ONCE(!name_len))
+			return -EFAULT;
+		break;
+	default:
+		return -EFAULT;
+	}
+
+	info.hdr.info_type = info_type;
 	info.hdr.len = len;
 	info.fsid = *fsid;
 	if (copy_to_user(buf, &info, sizeof(info)))
@@ -303,8 +316,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	struct fanotify_event_metadata metadata;
 	struct path *path = fanotify_event_path(event);
 	struct fanotify_fh *dfh = fanotify_event_dir_fh(event);
+	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
 	struct file *f = NULL;
 	int ret, fd = FAN_NOFD;
+	int info_type = 0;
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
@@ -345,8 +360,9 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 
 	/* Event info records order is: dir fid + name, child fid */
 	if (dfh) {
+		info_type = FAN_EVENT_INFO_TYPE_DFID_NAME;
 		ret = copy_info_to_user(fanotify_event_fsid(event), dfh,
-					fanotify_fh_name(dfh),
+					info_type, fanotify_fh_name(dfh),
 					dfh->name_len, buf, count);
 		if (ret < 0)
 			return ret;
@@ -356,9 +372,33 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	}
 
 	if (fanotify_event_object_fh_len(event)) {
+		if (fid_mode == FAN_REPORT_FID || info_type) {
+			/*
+			 * With only group flag FAN_REPORT_FID only type FID is
+			 * reported. Second info record type is always FID.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_FID;
+		} else if ((event->mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
+			   (event->mask & FAN_ONDIR)) {
+			/*
+			 * With group flag FAN_REPORT_DIR_FID, a single info
+			 * record has type DFID for directory entry modification
+			 * event and for event on a directory.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_DFID;
+		} else {
+			/*
+			 * With group flags FAN_REPORT_DIR_FID|FAN_REPORT_FID,
+			 * a single info record has type FID for event on a
+			 * non-directory, when there is no directory to report.
+			 * For example, on FAN_DELETE_SELF event.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_FID;
+		}
+
 		ret = copy_info_to_user(fanotify_event_fsid(event),
 					fanotify_event_object_fh(event),
-					NULL, 0, buf, count);
+					info_type, NULL, 0, buf, count);
 		if (ret < 0)
 			return ret;
 
@@ -854,6 +894,8 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	struct fsnotify_group *group;
 	int f_flags, fd;
 	struct user_struct *user;
+	unsigned int fid_mode = flags & FANOTIFY_FID_BITS;
+	unsigned int class = flags & FANOTIFY_CLASS_BITS;
 
 	pr_debug("%s: flags=%x event_f_flags=%x\n",
 		 __func__, flags, event_f_flags);
@@ -880,10 +922,19 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 		return -EINVAL;
 	}
 
-	if ((flags & FANOTIFY_FID_BITS) &&
-	    (flags & FANOTIFY_CLASS_BITS) != FAN_CLASS_NOTIF)
+	if (fid_mode && class != FAN_CLASS_NOTIF)
 		return -EINVAL;
 
+	/* Reporting either object fid or dir fid */
+	switch (fid_mode) {
+	case 0:
+	case FAN_REPORT_FID:
+	case FAN_REPORT_DIR_FID:
+		break;
+	default:
+		return -EINVAL;
+	}
+
 	user = get_current_user();
 	if (atomic_read(&user->fanotify_listeners) > FANOTIFY_DEFAULT_MAX_LISTENERS) {
 		free_uid(user);
@@ -919,7 +970,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	group->fanotify_data.f_flags = event_f_flags;
 	init_waitqueue_head(&group->fanotify_data.access_waitq);
 	INIT_LIST_HEAD(&group->fanotify_data.access_list);
-	switch (flags & FANOTIFY_CLASS_BITS) {
+	switch (class) {
 	case FAN_CLASS_NOTIF:
 		group->priority = FS_PRIO_0;
 		break;
@@ -1229,7 +1280,7 @@ COMPAT_SYSCALL_DEFINE6(fanotify_mark,
  */
 static int __init fanotify_user_setup(void)
 {
-	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 8);
+	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 9);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
 
 	fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index bbbee11d2521..4ddac97b2bf7 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -18,7 +18,7 @@
 #define FANOTIFY_CLASS_BITS	(FAN_CLASS_NOTIF | FAN_CLASS_CONTENT | \
 				 FAN_CLASS_PRE_CONTENT)
 
-#define FANOTIFY_FID_BITS	(FAN_REPORT_FID)
+#define FANOTIFY_FID_BITS	(FAN_REPORT_FID | FAN_REPORT_DIR_FID)
 
 #define FANOTIFY_INIT_FLAGS	(FANOTIFY_CLASS_BITS | FANOTIFY_FID_BITS | \
 				 FAN_REPORT_TID | \
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index 7f2f17eacbf9..21afebf77fd7 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -53,6 +53,7 @@
 /* Flags to determine fanotify event format */
 #define FAN_REPORT_TID		0x00000100	/* event->pid is thread id */
 #define FAN_REPORT_FID		0x00000200	/* Report unique file id */
+#define FAN_REPORT_DIR_FID	0x00000400	/* Report unique directory id */
 
 /* Deprecated - do not use this in programs and do not add new flags here! */
 #define FAN_ALL_INIT_FLAGS	(FAN_CLOEXEC | FAN_NONBLOCK | \
@@ -117,6 +118,7 @@ struct fanotify_event_metadata {
 
 #define FAN_EVENT_INFO_TYPE_FID		1
 #define FAN_EVENT_INFO_TYPE_DFID_NAME	2
+#define FAN_EVENT_INFO_TYPE_DFID	3
 
 /* Variable length info record following event metadata */
 struct fanotify_event_info_header {
@@ -126,10 +128,11 @@ struct fanotify_event_info_header {
 };
 
 /*
- * Unique file identifier info record. This is used both for
- * FAN_EVENT_INFO_TYPE_FID records and for FAN_EVENT_INFO_TYPE_DFID_NAME
- * records. For FAN_EVENT_INFO_TYPE_DFID_NAME there is additionally a null
- * terminated name immediately after the file handle.
+ * Unique file identifier info record.
+ * This structure is used for records of types FAN_EVENT_INFO_TYPE_FID,
+ * FAN_EVENT_INFO_TYPE_DFID and FAN_EVENT_INFO_TYPE_DFID_NAME.
+ * For FAN_EVENT_INFO_TYPE_DFID_NAME there is additionally a null terminated
+ * name immediately after the file handle.
  */
 struct fanotify_event_info_fid {
 	struct fanotify_event_info_header hdr;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 07/10] fanotify: report events with parent dir fid to sb/mount/non-dir marks
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
                   ` (5 preceding siblings ...)
  2020-07-02 12:57 ` [PATCH v4 06/10] fanotify: add basic support for FAN_REPORT_DIR_FID Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 08/10] fanotify: add support for FAN_REPORT_NAME Amir Goldstein
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

In a group with flag FAN_REPORT_DIR_FID, when adding an inode mark with
FAN_EVENT_ON_CHILD, events on non-directory children are reported with
the fid of the parent.

When adding a filesystem or mount mark or mark on a non-dir inode, we
want to report events that are "possible on child" (e.g. open/close)
also with fid of the parent, as if the victim inode's parent is
interested in events "on child".

Some events, currently only FAN_MOVE_SELF, should be reported to a
sb/mount/non-dir mark with parent fid even though they are not
reported to a watching parent.

To get the desired behavior we set the flag FAN_EVENT_ON_CHILD on
all the sb/mount/non-dir mark masks in a group with FAN_REPORT_DIR_FID.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 3 +--
 fs/notify/fanotify/fanotify_user.c | 7 +++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 23fb1bfb6945..c45d65fa8d1d 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -255,8 +255,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
 		 */
 		if (event_mask & FS_EVENT_ON_CHILD &&
 		    type != FSNOTIFY_OBJ_TYPE_CHILD &&
-		    (type != FSNOTIFY_OBJ_TYPE_INODE ||
-		     !(mark->mask & FS_EVENT_ON_CHILD)))
+		    !(mark->mask & FS_EVENT_ON_CHILD))
 			continue;
 
 		marks_mask |= mark->mask;
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index fe2c25e26753..20bb1c52b1cd 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1213,6 +1213,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	if (mnt || !S_ISDIR(inode->i_mode)) {
 		mask &= ~FAN_EVENT_ON_CHILD;
 		umask = FAN_EVENT_ON_CHILD;
+		/*
+		 * If group needs to report parent fid, register for getting
+		 * events with parent/name info for non-directory.
+		 */
+		if ((fid_mode & FAN_REPORT_DIR_FID) &&
+		    (flags & FAN_MARK_ADD) && !ignored)
+			mask |= FAN_EVENT_ON_CHILD;
 	}
 
 	/* create/update an inode mark */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 08/10] fanotify: add support for FAN_REPORT_NAME
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
                   ` (6 preceding siblings ...)
  2020-07-02 12:57 ` [PATCH v4 07/10] fanotify: report events with parent dir fid to sb/mount/non-dir marks Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 09/10] fanotify: report parent fid + name + child fid Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 10/10] fanotify: report parent fid " Amir Goldstein
  9 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Introduce a new fanotify_init() flag FAN_REPORT_NAME.  It requires the
flag FAN_REPORT_DIR_FID and there is a constant for setting both flags
named FAN_REPORT_DFID_NAME.

For a group with flag FAN_REPORT_NAME, the parent fid and name are
reported for directory entry modification events (create/detete/move)
and for events on non-directory objects.

Events on directories themselves are reported with their own fid and
"." as the name.

The parent fid and name are reported with an info record of type
FAN_EVENT_INFO_TYPE_DFID_NAME, similar to the way that parent fid is
reported with into type FAN_EVENT_INFO_TYPE_DFID, but with an appended
null terminated name string.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 18 +++++++++++-
 fs/notify/fanotify/fanotify_user.c | 47 +++++++++++++++++++++++-------
 include/linux/fanotify.h           |  2 +-
 include/uapi/linux/fanotify.h      |  4 +++
 4 files changed, 59 insertions(+), 12 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index c45d65fa8d1d..b22ab6630eba 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -495,9 +495,25 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
 	bool name_event = false;
 
-	if ((fid_mode & FAN_REPORT_DIR_FID) && dir)
+	if ((fid_mode & FAN_REPORT_DIR_FID) && dir) {
 		id = fanotify_dfid_inode(mask, data, data_type, dir);
 
+		/*
+		 * We record file name only in a group with FAN_REPORT_NAME
+		 * and when we have a directory inode to report.
+		 *
+		 * For directory entry modification event, we record the fid of
+		 * the directory and the name of the modified entry.
+		 *
+		 * For event on non-directory that is reported to parent, we
+		 * record the fid of the parent and the name of the child.
+		 */
+		if ((fid_mode & FAN_REPORT_NAME) &&
+		    ((mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
+		     !(mask & FAN_ONDIR)))
+			name_event = true;
+	}
+
 	/*
 	 * For queues with unlimited length lost events are not expected and
 	 * can possibly have security implications. Avoid losing events when
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 20bb1c52b1cd..27116cfead66 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -64,17 +64,26 @@ static int fanotify_fid_info_len(int fh_len, int name_len)
 	return roundup(FANOTIFY_INFO_HDR_LEN + info_len, FANOTIFY_EVENT_ALIGN);
 }
 
-static int fanotify_event_info_len(struct fanotify_event *event)
+static int fanotify_event_info_len(unsigned int fid_mode,
+				   struct fanotify_event *event)
 {
-	int info_len = 0;
 	int fh_len = fanotify_event_object_fh_len(event);
 	struct fanotify_fh *dfh = fanotify_event_dir_fh(event);
+	int info_len = 0;
+	int dot_len = 0;
 
-	if (dfh)
+	if (dfh) {
 		info_len += fanotify_fid_info_len(dfh->len, dfh->name_len);
+	} else if ((fid_mode & FAN_REPORT_NAME) && (event->mask & FAN_ONDIR)) {
+		/*
+		 * With group flag FAN_REPORT_NAME, if name was not recorded in
+		 * event on a directory, we will report the name ".".
+		 */
+		dot_len = 1;
+	}
 
 	if (fh_len)
-		info_len += fanotify_fid_info_len(fh_len, 0);
+		info_len += fanotify_fid_info_len(fh_len, dot_len);
 
 	return info_len;
 }
@@ -90,6 +99,7 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
 {
 	size_t event_size = FAN_EVENT_METADATA_LEN;
 	struct fanotify_event *event = NULL;
+	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
 
 	pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
 
@@ -97,8 +107,8 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
 	if (fsnotify_notify_queue_is_empty(group))
 		goto out;
 
-	if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) {
-		event_size += fanotify_event_info_len(
+	if (fid_mode) {
+		event_size += fanotify_event_info_len(fid_mode,
 			FANOTIFY_E(fsnotify_peek_first_event(group)));
 	}
 
@@ -324,7 +334,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
 	metadata.event_len = FAN_EVENT_METADATA_LEN +
-					fanotify_event_info_len(event);
+				fanotify_event_info_len(fid_mode, event);
 	metadata.metadata_len = FAN_EVENT_METADATA_LEN;
 	metadata.vers = FANOTIFY_METADATA_VERSION;
 	metadata.reserved = 0;
@@ -372,12 +382,25 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	}
 
 	if (fanotify_event_object_fh_len(event)) {
+		const char *dot = NULL;
+		int dot_len = 0;
+
 		if (fid_mode == FAN_REPORT_FID || info_type) {
 			/*
 			 * With only group flag FAN_REPORT_FID only type FID is
 			 * reported. Second info record type is always FID.
 			 */
 			info_type = FAN_EVENT_INFO_TYPE_FID;
+		} else if ((fid_mode & FAN_REPORT_NAME) &&
+			   (event->mask & FAN_ONDIR)) {
+			/*
+			 * With group flag FAN_REPORT_NAME, if name was not
+			 * recorded in an event on a directory, report the
+			 * name "." with info type DFID_NAME.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_DFID_NAME;
+			dot = ".";
+			dot_len = 1;
 		} else if ((event->mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
 			   (event->mask & FAN_ONDIR)) {
 			/*
@@ -398,7 +421,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 
 		ret = copy_info_to_user(fanotify_event_fsid(event),
 					fanotify_event_object_fh(event),
-					info_type, NULL, 0, buf, count);
+					info_type, dot, dot_len, buf, count);
 		if (ret < 0)
 			return ret;
 
@@ -925,11 +948,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	if (fid_mode && class != FAN_CLASS_NOTIF)
 		return -EINVAL;
 
-	/* Reporting either object fid or dir fid */
+	/*
+	 * Reporting either object fid or dir fid.
+	 * Child name is reported with parent fid so requires dir fid.
+	 */
 	switch (fid_mode) {
 	case 0:
 	case FAN_REPORT_FID:
 	case FAN_REPORT_DIR_FID:
+	case FAN_REPORT_DFID_NAME:
 		break;
 	default:
 		return -EINVAL;
@@ -1287,7 +1314,7 @@ COMPAT_SYSCALL_DEFINE6(fanotify_mark,
  */
 static int __init fanotify_user_setup(void)
 {
-	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 9);
+	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
 
 	fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 4ddac97b2bf7..3e9c56ee651f 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -18,7 +18,7 @@
 #define FANOTIFY_CLASS_BITS	(FAN_CLASS_NOTIF | FAN_CLASS_CONTENT | \
 				 FAN_CLASS_PRE_CONTENT)
 
-#define FANOTIFY_FID_BITS	(FAN_REPORT_FID | FAN_REPORT_DIR_FID)
+#define FANOTIFY_FID_BITS	(FAN_REPORT_FID | FAN_REPORT_DFID_NAME)
 
 #define FANOTIFY_INIT_FLAGS	(FANOTIFY_CLASS_BITS | FANOTIFY_FID_BITS | \
 				 FAN_REPORT_TID | \
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index 21afebf77fd7..fbf9c5c7dd59 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -54,6 +54,10 @@
 #define FAN_REPORT_TID		0x00000100	/* event->pid is thread id */
 #define FAN_REPORT_FID		0x00000200	/* Report unique file id */
 #define FAN_REPORT_DIR_FID	0x00000400	/* Report unique directory id */
+#define FAN_REPORT_NAME		0x00000800	/* Report events with name */
+
+/* Convenience macro - FAN_REPORT_NAME requires FAN_REPORT_DIR_FID */
+#define FAN_REPORT_DFID_NAME	(FAN_REPORT_DIR_FID | FAN_REPORT_NAME)
 
 /* Deprecated - do not use this in programs and do not add new flags here! */
 #define FAN_ALL_INIT_FLAGS	(FAN_CLOEXEC | FAN_NONBLOCK | \
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 09/10] fanotify: report parent fid + name + child fid
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
                   ` (7 preceding siblings ...)
  2020-07-02 12:57 ` [PATCH v4 08/10] fanotify: add support for FAN_REPORT_NAME Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  2020-07-02 12:57 ` [PATCH v4 10/10] fanotify: report parent fid " Amir Goldstein
  9 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

For a group with fanotify_init() flag FAN_REPORT_DFID_NAME, the parent
fid and name are reported for events on non-directory objects with an
info record of type FAN_EVENT_INFO_TYPE_DFID_NAME.

If the group also has the init flag FAN_REPORT_FID, the child fid
is also reported with another info record that follows the first info
record. The second info record is the same info record that would have
been reported to a group with only FAN_REPORT_FID flag.

When the child fid needs to be recorded, the variable size struct
fanotify_name_event is preallocated with enough space to store the
child fh between the dir fh and the name.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 33 ++++++++++++++++++++++++++++--
 fs/notify/fanotify/fanotify.h      |  2 ++
 fs/notify/fanotify/fanotify_user.c |  3 ++-
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index b22ab6630eba..3e8e20c19d97 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -64,6 +64,16 @@ static bool fanotify_name_event_equal(struct fanotify_name_event *fne1,
 	    !fanotify_fh_equal(dfh1, dfh2))
 		return false;
 
+	/*
+	 * There could be a child fid before the name.
+	 * If only one dfh had a blob, we would have failed the name_offset
+	 * comparison above.
+	 */
+	if (fanotify_fh_blob_len(dfh1) &&
+	    memcmp(fanotify_fh_blob(dfh1), fanotify_fh_blob(dfh2),
+		   fanotify_fh_blob_len(dfh1)))
+		return false;
+
 	return !memcmp(fanotify_fh_name(dfh1), fanotify_fh_name(dfh2),
 		       dfh1->name_len);
 }
@@ -454,13 +464,19 @@ struct fanotify_event *fanotify_alloc_fid_event(struct inode *id,
 struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
 						 __kernel_fsid_t *fsid,
 						 const struct qstr *file_name,
-						 gfp_t gfp)
+						 struct inode *child, gfp_t gfp)
 {
 	struct fanotify_name_event *fne;
 	struct fanotify_fh *dfh;
 	unsigned int prealloc_fh_len = fanotify_encode_fh_len(id);
+	unsigned int child_fh_len = fanotify_encode_fh_len(child);
 	unsigned int size;
 
+	if (WARN_ON_ONCE(prealloc_fh_len % FANOTIFY_FH_HDR_LEN))
+		child_fh_len = 0;
+	else if (child_fh_len)
+		prealloc_fh_len += FANOTIFY_FH_HDR_LEN + child_fh_len;
+
 	size = sizeof(*fne) - FANOTIFY_INLINE_FH_LEN + prealloc_fh_len;
 	if (file_name)
 		size += file_name->len + 1;
@@ -472,6 +488,8 @@ struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
 	fne->fsid = *fsid;
 	dfh = &fne->dir_fh;
 	fanotify_encode_fh(dfh, id, prealloc_fh_len, 0);
+	if (child_fh_len)
+		fanotify_encode_fh(fanotify_fh_blob(dfh), child, child_fh_len, 0);
 	if (file_name)
 		fanotify_fh_copy_name(dfh, file_name);
 
@@ -493,9 +511,19 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	struct inode *id = fanotify_fid_inode(mask, data, data_type, dir);
 	const struct path *path = fsnotify_data_path(data, data_type);
 	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
+	struct inode *child = NULL;
 	bool name_event = false;
 
 	if ((fid_mode & FAN_REPORT_DIR_FID) && dir) {
+		/*
+		 * With both flags FAN_REPORT_DIR_FID and FAN_REPORT_FID, we
+		 * report the child fid for events reported on a non-dir child
+		 * in addition to reporting the parent fid and child name.
+		 */
+		if ((fid_mode & FAN_REPORT_FID) &&
+		    (mask & FAN_EVENT_ON_CHILD) && !(mask & FAN_ONDIR))
+			child = id;
+
 		id = fanotify_dfid_inode(mask, data, data_type, dir);
 
 		/*
@@ -531,7 +559,8 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	if (fanotify_is_perm_event(mask)) {
 		event = fanotify_alloc_perm_event(path, gfp);
 	} else if (name_event && file_name) {
-		event = fanotify_alloc_name_event(id, fsid, file_name, gfp);
+		event = fanotify_alloc_name_event(id, fsid, file_name, child,
+						  gfp);
 	} else if (fid_mode) {
 		event = fanotify_alloc_fid_event(id, fsid, gfp);
 	} else {
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 7cbdac4be42f..f1aaa9fa5ca8 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -171,6 +171,8 @@ static inline struct fanotify_fh *fanotify_event_object_fh(
 {
 	if (event->type == FANOTIFY_EVENT_TYPE_FID)
 		return &FANOTIFY_FE(event)->object_fh;
+	else if (event->type == FANOTIFY_EVENT_TYPE_FID_NAME)
+		return fanotify_fh_blob(&FANOTIFY_NE(event)->dir_fh);
 	else
 		return NULL;
 }
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 27116cfead66..577ad74f71ec 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -949,14 +949,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 		return -EINVAL;
 
 	/*
-	 * Reporting either object fid or dir fid.
 	 * Child name is reported with parent fid so requires dir fid.
+	 * If reporting child name, we can report both child fid and dir fid.
 	 */
 	switch (fid_mode) {
 	case 0:
 	case FAN_REPORT_FID:
 	case FAN_REPORT_DIR_FID:
 	case FAN_REPORT_DFID_NAME:
+	case FAN_REPORT_DFID_NAME | FAN_REPORT_FID:
 		break;
 	default:
 		return -EINVAL;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 10/10] fanotify: report parent fid + child fid
  2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
                   ` (8 preceding siblings ...)
  2020-07-02 12:57 ` [PATCH v4 09/10] fanotify: report parent fid + name + child fid Amir Goldstein
@ 2020-07-02 12:57 ` Amir Goldstein
  9 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-02 12:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Add support for FAN_REPORT_FID | FAN_REPORT_DIR_FID.
Internally, it is implemented as a private case of reporting both
parent and child fids and name, the parent and child fids are recorded
in a variable length fanotify_name_event, but there is no name.

It should be noted that directory modification events are recorded
in fixed size fanotify_fid_event when not reporting name, just like
with group flags FAN_REPORT_FID.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 16 +++++++++++-----
 fs/notify/fanotify/fanotify_user.c | 15 ++++-----------
 2 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 3e8e20c19d97..1122d7a8ba07 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -518,7 +518,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 		/*
 		 * With both flags FAN_REPORT_DIR_FID and FAN_REPORT_FID, we
 		 * report the child fid for events reported on a non-dir child
-		 * in addition to reporting the parent fid and child name.
+		 * in addition to reporting the parent fid and maybe child name.
 		 */
 		if ((fid_mode & FAN_REPORT_FID) &&
 		    (mask & FAN_EVENT_ON_CHILD) && !(mask & FAN_ONDIR))
@@ -535,11 +535,17 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 		 *
 		 * For event on non-directory that is reported to parent, we
 		 * record the fid of the parent and the name of the child.
+		 *
+		 * Even if not reporting name, we need a variable length
+		 * fanotify_name_event if reporting both parent and child fids.
 		 */
-		if ((fid_mode & FAN_REPORT_NAME) &&
-		    ((mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
-		     !(mask & FAN_ONDIR)))
+		if (!(fid_mode & FAN_REPORT_NAME)) {
+			name_event = !!child;
+			file_name = NULL;
+		} else if ((mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
+			   !(mask & FAN_ONDIR)) {
 			name_event = true;
+		}
 	}
 
 	/*
@@ -558,7 +564,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 
 	if (fanotify_is_perm_event(mask)) {
 		event = fanotify_alloc_perm_event(path, gfp);
-	} else if (name_event && file_name) {
+	} else if (name_event && (file_name || child)) {
 		event = fanotify_alloc_name_event(id, fsid, file_name, child,
 						  gfp);
 	} else if (fid_mode) {
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 577ad74f71ec..24426146f41e 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -370,7 +370,8 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 
 	/* Event info records order is: dir fid + name, child fid */
 	if (dfh) {
-		info_type = FAN_EVENT_INFO_TYPE_DFID_NAME;
+		info_type = dfh->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
+					    FAN_EVENT_INFO_TYPE_DFID;
 		ret = copy_info_to_user(fanotify_event_fsid(event), dfh,
 					info_type, fanotify_fh_name(dfh),
 					dfh->name_len, buf, count);
@@ -950,18 +951,10 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 
 	/*
 	 * Child name is reported with parent fid so requires dir fid.
-	 * If reporting child name, we can report both child fid and dir fid.
+	 * We can report both child fid and dir fid with or without name.
 	 */
-	switch (fid_mode) {
-	case 0:
-	case FAN_REPORT_FID:
-	case FAN_REPORT_DIR_FID:
-	case FAN_REPORT_DFID_NAME:
-	case FAN_REPORT_DFID_NAME | FAN_REPORT_FID:
-		break;
-	default:
+	if ((fid_mode & FAN_REPORT_NAME) && !(fid_mode & FAN_REPORT_DIR_FID))
 		return -EINVAL;
-	}
 
 	user = get_current_user();
 	if (atomic_read(&user->fanotify_listeners) > FANOTIFY_DEFAULT_MAX_LISTENERS) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-02 12:57 ` [PATCH v4 03/10] fsnotify: send event to " Amir Goldstein
@ 2020-07-14 10:34   ` Jan Kara
  2020-07-14 11:54     ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kara @ 2020-07-14 10:34 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Thu 02-07-20 15:57:37, Amir Goldstein wrote:
> Instead of calling fsnotify() twice, once with parent inode and once
> with child inode, if event should be sent to parent inode, send it
> with both parent and child inodes marks in object type iterator and call
> the backend handle_event() callback only once.
> 
> The parent inode is assigned to the standard "inode" iterator type and
> the child inode is assigned to the special "child" iterator type.
> 
> In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
> the dir argment to handle_event will be the parent inode, the file_name
> argument to handle_event is non NULL and refers to the name of the child
> and the child inode can be accessed with fsnotify_data_inode().
> 
> This will allow fanotify to make decisions based on child or parent's
> ignored mask.  For example, when a parent is interested in a specific
> event on its children, but a specific child wishes to ignore this event,
> the event will not be reported.  This is not what happens with current
> code, but according to man page, it is the expected behavior.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

I like the direction where this is going. But can't we push it even a bit
further? I like the fact that we now have "one fs event" -> "one fsnotify()
call". Ideally I'd like to get rid of FS_EVENT_ON_CHILD in the event mask
because it's purpose seems very weak now and it complicates code (and now
it became even a bit of a misnomer) - intuitively, ->handle_event is now
passed sb, mnt, parent, child so it should have all the info to decide
where the event should be reported and I don't see a need for
FS_EVENT_ON_CHILD flag. With fsnotify() call itself we still use
FS_EVENT_ON_CHILD to determine what the arguments mean but can't we just
mandate that 'data' always points to the child, 'to_tell' is always the
parent dir if watching or NULL (and I'd rename that argument to 'dir' and
maybe move it after 'data_type' argument). What do you think?

Some further comments about the current implementation are below...

> ---
>  fs/kernfs/file.c     | 10 ++++++----
>  fs/notify/fsnotify.c | 40 ++++++++++++++++++++++++++--------------
>  2 files changed, 32 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> index e23b3f62483c..5b1468bc509e 100644
> --- a/fs/kernfs/file.c
> +++ b/fs/kernfs/file.c
> @@ -883,6 +883,7 @@ static void kernfs_notify_workfn(struct work_struct *work)
>  
>  	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
>  		struct kernfs_node *parent;
> +		struct inode *p_inode = NULL;
>  		struct inode *inode;
>  		struct qstr name;
>  
> @@ -899,8 +900,6 @@ static void kernfs_notify_workfn(struct work_struct *work)
>  		name = (struct qstr)QSTR_INIT(kn->name, strlen(kn->name));
>  		parent = kernfs_get_parent(kn);
>  		if (parent) {
> -			struct inode *p_inode;
> -
>  			p_inode = ilookup(info->sb, kernfs_ino(parent));
>  			if (p_inode) {
>  				fsnotify(p_inode, FS_MODIFY | FS_EVENT_ON_CHILD,
> @@ -911,8 +910,11 @@ static void kernfs_notify_workfn(struct work_struct *work)
>  			kernfs_put(parent);
>  		}
>  
> -		fsnotify(inode, FS_MODIFY, inode, FSNOTIFY_EVENT_INODE,
> -			 NULL, 0);
> +		if (!p_inode) {
> +			fsnotify(inode, FS_MODIFY, inode, FSNOTIFY_EVENT_INODE,
> +				 NULL, 0);
> +		}
> +
>  		iput(inode);
>  	}
>  
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 51ada3cfd2ff..7c6e624b24c9 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -145,15 +145,17 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
>  /*
>   * Notify this dentry's parent about a child's events with child name info
>   * if parent is watching.
> - * Notify also the child without name info if child inode is watching.
> + * Notify only the child without name info if parent is not watching.
>   */
>  int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
>  		      int data_type)
>  {
> +	struct inode *inode = d_inode(dentry);
>  	struct dentry *parent;
>  	struct inode *p_inode;
>  	int ret = 0;
>  
> +	parent = NULL;
>  	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
>  		goto notify_child;
>  
> @@ -165,23 +167,23 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
>  	} else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
>  		struct name_snapshot name;
>  
> -		/*
> -		 * We are notifying a parent, so set a flag in mask to inform
> -		 * backend that event has information about a child entry.
> -		 */
> +		/* When notifying parent, child should be passed as data */
> +		WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));
> +
> +		/* Notify both parent and child with child name info */
>  		take_dentry_name_snapshot(&name, dentry);
>  		ret = fsnotify(p_inode, mask | FS_EVENT_ON_CHILD, data,
>  			       data_type, &name.name, 0);
>  		release_dentry_name_snapshot(&name);
> +	} else {
> +notify_child:
> +		/* Notify child without child name info */
> +		ret = fsnotify(inode, mask, data, data_type, NULL, 0);
>  	}

AFAICT this will miss notifying the child if the condition
	!fsnotify_inode_watches_children(p_inode)
above is true... And I've noticed this because jumping into a branch in an
if block is usually a bad idea and so I gave it a closer look. Exactly
because of problems like this. Usually it's better to restructure
conditions instead.

In this case I think we could structure the code like:
	struct name_snapshot name
	struct qstr *namestr = NULL;

	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
		goto notify;
	parent = dget_parent(dentry);
	p_inode = parent->d_inode;

	if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
		__fsnotify_update_child_dentry_flags(p_inode);
	} else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
		take_dentry_name_snapshot(&name, dentry);
		namestr = &name.name;
		mask |= FS_EVENT_ON_CHILD;
	}
notify:
	ret = fsnotify(p_inode, mask, data, data_type, namestr, 0);
	if (namestr)
		release_dentry_name_snapshot(&name);
	dput(parent);
	return ret;
>  
>  	dput(parent);
>  
> -	if (ret)
> -		return ret;
> -
> -notify_child:
> -	return fsnotify(d_inode(dentry), mask, data, data_type, NULL, 0);
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(__fsnotify_parent);
>  
> @@ -322,12 +324,16 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  	struct super_block *sb = to_tell->i_sb;
>  	struct inode *dir = S_ISDIR(to_tell->i_mode) ? to_tell : NULL;
>  	struct mount *mnt = NULL;
> +	struct inode *child = NULL;
>  	int ret = 0;
>  	__u32 test_mask, marks_mask;
>  
>  	if (path)
>  		mnt = real_mount(path->mnt);
>  
> +	if (mask & FS_EVENT_ON_CHILD)
> +		child = fsnotify_data_inode(data, data_type);
> +
>  	/*
>  	 * Optimization: srcu_read_lock() has a memory barrier which can
>  	 * be expensive.  It protects walking the *_fsnotify_marks lists.
> @@ -336,21 +342,23 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  	 * need SRCU to keep them "alive".
>  	 */
>  	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
> -	    (!mnt || !mnt->mnt_fsnotify_marks))
> +	    (!mnt || !mnt->mnt_fsnotify_marks) &&
> +	    (!child || !child->i_fsnotify_marks))
>  		return 0;
>  
>  	/* An event "on child" is not intended for a mount/sb mark */
>  	marks_mask = to_tell->i_fsnotify_mask;
> -	if (!(mask & FS_EVENT_ON_CHILD)) {
> +	if (!child) {
>  		marks_mask |= sb->s_fsnotify_mask;
>  		if (mnt)
>  			marks_mask |= mnt->mnt_fsnotify_mask;
> +	} else {
> +		marks_mask |= child->i_fsnotify_mask;
>  	}

I don't think this is correct. The FS_EVENT_ON_CHILD events should
also be reported to sb & mnt marks because we now generate only
FS_EVENT_ON_CHILD if parent is watching but that doesn't mean e.g. sb
shouldn't receive the event. Am I missing something? If I'm indeed right,
maybe we should extend our LTP coverage a bit to catch breakage like
this...

>  	/*
>  	 * if this is a modify event we may need to clear the ignored masks
> -	 * otherwise return if neither the inode nor the vfsmount/sb care about
> -	 * this type of event.
> +	 * otherwise return if none of the marks care about this type of event.
>  	 */
>  	test_mask = (mask & ALL_FSNOTIFY_EVENTS);
>  	if (!(mask & FS_MODIFY) && !(test_mask & marks_mask))
> @@ -366,6 +374,10 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
>  			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
>  	}
> +	if (child) {
> +		iter_info.marks[FSNOTIFY_OBJ_TYPE_CHILD] =
> +			fsnotify_first_mark(&child->i_fsnotify_marks);
> +	}
>  
>  	/*
>  	 * We need to merge inode/vfsmount/sb mark lists so that e.g. inode mark
> -- 
> 2.17.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-14 10:34   ` Jan Kara
@ 2020-07-14 11:54     ` Amir Goldstein
  2020-07-15 17:09       ` Jan Kara
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-14 11:54 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Tue, Jul 14, 2020 at 1:34 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 02-07-20 15:57:37, Amir Goldstein wrote:
> > Instead of calling fsnotify() twice, once with parent inode and once
> > with child inode, if event should be sent to parent inode, send it
> > with both parent and child inodes marks in object type iterator and call
> > the backend handle_event() callback only once.
> >
> > The parent inode is assigned to the standard "inode" iterator type and
> > the child inode is assigned to the special "child" iterator type.
> >
> > In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
> > the dir argment to handle_event will be the parent inode, the file_name
> > argument to handle_event is non NULL and refers to the name of the child
> > and the child inode can be accessed with fsnotify_data_inode().
> >
> > This will allow fanotify to make decisions based on child or parent's
> > ignored mask.  For example, when a parent is interested in a specific
> > event on its children, but a specific child wishes to ignore this event,
> > the event will not be reported.  This is not what happens with current
> > code, but according to man page, it is the expected behavior.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> I like the direction where this is going. But can't we push it even a bit
> further? I like the fact that we now have "one fs event" -> "one fsnotify()
> call". Ideally I'd like to get rid of FS_EVENT_ON_CHILD in the event mask
> because it's purpose seems very weak now and it complicates code (and now

Can you give an example where it complicates the code?
Don't confuse this with the code in fanotify_user.c that subscribes for events
on child/with name.

In one before the last patch of the series I am testing FS_EVENT_ON_CHILD
in mask to know how to report the event inside fanotify_alloc_event().
I may be able to carry this information not in mask, but the flag space is
already taken anyway by FAN_EVENT_ON_CHILD input arg, so not sure
what is there to gain from not setting FS_EVENT_ON_CHILD.

> it became even a bit of a misnomer) - intuitively, ->handle_event is now

I thought of changing the name to FS_EVENT_WITH_NAME, but that was
confusing because create/delete are also events with a name.
Maybe FS_EVENT_WITH_CHILD_NAME :-/

> passed sb, mnt, parent, child so it should have all the info to decide
> where the event should be reported and I don't see a need for
> FS_EVENT_ON_CHILD flag.

Do you mean something like this?

        const struct path *inode = fsnotify_data_inode(data, data_type);
        bool event_on_child = !!file_name && dir != inode;

It may be true that all information is there, but I think the above is
a bit ugly and quite not trivial to explain, whereas the flag is quite
intuitive (to me) and adds no extra complexity (IMO).

> With fsnotify() call itself we still use
> FS_EVENT_ON_CHILD to determine what the arguments mean but can't we just
> mandate that 'data' always points to the child, 'to_tell' is always the
> parent dir if watching or NULL (and I'd rename that argument to 'dir' and
> maybe move it after 'data_type' argument). What do you think?
>

I think what you are missing is the calls from fsnotify_name().
For those calls, data is the dir and so is to_tell and name is the entry name.

> Some further comments about the current implementation are below...
>
> > ---
> >  fs/kernfs/file.c     | 10 ++++++----
> >  fs/notify/fsnotify.c | 40 ++++++++++++++++++++++++++--------------
> >  2 files changed, 32 insertions(+), 18 deletions(-)
> >
> > diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> > index e23b3f62483c..5b1468bc509e 100644
> > --- a/fs/kernfs/file.c
> > +++ b/fs/kernfs/file.c
> > @@ -883,6 +883,7 @@ static void kernfs_notify_workfn(struct work_struct *work)
> >
> >       list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
> >               struct kernfs_node *parent;
> > +             struct inode *p_inode = NULL;
> >               struct inode *inode;
> >               struct qstr name;
> >
> > @@ -899,8 +900,6 @@ static void kernfs_notify_workfn(struct work_struct *work)
> >               name = (struct qstr)QSTR_INIT(kn->name, strlen(kn->name));
> >               parent = kernfs_get_parent(kn);
> >               if (parent) {
> > -                     struct inode *p_inode;
> > -
> >                       p_inode = ilookup(info->sb, kernfs_ino(parent));
> >                       if (p_inode) {
> >                               fsnotify(p_inode, FS_MODIFY | FS_EVENT_ON_CHILD,
> > @@ -911,8 +910,11 @@ static void kernfs_notify_workfn(struct work_struct *work)
> >                       kernfs_put(parent);
> >               }
> >
> > -             fsnotify(inode, FS_MODIFY, inode, FSNOTIFY_EVENT_INODE,
> > -                      NULL, 0);
> > +             if (!p_inode) {
> > +                     fsnotify(inode, FS_MODIFY, inode, FSNOTIFY_EVENT_INODE,
> > +                              NULL, 0);
> > +             }
> > +
> >               iput(inode);
> >       }
> >
> > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> > index 51ada3cfd2ff..7c6e624b24c9 100644
> > --- a/fs/notify/fsnotify.c
> > +++ b/fs/notify/fsnotify.c
> > @@ -145,15 +145,17 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
> >  /*
> >   * Notify this dentry's parent about a child's events with child name info
> >   * if parent is watching.
> > - * Notify also the child without name info if child inode is watching.
> > + * Notify only the child without name info if parent is not watching.
> >   */
> >  int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> >                     int data_type)
> >  {
> > +     struct inode *inode = d_inode(dentry);
> >       struct dentry *parent;
> >       struct inode *p_inode;
> >       int ret = 0;
> >
> > +     parent = NULL;
> >       if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> >               goto notify_child;
> >
> > @@ -165,23 +167,23 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> >       } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> >               struct name_snapshot name;
> >
> > -             /*
> > -              * We are notifying a parent, so set a flag in mask to inform
> > -              * backend that event has information about a child entry.
> > -              */
> > +             /* When notifying parent, child should be passed as data */
> > +             WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));
> > +
> > +             /* Notify both parent and child with child name info */
> >               take_dentry_name_snapshot(&name, dentry);
> >               ret = fsnotify(p_inode, mask | FS_EVENT_ON_CHILD, data,
> >                              data_type, &name.name, 0);
> >               release_dentry_name_snapshot(&name);
> > +     } else {
> > +notify_child:
> > +             /* Notify child without child name info */
> > +             ret = fsnotify(inode, mask, data, data_type, NULL, 0);
> >       }
>
> AFAICT this will miss notifying the child if the condition
>         !fsnotify_inode_watches_children(p_inode)
> above is true... And I've noticed this because jumping into a branch in an
> if block is usually a bad idea and so I gave it a closer look. Exactly
> because of problems like this. Usually it's better to restructure
> conditions instead.
>
> In this case I think we could structure the code like:
>         struct name_snapshot name
>         struct qstr *namestr = NULL;
>
>         if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
>                 goto notify;
>         parent = dget_parent(dentry);
>         p_inode = parent->d_inode;
>
>         if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
>                 __fsnotify_update_child_dentry_flags(p_inode);
>         } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
>                 take_dentry_name_snapshot(&name, dentry);
>                 namestr = &name.name;
>                 mask |= FS_EVENT_ON_CHILD;
>         }
> notify:
>         ret = fsnotify(p_inode, mask, data, data_type, namestr, 0);
>         if (namestr)
>                 release_dentry_name_snapshot(&name);
>         dput(parent);
>         return ret;

I will look into this proposal, but please be aware that this function
completely changes
in the next patch "send event with parent/name info to sb/mount/non-dir marks",
so some of the things that look weird here or possibly even bugs might go away.
That is not to say that I won't fix them, but please review with the
next patch in mind
when considering reconstruct.

> >
> >       dput(parent);
> >
> > -     if (ret)
> > -             return ret;
> > -
> > -notify_child:
> > -     return fsnotify(d_inode(dentry), mask, data, data_type, NULL, 0);
> > +     return ret;
> >  }
> >  EXPORT_SYMBOL_GPL(__fsnotify_parent);
> >
> > @@ -322,12 +324,16 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
> >       struct super_block *sb = to_tell->i_sb;
> >       struct inode *dir = S_ISDIR(to_tell->i_mode) ? to_tell : NULL;
> >       struct mount *mnt = NULL;
> > +     struct inode *child = NULL;
> >       int ret = 0;
> >       __u32 test_mask, marks_mask;
> >
> >       if (path)
> >               mnt = real_mount(path->mnt);
> >
> > +     if (mask & FS_EVENT_ON_CHILD)
> > +             child = fsnotify_data_inode(data, data_type);
> > +
> >       /*
> >        * Optimization: srcu_read_lock() has a memory barrier which can
> >        * be expensive.  It protects walking the *_fsnotify_marks lists.
> > @@ -336,21 +342,23 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
> >        * need SRCU to keep them "alive".
> >        */
> >       if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
> > -         (!mnt || !mnt->mnt_fsnotify_marks))
> > +         (!mnt || !mnt->mnt_fsnotify_marks) &&
> > +         (!child || !child->i_fsnotify_marks))
> >               return 0;
> >
> >       /* An event "on child" is not intended for a mount/sb mark */
> >       marks_mask = to_tell->i_fsnotify_mask;
> > -     if (!(mask & FS_EVENT_ON_CHILD)) {
> > +     if (!child) {
> >               marks_mask |= sb->s_fsnotify_mask;
> >               if (mnt)
> >                       marks_mask |= mnt->mnt_fsnotify_mask;
> > +     } else {
> > +             marks_mask |= child->i_fsnotify_mask;
> >       }
>
> I don't think this is correct. The FS_EVENT_ON_CHILD events should
> also be reported to sb & mnt marks because we now generate only
> FS_EVENT_ON_CHILD if parent is watching but that doesn't mean e.g. sb
> shouldn't receive the event. Am I missing something? If I'm indeed right,
> maybe we should extend our LTP coverage a bit to catch breakage like
> this...
>

You are right.
I will see if the bug is interim or stays with the next patch and will check
if LTP coverage covers this mid series breakage.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks
  2020-07-02 12:57 ` [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks Amir Goldstein
@ 2020-07-14 11:54   ` Jan Kara
  2020-07-14 12:17     ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kara @ 2020-07-14 11:54 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Thu 02-07-20 15:57:38, Amir Goldstein wrote:
> Similar to events "on child" to watching directory, send event "on child"
> with parent/name info if sb/mount/non-dir marks are interested in
> parent/name info.
> 
> The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
> interest in parent/name info for events on non-directory inodes.
> 
> Events on "orphan" children (disconnected dentries) are sent without
> parent/name info.
> 
> Events on direcories are send with parent/name info only if the parent
> directory is watching.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/notify/fsnotify.c             | 50 +++++++++++++++++++++++---------
>  include/linux/fsnotify.h         | 10 +++++--
>  include/linux/fsnotify_backend.h | 32 +++++++++++++++++---
>  3 files changed, 73 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 7c6e624b24c9..6683c77a5b13 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -144,27 +144,55 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
>  
>  /*
>   * Notify this dentry's parent about a child's events with child name info
> - * if parent is watching.
> - * Notify only the child without name info if parent is not watching.
> + * if parent is watching or if inode/sb/mount are interested in events with
> + * parent and name info.
> + *
> + * Notify only the child without name info if parent is not watching and
> + * inode/sb/mount are not interested in events with parent and name info.
>   */
>  int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
>  		      int data_type)
>  {

Ugh, I have to say this function is now pretty difficult to digest. I was
staring at it for an hour before I could make some sence of it...

> +	const struct path *path = fsnotify_data_path(data, data_type);
> +	struct mount *mnt = path ? real_mount(path->mnt) : NULL;
>  	struct inode *inode = d_inode(dentry);
>  	struct dentry *parent;
> +	bool parent_watched = dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED;
> +	__u32 p_mask, test_mask, marks_mask = 0;
>  	struct inode *p_inode;
>  	int ret = 0;
>  
> +	/*
> +	 * Do inode/sb/mount care about parent and name info on non-dir?
> +	 * Do they care about any event at all?
> +	 */
> +	if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
> +	    (!mnt || !mnt->mnt_fsnotify_marks)) {
> +		if (!parent_watched)
> +			return 0;
> +	} else if (!(mask & FS_ISDIR) && !IS_ROOT(dentry)) {
> +		marks_mask |= fsnotify_want_parent(inode->i_fsnotify_mask);
> +		marks_mask |= fsnotify_want_parent(inode->i_sb->s_fsnotify_mask);
> +		if (mnt)
> +			marks_mask |= fsnotify_want_parent(mnt->mnt_fsnotify_mask);
> +	}

OK, so AFAIU at this point (mask & marks_mask) tells us whether we need to
grab parent because some mark needs parent into reported. Correct?

Maybe I'd rename fsnotify_want_parent() (which seems like it returns bool)
to fsnotify_parent_needed_mask() or something like that. Also I'd hide all
those checks in a helper function like:

	fsnotify_event_needs_parent(mnt, inode, mask)

So we'd then have just something like:
	if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
	    (!mnt || !mnt->mnt_fsnotify_marks) && !parent_watched)
		return 0;
	if (!parent_watched && !fsnotify_event_needs_parent(mnt, inode, mask))
		goto notify_child;
	...

> +
>  	parent = NULL;
> -	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> +	test_mask = mask & FS_EVENTS_POSS_TO_PARENT;
> +	if (!(marks_mask & test_mask) && !parent_watched)
>  		goto notify_child;
>  
> +	/* Does parent inode care about events on children? */
>  	parent = dget_parent(dentry);
>  	p_inode = parent->d_inode;
> +	p_mask = fsnotify_inode_watches_children(p_inode);
>  
> -	if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
> +	if (p_mask)
> +		marks_mask |= p_mask;
> +	else if (unlikely(parent_watched))
>  		__fsnotify_update_child_dentry_flags(p_inode);
> -	} else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> +
> +	if ((marks_mask & test_mask) && p_inode != inode) {
					^^ this is effectively
!IS_ROOT(dentry), isn't it? But since you've checked that above can it ever
be that p_inode == inode?

Also if marks_mask & test_mask == 0, why do you go to a branch that
notifies child? There shouldn't be anything to report there either. Am I
missing something? ... Oh, I see, the FS_EVENTS_POSS_TO_PARENT masking
above could cause that child is still interested. Because mask & marks_mask
can still contain something. Maybe it would be more comprehensible if we
restructure the above like:

	p_mask = fsnotify_inode_watches_children(p_inode);
	if (unlikely(parent_watched && !p_mask))
		__fsnotify_update_child_dentry_flags(p_inode);
	/*
	 * Include parent in notification either if some notification
	 * groups require parent info (!parent_watched case) or the parent is
	 * interested in this event.
	 */
	if (!parent_watched || (mask & p_mask & ALL_FSNOTIFY_EVENTS)) {
		...
	}

>  		struct name_snapshot name;
>  
>  		/* When notifying parent, child should be passed as data */
> @@ -346,15 +374,11 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  	    (!child || !child->i_fsnotify_marks))
>  		return 0;
>  
> -	/* An event "on child" is not intended for a mount/sb mark */
> -	marks_mask = to_tell->i_fsnotify_mask;
> -	if (!child) {
> -		marks_mask |= sb->s_fsnotify_mask;
> -		if (mnt)
> -			marks_mask |= mnt->mnt_fsnotify_mask;
> -	} else {
> +	marks_mask = to_tell->i_fsnotify_mask | sb->s_fsnotify_mask;
> +	if (mnt)
> +		marks_mask |= mnt->mnt_fsnotify_mask;
> +	if (child)
>  		marks_mask |= child->i_fsnotify_mask;
> -	}

This hunk seems to belong to the previous patch... It fixes the problem
I've spotted there.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 05/10] fsnotify: send MOVE_SELF event with parent/name info
  2020-07-02 12:57 ` [PATCH v4 05/10] fsnotify: send MOVE_SELF event with parent/name info Amir Goldstein
@ 2020-07-14 12:13   ` Jan Kara
  2020-07-14 12:44     ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kara @ 2020-07-14 12:13 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Thu 02-07-20 15:57:39, Amir Goldstein wrote:
> MOVE_SELF event does not get reported to a parent watching children
> when a child is moved, but it can be reported to sb/mount mark with
> parent/name info if group is interested in parent/name info.
> 
> Use the fsnotify_parent() helper to send a MOVE_SELF event and adjust
> fsnotify() to handle the case of an event "on child" that should not
> be sent to the watching parent's inode mark.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/notify/fsnotify.c             | 21 +++++++++++++++++----
>  include/linux/fsnotify.h         |  5 +----
>  include/linux/fsnotify_backend.h |  2 +-
>  3 files changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 6683c77a5b13..0faf5b09a73e 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -352,6 +352,7 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  	struct super_block *sb = to_tell->i_sb;
>  	struct inode *dir = S_ISDIR(to_tell->i_mode) ? to_tell : NULL;
>  	struct mount *mnt = NULL;
> +	struct inode *inode = NULL;
>  	struct inode *child = NULL;
>  	int ret = 0;
>  	__u32 test_mask, marks_mask;
> @@ -362,6 +363,14 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  	if (mask & FS_EVENT_ON_CHILD)
>  		child = fsnotify_data_inode(data, data_type);
>  
> +	/*
> +	 * If event is "on child" then to_tell is a watching parent.
> +	 * An event "on child" may be sent to mount/sb mark with parent/name
> +	 * info, but not appropriate for watching parent (e.g. FS_MOVE_SELF).
> +	 */
> +	if (!child || (mask & FS_EVENTS_POSS_ON_CHILD))
> +		inode = to_tell;

I'm now confused. Don't you want to fill in FSNOTIFY_OBJ_TYPE_INODE below
for FS_MOVE_SELF event? But this condition is false for it so you won't do
it?

> +
>  	/*
>  	 * Optimization: srcu_read_lock() has a memory barrier which can
>  	 * be expensive.  It protects walking the *_fsnotify_marks lists.
> @@ -369,14 +378,17 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  	 * SRCU because we have no references to any objects and do not
>  	 * need SRCU to keep them "alive".
>  	 */
> -	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
> +	if (!sb->s_fsnotify_marks &&
>  	    (!mnt || !mnt->mnt_fsnotify_marks) &&
> +	    (!inode || !inode->i_fsnotify_marks) &&
>  	    (!child || !child->i_fsnotify_marks))
>  		return 0;
>  
> -	marks_mask = to_tell->i_fsnotify_mask | sb->s_fsnotify_mask;
> +	marks_mask = sb->s_fsnotify_mask;
>  	if (mnt)
>  		marks_mask |= mnt->mnt_fsnotify_mask;
> +	if (inode)
> +		marks_mask |= inode->i_fsnotify_mask;
>  	if (child)
>  		marks_mask |= child->i_fsnotify_mask;
>  
> @@ -390,14 +402,15 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
>  
>  	iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
>  
> -	iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
> -		fsnotify_first_mark(&to_tell->i_fsnotify_marks);
>  	iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
>  		fsnotify_first_mark(&sb->s_fsnotify_marks);
>  	if (mnt) {
>  		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
>  			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
>  	}
> +	if (inode)
> +		iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
> +			fsnotify_first_mark(&inode->i_fsnotify_marks);
>  	if (child) {
>  		iter_info.marks[FSNOTIFY_OBJ_TYPE_CHILD] =
>  			fsnotify_first_mark(&child->i_fsnotify_marks);
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index 044cae3a0628..61dccaf21e7b 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -131,7 +131,6 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
>  	u32 fs_cookie = fsnotify_get_cookie();
>  	__u32 old_dir_mask = FS_MOVED_FROM;
>  	__u32 new_dir_mask = FS_MOVED_TO;
> -	__u32 mask = FS_MOVE_SELF;
>  	const struct qstr *new_name = &moved->d_name;
>  
>  	if (old_dir == new_dir)
> @@ -140,7 +139,6 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
>  	if (isdir) {
>  		old_dir_mask |= FS_ISDIR;
>  		new_dir_mask |= FS_ISDIR;
> -		mask |= FS_ISDIR;
>  	}
>  
>  	fsnotify_name(old_dir, old_dir_mask, source, old_name, fs_cookie);
> @@ -149,8 +147,7 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
>  	if (target)
>  		fsnotify_link_count(target);
>  
> -	if (source)
> -		fsnotify(source, mask, source, FSNOTIFY_EVENT_INODE, NULL, 0);
> +	fsnotify_dentry(moved, FS_MOVE_SELF);

I'm somewhat unsure about this. Does this mean that 'moved' is guaranteed
to be positive or that you've made sure that all the code below
fsnotify_dentry() is actually fine with a negative dentry? I don't find
either trivial to verify so some note in a changelog or maybe even a
separate patch for this would be useful.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks
  2020-07-14 11:54   ` Jan Kara
@ 2020-07-14 12:17     ` Amir Goldstein
  2020-07-14 15:31       ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-14 12:17 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Tue, Jul 14, 2020 at 2:54 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 02-07-20 15:57:38, Amir Goldstein wrote:
> > Similar to events "on child" to watching directory, send event "on child"
> > with parent/name info if sb/mount/non-dir marks are interested in
> > parent/name info.
> >
> > The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
> > interest in parent/name info for events on non-directory inodes.
> >
> > Events on "orphan" children (disconnected dentries) are sent without
> > parent/name info.
> >
> > Events on direcories are send with parent/name info only if the parent
> > directory is watching.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/notify/fsnotify.c             | 50 +++++++++++++++++++++++---------
> >  include/linux/fsnotify.h         | 10 +++++--
> >  include/linux/fsnotify_backend.h | 32 +++++++++++++++++---
> >  3 files changed, 73 insertions(+), 19 deletions(-)
> >
> > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> > index 7c6e624b24c9..6683c77a5b13 100644
> > --- a/fs/notify/fsnotify.c
> > +++ b/fs/notify/fsnotify.c
> > @@ -144,27 +144,55 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
> >
> >  /*
> >   * Notify this dentry's parent about a child's events with child name info
> > - * if parent is watching.
> > - * Notify only the child without name info if parent is not watching.
> > + * if parent is watching or if inode/sb/mount are interested in events with
> > + * parent and name info.
> > + *
> > + * Notify only the child without name info if parent is not watching and
> > + * inode/sb/mount are not interested in events with parent and name info.
> >   */
> >  int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> >                     int data_type)
> >  {
>
> Ugh, I have to say this function is now pretty difficult to digest. I was
> staring at it for an hour before I could make some sence of it...
>

Yeh, it took many trials to get to something that does the job,
but it could use some cosmetic changes ;-)

> > +     const struct path *path = fsnotify_data_path(data, data_type);
> > +     struct mount *mnt = path ? real_mount(path->mnt) : NULL;
> >       struct inode *inode = d_inode(dentry);
> >       struct dentry *parent;
> > +     bool parent_watched = dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED;
> > +     __u32 p_mask, test_mask, marks_mask = 0;
> >       struct inode *p_inode;
> >       int ret = 0;
> >
> > +     /*
> > +      * Do inode/sb/mount care about parent and name info on non-dir?
> > +      * Do they care about any event at all?
> > +      */
> > +     if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
> > +         (!mnt || !mnt->mnt_fsnotify_marks)) {
> > +             if (!parent_watched)
> > +                     return 0;
> > +     } else if (!(mask & FS_ISDIR) && !IS_ROOT(dentry)) {
> > +             marks_mask |= fsnotify_want_parent(inode->i_fsnotify_mask);
> > +             marks_mask |= fsnotify_want_parent(inode->i_sb->s_fsnotify_mask);
> > +             if (mnt)
> > +                     marks_mask |= fsnotify_want_parent(mnt->mnt_fsnotify_mask);
> > +     }
>
> OK, so AFAIU at this point (mask & marks_mask) tells us whether we need to
> grab parent because some mark needs parent into reported. Correct?
>
> Maybe I'd rename fsnotify_want_parent() (which seems like it returns bool)
> to fsnotify_parent_needed_mask() or something like that. Also I'd hide all
> those checks in a helper function like:
>
>         fsnotify_event_needs_parent(mnt, inode, mask)
>
> So we'd then have just something like:
>         if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
>             (!mnt || !mnt->mnt_fsnotify_marks) && !parent_watched)
>                 return 0;
>         if (!parent_watched && !fsnotify_event_needs_parent(mnt, inode, mask))
>                 goto notify_child;

OK, but we'd still need to store marks_mask for later in case event
needs parent.


>         ...
>
> > +
> >       parent = NULL;
> > -     if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> > +     test_mask = mask & FS_EVENTS_POSS_TO_PARENT;
> > +     if (!(marks_mask & test_mask) && !parent_watched)
> >               goto notify_child;
> >
> > +     /* Does parent inode care about events on children? */
> >       parent = dget_parent(dentry);
> >       p_inode = parent->d_inode;
> > +     p_mask = fsnotify_inode_watches_children(p_inode);
> >
> > -     if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
> > +     if (p_mask)
> > +             marks_mask |= p_mask;
> > +     else if (unlikely(parent_watched))
> >               __fsnotify_update_child_dentry_flags(p_inode);
> > -     } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> > +
> > +     if ((marks_mask & test_mask) && p_inode != inode) {
>                                         ^^ this is effectively
> !IS_ROOT(dentry), isn't it? But since you've checked that above can it ever
> be that p_inode == inode?
>

True, but only if you add the hidden assumption (which should be true) that
DCACHE_FSNOTIFY_PARENT_WATCHED cannot be set on an IS_ROOT
dentry. Otherwise you can have parent_watched and not check IS_ROOT()
so prefered defensive here, but I don't mind dropping it.

> Also if marks_mask & test_mask == 0, why do you go to a branch that
> notifies child? There shouldn't be anything to report there either. Am I
> missing something? ... Oh, I see, the FS_EVENTS_POSS_TO_PARENT masking
> above could cause that child is still interested. Because mask & marks_mask
> can still contain something. Maybe it would be more comprehensible if we
> restructure the above like:
>
>         p_mask = fsnotify_inode_watches_children(p_inode);
>         if (unlikely(parent_watched && !p_mask))
>                 __fsnotify_update_child_dentry_flags(p_inode);
>         /*
>          * Include parent in notification either if some notification
>          * groups require parent info (!parent_watched case) or the parent is
>          * interested in this event.
>          */
>         if (!parent_watched || (mask & p_mask & ALL_FSNOTIFY_EVENTS)) {
>                 ...
>         }
>

Sure!

> >               struct name_snapshot name;
> >
> >               /* When notifying parent, child should be passed as data */
> > @@ -346,15 +374,11 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
> >           (!child || !child->i_fsnotify_marks))
> >               return 0;
> >
> > -     /* An event "on child" is not intended for a mount/sb mark */
> > -     marks_mask = to_tell->i_fsnotify_mask;
> > -     if (!child) {
> > -             marks_mask |= sb->s_fsnotify_mask;
> > -             if (mnt)
> > -                     marks_mask |= mnt->mnt_fsnotify_mask;
> > -     } else {
> > +     marks_mask = to_tell->i_fsnotify_mask | sb->s_fsnotify_mask;
> > +     if (mnt)
> > +             marks_mask |= mnt->mnt_fsnotify_mask;
> > +     if (child)
> >               marks_mask |= child->i_fsnotify_mask;
> > -     }
>
> This hunk seems to belong to the previous patch... It fixes the problem
> I've spotted there.

I figured that we would find it here :-)
I will fix the previous patch.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 05/10] fsnotify: send MOVE_SELF event with parent/name info
  2020-07-14 12:13   ` Jan Kara
@ 2020-07-14 12:44     ` Amir Goldstein
  0 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-14 12:44 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Tue, Jul 14, 2020 at 3:13 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 02-07-20 15:57:39, Amir Goldstein wrote:
> > MOVE_SELF event does not get reported to a parent watching children
> > when a child is moved, but it can be reported to sb/mount mark with
> > parent/name info if group is interested in parent/name info.
> >
> > Use the fsnotify_parent() helper to send a MOVE_SELF event and adjust
> > fsnotify() to handle the case of an event "on child" that should not
> > be sent to the watching parent's inode mark.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/notify/fsnotify.c             | 21 +++++++++++++++++----
> >  include/linux/fsnotify.h         |  5 +----
> >  include/linux/fsnotify_backend.h |  2 +-
> >  3 files changed, 19 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> > index 6683c77a5b13..0faf5b09a73e 100644
> > --- a/fs/notify/fsnotify.c
> > +++ b/fs/notify/fsnotify.c
> > @@ -352,6 +352,7 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
> >       struct super_block *sb = to_tell->i_sb;
> >       struct inode *dir = S_ISDIR(to_tell->i_mode) ? to_tell : NULL;
> >       struct mount *mnt = NULL;
> > +     struct inode *inode = NULL;
> >       struct inode *child = NULL;
> >       int ret = 0;
> >       __u32 test_mask, marks_mask;
> > @@ -362,6 +363,14 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
> >       if (mask & FS_EVENT_ON_CHILD)
> >               child = fsnotify_data_inode(data, data_type);
> >
> > +     /*
> > +      * If event is "on child" then to_tell is a watching parent.
> > +      * An event "on child" may be sent to mount/sb mark with parent/name
> > +      * info, but not appropriate for watching parent (e.g. FS_MOVE_SELF).
> > +      */
> > +     if (!child || (mask & FS_EVENTS_POSS_ON_CHILD))
> > +             inode = to_tell;
>
> I'm now confused. Don't you want to fill in FSNOTIFY_OBJ_TYPE_INODE below
> for FS_MOVE_SELF event? But this condition is false for it so you won't do
> it?
>

I do not.
For events with the flag FS_EVENT_ON_CHILD, the inode in
FSNOTIFY_OBJ_TYPE_INODE is always the parent and the inode in
FSNOTIFY_OBJ_TYPE_CHILD is always the child.
So FS_MOVE_SELF will be reported if sb/mount are watching
or if child inode is watching, but NOT if only parent inode is watching.

I realize I may have been able to make other choices, but seemed like
the most consistent choice to me.
If you see a better option, let me know.

> > +
> >       /*
> >        * Optimization: srcu_read_lock() has a memory barrier which can
> >        * be expensive.  It protects walking the *_fsnotify_marks lists.
> > @@ -369,14 +378,17 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
> >        * SRCU because we have no references to any objects and do not
> >        * need SRCU to keep them "alive".
> >        */
> > -     if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
> > +     if (!sb->s_fsnotify_marks &&
> >           (!mnt || !mnt->mnt_fsnotify_marks) &&
> > +         (!inode || !inode->i_fsnotify_marks) &&
> >           (!child || !child->i_fsnotify_marks))
> >               return 0;
> >
> > -     marks_mask = to_tell->i_fsnotify_mask | sb->s_fsnotify_mask;
> > +     marks_mask = sb->s_fsnotify_mask;
> >       if (mnt)
> >               marks_mask |= mnt->mnt_fsnotify_mask;
> > +     if (inode)
> > +             marks_mask |= inode->i_fsnotify_mask;
> >       if (child)
> >               marks_mask |= child->i_fsnotify_mask;
> >
> > @@ -390,14 +402,15 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_type,
> >
> >       iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
> >
> > -     iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
> > -             fsnotify_first_mark(&to_tell->i_fsnotify_marks);
> >       iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
> >               fsnotify_first_mark(&sb->s_fsnotify_marks);
> >       if (mnt) {
> >               iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
> >                       fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
> >       }
> > +     if (inode)
> > +             iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
> > +                     fsnotify_first_mark(&inode->i_fsnotify_marks);
> >       if (child) {
> >               iter_info.marks[FSNOTIFY_OBJ_TYPE_CHILD] =
> >                       fsnotify_first_mark(&child->i_fsnotify_marks);
> > diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> > index 044cae3a0628..61dccaf21e7b 100644
> > --- a/include/linux/fsnotify.h
> > +++ b/include/linux/fsnotify.h
> > @@ -131,7 +131,6 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
> >       u32 fs_cookie = fsnotify_get_cookie();
> >       __u32 old_dir_mask = FS_MOVED_FROM;
> >       __u32 new_dir_mask = FS_MOVED_TO;
> > -     __u32 mask = FS_MOVE_SELF;
> >       const struct qstr *new_name = &moved->d_name;
> >
> >       if (old_dir == new_dir)
> > @@ -140,7 +139,6 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
> >       if (isdir) {
> >               old_dir_mask |= FS_ISDIR;
> >               new_dir_mask |= FS_ISDIR;
> > -             mask |= FS_ISDIR;
> >       }
> >
> >       fsnotify_name(old_dir, old_dir_mask, source, old_name, fs_cookie);
> > @@ -149,8 +147,7 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
> >       if (target)
> >               fsnotify_link_count(target);
> >
> > -     if (source)
> > -             fsnotify(source, mask, source, FSNOTIFY_EVENT_INODE, NULL, 0);
> > +     fsnotify_dentry(moved, FS_MOVE_SELF);
>
> I'm somewhat unsure about this. Does this mean that 'moved' is guaranteed
> to be positive or that you've made sure that all the code below
> fsnotify_dentry() is actually fine with a negative dentry? I don't find
> either trivial to verify so some note in a changelog or maybe even a
> separate patch for this would be useful.
>

Oh, it's true. I should've mentioned it or separate this change.
I guess my reaction was the opposite of yours - it seemed obvious to
me that moved
dentry is positive - vfs_rename() is called under lock_rename() and it seemed
obvious to me that callers verified positive source, but in any case,
vfs_rename()
starts with may_delete() that verifies positive rename victim and
debugfs_rename(), the other caller of fsnotify_move() also verifies
positive victim.

I will add this information to the commit message.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks
  2020-07-14 12:17     ` Amir Goldstein
@ 2020-07-14 15:31       ` Amir Goldstein
  0 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-14 15:31 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

> > > +     const struct path *path = fsnotify_data_path(data, data_type);
> > > +     struct mount *mnt = path ? real_mount(path->mnt) : NULL;
> > >       struct inode *inode = d_inode(dentry);
> > >       struct dentry *parent;
> > > +     bool parent_watched = dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED;
> > > +     __u32 p_mask, test_mask, marks_mask = 0;
> > >       struct inode *p_inode;
> > >       int ret = 0;
> > >
> > > +     /*
> > > +      * Do inode/sb/mount care about parent and name info on non-dir?
> > > +      * Do they care about any event at all?
> > > +      */
> > > +     if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
> > > +         (!mnt || !mnt->mnt_fsnotify_marks)) {
> > > +             if (!parent_watched)
> > > +                     return 0;
> > > +     } else if (!(mask & FS_ISDIR) && !IS_ROOT(dentry)) {
> > > +             marks_mask |= fsnotify_want_parent(inode->i_fsnotify_mask);
> > > +             marks_mask |= fsnotify_want_parent(inode->i_sb->s_fsnotify_mask);
> > > +             if (mnt)
> > > +                     marks_mask |= fsnotify_want_parent(mnt->mnt_fsnotify_mask);
> > > +     }
> >
> > OK, so AFAIU at this point (mask & marks_mask) tells us whether we need to
> > grab parent because some mark needs parent into reported. Correct?
> >
> > Maybe I'd rename fsnotify_want_parent() (which seems like it returns bool)
> > to fsnotify_parent_needed_mask() or something like that. Also I'd hide all
> > those checks in a helper function like:
> >
> >         fsnotify_event_needs_parent(mnt, inode, mask)
> >
> > So we'd then have just something like:
> >         if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
> >             (!mnt || !mnt->mnt_fsnotify_marks) && !parent_watched)
> >                 return 0;
> >         if (!parent_watched && !fsnotify_event_needs_parent(mnt, inode, mask))
> >                 goto notify_child;
>
> OK, but we'd still need to store marks_mask for later in case event
> needs parent.
>

Nevermind, I understand now what you meant.
Changed the function according to your guidance and looks much better.
Fits now in one terminal screen :-)

>
> >         ...
> >
> > > +
> > >       parent = NULL;
> > > -     if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> > > +     test_mask = mask & FS_EVENTS_POSS_TO_PARENT;
> > > +     if (!(marks_mask & test_mask) && !parent_watched)
> > >               goto notify_child;
> > >
> > > +     /* Does parent inode care about events on children? */
> > >       parent = dget_parent(dentry);
> > >       p_inode = parent->d_inode;
> > > +     p_mask = fsnotify_inode_watches_children(p_inode);
> > >
> > > -     if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
> > > +     if (p_mask)
> > > +             marks_mask |= p_mask;
> > > +     else if (unlikely(parent_watched))
> > >               __fsnotify_update_child_dentry_flags(p_inode);
> > > -     } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> > > +
> > > +     if ((marks_mask & test_mask) && p_inode != inode) {
> >                                         ^^ this is effectively
> > !IS_ROOT(dentry), isn't it? But since you've checked that above can it ever
> > be that p_inode == inode?
> >
>
> True, but only if you add the hidden assumption (which should be true) that
> DCACHE_FSNOTIFY_PARENT_WATCHED cannot be set on an IS_ROOT
> dentry. Otherwise you can have parent_watched and not check IS_ROOT()
> so prefered defensive here, but I don't mind dropping it.
>

I dropped this and also dropped the IS_ROOT(dentry) check because
there is already a check in the inline fsnotify_parent().

FYI, pushed these changes to branch fsnotify_name.
Rebased/tested/pushed branch fanotify_name_fid on top.
There are no changes to following patches as they are all
fanotify patches.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-14 11:54     ` Amir Goldstein
@ 2020-07-15 17:09       ` Jan Kara
  2020-07-15 17:42         ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kara @ 2020-07-15 17:09 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Tue 14-07-20 14:54:44, Amir Goldstein wrote:
> On Tue, Jul 14, 2020 at 1:34 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 02-07-20 15:57:37, Amir Goldstein wrote:
> > > Instead of calling fsnotify() twice, once with parent inode and once
> > > with child inode, if event should be sent to parent inode, send it
> > > with both parent and child inodes marks in object type iterator and call
> > > the backend handle_event() callback only once.
> > >
> > > The parent inode is assigned to the standard "inode" iterator type and
> > > the child inode is assigned to the special "child" iterator type.
> > >
> > > In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
> > > the dir argment to handle_event will be the parent inode, the file_name
> > > argument to handle_event is non NULL and refers to the name of the child
> > > and the child inode can be accessed with fsnotify_data_inode().
> > >
> > > This will allow fanotify to make decisions based on child or parent's
> > > ignored mask.  For example, when a parent is interested in a specific
> > > event on its children, but a specific child wishes to ignore this event,
> > > the event will not be reported.  This is not what happens with current
> > > code, but according to man page, it is the expected behavior.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >
> > I like the direction where this is going. But can't we push it even a bit
> > further? I like the fact that we now have "one fs event" -> "one fsnotify()
> > call". Ideally I'd like to get rid of FS_EVENT_ON_CHILD in the event mask
> > because it's purpose seems very weak now and it complicates code (and now
> 
> Can you give an example where it complicates the code?
> Don't confuse this with the code in fanotify_user.c that subscribes for
> events on child/with name.

I refer mostly to the stuff like:

        /* An event "on child" is not intended for a mount/sb mark */
        if (mask & FS_EVENT_ON_CHILD)
 		...

They are not big complications. But it would be nice to get rid of special
cases like this. Basically my thinking was like: Now that we generate each
event exactly once (i.e., no event duplication once with FS_EVENT_ON_CHILD
and once without it), we should just be able to deliver all events to sb,
mnt, parent, child and they'll just ignore it if they don't care. No
special cases needed. But I understand I'm omitting a lot of detail in this
highlevel "feeling" and these details may make this impractical.

> In one before the last patch of the series I am testing FS_EVENT_ON_CHILD
> in mask to know how to report the event inside fanotify_alloc_event().
> I may be able to carry this information not in mask, but the flag space is
> already taken anyway by FAN_EVENT_ON_CHILD input arg, so not sure
> what is there to gain from not setting FS_EVENT_ON_CHILD.
> 
> > it became even a bit of a misnomer) - intuitively, ->handle_event is now
> 
> I thought of changing the name to FS_EVENT_WITH_NAME, but that was
> confusing because create/delete are also events with a name.
> Maybe FS_EVENT_WITH_CHILD_NAME :-/

Yeah, FS_EVENT_WITH_CHILD_NAME would describe the use better now but then
the aliasing with FAN_EVENT_ON_CHILD will be confusing as well. So if we
keep passing the flag, I guess keeping the name is the least confusing.

> > passed sb, mnt, parent, child so it should have all the info to decide
> > where the event should be reported and I don't see a need for
> > FS_EVENT_ON_CHILD flag.
> 
> Do you mean something like this?
> 
>         const struct path *inode = fsnotify_data_inode(data, data_type);
>         bool event_on_child = !!file_name && dir != inode;

Not quite. E.g. in fanotify_group_event_mask() we could replace the
FS_EVENT_ON_CHILD usage with something like:

	/* If parent isn't interested in events on child, skip adding its mask */
	if (type == FSNOTIFY_OBJ_TYPE_INODE &&
	    !(mark->mask & FS_EVENT_ON_CHILD))
		continue;

And AFAIU this should do just what we need if we always fill in the
TYPE_CHILD field and TYPE_INODE only if we need the parent information
(either for reporting to parent or for parent info in the event).

> It may be true that all information is there, but I think the above is
> a bit ugly and quite not trivial to explain, whereas the flag is quite
> intuitive (to me) and adds no extra complexity (IMO).
> 
> > With fsnotify() call itself we still use
> > FS_EVENT_ON_CHILD to determine what the arguments mean but can't we just
> > mandate that 'data' always points to the child, 'to_tell' is always the
> > parent dir if watching or NULL (and I'd rename that argument to 'dir' and
> > maybe move it after 'data_type' argument). What do you think?
> >
> 
> I think what you are missing is the calls from fsnotify_name().
> For those calls, data is the dir and so is to_tell and name is the entry name.

Really? I can see in the final version:

static inline void fsnotify_name(struct inode *dir, __u32 mask,
				 struct inode *child,
				 const struct qstr *name, u32 cookie)
{
	fsnotify(dir, mask, child, FSNOTIFY_EVENT_INODE, name, cookie);
}

so it appears 'to_tell' is dir and data is the child inode...

> > > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> > > index 51ada3cfd2ff..7c6e624b24c9 100644
> > > --- a/fs/notify/fsnotify.c
> > > +++ b/fs/notify/fsnotify.c
> > > @@ -145,15 +145,17 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
> > >  /*
> > >   * Notify this dentry's parent about a child's events with child name info
> > >   * if parent is watching.
> > > - * Notify also the child without name info if child inode is watching.
> > > + * Notify only the child without name info if parent is not watching.
> > >   */
> > >  int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> > >                     int data_type)
> > >  {
> > > +     struct inode *inode = d_inode(dentry);
> > >       struct dentry *parent;
> > >       struct inode *p_inode;
> > >       int ret = 0;
> > >
> > > +     parent = NULL;
> > >       if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> > >               goto notify_child;
> > >
> > > @@ -165,23 +167,23 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> > >       } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> > >               struct name_snapshot name;
> > >
> > > -             /*
> > > -              * We are notifying a parent, so set a flag in mask to inform
> > > -              * backend that event has information about a child entry.
> > > -              */
> > > +             /* When notifying parent, child should be passed as data */
> > > +             WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));
> > > +
> > > +             /* Notify both parent and child with child name info */
> > >               take_dentry_name_snapshot(&name, dentry);
> > >               ret = fsnotify(p_inode, mask | FS_EVENT_ON_CHILD, data,
> > >                              data_type, &name.name, 0);
> > >               release_dentry_name_snapshot(&name);
> > > +     } else {
> > > +notify_child:
> > > +             /* Notify child without child name info */
> > > +             ret = fsnotify(inode, mask, data, data_type, NULL, 0);
> > >       }
> >
> > AFAICT this will miss notifying the child if the condition
> >         !fsnotify_inode_watches_children(p_inode)
> > above is true... And I've noticed this because jumping into a branch in an
> > if block is usually a bad idea and so I gave it a closer look. Exactly
> > because of problems like this. Usually it's better to restructure
> > conditions instead.
> >
> > In this case I think we could structure the code like:
> >         struct name_snapshot name
> >         struct qstr *namestr = NULL;
> >
> >         if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> >                 goto notify;
> >         parent = dget_parent(dentry);
> >         p_inode = parent->d_inode;
> >
> >         if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
> >                 __fsnotify_update_child_dentry_flags(p_inode);
> >         } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> >                 take_dentry_name_snapshot(&name, dentry);
> >                 namestr = &name.name;
> >                 mask |= FS_EVENT_ON_CHILD;
> >         }
> > notify:
> >         ret = fsnotify(p_inode, mask, data, data_type, namestr, 0);
> >         if (namestr)
> >                 release_dentry_name_snapshot(&name);
> >         dput(parent);
> >         return ret;
> 
> I will look into this proposal, but please be aware that this function
> completely changes in the next patch "send event with parent/name info to
> sb/mount/non-dir marks", so some of the things that look weird here or
> possibly even bugs might go away.  That is not to say that I won't fix
> them, but please review with the next patch in mind when considering
> reconstruct.

Yes, I've then noticed the function changes significantly later and the bug
actually gets silently fixed. So maybe what I proposed here isn't ideal and
the fix should look differently. But my main dislike was the jump into the
if branch which stays till the end AFAICT.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-15 17:09       ` Jan Kara
@ 2020-07-15 17:42         ` Amir Goldstein
  2020-07-16  6:38           ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-15 17:42 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Wed, Jul 15, 2020 at 8:09 PM Jan Kara <jack@suse.cz> wrote:
>
> On Tue 14-07-20 14:54:44, Amir Goldstein wrote:
> > On Tue, Jul 14, 2020 at 1:34 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Thu 02-07-20 15:57:37, Amir Goldstein wrote:
> > > > Instead of calling fsnotify() twice, once with parent inode and once
> > > > with child inode, if event should be sent to parent inode, send it
> > > > with both parent and child inodes marks in object type iterator and call
> > > > the backend handle_event() callback only once.
> > > >
> > > > The parent inode is assigned to the standard "inode" iterator type and
> > > > the child inode is assigned to the special "child" iterator type.
> > > >
> > > > In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
> > > > the dir argment to handle_event will be the parent inode, the file_name
> > > > argument to handle_event is non NULL and refers to the name of the child
> > > > and the child inode can be accessed with fsnotify_data_inode().
> > > >
> > > > This will allow fanotify to make decisions based on child or parent's
> > > > ignored mask.  For example, when a parent is interested in a specific
> > > > event on its children, but a specific child wishes to ignore this event,
> > > > the event will not be reported.  This is not what happens with current
> > > > code, but according to man page, it is the expected behavior.
> > > >
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > >
> > > I like the direction where this is going. But can't we push it even a bit
> > > further? I like the fact that we now have "one fs event" -> "one fsnotify()
> > > call". Ideally I'd like to get rid of FS_EVENT_ON_CHILD in the event mask
> > > because it's purpose seems very weak now and it complicates code (and now
> >
> > Can you give an example where it complicates the code?
> > Don't confuse this with the code in fanotify_user.c that subscribes for
> > events on child/with name.
>
> I refer mostly to the stuff like:
>
>         /* An event "on child" is not intended for a mount/sb mark */
>         if (mask & FS_EVENT_ON_CHILD)
>                 ...
>
> They are not big complications. But it would be nice to get rid of special
> cases like this. Basically my thinking was like: Now that we generate each
> event exactly once (i.e., no event duplication once with FS_EVENT_ON_CHILD
> and once without it), we should just be able to deliver all events to sb,
> mnt, parent, child and they'll just ignore it if they don't care. No
> special cases needed. But I understand I'm omitting a lot of detail in this
> highlevel "feeling" and these details may make this impractical.
>

Well, you may be right.
I have the opposite "feeling" but you often prove me wrong.
I will try to look closer at your proposal, but I must say, even
if the flag is redundant information, so what?
You may say the same thing about FS_ISDIR now.
I agree that we should simplify code that doesn't need to use the
flag, but if in the end the need to report the child FID is determined by
this flag, why should we recalculate it if we can conveniently set the
flag in the right case in fsnotify_parent().

If I am not mistaken, the alternative would mean special casing the
DIRENT_EVENTS in the event handler.
Yes, we already do special case them, but in some places special
casing them can be avoided by consulting the FS_EVENT_ON_CHILD flag.

> > In one before the last patch of the series I am testing FS_EVENT_ON_CHILD
> > in mask to know how to report the event inside fanotify_alloc_event().
> > I may be able to carry this information not in mask, but the flag space is
> > already taken anyway by FAN_EVENT_ON_CHILD input arg, so not sure
> > what is there to gain from not setting FS_EVENT_ON_CHILD.
> >
> > > it became even a bit of a misnomer) - intuitively, ->handle_event is now
> >
> > I thought of changing the name to FS_EVENT_WITH_NAME, but that was
> > confusing because create/delete are also events with a name.
> > Maybe FS_EVENT_WITH_CHILD_NAME :-/
>
> Yeah, FS_EVENT_WITH_CHILD_NAME would describe the use better now but then
> the aliasing with FAN_EVENT_ON_CHILD will be confusing as well. So if we
> keep passing the flag, I guess keeping the name is the least confusing.
>
> > > passed sb, mnt, parent, child so it should have all the info to decide
> > > where the event should be reported and I don't see a need for
> > > FS_EVENT_ON_CHILD flag.
> >
> > Do you mean something like this?
> >
> >         const struct path *inode = fsnotify_data_inode(data, data_type);
> >         bool event_on_child = !!file_name && dir != inode;
>
> Not quite. E.g. in fanotify_group_event_mask() we could replace the
> FS_EVENT_ON_CHILD usage with something like:
>
>         /* If parent isn't interested in events on child, skip adding its mask */
>         if (type == FSNOTIFY_OBJ_TYPE_INODE &&
>             !(mark->mask & FS_EVENT_ON_CHILD))
>                 continue;
>
> And AFAIU this should do just what we need if we always fill in the
> TYPE_CHILD field and TYPE_INODE only if we need the parent information
> (either for reporting to parent or for parent info in the event).
>

Yes, this specific condition could be simplified like you wrote.
It is a relic from before I unified the two event flavors and I forgot to
change it.

> > It may be true that all information is there, but I think the above is
> > a bit ugly and quite not trivial to explain, whereas the flag is quite
> > intuitive (to me) and adds no extra complexity (IMO).
> >
> > > With fsnotify() call itself we still use
> > > FS_EVENT_ON_CHILD to determine what the arguments mean but can't we just
> > > mandate that 'data' always points to the child, 'to_tell' is always the
> > > parent dir if watching or NULL (and I'd rename that argument to 'dir' and
> > > maybe move it after 'data_type' argument). What do you think?
> > >
> >
> > I think what you are missing is the calls from fsnotify_name().
> > For those calls, data is the dir and so is to_tell and name is the entry name.
>
> Really? I can see in the final version:
>
> static inline void fsnotify_name(struct inode *dir, __u32 mask,
>                                  struct inode *child,
>                                  const struct qstr *name, u32 cookie)
> {
>         fsnotify(dir, mask, child, FSNOTIFY_EVENT_INODE, name, cookie);
> }
>
> so it appears 'to_tell' is dir and data is the child inode...
>

Yes. right again.

> > > > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> > > > index 51ada3cfd2ff..7c6e624b24c9 100644
> > > > --- a/fs/notify/fsnotify.c
> > > > +++ b/fs/notify/fsnotify.c
> > > > @@ -145,15 +145,17 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
> > > >  /*
> > > >   * Notify this dentry's parent about a child's events with child name info
> > > >   * if parent is watching.
> > > > - * Notify also the child without name info if child inode is watching.
> > > > + * Notify only the child without name info if parent is not watching.
> > > >   */
> > > >  int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> > > >                     int data_type)
> > > >  {
> > > > +     struct inode *inode = d_inode(dentry);
> > > >       struct dentry *parent;
> > > >       struct inode *p_inode;
> > > >       int ret = 0;
> > > >
> > > > +     parent = NULL;
> > > >       if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> > > >               goto notify_child;
> > > >
> > > > @@ -165,23 +167,23 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> > > >       } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> > > >               struct name_snapshot name;
> > > >
> > > > -             /*
> > > > -              * We are notifying a parent, so set a flag in mask to inform
> > > > -              * backend that event has information about a child entry.
> > > > -              */
> > > > +             /* When notifying parent, child should be passed as data */
> > > > +             WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));
> > > > +
> > > > +             /* Notify both parent and child with child name info */
> > > >               take_dentry_name_snapshot(&name, dentry);
> > > >               ret = fsnotify(p_inode, mask | FS_EVENT_ON_CHILD, data,
> > > >                              data_type, &name.name, 0);
> > > >               release_dentry_name_snapshot(&name);
> > > > +     } else {
> > > > +notify_child:
> > > > +             /* Notify child without child name info */
> > > > +             ret = fsnotify(inode, mask, data, data_type, NULL, 0);
> > > >       }
> > >
> > > AFAICT this will miss notifying the child if the condition
> > >         !fsnotify_inode_watches_children(p_inode)
> > > above is true... And I've noticed this because jumping into a branch in an
> > > if block is usually a bad idea and so I gave it a closer look. Exactly
> > > because of problems like this. Usually it's better to restructure
> > > conditions instead.
> > >
> > > In this case I think we could structure the code like:
> > >         struct name_snapshot name
> > >         struct qstr *namestr = NULL;
> > >
> > >         if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
> > >                 goto notify;
> > >         parent = dget_parent(dentry);
> > >         p_inode = parent->d_inode;
> > >
> > >         if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
> > >                 __fsnotify_update_child_dentry_flags(p_inode);
> > >         } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
> > >                 take_dentry_name_snapshot(&name, dentry);
> > >                 namestr = &name.name;
> > >                 mask |= FS_EVENT_ON_CHILD;
> > >         }
> > > notify:
> > >         ret = fsnotify(p_inode, mask, data, data_type, namestr, 0);
> > >         if (namestr)
> > >                 release_dentry_name_snapshot(&name);
> > >         dput(parent);
> > >         return ret;
> >
> > I will look into this proposal, but please be aware that this function
> > completely changes in the next patch "send event with parent/name info to
> > sb/mount/non-dir marks", so some of the things that look weird here or
> > possibly even bugs might go away.  That is not to say that I won't fix
> > them, but please review with the next patch in mind when considering
> > reconstruct.
>
> Yes, I've then noticed the function changes significantly later and the bug
> actually gets silently fixed. So maybe what I proposed here isn't ideal and
> the fix should look differently. But my main dislike was the jump into the
> if branch which stays till the end AFAICT.
>

And you were right.
I implemented all re-structure suggestions and pushed to fsnotify_name branch.
Will try to look at reducing the use of the FS_EVENT_ON_CHILD flag.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-15 17:42         ` Amir Goldstein
@ 2020-07-16  6:38           ` Amir Goldstein
  2020-07-16  7:39             ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-16  6:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Wed, Jul 15, 2020 at 8:42 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Jul 15, 2020 at 8:09 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Tue 14-07-20 14:54:44, Amir Goldstein wrote:
> > > On Tue, Jul 14, 2020 at 1:34 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Thu 02-07-20 15:57:37, Amir Goldstein wrote:
> > > > > Instead of calling fsnotify() twice, once with parent inode and once
> > > > > with child inode, if event should be sent to parent inode, send it
> > > > > with both parent and child inodes marks in object type iterator and call
> > > > > the backend handle_event() callback only once.
> > > > >
> > > > > The parent inode is assigned to the standard "inode" iterator type and
> > > > > the child inode is assigned to the special "child" iterator type.
> > > > >
> > > > > In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
> > > > > the dir argment to handle_event will be the parent inode, the file_name
> > > > > argument to handle_event is non NULL and refers to the name of the child
> > > > > and the child inode can be accessed with fsnotify_data_inode().
> > > > >
> > > > > This will allow fanotify to make decisions based on child or parent's
> > > > > ignored mask.  For example, when a parent is interested in a specific
> > > > > event on its children, but a specific child wishes to ignore this event,
> > > > > the event will not be reported.  This is not what happens with current
> > > > > code, but according to man page, it is the expected behavior.
> > > > >
> > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > >
> > > > I like the direction where this is going. But can't we push it even a bit
> > > > further? I like the fact that we now have "one fs event" -> "one fsnotify()
> > > > call". Ideally I'd like to get rid of FS_EVENT_ON_CHILD in the event mask
> > > > because it's purpose seems very weak now and it complicates code (and now
> > >
> > > Can you give an example where it complicates the code?
> > > Don't confuse this with the code in fanotify_user.c that subscribes for
> > > events on child/with name.
> >
> > I refer mostly to the stuff like:
> >
> >         /* An event "on child" is not intended for a mount/sb mark */
> >         if (mask & FS_EVENT_ON_CHILD)
> >                 ...
> >

I need to explain something that was not an obvious decision for me.

When sending the same event on two inodes marks I considered a few options:

1. TYPE_INODE is the mark on the object referred to in data
    TYPE_PARENT is the mark on the parent if event is sent to a watching
                               parent or to sb/mnt/child with parent/name info
2. TYPE_CHILD is the mark on the object referred to in data
    TYPE_INODE is the mark on the fsnotify to_tell inode if not same as data
3. TYPE_INODE is the mark on the fsnotify to_tell inode
    TYPE_CHILD is the mark on the object referred to in data if it is
not to_tell

The first option with TYPE_PARENT  would require changing audit
and dnotify to look at TYPE_PARENT mark in addition to TYPE_INODE
mark, so it adds more friction and I ruled it out.

I think you had option #2 in mind when reading the code, but I went
for option #3.
There is a minor difference between them related to how we deal with the case
that the parent is watching and the case that only the child is watching.

If the parent is not watching (and child/sb/mnt not interested in name) we do
not snapshot the name and do not set the ON_CHILD flag in the mask.
In that case, should we add the child mark as TYPE_INODE or TYPE_CHILD?

I chose TYPE_INODE because this meant I did not have to change audit/dnotify
for that case. I didn't even care to look if they needed to be changed or not,
just wanted to keep things as they were.

Looking now, I see that dnotify would have needed to check TYPE_CHILD to
get FS_ATTRIB event on self.

It looks like audit would not have needed to change because although they set
FS_EVENT_ON_CHILD in mask, none of the events they care about are
"possible on child":
 #define AUDIT_FS_WATCH (FS_MOVE | FS_CREATE | FS_DELETE | FS_DELETE_SELF |\
                        FS_MOVE_SELF | FS_EVENT_ON_CHILD | FS_UNMOUNT)
#define AUDIT_FS_EVENTS (FS_MOVE | FS_CREATE | FS_DELETE | FS_DELETE_SELF |\
                         FS_MOVE_SELF | FS_EVENT_ON_CHILD)

Having written that decision process down made me realize there is a bug in
my unified inotify event handler implementation - it does not clear
FS_EVENT_ON_CHILD when reporting without name.

It is interesting to note that the result of sending FS_OPEN only to a watching
child to inotify_handle_event() is the same for design choices #2 and #3 above.
But the bug fix of clearing FS_EVENT_ON_CHILD when reporting without name
would look different depending on said choice.

Since I had to change inotify handler anyway, I prefer to stick with my choice
and fix inotify handler using goto notify_child which is a bit uglier,
instead of
having to adapt dnotify to choice #2.

> > They are not big complications. But it would be nice to get rid of special
> > cases like this. Basically my thinking was like: Now that we generate each
> > event exactly once (i.e., no event duplication once with FS_EVENT_ON_CHILD
> > and once without it), we should just be able to deliver all events to sb,
> > mnt, parent, child and they'll just ignore it if they don't care. No
> > special cases needed. But I understand I'm omitting a lot of detail in this
> > highlevel "feeling" and these details may make this impractical.
> >
>
[...]
> > > > passed sb, mnt, parent, child so it should have all the info to decide
> > > > where the event should be reported and I don't see a need for
> > > > FS_EVENT_ON_CHILD flag.
> > >
> > > Do you mean something like this?
> > >
> > >         const struct path *inode = fsnotify_data_inode(data, data_type);
> > >         bool event_on_child = !!file_name && dir != inode;
> >
> > Not quite. E.g. in fanotify_group_event_mask() we could replace the
> > FS_EVENT_ON_CHILD usage with something like:
> >
> >         /* If parent isn't interested in events on child, skip adding its mask */
> >         if (type == FSNOTIFY_OBJ_TYPE_INODE &&
> >             !(mark->mask & FS_EVENT_ON_CHILD))
> >                 continue;
> >

That looks wrong. FAN_CREATE does not have the ON_CHILD flag and
should very well be reported to inode mark.
Trying to special case as little as possible the different types of events
(on child/dirent/self) is what drove my choices here, but if we can find
ways to further simplify the code all the better.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-16  6:38           ` Amir Goldstein
@ 2020-07-16  7:39             ` Amir Goldstein
  2020-07-16  9:55               ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2020-07-16  7:39 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Thu, Jul 16, 2020 at 9:38 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Jul 15, 2020 at 8:42 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Wed, Jul 15, 2020 at 8:09 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Tue 14-07-20 14:54:44, Amir Goldstein wrote:
> > > > On Tue, Jul 14, 2020 at 1:34 PM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > On Thu 02-07-20 15:57:37, Amir Goldstein wrote:
> > > > > > Instead of calling fsnotify() twice, once with parent inode and once
> > > > > > with child inode, if event should be sent to parent inode, send it
> > > > > > with both parent and child inodes marks in object type iterator and call
> > > > > > the backend handle_event() callback only once.
> > > > > >
> > > > > > The parent inode is assigned to the standard "inode" iterator type and
> > > > > > the child inode is assigned to the special "child" iterator type.
> > > > > >
> > > > > > In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
> > > > > > the dir argment to handle_event will be the parent inode, the file_name
> > > > > > argument to handle_event is non NULL and refers to the name of the child
> > > > > > and the child inode can be accessed with fsnotify_data_inode().
> > > > > >
> > > > > > This will allow fanotify to make decisions based on child or parent's
> > > > > > ignored mask.  For example, when a parent is interested in a specific
> > > > > > event on its children, but a specific child wishes to ignore this event,
> > > > > > the event will not be reported.  This is not what happens with current
> > > > > > code, but according to man page, it is the expected behavior.
> > > > > >
> > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > >
> > > > > I like the direction where this is going. But can't we push it even a bit
> > > > > further? I like the fact that we now have "one fs event" -> "one fsnotify()
> > > > > call". Ideally I'd like to get rid of FS_EVENT_ON_CHILD in the event mask
> > > > > because it's purpose seems very weak now and it complicates code (and now
> > > >
> > > > Can you give an example where it complicates the code?
> > > > Don't confuse this with the code in fanotify_user.c that subscribes for
> > > > events on child/with name.
> > >
> > > I refer mostly to the stuff like:
> > >
> > >         /* An event "on child" is not intended for a mount/sb mark */
> > >         if (mask & FS_EVENT_ON_CHILD)
> > >                 ...
> > >
>
> I need to explain something that was not an obvious decision for me.
>
> When sending the same event on two inodes marks I considered a few options:
>
> 1. TYPE_INODE is the mark on the object referred to in data
>     TYPE_PARENT is the mark on the parent if event is sent to a watching
>                                parent or to sb/mnt/child with parent/name info
> 2. TYPE_CHILD is the mark on the object referred to in data
>     TYPE_INODE is the mark on the fsnotify to_tell inode if not same as data
> 3. TYPE_INODE is the mark on the fsnotify to_tell inode
>     TYPE_CHILD is the mark on the object referred to in data if it is
> not to_tell
>
> The first option with TYPE_PARENT  would require changing audit
> and dnotify to look at TYPE_PARENT mark in addition to TYPE_INODE
> mark, so it adds more friction and I ruled it out.
>
> I think you had option #2 in mind when reading the code, but I went
> for option #3.
> There is a minor difference between them related to how we deal with the case
> that the parent is watching and the case that only the child is watching.
>
> If the parent is not watching (and child/sb/mnt not interested in name) we do
> not snapshot the name and do not set the ON_CHILD flag in the mask.
> In that case, should we add the child mark as TYPE_INODE or TYPE_CHILD?
>
> I chose TYPE_INODE because this meant I did not have to change audit/dnotify
> for that case. I didn't even care to look if they needed to be changed or not,
> just wanted to keep things as they were.
>
> Looking now, I see that dnotify would have needed to check TYPE_CHILD to
> get FS_ATTRIB event on self.
>
> It looks like audit would not have needed to change because although they set
> FS_EVENT_ON_CHILD in mask, none of the events they care about are
> "possible on child":
>  #define AUDIT_FS_WATCH (FS_MOVE | FS_CREATE | FS_DELETE | FS_DELETE_SELF |\
>                         FS_MOVE_SELF | FS_EVENT_ON_CHILD | FS_UNMOUNT)
> #define AUDIT_FS_EVENTS (FS_MOVE | FS_CREATE | FS_DELETE | FS_DELETE_SELF |\
>                          FS_MOVE_SELF | FS_EVENT_ON_CHILD)
>
> Having written that decision process down made me realize there is a bug in
> my unified inotify event handler implementation - it does not clear
> FS_EVENT_ON_CHILD when reporting without name.
>
> It is interesting to note that the result of sending FS_OPEN only to a watching
> child to inotify_handle_event() is the same for design choices #2 and #3 above.
> But the bug fix of clearing FS_EVENT_ON_CHILD when reporting without name
> would look different depending on said choice.
>
> Since I had to change inotify handler anyway, I prefer to stick with my choice
> and fix inotify handler using goto notify_child which is a bit uglier,
> instead of
> having to adapt dnotify to choice #2.
>

It turns out it's the other way around.
inotify handler has no bug (FS_EVENT_ON_CHILD is not exposed to the user)
just a confusing comment, so I will fix that.
But dnotify does have a bug - it also needs to be taught about the unified event
so that DN_ATTRIB event can be reported twice on both parent dir and child
subdir if both are watching.
Alas, we have no test coverage for dnotify in LTP...

This means that we could also go with choice #2.
But we can also make that internal change later on, because it does not
impact the logic.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 03/10] fsnotify: send event to parent and child with single callback
  2020-07-16  7:39             ` Amir Goldstein
@ 2020-07-16  9:55               ` Amir Goldstein
  0 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2020-07-16  9:55 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Thu, Jul 16, 2020 at 10:39 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Thu, Jul 16, 2020 at 9:38 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Wed, Jul 15, 2020 at 8:42 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Wed, Jul 15, 2020 at 8:09 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Tue 14-07-20 14:54:44, Amir Goldstein wrote:
> > > > > On Tue, Jul 14, 2020 at 1:34 PM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > On Thu 02-07-20 15:57:37, Amir Goldstein wrote:
> > > > > > > Instead of calling fsnotify() twice, once with parent inode and once
> > > > > > > with child inode, if event should be sent to parent inode, send it
> > > > > > > with both parent and child inodes marks in object type iterator and call
> > > > > > > the backend handle_event() callback only once.
> > > > > > >
> > > > > > > The parent inode is assigned to the standard "inode" iterator type and
> > > > > > > the child inode is assigned to the special "child" iterator type.
> > > > > > >
> > > > > > > In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask,
> > > > > > > the dir argment to handle_event will be the parent inode, the file_name
> > > > > > > argument to handle_event is non NULL and refers to the name of the child
> > > > > > > and the child inode can be accessed with fsnotify_data_inode().
> > > > > > >
> > > > > > > This will allow fanotify to make decisions based on child or parent's
> > > > > > > ignored mask.  For example, when a parent is interested in a specific
> > > > > > > event on its children, but a specific child wishes to ignore this event,
> > > > > > > the event will not be reported.  This is not what happens with current
> > > > > > > code, but according to man page, it is the expected behavior.
> > > > > > >
> > > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > > >
> > > > > > I like the direction where this is going. But can't we push it even a bit
> > > > > > further? I like the fact that we now have "one fs event" -> "one fsnotify()
> > > > > > call". Ideally I'd like to get rid of FS_EVENT_ON_CHILD in the event mask
> > > > > > because it's purpose seems very weak now and it complicates code (and now
> > > > >
> > > > > Can you give an example where it complicates the code?
> > > > > Don't confuse this with the code in fanotify_user.c that subscribes for
> > > > > events on child/with name.
> > > >
> > > > I refer mostly to the stuff like:
> > > >
> > > >         /* An event "on child" is not intended for a mount/sb mark */
> > > >         if (mask & FS_EVENT_ON_CHILD)
> > > >                 ...
> > > >
> >
> > I need to explain something that was not an obvious decision for me.
> >
> > When sending the same event on two inodes marks I considered a few options:
> >
> > 1. TYPE_INODE is the mark on the object referred to in data
> >     TYPE_PARENT is the mark on the parent if event is sent to a watching
> >                                parent or to sb/mnt/child with parent/name info
> > 2. TYPE_CHILD is the mark on the object referred to in data
> >     TYPE_INODE is the mark on the fsnotify to_tell inode if not same as data
> > 3. TYPE_INODE is the mark on the fsnotify to_tell inode
> >     TYPE_CHILD is the mark on the object referred to in data if it is
> > not to_tell
> >
> > The first option with TYPE_PARENT  would require changing audit
> > and dnotify to look at TYPE_PARENT mark in addition to TYPE_INODE
> > mark, so it adds more friction and I ruled it out.
> >
> > I think you had option #2 in mind when reading the code, but I went
> > for option #3.
> > There is a minor difference between them related to how we deal with the case
> > that the parent is watching and the case that only the child is watching.
> >
> > If the parent is not watching (and child/sb/mnt not interested in name) we do
> > not snapshot the name and do not set the ON_CHILD flag in the mask.
> > In that case, should we add the child mark as TYPE_INODE or TYPE_CHILD?
> >
> > I chose TYPE_INODE because this meant I did not have to change audit/dnotify
> > for that case. I didn't even care to look if they needed to be changed or not,
> > just wanted to keep things as they were.
> >
> > Looking now, I see that dnotify would have needed to check TYPE_CHILD to
> > get FS_ATTRIB event on self.
> >
> > It looks like audit would not have needed to change because although they set
> > FS_EVENT_ON_CHILD in mask, none of the events they care about are
> > "possible on child":
> >  #define AUDIT_FS_WATCH (FS_MOVE | FS_CREATE | FS_DELETE | FS_DELETE_SELF |\
> >                         FS_MOVE_SELF | FS_EVENT_ON_CHILD | FS_UNMOUNT)
> > #define AUDIT_FS_EVENTS (FS_MOVE | FS_CREATE | FS_DELETE | FS_DELETE_SELF |\
> >                          FS_MOVE_SELF | FS_EVENT_ON_CHILD)
> >
> > Having written that decision process down made me realize there is a bug in
> > my unified inotify event handler implementation - it does not clear
> > FS_EVENT_ON_CHILD when reporting without name.
> >
> > It is interesting to note that the result of sending FS_OPEN only to a watching
> > child to inotify_handle_event() is the same for design choices #2 and #3 above.
> > But the bug fix of clearing FS_EVENT_ON_CHILD when reporting without name
> > would look different depending on said choice.
> >
> > Since I had to change inotify handler anyway, I prefer to stick with my choice
> > and fix inotify handler using goto notify_child which is a bit uglier,
> > instead of
> > having to adapt dnotify to choice #2.
> >
>
> It turns out it's the other way around.
> inotify handler has no bug (FS_EVENT_ON_CHILD is not exposed to the user)
> just a confusing comment, so I will fix that.
> But dnotify does have a bug - it also needs to be taught about the unified event
> so that DN_ATTRIB event can be reported twice on both parent dir and child
> subdir if both are watching.
> Alas, we have no test coverage for dnotify in LTP...

FYI I verified this dnotify regression and fix manually after adding
DN_ATTRIB to
tools/testing/selftests/filesystems/dnotify_test.c:

$ cd dir/
$ dnotify_test &
$ cd subdir/
$ dnotify_test &
$ chmod 777 .
Got event on fd=3
Got event on fd=3

I will write an LTP test to cover this and see if we have similar
tests for inotify
and fanotify.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-07-16  9:55 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02 12:57 [PATCH v4 00/10] fanotify events with name info Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 01/10] inotify: report both events on parent and child with single callback Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 02/10] fanotify: " Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 03/10] fsnotify: send event to " Amir Goldstein
2020-07-14 10:34   ` Jan Kara
2020-07-14 11:54     ` Amir Goldstein
2020-07-15 17:09       ` Jan Kara
2020-07-15 17:42         ` Amir Goldstein
2020-07-16  6:38           ` Amir Goldstein
2020-07-16  7:39             ` Amir Goldstein
2020-07-16  9:55               ` Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 04/10] fsnotify: send event with parent/name info to sb/mount/non-dir marks Amir Goldstein
2020-07-14 11:54   ` Jan Kara
2020-07-14 12:17     ` Amir Goldstein
2020-07-14 15:31       ` Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 05/10] fsnotify: send MOVE_SELF event with parent/name info Amir Goldstein
2020-07-14 12:13   ` Jan Kara
2020-07-14 12:44     ` Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 06/10] fanotify: add basic support for FAN_REPORT_DIR_FID Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 07/10] fanotify: report events with parent dir fid to sb/mount/non-dir marks Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 08/10] fanotify: add support for FAN_REPORT_NAME Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 09/10] fanotify: report parent fid + name + child fid Amir Goldstein
2020-07-02 12:57 ` [PATCH v4 10/10] fanotify: report parent fid " Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).