linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/13] fanotify: add support for more event types
@ 2018-11-25 13:43 Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 01/13] fsnotify: annotate directory entry modification events Amir Goldstein
                   ` (12 more replies)
  0 siblings, 13 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

Jan,

This is the 3rd revision of patch series to add support for filesystem
change monitoring to fanotify.
It incorporates the changes you requested in review of v2 patches.
The complete work is available on fanotify_dirent branch [1] on my tree.

The end game is to use:
  fd = fanotify_init(FAN_CLASS_NOTIF|FAN_REPORT_FID, ...);
  rc = fanotify_mark(fd, FAN_MARK_FILESYSTEM, FAN_CREATE|FAN_DELETE...);
to monitor changes to a large scale namespace.

This functionality was not available with inotify API, which does not
scale well with recursive directory watches and was not available
with fanotify API, which did not support directory modification events.

I have tested this work with some preliminary LTP tests [2] and with
a prototype of global filesystem monitor based on inotify-tools [3].
Please see below a demo output [4] from filesystem monitor prototype.
Note that the "watches" in the prototype are userland entries to
map fid to path. The kernel has but one mark per super block.

Matthew Bobrowski has agreed to help me with writing more tests and
man pages (thanks Matthew!).

Thanks,
Amir.

Changes since v2:
- Discard FSNOTIFY_EVENT_DENTRY data type changes
- Cache fsid in connector instead of calling vfs_statfs() on every event
- Deny setting fid watch on filesystem with no fsid (tmpfs)
- Deny setting fid watch on filesystem with non root fsid (btrfs subvol)
- Report FAN_ONDIR for all event types with FAN_REPORT_FID

[1] https://github.com/amir73il/linux/commits/fanotify_dirent
[2] https://github.com/amir73il/ltp/commits/fanotify_dirent
[3] https://github.com/amir73il/inotify-tools/commits/fanotify_dirent
[4] Demo run of inotifywait monitor on 2 filesystems and 1 subvolume:
=================
...
/dev/vdf on /vdf type xfs (rw,attr2,inode64,noquota)
/dev/vde on /mnt type btrfs (rw,space_cache,subvolid=257,subvol=/subvol)
/dev/vde on /vde type btrfs (rw,space_cache,subvolid=5,subvol=/)

root@kvm-xfstests:~# inotifywait -m -g /vde /vdf
Setting up global filesystem watches.
Watches established.

/vde/ OPEN,ISDIR
/vdf/ OPEN,ISDIR

root@kvm-xfstests:~# mkdir -p /mnt/a/b/c/d/e/ && touch /mnt/a/b/c/d/e/x
...Start watching /vde/subvol (fid=f74e7a26.a635b2c5.100...)
/vde/subvol CREATE,ISDIR
...Start watching /vde/subvol/a (fid=f74e7a26.a635b2c5.101...)
/vde/subvol/a CLOSE_NOWRITE,OPEN,CREATE,CLOSE,ISDIR
/vde/subvol CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
...Start watching /vde/subvol/a/b (fid=f74e7a26.a635b2c5.102...)
/vde/subvol/a/b CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
/vde/subvol/a CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
/vde/subvol/a/b CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
/vde/subvol/a/b CREATE,ISDIR
...Start watching /vde/subvol/a/b/c (fid=f74e7a26.a635b2c5.103...)
/vde/subvol/a/b/c CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
/vde/subvol/a/b/c CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
/vde/subvol/a/b/c CREATE,ISDIR
...Start watching /vde/subvol/a/b/c/d (fid=f74e7a26.a635b2c5.104...)
/vde/subvol/a/b/c/d CLOSE_NOWRITE,OPEN,CREATE,CLOSE,ISDIR
/vde/subvol/a/b/c/d CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
...Start watching /vde/subvol/a/b/c/d/e (fid=f74e7a26.a635b2c5.105...)
/vde/subvol/a/b/c/d/e CREATE
...Start watching /vde/subvol/a/b/c/d/e/x (fid=f74e7a26.a635b2c5.106...)
/vde/subvol/a/b/c/d/e/x OPEN
/vde/subvol/a/b/c/d/e CLOSE_NOWRITE,OPEN,CLOSE,ISDIR
/vde/subvol/a/b/c/d/e/x CLOSE_NOWRITE,OPEN,CLOSE
/vde/subvol/a/b/c/d/e/x ATTRIB
/vde/subvol/a/b/c/d/e/x CLOSE_WRITE,CLOSE

root@kvm-xfstests:~# touch /vde/a /vdf/a /mnt/a
/vde/ CREATE
...Start watching /vde/a (fid=f74e7a26.a635b2c5.101...)
/vde/a OPEN
/vde/a ATTRIB
/vde/a CLOSE_NOWRITE,OPEN,CLOSE
/vde/a CLOSE_WRITE,CLOSE
/vdf/ CREATE
...Start watching /vdf/a (fid=fd50.0.105...)
/vdf/a OPEN
/vdf/a CLOSE_NOWRITE,OPEN,CLOSE
/vdf/a ATTRIB,CLOSE_WRITE,CLOSE
/vde/subvol/a ATTRIB,ISDIR

root@kvm-xfstests:~# rm -rf /vde/a /vdf/a /mnt/a
/vde/a ATTRIB,DELETE_SELF
/vde/ DELETE
/vdf/a ATTRIB,DELETE_SELF
/vdf/ DELETE
/vde/subvol/a OPEN,ISDIR
/vde/subvol/a ACCESS,ISDIR
/vde/subvol/a CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a OPEN,ISDIR
/vde/subvol/a ACCESS,ISDIR
/vde/subvol/a ACCESS,ISDIR
/vde/subvol/a/b OPEN,ISDIR
/vde/subvol/a/b ACCESS,ISDIR
/vde/subvol/a/b CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b OPEN,ISDIR
/vde/subvol/a/b ACCESS,ISDIR
/vde/subvol/a/b ACCESS,ISDIR
/vde/subvol/a/b/c ACCESS,OPEN,ISDIR
/vde/subvol/a/b/c CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b/c OPEN,ISDIR
/vde/subvol/a/b/c ACCESS,ISDIR
/vde/subvol/a/b/c/d OPEN,ISDIR
/vde/subvol/a/b/c/d ACCESS,ISDIR
/vde/subvol/a/b/c/d CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b/c/d OPEN,ISDIR
/vde/subvol/a/b/c/d ACCESS,ISDIR
/vde/subvol/a/b/c/d ACCESS,ISDIR
/vde/subvol/a/b/c/d/e OPEN,ISDIR
/vde/subvol/a/b/c/d/e ACCESS,ISDIR
/vde/subvol/a/b/c/d/e CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b/c/d/e OPEN,ISDIR
/vde/subvol/a/b/c/d/e ACCESS,ISDIR
/vde/subvol/a/b/c/d/e ACCESS,ISDIR
/vde/subvol/a/b/c/d/e/x ATTRIB,DELETE_SELF
/vde/subvol/a/b/c/d/e DELETE
/vde/subvol/a/b/c/d/e CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b/c/d/e DELETE_SELF
/vde/subvol/a/b/c/d DELETE,ISDIR
/vde/subvol/a/b/c/d CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b/c/d DELETE_SELF
/vde/subvol/a/b/c DELETE,ISDIR
/vde/subvol/a/b/c CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b/c DELETE_SELF
/vde/subvol/a/b DELETE,ISDIR
/vde/subvol/a/b CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a/b DELETE_SELF
/vde/subvol/a DELETE,ISDIR
/vde/subvol/a CLOSE_NOWRITE,CLOSE,ISDIR
/vde/subvol/a DELETE_SELF
/vde/subvol DELETE,ISDIR
=====================

Amir Goldstein (13):
  fsnotify: annotate directory entry modification events
  fsnotify: send all event types to super block marks
  fanotify: rename struct fanotify_{,perm_}event_info
  fanotify: define the structures to report a unique file identifier
  fanotify: classify events that hold a file identifier
  fanotify: encode file identifier for FAN_REPORT_FID
  fanotify: copy event fid info to user
  fanotify: enable FAN_REPORT_FID init flag
  fanotify: cache fsid in fsnotify_mark_connector
  fanotify: check FS_ISDIR flag instead of d_is_dir()
  fanotify: support events with data type FSNOTIFY_EVENT_INODE
  fanotify: add support for create/attrib/move/delete events
  fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID

 fs/notify/fanotify/fanotify.c      | 214 ++++++++++++++++++++++++-----
 fs/notify/fanotify/fanotify.h      |  77 +++++++++--
 fs/notify/fanotify/fanotify_user.c | 213 +++++++++++++++++++++++-----
 fs/notify/fsnotify.c               |  15 +-
 fs/notify/mark.c                   |  77 +++++++++--
 fs/statfs.c                        |   4 +-
 include/linux/fanotify.h           |  33 ++++-
 include/linux/fsnotify.h           |  46 +++++--
 include/linux/fsnotify_backend.h   |  60 +++++---
 include/linux/statfs.h             |   3 +
 include/uapi/linux/fanotify.h      |  46 ++++++-
 11 files changed, 642 insertions(+), 146 deletions(-)

--
2.17.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 01/13] fsnotify: annotate directory entry modification events
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-28 12:59   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 02/13] fsnotify: send all event types to super block marks Amir Goldstein
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

"dirent" events are referring to events that modify directory entries,
such as create,delete,rename. Those events should always be reported
on a watched directory, regardless if FS_EVENT_ON_CHILD is set
on the watch mask.

fsnotify_nameremove() and fsnotify_move() were modified to no longer
set the FS_EVENT_ON_CHILD event bit. This is a semantic change to
align with the "dirent" event definition. It has no effect on any
existing backend, because dnotify, inotify and audit always requets the
child events and fanotify does not get the delete,rename events.

The fsnotify_dirent() helper is used instead of fsnotify_parent() to
report a dirent event to dentry->d_parent without FS_EVENT_ON_CHILD
and regardless if parent has the FS_EVENT_ON_CHILD bit set.

ALL_FSNOTIFY_DIRENT_EVENTS defines all the dirent event types and
those event types have been extracted out of FS_EVENTS_POSS_ON_CHILD.

That means for a directory with an inotify watch and only dirent
events in the mask (i.e. create,delete,move), all children dentries
will no longer have the DCACHE_FSNOTIFY_PARENT_WATCHED flag set.
This will allow all events that happen on children to be optimized
away in __fsnotify_parent() without the need to dereference
child->d_parent->d_inode->i_fsnotify_mask.

Since the dirent events are never repoted via __fsnotify_parent(),
this results in no change of logic, but only an optimization.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 include/linux/fsnotify.h         | 43 +++++++++++++++++++++++++-------
 include/linux/fsnotify_backend.h | 36 +++++++++++++++-----------
 2 files changed, 55 insertions(+), 24 deletions(-)

diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 2ccb08cb5d6a..098322ec3689 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -17,8 +17,36 @@
 #include <linux/slab.h>
 #include <linux/bug.h>
 
+/*
+ * Notify this @dir inode about a change in the directory entry @dentry.
+ *
+ * Unlike fsnotify_parent(), the event will be reported regardless of the
+ * FS_EVENT_ON_CHILD mask on the parent inode.
+ *
+ * When called with NULL @dir (from fsnotify_nameremove()), the dentry parent
+ * inode is used as the inode to report the event to.
+ */
+static inline int fsnotify_dirent(struct inode *dir, struct dentry *dentry,
+				  __u32 mask)
+{
+	struct dentry *parent = NULL;
+	int ret;
+
+	if (!dir) {
+		parent = dget_parent(dentry);
+		dir = d_inode(parent);
+	}
+
+	ret = fsnotify(dir, mask, d_inode(dentry), FSNOTIFY_EVENT_INODE,
+			dentry->d_name.name, 0);
+
+	dput(parent);
+	return ret;
+}
+
 /* Notify this dentry's parent about a child's events. */
-static inline int fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask)
+static inline int fsnotify_parent(const struct path *path,
+				  struct dentry *dentry, __u32 mask)
 {
 	if (!dentry)
 		dentry = path->dentry;
@@ -85,8 +113,8 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 {
 	struct inode *source = moved->d_inode;
 	u32 fs_cookie = fsnotify_get_cookie();
-	__u32 old_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_FROM);
-	__u32 new_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_TO);
+	__u32 old_dir_mask = FS_MOVED_FROM;
+	__u32 new_dir_mask = FS_MOVED_TO;
 	const unsigned char *new_name = moved->d_name.name;
 
 	if (old_dir == new_dir)
@@ -136,7 +164,7 @@ static inline void fsnotify_nameremove(struct dentry *dentry, int isdir)
 	if (isdir)
 		mask |= FS_ISDIR;
 
-	fsnotify_parent(NULL, dentry, mask);
+	fsnotify_dirent(NULL, dentry, mask);
 }
 
 /*
@@ -155,7 +183,7 @@ static inline void fsnotify_create(struct inode *inode, struct dentry *dentry)
 {
 	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(inode, FS_CREATE, dentry->d_inode, FSNOTIFY_EVENT_INODE, dentry->d_name.name, 0);
+	fsnotify_dirent(inode, dentry, FS_CREATE);
 }
 
 /*
@@ -176,12 +204,9 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode, struct
  */
 static inline void fsnotify_mkdir(struct inode *inode, struct dentry *dentry)
 {
-	__u32 mask = (FS_CREATE | FS_ISDIR);
-	struct inode *d_inode = dentry->d_inode;
-
 	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(inode, mask, d_inode, FSNOTIFY_EVENT_INODE, dentry->d_name.name, 0);
+	fsnotify_dirent(inode, dentry, FS_CREATE | FS_ISDIR);
 }
 
 /*
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 7639774e7475..c3cb692c48a0 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -59,27 +59,33 @@
  * dnotify and inotify. */
 #define FS_EVENT_ON_CHILD	0x08000000
 
-/* This is a list of all events that may get sent to a parernt based on fs event
- * happening to inodes inside that directory */
-#define FS_EVENTS_POSS_ON_CHILD   (FS_ACCESS | FS_MODIFY | FS_ATTRIB |\
-				   FS_CLOSE_WRITE | FS_CLOSE_NOWRITE | FS_OPEN |\
-				   FS_MOVED_FROM | FS_MOVED_TO | FS_CREATE |\
-				   FS_DELETE | FS_OPEN_PERM | FS_ACCESS_PERM | \
-				   FS_OPEN_EXEC | FS_OPEN_EXEC_PERM)
-
 #define FS_MOVE			(FS_MOVED_FROM | FS_MOVED_TO)
 
+/*
+ * Directory entry modification events - reported only to directory
+ * where entry is modified and not to a watching parent.
+ * The watching parent may get an FS_ATTRIB|FS_EVENT_ON_CHILD event
+ * when a directory entry inside a child subdir changes.
+ */
+#define ALL_FSNOTIFY_DIRENT_EVENTS	(FS_CREATE | FS_DELETE | FS_MOVE)
+
 #define ALL_FSNOTIFY_PERM_EVENTS (FS_OPEN_PERM | FS_ACCESS_PERM | \
 				  FS_OPEN_EXEC_PERM)
 
+/*
+ * This is a list of all events that may get sent to a parent based on fs event
+ * happening to inodes inside that directory.
+ */
+#define FS_EVENTS_POSS_ON_CHILD   (ALL_FSNOTIFY_PERM_EVENTS | \
+				   FS_ACCESS | FS_MODIFY | FS_ATTRIB | \
+				   FS_CLOSE_WRITE | FS_CLOSE_NOWRITE | \
+				   FS_OPEN | FS_OPEN_EXEC)
+
 /* Events that can be reported to backends */
-#define ALL_FSNOTIFY_EVENTS (FS_ACCESS | FS_MODIFY | FS_ATTRIB | \
-			     FS_CLOSE_WRITE | FS_CLOSE_NOWRITE | FS_OPEN | \
-			     FS_MOVED_FROM | FS_MOVED_TO | FS_CREATE | \
-			     FS_DELETE | FS_DELETE_SELF | FS_MOVE_SELF | \
-			     FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED | \
-			     FS_OPEN_PERM | FS_ACCESS_PERM | FS_DN_RENAME | \
-			     FS_OPEN_EXEC | FS_OPEN_EXEC_PERM)
+#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS| \
+			     FS_EVENTS_POSS_ON_CHILD | \
+			     FS_DELETE_SELF | FS_MOVE_SELF | FS_DN_RENAME | \
+			     FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED)
 
 /* Extra flags that may be reported with event or control handling of events */
 #define ALL_FSNOTIFY_FLAGS  (FS_EXCL_UNLINK | FS_ISDIR | FS_IN_ONESHOT | \
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 02/13] fsnotify: send all event types to super block marks
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 01/13] fsnotify: annotate directory entry modification events Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-28 14:26   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 03/13] fanotify: rename struct fanotify_{,perm_}event_info Amir Goldstein
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

So far, existence of super block marks was checked only on events with
data type FSNOTIFY_EVENT_PATH. Use the super block of the "to_tell" inode
to report the events of all event types to super block marks.

This change has no effect on current backends. Soon, this will allow
fanotify backend to receive all event types on a super block mark.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index ecf09b6243d9..df06f3da166c 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -328,16 +328,15 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	     const unsigned char *file_name, u32 cookie)
 {
 	struct fsnotify_iter_info iter_info = {};
-	struct super_block *sb = NULL;
+	struct super_block *sb = to_tell->i_sb;
 	struct mount *mnt = NULL;
-	__u32 mnt_or_sb_mask = 0;
+	__u32 mnt_or_sb_mask = sb->s_fsnotify_mask;
 	int ret = 0;
 	__u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
 
 	if (data_is == FSNOTIFY_EVENT_PATH) {
 		mnt = real_mount(((const struct path *)data)->mnt);
-		sb = mnt->mnt.mnt_sb;
-		mnt_or_sb_mask = mnt->mnt_fsnotify_mask | sb->s_fsnotify_mask;
+		mnt_or_sb_mask |= mnt->mnt_fsnotify_mask;
 	}
 	/* An event "on child" is not intended for a mount/sb mark */
 	if (mask & FS_EVENT_ON_CHILD)
@@ -350,8 +349,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	 * SRCU because we have no references to any objects and do not
 	 * need SRCU to keep them "alive".
 	 */
-	if (!to_tell->i_fsnotify_marks &&
-	    (!mnt || (!mnt->mnt_fsnotify_marks && !sb->s_fsnotify_marks)))
+	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
+	    (!mnt || !mnt->mnt_fsnotify_marks))
 		return 0;
 	/*
 	 * if this is a modify event we may need to clear the ignored masks
@@ -366,11 +365,11 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 
 	iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
 		fsnotify_first_mark(&to_tell->i_fsnotify_marks);
+	iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
+		fsnotify_first_mark(&sb->s_fsnotify_marks);
 	if (mnt) {
 		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
 			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
-		iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
-			fsnotify_first_mark(&sb->s_fsnotify_marks);
 	}
 
 	/*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 03/13] fanotify: rename struct fanotify_{,perm_}event_info
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 01/13] fsnotify: annotate directory entry modification events Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 02/13] fsnotify: send all event types to super block marks Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-28 14:29   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier Amir Goldstein
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

struct fanotify_event_info "inherits" from struct fsnotify_event and
therefore a more appropriate (and short) name for it is fanotify_event.
Same for struct fanotify_perm_event_info, which now "inherits" from
struct fanotify_event.

We plan to reuse the name struct fanotify_event_info for user visible
event info record format.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 14 +++++++-------
 fs/notify/fanotify/fanotify.h      | 16 ++++++++--------
 fs/notify/fanotify/fanotify_user.c | 20 ++++++++++----------
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 3723f3d18d20..ecd5f4aec624 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -19,7 +19,7 @@
 static bool should_merge(struct fsnotify_event *old_fsn,
 			 struct fsnotify_event *new_fsn)
 {
-	struct fanotify_event_info *old, *new;
+	struct fanotify_event *old, *new;
 
 	pr_debug("%s: old=%p new=%p\n", __func__, old_fsn, new_fsn);
 	old = FANOTIFY_E(old_fsn);
@@ -58,7 +58,7 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
 }
 
 static int fanotify_get_response(struct fsnotify_group *group,
-				 struct fanotify_perm_event_info *event,
+				 struct fanotify_perm_event *event,
 				 struct fsnotify_iter_info *iter_info)
 {
 	int ret;
@@ -141,11 +141,11 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 		~marks_ignored_mask;
 }
 
-struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
+struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 						 struct inode *inode, u32 mask,
 						 const struct path *path)
 {
-	struct fanotify_event_info *event = NULL;
+	struct fanotify_event *event = NULL;
 	gfp_t gfp = GFP_KERNEL_ACCOUNT;
 
 	/*
@@ -160,7 +160,7 @@ struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
 	memalloc_use_memcg(group->memcg);
 
 	if (fanotify_is_perm_event(mask)) {
-		struct fanotify_perm_event_info *pevent;
+		struct fanotify_perm_event *pevent;
 
 		pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp);
 		if (!pevent)
@@ -197,7 +197,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 				 struct fsnotify_iter_info *iter_info)
 {
 	int ret = 0;
-	struct fanotify_event_info *event;
+	struct fanotify_event *event;
 	struct fsnotify_event *fsn_event;
 
 	BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS);
@@ -275,7 +275,7 @@ static void fanotify_free_group_priv(struct fsnotify_group *group)
 
 static void fanotify_free_event(struct fsnotify_event *fsn_event)
 {
-	struct fanotify_event_info *event;
+	struct fanotify_event *event;
 
 	event = FANOTIFY_E(fsn_event);
 	path_put(&event->path);
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index ea05b8a401e7..fb84dd3289f8 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -12,7 +12,7 @@ extern struct kmem_cache *fanotify_perm_event_cachep;
  * fanotify_handle_event() and freed when the information is retrieved by
  * userspace
  */
-struct fanotify_event_info {
+struct fanotify_event {
 	struct fsnotify_event fse;
 	/*
 	 * We hold ref to this path so it may be dereferenced at any point
@@ -29,16 +29,16 @@ struct fanotify_event_info {
  * group->notification_list to group->fanotify_data.access_list to wait for
  * user response.
  */
-struct fanotify_perm_event_info {
-	struct fanotify_event_info fae;
+struct fanotify_perm_event {
+	struct fanotify_event fae;
 	int response;	/* userspace answer to question */
 	int fd;		/* fd we passed to userspace for this event */
 };
 
-static inline struct fanotify_perm_event_info *
+static inline struct fanotify_perm_event *
 FANOTIFY_PE(struct fsnotify_event *fse)
 {
-	return container_of(fse, struct fanotify_perm_event_info, fae.fse);
+	return container_of(fse, struct fanotify_perm_event, fae.fse);
 }
 
 static inline bool fanotify_is_perm_event(u32 mask)
@@ -47,11 +47,11 @@ static inline bool fanotify_is_perm_event(u32 mask)
 		mask & FANOTIFY_PERM_EVENTS;
 }
 
-static inline struct fanotify_event_info *FANOTIFY_E(struct fsnotify_event *fse)
+static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
 {
-	return container_of(fse, struct fanotify_event_info, fse);
+	return container_of(fse, struct fanotify_event, fse);
 }
 
-struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
+struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 						 struct inode *inode, u32 mask,
 						 const struct path *path);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index e03be5071362..2dbb2662a92f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -73,7 +73,7 @@ static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
 }
 
 static int create_fd(struct fsnotify_group *group,
-		     struct fanotify_event_info *event,
+		     struct fanotify_event *event,
 		     struct file **file)
 {
 	int client_fd;
@@ -120,13 +120,13 @@ static int fill_event_metadata(struct fsnotify_group *group,
 			       struct file **file)
 {
 	int ret = 0;
-	struct fanotify_event_info *event;
+	struct fanotify_event *event;
 
 	pr_debug("%s: group=%p metadata=%p event=%p\n", __func__,
 		 group, metadata, fsn_event);
 
 	*file = NULL;
-	event = container_of(fsn_event, struct fanotify_event_info, fse);
+	event = container_of(fsn_event, struct fanotify_event, fse);
 	metadata->event_len = FAN_EVENT_METADATA_LEN;
 	metadata->metadata_len = FAN_EVENT_METADATA_LEN;
 	metadata->vers = FANOTIFY_METADATA_VERSION;
@@ -144,10 +144,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
 	return ret;
 }
 
-static struct fanotify_perm_event_info *dequeue_event(
+static struct fanotify_perm_event *dequeue_event(
 				struct fsnotify_group *group, int fd)
 {
-	struct fanotify_perm_event_info *event, *return_e = NULL;
+	struct fanotify_perm_event *event, *return_e = NULL;
 
 	spin_lock(&group->notification_lock);
 	list_for_each_entry(event, &group->fanotify_data.access_list,
@@ -169,7 +169,7 @@ static struct fanotify_perm_event_info *dequeue_event(
 static int process_access_response(struct fsnotify_group *group,
 				   struct fanotify_response *response_struct)
 {
-	struct fanotify_perm_event_info *event;
+	struct fanotify_perm_event *event;
 	int fd = response_struct->fd;
 	int response = response_struct->response;
 
@@ -364,7 +364,7 @@ static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t
 static int fanotify_release(struct inode *ignored, struct file *file)
 {
 	struct fsnotify_group *group = file->private_data;
-	struct fanotify_perm_event_info *event, *next;
+	struct fanotify_perm_event *event, *next;
 	struct fsnotify_event *fsn_event;
 
 	/*
@@ -682,7 +682,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	struct fsnotify_group *group;
 	int f_flags, fd;
 	struct user_struct *user;
-	struct fanotify_event_info *oevent;
+	struct fanotify_event *oevent;
 
 	pr_debug("%s: flags=%x event_f_flags=%x\n",
 		 __func__, flags, event_f_flags);
@@ -949,10 +949,10 @@ static int __init fanotify_user_setup(void)
 
 	fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
 					 SLAB_PANIC|SLAB_ACCOUNT);
-	fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC);
+	fanotify_event_cachep = KMEM_CACHE(fanotify_event, SLAB_PANIC);
 	if (IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) {
 		fanotify_perm_event_cachep =
-			KMEM_CACHE(fanotify_perm_event_info, SLAB_PANIC);
+			KMEM_CACHE(fanotify_perm_event, SLAB_PANIC);
 	}
 
 	return 0;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (2 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 03/13] fanotify: rename struct fanotify_{,perm_}event_info Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-28 15:27   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 05/13] fanotify: classify events that hold a " Amir Goldstein
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

When user requests the flag FAN_REPORT_FID in fanotify_init(),
a unique file indetifier of the event target object will be reported
with the event.

This commit only defines the internal and user visible structures used
to store and report the unique file identifier.

The file identifier includes the filesystem's fsid (i.e. from statfs(2))
and an NFS file handle of the file (i.e. from name_to_handle_at(2)).

The file identifier makes holding the path reference and passing a file
descriptor to user redundant, so those are disabled in a group with
FAN_REPORT_FID.

Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      |  2 +-
 fs/notify/fanotify/fanotify.h      | 26 ++++++++++++++++----
 fs/notify/fanotify/fanotify_user.c |  5 ++--
 include/uapi/linux/fanotify.h      | 38 +++++++++++++++++++++++++-----
 4 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index ecd5f4aec624..59d093923c97 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -178,7 +178,7 @@ init: __maybe_unused
 		event->pid = get_pid(task_pid(current));
 	else
 		event->pid = get_pid(task_tgid(current));
-	if (path) {
+	if (path && !FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
 		event->path = *path;
 		path_get(&event->path);
 	} else {
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index fb84dd3289f8..2e4fca30afda 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -7,6 +7,14 @@ extern struct kmem_cache *fanotify_mark_cache;
 extern struct kmem_cache *fanotify_event_cachep;
 extern struct kmem_cache *fanotify_perm_event_cachep;
 
+/* The size of the variable length buffer storing fsid and file handle */
+#define FANOTIFY_FID_LEN(handle_bytes)	\
+	(sizeof(struct fanotify_event_fid) + (handle_bytes))
+
+struct fanotify_info {
+	struct fanotify_event_fid *fid;
+};
+
 /*
  * Structure for normal fanotify events. It gets allocated in
  * fanotify_handle_event() and freed when the information is retrieved by
@@ -14,11 +22,19 @@ extern struct kmem_cache *fanotify_perm_event_cachep;
  */
 struct fanotify_event {
 	struct fsnotify_event fse;
-	/*
-	 * We hold ref to this path so it may be dereferenced at any point
-	 * during this object's lifetime
-	 */
-	struct path path;
+	union {
+		/*
+		 * We hold ref to this path so it may be dereferenced at any
+		 * point during this object's lifetime
+		 */
+		struct path path;
+		/*
+		 * With FAN_REPORT_FID, we do not hold any reference on the
+		 * victim object. Instead we store its NFS file handle and its
+		 * filesystem's fsid as a unique identifier.
+		 */
+		struct fanotify_info info;
+	};
 	struct pid *pid;
 };
 
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 2dbb2662a92f..93e1aa2a389f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -133,9 +133,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
 	metadata->reserved = 0;
 	metadata->mask = fsn_event->mask & FANOTIFY_OUTGOING_EVENTS;
 	metadata->pid = pid_vnr(event->pid);
-	if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
+	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
+	    unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
 		metadata->fd = FAN_NOFD;
-	else {
+	} else {
 		metadata->fd = create_fd(group, event, file);
 		if (metadata->fd < 0)
 			ret = metadata->fd;
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index 909c98fcace2..0fd8736269c4 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -44,6 +44,7 @@
 
 /* Flags to determine fanotify event format */
 #define FAN_REPORT_TID		0x00000100	/* event->pid is thread id */
+#define FAN_REPORT_FID		0x00000200	/* Report unique file id */
 
 /* Deprecated - do not use this in programs and do not add new flags here! */
 #define FAN_ALL_INIT_FLAGS	(FAN_CLOEXEC | FAN_NONBLOCK | \
@@ -106,6 +107,24 @@ struct fanotify_event_metadata {
 	__s32 pid;
 };
 
+#define FAN_EVENT_INFO_TYPE_FID		1
+
+/* Variable length info record header following event metadata */
+struct fanotify_event_info {
+	__u8 info_type;
+	__u8 reserved;
+	__u16 info_len;
+	unsigned char info[0];
+};
+
+/* Unique file identifier info record */
+struct fanotify_event_fid {
+	__kernel_fsid_t fsid;
+	__u32 handle_bytes;
+	__s32 handle_type;
+	unsigned char f_handle[0];
+};
+
 struct fanotify_response {
 	__s32 fd;
 	__u32 response;
@@ -122,12 +141,19 @@ struct fanotify_response {
 /* Helper functions to deal with fanotify_event_metadata buffers */
 #define FAN_EVENT_METADATA_LEN (sizeof(struct fanotify_event_metadata))
 
-#define FAN_EVENT_NEXT(meta, len) ((len) -= (meta)->event_len, \
-				   (struct fanotify_event_metadata*)(((char *)(meta)) + \
-				   (meta)->event_len))
+#define FAN_EVENT_NEXT(meta, len) \
+	((len) -= (meta)->event_len, \
+	 (struct fanotify_event_metadata *)(((char *)(meta)) + \
+					   (meta)->event_len))
+
+#define FAN_EVENT_OK(meta, len)	\
+	((long)(len) >= (long)FAN_EVENT_METADATA_LEN && \
+	 (long)(meta)->event_len >= (long)FAN_EVENT_METADATA_LEN && \
+	 (long)(meta)->event_len <= (long)(len))
 
-#define FAN_EVENT_OK(meta, len)	((long)(len) >= (long)FAN_EVENT_METADATA_LEN && \
-				(long)(meta)->event_len >= (long)FAN_EVENT_METADATA_LEN && \
-				(long)(meta)->event_len <= (long)(len))
+/* Get the first event info record if one exists */
+#define FAN_EVENT_INFO(meta)	\
+	((long)(meta)->event_len > (long)FAN_EVENT_METADATA_LEN ? \
+	 (struct fanotify_event_info *)((meta) + 1) : NULL)
 
 #endif /* _UAPI_LINUX_FANOTIFY_H */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 05/13] fanotify: classify events that hold a file identifier
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (3 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-28 15:33   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 06/13] fanotify: encode file identifier for FAN_REPORT_FID Amir Goldstein
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

With group flag FAN_REPORT_FID, we do not store event->path.
Instead, we will store the file handle and fsid in event->info.fid.

Because event->info and event->path share the same union space, set
bit 0 of event->info.flags, so we know how to free and compare the
event objects.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 35 ++++++++++++++++++++++++------
 fs/notify/fanotify/fanotify.h      | 27 +++++++++++++++++++++++
 fs/notify/fanotify/fanotify_user.c |  3 +++
 3 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 59d093923c97..8192d4b1db21 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -25,11 +25,16 @@ static bool should_merge(struct fsnotify_event *old_fsn,
 	old = FANOTIFY_E(old_fsn);
 	new = FANOTIFY_E(new_fsn);
 
-	if (old_fsn->inode == new_fsn->inode && old->pid == new->pid &&
-	    old->path.mnt == new->path.mnt &&
-	    old->path.dentry == new->path.dentry)
-		return true;
-	return false;
+	if (old_fsn->inode != new_fsn->inode || old->pid != new->pid)
+		return false;
+
+	if (FANOTIFY_HAS_FID(new)) {
+		return FANOTIFY_HAS_FID(old) &&
+			fanotify_fid_equal(old->info.fid, new->info.fid);
+	} else {
+		return old->path.mnt == new->path.mnt &&
+			old->path.dentry == new->path.dentry;
+	}
 }
 
 /* and the list better be locked by something too! */
@@ -178,7 +183,10 @@ init: __maybe_unused
 		event->pid = get_pid(task_pid(current));
 	else
 		event->pid = get_pid(task_tgid(current));
-	if (path && !FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+		/* TODO: allocate buffer and encode file handle */
+		fanotify_set_fid(event, NULL);
+	} else if (path) {
 		event->path = *path;
 		path_get(&event->path);
 	} else {
@@ -277,8 +285,21 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
 {
 	struct fanotify_event *event;
 
+	/*
+	 * event->path and event->info are a union. Make sure that
+	 * event->info.flags overlaps with path.dentry pointer, so it is safe
+	 * to use bit 0 to classify the event info type (FANOTIFY_HAS_FID).
+	 */
+	BUILD_BUG_ON(offsetof(struct fanotify_event, path.dentry) !=
+		     offsetof(struct fanotify_event, info.flags));
+	BUILD_BUG_ON(!__builtin_types_compatible_p(typeof(event->info.flags),
+						   unsigned long));
+
 	event = FANOTIFY_E(fsn_event);
-	path_put(&event->path);
+	if (FANOTIFY_HAS_FID(event))
+		kfree(event->info.fid);
+	else
+		path_put(&event->path);
 	put_pid(event->pid);
 	if (fanotify_is_perm_event(fsn_event->mask)) {
 		kmem_cache_free(fanotify_perm_event_cachep,
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 2e4fca30afda..a79dcbd41702 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -13,8 +13,20 @@ extern struct kmem_cache *fanotify_perm_event_cachep;
 
 struct fanotify_info {
 	struct fanotify_event_fid *fid;
+	unsigned long flags;
 };
 
+static inline bool fanotify_fid_equal(const struct fanotify_event_fid *fid1,
+				      const struct fanotify_event_fid *fid2)
+{
+	if (fid1 == fid2)
+		return true;
+
+	return fid1 && fid2 && fid1->handle_bytes == fid2->handle_bytes &&
+		!memcmp((void *)fid1, (void *)fid2,
+			FANOTIFY_FID_LEN(fid1->handle_bytes));
+}
+
 /*
  * Structure for normal fanotify events. It gets allocated in
  * fanotify_handle_event() and freed when the information is retrieved by
@@ -38,6 +50,21 @@ struct fanotify_event {
 	struct pid *pid;
 };
 
+/*
+ * Bit 0 of info.flags is set when event has fid information.
+ * event->info shares the same union address with event->path, so this helps
+ * us tell if event has fid or path.
+ */
+#define __FANOTIFY_FID		1UL
+#define FANOTIFY_HAS_FID(event)	((event)->info.flags & __FANOTIFY_FID)
+
+static inline void fanotify_set_fid(struct fanotify_event *event,
+				    struct fanotify_event_fid *fid)
+{
+	event->info.fid = fid;
+	event->info.flags = fid ? __FANOTIFY_FID : 0;
+}
+
 /*
  * Structure for permission fanotify events. It gets allocated and freed in
  * fanotify_handle_event() since we wait there for user response. When the
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 93e1aa2a389f..47e8bf3bcd28 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -81,6 +81,9 @@ static int create_fd(struct fsnotify_group *group,
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
+	if (WARN_ON_ONCE(FANOTIFY_HAS_FID(event)))
+		return -EBADF;
+
 	client_fd = get_unused_fd_flags(group->fanotify_data.f_flags);
 	if (client_fd < 0)
 		return client_fd;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 06/13] fanotify: encode file identifier for FAN_REPORT_FID
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (4 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 05/13] fanotify: classify events that hold a " Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 07/13] fanotify: copy event fid info to user Amir Goldstein
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Allocate and encode event->info.fid for a group with FAN_REPORT_FID.
Treat failure to allocate fid buffer as failure to allocate event.
Treat failure to encode fid by printing a warning but queueing
the event without the fid information.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c | 59 +++++++++++++++++++++++++++++++++--
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 8192d4b1db21..844b748f0b74 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -13,6 +13,8 @@
 #include <linux/wait.h>
 #include <linux/audit.h>
 #include <linux/sched/mm.h>
+#include <linux/statfs.h>
+#include <linux/exportfs.h>
 
 #include "fanotify.h"
 
@@ -146,11 +148,55 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 		~marks_ignored_mask;
 }
 
+static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
+						     gfp_t gfp)
+{
+	struct fanotify_event_fid *fid = NULL;
+	int dwords, bytes = 0;
+	struct kstatfs stat;
+	int err, type;
+
+	dwords = 0;
+	err = -ENOENT;
+	type = exportfs_encode_fh(path->dentry, NULL, &dwords,  0);
+	if (!dwords)
+		goto out_err;
+
+	err = vfs_statfs(path, &stat);
+	if (err)
+		goto out_err;
+
+	/* Treat failure to allocate fid as failure to allocate event */
+	bytes = dwords << 2;
+	fid = kmalloc(FANOTIFY_FID_LEN(bytes), gfp);
+	if (!fid)
+		return NULL;
+
+	type = exportfs_encode_fh(path->dentry, (struct fid *)fid->f_handle,
+				  &dwords,  0);
+	err = -EINVAL;
+	if (type == FILEID_INVALID || bytes != dwords << 2)
+		goto out_err;
+
+	fid->handle_bytes = bytes;
+	fid->handle_type = type;
+	fid->fsid = stat.f_fsid;
+
+	return fid;
+
+out_err:
+	pr_warn_ratelimited("fanotify: failed to encode fid of %pd2 (bytes=%d, err=%i)\n",
+			    path->dentry, bytes, err);
+	kfree(fid);
+	return ERR_PTR(err);
+}
+
 struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 						 struct inode *inode, u32 mask,
 						 const struct path *path)
 {
 	struct fanotify_event *event = NULL;
+	struct fanotify_event_fid *fid = NULL;
 	gfp_t gfp = GFP_KERNEL_ACCOUNT;
 
 	/*
@@ -164,6 +210,16 @@ struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	/* Whoever is interested in the event, pays for the allocation. */
 	memalloc_use_memcg(group->memcg);
 
+	if (path && FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+		fid = fanotify_alloc_fid(path, gfp);
+		/* Treat failure to allocate fid as failure to allocate event */
+		if (!fid)
+			goto out;
+		/* Report the event without a file identifier on encode error */
+		if (IS_ERR(fid))
+			fid = NULL;
+	}
+
 	if (fanotify_is_perm_event(mask)) {
 		struct fanotify_perm_event *pevent;
 
@@ -184,8 +240,7 @@ init: __maybe_unused
 	else
 		event->pid = get_pid(task_tgid(current));
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
-		/* TODO: allocate buffer and encode file handle */
-		fanotify_set_fid(event, NULL);
+		fanotify_set_fid(event, fid);
 	} else if (path) {
 		event->path = *path;
 		path_get(&event->path);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 07/13] fanotify: copy event fid info to user
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (5 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 06/13] fanotify: encode file identifier for FAN_REPORT_FID Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-29  9:00   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag Amir Goldstein
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

If group requested FAN_REPORT_FID and event has file identifier
copy that information to user reading the event after reading
event metadata.
metadata->event_len includes the length of the fid information.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.h      |  3 ++
 fs/notify/fanotify/fanotify_user.c | 72 +++++++++++++++++++++++++++---
 2 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index a79dcbd41702..0e57fa0674d7 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -10,6 +10,9 @@ extern struct kmem_cache *fanotify_perm_event_cachep;
 /* The size of the variable length buffer storing fsid and file handle */
 #define FANOTIFY_FID_LEN(handle_bytes)	\
 	(sizeof(struct fanotify_event_fid) + (handle_bytes))
+#define FANOTIFY_FID_INFO_LEN(event)	\
+	(sizeof(struct fanotify_event_info) + \
+	FANOTIFY_FID_LEN((event)->info.fid->handle_bytes))
 
 struct fanotify_info {
 	struct fanotify_event_fid *fid;
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 47e8bf3bcd28..ea8e81a3e80b 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -47,6 +47,16 @@ struct kmem_cache *fanotify_mark_cache __read_mostly;
 struct kmem_cache *fanotify_event_cachep __read_mostly;
 struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
 
+static int round_event_fid_len(struct fsnotify_event *fsn_event)
+{
+	struct fanotify_event *event = FANOTIFY_E(fsn_event);
+
+	if (!FANOTIFY_HAS_FID(event))
+		return 0;
+
+	return roundup(FANOTIFY_FID_INFO_LEN(event), FAN_EVENT_METADATA_LEN);
+}
+
 /*
  * Get an fsnotify notification event if one exists and is small
  * enough to fit in "count". Return an error pointer if the count
@@ -57,6 +67,9 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
 static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
 					    size_t count)
 {
+	size_t event_size = FAN_EVENT_METADATA_LEN;
+	struct fsnotify_event *event;
+
 	assert_spin_locked(&group->notification_lock);
 
 	pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
@@ -64,11 +77,18 @@ static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
 	if (fsnotify_notify_queue_is_empty(group))
 		return NULL;
 
-	if (FAN_EVENT_METADATA_LEN > count)
+	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+		event = fsnotify_peek_first_event(group);
+		event_size += round_event_fid_len(event);
+	}
+
+	if (event_size > count)
 		return ERR_PTR(-EINVAL);
 
-	/* held the notification_lock the whole time, so this is the
-	 * same event we peeked above */
+	/*
+	 * Held the notification_lock the whole time, so this is the
+	 * same event we peeked above
+	 */
 	return fsnotify_remove_first_event(group);
 }
 
@@ -129,7 +149,7 @@ static int fill_event_metadata(struct fsnotify_group *group,
 		 group, metadata, fsn_event);
 
 	*file = NULL;
-	event = container_of(fsn_event, struct fanotify_event, fse);
+	event = FANOTIFY_E(fsn_event);
 	metadata->event_len = FAN_EVENT_METADATA_LEN;
 	metadata->metadata_len = FAN_EVENT_METADATA_LEN;
 	metadata->vers = FANOTIFY_METADATA_VERSION;
@@ -139,6 +159,7 @@ static int fill_event_metadata(struct fsnotify_group *group,
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
 	    unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
 		metadata->fd = FAN_NOFD;
+		metadata->event_len += round_event_fid_len(fsn_event);
 	} else {
 		metadata->fd = create_fd(group, event, file);
 		if (metadata->fd < 0)
@@ -208,6 +229,38 @@ static int process_access_response(struct fsnotify_group *group,
 	return 0;
 }
 
+static int copy_fid_to_user(struct fsnotify_event *fsn_event, char __user *buf)
+{
+	struct fanotify_event *event = FANOTIFY_E(fsn_event);
+	struct fanotify_event_info ei;
+	size_t fid_len;
+	size_t pad_len = round_event_fid_len(fsn_event);
+
+	if (!pad_len)
+		return 0;
+
+	/* Copy event info header followed by fid buffer */
+	fid_len = FANOTIFY_FID_INFO_LEN(event);
+	pad_len -= fid_len;
+	ei.info_type = FAN_EVENT_INFO_TYPE_FID;
+	ei.reserved = 0;
+	ei.info_len = fid_len;
+	if (copy_to_user(buf, &ei, sizeof(ei)))
+		return -EFAULT;
+
+	buf += sizeof(ei);
+	fid_len -= sizeof(ei);
+	if (copy_to_user(buf, event->info.fid, fid_len))
+		return -EFAULT;
+
+	/* Pad with 0's */
+	buf += fid_len;
+	if (pad_len && clear_user(buf, pad_len))
+		return -EFAULT;
+
+	return 0;
+}
+
 static ssize_t copy_event_to_user(struct fsnotify_group *group,
 				  struct fsnotify_event *event,
 				  char __user *buf)
@@ -224,15 +277,20 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 
 	fd = fanotify_event_metadata.fd;
 	ret = -EFAULT;
-	if (copy_to_user(buf, &fanotify_event_metadata,
-			 fanotify_event_metadata.event_len))
+	if (copy_to_user(buf, &fanotify_event_metadata, FAN_EVENT_METADATA_LEN))
 		goto out_close_fd;
 
 	if (fanotify_is_perm_event(event->mask))
 		FANOTIFY_PE(event)->fd = fd;
 
-	if (fd != FAN_NOFD)
+	if (fd != FAN_NOFD) {
 		fd_install(fd, f);
+	} else if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+		ret = copy_fid_to_user(event, buf + FAN_EVENT_METADATA_LEN);
+		if (ret < 0)
+			return ret;
+	}
+
 	return fanotify_event_metadata.event_len;
 
 out_close_fd:
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (6 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 07/13] fanotify: copy event fid info to user Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-29  9:46   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector Amir Goldstein
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

When setting up an fanotify listener, user may request to get fid
information in event instead of an open file descriptor.

The fid obtained with event on a watched object contains the file
handle returned by name_to_handle_at(2) and fsid returned by statfs(2).

When setting a mark, we need to make sure that the filesystem
supports encoding file handles with name_to_handle_at(2) and that
statfs(2) encodes a non-zero fsid.

Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify_user.c | 54 +++++++++++++++++++++++++++++-
 fs/statfs.c                        |  4 ++-
 include/linux/fanotify.h           |  2 +-
 include/linux/statfs.h             |  3 ++
 4 files changed, 60 insertions(+), 3 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index ea8e81a3e80b..d7aa2f392a64 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -17,6 +17,8 @@
 #include <linux/compat.h>
 #include <linux/sched/signal.h>
 #include <linux/memcontrol.h>
+#include <linux/statfs.h>
+#include <linux/exportfs.h>
 
 #include <asm/ioctls.h>
 
@@ -857,6 +859,49 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	return fd;
 }
 
+/* Check if filesystem can encode a unique fid */
+static int fanotify_test_fid(struct path *path)
+{
+	struct kstatfs stat, root_stat;
+	int err;
+
+	/*
+	 * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
+	 * TODO: cache fsid in the mark connector.
+	 */
+	err = vfs_statfs(path, &stat);
+	if (err)
+		return err;
+
+	if (!stat.f_fsid.val[0] && !stat.f_fsid.val[1])
+		return -ENODEV;
+
+	/*
+	 * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
+	 * which uses a different fsid than sb root.
+	 */
+	err = statfs_by_dentry(path->dentry->d_sb->s_root, &root_stat);
+	if (err)
+		return err;
+
+	if (root_stat.f_fsid.val[0] != stat.f_fsid.val[0] ||
+	    root_stat.f_fsid.val[1] != stat.f_fsid.val[1])
+		return -EXDEV;
+
+	/*
+	 * We need to make sure that the file system supports at least
+	 * encoding a file handle so user can use name_to_handle_at() to
+	 * compare fid returned with event to the file handle of watched
+	 * objects. However, name_to_handle_at() requires that the
+	 * filesystem also supports decoding file handles.
+	 */
+	if (!path->dentry->d_sb->s_export_op ||
+	    !path->dentry->d_sb->s_export_op->fh_to_dentry)
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
 static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 			    int dfd, const char  __user *pathname)
 {
@@ -942,6 +987,12 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	if (ret)
 		goto fput_and_out;
 
+	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+		ret = fanotify_test_fid(&path);
+		if (ret)
+			goto path_put_and_out;
+	}
+
 	/* inode held in place by reference to path; group by fget on fd */
 	if (mark_type == FAN_MARK_INODE)
 		inode = path.dentry->d_inode;
@@ -970,6 +1021,7 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 		ret = -EINVAL;
 	}
 
+path_put_and_out:
 	path_put(&path);
 fput_and_out:
 	fdput(f);
@@ -1006,7 +1058,7 @@ COMPAT_SYSCALL_DEFINE6(fanotify_mark,
  */
 static int __init fanotify_user_setup(void)
 {
-	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 7);
+	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 8);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
 
 	fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
diff --git a/fs/statfs.c b/fs/statfs.c
index f0216629621d..6a5d840a2d8d 100644
--- a/fs/statfs.c
+++ b/fs/statfs.c
@@ -50,7 +50,8 @@ static int calculate_f_flags(struct vfsmount *mnt)
 		flags_by_sb(mnt->mnt_sb->s_flags);
 }
 
-static int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
+/* Does not set buf->f_flags */
+int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
 {
 	int retval;
 
@@ -66,6 +67,7 @@ static int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
 		buf->f_frsize = buf->f_bsize;
 	return retval;
 }
+EXPORT_SYMBOL(statfs_by_dentry);
 
 int vfs_statfs(const struct path *path, struct kstatfs *buf)
 {
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 9e2142795335..f59be967f72b 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -19,7 +19,7 @@
 				 FAN_CLASS_PRE_CONTENT)
 
 #define FANOTIFY_INIT_FLAGS	(FANOTIFY_CLASS_BITS | \
-				 FAN_REPORT_TID | \
+				 FAN_REPORT_TID | FAN_REPORT_FID | \
 				 FAN_CLOEXEC | FAN_NONBLOCK | \
 				 FAN_UNLIMITED_QUEUE | FAN_UNLIMITED_MARKS)
 
diff --git a/include/linux/statfs.h b/include/linux/statfs.h
index 3142e98546ac..2c3ca7cb8c98 100644
--- a/include/linux/statfs.h
+++ b/include/linux/statfs.h
@@ -41,4 +41,7 @@ struct kstatfs {
 #define ST_NODIRATIME	0x0800	/* do not update directory access times */
 #define ST_RELATIME	0x1000	/* update atime relative to mtime/ctime */
 
+struct dentry;
+extern int statfs_by_dentry(struct dentry *dentry, struct kstatfs *stat);
+
 #endif
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (7 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-29 10:48   ` Jan Kara
  2018-11-25 13:43 ` [PATCH v3 10/13] fanotify: check FS_ISDIR flag instead of d_is_dir() Amir Goldstein
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

For FAN_REPORT_FID, we need to encode fid with fsid of the filesystem on
every event. To avoid having to call vfs_statfs() on every event to get
fsid, we store the fsid in fsnotify_mark_connector on the first time we
add a mark and on handle event we use the cached fsid.

Subsequent calls to add mark on the same object are expected to pass the
same fsid, so the call will fail on cached fsid mismatch.

Similarly, if an event is reported on several mark types (inode, mount,
filesystem), all connectors should already have the same fsid and we
warn if we detect a mismatch.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 42 ++++++++++------
 fs/notify/fanotify/fanotify.h      |  5 +-
 fs/notify/fanotify/fanotify_user.c | 62 ++++++++++++++----------
 fs/notify/mark.c                   | 77 ++++++++++++++++++++++++++----
 include/linux/fsnotify_backend.h   | 24 ++++++++--
 5 files changed, 154 insertions(+), 56 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 844b748f0b74..90bac98dd8c7 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -92,7 +92,7 @@ static int fanotify_get_response(struct fsnotify_group *group,
 
 	pr_debug("%s: group=%p event=%p about to return ret=%d\n", __func__,
 		 group, event, ret);
-	
+
 	return ret;
 }
 
@@ -103,13 +103,13 @@ static int fanotify_get_response(struct fsnotify_group *group,
  * requested by the user, will not be present in the returned mask.
  */
 static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
-				       u32 event_mask, const void *data,
-				       int data_type)
+				     u32 event_mask, const void *data,
+				     int data_type, __kernel_fsid_t *fsid)
 {
 	__u32 marks_mask = 0, marks_ignored_mask = 0;
 	const struct path *path = data;
 	struct fsnotify_mark *mark;
-	int type;
+	int type, err;
 
 	pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
 		 __func__, iter_info->report_mask, event_mask, data, data_type);
@@ -136,6 +136,13 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 		     !(mark->mask & FS_EVENT_ON_CHILD)))
 			continue;
 
+		if (fsid) {
+			err = fsnotify_get_conn_fsid(mark->connector, fsid);
+			/* Should we skip mark with null or conflicting fsid? */
+			pr_debug("%s: fsid=%x.%x type=%u err=%d\n", __func__,
+				 fsid->val[0], fsid->val[1], type, err);
+		}
+
 		marks_mask |= mark->mask;
 		marks_ignored_mask |= mark->ignored_mask;
 	}
@@ -153,7 +160,6 @@ static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
 {
 	struct fanotify_event_fid *fid = NULL;
 	int dwords, bytes = 0;
-	struct kstatfs stat;
 	int err, type;
 
 	dwords = 0;
@@ -162,10 +168,6 @@ static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
 	if (!dwords)
 		goto out_err;
 
-	err = vfs_statfs(path, &stat);
-	if (err)
-		goto out_err;
-
 	/* Treat failure to allocate fid as failure to allocate event */
 	bytes = dwords << 2;
 	fid = kmalloc(FANOTIFY_FID_LEN(bytes), gfp);
@@ -180,7 +182,6 @@ static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
 
 	fid->handle_bytes = bytes;
 	fid->handle_type = type;
-	fid->fsid = stat.f_fsid;
 
 	return fid;
 
@@ -192,8 +193,9 @@ static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
 }
 
 struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
-						 struct inode *inode, u32 mask,
-						 const struct path *path)
+					    struct inode *inode, u32 mask,
+					    const struct path *path,
+					    __kernel_fsid_t *fsid)
 {
 	struct fanotify_event *event = NULL;
 	struct fanotify_event_fid *fid = NULL;
@@ -210,7 +212,7 @@ struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	/* Whoever is interested in the event, pays for the allocation. */
 	memalloc_use_memcg(group->memcg);
 
-	if (path && FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+	if (path && fsid && FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
 		fid = fanotify_alloc_fid(path, gfp);
 		/* Treat failure to allocate fid as failure to allocate event */
 		if (!fid)
@@ -218,6 +220,8 @@ struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 		/* Report the event without a file identifier on encode error */
 		if (IS_ERR(fid))
 			fid = NULL;
+		else
+			fid->fsid = *fsid;
 	}
 
 	if (fanotify_is_perm_event(mask)) {
@@ -262,6 +266,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 	int ret = 0;
 	struct fanotify_event *event;
 	struct fsnotify_event *fsn_event;
+	__kernel_fsid_t __fsid, *fsid = NULL;
 
 	BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS);
 	BUILD_BUG_ON(FAN_MODIFY != FS_MODIFY);
@@ -278,7 +283,14 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 
 	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 12);
 
-	mask = fanotify_group_event_mask(iter_info, mask, data, data_type);
+	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+		__fsid.val[0] = 0;
+		__fsid.val[1] = 0;
+		fsid = &__fsid;
+	}
+
+	mask = fanotify_group_event_mask(iter_info, mask, data, data_type,
+					 fsid);
 	if (!mask)
 		return 0;
 
@@ -294,7 +306,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 			return 0;
 	}
 
-	event = fanotify_alloc_event(group, inode, mask, data);
+	event = fanotify_alloc_event(group, inode, mask, data, fsid);
 	ret = -ENOMEM;
 	if (unlikely(!event)) {
 		/*
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 0e57fa0674d7..b9938626b2a3 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -99,5 +99,6 @@ static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
 }
 
 struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
-						 struct inode *inode, u32 mask,
-						 const struct path *path);
+					    struct inode *inode, u32 mask,
+					    const struct path *path,
+					    __kernel_fsid_t *fsid);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index d7aa2f392a64..92afb297fdea 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -656,7 +656,8 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
 
 static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
 						   fsnotify_connp_t *connp,
-						   unsigned int type)
+						   unsigned int type,
+						   __kernel_fsid_t *fsid)
 {
 	struct fsnotify_mark *mark;
 	int ret;
@@ -669,7 +670,7 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
 		return ERR_PTR(-ENOMEM);
 
 	fsnotify_init_mark(mark, group);
-	ret = fsnotify_add_mark_locked(mark, connp, type, 0);
+	ret = fsnotify_add_mark_locked_fsid(mark, connp, type, 0, fsid);
 	if (ret) {
 		fsnotify_put_mark(mark);
 		return ERR_PTR(ret);
@@ -681,7 +682,8 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
 
 static int fanotify_add_mark(struct fsnotify_group *group,
 			     fsnotify_connp_t *connp, unsigned int type,
-			     __u32 mask, unsigned int flags)
+			     __u32 mask, unsigned int flags,
+			     __kernel_fsid_t *fsid)
 {
 	struct fsnotify_mark *fsn_mark;
 	__u32 added;
@@ -689,7 +691,7 @@ static int fanotify_add_mark(struct fsnotify_group *group,
 	mutex_lock(&group->mark_mutex);
 	fsn_mark = fsnotify_find_mark(connp, group);
 	if (!fsn_mark) {
-		fsn_mark = fanotify_add_new_mark(group, connp, type);
+		fsn_mark = fanotify_add_new_mark(group, connp, type, fsid);
 		if (IS_ERR(fsn_mark)) {
 			mutex_unlock(&group->mark_mutex);
 			return PTR_ERR(fsn_mark);
@@ -706,23 +708,23 @@ static int fanotify_add_mark(struct fsnotify_group *group,
 
 static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
 				      struct vfsmount *mnt, __u32 mask,
-				      unsigned int flags)
+				      unsigned int flags, __kernel_fsid_t *fsid)
 {
 	return fanotify_add_mark(group, &real_mount(mnt)->mnt_fsnotify_marks,
-				 FSNOTIFY_OBJ_TYPE_VFSMOUNT, mask, flags);
+				 FSNOTIFY_OBJ_TYPE_VFSMOUNT, mask, flags, fsid);
 }
 
 static int fanotify_add_sb_mark(struct fsnotify_group *group,
-				      struct super_block *sb, __u32 mask,
-				      unsigned int flags)
+				struct super_block *sb, __u32 mask,
+				unsigned int flags, __kernel_fsid_t *fsid)
 {
 	return fanotify_add_mark(group, &sb->s_fsnotify_marks,
-				 FSNOTIFY_OBJ_TYPE_SB, mask, flags);
+				 FSNOTIFY_OBJ_TYPE_SB, mask, flags, fsid);
 }
 
 static int fanotify_add_inode_mark(struct fsnotify_group *group,
 				   struct inode *inode, __u32 mask,
-				   unsigned int flags)
+				   unsigned int flags, __kernel_fsid_t *fsid)
 {
 	pr_debug("%s: group=%p inode=%p\n", __func__, group, inode);
 
@@ -737,7 +739,7 @@ static int fanotify_add_inode_mark(struct fsnotify_group *group,
 		return 0;
 
 	return fanotify_add_mark(group, &inode->i_fsnotify_marks,
-				 FSNOTIFY_OBJ_TYPE_INODE, mask, flags);
+				 FSNOTIFY_OBJ_TYPE_INODE, mask, flags, fsid);
 }
 
 /* fanotify syscalls */
@@ -797,7 +799,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	atomic_inc(&user->fanotify_listeners);
 	group->memcg = get_mem_cgroup_from_mm(current->mm);
 
-	oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL);
+	oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL, NULL);
 	if (unlikely(!oevent)) {
 		fd = -ENOMEM;
 		goto out_destroy_group;
@@ -860,20 +862,20 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 }
 
 /* Check if filesystem can encode a unique fid */
-static int fanotify_test_fid(struct path *path)
+static int fanotify_test_fid(struct path *path, struct kstatfs *stat)
 {
-	struct kstatfs stat, root_stat;
+	struct kstatfs root_stat;
 	int err;
 
 	/*
 	 * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
 	 * TODO: cache fsid in the mark connector.
 	 */
-	err = vfs_statfs(path, &stat);
+	err = vfs_statfs(path, stat);
 	if (err)
 		return err;
 
-	if (!stat.f_fsid.val[0] && !stat.f_fsid.val[1])
+	if (!stat->f_fsid.val[0] && !stat->f_fsid.val[1])
 		return -ENODEV;
 
 	/*
@@ -884,8 +886,8 @@ static int fanotify_test_fid(struct path *path)
 	if (err)
 		return err;
 
-	if (root_stat.f_fsid.val[0] != stat.f_fsid.val[0] ||
-	    root_stat.f_fsid.val[1] != stat.f_fsid.val[1])
+	if (root_stat.f_fsid.val[0] != stat->f_fsid.val[0] ||
+	    root_stat.f_fsid.val[1] != stat->f_fsid.val[1])
 		return -EXDEV;
 
 	/*
@@ -910,6 +912,8 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	struct fsnotify_group *group;
 	struct fd f;
 	struct path path;
+	struct kstatfs stat;
+	__kernel_fsid_t *fsid = NULL;
 	u32 valid_mask = FANOTIFY_EVENTS | FANOTIFY_EVENT_FLAGS;
 	unsigned int mark_type = flags & FANOTIFY_MARK_TYPE_BITS;
 	int ret;
@@ -988,9 +992,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 		goto fput_and_out;
 
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
-		ret = fanotify_test_fid(&path);
+		ret = fanotify_test_fid(&path, &stat);
 		if (ret)
 			goto path_put_and_out;
+
+		fsid = &stat.f_fsid;
 	}
 
 	/* inode held in place by reference to path; group by fget on fd */
@@ -1003,19 +1009,25 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	switch (flags & (FAN_MARK_ADD | FAN_MARK_REMOVE)) {
 	case FAN_MARK_ADD:
 		if (mark_type == FAN_MARK_MOUNT)
-			ret = fanotify_add_vfsmount_mark(group, mnt, mask, flags);
+			ret = fanotify_add_vfsmount_mark(group, mnt, mask,
+							 flags, fsid);
 		else if (mark_type == FAN_MARK_FILESYSTEM)
-			ret = fanotify_add_sb_mark(group, mnt->mnt_sb, mask, flags);
+			ret = fanotify_add_sb_mark(group, mnt->mnt_sb, mask,
+						   flags, fsid);
 		else
-			ret = fanotify_add_inode_mark(group, inode, mask, flags);
+			ret = fanotify_add_inode_mark(group, inode, mask,
+						      flags, fsid);
 		break;
 	case FAN_MARK_REMOVE:
 		if (mark_type == FAN_MARK_MOUNT)
-			ret = fanotify_remove_vfsmount_mark(group, mnt, mask, flags);
+			ret = fanotify_remove_vfsmount_mark(group, mnt, mask,
+							    flags);
 		else if (mark_type == FAN_MARK_FILESYSTEM)
-			ret = fanotify_remove_sb_mark(group, mnt->mnt_sb, mask, flags);
+			ret = fanotify_remove_sb_mark(group, mnt->mnt_sb, mask,
+						      flags);
 		else
-			ret = fanotify_remove_inode_mark(group, inode, mask, flags);
+			ret = fanotify_remove_inode_mark(group, inode, mask,
+							 flags);
 		break;
 	default:
 		ret = -EINVAL;
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index d2dd16cb5989..c7d8e487c348 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -82,6 +82,7 @@
 #include <linux/slab.h>
 #include <linux/spinlock.h>
 #include <linux/srcu.h>
+#include <linux/ratelimit.h>
 
 #include <linux/atomic.h>
 
@@ -480,8 +481,54 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
 	return -1;
 }
 
+/*
+ * Check consistent fsid for all marks on the same object.
+ * It is expected to always get the same fsid (or no fsid) for all
+ * marks added to object.
+ */
+static int fsnotify_test_conn_fsid(const struct fsnotify_mark_connector *conn,
+				   const __kernel_fsid_t *fsid,
+				   const char *where)
+
+{
+	/*
+	 * Backend is expected to check for non uniform fsid (e.g. btrfs),
+	 * but maybe we missed something?
+	 */
+	if (fsid->val[0] != conn->fsid.val[0] ||
+	    fsid->val[1] != conn->fsid.val[1]) {
+		pr_warn_ratelimited("%s: fsid mismatch on object of type %u: %x.%x != %x.%x\n",
+				    where, conn->type,
+				    fsid->val[0], fsid->val[1],
+				    conn->fsid.val[0], conn->fsid.val[1]);
+		return -EXDEV;
+	}
+
+	return 0;
+}
+
+/*
+ * Get cached fsid of filesystem containing object from connector.
+ */
+int fsnotify_get_conn_fsid(const struct fsnotify_mark_connector *conn,
+			   __kernel_fsid_t *fsid)
+{
+	if (WARN_ON(!conn->fsid.val[0] && !conn->fsid.val[1]))
+		return -ENODEV;
+
+	if (!fsid->val[0] && !fsid->val[1]) {
+		*fsid = conn->fsid;
+		return 0;
+	}
+
+	return fsnotify_test_conn_fsid(conn, fsid, __func__);
+}
+
+static const __kernel_fsid_t null_fsid;
+
 static int fsnotify_attach_connector_to_object(fsnotify_connp_t *connp,
-					       unsigned int type)
+					       unsigned int type,
+					       __kernel_fsid_t *fsid)
 {
 	struct inode *inode = NULL;
 	struct fsnotify_mark_connector *conn;
@@ -493,6 +540,8 @@ static int fsnotify_attach_connector_to_object(fsnotify_connp_t *connp,
 	INIT_HLIST_HEAD(&conn->list);
 	conn->type = type;
 	conn->obj = connp;
+	/* Cache fsid of filesystem containing the object */
+	conn->fsid = fsid ? *fsid : null_fsid;
 	if (conn->type == FSNOTIFY_OBJ_TYPE_INODE)
 		inode = igrab(fsnotify_conn_inode(conn));
 	/*
@@ -544,7 +593,7 @@ static struct fsnotify_mark_connector *fsnotify_grab_connector(
  */
 static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
 				  fsnotify_connp_t *connp, unsigned int type,
-				  int allow_dups)
+				  int allow_dups, __kernel_fsid_t *fsid)
 {
 	struct fsnotify_mark *lmark, *last = NULL;
 	struct fsnotify_mark_connector *conn;
@@ -553,15 +602,24 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
 
 	if (WARN_ON(!fsnotify_valid_obj_type(type)))
 		return -EINVAL;
+
+	/* Backend is expected to check for zero fsid (e.g. tmpfs) */
+	if (fsid && WARN_ON(!fsid->val[0] && !fsid->val[1]))
+		return -ENODEV;
+
 restart:
 	spin_lock(&mark->lock);
 	conn = fsnotify_grab_connector(connp);
 	if (!conn) {
 		spin_unlock(&mark->lock);
-		err = fsnotify_attach_connector_to_object(connp, type);
+		err = fsnotify_attach_connector_to_object(connp, type, fsid);
 		if (err)
 			return err;
 		goto restart;
+	} else if (fsid) {
+		err = fsnotify_test_conn_fsid(conn, fsid, __func__);
+		if (err)
+			goto out_err;
 	}
 
 	/* is mark the first mark? */
@@ -604,9 +662,9 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
  * These marks may be used for the fsnotify backend to determine which
  * event types should be delivered to which group.
  */
-int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
-			     fsnotify_connp_t *connp, unsigned int type,
-			     int allow_dups)
+int fsnotify_add_mark_locked_fsid(struct fsnotify_mark *mark,
+				  fsnotify_connp_t *connp, unsigned int type,
+				  int allow_dups, __kernel_fsid_t *fsid)
 {
 	struct fsnotify_group *group = mark->group;
 	int ret = 0;
@@ -627,7 +685,7 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	fsnotify_get_mark(mark); /* for g_list */
 	spin_unlock(&mark->lock);
 
-	ret = fsnotify_add_mark_list(mark, connp, type, allow_dups);
+	ret = fsnotify_add_mark_list(mark, connp, type, allow_dups, fsid);
 	if (ret)
 		goto err;
 
@@ -648,13 +706,14 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 }
 
 int fsnotify_add_mark(struct fsnotify_mark *mark, fsnotify_connp_t *connp,
-		      unsigned int type, int allow_dups)
+		      unsigned int type, int allow_dups, __kernel_fsid_t *fsid)
 {
 	int ret;
 	struct fsnotify_group *group = mark->group;
 
 	mutex_lock(&group->mark_mutex);
-	ret = fsnotify_add_mark_locked(mark, connp, type, allow_dups);
+	ret = fsnotify_add_mark_locked_fsid(mark, connp, type, allow_dups,
+					    fsid);
 	mutex_unlock(&group->mark_mutex);
 	return ret;
 }
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index c3cb692c48a0..d27106efd9b0 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -294,6 +294,7 @@ typedef struct fsnotify_mark_connector __rcu *fsnotify_connp_t;
 struct fsnotify_mark_connector {
 	spinlock_t lock;
 	unsigned int type;	/* Type of object [lock] */
+	__kernel_fsid_t fsid;	/* fsid of filesystem containing object */
 	union {
 		/* Object pointer [lock] */
 		fsnotify_connp_t *obj;
@@ -434,20 +435,32 @@ extern void fsnotify_init_mark(struct fsnotify_mark *mark,
 /* Find mark belonging to given group in the list of marks */
 extern struct fsnotify_mark *fsnotify_find_mark(fsnotify_connp_t *connp,
 						struct fsnotify_group *group);
+/* Get cached fsid of filesystem containing object */
+extern int fsnotify_get_conn_fsid(const struct fsnotify_mark_connector *conn,
+				  __kernel_fsid_t *fsid);
 /* attach the mark to the object */
 extern int fsnotify_add_mark(struct fsnotify_mark *mark,
 			     fsnotify_connp_t *connp, unsigned int type,
-			     int allow_dups);
-extern int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
-				    fsnotify_connp_t *connp, unsigned int type,
-				    int allow_dups);
+			     int allow_dups, __kernel_fsid_t *fsid);
+extern int fsnotify_add_mark_locked_fsid(struct fsnotify_mark *mark,
+					 fsnotify_connp_t *connp,
+					 unsigned int type, int allow_dups,
+					 __kernel_fsid_t *fsid);
+static inline int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
+					   fsnotify_connp_t *connp,
+					   unsigned int type, int allow_dups)
+{
+	return fsnotify_add_mark_locked_fsid(mark, connp, type, allow_dups,
+					     NULL);
+}
+
 /* attach the mark to the inode */
 static inline int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
 					  struct inode *inode,
 					  int allow_dups)
 {
 	return fsnotify_add_mark(mark, &inode->i_fsnotify_marks,
-				 FSNOTIFY_OBJ_TYPE_INODE, allow_dups);
+				 FSNOTIFY_OBJ_TYPE_INODE, allow_dups, NULL);
 }
 static inline int fsnotify_add_inode_mark_locked(struct fsnotify_mark *mark,
 						 struct inode *inode,
@@ -456,6 +469,7 @@ static inline int fsnotify_add_inode_mark_locked(struct fsnotify_mark *mark,
 	return fsnotify_add_mark_locked(mark, &inode->i_fsnotify_marks,
 					FSNOTIFY_OBJ_TYPE_INODE, allow_dups);
 }
+
 /* given a group and a mark, flag mark to be freed when all references are dropped */
 extern void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 				  struct fsnotify_group *group);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 10/13] fanotify: check FS_ISDIR flag instead of d_is_dir()
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (8 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 11/13] fanotify: support events with data type FSNOTIFY_EVENT_INODE Amir Goldstein
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

All fsnotify hooks set the FS_ISDIR flag for events that happen
on directory victim inodes except for fsnotify_perm().

Add the missing FS_ISDIR flag in fsnotify_perm() hook and let
fanotify_group_event_mask() check the FS_ISDIR flag instead of
checking if path argument is a directory.

This is needed for fanotify support for event types that do not
carry path information.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c | 2 +-
 include/linux/fsnotify.h      | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 90bac98dd8c7..ef567d07f5a6 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -147,7 +147,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 		marks_ignored_mask |= mark->ignored_mask;
 	}
 
-	if (d_is_dir(path->dentry) &&
+	if (event_mask & FS_ISDIR &&
 	    !(marks_mask & FS_ISDIR & ~marks_ignored_mask))
 		return 0;
 
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 098322ec3689..c1596ca881aa 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -93,6 +93,9 @@ static inline int fsnotify_perm(struct file *file, int mask)
 		fsnotify_mask = FS_ACCESS_PERM;
 	}
 
+	if (S_ISDIR(inode->i_mode))
+		fsnotify_mask |= FS_ISDIR;
+
 	return fsnotify_path(inode, path, fsnotify_mask);
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 11/13] fanotify: support events with data type FSNOTIFY_EVENT_INODE
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (9 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 10/13] fanotify: check FS_ISDIR flag instead of d_is_dir() Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 12/13] fanotify: add support for create/attrib/move/delete events Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 13/13] fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID Amir Goldstein
  12 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

When event data type is FSNOTIFY_EVENT_INODE, we don't have a refernece
to the mount, so we will not be able to open a file descriptor when user
reads the event. However, if the listener has enabled reporting file
identifier with the FAN_REPORT_FID init flag, we allow repoting those
events and we use an indentifier inode to encode fid.

The inode to use as indetifier when reporting fid depedns on the event.
For dirent modification events, we report the modified directory inode
and we report the "victim" inode otherwise.
For example:
FS_ATTRIB reports the child inode even if reported on a watched parent.
FS_CREATE reports the modified dir inode and not the created inode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 62 ++++++++++++++++++++----------
 fs/notify/fanotify/fanotify.h      |  2 +-
 fs/notify/fanotify/fanotify_user.c |  3 +-
 3 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index ef567d07f5a6..3e8b3abe4952 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -114,14 +114,11 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 	pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
 		 __func__, iter_info->report_mask, event_mask, data, data_type);
 
-	/* If we don't have enough info to send an event to userspace say no */
-	if (data_type != FSNOTIFY_EVENT_PATH)
-		return 0;
-
-	/* Sorry, fanotify only gives a damn about files and dirs */
-	if (!d_is_reg(path->dentry) &&
-	    !d_can_lookup(path->dentry))
-		return 0;
+	if (data_type == FSNOTIFY_EVENT_PATH) {
+		/* Path type events are only relevant for files and dirs */
+		if (!d_is_reg(path->dentry) && !d_can_lookup(path->dentry))
+			return 0;
+	}
 
 	fsnotify_foreach_obj_type(type) {
 		if (!fsnotify_iter_should_report_type(iter_info, type))
@@ -155,7 +152,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 		~marks_ignored_mask;
 }
 
-static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
+static struct fanotify_event_fid *fanotify_alloc_fid(struct inode *inode,
 						     gfp_t gfp)
 {
 	struct fanotify_event_fid *fid = NULL;
@@ -164,7 +161,7 @@ static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
 
 	dwords = 0;
 	err = -ENOENT;
-	type = exportfs_encode_fh(path->dentry, NULL, &dwords,  0);
+	type = exportfs_encode_inode_fh(inode, NULL, &dwords,  NULL);
 	if (!dwords)
 		goto out_err;
 
@@ -174,8 +171,8 @@ static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
 	if (!fid)
 		return NULL;
 
-	type = exportfs_encode_fh(path->dentry, (struct fid *)fid->f_handle,
-				  &dwords,  0);
+	type = exportfs_encode_inode_fh(inode, (struct fid *)fid->f_handle,
+					&dwords,  NULL);
 	err = -EINVAL;
 	if (type == FILEID_INVALID || bytes != dwords << 2)
 		goto out_err;
@@ -186,20 +183,42 @@ static struct fanotify_event_fid *fanotify_alloc_fid(const struct path *path,
 	return fid;
 
 out_err:
-	pr_warn_ratelimited("fanotify: failed to encode fid of %pd2 (bytes=%d, err=%i)\n",
-			    path->dentry, bytes, err);
+	pr_warn_ratelimited("fanotify: failed to encode fid (ino=%lu, type=%d, bytes=%d, err=%i)\n",
+			    inode->i_ino, type, bytes, err);
 	kfree(fid);
 	return ERR_PTR(err);
 }
 
+/*
+ * The inode to use as indetifier when reporting fid depedns on the event.
+ * Report the modified directory inode on dirent modification events.
+ * Report the "victim" inode otherwise.
+ * For example:
+ * FS_ATTRIB reports the child inode even if reported on a watched parent.
+ * FS_CREATE reports the modified dir inode and not the created inode.
+ */
+static struct inode *fanotify_report_id(struct inode *to_tell, u32 event_mask,
+					const void *data, int data_type)
+{
+	if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
+		return to_tell;
+	else if (data_type == FSNOTIFY_EVENT_INODE)
+		return (struct inode *)data;
+	else if (data_type == FSNOTIFY_EVENT_PATH)
+		return d_inode(((struct path *)data)->dentry);
+	else
+		return NULL;
+}
+
 struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 					    struct inode *inode, u32 mask,
-					    const struct path *path,
+					    const void *data, int data_type,
 					    __kernel_fsid_t *fsid)
 {
 	struct fanotify_event *event = NULL;
 	struct fanotify_event_fid *fid = NULL;
 	gfp_t gfp = GFP_KERNEL_ACCOUNT;
+	struct inode *id = fanotify_report_id(inode, mask, data, data_type);
 
 	/*
 	 * For queues with unlimited length lost events are not expected and
@@ -212,8 +231,8 @@ struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	/* Whoever is interested in the event, pays for the allocation. */
 	memalloc_use_memcg(group->memcg);
 
-	if (path && fsid && FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
-		fid = fanotify_alloc_fid(path, gfp);
+	if (id && fsid && FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
+		fid = fanotify_alloc_fid(id, gfp);
 		/* Treat failure to allocate fid as failure to allocate event */
 		if (!fid)
 			goto out;
@@ -245,8 +264,8 @@ init: __maybe_unused
 		event->pid = get_pid(task_tgid(current));
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
 		fanotify_set_fid(event, fid);
-	} else if (path) {
-		event->path = *path;
+	} else if (data_type == FSNOTIFY_EVENT_PATH) {
+		event->path = *((struct path *)data);
 		path_get(&event->path);
 	} else {
 		event->path.mnt = NULL;
@@ -283,10 +302,13 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 
 	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 12);
 
+	/* Events without path data require FAN_REPORT_FID */
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
 		__fsid.val[0] = 0;
 		__fsid.val[1] = 0;
 		fsid = &__fsid;
+	} else if (data_type != FSNOTIFY_EVENT_PATH) {
+		return 0;
 	}
 
 	mask = fanotify_group_event_mask(iter_info, mask, data, data_type,
@@ -306,7 +328,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 			return 0;
 	}
 
-	event = fanotify_alloc_event(group, inode, mask, data, fsid);
+	event = fanotify_alloc_event(group, inode, mask, data, data_type, fsid);
 	ret = -ENOMEM;
 	if (unlikely(!event)) {
 		/*
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index b9938626b2a3..3c21c9a1278d 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -100,5 +100,5 @@ static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
 
 struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 					    struct inode *inode, u32 mask,
-					    const struct path *path,
+					    const void *data, int data_type,
 					    __kernel_fsid_t *fsid);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 92afb297fdea..9479bd418c31 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -799,7 +799,8 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	atomic_inc(&user->fanotify_listeners);
 	group->memcg = get_mem_cgroup_from_mm(current->mm);
 
-	oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL, NULL);
+	oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL,
+				      FSNOTIFY_EVENT_NONE, NULL);
 	if (unlikely(!oevent)) {
 		fd = -ENOMEM;
 		goto out_destroy_group;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 12/13] fanotify: add support for create/attrib/move/delete events
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (10 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 11/13] fanotify: support events with data type FSNOTIFY_EVENT_INODE Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  2018-11-25 13:43 ` [PATCH v3 13/13] fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID Amir Goldstein
  12 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

Add support for events with data type FSNOTIFY_EVENT_INODE
(e.g. create/attrib/move/delete) for inode and filesystem mark types.

The "inode" events do not carry enough inormation (i.e. path) to
report event->fd, so we do not allow setting a mask for those events
unless group supports reporting fid.

The "inode" events are not supported on a mount mark, because they do
not carry enough inormation (i.e. path) to be filtered by mount point.

The "dirent" events (create/move/delete) report the fid of the parent
directry where events took place without specifying the filename of the
child. In the future, fanotify may get support for reporting filename
information for those events.

Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      |  9 ++++++++-
 fs/notify/fanotify/fanotify_user.c | 12 ++++++++++++
 include/linux/fanotify.h           | 22 ++++++++++++++++++++--
 include/uapi/linux/fanotify.h      |  8 ++++++++
 4 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 3e8b3abe4952..883db359c508 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -289,9 +289,16 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 
 	BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS);
 	BUILD_BUG_ON(FAN_MODIFY != FS_MODIFY);
+	BUILD_BUG_ON(FAN_ATTRIB != FS_ATTRIB);
 	BUILD_BUG_ON(FAN_CLOSE_NOWRITE != FS_CLOSE_NOWRITE);
 	BUILD_BUG_ON(FAN_CLOSE_WRITE != FS_CLOSE_WRITE);
 	BUILD_BUG_ON(FAN_OPEN != FS_OPEN);
+	BUILD_BUG_ON(FAN_MOVED_TO != FS_MOVED_TO);
+	BUILD_BUG_ON(FAN_MOVED_FROM != FS_MOVED_FROM);
+	BUILD_BUG_ON(FAN_CREATE != FS_CREATE);
+	BUILD_BUG_ON(FAN_DELETE != FS_DELETE);
+	BUILD_BUG_ON(FAN_DELETE_SELF != FS_DELETE_SELF);
+	BUILD_BUG_ON(FAN_MOVE_SELF != FS_MOVE_SELF);
 	BUILD_BUG_ON(FAN_EVENT_ON_CHILD != FS_EVENT_ON_CHILD);
 	BUILD_BUG_ON(FAN_Q_OVERFLOW != FS_Q_OVERFLOW);
 	BUILD_BUG_ON(FAN_OPEN_PERM != FS_OPEN_PERM);
@@ -300,7 +307,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 	BUILD_BUG_ON(FAN_OPEN_EXEC != FS_OPEN_EXEC);
 	BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
 
-	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 12);
+	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
 
 	/* Events without path data require FAN_REPORT_FID */
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 9479bd418c31..5de9c0fbb7ae 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -977,6 +977,18 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	    group->priority == FS_PRIO_0)
 		goto fput_and_out;
 
+	/*
+	 * Events with data type inode do not carry enough inormation to report
+	 * event->fd, so we do not allow setting a mask for inode events unless
+	 * group supports reporting fid.
+	 * inode events are not supported on a mount mark, because they do not
+	 * carry enough inormation (i.e. path) to be filtered by mount point.
+	 */
+	if (mask & FANOTIFY_INODE_EVENTS &&
+	    (!FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
+	     mark_type == FAN_MARK_MOUNT))
+		goto fput_and_out;
+
 	if (flags & FAN_MARK_FLUSH) {
 		ret = 0;
 		if (mark_type == FAN_MARK_MOUNT)
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index f59be967f72b..e9d45387089f 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -35,10 +35,28 @@
 				 FAN_MARK_IGNORED_SURV_MODIFY | \
 				 FAN_MARK_FLUSH)
 
-/* Events that user can request to be notified on */
-#define FANOTIFY_EVENTS		(FAN_ACCESS | FAN_MODIFY | \
+/*
+ * Events that can be reported with data type FSNOTIFY_EVENT_PATH.
+ * Note that FAN_MODIFY can also be reported with data type
+ * FSNOTIFY_EVENT_INODE.
+ */
+#define FANOTIFY_PATH_EVENTS	(FAN_ACCESS | FAN_MODIFY | \
 				 FAN_CLOSE | FAN_OPEN | FAN_OPEN_EXEC)
 
+/*
+ * Directory entry modification events - reported only to directory
+ * where entry is modified and not to a watching parent.
+ */
+#define FANOTIFY_DIRENT_EVENTS	(FAN_MOVE | FAN_CREATE | FAN_DELETE)
+
+/* Events that can only be reported with data type FSNOTIFY_EVENT_INODE */
+#define FANOTIFY_INODE_EVENTS	(FANOTIFY_DIRENT_EVENTS | \
+				 FAN_ATTRIB | FAN_MOVE_SELF | FAN_DELETE_SELF)
+
+/* Events that user can request to be notified on */
+#define FANOTIFY_EVENTS		(FANOTIFY_PATH_EVENTS | \
+				 FANOTIFY_INODE_EVENTS)
+
 /* Events that require a permission response from user */
 #define FANOTIFY_PERM_EVENTS	(FAN_OPEN_PERM | FAN_ACCESS_PERM | \
 				 FAN_OPEN_EXEC_PERM)
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index 0fd8736269c4..2ac5ba5c56f7 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -7,9 +7,16 @@
 /* the following events that user-space can register for */
 #define FAN_ACCESS		0x00000001	/* File was accessed */
 #define FAN_MODIFY		0x00000002	/* File was modified */
+#define FAN_ATTRIB		0x00000004	/* Metadata changed */
 #define FAN_CLOSE_WRITE		0x00000008	/* Writtable file closed */
 #define FAN_CLOSE_NOWRITE	0x00000010	/* Unwrittable file closed */
 #define FAN_OPEN		0x00000020	/* File was opened */
+#define FAN_MOVED_FROM		0x00000040	/* File was moved from X */
+#define FAN_MOVED_TO		0x00000080	/* File was moved to Y */
+#define FAN_CREATE		0x00000100	/* Subfile was created */
+#define FAN_DELETE		0x00000200	/* Subfile was deleted */
+#define FAN_DELETE_SELF		0x00000400	/* Self was deleted */
+#define FAN_MOVE_SELF		0x00000800	/* Self was moved */
 #define FAN_OPEN_EXEC		0x00001000	/* File was opened for exec */
 
 #define FAN_Q_OVERFLOW		0x00004000	/* Event queued overflowed */
@@ -24,6 +31,7 @@
 
 /* helper events */
 #define FAN_CLOSE		(FAN_CLOSE_WRITE | FAN_CLOSE_NOWRITE) /* close */
+#define FAN_MOVE		(FAN_MOVED_FROM | FAN_MOVED_TO) /* moves */
 
 /* flags used for fanotify_init() */
 #define FAN_CLOEXEC		0x00000001
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 13/13] fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID
  2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
                   ` (11 preceding siblings ...)
  2018-11-25 13:43 ` [PATCH v3 12/13] fanotify: add support for create/attrib/move/delete events Amir Goldstein
@ 2018-11-25 13:43 ` Amir Goldstein
  12 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-25 13:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

dirent modification events (create/delete/move) do not carry the
child entry name/inode information. Instead, we report FAN_ONDIR
for mkdir/rmdir so user can differentiate them from creat/unlink.

For backward compatibility and consistency, do not report FAN_ONDIR
to user in legacy fanotify mode (reporting fd) and report FAN_ONDIR
to user in FAN_REPORT_FID mode for all event types.

Unlike legacy fanotify events (open/access/close), dirent events
for subdir entries (mkdir/rmdir) will be reported regardless if
user requested FAN_ONDIR, but the FAN_ONDIR flag itself will only
be reported if the user asked for it.

Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c | 31 +++++++++++++++++++++++++++++--
 include/linux/fanotify.h      |  9 ++++++---
 2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 883db359c508..610a7f8981b5 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -107,6 +107,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 				     int data_type, __kernel_fsid_t *fsid)
 {
 	__u32 marks_mask = 0, marks_ignored_mask = 0;
+	__u32 test_mask, user_mask = FANOTIFY_EVENT_TYPES;
 	const struct path *path = data;
 	struct fsnotify_mark *mark;
 	int type, err;
@@ -144,12 +145,38 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
 		marks_ignored_mask |= mark->ignored_mask;
 	}
 
+	test_mask = event_mask & marks_mask & ~marks_ignored_mask;
+
+	/*
+	 * dirent modification events (create/delete/move) do not carry the
+	 * child entry name/inode information. Instead, we report FAN_ONDIR
+	 * for mkdir/rmdir so user can differentiate them from creat/unlink.
+	 *
+	 * For backward compatibility and consistency, do not report FAN_ONDIR
+	 * to user in legacy fanotify mode (reporting fd) and report FAN_ONDIR
+	 * to user in FAN_REPORT_FID mode for all event types.
+	 */
+	if (fsid) {
+		/* Do not report FAN_ONDIR without an event type */
+		BUILD_BUG_ON(FANOTIFY_EVENT_TYPES & FANOTIFY_EVENT_FLAGS);
+		if (!(test_mask & FANOTIFY_EVENT_TYPES))
+			return 0;
+
+		user_mask |= FAN_ONDIR;
+	}
+
+	/*
+	 * Unlike legacy fanotify events (open/access/close), dirent events
+	 * for subdir entries (mkdir/rmdir) will be reported regardless if
+	 * user requested FAN_ONDIR, but the FAN_ONDIR flag itself will only
+	 * be reported if the user asked for it.
+	 */
 	if (event_mask & FS_ISDIR &&
+	    !(event_mask & ALL_FSNOTIFY_DIRENT_EVENTS) &&
 	    !(marks_mask & FS_ISDIR & ~marks_ignored_mask))
 		return 0;
 
-	return event_mask & FANOTIFY_OUTGOING_EVENTS & marks_mask &
-		~marks_ignored_mask;
+	return test_mask & user_mask;
 }
 
 static struct fanotify_event_fid *fanotify_alloc_fid(struct inode *inode,
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index e9d45387089f..f5f86566c277 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -61,13 +61,16 @@
 #define FANOTIFY_PERM_EVENTS	(FAN_OPEN_PERM | FAN_ACCESS_PERM | \
 				 FAN_OPEN_EXEC_PERM)
 
+/* Events types that may be reported from vfs */
+#define FANOTIFY_EVENT_TYPES	(FANOTIFY_EVENTS | \
+				 FANOTIFY_PERM_EVENTS)
+
 /* Extra flags that may be reported with event or control handling of events */
 #define FANOTIFY_EVENT_FLAGS	(FAN_EVENT_ON_CHILD | FAN_ONDIR)
 
 /* Events that may be reported to user */
-#define FANOTIFY_OUTGOING_EVENTS	(FANOTIFY_EVENTS | \
-					 FANOTIFY_PERM_EVENTS | \
-					 FAN_Q_OVERFLOW)
+#define FANOTIFY_OUTGOING_EVENTS	(FANOTIFY_EVENT_TYPES | \
+					 FAN_Q_OVERFLOW | FAN_ONDIR)
 
 #define ALL_FANOTIFY_EVENT_BITS		(FANOTIFY_OUTGOING_EVENTS | \
 					 FANOTIFY_EVENT_FLAGS)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 01/13] fsnotify: annotate directory entry modification events
  2018-11-25 13:43 ` [PATCH v3 01/13] fsnotify: annotate directory entry modification events Amir Goldstein
@ 2018-11-28 12:59   ` Jan Kara
  2018-11-28 14:39     ` Amir Goldstein
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-28 12:59 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Sun 25-11-18 15:43:40, Amir Goldstein wrote:
> "dirent" events are referring to events that modify directory entries,
> such as create,delete,rename. Those events should always be reported
> on a watched directory, regardless if FS_EVENT_ON_CHILD is set
> on the watch mask.
> 
> fsnotify_nameremove() and fsnotify_move() were modified to no longer
> set the FS_EVENT_ON_CHILD event bit. This is a semantic change to
> align with the "dirent" event definition. It has no effect on any
> existing backend, because dnotify, inotify and audit always requets the
> child events and fanotify does not get the delete,rename events.
> 
> The fsnotify_dirent() helper is used instead of fsnotify_parent() to
> report a dirent event to dentry->d_parent without FS_EVENT_ON_CHILD
> and regardless if parent has the FS_EVENT_ON_CHILD bit set.
> 
> ALL_FSNOTIFY_DIRENT_EVENTS defines all the dirent event types and
> those event types have been extracted out of FS_EVENTS_POSS_ON_CHILD.
> 
> That means for a directory with an inotify watch and only dirent
> events in the mask (i.e. create,delete,move), all children dentries
> will no longer have the DCACHE_FSNOTIFY_PARENT_WATCHED flag set.
> This will allow all events that happen on children to be optimized
> away in __fsnotify_parent() without the need to dereference
> child->d_parent->d_inode->i_fsnotify_mask.
> 
> Since the dirent events are never repoted via __fsnotify_parent(),
> this results in no change of logic, but only an optimization.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  include/linux/fsnotify.h         | 43 +++++++++++++++++++++++++-------
>  include/linux/fsnotify_backend.h | 36 +++++++++++++++-----------
>  2 files changed, 55 insertions(+), 24 deletions(-)

The patch looks good. Just one question and two nits below.

> +static inline int fsnotify_dirent(struct inode *dir, struct dentry *dentry,
> +				  __u32 mask)
> +{
> +	struct dentry *parent = NULL;
> +	int ret;
> +
> +	if (!dir) {
> +		parent = dget_parent(dentry);
> +		dir = d_inode(parent);
> +	}
> +
> +	ret = fsnotify(dir, mask, d_inode(dentry), FSNOTIFY_EVENT_INODE,
> +			dentry->d_name.name, 0);
> +
> +	dput(parent);
> +	return ret;

There's one more feature fsnotify_parent() provides - it calls
take_dentry_name_snapshot() before calling into fsnotify and uses
snapshotted name. I think you need to do the same here to provide the same
protection for fsnotify_nameremove, don't you? Or maybe you don't since
generally directory i_rwsem is held when unlinking and so dentry name cannot
change but then it would be good to assert that? Who knows what some weird
fs is doing...

> +/*
> + * This is a list of all events that may get sent to a parent based on fs event

^^^ Line too long.

> +#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS| \
							^^^ space here

									Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 02/13] fsnotify: send all event types to super block marks
  2018-11-25 13:43 ` [PATCH v3 02/13] fsnotify: send all event types to super block marks Amir Goldstein
@ 2018-11-28 14:26   ` Jan Kara
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-28 14:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Sun 25-11-18 15:43:41, Amir Goldstein wrote:
> So far, existence of super block marks was checked only on events with
> data type FSNOTIFY_EVENT_PATH. Use the super block of the "to_tell" inode
> to report the events of all event types to super block marks.
> 
> This change has no effect on current backends. Soon, this will allow
> fanotify backend to receive all event types on a super block mark.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Looks good.

								Honza

> ---
>  fs/notify/fsnotify.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index ecf09b6243d9..df06f3da166c 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -328,16 +328,15 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
>  	     const unsigned char *file_name, u32 cookie)
>  {
>  	struct fsnotify_iter_info iter_info = {};
> -	struct super_block *sb = NULL;
> +	struct super_block *sb = to_tell->i_sb;
>  	struct mount *mnt = NULL;
> -	__u32 mnt_or_sb_mask = 0;
> +	__u32 mnt_or_sb_mask = sb->s_fsnotify_mask;
>  	int ret = 0;
>  	__u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
>  
>  	if (data_is == FSNOTIFY_EVENT_PATH) {
>  		mnt = real_mount(((const struct path *)data)->mnt);
> -		sb = mnt->mnt.mnt_sb;
> -		mnt_or_sb_mask = mnt->mnt_fsnotify_mask | sb->s_fsnotify_mask;
> +		mnt_or_sb_mask |= mnt->mnt_fsnotify_mask;
>  	}
>  	/* An event "on child" is not intended for a mount/sb mark */
>  	if (mask & FS_EVENT_ON_CHILD)
> @@ -350,8 +349,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
>  	 * SRCU because we have no references to any objects and do not
>  	 * need SRCU to keep them "alive".
>  	 */
> -	if (!to_tell->i_fsnotify_marks &&
> -	    (!mnt || (!mnt->mnt_fsnotify_marks && !sb->s_fsnotify_marks)))
> +	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
> +	    (!mnt || !mnt->mnt_fsnotify_marks))
>  		return 0;
>  	/*
>  	 * if this is a modify event we may need to clear the ignored masks
> @@ -366,11 +365,11 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
>  
>  	iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
>  		fsnotify_first_mark(&to_tell->i_fsnotify_marks);
> +	iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
> +		fsnotify_first_mark(&sb->s_fsnotify_marks);
>  	if (mnt) {
>  		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
>  			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
> -		iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
> -			fsnotify_first_mark(&sb->s_fsnotify_marks);
>  	}
>  
>  	/*
> -- 
> 2.17.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 03/13] fanotify: rename struct fanotify_{,perm_}event_info
  2018-11-25 13:43 ` [PATCH v3 03/13] fanotify: rename struct fanotify_{,perm_}event_info Amir Goldstein
@ 2018-11-28 14:29   ` Jan Kara
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-28 14:29 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Sun 25-11-18 15:43:42, Amir Goldstein wrote:
> struct fanotify_event_info "inherits" from struct fsnotify_event and
> therefore a more appropriate (and short) name for it is fanotify_event.
> Same for struct fanotify_perm_event_info, which now "inherits" from
> struct fanotify_event.
> 
> We plan to reuse the name struct fanotify_event_info for user visible
> event info record format.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Why not and I like that the name is shorter :)

								Honza

> ---
>  fs/notify/fanotify/fanotify.c      | 14 +++++++-------
>  fs/notify/fanotify/fanotify.h      | 16 ++++++++--------
>  fs/notify/fanotify/fanotify_user.c | 20 ++++++++++----------
>  3 files changed, 25 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 3723f3d18d20..ecd5f4aec624 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -19,7 +19,7 @@
>  static bool should_merge(struct fsnotify_event *old_fsn,
>  			 struct fsnotify_event *new_fsn)
>  {
> -	struct fanotify_event_info *old, *new;
> +	struct fanotify_event *old, *new;
>  
>  	pr_debug("%s: old=%p new=%p\n", __func__, old_fsn, new_fsn);
>  	old = FANOTIFY_E(old_fsn);
> @@ -58,7 +58,7 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
>  }
>  
>  static int fanotify_get_response(struct fsnotify_group *group,
> -				 struct fanotify_perm_event_info *event,
> +				 struct fanotify_perm_event *event,
>  				 struct fsnotify_iter_info *iter_info)
>  {
>  	int ret;
> @@ -141,11 +141,11 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
>  		~marks_ignored_mask;
>  }
>  
> -struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
> +struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
>  						 struct inode *inode, u32 mask,
>  						 const struct path *path)
>  {
> -	struct fanotify_event_info *event = NULL;
> +	struct fanotify_event *event = NULL;
>  	gfp_t gfp = GFP_KERNEL_ACCOUNT;
>  
>  	/*
> @@ -160,7 +160,7 @@ struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
>  	memalloc_use_memcg(group->memcg);
>  
>  	if (fanotify_is_perm_event(mask)) {
> -		struct fanotify_perm_event_info *pevent;
> +		struct fanotify_perm_event *pevent;
>  
>  		pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp);
>  		if (!pevent)
> @@ -197,7 +197,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
>  				 struct fsnotify_iter_info *iter_info)
>  {
>  	int ret = 0;
> -	struct fanotify_event_info *event;
> +	struct fanotify_event *event;
>  	struct fsnotify_event *fsn_event;
>  
>  	BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS);
> @@ -275,7 +275,7 @@ static void fanotify_free_group_priv(struct fsnotify_group *group)
>  
>  static void fanotify_free_event(struct fsnotify_event *fsn_event)
>  {
> -	struct fanotify_event_info *event;
> +	struct fanotify_event *event;
>  
>  	event = FANOTIFY_E(fsn_event);
>  	path_put(&event->path);
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index ea05b8a401e7..fb84dd3289f8 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -12,7 +12,7 @@ extern struct kmem_cache *fanotify_perm_event_cachep;
>   * fanotify_handle_event() and freed when the information is retrieved by
>   * userspace
>   */
> -struct fanotify_event_info {
> +struct fanotify_event {
>  	struct fsnotify_event fse;
>  	/*
>  	 * We hold ref to this path so it may be dereferenced at any point
> @@ -29,16 +29,16 @@ struct fanotify_event_info {
>   * group->notification_list to group->fanotify_data.access_list to wait for
>   * user response.
>   */
> -struct fanotify_perm_event_info {
> -	struct fanotify_event_info fae;
> +struct fanotify_perm_event {
> +	struct fanotify_event fae;
>  	int response;	/* userspace answer to question */
>  	int fd;		/* fd we passed to userspace for this event */
>  };
>  
> -static inline struct fanotify_perm_event_info *
> +static inline struct fanotify_perm_event *
>  FANOTIFY_PE(struct fsnotify_event *fse)
>  {
> -	return container_of(fse, struct fanotify_perm_event_info, fae.fse);
> +	return container_of(fse, struct fanotify_perm_event, fae.fse);
>  }
>  
>  static inline bool fanotify_is_perm_event(u32 mask)
> @@ -47,11 +47,11 @@ static inline bool fanotify_is_perm_event(u32 mask)
>  		mask & FANOTIFY_PERM_EVENTS;
>  }
>  
> -static inline struct fanotify_event_info *FANOTIFY_E(struct fsnotify_event *fse)
> +static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
>  {
> -	return container_of(fse, struct fanotify_event_info, fse);
> +	return container_of(fse, struct fanotify_event, fse);
>  }
>  
> -struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
> +struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
>  						 struct inode *inode, u32 mask,
>  						 const struct path *path);
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index e03be5071362..2dbb2662a92f 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -73,7 +73,7 @@ static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
>  }
>  
>  static int create_fd(struct fsnotify_group *group,
> -		     struct fanotify_event_info *event,
> +		     struct fanotify_event *event,
>  		     struct file **file)
>  {
>  	int client_fd;
> @@ -120,13 +120,13 @@ static int fill_event_metadata(struct fsnotify_group *group,
>  			       struct file **file)
>  {
>  	int ret = 0;
> -	struct fanotify_event_info *event;
> +	struct fanotify_event *event;
>  
>  	pr_debug("%s: group=%p metadata=%p event=%p\n", __func__,
>  		 group, metadata, fsn_event);
>  
>  	*file = NULL;
> -	event = container_of(fsn_event, struct fanotify_event_info, fse);
> +	event = container_of(fsn_event, struct fanotify_event, fse);
>  	metadata->event_len = FAN_EVENT_METADATA_LEN;
>  	metadata->metadata_len = FAN_EVENT_METADATA_LEN;
>  	metadata->vers = FANOTIFY_METADATA_VERSION;
> @@ -144,10 +144,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
>  	return ret;
>  }
>  
> -static struct fanotify_perm_event_info *dequeue_event(
> +static struct fanotify_perm_event *dequeue_event(
>  				struct fsnotify_group *group, int fd)
>  {
> -	struct fanotify_perm_event_info *event, *return_e = NULL;
> +	struct fanotify_perm_event *event, *return_e = NULL;
>  
>  	spin_lock(&group->notification_lock);
>  	list_for_each_entry(event, &group->fanotify_data.access_list,
> @@ -169,7 +169,7 @@ static struct fanotify_perm_event_info *dequeue_event(
>  static int process_access_response(struct fsnotify_group *group,
>  				   struct fanotify_response *response_struct)
>  {
> -	struct fanotify_perm_event_info *event;
> +	struct fanotify_perm_event *event;
>  	int fd = response_struct->fd;
>  	int response = response_struct->response;
>  
> @@ -364,7 +364,7 @@ static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t
>  static int fanotify_release(struct inode *ignored, struct file *file)
>  {
>  	struct fsnotify_group *group = file->private_data;
> -	struct fanotify_perm_event_info *event, *next;
> +	struct fanotify_perm_event *event, *next;
>  	struct fsnotify_event *fsn_event;
>  
>  	/*
> @@ -682,7 +682,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  	struct fsnotify_group *group;
>  	int f_flags, fd;
>  	struct user_struct *user;
> -	struct fanotify_event_info *oevent;
> +	struct fanotify_event *oevent;
>  
>  	pr_debug("%s: flags=%x event_f_flags=%x\n",
>  		 __func__, flags, event_f_flags);
> @@ -949,10 +949,10 @@ static int __init fanotify_user_setup(void)
>  
>  	fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
>  					 SLAB_PANIC|SLAB_ACCOUNT);
> -	fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC);
> +	fanotify_event_cachep = KMEM_CACHE(fanotify_event, SLAB_PANIC);
>  	if (IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) {
>  		fanotify_perm_event_cachep =
> -			KMEM_CACHE(fanotify_perm_event_info, SLAB_PANIC);
> +			KMEM_CACHE(fanotify_perm_event, SLAB_PANIC);
>  	}
>  
>  	return 0;
> -- 
> 2.17.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 01/13] fsnotify: annotate directory entry modification events
  2018-11-28 12:59   ` Jan Kara
@ 2018-11-28 14:39     ` Amir Goldstein
  2018-11-28 14:43       ` Jan Kara
  0 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-28 14:39 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Wed, Nov 28, 2018 at 2:59 PM Jan Kara <jack@suse.cz> wrote:
>
> On Sun 25-11-18 15:43:40, Amir Goldstein wrote:
> > "dirent" events are referring to events that modify directory entries,
> > such as create,delete,rename. Those events should always be reported
> > on a watched directory, regardless if FS_EVENT_ON_CHILD is set
> > on the watch mask.
> >
> > fsnotify_nameremove() and fsnotify_move() were modified to no longer
> > set the FS_EVENT_ON_CHILD event bit. This is a semantic change to
> > align with the "dirent" event definition. It has no effect on any
> > existing backend, because dnotify, inotify and audit always requets the
> > child events and fanotify does not get the delete,rename events.
> >
> > The fsnotify_dirent() helper is used instead of fsnotify_parent() to
> > report a dirent event to dentry->d_parent without FS_EVENT_ON_CHILD
> > and regardless if parent has the FS_EVENT_ON_CHILD bit set.
> >
> > ALL_FSNOTIFY_DIRENT_EVENTS defines all the dirent event types and
> > those event types have been extracted out of FS_EVENTS_POSS_ON_CHILD.
> >
> > That means for a directory with an inotify watch and only dirent
> > events in the mask (i.e. create,delete,move), all children dentries
> > will no longer have the DCACHE_FSNOTIFY_PARENT_WATCHED flag set.
> > This will allow all events that happen on children to be optimized
> > away in __fsnotify_parent() without the need to dereference
> > child->d_parent->d_inode->i_fsnotify_mask.
> >
> > Since the dirent events are never repoted via __fsnotify_parent(),
> > this results in no change of logic, but only an optimization.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  include/linux/fsnotify.h         | 43 +++++++++++++++++++++++++-------
> >  include/linux/fsnotify_backend.h | 36 +++++++++++++++-----------
> >  2 files changed, 55 insertions(+), 24 deletions(-)
>
> The patch looks good. Just one question and two nits below.
>
> > +static inline int fsnotify_dirent(struct inode *dir, struct dentry *dentry,
> > +                               __u32 mask)
> > +{
> > +     struct dentry *parent = NULL;
> > +     int ret;
> > +
> > +     if (!dir) {
> > +             parent = dget_parent(dentry);
> > +             dir = d_inode(parent);
> > +     }
> > +
> > +     ret = fsnotify(dir, mask, d_inode(dentry), FSNOTIFY_EVENT_INODE,
> > +                     dentry->d_name.name, 0);
> > +
> > +     dput(parent);
> > +     return ret;
>
> There's one more feature fsnotify_parent() provides - it calls
> take_dentry_name_snapshot() before calling into fsnotify and uses
> snapshotted name. I think you need to do the same here to provide the same
> protection for fsnotify_nameremove, don't you? Or maybe you don't since
> generally directory i_rwsem is held when unlinking and so dentry name cannot
> change but then it would be good to assert that? Who knows what some weird
> fs is doing...
>

Right.. I will add that assert and at the same time, for the same reason, this
helper doesn't need to dget_parent(), because d_parent should also be stable.

> > +/*
> > + * This is a list of all events that may get sent to a parent based on fs event
>
> ^^^ Line too long.

Heh?? checkpatch did not complain. It's 79 chars.

>
> > +#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS| \
>                                                         ^^^ space here
>

My patch did not change indentation AFAIK.
The new definition added is with tab (ALL_FSNOTIFY_DIRENT_EVENTS)
but some of the existing definitions, even ones moved around are with space(s).
Do you want me to change that?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 01/13] fsnotify: annotate directory entry modification events
  2018-11-28 14:39     ` Amir Goldstein
@ 2018-11-28 14:43       ` Jan Kara
  2018-11-28 15:01         ` Amir Goldstein
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-28 14:43 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Wed 28-11-18 16:39:31, Amir Goldstein wrote:
> On Wed, Nov 28, 2018 at 2:59 PM Jan Kara <jack@suse.cz> wrote:
> > > +/*
> > > + * This is a list of all events that may get sent to a parent based on fs event
> >
> > ^^^ Line too long.
> 
> Heh?? checkpatch did not complain. It's 79 chars.

My bad, just the quoting in the reply pushed the line over 80 chars and I
didn't realize that.

> >
> > > +#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS| \
> >                                                         ^^^ space here
> >
> 
> My patch did not change indentation AFAIK.
> The new definition added is with tab (ALL_FSNOTIFY_DIRENT_EVENTS)
> but some of the existing definitions, even ones moved around are with space(s).
> Do you want me to change that?

This is not about indentation but space before '|' operator...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 01/13] fsnotify: annotate directory entry modification events
  2018-11-28 14:43       ` Jan Kara
@ 2018-11-28 15:01         ` Amir Goldstein
  0 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-28 15:01 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Wed, Nov 28, 2018 at 4:43 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 28-11-18 16:39:31, Amir Goldstein wrote:
> > On Wed, Nov 28, 2018 at 2:59 PM Jan Kara <jack@suse.cz> wrote:
> > > > +/*
> > > > + * This is a list of all events that may get sent to a parent based on fs event
> > >
> > > ^^^ Line too long.
> >
> > Heh?? checkpatch did not complain. It's 79 chars.
>
> My bad, just the quoting in the reply pushed the line over 80 chars and I
> didn't realize that.
>
> > >
> > > > +#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS| \
> > >                                                         ^^^ space here
> > >
> >
> > My patch did not change indentation AFAIK.
> > The new definition added is with tab (ALL_FSNOTIFY_DIRENT_EVENTS)
> > but some of the existing definitions, even ones moved around are with space(s).
> > Do you want me to change that?
>
> This is not about indentation but space before '|' operator...
>
Ah. ok. fixed.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-25 13:43 ` [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier Amir Goldstein
@ 2018-11-28 15:27   ` Jan Kara
  2018-11-28 16:24     ` Amir Goldstein
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-28 15:27 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-api

On Sun 25-11-18 15:43:43, Amir Goldstein wrote:
> When user requests the flag FAN_REPORT_FID in fanotify_init(),
> a unique file indetifier of the event target object will be reported
> with the event.
> 
> This commit only defines the internal and user visible structures used
> to store and report the unique file identifier.
> 
> The file identifier includes the filesystem's fsid (i.e. from statfs(2))
> and an NFS file handle of the file (i.e. from name_to_handle_at(2)).
> 
> The file identifier makes holding the path reference and passing a file
> descriptor to user redundant, so those are disabled in a group with
> FAN_REPORT_FID.
> 
> Cc: <linux-api@vger.kernel.org>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

On a general note I'd fold patches 4-6 into one patch as separating them
looks weird to me and does not really simplify the review (I had to jump
among these three patches frequently to understand what's going on).

> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index fb84dd3289f8..2e4fca30afda 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -7,6 +7,14 @@ extern struct kmem_cache *fanotify_mark_cache;
>  extern struct kmem_cache *fanotify_event_cachep;
>  extern struct kmem_cache *fanotify_perm_event_cachep;
>  
> +/* The size of the variable length buffer storing fsid and file handle */
> +#define FANOTIFY_FID_LEN(handle_bytes)	\
> +	(sizeof(struct fanotify_event_fid) + (handle_bytes))
> +
> +struct fanotify_info {
> +	struct fanotify_event_fid *fid;
> +};
> +

Hum, lot of file handles actually fit in 24 bytes and separate allocation
for such small things is really an overkill. And the rate at which we
produce events can be relatively large. So I'd prefer if could embed such
small handles inside the struct fanotify_event. Probably as a separate
optimization patch.

> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 2dbb2662a92f..93e1aa2a389f 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -133,9 +133,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
>  	metadata->reserved = 0;
>  	metadata->mask = fsn_event->mask & FANOTIFY_OUTGOING_EVENTS;
>  	metadata->pid = pid_vnr(event->pid);
> -	if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
> +	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
> +	    unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
>  		metadata->fd = FAN_NOFD;
> -	else {
> +	} else {

Use FANOTIFY_HAS_FID() helper here?

> @@ -106,6 +107,24 @@ struct fanotify_event_metadata {
>  	__s32 pid;
>  };
>  
> +#define FAN_EVENT_INFO_TYPE_FID		1
> +
> +/* Variable length info record header following event metadata */
> +struct fanotify_event_info {
> +	__u8 info_type;
> +	__u8 reserved;
> +	__u16 info_len;
> +	unsigned char info[0];
> +};
> +
> +/* Unique file identifier info record */
> +struct fanotify_event_fid {
> +	__kernel_fsid_t fsid;
> +	__u32 handle_bytes;
> +	__s32 handle_type;
> +	unsigned char f_handle[0];
> +};
> +

Hum, I find another generic embedded fanotify_event_info an
overengineering. fanotify_event_metadata with embedded versioning and
length was supposed to be extensible enough without further generic headers
being embedded... I know you had ideas for further extension of reported
information so probably that is the reason but at least from my POV why not
just bump the 'vers' field to 4 for groups with FAN_REPORT_FID and create
struct fanotify_event_metadata_v4 which would have all necessary fields for
passing the filehandle (and probably it does not have to have 'fd' field)?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 05/13] fanotify: classify events that hold a file identifier
  2018-11-25 13:43 ` [PATCH v3 05/13] fanotify: classify events that hold a " Amir Goldstein
@ 2018-11-28 15:33   ` Jan Kara
  2018-11-28 15:44     ` Jan Kara
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-28 15:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Sun 25-11-18 15:43:44, Amir Goldstein wrote:
> With group flag FAN_REPORT_FID, we do not store event->path.
> Instead, we will store the file handle and fsid in event->info.fid.
> 
> Because event->info and event->path share the same union space, set
> bit 0 of event->info.flags, so we know how to free and compare the
> event objects.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/notify/fanotify/fanotify.c      | 35 ++++++++++++++++++++++++------
>  fs/notify/fanotify/fanotify.h      | 27 +++++++++++++++++++++++
>  fs/notify/fanotify/fanotify_user.c |  3 +++
>  3 files changed, 58 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 59d093923c97..8192d4b1db21 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -25,11 +25,16 @@ static bool should_merge(struct fsnotify_event *old_fsn,
>  	old = FANOTIFY_E(old_fsn);
>  	new = FANOTIFY_E(new_fsn);
>  
> -	if (old_fsn->inode == new_fsn->inode && old->pid == new->pid &&
> -	    old->path.mnt == new->path.mnt &&
> -	    old->path.dentry == new->path.dentry)
> -		return true;
> -	return false;
> +	if (old_fsn->inode != new_fsn->inode || old->pid != new->pid)
> +		return false;
> +
> +	if (FANOTIFY_HAS_FID(new)) {
> +		return FANOTIFY_HAS_FID(old) &&
> +			fanotify_fid_equal(old->info.fid, new->info.fid);
> +	} else {
> +		return old->path.mnt == new->path.mnt &&
> +			old->path.dentry == new->path.dentry;
> +	}
>  }
>  
>  /* and the list better be locked by something too! */
> @@ -178,7 +183,10 @@ init: __maybe_unused
>  		event->pid = get_pid(task_pid(current));
>  	else
>  		event->pid = get_pid(task_tgid(current));
> -	if (path && !FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
> +	if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
> +		/* TODO: allocate buffer and encode file handle */
> +		fanotify_set_fid(event, NULL);
> +	} else if (path) {
>  		event->path = *path;
>  		path_get(&event->path);
>  	} else {
> @@ -277,8 +285,21 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
>  {
>  	struct fanotify_event *event;
>  
> +	/*
> +	 * event->path and event->info are a union. Make sure that
> +	 * event->info.flags overlaps with path.dentry pointer, so it is safe
> +	 * to use bit 0 to classify the event info type (FANOTIFY_HAS_FID).
> +	 */
> +	BUILD_BUG_ON(offsetof(struct fanotify_event, path.dentry) !=
> +		     offsetof(struct fanotify_event, info.flags));
> +	BUILD_BUG_ON(!__builtin_types_compatible_p(typeof(event->info.flags),
> +						   unsigned long));
> +
>  	event = FANOTIFY_E(fsn_event);
> -	path_put(&event->path);
> +	if (FANOTIFY_HAS_FID(event))
> +		kfree(event->info.fid);
> +	else
> +		path_put(&event->path);
>  	put_pid(event->pid);
>  	if (fanotify_is_perm_event(fsn_event->mask)) {
>  		kmem_cache_free(fanotify_perm_event_cachep,
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 2e4fca30afda..a79dcbd41702 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -13,8 +13,20 @@ extern struct kmem_cache *fanotify_perm_event_cachep;
>  
>  struct fanotify_info {
>  	struct fanotify_event_fid *fid;
> +	unsigned long flags;
>  };
>  
> +static inline bool fanotify_fid_equal(const struct fanotify_event_fid *fid1,
> +				      const struct fanotify_event_fid *fid2)
> +{
> +	if (fid1 == fid2)
> +		return true;
> +
> +	return fid1 && fid2 && fid1->handle_bytes == fid2->handle_bytes &&
> +		!memcmp((void *)fid1, (void *)fid2,
> +			FANOTIFY_FID_LEN(fid1->handle_bytes));
> +}
> +
>  /*
>   * Structure for normal fanotify events. It gets allocated in
>   * fanotify_handle_event() and freed when the information is retrieved by
> @@ -38,6 +50,21 @@ struct fanotify_event {
>  	struct pid *pid;
>  };
>  
> +/*
> + * Bit 0 of info.flags is set when event has fid information.
> + * event->info shares the same union address with event->path, so this helps
> + * us tell if event has fid or path.
> + */
> +#define __FANOTIFY_FID		1UL
> +#define FANOTIFY_HAS_FID(event)	((event)->info.flags & __FANOTIFY_FID)

Why isn't this an inline function? And then the name wouldn't have to be
all caps... Also

> +static inline void fanotify_set_fid(struct fanotify_event *event,
> +				    struct fanotify_event_fid *fid)
> +{
> +	event->info.fid = fid;
> +	event->info.flags = fid ? __FANOTIFY_FID : 0;
> +}
> +
>  /*
>   * Structure for permission fanotify events. It gets allocated and freed in
>   * fanotify_handle_event() since we wait there for user response. When the
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 93e1aa2a389f..47e8bf3bcd28 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -81,6 +81,9 @@ static int create_fd(struct fsnotify_group *group,
>  
>  	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
>  
> +	if (WARN_ON_ONCE(FANOTIFY_HAS_FID(event)))
> +		return -EBADF;
> +
>  	client_fd = get_unused_fd_flags(group->fanotify_data.f_flags);
>  	if (client_fd < 0)
>  		return client_fd;
> -- 
> 2.17.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 05/13] fanotify: classify events that hold a file identifier
  2018-11-28 15:33   ` Jan Kara
@ 2018-11-28 15:44     ` Jan Kara
  2018-11-28 15:52       ` Amir Goldstein
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-28 15:44 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

Sorry, sent previous reply too early.

On Wed 28-11-18 16:33:12, Jan Kara wrote:
> @@ -38,6 +50,21 @@ struct fanotify_event {
>  	struct pid *pid;
>  };
>  
> +/*
> + * Bit 0 of info.flags is set when event has fid information.
> + * event->info shares the same union address with event->path, so this helps
> + * us tell if event has fid or path.
> + */
> +#define __FANOTIFY_FID		1UL
> +#define FANOTIFY_HAS_FID(event)	((event)->info.flags & __FANOTIFY_FID)

Why isn't this an inline function? And then the name wouldn't have to be
all caps...

Also I'd prefer if we just enlarged struct fanotify_event by 8 bytes to
allow simple 'flags' field and info about possible embedded fid, rather
than playing these bit tricks. They save space but make code more subtle
and I don't file struct fanotify_event size so critical.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 05/13] fanotify: classify events that hold a file identifier
  2018-11-28 15:44     ` Jan Kara
@ 2018-11-28 15:52       ` Amir Goldstein
  0 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-28 15:52 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Wed, Nov 28, 2018 at 5:44 PM Jan Kara <jack@suse.cz> wrote:
>
> Sorry, sent previous reply too early.
>
> On Wed 28-11-18 16:33:12, Jan Kara wrote:
> > @@ -38,6 +50,21 @@ struct fanotify_event {
> >       struct pid *pid;
> >  };
> >
> > +/*
> > + * Bit 0 of info.flags is set when event has fid information.
> > + * event->info shares the same union address with event->path, so this helps
> > + * us tell if event has fid or path.
> > + */
> > +#define __FANOTIFY_FID               1UL
> > +#define FANOTIFY_HAS_FID(event)      ((event)->info.flags & __FANOTIFY_FID)
>
> Why isn't this an inline function? And then the name wouldn't have to be
> all caps...

Sure.

>
> Also I'd prefer if we just enlarged struct fanotify_event by 8 bytes to
> allow simple 'flags' field and info about possible embedded fid, rather
> than playing these bit tricks. They save space but make code more subtle
> and I don't file struct fanotify_event size so critical.
>

Ok.

Freeing the space of path.dentry and path.mnt will give us 16 bytes of space
for embedded fid, which should be sufficient for many file handles (ext4, xfs).
and flags could indicate if fid is embedded or allocated.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-28 15:27   ` Jan Kara
@ 2018-11-28 16:24     ` Amir Goldstein
  2018-11-28 17:43       ` Jan Kara
  0 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-28 16:24 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

On Wed, Nov 28, 2018 at 5:27 PM Jan Kara <jack@suse.cz> wrote:
>
> On Sun 25-11-18 15:43:43, Amir Goldstein wrote:
> > When user requests the flag FAN_REPORT_FID in fanotify_init(),
> > a unique file indetifier of the event target object will be reported
> > with the event.
> >
> > This commit only defines the internal and user visible structures used
> > to store and report the unique file identifier.
> >
> > The file identifier includes the filesystem's fsid (i.e. from statfs(2))
> > and an NFS file handle of the file (i.e. from name_to_handle_at(2)).
> >
> > The file identifier makes holding the path reference and passing a file
> > descriptor to user redundant, so those are disabled in a group with
> > FAN_REPORT_FID.
> >
> > Cc: <linux-api@vger.kernel.org>
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> On a general note I'd fold patches 4-6 into one patch as separating them
> looks weird to me and does not really simplify the review (I had to jump
> among these three patches frequently to understand what's going on).
>
> > diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> > index fb84dd3289f8..2e4fca30afda 100644
> > --- a/fs/notify/fanotify/fanotify.h
> > +++ b/fs/notify/fanotify/fanotify.h
> > @@ -7,6 +7,14 @@ extern struct kmem_cache *fanotify_mark_cache;
> >  extern struct kmem_cache *fanotify_event_cachep;
> >  extern struct kmem_cache *fanotify_perm_event_cachep;
> >
> > +/* The size of the variable length buffer storing fsid and file handle */
> > +#define FANOTIFY_FID_LEN(handle_bytes)       \
> > +     (sizeof(struct fanotify_event_fid) + (handle_bytes))
> > +
> > +struct fanotify_info {
> > +     struct fanotify_event_fid *fid;
> > +};
> > +
>
> Hum, lot of file handles actually fit in 24 bytes and separate allocation
> for such small things is really an overkill. And the rate at which we
> produce events can be relatively large. So I'd prefer if could embed such
> small handles inside the struct fanotify_event. Probably as a separate
> optimization patch.
>
> > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > index 2dbb2662a92f..93e1aa2a389f 100644
> > --- a/fs/notify/fanotify/fanotify_user.c
> > +++ b/fs/notify/fanotify/fanotify_user.c
> > @@ -133,9 +133,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
> >       metadata->reserved = 0;
> >       metadata->mask = fsn_event->mask & FANOTIFY_OUTGOING_EVENTS;
> >       metadata->pid = pid_vnr(event->pid);
> > -     if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
> > +     if (FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
> > +         unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
> >               metadata->fd = FAN_NOFD;
> > -     else {
> > +     } else {
>
> Use FANOTIFY_HAS_FID() helper here?

Not here. FAN_REPORT_FID implies that event->fd will never be used.
But I need to check FANOTIFY_HAS_FID() before copy_fid_to_user(),
because we could fail to decode fid and still report the event without fid.

>
> > @@ -106,6 +107,24 @@ struct fanotify_event_metadata {
> >       __s32 pid;
> >  };
> >
> > +#define FAN_EVENT_INFO_TYPE_FID              1
> > +
> > +/* Variable length info record header following event metadata */
> > +struct fanotify_event_info {
> > +     __u8 info_type;
> > +     __u8 reserved;
> > +     __u16 info_len;
> > +     unsigned char info[0];
> > +};
> > +
> > +/* Unique file identifier info record */
> > +struct fanotify_event_fid {
> > +     __kernel_fsid_t fsid;
> > +     __u32 handle_bytes;
> > +     __s32 handle_type;
> > +     unsigned char f_handle[0];
> > +};
> > +
>
> Hum, I find another generic embedded fanotify_event_info an
> overengineering. fanotify_event_metadata with embedded versioning and
> length was supposed to be extensible enough without further generic headers
> being embedded... I know you had ideas for further extension of reported
> information so probably that is the reason but at least from my POV why not
> just bump the 'vers' field to 4 for groups with FAN_REPORT_FID and create
> struct fanotify_event_metadata_v4 which would have all necessary fields for
> passing the filehandle (and probably it does not have to have 'fd' field)?
>

Two reasons mainly:
1. Considering further extensions, this design looks like a better fit to me.
2. Related to #1, I don't like the way uapi gets bloated with all definition of
past format versions, so IMO format bump should be last resort

I'm perfectly fine with getting rid of fanotify_event_info record header.
I agree that it is overengineering.

How about:
struct fanotify_event_info_fid {
          struct fanotify_event_metadata hdr;
          struct fanotify_event_fid fid;
};

Then fanotify_example program from man fanotify(7) is legit even when adding
FAN_REPORT_FID to fanotify_init().

Programs that want to access fid can cast to (struct fanotify_event_info_fid *)
and access fid info.

This will make the process of adapting existing programs to FAN_REPORT_FID
smoother IMO. And the only cost we need to pay is carry event->fd = FAN_NOFD
in wasted event space.

Thoughts?

Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-28 16:24     ` Amir Goldstein
@ 2018-11-28 17:43       ` Jan Kara
  2018-11-28 18:34         ` Amir Goldstein
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-28 17:43 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-api

On Wed 28-11-18 18:24:11, Amir Goldstein wrote:
> On Wed, Nov 28, 2018 at 5:27 PM Jan Kara <jack@suse.cz> wrote:
> > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > > index 2dbb2662a92f..93e1aa2a389f 100644
> > > --- a/fs/notify/fanotify/fanotify_user.c
> > > +++ b/fs/notify/fanotify/fanotify_user.c
> > > @@ -133,9 +133,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
> > >       metadata->reserved = 0;
> > >       metadata->mask = fsn_event->mask & FANOTIFY_OUTGOING_EVENTS;
> > >       metadata->pid = pid_vnr(event->pid);
> > > -     if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
> > > +     if (FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
> > > +         unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
> > >               metadata->fd = FAN_NOFD;
> > > -     else {
> > > +     } else {
> >
> > Use FANOTIFY_HAS_FID() helper here?
> 
> Not here. FAN_REPORT_FID implies that event->fd will never be used.
> But I need to check FANOTIFY_HAS_FID() before copy_fid_to_user(),
> because we could fail to decode fid and still report the event without fid.

OK. So maybe something like would be more obvious?

	if (fanotify_event_has_path(event)) {
		create and store fd
	} else if (fanotify_event_has_fid(event))
		store fid
	} else {
		clear fd & fid
	}

> > > @@ -106,6 +107,24 @@ struct fanotify_event_metadata {
> > >       __s32 pid;
> > >  };
> > >
> > > +#define FAN_EVENT_INFO_TYPE_FID              1
> > > +
> > > +/* Variable length info record header following event metadata */
> > > +struct fanotify_event_info {
> > > +     __u8 info_type;
> > > +     __u8 reserved;
> > > +     __u16 info_len;
> > > +     unsigned char info[0];
> > > +};
> > > +
> > > +/* Unique file identifier info record */
> > > +struct fanotify_event_fid {
> > > +     __kernel_fsid_t fsid;
> > > +     __u32 handle_bytes;
> > > +     __s32 handle_type;
> > > +     unsigned char f_handle[0];
> > > +};
> > > +
> >
> > Hum, I find another generic embedded fanotify_event_info an
> > overengineering. fanotify_event_metadata with embedded versioning and
> > length was supposed to be extensible enough without further generic headers
> > being embedded... I know you had ideas for further extension of reported
> > information so probably that is the reason but at least from my POV why not
> > just bump the 'vers' field to 4 for groups with FAN_REPORT_FID and create
> > struct fanotify_event_metadata_v4 which would have all necessary fields for
> > passing the filehandle (and probably it does not have to have 'fd' field)?
> >
> 
> Two reasons mainly:
> 1. Considering further extensions, this design looks like a better fit to me.
> 2. Related to #1, I don't like the way uapi gets bloated with all
> definition of past format versions, so IMO format bump should be last
> resort
> 
> I'm perfectly fine with getting rid of fanotify_event_info record header.
> I agree that it is overengineering.
> 
> How about:
> struct fanotify_event_info_fid {
>           struct fanotify_event_metadata hdr;
>           struct fanotify_event_fid fid;
> };
> 
> Then fanotify_example program from man fanotify(7) is legit even when
> adding FAN_REPORT_FID to fanotify_init().
> 
> Programs that want to access fid can cast to (struct
> fanotify_event_info_fid *) and access fid info.

OK, what I'm uneasy about is the fact that the event format is defined by
group flags and not determined within the event itself. To demonstrate
what I mean: Group with FAN_REPORT_FID has fanotify_event_fid appended at
the end. If we have flag FAN_REPORT_ELSE, then group with FAN_REPORT_ELSE
would have fanotify_event_else appended at the end. Now if you have group
with FAN_REPORT_FID | FAN_REPORT_ELSE, then what happens?

So when thinking about this more I actually think your idea with some
header is not bad. I'd just implement it like:

struct fanotify_event_info_header {
	__u8 info_type;
	__u8 pad;
	__u16 len;
}

and then

struct fanotify_event_fid {
	struct fanotify_event_info_header hdr;
	__kernel_fsid_t fsid;
	...
}

We have to be careful with padding but it should work here since everything
is at 32-bit granularity. Thoughts?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-28 17:43       ` Jan Kara
@ 2018-11-28 18:34         ` Amir Goldstein
  2018-11-29  7:51           ` Jan Kara
  0 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-28 18:34 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

On Wed, Nov 28, 2018 at 7:43 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 28-11-18 18:24:11, Amir Goldstein wrote:
> > On Wed, Nov 28, 2018 at 5:27 PM Jan Kara <jack@suse.cz> wrote:
> > > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > > > index 2dbb2662a92f..93e1aa2a389f 100644
> > > > --- a/fs/notify/fanotify/fanotify_user.c
> > > > +++ b/fs/notify/fanotify/fanotify_user.c
> > > > @@ -133,9 +133,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
> > > >       metadata->reserved = 0;
> > > >       metadata->mask = fsn_event->mask & FANOTIFY_OUTGOING_EVENTS;
> > > >       metadata->pid = pid_vnr(event->pid);
> > > > -     if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
> > > > +     if (FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
> > > > +         unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
> > > >               metadata->fd = FAN_NOFD;
> > > > -     else {
> > > > +     } else {
> > >
> > > Use FANOTIFY_HAS_FID() helper here?
> >
> > Not here. FAN_REPORT_FID implies that event->fd will never be used.
> > But I need to check FANOTIFY_HAS_FID() before copy_fid_to_user(),
> > because we could fail to decode fid and still report the event without fid.
>
> OK. So maybe something like would be more obvious?
>
>         if (fanotify_event_has_path(event)) {
>                 create and store fd
>         } else if (fanotify_event_has_fid(event))
>                 store fid
>         } else {
>                 clear fd & fid
>         }

The issue is that fill_event_metadata() function fills a fixed size header
and later copy_event_to_user() copies that header to user and then
copies the variable sized fid to user, so this is not the place to "store"
fid, but I will work on readability of this code.

>
> > > > @@ -106,6 +107,24 @@ struct fanotify_event_metadata {
> > > >       __s32 pid;
> > > >  };
> > > >
> > > > +#define FAN_EVENT_INFO_TYPE_FID              1
> > > > +
> > > > +/* Variable length info record header following event metadata */
> > > > +struct fanotify_event_info {
> > > > +     __u8 info_type;
> > > > +     __u8 reserved;
> > > > +     __u16 info_len;
> > > > +     unsigned char info[0];
> > > > +};
> > > > +
> > > > +/* Unique file identifier info record */
> > > > +struct fanotify_event_fid {
> > > > +     __kernel_fsid_t fsid;
> > > > +     __u32 handle_bytes;
> > > > +     __s32 handle_type;
> > > > +     unsigned char f_handle[0];
> > > > +};
> > > > +
> > >
> > > Hum, I find another generic embedded fanotify_event_info an
> > > overengineering. fanotify_event_metadata with embedded versioning and
> > > length was supposed to be extensible enough without further generic headers
> > > being embedded... I know you had ideas for further extension of reported
> > > information so probably that is the reason but at least from my POV why not
> > > just bump the 'vers' field to 4 for groups with FAN_REPORT_FID and create
> > > struct fanotify_event_metadata_v4 which would have all necessary fields for
> > > passing the filehandle (and probably it does not have to have 'fd' field)?
> > >
> >
> > Two reasons mainly:
> > 1. Considering further extensions, this design looks like a better fit to me.
> > 2. Related to #1, I don't like the way uapi gets bloated with all
> > definition of past format versions, so IMO format bump should be last
> > resort
> >
> > I'm perfectly fine with getting rid of fanotify_event_info record header.
> > I agree that it is overengineering.
> >
> > How about:
> > struct fanotify_event_info_fid {
> >           struct fanotify_event_metadata hdr;
> >           struct fanotify_event_fid fid;
> > };
> >
> > Then fanotify_example program from man fanotify(7) is legit even when
> > adding FAN_REPORT_FID to fanotify_init().
> >
> > Programs that want to access fid can cast to (struct
> > fanotify_event_info_fid *) and access fid info.
>
> OK, what I'm uneasy about is the fact that the event format is defined by
> group flags and not determined within the event itself. To demonstrate
> what I mean: Group with FAN_REPORT_FID has fanotify_event_fid appended at
> the end. If we have flag FAN_REPORT_ELSE, then group with FAN_REPORT_ELSE
> would have fanotify_event_else appended at the end. Now if you have group
> with FAN_REPORT_FID | FAN_REPORT_ELSE, then what happens?
>
> So when thinking about this more I actually think your idea with some
> header is not bad. I'd just implement it like:
>
> struct fanotify_event_info_header {
>         __u8 info_type;
>         __u8 pad;
>         __u16 len;
> }
>
> and then
>
> struct fanotify_event_fid {
>         struct fanotify_event_info_header hdr;
>         __kernel_fsid_t fsid;
>         ...
> }
>
> We have to be careful with padding but it should work here since everything
> is at 32-bit granularity. Thoughts?
>

Sure. That's the easiest for me. Not that different from what I have.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-28 18:34         ` Amir Goldstein
@ 2018-11-29  7:51           ` Jan Kara
  2018-11-29  8:16             ` Amir Goldstein
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-29  7:51 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-api

On Wed 28-11-18 20:34:47, Amir Goldstein wrote:
> On Wed, Nov 28, 2018 at 7:43 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 28-11-18 18:24:11, Amir Goldstein wrote:
> > > On Wed, Nov 28, 2018 at 5:27 PM Jan Kara <jack@suse.cz> wrote:
> > > > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > > > > index 2dbb2662a92f..93e1aa2a389f 100644
> > > > > --- a/fs/notify/fanotify/fanotify_user.c
> > > > > +++ b/fs/notify/fanotify/fanotify_user.c
> > > > > @@ -133,9 +133,10 @@ static int fill_event_metadata(struct fsnotify_group *group,
> > > > >       metadata->reserved = 0;
> > > > >       metadata->mask = fsn_event->mask & FANOTIFY_OUTGOING_EVENTS;
> > > > >       metadata->pid = pid_vnr(event->pid);
> > > > > -     if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
> > > > > +     if (FAN_GROUP_FLAG(group, FAN_REPORT_FID) ||
> > > > > +         unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
> > > > >               metadata->fd = FAN_NOFD;
> > > > > -     else {
> > > > > +     } else {
> > > >
> > > > Use FANOTIFY_HAS_FID() helper here?
> > >
> > > Not here. FAN_REPORT_FID implies that event->fd will never be used.
> > > But I need to check FANOTIFY_HAS_FID() before copy_fid_to_user(),
> > > because we could fail to decode fid and still report the event without fid.
> >
> > OK. So maybe something like would be more obvious?
> >
> >         if (fanotify_event_has_path(event)) {
> >                 create and store fd
> >         } else if (fanotify_event_has_fid(event))
> >                 store fid
> >         } else {
> >                 clear fd & fid
> >         }
> 
> The issue is that fill_event_metadata() function fills a fixed size header
> and later copy_event_to_user() copies that header to user and then
> copies the variable sized fid to user, so this is not the place to "store"
> fid, but I will work on readability of this code.

Aha, I see. Thanks for your patience with me :). So then how about just
folding fill_event_metadata() into copy_event_to_user() (as a preparatory
patch). It is relatively small function, has a single call site and with
FID changes the distinction isn't so clear anymore...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-29  7:51           ` Jan Kara
@ 2018-11-29  8:16             ` Amir Goldstein
  2018-11-29 10:16               ` Jan Kara
  0 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-29  8:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

> > The issue is that fill_event_metadata() function fills a fixed size header
> > and later copy_event_to_user() copies that header to user and then
> > copies the variable sized fid to user, so this is not the place to "store"
> > fid, but I will work on readability of this code.
>
> Aha, I see. Thanks for your patience with me :). So then how about just
> folding fill_event_metadata() into copy_event_to_user() (as a preparatory
> patch). It is relatively small function, has a single call site and with
> FID changes the distinction isn't so clear anymore...
>

Sure. While you are here, before I start reworking, wanted to run by you
a few minor suggestions:

struct fanotify_event_info_fid {
        struct fanotify_event_info_header hdr;
        ....

Seems more appropriate name than the shorter fanotify_event_fid name
that you suggested.

It bothers me a bit not to use struct file_handle in the definition,
but I don't with to start moving struct file_handle into uapi.
I am a bit lost trying to understand which uapi include file I need
to include in fanotify.h if I want to use it. fcntl.h?

I am going to add a separate internal struct for storing fid in event
(either embedded of allocated) because I am going to make it more compact
(similar to struct ovl_fh)

struct fanotify_fid {
        __kernel_fsid_t fsid;
        u8 handle_bytes;
        u8 handle_type;
        u16 pad;
        unsigned char f_handle[0];
};

Will let you know when I have something ready.

AFAICT, this rework should not affect the rest of the patches in the
series (i.e. cached fsid
and dirent events), so if you have time, you don't need to wait on my
rework to continue
review of v3.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 07/13] fanotify: copy event fid info to user
  2018-11-25 13:43 ` [PATCH v3 07/13] fanotify: copy event fid info to user Amir Goldstein
@ 2018-11-29  9:00   ` Jan Kara
       [not found]     ` <CAOQ4uxjcb=UqQiw0XcpDfetK28bM4tOYdvgxPwhkjgE2mxpt=g@mail.gmail.com>
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-29  9:00 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Sun 25-11-18 15:43:46, Amir Goldstein wrote:
> If group requested FAN_REPORT_FID and event has file identifier
> copy that information to user reading the event after reading
> event metadata.
> metadata->event_len includes the length of the fid information.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
...
> +static int round_event_fid_len(struct fsnotify_event *fsn_event)
> +{
> +	struct fanotify_event *event = FANOTIFY_E(fsn_event);
> +
> +	if (!FANOTIFY_HAS_FID(event))
> +		return 0;
> +
> +	return roundup(FANOTIFY_FID_INFO_LEN(event), FAN_EVENT_METADATA_LEN);
> +}
> +

Why do you round up to FAN_EVENT_METADATA_LEN? I think rounding up to
multiple of 8 bytes should be more than enough... Otherwise the patch looks
good to me.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag
  2018-11-25 13:43 ` [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag Amir Goldstein
@ 2018-11-29  9:46   ` Jan Kara
  2018-11-29 10:52     ` Jan Kara
  2018-11-29 11:03     ` Amir Goldstein
  0 siblings, 2 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-29  9:46 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-api

On Sun 25-11-18 15:43:47, Amir Goldstein wrote:
> When setting up an fanotify listener, user may request to get fid
> information in event instead of an open file descriptor.
> 
> The fid obtained with event on a watched object contains the file
> handle returned by name_to_handle_at(2) and fsid returned by statfs(2).
> 
> When setting a mark, we need to make sure that the filesystem
> supports encoding file handles with name_to_handle_at(2) and that
> statfs(2) encodes a non-zero fsid.
...
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index ea8e81a3e80b..d7aa2f392a64 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -857,6 +859,49 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  	return fd;
>  }
>  
> +/* Check if filesystem can encode a unique fid */
> +static int fanotify_test_fid(struct path *path)
> +{
> +	struct kstatfs stat, root_stat;
> +	int err;
> +
> +	/*
> +	 * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
> +	 * TODO: cache fsid in the mark connector.
> +	 */

TODO in a submitted patch looks strange (looks like the patch isn't done
yet ;)) and in this particular case the code is perfectly justified even
without talking about future functionality...

> +	err = vfs_statfs(path, &stat);
> +	if (err)
> +		return err;
> +
> +	if (!stat.f_fsid.val[0] && !stat.f_fsid.val[1])
> +		return -ENODEV;
> +
> +	/*
> +	 * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
> +	 * which uses a different fsid than sb root.
> +	 */
> +	err = statfs_by_dentry(path->dentry->d_sb->s_root, &root_stat);
> +	if (err)
> +		return err;
> +
> +	if (root_stat.f_fsid.val[0] != stat.f_fsid.val[0] ||
> +	    root_stat.f_fsid.val[1] != stat.f_fsid.val[1])
> +		return -EXDEV;

I think inode watches in subvolumes are actually fine? The same fs object
is going to get different struct inode for different subvolumes if I
remember right. So there won't be any surprises with unexpected fsid being
reported.

Also mount watches are actually fine for subvolume as different subvolumes
appear as different mountpoints, don't they? And I think implementation
that would have different fsid for inodes in the same mountpoint would be
indeed very weird. So again no problem with fsid mismatch.

So we need this check only for superblock watches.

> diff --git a/fs/statfs.c b/fs/statfs.c
> index f0216629621d..6a5d840a2d8d 100644
> --- a/fs/statfs.c
> +++ b/fs/statfs.c
> @@ -50,7 +50,8 @@ static int calculate_f_flags(struct vfsmount *mnt)
>  		flags_by_sb(mnt->mnt_sb->s_flags);
>  }
>  
> -static int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
> +/* Does not set buf->f_flags */
> +int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
>  {
>  	int retval;
>  
> @@ -66,6 +67,7 @@ static int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
>  		buf->f_frsize = buf->f_bsize;
>  	return retval;
>  }
> +EXPORT_SYMBOL(statfs_by_dentry);
>  
>  int vfs_statfs(const struct path *path, struct kstatfs *buf)
>  {

Make this export a separate patch, CC Al Viro on it. Honestly I'm not very
happy about statfs_by_dentry() interface as it may return different result
than vfs_statfs() so it just waits for someone careless to use
statfs_by_dentry() and get burnt. How about implementing vfs_get_fsid()
that will get dentry and return fsid, that will be just internally
implemented using statfs_by_dentry()? We can then use that everywhere in
fsnotify and the interface is going to be much cleaner like that. The
comment regarding CC to Al Viro and separate patch still applies though.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 07/13] fanotify: copy event fid info to user
       [not found]     ` <CAOQ4uxjcb=UqQiw0XcpDfetK28bM4tOYdvgxPwhkjgE2mxpt=g@mail.gmail.com>
@ 2018-11-29  9:49       ` Jan Kara
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-29  9:49 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Thu 29-11-18 11:27:08, Amir Goldstein wrote:
> On Thu, Nov 29, 2018, 11:00 AM Jan Kara <jack@suse.cz wrote:
> 
> > On Sun 25-11-18 15:43:46, Amir Goldstein wrote:
> > > If group requested FAN_REPORT_FID and event has file identifier
> > > copy that information to user reading the event after reading
> > > event metadata.
> > > metadata->event_len includes the length of the fid information.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ...
> > > +static int round_event_fid_len(struct fsnotify_event *fsn_event)
> > > +{
> > > +     struct fanotify_event *event = FANOTIFY_E(fsn_event);
> > > +
> > > +     if (!FANOTIFY_HAS_FID(event))
> > > +             return 0;
> > > +
> > > +     return roundup(FANOTIFY_FID_INFO_LEN(event),
> > FAN_EVENT_METADATA_LEN);
> > > +}
> > > +
> >
> > Why do you round up to FAN_EVENT_METADATA_LEN? I think rounding up to
> > multiple of 8 bytes should be more than enough... Otherwise the patch looks
> > good to me.
> >
> 
> Sorry for html reply . You may reply to list.
> 
> I was following the practice in inotify with variable size events.
> 
> I recon the reasoning is that for simplicity application define an array of
> struct metadata
> And rely on events being metadata size alligned.
> 
> Not sure if that justifies the inotify practice?

Yeah, I don't think following inotify practice is really warranted here.
But thanks for explanation.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-29  8:16             ` Amir Goldstein
@ 2018-11-29 10:16               ` Jan Kara
  2018-11-29 11:10                 ` Amir Goldstein
  2018-11-30 15:32                 ` Amir Goldstein
  0 siblings, 2 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-29 10:16 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-api

On Thu 29-11-18 10:16:27, Amir Goldstein wrote:
> > > The issue is that fill_event_metadata() function fills a fixed size header
> > > and later copy_event_to_user() copies that header to user and then
> > > copies the variable sized fid to user, so this is not the place to "store"
> > > fid, but I will work on readability of this code.
> >
> > Aha, I see. Thanks for your patience with me :). So then how about just
> > folding fill_event_metadata() into copy_event_to_user() (as a preparatory
> > patch). It is relatively small function, has a single call site and with
> > FID changes the distinction isn't so clear anymore...
> >
> 
> Sure. While you are here, before I start reworking, wanted to run by you
> a few minor suggestions:
> 
> struct fanotify_event_info_fid {
>         struct fanotify_event_info_header hdr;
>         ....
> 
> Seems more appropriate name than the shorter fanotify_event_fid name
> that you suggested.

Fine by me.

> It bothers me a bit not to use struct file_handle in the definition,
> but I don't with to start moving struct file_handle into uapi.
> I am a bit lost trying to understand which uapi include file I need
> to include in fanotify.h if I want to use it. fcntl.h?

Hum, so far we got away with not defining file_handle to userspace and
userspace headers don't define it either (it's just anonymouns pointer).
The manpage for name_to_handle_at() and open_by_handle_at() does show it's
internal structure but I'm not sure that really counts. But userspace is
actually expected to fill in handle_bytes when passing handle to
name_to_handle_at() so people using this syscall must have the definition
somehow. Probably their private one...

So I think moving the struct file_handle definition into uapi is the right
thing (with the justification above). That's a cleanup on its own. I would
probably move the definition into include/uapi/linux/fs.h as that's where
other similar structures are defined and from kernel POV it gets included
as a part of include/linux/fs.h as previously.

> I am going to add a separate internal struct for storing fid in event
> (either embedded of allocated) because I am going to make it more compact
> (similar to struct ovl_fh)
> 
> struct fanotify_fid {
>         __kernel_fsid_t fsid;
>         u8 handle_bytes;
>         u8 handle_type;
>         u16 pad;
>         unsigned char f_handle[0];
> };
> 
> Will let you know when I have something ready.

OK.

> AFAICT, this rework should not affect the rest of the patches in the
> series (i.e. cached fsid
> and dirent events), so if you have time, you don't need to wait on my
> rework to continue
> review of v3.

OK, thanks for info. I'm slowly crunching through the patches but it
takes time and I have also other things to do...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector
  2018-11-25 13:43 ` [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector Amir Goldstein
@ 2018-11-29 10:48   ` Jan Kara
  2018-11-29 11:42     ` Amir Goldstein
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2018-11-29 10:48 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Sun 25-11-18 15:43:48, Amir Goldstein wrote:
> For FAN_REPORT_FID, we need to encode fid with fsid of the filesystem on
> every event. To avoid having to call vfs_statfs() on every event to get
> fsid, we store the fsid in fsnotify_mark_connector on the first time we
> add a mark and on handle event we use the cached fsid.
> 
> Subsequent calls to add mark on the same object are expected to pass the
> same fsid, so the call will fail on cached fsid mismatch.
> 
> Similarly, if an event is reported on several mark types (inode, mount,
> filesystem), all connectors should already have the same fsid and we
> warn if we detect a mismatch.

Aha, that's what I was missing when I commented on inode & mount marks
working fine with fid. Couldn't we define inode->mount->sb ordering of
marks and just fetch fsid from the most specific one?

> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/notify/fanotify/fanotify.c      | 42 ++++++++++------
>  fs/notify/fanotify/fanotify.h      |  5 +-
>  fs/notify/fanotify/fanotify_user.c | 62 ++++++++++++++----------
>  fs/notify/mark.c                   | 77 ++++++++++++++++++++++++++----
>  include/linux/fsnotify_backend.h   | 24 ++++++++--
>  5 files changed, 154 insertions(+), 56 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 844b748f0b74..90bac98dd8c7 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -103,13 +103,13 @@ static int fanotify_get_response(struct fsnotify_group *group,
>   * requested by the user, will not be present in the returned mask.
>   */
>  static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
> -				       u32 event_mask, const void *data,
> -				       int data_type)
> +				     u32 event_mask, const void *data,
> +				     int data_type, __kernel_fsid_t *fsid)
>  {
>  	__u32 marks_mask = 0, marks_ignored_mask = 0;
>  	const struct path *path = data;
>  	struct fsnotify_mark *mark;
> -	int type;
> +	int type, err;
>  
>  	pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
>  		 __func__, iter_info->report_mask, event_mask, data, data_type);
> @@ -136,6 +136,13 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
>  		     !(mark->mask & FS_EVENT_ON_CHILD)))
>  			continue;
>  
> +		if (fsid) {
> +			err = fsnotify_get_conn_fsid(mark->connector, fsid);
> +			/* Should we skip mark with null or conflicting fsid? */
> +			pr_debug("%s: fsid=%x.%x type=%u err=%d\n", __func__,
> +				 fsid->val[0], fsid->val[1], type, err);
> +		}
> +
>  		marks_mask |= mark->mask;
>  		marks_ignored_mask |= mark->ignored_mask;
>  	}

I would just move getting of fsid into a separate function. I does not seem
to fit well into fanotify_group_event_mask() except for the fact that we
iterate through marks there... Also is the pr_debug() long-term useful
there?

> @@ -480,8 +481,54 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
>  	return -1;
>  }
>  
> +/*
> + * Check consistent fsid for all marks on the same object.
> + * It is expected to always get the same fsid (or no fsid) for all
> + * marks added to object.
> + */
> +static int fsnotify_test_conn_fsid(const struct fsnotify_mark_connector *conn,
> +				   const __kernel_fsid_t *fsid,
> +				   const char *where)
> +

Call this fsnotify_conn_fsid_consistent()?

> +{
> +	/*
> +	 * Backend is expected to check for non uniform fsid (e.g. btrfs),
> +	 * but maybe we missed something?
> +	 */
> +	if (fsid->val[0] != conn->fsid.val[0] ||
> +	    fsid->val[1] != conn->fsid.val[1]) {
> +		pr_warn_ratelimited("%s: fsid mismatch on object of type %u: %x.%x != %x.%x\n",
> +				    where, conn->type,
> +				    fsid->val[0], fsid->val[1],
> +				    conn->fsid.val[0], conn->fsid.val[1]);
> +		return -EXDEV;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Get cached fsid of filesystem containing object from connector.
> + */
> +int fsnotify_get_conn_fsid(const struct fsnotify_mark_connector *conn,
> +			   __kernel_fsid_t *fsid)
> +{
> +	if (WARN_ON(!conn->fsid.val[0] && !conn->fsid.val[1]))

WARN_ON_ONCE so that we don't spam logs please.

> +		return -ENODEV;
> +
> +	if (!fsid->val[0] && !fsid->val[1]) {
> +		*fsid = conn->fsid;
> +		return 0;
> +	}
> +
> +	return fsnotify_test_conn_fsid(conn, fsid, __func__);
> +}

The validation in fsnotify_get_conn_fsid() is weird. This function should
just return the fsid whatever it is. Caller can check if he wants. I know
this just does the right thing for the place where you use
fsnotify_get_conn_fsid() but it is just a surprising behavior.

> +
> +static const __kernel_fsid_t null_fsid;
> +
>  static int fsnotify_attach_connector_to_object(fsnotify_connp_t *connp,
> -					       unsigned int type)
> +					       unsigned int type,
> +					       __kernel_fsid_t *fsid)
>  {
>  	struct inode *inode = NULL;
>  	struct fsnotify_mark_connector *conn;
> @@ -493,6 +540,8 @@ static int fsnotify_attach_connector_to_object(fsnotify_connp_t *connp,
>  	INIT_HLIST_HEAD(&conn->list);
>  	conn->type = type;
>  	conn->obj = connp;
> +	/* Cache fsid of filesystem containing the object */
> +	conn->fsid = fsid ? *fsid : null_fsid;

This is unusual. I'd just do:

	if (fsid)
		conn->fsid = *fsid;
	else
		conn->fsid->val[0] = conn->fsid->val[1] = 0;

									Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag
  2018-11-29  9:46   ` Jan Kara
@ 2018-11-29 10:52     ` Jan Kara
  2018-11-29 11:03     ` Amir Goldstein
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-29 10:52 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-api

On Thu 29-11-18 10:46:36, Jan Kara wrote:
> On Sun 25-11-18 15:43:47, Amir Goldstein wrote:
> > When setting up an fanotify listener, user may request to get fid
> > information in event instead of an open file descriptor.
> > 
> > The fid obtained with event on a watched object contains the file
> > handle returned by name_to_handle_at(2) and fsid returned by statfs(2).
> > 
> > When setting a mark, we need to make sure that the filesystem
> > supports encoding file handles with name_to_handle_at(2) and that
> > statfs(2) encodes a non-zero fsid.
> ...
> > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > index ea8e81a3e80b..d7aa2f392a64 100644
> > --- a/fs/notify/fanotify/fanotify_user.c
> > +++ b/fs/notify/fanotify/fanotify_user.c
> > @@ -857,6 +859,49 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> >  	return fd;
> >  }
> >  
> > +/* Check if filesystem can encode a unique fid */
> > +static int fanotify_test_fid(struct path *path)
> > +{
> > +	struct kstatfs stat, root_stat;
> > +	int err;
> > +
> > +	/*
> > +	 * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
> > +	 * TODO: cache fsid in the mark connector.
> > +	 */
> 
> TODO in a submitted patch looks strange (looks like the patch isn't done
> yet ;)) and in this particular case the code is perfectly justified even
> without talking about future functionality...
> 
> > +	err = vfs_statfs(path, &stat);
> > +	if (err)
> > +		return err;
> > +
> > +	if (!stat.f_fsid.val[0] && !stat.f_fsid.val[1])
> > +		return -ENODEV;
> > +
> > +	/*
> > +	 * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
> > +	 * which uses a different fsid than sb root.
> > +	 */
> > +	err = statfs_by_dentry(path->dentry->d_sb->s_root, &root_stat);
> > +	if (err)
> > +		return err;
> > +
> > +	if (root_stat.f_fsid.val[0] != stat.f_fsid.val[0] ||
> > +	    root_stat.f_fsid.val[1] != stat.f_fsid.val[1])
> > +		return -EXDEV;
> 
> I think inode watches in subvolumes are actually fine? The same fs object
> is going to get different struct inode for different subvolumes if I
> remember right. So there won't be any surprises with unexpected fsid being
> reported.
> 
> Also mount watches are actually fine for subvolume as different subvolumes
> appear as different mountpoints, don't they? And I think implementation
> that would have different fsid for inodes in the same mountpoint would be
> indeed very weird. So again no problem with fsid mismatch.
> 
> So we need this check only for superblock watches.

See my reply to the next patch for more on this. Also I was thinking about
the "no zero fsid" restriction. I understand where you are coming from but
IMO FAN_REPORT_FID can be useful even if the filesystem doesn't support
fsid - e.g. if you create a notification group and put there marks  only for
one filesystem, then you don't need the fsid part at all. I don't have a
strong opinion on this (we can always enable this functionality later if we
want) but wanted to run this by you.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag
  2018-11-29  9:46   ` Jan Kara
  2018-11-29 10:52     ` Jan Kara
@ 2018-11-29 11:03     ` Amir Goldstein
  2018-11-29 13:08       ` Jan Kara
  1 sibling, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-29 11:03 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

On Thu, Nov 29, 2018 at 11:46 AM Jan Kara <jack@suse.cz> wrote:
>
> On Sun 25-11-18 15:43:47, Amir Goldstein wrote:
> > When setting up an fanotify listener, user may request to get fid
> > information in event instead of an open file descriptor.
> >
> > The fid obtained with event on a watched object contains the file
> > handle returned by name_to_handle_at(2) and fsid returned by statfs(2).
> >
> > When setting a mark, we need to make sure that the filesystem
> > supports encoding file handles with name_to_handle_at(2) and that
> > statfs(2) encodes a non-zero fsid.
> ...
> > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > index ea8e81a3e80b..d7aa2f392a64 100644
> > --- a/fs/notify/fanotify/fanotify_user.c
> > +++ b/fs/notify/fanotify/fanotify_user.c
> > @@ -857,6 +859,49 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> >       return fd;
> >  }
> >
> > +/* Check if filesystem can encode a unique fid */
> > +static int fanotify_test_fid(struct path *path)
> > +{
> > +     struct kstatfs stat, root_stat;
> > +     int err;
> > +
> > +     /*
> > +      * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
> > +      * TODO: cache fsid in the mark connector.
> > +      */
>
> TODO in a submitted patch looks strange (looks like the patch isn't done
> yet ;)) and in this particular case the code is perfectly justified even
> without talking about future functionality...
>

Not to mention that I forgot to remove the TODO in the patch that adds
cached fsid ;-)

> > +     err = vfs_statfs(path, &stat);
> > +     if (err)
> > +             return err;
> > +
> > +     if (!stat.f_fsid.val[0] && !stat.f_fsid.val[1])
> > +             return -ENODEV;
> > +
> > +     /*
> > +      * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
> > +      * which uses a different fsid than sb root.
> > +      */
> > +     err = statfs_by_dentry(path->dentry->d_sb->s_root, &root_stat);
> > +     if (err)
> > +             return err;
> > +
> > +     if (root_stat.f_fsid.val[0] != stat.f_fsid.val[0] ||
> > +         root_stat.f_fsid.val[1] != stat.f_fsid.val[1])
> > +             return -EXDEV;
>
> I think inode watches in subvolumes are actually fine? The same fs object
> is going to get different struct inode for different subvolumes if I
> remember right. So there won't be any surprises with unexpected fsid being
> reported.
>
> Also mount watches are actually fine for subvolume as different subvolumes
> appear as different mountpoints, don't they? And I think implementation
> that would have different fsid for inodes in the same mountpoint would be
> indeed very weird. So again no problem with fsid mismatch.
>
> So we need this check only for superblock watches.
>

Not so simple (or is it?).
If a group has inode, mount and filesystem marks, not all added at the
same time.
When event on an object that is associated with all the above marks,
which cached fsid should be used in the report?
Naturally, it makes sense to prefer to more accurate fsid of mount/inode
over the broader fsid of filesystem. Right?
But what happens when mount/inode marks are removed?
Or if filesystem has events in the mask that inode/mount do not?
Then the same object reports events with different fsid depending
on the type of event and time it took place (which marks existed).

Not a good situation to get ourselves into.

The simple way out of this is: we do not support FAN_REPORT_FID
on marks using path that is not relative to main volume. period.

Considering the fact that FAN_REPORT_FID is mainly indented to
enable reporting directory modification events and mount marks
are not supported with reporting directory modification events, we
only loose the ability to watch modification on selective directories
inside btrfs subvolume.

I also don't like the fact that I disabled filesystem watch over tmpfs,
because for the case of watching a single filesystem or single
directory, which is quite a common case, we don't need fsid
to be non-zero and we don't care if it mismatches with s_root fsid.

A solution I was contemplating was to allow zero fsid and non
root fsid as long as it is the only sb watched by the group, so
for non unique fsid:
- store group->sb and group->fsid
- return -EXDEV for an attempt to add mark from a different sb
(no matter if it is inode/mount/sb mark)
- when trying to add mark with zero or non root fsid (common case)
set group->sb to a special value so no fs will match it and then
attempt to add any mark with zero/non-root fsid will fail

This is something that is quite easy for me to implement and less
easy to document the expected behavior.
I donno, maybe:
EXDEV    watching several filesystems and either new mark or existing marks
                 are on filesystems with non unique fsid

The easy way out of it for me was: no support for FAN_REPORT_FID
on btrfs subvolumes at the moment - it could be added with restrictions
in the future.

Do you have a different view of the problem than mine?

> > diff --git a/fs/statfs.c b/fs/statfs.c
> > index f0216629621d..6a5d840a2d8d 100644
> > --- a/fs/statfs.c
> > +++ b/fs/statfs.c
> > @@ -50,7 +50,8 @@ static int calculate_f_flags(struct vfsmount *mnt)
> >               flags_by_sb(mnt->mnt_sb->s_flags);
> >  }
> >
> > -static int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
> > +/* Does not set buf->f_flags */
> > +int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
> >  {
> >       int retval;
> >
> > @@ -66,6 +67,7 @@ static int statfs_by_dentry(struct dentry *dentry, struct kstatfs *buf)
> >               buf->f_frsize = buf->f_bsize;
> >       return retval;
> >  }
> > +EXPORT_SYMBOL(statfs_by_dentry);
> >
> >  int vfs_statfs(const struct path *path, struct kstatfs *buf)
> >  {
>
> Make this export a separate patch, CC Al Viro on it. Honestly I'm not very
> happy about statfs_by_dentry() interface as it may return different result
> than vfs_statfs() so it just waits for someone careless to use
> statfs_by_dentry() and get burnt. How about implementing vfs_get_fsid()
> that will get dentry and return fsid, that will be just internally
> implemented using statfs_by_dentry()? We can then use that everywhere in
> fsnotify and the interface is going to be much cleaner like that. The
> comment regarding CC to Al Viro and separate patch still applies though.
>

OK. sounds good.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-29 10:16               ` Jan Kara
@ 2018-11-29 11:10                 ` Amir Goldstein
  2018-11-30 15:32                 ` Amir Goldstein
  1 sibling, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-11-29 11:10 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

On Thu, Nov 29, 2018 at 12:16 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 29-11-18 10:16:27, Amir Goldstein wrote:
> > > > The issue is that fill_event_metadata() function fills a fixed size header
> > > > and later copy_event_to_user() copies that header to user and then
> > > > copies the variable sized fid to user, so this is not the place to "store"
> > > > fid, but I will work on readability of this code.
> > >
> > > Aha, I see. Thanks for your patience with me :). So then how about just
> > > folding fill_event_metadata() into copy_event_to_user() (as a preparatory
> > > patch). It is relatively small function, has a single call site and with
> > > FID changes the distinction isn't so clear anymore...
> > >
> >
> > Sure. While you are here, before I start reworking, wanted to run by you
> > a few minor suggestions:
> >
> > struct fanotify_event_info_fid {
> >         struct fanotify_event_info_header hdr;
> >         ....
> >
> > Seems more appropriate name than the shorter fanotify_event_fid name
> > that you suggested.
>
> Fine by me.
>
> > It bothers me a bit not to use struct file_handle in the definition,
> > but I don't with to start moving struct file_handle into uapi.
> > I am a bit lost trying to understand which uapi include file I need
> > to include in fanotify.h if I want to use it. fcntl.h?
>
> Hum, so far we got away with not defining file_handle to userspace and
> userspace headers don't define it either (it's just anonymouns pointer).
> The manpage for name_to_handle_at() and open_by_handle_at() does show it's
> internal structure but I'm not sure that really counts. But userspace is
> actually expected to fill in handle_bytes when passing handle to
> name_to_handle_at() so people using this syscall must have the definition
> somehow. Probably their private one...
>
> So I think moving the struct file_handle definition into uapi is the right
> thing (with the justification above). That's a cleanup on its own. I would
> probably move the definition into include/uapi/linux/fs.h as that's where
> other similar structures are defined and from kernel POV it gets included
> as a part of include/linux/fs.h as previously.
>

That makes perfectly sense.
I couldn't figure out what it means for uapi headers that struct file_handle
is defined in /usr/include/x86_64-linux-gnu/bits/fcntl-linux.h
under ifdef __USE_GNU, but I see that SYNC_FILE_RANGE_* are also
defined at the same place and they are already in include/uapi/linux/fs.h
so that should be safe for struct file_handle as well.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector
  2018-11-29 10:48   ` Jan Kara
@ 2018-11-29 11:42     ` Amir Goldstein
  2018-11-29 13:11       ` Jan Kara
  0 siblings, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-29 11:42 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Thu, Nov 29, 2018 at 1:12 PM Jan Kara <jack@suse.cz> wrote:
>
> On Sun 25-11-18 15:43:48, Amir Goldstein wrote:
> > For FAN_REPORT_FID, we need to encode fid with fsid of the filesystem on
> > every event. To avoid having to call vfs_statfs() on every event to get
> > fsid, we store the fsid in fsnotify_mark_connector on the first time we
> > add a mark and on handle event we use the cached fsid.
> >
> > Subsequent calls to add mark on the same object are expected to pass the
> > same fsid, so the call will fail on cached fsid mismatch.
> >
> > Similarly, if an event is reported on several mark types (inode, mount,
> > filesystem), all connectors should already have the same fsid and we
> > warn if we detect a mismatch.
>
> Aha, that's what I was missing when I commented on inode & mount marks
> working fine with fid. Couldn't we define inode->mount->sb ordering of
> marks and just fetch fsid from the most specific one?
>

You probably already realized that we are racing on two threads ;-)
so see my reply to this in question on thread of patch 8.
Short: I don't want to change fsid reported on same object via same
path depending on event type and marks existing at that time.


> > Suggested-by: Jan Kara <jack@suse.cz>
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/notify/fanotify/fanotify.c      | 42 ++++++++++------
> >  fs/notify/fanotify/fanotify.h      |  5 +-
> >  fs/notify/fanotify/fanotify_user.c | 62 ++++++++++++++----------
> >  fs/notify/mark.c                   | 77 ++++++++++++++++++++++++++----
> >  include/linux/fsnotify_backend.h   | 24 ++++++++--
> >  5 files changed, 154 insertions(+), 56 deletions(-)
> >
> > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> > index 844b748f0b74..90bac98dd8c7 100644
> > --- a/fs/notify/fanotify/fanotify.c
> > +++ b/fs/notify/fanotify/fanotify.c
> > @@ -103,13 +103,13 @@ static int fanotify_get_response(struct fsnotify_group *group,
> >   * requested by the user, will not be present in the returned mask.
> >   */
> >  static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
> > -                                    u32 event_mask, const void *data,
> > -                                    int data_type)
> > +                                  u32 event_mask, const void *data,
> > +                                  int data_type, __kernel_fsid_t *fsid)
> >  {
> >       __u32 marks_mask = 0, marks_ignored_mask = 0;
> >       const struct path *path = data;
> >       struct fsnotify_mark *mark;
> > -     int type;
> > +     int type, err;
> >
> >       pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
> >                __func__, iter_info->report_mask, event_mask, data, data_type);
> > @@ -136,6 +136,13 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
> >                    !(mark->mask & FS_EVENT_ON_CHILD)))
> >                       continue;
> >
> > +             if (fsid) {
> > +                     err = fsnotify_get_conn_fsid(mark->connector, fsid);
> > +                     /* Should we skip mark with null or conflicting fsid? */
> > +                     pr_debug("%s: fsid=%x.%x type=%u err=%d\n", __func__,
> > +                              fsid->val[0], fsid->val[1], type, err);
> > +             }
> > +
> >               marks_mask |= mark->mask;
> >               marks_ignored_mask |= mark->ignored_mask;
> >       }
>
> I would just move getting of fsid into a separate function. I does not seem
> to fit well into fanotify_group_event_mask() except for the fact that we
> iterate through marks there... Also is the pr_debug() long-term useful
> there?
>

Don't mind to get rid of it.
How about fsnotify_get_event_fsid(iter_info) will just get the
first fsid it finds from any connector or via prioirty and not enforce conflicts
here at all?
Conflicts are supposed to be enforces when adding the mark anyway.

> > @@ -480,8 +481,54 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
> >       return -1;
> >  }
> >
> > +/*
> > + * Check consistent fsid for all marks on the same object.
> > + * It is expected to always get the same fsid (or no fsid) for all
> > + * marks added to object.
> > + */
> > +static int fsnotify_test_conn_fsid(const struct fsnotify_mark_connector *conn,
> > +                                const __kernel_fsid_t *fsid,
> > +                                const char *where)
> > +
>
> Call this fsnotify_conn_fsid_consistent()?
>

As written above, I am considering dropping the call to this function from
get_fsid path and use it only from set_fsid, so I might as well just open code
this test in set_fsid.

> > +{
> > +     /*
> > +      * Backend is expected to check for non uniform fsid (e.g. btrfs),
> > +      * but maybe we missed something?
> > +      */
> > +     if (fsid->val[0] != conn->fsid.val[0] ||
> > +         fsid->val[1] != conn->fsid.val[1]) {
> > +             pr_warn_ratelimited("%s: fsid mismatch on object of type %u: %x.%x != %x.%x\n",
> > +                                 where, conn->type,
> > +                                 fsid->val[0], fsid->val[1],
> > +                                 conn->fsid.val[0], conn->fsid.val[1]);
> > +             return -EXDEV;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Get cached fsid of filesystem containing object from connector.
> > + */
> > +int fsnotify_get_conn_fsid(const struct fsnotify_mark_connector *conn,
> > +                        __kernel_fsid_t *fsid)
> > +{
> > +     if (WARN_ON(!conn->fsid.val[0] && !conn->fsid.val[1]))
>
> WARN_ON_ONCE so that we don't spam logs please.
>

OK.

> > +             return -ENODEV;
> > +
> > +     if (!fsid->val[0] && !fsid->val[1]) {
> > +             *fsid = conn->fsid;
> > +             return 0;
> > +     }
> > +
> > +     return fsnotify_test_conn_fsid(conn, fsid, __func__);
> > +}
>
> The validation in fsnotify_get_conn_fsid() is weird. This function should
> just return the fsid whatever it is. Caller can check if he wants. I know
> this just does the right thing for the place where you use
> fsnotify_get_conn_fsid() but it is just a surprising behavior.
>

OK.

> > +
> > +static const __kernel_fsid_t null_fsid;
> > +
> >  static int fsnotify_attach_connector_to_object(fsnotify_connp_t *connp,
> > -                                            unsigned int type)
> > +                                            unsigned int type,
> > +                                            __kernel_fsid_t *fsid)
> >  {
> >       struct inode *inode = NULL;
> >       struct fsnotify_mark_connector *conn;
> > @@ -493,6 +540,8 @@ static int fsnotify_attach_connector_to_object(fsnotify_connp_t *connp,
> >       INIT_HLIST_HEAD(&conn->list);
> >       conn->type = type;
> >       conn->obj = connp;
> > +     /* Cache fsid of filesystem containing the object */
> > +     conn->fsid = fsid ? *fsid : null_fsid;
>
> This is unusual. I'd just do:
>
>         if (fsid)
>                 conn->fsid = *fsid;
>         else
>                 conn->fsid->val[0] = conn->fsid->val[1] = 0;
>

Sure.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag
  2018-11-29 11:03     ` Amir Goldstein
@ 2018-11-29 13:08       ` Jan Kara
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-29 13:08 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-api

On Thu 29-11-18 13:03:03, Amir Goldstein wrote:
> On Thu, Nov 29, 2018 at 11:46 AM Jan Kara <jack@suse.cz> wrote:
> > > +     err = vfs_statfs(path, &stat);
> > > +     if (err)
> > > +             return err;
> > > +
> > > +     if (!stat.f_fsid.val[0] && !stat.f_fsid.val[1])
> > > +             return -ENODEV;
> > > +
> > > +     /*
> > > +      * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
> > > +      * which uses a different fsid than sb root.
> > > +      */
> > > +     err = statfs_by_dentry(path->dentry->d_sb->s_root, &root_stat);
> > > +     if (err)
> > > +             return err;
> > > +
> > > +     if (root_stat.f_fsid.val[0] != stat.f_fsid.val[0] ||
> > > +         root_stat.f_fsid.val[1] != stat.f_fsid.val[1])
> > > +             return -EXDEV;
> >
> > I think inode watches in subvolumes are actually fine? The same fs object
> > is going to get different struct inode for different subvolumes if I
> > remember right. So there won't be any surprises with unexpected fsid being
> > reported.
> >
> > Also mount watches are actually fine for subvolume as different subvolumes
> > appear as different mountpoints, don't they? And I think implementation
> > that would have different fsid for inodes in the same mountpoint would be
> > indeed very weird. So again no problem with fsid mismatch.
> >
> > So we need this check only for superblock watches.
> >
> 
> Not so simple (or is it?).
> If a group has inode, mount and filesystem marks, not all added at the
> same time.
> When event on an object that is associated with all the above marks,
> which cached fsid should be used in the report?
> Naturally, it makes sense to prefer to more accurate fsid of mount/inode
> over the broader fsid of filesystem. Right?
> But what happens when mount/inode marks are removed?
> Or if filesystem has events in the mask that inode/mount do not?
> Then the same object reports events with different fsid depending
> on the type of event and time it took place (which marks existed).
> 
> Not a good situation to get ourselves into.
> 
> The simple way out of this is: we do not support FAN_REPORT_FID
> on marks using path that is not relative to main volume. period.
> 
> Considering the fact that FAN_REPORT_FID is mainly indented to
> enable reporting directory modification events and mount marks
> are not supported with reporting directory modification events, we
> only loose the ability to watch modification on selective directories
> inside btrfs subvolume.
> 
> I also don't like the fact that I disabled filesystem watch over tmpfs,
> because for the case of watching a single filesystem or single
> directory, which is quite a common case, we don't need fsid
> to be non-zero and we don't care if it mismatches with s_root fsid.
> 
> A solution I was contemplating was to allow zero fsid and non
> root fsid as long as it is the only sb watched by the group, so
> for non unique fsid:
> - store group->sb and group->fsid
> - return -EXDEV for an attempt to add mark from a different sb
> (no matter if it is inode/mount/sb mark)
> - when trying to add mark with zero or non root fsid (common case)
> set group->sb to a special value so no fs will match it and then
> attempt to add any mark with zero/non-root fsid will fail
> 
> This is something that is quite easy for me to implement and less
> easy to document the expected behavior.
> I donno, maybe:
> EXDEV    watching several filesystems and either new mark or existing marks
>                  are on filesystems with non unique fsid
> 
> The easy way out of it for me was: no support for FAN_REPORT_FID
> on btrfs subvolumes at the moment - it could be added with restrictions
> in the future.
> 
> Do you have a different view of the problem than mine?

Yeah, OK, you're right the semantics isn't really obvious. So I'm fine with
going for EXDEV now and we can open that can of worms later.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector
  2018-11-29 11:42     ` Amir Goldstein
@ 2018-11-29 13:11       ` Jan Kara
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kara @ 2018-11-29 13:11 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Thu 29-11-18 13:42:00, Amir Goldstein wrote:
> On Thu, Nov 29, 2018 at 1:12 PM Jan Kara <jack@suse.cz> wrote:
> > > Suggested-by: Jan Kara <jack@suse.cz>
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > ---
> > >  fs/notify/fanotify/fanotify.c      | 42 ++++++++++------
> > >  fs/notify/fanotify/fanotify.h      |  5 +-
> > >  fs/notify/fanotify/fanotify_user.c | 62 ++++++++++++++----------
> > >  fs/notify/mark.c                   | 77 ++++++++++++++++++++++++++----
> > >  include/linux/fsnotify_backend.h   | 24 ++++++++--
> > >  5 files changed, 154 insertions(+), 56 deletions(-)
> > >
> > > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> > > index 844b748f0b74..90bac98dd8c7 100644
> > > --- a/fs/notify/fanotify/fanotify.c
> > > +++ b/fs/notify/fanotify/fanotify.c
> > > @@ -103,13 +103,13 @@ static int fanotify_get_response(struct fsnotify_group *group,
> > >   * requested by the user, will not be present in the returned mask.
> > >   */
> > >  static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
> > > -                                    u32 event_mask, const void *data,
> > > -                                    int data_type)
> > > +                                  u32 event_mask, const void *data,
> > > +                                  int data_type, __kernel_fsid_t *fsid)
> > >  {
> > >       __u32 marks_mask = 0, marks_ignored_mask = 0;
> > >       const struct path *path = data;
> > >       struct fsnotify_mark *mark;
> > > -     int type;
> > > +     int type, err;
> > >
> > >       pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
> > >                __func__, iter_info->report_mask, event_mask, data, data_type);
> > > @@ -136,6 +136,13 @@ static u32 fanotify_group_event_mask(struct fsnotify_iter_info *iter_info,
> > >                    !(mark->mask & FS_EVENT_ON_CHILD)))
> > >                       continue;
> > >
> > > +             if (fsid) {
> > > +                     err = fsnotify_get_conn_fsid(mark->connector, fsid);
> > > +                     /* Should we skip mark with null or conflicting fsid? */
> > > +                     pr_debug("%s: fsid=%x.%x type=%u err=%d\n", __func__,
> > > +                              fsid->val[0], fsid->val[1], type, err);
> > > +             }
> > > +
> > >               marks_mask |= mark->mask;
> > >               marks_ignored_mask |= mark->ignored_mask;
> > >       }
> >
> > I would just move getting of fsid into a separate function. I does not seem
> > to fit well into fanotify_group_event_mask() except for the fact that we
> > iterate through marks there... Also is the pr_debug() long-term useful
> > there?
> 
> Don't mind to get rid of it.  How about
> fsnotify_get_event_fsid(iter_info) will just get the first fsid it finds
> from any connector or via prioirty and not enforce conflicts here at all?
> Conflicts are supposed to be enforces when adding the mark anyway.

Sounds good to me.

> > > +/*
> > > + * Check consistent fsid for all marks on the same object.
> > > + * It is expected to always get the same fsid (or no fsid) for all
> > > + * marks added to object.
> > > + */
> > > +static int fsnotify_test_conn_fsid(const struct fsnotify_mark_connector *conn,
> > > +                                const __kernel_fsid_t *fsid,
> > > +                                const char *where)
> > > +
> >
> > Call this fsnotify_conn_fsid_consistent()?
> >
> 
> As written above, I am considering dropping the call to this function from
> get_fsid path and use it only from set_fsid, so I might as well just open code
> this test in set_fsid.

OK.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-29 10:16               ` Jan Kara
  2018-11-29 11:10                 ` Amir Goldstein
@ 2018-11-30 15:32                 ` Amir Goldstein
  2018-12-01 16:43                   ` Amir Goldstein
  1 sibling, 1 reply; 42+ messages in thread
From: Amir Goldstein @ 2018-11-30 15:32 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

> > struct fanotify_event_info_fid {
> >         struct fanotify_event_info_header hdr;
> >         ....
> >
> > Seems more appropriate name than the shorter fanotify_event_fid name
> > that you suggested.
>
> Fine by me.

FYI, at the moment the uapi struct looks like this:

struct fanotify_event_info_header {
        __u8 info_type;
        __u8 pad;
        __u16 len;
};

struct fanotify_event_info_fid {
        struct fanotify_event_info_header hdr;
        __kernel_fsid_t fsid;
        struct file_handle fh;
};

But for my global watch prototype [1], I also defined this struct internally:

struct fanotify_event_fid {
       __kernel_fsid_t fsid;
       struct file_handle fh;
};

To be used as key to hash the object in userspace.
This key is often generated by open_by_handle_at() and statfs() from
application and not received from fanotify.

I wonder if it would be useful to include this definition in fanotify uapi
headers and then:

struct fanotify_event_info_fid {
        struct fanotify_event_info_header hdr;
        struct fanotify_event_fid fid;
};

[1] https://github.com/amir73il/inotify-tools/commits/fanotify_fid-v4

>
> > It bothers me a bit not to use struct file_handle in the definition,
> > but I don't with to start moving struct file_handle into uapi.
> > I am a bit lost trying to understand which uapi include file I need
> > to include in fanotify.h if I want to use it. fcntl.h?
>
...

> > struct fanotify_fid {
> >         __kernel_fsid_t fsid;
> >         u8 handle_bytes;
> >         u8 handle_type;
> >         u16 pad;
> >         unsigned char f_handle[0];
> > };
> >
> > Will let you know when I have something ready.

I ended up putting fh_type;fh_len in a 64bit alignment gap in found in
fanotify_event struct and now I use fh_type to determine if the event
info is path (0), fid (>0) or none (FILEID_INVALID).

I pushed the reworked fanotify_dirent [2] branch and I am quite content
with the result.
For comparison, also pushed fanotify_fid-v3 [3] and fanotify_fid-v4 [4],
which correspond to the last patch you posted review on (cached fsid).

[2] https://github.com/amir73il/inotify-tools/commits/fanotify_dirent
[3] https://github.com/amir73il/inotify-tools/commits/fanotify_fid-v3
[4] https://github.com/amir73il/inotify-tools/commits/fanotify_fid-v4

>
> > AFAICT, this rework should not affect the rest of the patches in the
> > series (i.e. cached fsid
> > and dirent events), so if you have time, you don't need to wait on my
> > rework to continue
> > review of v3.
>
> OK, thanks for info. I'm slowly crunching through the patches but it
> takes time and I have also other things to do...
>

I'm well aware of that and thanks for taking the time to crunch this far.
I hope you will like the changes in v4.

The remaining 4 fanotify_dirent patches, did have some rebase conflicts
over fanotify_fid-v4, but the logic remained mostly unchanged.

I will post the entire v4 patch set next week, so you may continue the
review when you have time.

Thanks!
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier
  2018-11-30 15:32                 ` Amir Goldstein
@ 2018-12-01 16:43                   ` Amir Goldstein
  0 siblings, 0 replies; 42+ messages in thread
From: Amir Goldstein @ 2018-12-01 16:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel, linux-api

On Fri, Nov 30, 2018 at 5:32 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> > > struct fanotify_event_info_fid {
> > >         struct fanotify_event_info_header hdr;
> > >         ....
> > >
> > > Seems more appropriate name than the shorter fanotify_event_fid name
> > > that you suggested.
> >
> > Fine by me.
>
> FYI, at the moment the uapi struct looks like this:
>
> struct fanotify_event_info_header {
>         __u8 info_type;
>         __u8 pad;
>         __u16 len;
> };
>
> struct fanotify_event_info_fid {
>         struct fanotify_event_info_header hdr;
>         __kernel_fsid_t fsid;
>         struct file_handle fh;
> };
>

I changed my mind about using struct file_handle.

The current v4 struct is defined:

struct fanotify_event_info_fid {
        struct fanotify_event_info_header hdr;
        __kernel_fsid_t fsid;
        /*
         * Following is an opaque struct file_handle that can be passed as
         * an argument to open_by_handle_at(2).
         */
        unsigned char handle[0];
};

It's true that name_to_handle_at(2) users need to know about the
layout of struct file_handle to pass the buffer size.

But the users of fanotify FAN_REPORT_FID and the users of
open_by_handle_at(2) can treat the handle as an opaque blob.


> >
> > > It bothers me a bit not to use struct file_handle in the definition,
> > > but I don't with to start moving struct file_handle into uapi.
> > > I am a bit lost trying to understand which uapi include file I need
> > > to include in fanotify.h if I want to use it. fcntl.h?
> >


I gave up trying to move struct file_handle to uapi headers, because
it is already defined in glibc fcntl-linux.h and I couldn't find a way
out of this mess, nor did I find a good reason to pursue this for the
current patch set.

...

>
> I pushed the reworked fanotify_dirent [2] branch and I am quite content
> with the result.
> For comparison, also pushed fanotify_fid-v3 [3] and fanotify_fid-v4 [4],
> which correspond to the last patch you posted review on (cached fsid).
>
> [2] https://github.com/amir73il/inotify-tools/commits/fanotify_dirent
> [3] https://github.com/amir73il/inotify-tools/commits/fanotify_fid-v3
> [4] https://github.com/amir73il/inotify-tools/commits/fanotify_fid-v4
>

Re-pushed those branches without changes to uapi/linux/fs.h.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2018-12-02  3:57 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-25 13:43 [PATCH v3 00/13] fanotify: add support for more event types Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 01/13] fsnotify: annotate directory entry modification events Amir Goldstein
2018-11-28 12:59   ` Jan Kara
2018-11-28 14:39     ` Amir Goldstein
2018-11-28 14:43       ` Jan Kara
2018-11-28 15:01         ` Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 02/13] fsnotify: send all event types to super block marks Amir Goldstein
2018-11-28 14:26   ` Jan Kara
2018-11-25 13:43 ` [PATCH v3 03/13] fanotify: rename struct fanotify_{,perm_}event_info Amir Goldstein
2018-11-28 14:29   ` Jan Kara
2018-11-25 13:43 ` [PATCH v3 04/13] fanotify: define the structures to report a unique file identifier Amir Goldstein
2018-11-28 15:27   ` Jan Kara
2018-11-28 16:24     ` Amir Goldstein
2018-11-28 17:43       ` Jan Kara
2018-11-28 18:34         ` Amir Goldstein
2018-11-29  7:51           ` Jan Kara
2018-11-29  8:16             ` Amir Goldstein
2018-11-29 10:16               ` Jan Kara
2018-11-29 11:10                 ` Amir Goldstein
2018-11-30 15:32                 ` Amir Goldstein
2018-12-01 16:43                   ` Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 05/13] fanotify: classify events that hold a " Amir Goldstein
2018-11-28 15:33   ` Jan Kara
2018-11-28 15:44     ` Jan Kara
2018-11-28 15:52       ` Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 06/13] fanotify: encode file identifier for FAN_REPORT_FID Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 07/13] fanotify: copy event fid info to user Amir Goldstein
2018-11-29  9:00   ` Jan Kara
     [not found]     ` <CAOQ4uxjcb=UqQiw0XcpDfetK28bM4tOYdvgxPwhkjgE2mxpt=g@mail.gmail.com>
2018-11-29  9:49       ` Jan Kara
2018-11-25 13:43 ` [PATCH v3 08/13] fanotify: enable FAN_REPORT_FID init flag Amir Goldstein
2018-11-29  9:46   ` Jan Kara
2018-11-29 10:52     ` Jan Kara
2018-11-29 11:03     ` Amir Goldstein
2018-11-29 13:08       ` Jan Kara
2018-11-25 13:43 ` [PATCH v3 09/13] fanotify: cache fsid in fsnotify_mark_connector Amir Goldstein
2018-11-29 10:48   ` Jan Kara
2018-11-29 11:42     ` Amir Goldstein
2018-11-29 13:11       ` Jan Kara
2018-11-25 13:43 ` [PATCH v3 10/13] fanotify: check FS_ISDIR flag instead of d_is_dir() Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 11/13] fanotify: support events with data type FSNOTIFY_EVENT_INODE Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 12/13] fanotify: add support for create/attrib/move/delete events Amir Goldstein
2018-11-25 13:43 ` [PATCH v3 13/13] fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).