linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] fsnotify prep work for fanotify dentry events
@ 2018-11-14 17:43 Amir Goldstein
  2018-11-14 17:43 ` [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available Amir Goldstein
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Amir Goldstein @ 2018-11-14 17:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Jan,

Not much has changed in these cleanup patches since they were last
posted two years ago [1].

Except they were posted as the 1st part of a 4 parts patch series
and the 2nd and 4th part of that original series have already been
merged to v4.20-rc1 (FAN_MARK_FILESYSTEM).

This series does not change any logic and it passed the existing
LTP inotify/fanotify tests.

The remaining work (3rd part of original series) will be posted
in two parts - adding FAN_REPORT_FID and adding support for the
dentry events to fanotify.

Thanks,
Amir.

[1] https://marc.info/?l=linux-fsdevel&m=148198446127338&w=2
[2] https://lkml.org/lkml/2016/12/20/312

Amir Goldstein (5):
  fsnotify: pass dentry instead of inode when available
  fsnotify: annotate filename events
  fsnotify: simplify API for filename events
  fsnotify: make MOVED_FROM a dentry event
  fsnotify: send event type FSNOTIFY_EVENT_DENTRY to super block mark

 arch/powerpc/platforms/cell/spufs/inode.c |  2 +-
 fs/btrfs/ioctl.c                          |  2 +-
 fs/debugfs/inode.c                        | 11 ++-
 fs/devpts/inode.c                         |  2 +-
 fs/namei.c                                | 28 ++++----
 fs/notify/fsnotify.c                      | 17 +++--
 fs/ocfs2/refcounttree.c                   |  2 +-
 fs/tracefs/inode.c                        |  4 +-
 include/linux/fsnotify.h                  | 81 +++++++++++++++--------
 include/linux/fsnotify_backend.h          |  3 +-
 net/sunrpc/rpc_pipe.c                     |  6 +-
 11 files changed, 97 insertions(+), 61 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-14 17:43 [PATCH v2 0/5] fsnotify prep work for fanotify dentry events Amir Goldstein
@ 2018-11-14 17:43 ` Amir Goldstein
  2018-11-20 11:32   ` Jan Kara
  2018-11-14 17:43 ` [PATCH v2 2/5] fsnotify: annotate filename events Amir Goldstein
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-14 17:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Define a new data type to pass for event FSNOTIFY_EVENT_DENTRY
and use it whenever a dentry is available instead of passing
it's ->d_inode as data type FSNOTIFY_EVENT_INODE.

None of the current fsnotify backends make use of the inode data
with data type FSNOTIFY_EVENT_INODE - only the data of type
FSNOTIFY_EVENT_PATH is ever used, so this change has no immediate
consequences.

Soon, we are going to use the dentry data type to support more
events with fanotify backend.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c             |  8 ++++----
 include/linux/fsnotify.h         | 15 +++++++--------
 include/linux/fsnotify_backend.h |  3 ++-
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index ecf09b6243d9..6a120b7f8b94 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -178,11 +178,11 @@ int __fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask
 
 		take_dentry_name_snapshot(&name, dentry);
 		if (path)
-			ret = fsnotify(p_inode, mask, path, FSNOTIFY_EVENT_PATH,
-				       name.name, 0);
+			ret = fsnotify(p_inode, mask, path,
+				       FSNOTIFY_EVENT_PATH, name.name, 0);
 		else
-			ret = fsnotify(p_inode, mask, dentry->d_inode, FSNOTIFY_EVENT_INODE,
-				       name.name, 0);
+			ret = fsnotify(p_inode, mask, dentry,
+				       FSNOTIFY_EVENT_DENTRY, name.name, 0);
 		release_dentry_name_snapshot(&name);
 	}
 
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 2ccb08cb5d6a..9dadc0bcd7a9 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -99,14 +99,14 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 
 	fsnotify(old_dir, old_dir_mask, source, FSNOTIFY_EVENT_INODE, old_name,
 		 fs_cookie);
-	fsnotify(new_dir, new_dir_mask, source, FSNOTIFY_EVENT_INODE, new_name,
+	fsnotify(new_dir, new_dir_mask, moved, FSNOTIFY_EVENT_DENTRY, new_name,
 		 fs_cookie);
 
 	if (target)
 		fsnotify_link_count(target);
 
 	if (source)
-		fsnotify(source, FS_MOVE_SELF, moved->d_inode, FSNOTIFY_EVENT_INODE, NULL, 0);
+		fsnotify(source, FS_MOVE_SELF, moved, FSNOTIFY_EVENT_DENTRY, NULL, 0);
 	audit_inode_child(new_dir, moved, AUDIT_TYPE_CHILD_CREATE);
 }
 
@@ -155,7 +155,7 @@ static inline void fsnotify_create(struct inode *inode, struct dentry *dentry)
 {
 	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(inode, FS_CREATE, dentry->d_inode, FSNOTIFY_EVENT_INODE, dentry->d_name.name, 0);
+	fsnotify(inode, FS_CREATE, dentry, FSNOTIFY_EVENT_DENTRY, dentry->d_name.name, 0);
 }
 
 /*
@@ -168,7 +168,7 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode, struct
 	fsnotify_link_count(inode);
 	audit_inode_child(dir, new_dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(dir, FS_CREATE, inode, FSNOTIFY_EVENT_INODE, new_dentry->d_name.name, 0);
+	fsnotify(dir, FS_CREATE, new_dentry, FSNOTIFY_EVENT_DENTRY, new_dentry->d_name.name, 0);
 }
 
 /*
@@ -177,11 +177,10 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode, struct
 static inline void fsnotify_mkdir(struct inode *inode, struct dentry *dentry)
 {
 	__u32 mask = (FS_CREATE | FS_ISDIR);
-	struct inode *d_inode = dentry->d_inode;
 
 	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(inode, mask, d_inode, FSNOTIFY_EVENT_INODE, dentry->d_name.name, 0);
+	fsnotify(inode, mask, dentry, FSNOTIFY_EVENT_DENTRY, dentry->d_name.name, 0);
 }
 
 /*
@@ -262,7 +261,7 @@ static inline void fsnotify_xattr(struct dentry *dentry)
 		mask |= FS_ISDIR;
 
 	fsnotify_parent(NULL, dentry, mask);
-	fsnotify(inode, mask, inode, FSNOTIFY_EVENT_INODE, NULL, 0);
+	fsnotify(inode, mask, dentry, FSNOTIFY_EVENT_DENTRY, NULL, 0);
 }
 
 /*
@@ -297,7 +296,7 @@ static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
 			mask |= FS_ISDIR;
 
 		fsnotify_parent(NULL, dentry, mask);
-		fsnotify(inode, mask, inode, FSNOTIFY_EVENT_INODE, NULL, 0);
+		fsnotify(inode, mask, dentry, FSNOTIFY_EVENT_DENTRY, NULL, 0);
 	}
 }
 
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 7639774e7475..24d92455be03 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -208,10 +208,11 @@ struct fsnotify_group {
 	};
 };
 
-/* when calling fsnotify tell it if the data is a path or inode */
+/* when calling fsnotify tell it if the data is a path or inode or dentry */
 #define FSNOTIFY_EVENT_NONE	0
 #define FSNOTIFY_EVENT_PATH	1
 #define FSNOTIFY_EVENT_INODE	2
+#define FSNOTIFY_EVENT_DENTRY	3
 
 enum fsnotify_obj_type {
 	FSNOTIFY_OBJ_TYPE_INODE,
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-14 17:43 [PATCH v2 0/5] fsnotify prep work for fanotify dentry events Amir Goldstein
  2018-11-14 17:43 ` [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available Amir Goldstein
@ 2018-11-14 17:43 ` Amir Goldstein
  2018-11-20 11:59   ` Jan Kara
  2018-11-14 17:43 ` [PATCH v2 3/5] fsnotify: simplify API for " Amir Goldstein
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-14 17:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Filename events are referring to events that modify directory entries,
such as create,delete,rename. Those events should always be reported
on a watched directory, regardless if FS_EVENT_ON_CHILD is set
on the watch mask.

fsnotify_nameremove() and fsnotify_move() were modified to no longer
set the FS_EVENT_ON_CHILD event bit. This is a semantic change to
align with the filename event definition. It has no effect on any
existing backend, because dnotify and inotify always requets the
child events and fanotify does not get the delete,rename events.

The fsnotify_filename() helper is used to report all the filename
events. It gets a reference on parent dentry and passes it as the
data for the event along with the filename.

fsnotify_filename() is different from fsnotify_parent().
fsnotify_parent() is intended to report any events that happened on
child inodes when FS_EVENT_ON_CHILD is requested.
fsnotify_filename() is intended to report only filename events,
such as create,mkdir,link. Those events must always be reported
on a watched directory, regardless if FS_EVENT_ON_CHILD was requested.

fsnotify_d_name() is a helper for the common case where the
filename to pass is dentry->d_name.name.

It is safe to use these helpers with negative or not instantiated
dentries, such as the case with fsnotify_link() and
fsnotify_nameremove().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 include/linux/fsnotify.h | 40 +++++++++++++++++++++++++++++++---------
 1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 9dadc0bcd7a9..d00ec5838d6e 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -17,8 +17,31 @@
 #include <linux/slab.h>
 #include <linux/bug.h>
 
+/*
+ * Notify this parent about a filename event (create,delete,rename).
+ * Unlike fsnotify_parent(), the event will be reported regardless of the
+ * FS_EVENT_ON_CHILD mask on the parent inode
+ */
+static inline int fsnotify_filename(struct dentry *parent, __u32 mask,
+				    const unsigned char *file_name, u32 cookie)
+{
+	return fsnotify(d_inode(parent), mask, parent, FSNOTIFY_EVENT_DENTRY,
+			file_name, cookie);
+}
+
+/*
+ * Call fsnotify_filename() with parent and d_name of this dentry.
+ * Safe to call with negative dentry, e.g. from fsnotify_nameremove()
+ */
+static inline int fsnotify_d_name(struct dentry *dentry, __u32 mask)
+{
+	return fsnotify_filename(dentry->d_parent, mask,
+				 dentry->d_name.name, 0);
+}
+
 /* Notify this dentry's parent about a child's events. */
-static inline int fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask)
+static inline int fsnotify_parent(const struct path *path,
+				  struct dentry *dentry, __u32 mask)
 {
 	if (!dentry)
 		dentry = path->dentry;
@@ -85,8 +108,8 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 {
 	struct inode *source = moved->d_inode;
 	u32 fs_cookie = fsnotify_get_cookie();
-	__u32 old_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_FROM);
-	__u32 new_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_TO);
+	__u32 old_dir_mask = FS_MOVED_FROM;
+	__u32 new_dir_mask = FS_MOVED_TO;
 	const unsigned char *new_name = moved->d_name.name;
 
 	if (old_dir == new_dir)
@@ -99,8 +122,7 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 
 	fsnotify(old_dir, old_dir_mask, source, FSNOTIFY_EVENT_INODE, old_name,
 		 fs_cookie);
-	fsnotify(new_dir, new_dir_mask, moved, FSNOTIFY_EVENT_DENTRY, new_name,
-		 fs_cookie);
+	fsnotify_filename(moved->d_parent, new_dir_mask, new_name, fs_cookie);
 
 	if (target)
 		fsnotify_link_count(target);
@@ -136,7 +158,7 @@ static inline void fsnotify_nameremove(struct dentry *dentry, int isdir)
 	if (isdir)
 		mask |= FS_ISDIR;
 
-	fsnotify_parent(NULL, dentry, mask);
+	fsnotify_d_name(dentry, mask);
 }
 
 /*
@@ -155,7 +177,7 @@ static inline void fsnotify_create(struct inode *inode, struct dentry *dentry)
 {
 	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(inode, FS_CREATE, dentry, FSNOTIFY_EVENT_DENTRY, dentry->d_name.name, 0);
+	fsnotify_d_name(dentry, FS_CREATE);
 }
 
 /*
@@ -168,7 +190,7 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode, struct
 	fsnotify_link_count(inode);
 	audit_inode_child(dir, new_dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(dir, FS_CREATE, new_dentry, FSNOTIFY_EVENT_DENTRY, new_dentry->d_name.name, 0);
+	fsnotify_d_name(new_dentry, FS_CREATE);
 }
 
 /*
@@ -180,7 +202,7 @@ static inline void fsnotify_mkdir(struct inode *inode, struct dentry *dentry)
 
 	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 
-	fsnotify(inode, mask, dentry, FSNOTIFY_EVENT_DENTRY, dentry->d_name.name, 0);
+	fsnotify_d_name(dentry, mask);
 }
 
 /*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 3/5] fsnotify: simplify API for filename events
  2018-11-14 17:43 [PATCH v2 0/5] fsnotify prep work for fanotify dentry events Amir Goldstein
  2018-11-14 17:43 ` [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available Amir Goldstein
  2018-11-14 17:43 ` [PATCH v2 2/5] fsnotify: annotate filename events Amir Goldstein
@ 2018-11-14 17:43 ` Amir Goldstein
  2018-11-14 17:43 ` [PATCH v2 4/5] fsnotify: make MOVED_FROM a dentry event Amir Goldstein
  2018-11-14 17:43 ` [PATCH v2 5/5] fsnotify: send event type FSNOTIFY_EVENT_DENTRY to super block mark Amir Goldstein
  4 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2018-11-14 17:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Do not pass redundant parent inode arg.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 arch/powerpc/platforms/cell/spufs/inode.c |  2 +-
 fs/btrfs/ioctl.c                          |  2 +-
 fs/debugfs/inode.c                        |  6 +++---
 fs/devpts/inode.c                         |  2 +-
 fs/namei.c                                | 18 +++++++++---------
 fs/ocfs2/refcounttree.c                   |  2 +-
 fs/tracefs/inode.c                        |  4 ++--
 include/linux/fsnotify.h                  | 20 +++++++++++++-------
 net/sunrpc/rpc_pipe.c                     |  6 +++---
 9 files changed, 34 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index db329d4bf1c3..a40de703f586 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -587,7 +587,7 @@ long spufs_create(struct path *path, struct dentry *dentry,
 		ret = spufs_create_context(dir, dentry, path->mnt, flags, mode,
 					    filp);
 	if (ret >= 0)
-		fsnotify_mkdir(dir, dentry);
+		fsnotify_mkdir(dentry);
 
 	return ret;
 }
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 802a628e9f7d..8af6258cb0a2 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -979,7 +979,7 @@ static noinline int btrfs_mksubvol(const struct path *parent,
 				      async_transid, inherit);
 	}
 	if (!error)
-		fsnotify_mkdir(dir, dentry);
+		fsnotify_mkdir(dentry);
 out_up_read:
 	up_read(&fs_info->subvol_sem);
 out_dput:
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 13b01351dd1c..ca9945f7db59 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -361,7 +361,7 @@ static struct dentry *__debugfs_create_file(const char *name, umode_t mode,
 				DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
 
 	d_instantiate(dentry, inode);
-	fsnotify_create(d_inode(dentry->d_parent), dentry);
+	fsnotify_create(dentry);
 	return end_creating(dentry);
 }
 
@@ -520,7 +520,7 @@ struct dentry *debugfs_create_dir(const char *name, struct dentry *parent)
 	inc_nlink(inode);
 	d_instantiate(dentry, inode);
 	inc_nlink(d_inode(dentry->d_parent));
-	fsnotify_mkdir(d_inode(dentry->d_parent), dentry);
+	fsnotify_mkdir(dentry);
 	return end_creating(dentry);
 }
 EXPORT_SYMBOL_GPL(debugfs_create_dir);
@@ -559,7 +559,7 @@ struct dentry *debugfs_create_automount(const char *name,
 	inc_nlink(inode);
 	d_instantiate(dentry, inode);
 	inc_nlink(d_inode(dentry->d_parent));
-	fsnotify_mkdir(d_inode(dentry->d_parent), dentry);
+	fsnotify_mkdir(dentry);
 	return end_creating(dentry);
 }
 EXPORT_SYMBOL(debugfs_create_automount);
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index c53814539070..f15a33399a5b 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -589,7 +589,7 @@ struct dentry *devpts_pty_new(struct pts_fs_info *fsi, int index, void *priv)
 	if (dentry) {
 		dentry->d_fsdata = priv;
 		d_add(dentry, inode);
-		fsnotify_create(d_inode(root), dentry);
+		fsnotify_create(dentry);
 	} else {
 		iput(inode);
 		dentry = ERR_PTR(-ENOMEM);
diff --git a/fs/namei.c b/fs/namei.c
index 0cab6494978c..1d743adf90a0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2911,7 +2911,7 @@ int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 		return error;
 	error = dir->i_op->create(dir, dentry, mode, want_excl);
 	if (!error)
-		fsnotify_create(dir, dentry);
+		fsnotify_create(dentry);
 	return error;
 }
 EXPORT_SYMBOL(vfs_create);
@@ -2932,7 +2932,7 @@ int vfs_mkobj(struct dentry *dentry, umode_t mode,
 		return error;
 	error = f(dentry, mode, arg);
 	if (!error)
-		fsnotify_create(dir, dentry);
+		fsnotify_create(dentry);
 	return error;
 }
 EXPORT_SYMBOL(vfs_mkobj);
@@ -3081,7 +3081,7 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
 			int acc_mode = op->acc_mode;
 			if (file->f_mode & FMODE_CREATED) {
 				WARN_ON(!(open_flag & O_CREAT));
-				fsnotify_create(dir, dentry);
+				fsnotify_create(dentry);
 				acc_mode = 0;
 			}
 			error = may_open(&file->f_path, acc_mode, open_flag);
@@ -3095,7 +3095,7 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
 				dentry = file->f_path.dentry;
 			}
 			if (file->f_mode & FMODE_CREATED)
-				fsnotify_create(dir, dentry);
+				fsnotify_create(dentry);
 			if (unlikely(d_is_negative(dentry))) {
 				error = -ENOENT;
 			} else {
@@ -3235,7 +3235,7 @@ static int lookup_open(struct nameidata *nd, struct path *path,
 						open_flag & O_EXCL);
 		if (error)
 			goto out_dput;
-		fsnotify_create(dir_inode, dentry);
+		fsnotify_create(dentry);
 	}
 	if (unlikely(create_error) && !dentry->d_inode) {
 		error = create_error;
@@ -3718,7 +3718,7 @@ int vfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t dev)
 
 	error = dir->i_op->mknod(dir, dentry, mode, dev);
 	if (!error)
-		fsnotify_create(dir, dentry);
+		fsnotify_create(dentry);
 	return error;
 }
 EXPORT_SYMBOL(vfs_mknod);
@@ -3816,7 +3816,7 @@ int vfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 
 	error = dir->i_op->mkdir(dir, dentry, mode);
 	if (!error)
-		fsnotify_mkdir(dir, dentry);
+		fsnotify_mkdir(dentry);
 	return error;
 }
 EXPORT_SYMBOL(vfs_mkdir);
@@ -4126,7 +4126,7 @@ int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
 
 	error = dir->i_op->symlink(dir, dentry, oldname);
 	if (!error)
-		fsnotify_create(dir, dentry);
+		fsnotify_create(dentry);
 	return error;
 }
 EXPORT_SYMBOL(vfs_symlink);
@@ -4248,7 +4248,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 	}
 	inode_unlock(inode);
 	if (!error)
-		fsnotify_link(dir, inode, new_dentry);
+		fsnotify_link(inode, new_dentry);
 	return error;
 }
 EXPORT_SYMBOL(vfs_link);
diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c
index a35259eebc56..7ff695d1a3e4 100644
--- a/fs/ocfs2/refcounttree.c
+++ b/fs/ocfs2/refcounttree.c
@@ -4417,7 +4417,7 @@ static int ocfs2_vfs_reflink(struct dentry *old_dentry, struct inode *dir,
 		error = ocfs2_reflink(old_dentry, dir, new_dentry, preserve);
 	inode_unlock(inode);
 	if (!error)
-		fsnotify_create(dir, new_dentry);
+		fsnotify_create(new_dentry);
 	return error;
 }
 /*
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 7098c49f3693..4644ddc77595 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -410,7 +410,7 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 	inode->i_fop = fops ? fops : &tracefs_file_operations;
 	inode->i_private = data;
 	d_instantiate(dentry, inode);
-	fsnotify_create(dentry->d_parent->d_inode, dentry);
+	fsnotify_create(dentry);
 	return end_creating(dentry);
 }
 
@@ -435,7 +435,7 @@ static struct dentry *__create_dir(const char *name, struct dentry *parent,
 	inc_nlink(inode);
 	d_instantiate(dentry, inode);
 	inc_nlink(dentry->d_parent->d_inode);
-	fsnotify_mkdir(dentry->d_parent->d_inode, dentry);
+	fsnotify_mkdir(dentry);
 	return end_creating(dentry);
 }
 
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index d00ec5838d6e..4fb2fa0b31d2 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -63,6 +63,12 @@ static inline int fsnotify_path(struct inode *inode, const struct path *path,
 	return fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0);
 }
 
+static inline void audit_dentry_child(const struct dentry *dentry,
+				      const unsigned char type)
+{
+	audit_inode_child(d_inode(dentry->d_parent), dentry, type);
+}
+
 /* Simple call site for access decisions */
 static inline int fsnotify_perm(struct file *file, int mask)
 {
@@ -129,7 +135,7 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 
 	if (source)
 		fsnotify(source, FS_MOVE_SELF, moved, FSNOTIFY_EVENT_DENTRY, NULL, 0);
-	audit_inode_child(new_dir, moved, AUDIT_TYPE_CHILD_CREATE);
+	audit_dentry_child(moved, AUDIT_TYPE_CHILD_CREATE);
 }
 
 /*
@@ -173,9 +179,9 @@ static inline void fsnotify_inoderemove(struct inode *inode)
 /*
  * fsnotify_create - 'name' was linked in
  */
-static inline void fsnotify_create(struct inode *inode, struct dentry *dentry)
+static inline void fsnotify_create(struct dentry *dentry)
 {
-	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
+	audit_dentry_child(dentry, AUDIT_TYPE_CHILD_CREATE);
 
 	fsnotify_d_name(dentry, FS_CREATE);
 }
@@ -185,10 +191,10 @@ static inline void fsnotify_create(struct inode *inode, struct dentry *dentry)
  * Note: We have to pass also the linked inode ptr as some filesystems leave
  *   new_dentry->d_inode NULL and instantiate inode pointer later
  */
-static inline void fsnotify_link(struct inode *dir, struct inode *inode, struct dentry *new_dentry)
+static inline void fsnotify_link(struct inode *inode, struct dentry *new_dentry)
 {
 	fsnotify_link_count(inode);
-	audit_inode_child(dir, new_dentry, AUDIT_TYPE_CHILD_CREATE);
+	audit_dentry_child(new_dentry, AUDIT_TYPE_CHILD_CREATE);
 
 	fsnotify_d_name(new_dentry, FS_CREATE);
 }
@@ -196,11 +202,11 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode, struct
 /*
  * fsnotify_mkdir - directory 'name' was created
  */
-static inline void fsnotify_mkdir(struct inode *inode, struct dentry *dentry)
+static inline void fsnotify_mkdir(struct dentry *dentry)
 {
 	__u32 mask = (FS_CREATE | FS_ISDIR);
 
-	audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);
+	audit_dentry_child(dentry, AUDIT_TYPE_CHILD_CREATE);
 
 	fsnotify_d_name(dentry, mask);
 }
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 4fda18d47e2c..2dc1a0054c40 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -524,7 +524,7 @@ static int __rpc_create(struct inode *dir, struct dentry *dentry,
 	err = __rpc_create_common(dir, dentry, S_IFREG | mode, i_fop, private);
 	if (err)
 		return err;
-	fsnotify_create(dir, dentry);
+	fsnotify_create(dentry);
 	return 0;
 }
 
@@ -539,7 +539,7 @@ static int __rpc_mkdir(struct inode *dir, struct dentry *dentry,
 	if (err)
 		return err;
 	inc_nlink(dir);
-	fsnotify_mkdir(dir, dentry);
+	fsnotify_mkdir(dentry);
 	return 0;
 }
 
@@ -594,7 +594,7 @@ static int __rpc_mkpipe_dentry(struct inode *dir, struct dentry *dentry,
 	rpci = RPC_I(d_inode(dentry));
 	rpci->private = private;
 	rpci->pipe = pipe;
-	fsnotify_create(dir, dentry);
+	fsnotify_create(dentry);
 	return 0;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 4/5] fsnotify: make MOVED_FROM a dentry event
  2018-11-14 17:43 [PATCH v2 0/5] fsnotify prep work for fanotify dentry events Amir Goldstein
                   ` (2 preceding siblings ...)
  2018-11-14 17:43 ` [PATCH v2 3/5] fsnotify: simplify API for " Amir Goldstein
@ 2018-11-14 17:43 ` Amir Goldstein
  2018-11-20 12:45   ` Jan Kara
  2018-11-14 17:43 ` [PATCH v2 5/5] fsnotify: send event type FSNOTIFY_EVENT_DENTRY to super block mark Amir Goldstein
  4 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-14 17:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

Propagate old/new parent dentries to fsnotify_move() and pass old parent
dentry as FSNOTIFY_EVENT_DENTRY info of MOVED_FROM event.

This change has no effect on current backends. Soon, this will allow
fanotify backend to get MOVED_FROM events.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/debugfs/inode.c       |  5 ++---
 fs/namei.c               | 10 +++++++---
 include/linux/fsnotify.h | 18 +++++++++---------
 3 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index ca9945f7db59..f7c061485f89 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -809,9 +809,8 @@ struct dentry *debugfs_rename(struct dentry *old_dir, struct dentry *old_dentry,
 		goto exit;
 	}
 	d_move(old_dentry, dentry);
-	fsnotify_move(d_inode(old_dir), d_inode(new_dir), old_name.name,
-		d_is_dir(old_dentry),
-		NULL, old_dentry);
+	fsnotify_move(old_dir, new_dir, old_name.name,
+		      d_is_dir(old_dentry), NULL, old_dentry);
 	release_dentry_name_snapshot(&old_name);
 	unlock_rename(new_dir, old_dir);
 	dput(dentry);
diff --git a/fs/namei.c b/fs/namei.c
index 1d743adf90a0..ecaabc081cac 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4394,6 +4394,8 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 {
 	int error;
 	bool is_dir = d_is_dir(old_dentry);
+	struct dentry *old_parent = old_dentry->d_parent;
+	struct dentry *new_parent = new_dentry->d_parent;
 	struct inode *source = old_dentry->d_inode;
 	struct inode *target = new_dentry->d_inode;
 	bool new_is_dir = false;
@@ -4500,10 +4502,12 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		inode_unlock(target);
 	dput(new_dentry);
 	if (!error) {
-		fsnotify_move(old_dir, new_dir, old_name.name, is_dir,
-			      !(flags & RENAME_EXCHANGE) ? target : NULL, old_dentry);
+		fsnotify_move(old_parent, new_parent, old_name.name, is_dir,
+			      !(flags & RENAME_EXCHANGE) ? target : NULL,
+			      old_dentry);
 		if (flags & RENAME_EXCHANGE) {
-			fsnotify_move(new_dir, old_dir, old_dentry->d_name.name,
+			fsnotify_move(new_parent, old_parent,
+				      old_dentry->d_name.name,
 				      new_is_dir, NULL, new_dentry);
 		}
 	}
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 4fb2fa0b31d2..7d7629f99c9d 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -108,17 +108,17 @@ static inline void fsnotify_link_count(struct inode *inode)
 /*
  * fsnotify_move - file old_name at old_dir was moved to new_name at new_dir
  */
-static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
-				 const unsigned char *old_name,
-				 int isdir, struct inode *target, struct dentry *moved)
+static inline void fsnotify_move(struct dentry *old_dir, struct dentry *new_dir,
+				 const unsigned char *old_name, int isdir,
+				 struct inode *target, struct dentry *moved)
 {
-	struct inode *source = moved->d_inode;
+	struct inode *source = d_inode(moved);
 	u32 fs_cookie = fsnotify_get_cookie();
 	__u32 old_dir_mask = FS_MOVED_FROM;
 	__u32 new_dir_mask = FS_MOVED_TO;
 	const unsigned char *new_name = moved->d_name.name;
 
-	if (old_dir == new_dir)
+	if (d_inode(old_dir) == d_inode(new_dir))
 		old_dir_mask |= FS_DN_RENAME;
 
 	if (isdir) {
@@ -126,15 +126,15 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
 		new_dir_mask |= FS_ISDIR;
 	}
 
-	fsnotify(old_dir, old_dir_mask, source, FSNOTIFY_EVENT_INODE, old_name,
-		 fs_cookie);
-	fsnotify_filename(moved->d_parent, new_dir_mask, new_name, fs_cookie);
+	fsnotify_filename(old_dir, old_dir_mask, old_name, fs_cookie);
+	fsnotify_filename(new_dir, new_dir_mask, new_name, fs_cookie);
 
 	if (target)
 		fsnotify_link_count(target);
 
 	if (source)
-		fsnotify(source, FS_MOVE_SELF, moved, FSNOTIFY_EVENT_DENTRY, NULL, 0);
+		fsnotify(source, FS_MOVE_SELF, moved, FSNOTIFY_EVENT_DENTRY,
+			 NULL, 0);
 	audit_dentry_child(moved, AUDIT_TYPE_CHILD_CREATE);
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 5/5] fsnotify: send event type FSNOTIFY_EVENT_DENTRY to super block mark
  2018-11-14 17:43 [PATCH v2 0/5] fsnotify prep work for fanotify dentry events Amir Goldstein
                   ` (3 preceding siblings ...)
  2018-11-14 17:43 ` [PATCH v2 4/5] fsnotify: make MOVED_FROM a dentry event Amir Goldstein
@ 2018-11-14 17:43 ` Amir Goldstein
  4 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2018-11-14 17:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

So far, existence of super block marks could be checked only on events
with data type FSNOTIFY_EVENT_PATH. Use d_sb field of dentry from events
with data type FSNOTIFY_EVENT_DENTRY to send event on super block mark.

This change has no effect on current backends. Soon, this will allow
fanotify backend to receive dentry events on a super block mark.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 6a120b7f8b94..c9d673f0cbdc 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -338,6 +338,9 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 		mnt = real_mount(((const struct path *)data)->mnt);
 		sb = mnt->mnt.mnt_sb;
 		mnt_or_sb_mask = mnt->mnt_fsnotify_mask | sb->s_fsnotify_mask;
+	} else if (data_is == FSNOTIFY_EVENT_DENTRY) {
+		sb = ((const struct dentry *)data)->d_sb;
+		mnt_or_sb_mask = sb->s_fsnotify_mask;
 	}
 	/* An event "on child" is not intended for a mount/sb mark */
 	if (mask & FS_EVENT_ON_CHILD)
@@ -350,8 +353,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	 * SRCU because we have no references to any objects and do not
 	 * need SRCU to keep them "alive".
 	 */
-	if (!to_tell->i_fsnotify_marks &&
-	    (!mnt || (!mnt->mnt_fsnotify_marks && !sb->s_fsnotify_marks)))
+	if (!to_tell->i_fsnotify_marks && (!mnt || !mnt->mnt_fsnotify_marks) &&
+	    (!sb || !sb->s_fsnotify_marks))
 		return 0;
 	/*
 	 * if this is a modify event we may need to clear the ignored masks
@@ -369,6 +372,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	if (mnt) {
 		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
 			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
+	}
+	if (sb) {
 		iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
 			fsnotify_first_mark(&sb->s_fsnotify_marks);
 	}
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-14 17:43 ` [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available Amir Goldstein
@ 2018-11-20 11:32   ` Jan Kara
  2018-11-20 14:36     ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2018-11-20 11:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Wed 14-11-18 19:43:40, Amir Goldstein wrote:
> Define a new data type to pass for event FSNOTIFY_EVENT_DENTRY
> and use it whenever a dentry is available instead of passing
> it's ->d_inode as data type FSNOTIFY_EVENT_INODE.
> 
> None of the current fsnotify backends make use of the inode data
> with data type FSNOTIFY_EVENT_INODE - only the data of type
> FSNOTIFY_EVENT_PATH is ever used, so this change has no immediate
> consequences.
> 
> Soon, we are going to use the dentry data type to support more
> events with fanotify backend.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

One general coment: Audit is using FSNOTIFY_EVENT_INODE and this change is
going to break it. So it needs updating as well.

Also I'd prefer a more justified selection of dentry events than 'those
where we can do it'. Because eventually that set will translate to events
available to userspace in fanotify and that should be a reasonably
consistent set. So here I suspect you target at directory modification
events (you call them filename events in the next patch). So just define
which FS_<foo> events these exactly are and convert exactly those event
helpers to dentry ones...

> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index 2ccb08cb5d6a..9dadc0bcd7a9 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -99,14 +99,14 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
>  
>  	fsnotify(old_dir, old_dir_mask, source, FSNOTIFY_EVENT_INODE, old_name,
>  		 fs_cookie);
> -	fsnotify(new_dir, new_dir_mask, source, FSNOTIFY_EVENT_INODE, new_name,
> +	fsnotify(new_dir, new_dir_mask, moved, FSNOTIFY_EVENT_DENTRY, new_name,
>  		 fs_cookie);
>  
>  	if (target)
>  		fsnotify_link_count(target);
>  
>  	if (source)
> -		fsnotify(source, FS_MOVE_SELF, moved->d_inode, FSNOTIFY_EVENT_INODE, NULL, 0);
> +		fsnotify(source, FS_MOVE_SELF, moved, FSNOTIFY_EVENT_DENTRY, NULL, 0);
>  	audit_inode_child(new_dir, moved, AUDIT_TYPE_CHILD_CREATE);
>  }
>  

I'd postpone these fsnotify_move() changes to a patch where you make
fsnotify_move() fully dentry based. Having it converted half-way looks
confusing...

> @@ -168,7 +168,7 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode, struct
>  	fsnotify_link_count(inode);
>  	audit_inode_child(dir, new_dentry, AUDIT_TYPE_CHILD_CREATE);
>  
> -	fsnotify(dir, FS_CREATE, inode, FSNOTIFY_EVENT_INODE, new_dentry->d_name.name, 0);
> +	fsnotify(dir, FS_CREATE, new_dentry, FSNOTIFY_EVENT_DENTRY, new_dentry->d_name.name, 0);
>  }

So a change like this makes me slightly nervous. What guarantees that
new_dentry->d_inode is going to be the same thing as 'inode' passed in?
Aren't races changing new_dentry before we look at new_dentry->d_inode
possible? In this specific case I don't think anything unexpected is
possible - we hold i_rwsem, hopefully nobody does anything fancy like
instantiating a different inode in their ->link method. But generally
non-obvious changes like this should be split out in separate commits with
justification why the change is safe. 

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-14 17:43 ` [PATCH v2 2/5] fsnotify: annotate filename events Amir Goldstein
@ 2018-11-20 11:59   ` Jan Kara
  2018-11-20 14:58     ` Amir Goldstein
  2018-11-22  8:40     ` Amir Goldstein
  0 siblings, 2 replies; 29+ messages in thread
From: Jan Kara @ 2018-11-20 11:59 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Wed 14-11-18 19:43:41, Amir Goldstein wrote:
> Filename events are referring to events that modify directory entries,
> such as create,delete,rename. Those events should always be reported
> on a watched directory, regardless if FS_EVENT_ON_CHILD is set
> on the watch mask.

OK, I find 'directory modification events' clearer than 'filename events'.
But I can live with your name since I don't really have a better
alternative :). Just please define these events in terms of all FS_<foo>
events that are involved so that everyone is on the same page which events
you mean.

> fsnotify_nameremove() and fsnotify_move() were modified to no longer
> set the FS_EVENT_ON_CHILD event bit. This is a semantic change to
> align with the filename event definition. It has no effect on any
> existing backend, because dnotify and inotify always requets the
> child events and fanotify does not get the delete,rename events.

You keep forgetting about audit ;) But in this case it is fine as it always
sets FS_EVENT_ON_CHILD as well.

> The fsnotify_filename() helper is used to report all the filename
> events. It gets a reference on parent dentry and passes it as the
> data for the event along with the filename.
> 
> fsnotify_filename() is different from fsnotify_parent().
> fsnotify_parent() is intended to report any events that happened on
> child inodes when FS_EVENT_ON_CHILD is requested.
> fsnotify_filename() is intended to report only filename events,
> such as create,mkdir,link. Those events must always be reported
> on a watched directory, regardless if FS_EVENT_ON_CHILD was requested.
> 
> fsnotify_d_name() is a helper for the common case where the
> filename to pass is dentry->d_name.name.
> 
> It is safe to use these helpers with negative or not instantiated
> dentries, such as the case with fsnotify_link() and
> fsnotify_nameremove().
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  include/linux/fsnotify.h | 40 +++++++++++++++++++++++++++++++---------
>  1 file changed, 31 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index 9dadc0bcd7a9..d00ec5838d6e 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -17,8 +17,31 @@
>  #include <linux/slab.h>
>  #include <linux/bug.h>
>  
> +/*
> + * Notify this parent about a filename event (create,delete,rename).
> + * Unlike fsnotify_parent(), the event will be reported regardless of the
> + * FS_EVENT_ON_CHILD mask on the parent inode
> + */

How about specifying this as 'Notify directory 'parent' about a change of
some of its directory entries'? That way you avoid using 'filename' event
in the description which is not defined.

> +static inline int fsnotify_filename(struct dentry *parent, __u32 mask,
> +				    const unsigned char *file_name, u32 cookie)

And how about calling the function fsnotify_dir_change()?

> +{
> +	return fsnotify(d_inode(parent), mask, parent, FSNOTIFY_EVENT_DENTRY,
> +			file_name, cookie);
> +}
> +
> +/*
> + * Call fsnotify_filename() with parent and d_name of this dentry.
> + * Safe to call with negative dentry, e.g. from fsnotify_nameremove()
> + */
> +static inline int fsnotify_d_name(struct dentry *dentry, __u32 mask)

Maybe call this fsnotify_dir_dentry_change()?

> +{
> +	return fsnotify_filename(dentry->d_parent, mask,
> +				 dentry->d_name.name, 0);
> +}
> +
>  /* Notify this dentry's parent about a child's events. */
> -static inline int fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask)
> +static inline int fsnotify_parent(const struct path *path,
> +				  struct dentry *dentry, __u32 mask)
>  {
>  	if (!dentry)
>  		dentry = path->dentry;
> @@ -85,8 +108,8 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
>  {
>  	struct inode *source = moved->d_inode;
>  	u32 fs_cookie = fsnotify_get_cookie();
> -	__u32 old_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_FROM);
> -	__u32 new_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_TO);
> +	__u32 old_dir_mask = FS_MOVED_FROM;
> +	__u32 new_dir_mask = FS_MOVED_TO;

You can just inline these two masks now. There's no point for the variable
anymore.

> @@ -99,8 +122,7 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
>  
>  	fsnotify(old_dir, old_dir_mask, source, FSNOTIFY_EVENT_INODE, old_name,
>  		 fs_cookie);
> -	fsnotify(new_dir, new_dir_mask, moved, FSNOTIFY_EVENT_DENTRY, new_name,
> -		 fs_cookie);
> +	fsnotify_filename(moved->d_parent, new_dir_mask, new_name, fs_cookie);

The same comment as for the patch 1 here. You should justify that
moved->d_parent is actually the same as new_dir and the dentry cannot
change under us. The same holds for all the calls of fsnotify_d_name()
where the directory inode originally passed in gets replaced with
d_inode(dentry->d_parent).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/5] fsnotify: make MOVED_FROM a dentry event
  2018-11-14 17:43 ` [PATCH v2 4/5] fsnotify: make MOVED_FROM a dentry event Amir Goldstein
@ 2018-11-20 12:45   ` Jan Kara
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Kara @ 2018-11-20 12:45 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Wed 14-11-18 19:43:43, Amir Goldstein wrote:
> Propagate old/new parent dentries to fsnotify_move() and pass old parent
> dentry as FSNOTIFY_EVENT_DENTRY info of MOVED_FROM event.
> 
> This change has no effect on current backends. Soon, this will allow
> fanotify backend to get MOVED_FROM events.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Possibly this patch should go first in the series so that further
fsnotify_move() changes can happen to the whole function...

> diff --git a/fs/namei.c b/fs/namei.c
> index 1d743adf90a0..ecaabc081cac 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4394,6 +4394,8 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  {
>  	int error;
>  	bool is_dir = d_is_dir(old_dentry);
> +	struct dentry *old_parent = old_dentry->d_parent;
> +	struct dentry *new_parent = new_dentry->d_parent;

Again, did you make sure old_dir == old_dentry->d_parent in all cases? If
yes, comment in the changelog would be good. Maybe CC Al Viro on this one
as rename is especially complex.  Chances he'll comment are low but if
there's something seriously broken with this, he will.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-20 11:32   ` Jan Kara
@ 2018-11-20 14:36     ` Amir Goldstein
  2018-11-21 12:51       ` Jan Kara
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-20 14:36 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Tue, Nov 20, 2018 at 1:32 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 14-11-18 19:43:40, Amir Goldstein wrote:
> > Define a new data type to pass for event FSNOTIFY_EVENT_DENTRY
> > and use it whenever a dentry is available instead of passing
> > it's ->d_inode as data type FSNOTIFY_EVENT_INODE.
> >
> > None of the current fsnotify backends make use of the inode data
> > with data type FSNOTIFY_EVENT_INODE - only the data of type
> > FSNOTIFY_EVENT_PATH is ever used, so this change has no immediate
> > consequences.
> >
> > Soon, we are going to use the dentry data type to support more
> > events with fanotify backend.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> One general coment: Audit is using FSNOTIFY_EVENT_INODE and this change is
> going to break it. So it needs updating as well.
>

Ouch! I should really add the audit test suite to my testing routine...

> Also I'd prefer a more justified selection of dentry events than 'those
> where we can do it'. Because eventually that set will translate to events
> available to userspace in fanotify and that should be a reasonably
> consistent set. So here I suspect you target at directory modification
> events (you call them filename events in the next patch). So just define
> which FS_<foo> events these exactly are and convert exactly those event
> helpers to dentry ones...
>

It is not accurate that I target only "directory modification" events.
I also target FS_ATTRIB and in the future also FS_DELETE_SELF,
both applicable to files as well as directories.

But when I think about it... fanotify does not really need the dentry
information,
so I might just be able to do without FSNOTIFY_EVENT_DENTRY
and avert the questions about stability of dentry->d_inode

fanotify_dentry branch uses the dentry information in 3 occasions:

1. if (d_is_dir(dentry) in fanotify_group_event_mask()
    This could be converted to S_ISDIR(inode->i_mode)
2. exportfs_encode_fh(dentry,... in fanotify_alloc_fid()
    Can be converted to exportfs_encode_inode_fh(inode,...
3. statfs_by_dentry(dentry,... in fanotify_alloc_fid()
    Can be converted to statfs_by_dentry(inode->i_sb->sb_root,...

I will leave the patch to annotate "directory change" events,
but the rest of this prep series can be discarded.

Does that sound reasonable?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-20 11:59   ` Jan Kara
@ 2018-11-20 14:58     ` Amir Goldstein
  2018-11-21 13:18       ` Jan Kara
  2018-11-22  8:40     ` Amir Goldstein
  1 sibling, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-20 14:58 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Tue, Nov 20, 2018 at 1:59 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 14-11-18 19:43:41, Amir Goldstein wrote:
> > Filename events are referring to events that modify directory entries,
> > such as create,delete,rename. Those events should always be reported
> > on a watched directory, regardless if FS_EVENT_ON_CHILD is set
> > on the watch mask.
>
> OK, I find 'directory modification events' clearer than 'filename events'.
> But I can live with your name since I don't really have a better
> alternative :). Just please define these events in terms of all FS_<foo>
> events that are involved so that everyone is on the same page which events
> you mean.
>

>From a later fanotify patch:

/*
 * Events whose reported fid is the parent directory.
 * fanotify may get support for reporting the filename in the future.
 * For now, listener only gets notified that a create/delete/rename took
 * place in that directory.
 */
#define FANOTIFY_FILENAME_EVENTS        (FAN_MOVE | FAN_CREATE | FAN_DELETE)

I went back and forth with this trying to come up with a better
name and DIR_MODIFY_EVENTS did cross my mind, but the
problem is that FS_MODIFY|FS_ISDIR is technically also a directory
modification event, so we are really looking at "directory entry modification"
and I didn't like the sounds of DIRENT_EVENTS.

So I went for a name that described the information reported in the events
which is parent+filename.

Since you say you can live with this choice, I will add FS_FILENAME_EVENTS
with a similar comment in this patch.

> > fsnotify_nameremove() and fsnotify_move() were modified to no longer
> > set the FS_EVENT_ON_CHILD event bit. This is a semantic change to
> > align with the filename event definition. It has no effect on any
> > existing backend, because dnotify and inotify always requets the
> > child events and fanotify does not get the delete,rename events.
>
> You keep forgetting about audit ;) But in this case it is fine as it always
> sets FS_EVENT_ON_CHILD as well.
>

Right... :-/

> > The fsnotify_filename() helper is used to report all the filename
> > events. It gets a reference on parent dentry and passes it as the
> > data for the event along with the filename.
> >
> > fsnotify_filename() is different from fsnotify_parent().
> > fsnotify_parent() is intended to report any events that happened on
> > child inodes when FS_EVENT_ON_CHILD is requested.
> > fsnotify_filename() is intended to report only filename events,
> > such as create,mkdir,link. Those events must always be reported
> > on a watched directory, regardless if FS_EVENT_ON_CHILD was requested.
> >
> > fsnotify_d_name() is a helper for the common case where the
> > filename to pass is dentry->d_name.name.
> >
> > It is safe to use these helpers with negative or not instantiated
> > dentries, such as the case with fsnotify_link() and
> > fsnotify_nameremove().
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  include/linux/fsnotify.h | 40 +++++++++++++++++++++++++++++++---------
> >  1 file changed, 31 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> > index 9dadc0bcd7a9..d00ec5838d6e 100644
> > --- a/include/linux/fsnotify.h
> > +++ b/include/linux/fsnotify.h
> > @@ -17,8 +17,31 @@
> >  #include <linux/slab.h>
> >  #include <linux/bug.h>
> >
> > +/*
> > + * Notify this parent about a filename event (create,delete,rename).
> > + * Unlike fsnotify_parent(), the event will be reported regardless of the
> > + * FS_EVENT_ON_CHILD mask on the parent inode
> > + */
>
> How about specifying this as 'Notify directory 'parent' about a change of
> some of its directory entries'? That way you avoid using 'filename' event
> in the description which is not defined.
>

Sure. Although I am going to define filename events now...

> > +static inline int fsnotify_filename(struct dentry *parent, __u32 mask,
> > +                                 const unsigned char *file_name, u32 cookie)
>
> And how about calling the function fsnotify_dir_change()?

Not comfortable with this name because of fsnotify_change()
being passed a directory sounds like it should call here.
The name of this helper signifies that it takes a filename argument
and pass a non NULL filename argument to fsnotify().
Nothing more, nothing less.

>
> > +{
> > +     return fsnotify(d_inode(parent), mask, parent, FSNOTIFY_EVENT_DENTRY,
> > +                     file_name, cookie);
> > +}
> > +
> > +/*
> > + * Call fsnotify_filename() with parent and d_name of this dentry.
> > + * Safe to call with negative dentry, e.g. from fsnotify_nameremove()
> > + */
> > +static inline int fsnotify_d_name(struct dentry *dentry, __u32 mask)
>
> Maybe call this fsnotify_dir_dentry_change()?
>

Similar reasoning as above.
The name of this wrapper signifies that it passed dentry->d_name as
a non NULL filename argument to fsnotify().
Nothing more, nothing less.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-20 14:36     ` Amir Goldstein
@ 2018-11-21 12:51       ` Jan Kara
  2018-11-21 16:13         ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2018-11-21 12:51 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Tue 20-11-18 16:36:31, Amir Goldstein wrote:
> On Tue, Nov 20, 2018 at 1:32 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 14-11-18 19:43:40, Amir Goldstein wrote:
> > > Define a new data type to pass for event FSNOTIFY_EVENT_DENTRY
> > > and use it whenever a dentry is available instead of passing
> > > it's ->d_inode as data type FSNOTIFY_EVENT_INODE.
> > >
> > > None of the current fsnotify backends make use of the inode data
> > > with data type FSNOTIFY_EVENT_INODE - only the data of type
> > > FSNOTIFY_EVENT_PATH is ever used, so this change has no immediate
> > > consequences.
> > >
> > > Soon, we are going to use the dentry data type to support more
> > > events with fanotify backend.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >
> > One general coment: Audit is using FSNOTIFY_EVENT_INODE and this change is
> > going to break it. So it needs updating as well.
> >
> 
> Ouch! I should really add the audit test suite to my testing routine...
> 
> > Also I'd prefer a more justified selection of dentry events than 'those
> > where we can do it'. Because eventually that set will translate to events
> > available to userspace in fanotify and that should be a reasonably
> > consistent set. So here I suspect you target at directory modification
> > events (you call them filename events in the next patch). So just define
> > which FS_<foo> events these exactly are and convert exactly those event
> > helpers to dentry ones...
> >
> 
> It is not accurate that I target only "directory modification" events.
> I also target FS_ATTRIB and in the future also FS_DELETE_SELF,
> both applicable to files as well as directories.
> 
> But when I think about it... fanotify does not really need the dentry
> information,
> so I might just be able to do without FSNOTIFY_EVENT_DENTRY
> and avert the questions about stability of dentry->d_inode

OK, that would certainly make things simpler.

> fanotify_dentry branch uses the dentry information in 3 occasions:
> 
> 1. if (d_is_dir(dentry) in fanotify_group_event_mask()
>     This could be converted to S_ISDIR(inode->i_mode)

Sure.

> 2. exportfs_encode_fh(dentry,... in fanotify_alloc_fid()
>     Can be converted to exportfs_encode_inode_fh(inode,...

If you have the parent inode then yes. In lot of cases we do have it, not
sure if in all of them (but likely yes so that we can do proper
FS_EVENT_ON_CHILD) handling.

> 3. statfs_by_dentry(dentry,... in fanotify_alloc_fid()
>     Can be converted to statfs_by_dentry(inode->i_sb->sb_root,...

This might be problematic e.g. with btrfs subvolumes which have each
different fsid so the fsid-handle pair might be actually invalid or even
point to a file with different contents. Maybe we could just store the
fsid in the fsnotify_mark when it is created and use it when generating
events? That would also get rid of the overhead of calling statfs for each
generated event which I don't like...

> I will leave the patch to annotate "directory change" events,
> but the rest of this prep series can be discarded.
> 
> Does that sound reasonable?

Yes.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-20 14:58     ` Amir Goldstein
@ 2018-11-21 13:18       ` Jan Kara
  2018-11-21 15:40         ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2018-11-21 13:18 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Tue 20-11-18 16:58:31, Amir Goldstein wrote:
> On Tue, Nov 20, 2018 at 1:59 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 14-11-18 19:43:41, Amir Goldstein wrote:
> > > Filename events are referring to events that modify directory entries,
> > > such as create,delete,rename. Those events should always be reported
> > > on a watched directory, regardless if FS_EVENT_ON_CHILD is set
> > > on the watch mask.
> >
> > OK, I find 'directory modification events' clearer than 'filename events'.
> > But I can live with your name since I don't really have a better
> > alternative :). Just please define these events in terms of all FS_<foo>
> > events that are involved so that everyone is on the same page which events
> > you mean.
> >
> 
> From a later fanotify patch:
> 
> /*
>  * Events whose reported fid is the parent directory.
>  * fanotify may get support for reporting the filename in the future.
>  * For now, listener only gets notified that a create/delete/rename took
>  * place in that directory.
>  */
> #define FANOTIFY_FILENAME_EVENTS        (FAN_MOVE | FAN_CREATE | FAN_DELETE)
> 
> I went back and forth with this trying to come up with a better
> name and DIR_MODIFY_EVENTS did cross my mind, but the
> problem is that FS_MODIFY|FS_ISDIR is technically also a directory
> modification event, so we are really looking at "directory entry modification"
> and I didn't like the sounds of DIRENT_EVENTS.

But we never generate FS_MODIFY|FS_ISDIR events so I don't think there's a
big space for confusion (and I've deliberately used CHANGE instead of
MODIFY to make the distinction even clearer). FWIW
FANOTIFY_DIRENT_MODIFY_EVENTS would also look better than _FILENAME_EVENTS
to me.

> So I went for a name that described the information reported in the events
> which is parent+filename.

Well, but you don't report parent + filename, you report file handle and it
isn't clear whether name will ever get reported. What initially confused me
about filename events was that it sounded like you want to report all
events (so also open / read / write / ...) for a _filename_.

> Since you say you can live with this choice, I will add FS_FILENAME_EVENTS
> with a similar comment in this patch.

Talking about this more I'd really prefer if we could change the name to
something else...

> > > +static inline int fsnotify_filename(struct dentry *parent, __u32 mask,
> > > +                                 const unsigned char *file_name, u32 cookie)
> >
> > And how about calling the function fsnotify_dir_change()?
> 
> Not comfortable with this name because of fsnotify_change()
> being passed a directory sounds like it should call here.
> The name of this helper signifies that it takes a filename argument
> and pass a non NULL filename argument to fsnotify().
> Nothing more, nothing less.

I find the current name confusing because, as I wrote above, it creates a
feeling in me that we are notifying filename that something happens with
it. But instead we notify a directory that something happens with filenames
inside it... Generally we have names of form fsnotify_<what-happened> or
fsnotify_<who-to-tell> and fsnotify_filename() does not fall into either of
the cathegories which confused my brain trying to fit it in one of the two
cathegories and failing. Maybe fsnotify_dir_op() because it is just a helper
function for various directory operations?

> > > +{
> > > +     return fsnotify(d_inode(parent), mask, parent, FSNOTIFY_EVENT_DENTRY,
> > > +                     file_name, cookie);
> > > +}
> > > +
> > > +/*
> > > + * Call fsnotify_filename() with parent and d_name of this dentry.
> > > + * Safe to call with negative dentry, e.g. from fsnotify_nameremove()
> > > + */
> > > +static inline int fsnotify_d_name(struct dentry *dentry, __u32 mask)
> >
> > Maybe call this fsnotify_dir_dentry_change()?
> >
> 
> Similar reasoning as above.
> The name of this wrapper signifies that it passed dentry->d_name as
> a non NULL filename argument to fsnotify().
> Nothing more, nothing less.

And same objection to the original name here...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-21 13:18       ` Jan Kara
@ 2018-11-21 15:40         ` Amir Goldstein
  2018-11-22  7:45           ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-21 15:40 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

On Wed, Nov 21, 2018 at 3:18 PM Jan Kara <jack@suse.cz> wrote:
>
> On Tue 20-11-18 16:58:31, Amir Goldstein wrote:
> > On Tue, Nov 20, 2018 at 1:59 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 14-11-18 19:43:41, Amir Goldstein wrote:
> > > > Filename events are referring to events that modify directory entries,
> > > > such as create,delete,rename. Those events should always be reported
> > > > on a watched directory, regardless if FS_EVENT_ON_CHILD is set
> > > > on the watch mask.
> > >
> > > OK, I find 'directory modification events' clearer than 'filename events'.
> > > But I can live with your name since I don't really have a better
> > > alternative :). Just please define these events in terms of all FS_<foo>
> > > events that are involved so that everyone is on the same page which events
> > > you mean.
> > >
> >
> > From a later fanotify patch:
> >
> > /*
> >  * Events whose reported fid is the parent directory.
> >  * fanotify may get support for reporting the filename in the future.
> >  * For now, listener only gets notified that a create/delete/rename took
> >  * place in that directory.
> >  */
> > #define FANOTIFY_FILENAME_EVENTS        (FAN_MOVE | FAN_CREATE | FAN_DELETE)
> >
> > I went back and forth with this trying to come up with a better
> > name and DIR_MODIFY_EVENTS did cross my mind, but the
> > problem is that FS_MODIFY|FS_ISDIR is technically also a directory
> > modification event, so we are really looking at "directory entry modification"
> > and I didn't like the sounds of DIRENT_EVENTS.
>
> But we never generate FS_MODIFY|FS_ISDIR events so I don't think there's a
> big space for confusion (and I've deliberately used CHANGE instead of
> MODIFY to make the distinction even clearer). FWIW
> FANOTIFY_DIRENT_MODIFY_EVENTS would also look better than _FILENAME_EVENTS
> to me.
>

Fair enough. I'll change to FANOTIFY_DIRENT_MODIFY_EVENTS
and similar named helpers and comments in fsnotify.h.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-21 12:51       ` Jan Kara
@ 2018-11-21 16:13         ` Amir Goldstein
  2018-11-22  9:52           ` Jan Kara
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-21 16:13 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Bobrowski, linux-fsdevel, Mark Fasheh, Chris Mason, Josef Bacik

>
> > 2. exportfs_encode_fh(dentry,... in fanotify_alloc_fid()
> >     Can be converted to exportfs_encode_inode_fh(inode,...
>
> If you have the parent inode then yes. In lot of cases we do have it, not
> sure if in all of them (but likely yes so that we can do proper
> FS_EVENT_ON_CHILD) handling.
>

We do not need the parent inode, because we are not encoding a
"connectable" file handle. we need:
exportfs_encode_inode_fh(inode, (struct fid *)fid->f_handle,
                                  &dwords,  NULL);


> > 3. statfs_by_dentry(dentry,... in fanotify_alloc_fid()
> >     Can be converted to statfs_by_dentry(inode->i_sb->sb_root,...
>
> This might be problematic e.g. with btrfs subvolumes which have each
> different fsid so the fsid-handle pair might be actually invalid or even
> point to a file with different contents. Maybe we could just store the
> fsid in the fsnotify_mark when it is created and use it when generating
> events? That would also get rid of the overhead of calling statfs for each
> generated event which I don't like...
>

OK, I was not being accurate when I wrote that these are the only 3 places
we use dentry. Those are the only 3 places in fanotify, but we also use
dentry->d_sb in fsnotify() to get to the sb mark of course, so we will be using
inode->i_sb instead.

w.r.t btrfs - btrfs has a single sb for multiple subvolumes so by definition the
FAN_MARK_FILESYSTEM feature is only capable of watching ALL subvolumes.

If we wanted to implement subvolume watch for btrfs, we would need
to support attaching a mark to a btrfs subvolume struct (or fs_view [1]).

Basically, the purpose, of fid->fsid is:
1. A key for finding a mount point to use as mount_fd argument to
open_by_handle_at()
2. Make fid unique across the system

Since btrfs file handles encode the subvolume root object id in the file handle,
fid->fsid is only needed to find a btrfs mount of the same sb (blockdev).

Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with the
single super block struct, so all dentries in all subvolumes will
return the same
fsid: btrfs_sb(dentry->d_sb)->fsid.

CC some btrfs folks to correct me if I am wrong.

Thanks,
Amir.

[1] https://lwn.net/Articles/753917/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-21 15:40         ` Amir Goldstein
@ 2018-11-22  7:45           ` Amir Goldstein
  2018-11-22  9:33             ` Jan Kara
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-22  7:45 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

> > But we never generate FS_MODIFY|FS_ISDIR events so I don't think there's a
> > big space for confusion (and I've deliberately used CHANGE instead of
> > MODIFY to make the distinction even clearer). FWIW
> > FANOTIFY_DIRENT_MODIFY_EVENTS would also look better than _FILENAME_EVENTS
> > to me.
> >
>
> Fair enough. I'll change to FANOTIFY_DIRENT_MODIFY_EVENTS
> and similar named helpers and comments in fsnotify.h.
>

On second thought, if you don't object, I will opt for brevity and use:
FANOTIFY_DIRENT_EVENTS.
I don't thing that dropping the _MODIFY_ /_change part of the name is
going to be a source of ambiguity.

/* Notify this directory about a change in one of its directory entries. */
static inline int __fsnotify_dirent(struct inode *dir, __u32 mask,
                                    const unsigned char *file_name, u32 cookie)
/*
 * Notify this dentry's parent about a change in the directory entry
 * associated with this dentry.
 * Unlike fsnotify_parent(), the event will be reported regardless of the
 * FS_EVENT_ON_CHILD mask on the parent inode
 * Safe to call with negative dentry, e.g. from fsnotify_nameremove()
 */
static inline int fsnotify_dirent(struct dentry *dentry, __u32 mask)

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-20 11:59   ` Jan Kara
  2018-11-20 14:58     ` Amir Goldstein
@ 2018-11-22  8:40     ` Amir Goldstein
  1 sibling, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2018-11-22  8:40 UTC (permalink / raw)
  To: Jan Kara; +Cc: Matthew Bobrowski, linux-fsdevel

> > @@ -85,8 +108,8 @@ static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
> >  {
> >       struct inode *source = moved->d_inode;
> >       u32 fs_cookie = fsnotify_get_cookie();
> > -     __u32 old_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_FROM);
> > -     __u32 new_dir_mask = (FS_EVENT_ON_CHILD | FS_MOVED_TO);
> > +     __u32 old_dir_mask = FS_MOVED_FROM;
> > +     __u32 new_dir_mask = FS_MOVED_TO;
>
> You can just inline these two masks now. There's no point for the variable
> anymore.
>

He? why not?
We still this this:

        if (old_dir == new_dir)
                old_dir_mask |= FS_DN_RENAME;

        if (isdir) {
                old_dir_mask |= FS_ISDIR;
                new_dir_mask |= FS_ISDIR;
        }

@isdir refers to the d_type of the modified dirent.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] fsnotify: annotate filename events
  2018-11-22  7:45           ` Amir Goldstein
@ 2018-11-22  9:33             ` Jan Kara
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Kara @ 2018-11-22  9:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel

On Thu 22-11-18 09:45:06, Amir Goldstein wrote:
> > > But we never generate FS_MODIFY|FS_ISDIR events so I don't think there's a
> > > big space for confusion (and I've deliberately used CHANGE instead of
> > > MODIFY to make the distinction even clearer). FWIW
> > > FANOTIFY_DIRENT_MODIFY_EVENTS would also look better than _FILENAME_EVENTS
> > > to me.
> > >
> >
> > Fair enough. I'll change to FANOTIFY_DIRENT_MODIFY_EVENTS
> > and similar named helpers and comments in fsnotify.h.
> >
> 
> On second thought, if you don't object, I will opt for brevity and use:
> FANOTIFY_DIRENT_EVENTS.
> I don't thing that dropping the _MODIFY_ /_change part of the name is
> going to be a source of ambiguity.
> 
> /* Notify this directory about a change in one of its directory entries. */
> static inline int __fsnotify_dirent(struct inode *dir, __u32 mask,
>                                     const unsigned char *file_name, u32 cookie)
> /*
>  * Notify this dentry's parent about a change in the directory entry
>  * associated with this dentry.
>  * Unlike fsnotify_parent(), the event will be reported regardless of the
>  * FS_EVENT_ON_CHILD mask on the parent inode
>  * Safe to call with negative dentry, e.g. from fsnotify_nameremove()
>  */
> static inline int fsnotify_dirent(struct dentry *dentry, __u32 mask)

Fine by me. Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-21 16:13         ` Amir Goldstein
@ 2018-11-22  9:52           ` Jan Kara
  2018-11-22 12:36             ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2018-11-22  9:52 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, Mark Fasheh,
	Chris Mason, Josef Bacik

On Wed 21-11-18 18:13:17, Amir Goldstein wrote:
> >
> > > 2. exportfs_encode_fh(dentry,... in fanotify_alloc_fid()
> > >     Can be converted to exportfs_encode_inode_fh(inode,...
> >
> > If you have the parent inode then yes. In lot of cases we do have it, not
> > sure if in all of them (but likely yes so that we can do proper
> > FS_EVENT_ON_CHILD) handling.
> >
> 
> We do not need the parent inode, because we are not encoding a
> "connectable" file handle. we need:
> exportfs_encode_inode_fh(inode, (struct fid *)fid->f_handle,
>                                   &dwords,  NULL);

Ah, you're right. I thought any handle has to be "connectable" as you say
but apparently open-by-handle is fine even with not connectable ones.
Thanks for enlightening me :)

> > > 3. statfs_by_dentry(dentry,... in fanotify_alloc_fid()
> > >     Can be converted to statfs_by_dentry(inode->i_sb->sb_root,...
> >
> > This might be problematic e.g. with btrfs subvolumes which have each
> > different fsid so the fsid-handle pair might be actually invalid or even
> > point to a file with different contents. Maybe we could just store the
> > fsid in the fsnotify_mark when it is created and use it when generating
> > events? That would also get rid of the overhead of calling statfs for each
> > generated event which I don't like...
> >
> 
> OK, I was not being accurate when I wrote that these are the only 3 places
> we use dentry. Those are the only 3 places in fanotify, but we also use
> dentry->d_sb in fsnotify() to get to the sb mark of course, so we will be using
> inode->i_sb instead.

Sure.

> w.r.t btrfs - btrfs has a single sb for multiple subvolumes so by
> definition the FAN_MARK_FILESYSTEM feature is only capable of watching
> ALL subvolumes.

Agreed but then if some event happens on inode A in subvolume S, you have
to be sure the handle+fsid you return will allow you to open the file with
that content and not inode A on subvolume T which has a different
contents...

> If we wanted to implement subvolume watch for btrfs, we would need
> to support attaching a mark to a btrfs subvolume struct (or fs_view [1]).

No, that's not what I meant.

> Basically, the purpose, of fid->fsid is:
> 1. A key for finding a mount point to use as mount_fd argument to
> open_by_handle_at()
> 2. Make fid unique across the system

Understood, that's what I thought.

> Since btrfs file handles encode the subvolume root object id in the file
> handle, fid->fsid is only needed to find a btrfs mount of the same sb
> (blockdev).

Ah good. I didn't realize btrfs will record subvolume root in the file
handle. On a second thought how else could NFS mount work, right, but it
didn't occur to me yesterday.

> Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> the single super block struct, so all dentries in all subvolumes will
> return the same fsid: btrfs_sb(dentry->d_sb)->fsid.

This is not true AFAICT. Looking at btrfs_statfs() I can see:

        buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
        buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
        /* Mask in the root object ID too, to disambiguate subvols */
        buf->f_fsid.val[0] ^=
                BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
        buf->f_fsid.val[1] ^=
                BTRFS_I(d_inode(dentry))->root->root_key.objectid;

So subvolume root is xored into the FSID. Thus dentries from different
subvolumes are going to return different fsids...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-22  9:52           ` Jan Kara
@ 2018-11-22 12:36             ` Amir Goldstein
  2018-11-22 13:26               ` Jan Kara
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-22 12:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Bobrowski, linux-fsdevel, Mark Fasheh, Chris Mason, Josef Bacik

> > w.r.t btrfs - btrfs has a single sb for multiple subvolumes so by
> > definition the FAN_MARK_FILESYSTEM feature is only capable of watching
> > ALL subvolumes.
>
> Agreed but then if some event happens on inode A in subvolume S, you have
> to be sure the handle+fsid you return will allow you to open the file with
> that content and not inode A on subvolume T which has a different
> contents...

So as written below, the btrfs file handle is unique across all subvolumes.
fsid is therefore not needed to find the subvolume. It is only needed to find
the btrfs super block and listening on many block devices and/or many
filesystem types.

> > Basically, the purpose, of fid->fsid is:
> > 1. A key for finding a mount point to use as mount_fd argument to
> > open_by_handle_at()
> > 2. Make fid unique across the system
>
> Understood, that's what I thought.
>
> > Since btrfs file handles encode the subvolume root object id in the file
> > handle, fid->fsid is only needed to find a btrfs mount of the same sb
> > (blockdev).
>
> Ah good. I didn't realize btrfs will record subvolume root in the file
> handle. On a second thought how else could NFS mount work, right, but it
> didn't occur to me yesterday.
>
> > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > the single super block struct, so all dentries in all subvolumes will
> > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
>
> This is not true AFAICT. Looking at btrfs_statfs() I can see:
>
>         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
>         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
>         /* Mask in the root object ID too, to disambiguate subvols */
>         buf->f_fsid.val[0] ^=
>                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
>         buf->f_fsid.val[1] ^=
>                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
>
> So subvolume root is xored into the FSID. Thus dentries from different
> subvolumes are going to return different fsids...
>

Right... how could I have missed that :-/

I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
and I saw how many flaws you pointed to in $SUBJECT patch.
Instead, I will use:
statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...

So we are only left with the corner case of fsnotify_inoderemove()
on btrfs and the like. I checked that for all the rest of fsnotify hooks
an alias is guarantied to exist at the time of the hook.

I could go with either of two options:
1. No support for FAN_DELETE_SELF as my v2 patch set already does
2. Best effort support for FAN_DELETE_SELF - it works for non-btrfs
    and if application listens on a single filesystem and ignores fsid
    (document this culprit).

What do you think?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-22 12:36             ` Amir Goldstein
@ 2018-11-22 13:26               ` Jan Kara
  2018-11-22 15:18                 ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2018-11-22 13:26 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, Mark Fasheh,
	Chris Mason, Josef Bacik

On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > the single super block struct, so all dentries in all subvolumes will
> > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> >
> > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> >
> >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> >         /* Mask in the root object ID too, to disambiguate subvols */
> >         buf->f_fsid.val[0] ^=
> >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> >         buf->f_fsid.val[1] ^=
> >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> >
> > So subvolume root is xored into the FSID. Thus dentries from different
> > subvolumes are going to return different fsids...
> >
> 
> Right... how could I have missed that :-/
> 
> I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> and I saw how many flaws you pointed to in $SUBJECT patch.
> Instead, I will use:
> statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...

So what about my proposal to store fsid in the notification mark when it is
created and then use it when that mark results in event being generated?
When mark is created, we have full path available, so getting proper fsid
is trivial. Furthermore if the behavior is documented like that, it is
fairly easy for userspace to track fsids it should care about and translate
them to proper file descriptors for open_by_handle().

This would get rid of statfs() on every event creation (which I don't like
much) and also avoids these problems "how to get fsid for inode". What do
you think?

								Honza

> So we are only left with the corner case of fsnotify_inoderemove()
> on btrfs and the like. I checked that for all the rest of fsnotify hooks
> an alias is guarantied to exist at the time of the hook.
> 
> I could go with either of two options:
> 1. No support for FAN_DELETE_SELF as my v2 patch set already does
> 2. Best effort support for FAN_DELETE_SELF - it works for non-btrfs
>     and if application listens on a single filesystem and ignores fsid
>     (document this culprit).
> 
> What do you think?
> 
> Thanks,
> Amir.
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-22 13:26               ` Jan Kara
@ 2018-11-22 15:18                 ` Amir Goldstein
  2018-11-22 19:42                   ` Amir Goldstein
  2018-11-23 12:52                   ` Btrfs and fanotify filesystem watches Jan Kara
  0 siblings, 2 replies; 29+ messages in thread
From: Amir Goldstein @ 2018-11-22 15:18 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Bobrowski, linux-fsdevel, Mark Fasheh, Chris Mason, Josef Bacik

On Thu, Nov 22, 2018 at 3:26 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > > the single super block struct, so all dentries in all subvolumes will
> > > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> > >
> > > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> > >
> > >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> > >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> > >         /* Mask in the root object ID too, to disambiguate subvols */
> > >         buf->f_fsid.val[0] ^=
> > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> > >         buf->f_fsid.val[1] ^=
> > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> > >
> > > So subvolume root is xored into the FSID. Thus dentries from different
> > > subvolumes are going to return different fsids...
> > >
> >
> > Right... how could I have missed that :-/
> >
> > I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> > and I saw how many flaws you pointed to in $SUBJECT patch.
> > Instead, I will use:
> > statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...
>
> So what about my proposal to store fsid in the notification mark when it is
> created and then use it when that mark results in event being generated?
> When mark is created, we have full path available, so getting proper fsid
> is trivial. Furthermore if the behavior is documented like that, it is
> fairly easy for userspace to track fsids it should care about and translate
> them to proper file descriptors for open_by_handle().
>
> This would get rid of statfs() on every event creation (which I don't like
> much) and also avoids these problems "how to get fsid for inode". What do
> you think?
>

That's interesting. I like the simplicity.
What happens when there are 2 btrfs subvols /subvol1 /subvol2
and the application obviously doesn't know about this and does:
fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol1);
statfs("/subvol1",...);
fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol2);
statfs("/subvol2",...);

Now the second fanotify_mark() call just updates the existing super block
mark mask, but does not change the fsid on the mark, so all events will have
fsid of subvol1 that was stored when first creating the mark.

fanotify-watch application (for example) hashes the watches (paths) under
/subvol2 by fid with fsid of subvol2, so events cannot get matched back to
"watch" (i.e. path).

And when trying to open_by_handle fid with fhandle from /subvol2
using mount_fd of /subvol1, I suppose we can either get ESTALE
or a disconnected dentry, because object from /subvol2 cannot
have a path inside /subvol1....

Am I making the issue clear? or maybe I am missing something?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-22 15:18                 ` Amir Goldstein
@ 2018-11-22 19:42                   ` Amir Goldstein
  2018-11-23 12:56                     ` Jan Kara
  2018-11-23 12:52                   ` Btrfs and fanotify filesystem watches Jan Kara
  1 sibling, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-22 19:42 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Bobrowski, linux-fsdevel, Mark Fasheh, Chris Mason, Josef Bacik

On Thu, Nov 22, 2018 at 5:18 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Thu, Nov 22, 2018 at 3:26 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > > > the single super block struct, so all dentries in all subvolumes will
> > > > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> > > >
> > > > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> > > >
> > > >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> > > >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> > > >         /* Mask in the root object ID too, to disambiguate subvols */
> > > >         buf->f_fsid.val[0] ^=
> > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> > > >         buf->f_fsid.val[1] ^=
> > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> > > >
> > > > So subvolume root is xored into the FSID. Thus dentries from different
> > > > subvolumes are going to return different fsids...
> > > >
> > >
> > > Right... how could I have missed that :-/
> > >
> > > I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> > > and I saw how many flaws you pointed to in $SUBJECT patch.
> > > Instead, I will use:
> > > statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...
> >
> > So what about my proposal to store fsid in the notification mark when it is
> > created and then use it when that mark results in event being generated?
> > When mark is created, we have full path available, so getting proper fsid
> > is trivial. Furthermore if the behavior is documented like that, it is
> > fairly easy for userspace to track fsids it should care about and translate
> > them to proper file descriptors for open_by_handle().
> >
> > This would get rid of statfs() on every event creation (which I don't like
> > much) and also avoids these problems "how to get fsid for inode". What do
> > you think?
> >
>
> That's interesting. I like the simplicity.
> What happens when there are 2 btrfs subvols /subvol1 /subvol2
> and the application obviously doesn't know about this and does:
> fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol1);
> statfs("/subvol1",...);
> fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol2);
> statfs("/subvol2",...);
>
> Now the second fanotify_mark() call just updates the existing super block
> mark mask, but does not change the fsid on the mark, so all events will have
> fsid of subvol1 that was stored when first creating the mark.
>
> fanotify-watch application (for example) hashes the watches (paths) under
> /subvol2 by fid with fsid of subvol2, so events cannot get matched back to
> "watch" (i.e. path).
>
> And when trying to open_by_handle fid with fhandle from /subvol2
> using mount_fd of /subvol1, I suppose we can either get ESTALE
> or a disconnected dentry, because object from /subvol2 cannot
> have a path inside /subvol1....
>
> Am I making the issue clear? or maybe I am missing something?
>

How about this compromise:

- On fanotify_mark() call statfs() for both dentry and dentry->d_sb->sb_root
- If they produce the same fsid, cache the fsid in the mark
  If they do not match invalidate existing cache and never check again
- When encoding fid, use cached fsid if exists, otherwise fallback to
  statfs_by_dentry(find_any_alias(inode) ?: inode->i_sb->sb_root)

Maybe better to return -EXDEV from fanotify_mark() on mismatch of fsid
of dentry and dentry->d_sb->sb_root? because it doesn't look like the
FAN_MARK_FILESYSTEM is going to be very useful for btrfs (??).

Other ideas?

Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Btrfs and fanotify filesystem watches
  2018-11-22 15:18                 ` Amir Goldstein
  2018-11-22 19:42                   ` Amir Goldstein
@ 2018-11-23 12:52                   ` Jan Kara
  2018-11-23 13:34                     ` Amir Goldstein
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Kara @ 2018-11-23 12:52 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, Mark Fasheh,
	Chris Mason, Josef Bacik, linux-btrfs

Changed subject to better match what we discuss and added btrfs list to CC.

On Thu 22-11-18 17:18:25, Amir Goldstein wrote:
> On Thu, Nov 22, 2018 at 3:26 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > > > the single super block struct, so all dentries in all subvolumes will
> > > > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> > > >
> > > > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> > > >
> > > >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> > > >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> > > >         /* Mask in the root object ID too, to disambiguate subvols */
> > > >         buf->f_fsid.val[0] ^=
> > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> > > >         buf->f_fsid.val[1] ^=
> > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> > > >
> > > > So subvolume root is xored into the FSID. Thus dentries from different
> > > > subvolumes are going to return different fsids...
> > > >
> > >
> > > Right... how could I have missed that :-/
> > >
> > > I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> > > and I saw how many flaws you pointed to in $SUBJECT patch.
> > > Instead, I will use:
> > > statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...
> >
> > So what about my proposal to store fsid in the notification mark when it is
> > created and then use it when that mark results in event being generated?
> > When mark is created, we have full path available, so getting proper fsid
> > is trivial. Furthermore if the behavior is documented like that, it is
> > fairly easy for userspace to track fsids it should care about and translate
> > them to proper file descriptors for open_by_handle().
> >
> > This would get rid of statfs() on every event creation (which I don't like
> > much) and also avoids these problems "how to get fsid for inode". What do
> > you think?
> >
> 
> That's interesting. I like the simplicity.
> What happens when there are 2 btrfs subvols /subvol1 /subvol2
> and the application obviously doesn't know about this and does:
> fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol1);
> statfs("/subvol1",...);
> fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol2);
> statfs("/subvol2",...);
> 
> Now the second fanotify_mark() call just updates the existing super block
> mark mask, but does not change the fsid on the mark, so all events will have
> fsid of subvol1 that was stored when first creating the mark.

Yes.

> fanotify-watch application (for example) hashes the watches (paths) under
> /subvol2 by fid with fsid of subvol2, so events cannot get matched back to
> "watch" (i.e. path).

I agree this can be confusing... but with btrfs fanotify-watch will be
confused even with your current code, won't it? Because FAN_MARK_FILESYSTEM
on /subvol1 (with fsid A) is going to return also events on inodes from
/subvol2 (with fsid B). So your current code will return event with fsid B
which fanotify-watch has no way to match back and can get confused.

So currently application can get events with fsid it has never seen, with
the code as I suggest it can get "wrong" fsid. That is confusing but still
somewhat better?

The core of the problem is that btrfs does not have "the superblock
identifier" that would correspond to FAN_MARK_FILESYSTEM scope of events
that we could use.

> And when trying to open_by_handle fid with fhandle from /subvol2
> using mount_fd of /subvol1, I suppose we can either get ESTALE
> or a disconnected dentry, because object from /subvol2 cannot
> have a path inside /subvol1....

So open_by_handle() should work fine even if we get mount_fd of /subvol1
and handle for inode on /subvol2. mount_fd in open_by_handle() is really
only used to get the superblock and that is the same.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-22 19:42                   ` Amir Goldstein
@ 2018-11-23 12:56                     ` Jan Kara
  2018-11-23 13:42                       ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2018-11-23 12:56 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, Mark Fasheh,
	Chris Mason, Josef Bacik

On Thu 22-11-18 21:42:45, Amir Goldstein wrote:
> On Thu, Nov 22, 2018 at 5:18 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Thu, Nov 22, 2018 at 3:26 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > > > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > > > > the single super block struct, so all dentries in all subvolumes will
> > > > > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> > > > >
> > > > > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> > > > >
> > > > >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> > > > >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> > > > >         /* Mask in the root object ID too, to disambiguate subvols */
> > > > >         buf->f_fsid.val[0] ^=
> > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> > > > >         buf->f_fsid.val[1] ^=
> > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> > > > >
> > > > > So subvolume root is xored into the FSID. Thus dentries from different
> > > > > subvolumes are going to return different fsids...
> > > > >
> > > >
> > > > Right... how could I have missed that :-/
> > > >
> > > > I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> > > > and I saw how many flaws you pointed to in $SUBJECT patch.
> > > > Instead, I will use:
> > > > statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...
> > >
> > > So what about my proposal to store fsid in the notification mark when it is
> > > created and then use it when that mark results in event being generated?
> > > When mark is created, we have full path available, so getting proper fsid
> > > is trivial. Furthermore if the behavior is documented like that, it is
> > > fairly easy for userspace to track fsids it should care about and translate
> > > them to proper file descriptors for open_by_handle().
> > >
> > > This would get rid of statfs() on every event creation (which I don't like
> > > much) and also avoids these problems "how to get fsid for inode". What do
> > > you think?
> > >
> >
> > That's interesting. I like the simplicity.
> > What happens when there are 2 btrfs subvols /subvol1 /subvol2
> > and the application obviously doesn't know about this and does:
> > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol1);
> > statfs("/subvol1",...);
> > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol2);
> > statfs("/subvol2",...);
> >
> > Now the second fanotify_mark() call just updates the existing super block
> > mark mask, but does not change the fsid on the mark, so all events will have
> > fsid of subvol1 that was stored when first creating the mark.
> >
> > fanotify-watch application (for example) hashes the watches (paths) under
> > /subvol2 by fid with fsid of subvol2, so events cannot get matched back to
> > "watch" (i.e. path).
> >
> > And when trying to open_by_handle fid with fhandle from /subvol2
> > using mount_fd of /subvol1, I suppose we can either get ESTALE
> > or a disconnected dentry, because object from /subvol2 cannot
> > have a path inside /subvol1....
> >
> > Am I making the issue clear? or maybe I am missing something?
> >
> 
> How about this compromise:
> 
> - On fanotify_mark() call statfs() for both dentry and dentry->d_sb->sb_root
> - If they produce the same fsid, cache the fsid in the mark
>   If they do not match invalidate existing cache and never check again
> - When encoding fid, use cached fsid if exists, otherwise fallback to
>   statfs_by_dentry(find_any_alias(inode) ?: inode->i_sb->sb_root)

I don't think this is really better - see my previous email.

> Maybe better to return -EXDEV from fanotify_mark() on mismatch of fsid
> of dentry and dentry->d_sb->sb_root? because it doesn't look like the
> FAN_MARK_FILESYSTEM is going to be very useful for btrfs (??).

Well, I think FAN_MARK_FILESYSTEM is useful. Path events have no problem,
events using FID will work as well when using open_by_handle(), just the
reported fsid may be confusing... I'm undecided what is better: returning
EXDEV or just silently dropping the other fsid.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Btrfs and fanotify filesystem watches
  2018-11-23 12:52                   ` Btrfs and fanotify filesystem watches Jan Kara
@ 2018-11-23 13:34                     ` Amir Goldstein
  2018-11-23 17:53                       ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-23 13:34 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Bobrowski, linux-fsdevel, Mark Fasheh, Chris Mason,
	Josef Bacik, Linux Btrfs

On Fri, Nov 23, 2018 at 2:52 PM Jan Kara <jack@suse.cz> wrote:
>
> Changed subject to better match what we discuss and added btrfs list to CC.
>
> On Thu 22-11-18 17:18:25, Amir Goldstein wrote:
> > On Thu, Nov 22, 2018 at 3:26 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > > > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > > > > the single super block struct, so all dentries in all subvolumes will
> > > > > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> > > > >
> > > > > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> > > > >
> > > > >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> > > > >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> > > > >         /* Mask in the root object ID too, to disambiguate subvols */
> > > > >         buf->f_fsid.val[0] ^=
> > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> > > > >         buf->f_fsid.val[1] ^=
> > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> > > > >
> > > > > So subvolume root is xored into the FSID. Thus dentries from different
> > > > > subvolumes are going to return different fsids...
> > > > >
> > > >
> > > > Right... how could I have missed that :-/
> > > >
> > > > I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> > > > and I saw how many flaws you pointed to in $SUBJECT patch.
> > > > Instead, I will use:
> > > > statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...
> > >
> > > So what about my proposal to store fsid in the notification mark when it is
> > > created and then use it when that mark results in event being generated?
> > > When mark is created, we have full path available, so getting proper fsid
> > > is trivial. Furthermore if the behavior is documented like that, it is
> > > fairly easy for userspace to track fsids it should care about and translate
> > > them to proper file descriptors for open_by_handle().
> > >
> > > This would get rid of statfs() on every event creation (which I don't like
> > > much) and also avoids these problems "how to get fsid for inode". What do
> > > you think?
> > >
> >
> > That's interesting. I like the simplicity.
> > What happens when there are 2 btrfs subvols /subvol1 /subvol2
> > and the application obviously doesn't know about this and does:
> > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol1);
> > statfs("/subvol1",...);
> > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol2);
> > statfs("/subvol2",...);
> >
> > Now the second fanotify_mark() call just updates the existing super block
> > mark mask, but does not change the fsid on the mark, so all events will have
> > fsid of subvol1 that was stored when first creating the mark.
>
> Yes.
>
> > fanotify-watch application (for example) hashes the watches (paths) under
> > /subvol2 by fid with fsid of subvol2, so events cannot get matched back to
> > "watch" (i.e. path).
>
> I agree this can be confusing... but with btrfs fanotify-watch will be
> confused even with your current code, won't it? Because FAN_MARK_FILESYSTEM
> on /subvol1 (with fsid A) is going to return also events on inodes from
> /subvol2 (with fsid B). So your current code will return event with fsid B
> which fanotify-watch has no way to match back and can get confused.
>
> So currently application can get events with fsid it has never seen, with
> the code as I suggest it can get "wrong" fsid. That is confusing but still
> somewhat better?

It's not better IMO because the purpose of the fid in event is a unique key
that application can use to match a path it already indexed.
wrong fsid => no match.
Using open_by_handle_at() to resolve unknown fid is a limited solution
and I don't think it is going to work in this case (see below).

>
> The core of the problem is that btrfs does not have "the superblock
> identifier" that would correspond to FAN_MARK_FILESYSTEM scope of events
> that we could use.
>
> > And when trying to open_by_handle fid with fhandle from /subvol2
> > using mount_fd of /subvol1, I suppose we can either get ESTALE
> > or a disconnected dentry, because object from /subvol2 cannot
> > have a path inside /subvol1....
>
> So open_by_handle() should work fine even if we get mount_fd of /subvol1
> and handle for inode on /subvol2. mount_fd in open_by_handle() is really
> only used to get the superblock and that is the same.

I don't think it will work fine.
do_handle_to_path() will compose a path from /subvol1 mnt and dentry from
/subvol2. This may resolve to a full path that does not really exist,
so application
cannot match it to anything that is can use name_to_handle_at() to identify.

The whole thing just sounds too messy. At the very least we need more time
to communicate with btrfs developers and figure this out, so I am going to
go with -EXDEV for any attempt to set *any* mark on a group with
FAN_REPORT_FID if fsid of fanotify_mark() path argument is different
from fsid of path->dentry->d_sb->s_root.

We can relax that later if we figure out a better way.

BTW, I am also going to go with -ENODEV for zero fsid (e.g. tmpfs).
tmpfs can be easily fixed to have non zero fsid if filesystem watch on tmpfs is
important.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available
  2018-11-23 12:56                     ` Jan Kara
@ 2018-11-23 13:42                       ` Amir Goldstein
  0 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2018-11-23 13:42 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Bobrowski, linux-fsdevel, Mark Fasheh, Chris Mason, Josef Bacik

> > How about this compromise:
> >
> > - On fanotify_mark() call statfs() for both dentry and dentry->d_sb->sb_root
> > - If they produce the same fsid, cache the fsid in the mark
> >   If they do not match invalidate existing cache and never check again
> > - When encoding fid, use cached fsid if exists, otherwise fallback to
> >   statfs_by_dentry(find_any_alias(inode) ?: inode->i_sb->sb_root)
>
> I don't think this is really better - see my previous email.

No I think it is horid, because results are unexpected and the same
type of event
on the same object can produce different fsid over time.

>
> > Maybe better to return -EXDEV from fanotify_mark() on mismatch of fsid
> > of dentry and dentry->d_sb->sb_root? because it doesn't look like the
> > FAN_MARK_FILESYSTEM is going to be very useful for btrfs (??).
>
> Well, I think FAN_MARK_FILESYSTEM is useful. Path events have no problem,

I wasn't indenting to deny btrfs of FAN_MARK_FILESYSTEM with event->fd.
Using a group with FAN_REPORT_FID will have limited support on btrfs
(only for setting marks on root volume?).

> events using FID will work as well when using open_by_handle(), just the
> reported fsid may be confusing... I'm undecided what is better: returning
> EXDEV or just silently dropping the other fsid.
>

As I wrote in other thread, I don't think open_by_handle() is really going to
work. you may get an open file, but with an invalid path that is not useful
for filesystem monitoring application.

So if you don't object, I'll go with EXDEV and we can see later about
relaxing it.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Btrfs and fanotify filesystem watches
  2018-11-23 13:34                     ` Amir Goldstein
@ 2018-11-23 17:53                       ` Amir Goldstein
  2018-11-27  8:35                         ` Jan Kara
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2018-11-23 17:53 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Bobrowski, linux-fsdevel, Mark Fasheh, Chris Mason,
	Josef Bacik, Linux Btrfs

On Fri, Nov 23, 2018 at 3:34 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Nov 23, 2018 at 2:52 PM Jan Kara <jack@suse.cz> wrote:
> >
> > Changed subject to better match what we discuss and added btrfs list to CC.
> >
> > On Thu 22-11-18 17:18:25, Amir Goldstein wrote:
> > > On Thu, Nov 22, 2018 at 3:26 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Thu 22-11-18 14:36:35, Amir Goldstein wrote:
> > > > > > > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with
> > > > > > > the single super block struct, so all dentries in all subvolumes will
> > > > > > > return the same fsid: btrfs_sb(dentry->d_sb)->fsid.
> > > > > >
> > > > > > This is not true AFAICT. Looking at btrfs_statfs() I can see:
> > > > > >
> > > > > >         buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
> > > > > >         buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
> > > > > >         /* Mask in the root object ID too, to disambiguate subvols */
> > > > > >         buf->f_fsid.val[0] ^=
> > > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32;
> > > > > >         buf->f_fsid.val[1] ^=
> > > > > >                 BTRFS_I(d_inode(dentry))->root->root_key.objectid;
> > > > > >
> > > > > > So subvolume root is xored into the FSID. Thus dentries from different
> > > > > > subvolumes are going to return different fsids...
> > > > > >
> > > > >
> > > > > Right... how could I have missed that :-/
> > > > >
> > > > > I do not want to bring back FSNOTIFY_EVENT_DENTRY just for that
> > > > > and I saw how many flaws you pointed to in $SUBJECT patch.
> > > > > Instead, I will use:
> > > > > statfs_by_dentry(d_find_any_alias(inode) ?: inode->i_sb->sb_root,...
> > > >
> > > > So what about my proposal to store fsid in the notification mark when it is
> > > > created and then use it when that mark results in event being generated?
> > > > When mark is created, we have full path available, so getting proper fsid
> > > > is trivial. Furthermore if the behavior is documented like that, it is
> > > > fairly easy for userspace to track fsids it should care about and translate
> > > > them to proper file descriptors for open_by_handle().
> > > >
> > > > This would get rid of statfs() on every event creation (which I don't like
> > > > much) and also avoids these problems "how to get fsid for inode". What do
> > > > you think?
> > > >
> > >
> > > That's interesting. I like the simplicity.
> > > What happens when there are 2 btrfs subvols /subvol1 /subvol2
> > > and the application obviously doesn't know about this and does:
> > > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol1);
> > > statfs("/subvol1",...);
> > > fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM, ... /subvol2);
> > > statfs("/subvol2",...);
> > >
> > > Now the second fanotify_mark() call just updates the existing super block
> > > mark mask, but does not change the fsid on the mark, so all events will have
> > > fsid of subvol1 that was stored when first creating the mark.
> >
> > Yes.
> >
> > > fanotify-watch application (for example) hashes the watches (paths) under
> > > /subvol2 by fid with fsid of subvol2, so events cannot get matched back to
> > > "watch" (i.e. path).
> >
> > I agree this can be confusing... but with btrfs fanotify-watch will be
> > confused even with your current code, won't it? Because FAN_MARK_FILESYSTEM
> > on /subvol1 (with fsid A) is going to return also events on inodes from
> > /subvol2 (with fsid B). So your current code will return event with fsid B
> > which fanotify-watch has no way to match back and can get confused.
> >
> > So currently application can get events with fsid it has never seen, with
> > the code as I suggest it can get "wrong" fsid. That is confusing but still
> > somewhat better?
>
> It's not better IMO because the purpose of the fid in event is a unique key
> that application can use to match a path it already indexed.
> wrong fsid => no match.
> Using open_by_handle_at() to resolve unknown fid is a limited solution
> and I don't think it is going to work in this case (see below).
>
> >
> > The core of the problem is that btrfs does not have "the superblock
> > identifier" that would correspond to FAN_MARK_FILESYSTEM scope of events
> > that we could use.
> >
> > > And when trying to open_by_handle fid with fhandle from /subvol2
> > > using mount_fd of /subvol1, I suppose we can either get ESTALE
> > > or a disconnected dentry, because object from /subvol2 cannot
> > > have a path inside /subvol1....
> >
> > So open_by_handle() should work fine even if we get mount_fd of /subvol1
> > and handle for inode on /subvol2. mount_fd in open_by_handle() is really
> > only used to get the superblock and that is the same.
>
> I don't think it will work fine.
> do_handle_to_path() will compose a path from /subvol1 mnt and dentry from
> /subvol2. This may resolve to a full path that does not really exist,
> so application
> cannot match it to anything that is can use name_to_handle_at() to identify.
>
> The whole thing just sounds too messy. At the very least we need more time
> to communicate with btrfs developers and figure this out, so I am going to
> go with -EXDEV for any attempt to set *any* mark on a group with
> FAN_REPORT_FID if fsid of fanotify_mark() path argument is different
> from fsid of path->dentry->d_sb->s_root.
>
> We can relax that later if we figure out a better way.
>
> BTW, I am also going to go with -ENODEV for zero fsid (e.g. tmpfs).
> tmpfs can be easily fixed to have non zero fsid if filesystem watch on tmpfs is
> important.
>

Well, this is interesting... I have implemented the -EXDEV logic and it
works as expected. I can set a filesystem global watch on the main
btrfs mount and not allowed to set a global watch on a subvolume.

The interesting part is that the global watch on the main btrfs volume
more useful than I though it would be. The file handles reported by the
main volume global watch are resolved to correct paths in the main volume.
I guess this is because a btrfs subvolume looks like a directory tree
in the global
namespace to vfs. See below.

So I will continue based on this working POC:
https://github.com/amir73il/linux/commits/fanotify_fid

Note that in the POC, fsid is cached in mark connector as you suggested.
It is still buggy, but my prototype always decodes file handles from the first
path argument given to the program, so it just goes to show that by getting
fsid of the main btrfs volume, the application will be able to properly decode
file handles and resolve correct paths.

The bottom line is that btrfs will have the full functionality of super block
monitoring with no ability to watch (or ignore) a single subvolume
(unless by using a mount mark).

Thanks,
Amir.

=============

root@kvm-xfstests:~# mount -t btrfs
/dev/vde on /vde type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)
/dev/vde on /mnt type btrfs
(rw,relatime,space_cache,subvolid=257,subvol=/subvol)

root@kvm-xfstests:~# /vtmp/inotifywait -m -e open,close -g /mnt
Setting up global filesystem watches.
Couldn't add global watch(es) /mnt...: Invalid cross-device link

root@kvm-xfstests:~# /vtmp/inotifywait -m -e open,close -g /vde &
Setting up global filesystem watches.
Watches established.

root@kvm-xfstests:~# mkdir -p /mnt/a/b/c/d/e/ && touch /mnt/a/b/c/d/e/x
Start watching /vde/subvol/a (fid=107...).
/vde/subvol/a OPEN
/vde/subvol/a CLOSE_NOWRITE,CLOSE
/vde/subvol/a CLOSE_NOWRITE,OPEN,CLOSE
Start watching /vde/subvol/a/b (fid=108...).
/vde/subvol/a/b CLOSE_NOWRITE,OPEN,CLOSE
Start watching /vde/subvol/a/b/c (fid=109...).
/vde/subvol/a/b/c CLOSE_NOWRITE,OPEN,CLOSE
Start watching /vde/subvol/a/b/c/d (fid=10a...).
/vde/subvol/a/b/c/d CLOSE_NOWRITE,OPEN,CLOSE
/vde/subvol/a/b CLOSE_NOWRITE,OPEN,CLOSE
/vde/subvol/a/b/c CLOSE_NOWRITE,OPEN,CLOSE
Start watching /vde/subvol/a/b/c/d/e/x (fid=10c...).
/vde/subvol/a/b/c/d/e/x CLOSE_WRITE,OPEN,CLOSE
/vde/subvol/a/b/c/d CLOSE_NOWRITE,OPEN,CLOSE
/vde/subvol/a/b/c/d/e/x CLOSE_NOWRITE,OPEN,CLOSE

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Btrfs and fanotify filesystem watches
  2018-11-23 17:53                       ` Amir Goldstein
@ 2018-11-27  8:35                         ` Jan Kara
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Kara @ 2018-11-27  8:35 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, Matthew Bobrowski, linux-fsdevel, Mark Fasheh,
	Chris Mason, Josef Bacik, Linux Btrfs

On Fri 23-11-18 19:53:11, Amir Goldstein wrote:
> On Fri, Nov 23, 2018 at 3:34 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > So open_by_handle() should work fine even if we get mount_fd of /subvol1
> > > and handle for inode on /subvol2. mount_fd in open_by_handle() is really
> > > only used to get the superblock and that is the same.
> >
> > I don't think it will work fine.
> > do_handle_to_path() will compose a path from /subvol1 mnt and dentry from
> > /subvol2. This may resolve to a full path that does not really exist,
> > so application
> > cannot match it to anything that is can use name_to_handle_at() to identify.
> >
> > The whole thing just sounds too messy. At the very least we need more time
> > to communicate with btrfs developers and figure this out, so I am going to
> > go with -EXDEV for any attempt to set *any* mark on a group with
> > FAN_REPORT_FID if fsid of fanotify_mark() path argument is different
> > from fsid of path->dentry->d_sb->s_root.
> >
> > We can relax that later if we figure out a better way.
> >
> > BTW, I am also going to go with -ENODEV for zero fsid (e.g. tmpfs).
> > tmpfs can be easily fixed to have non zero fsid if filesystem watch on tmpfs is
> > important.
> >
> 
> Well, this is interesting... I have implemented the -EXDEV logic and it
> works as expected. I can set a filesystem global watch on the main
> btrfs mount and not allowed to set a global watch on a subvolume.
> 
> The interesting part is that the global watch on the main btrfs volume
> more useful than I though it would be. The file handles reported by the
> main volume global watch are resolved to correct paths in the main volume.
> I guess this is because a btrfs subvolume looks like a directory tree
> in the global
> namespace to vfs. See below.
> 
> So I will continue based on this working POC:
> https://github.com/amir73il/linux/commits/fanotify_fid
> 
> Note that in the POC, fsid is cached in mark connector as you suggested.
> It is still buggy, but my prototype always decodes file handles from the first
> path argument given to the program, so it just goes to show that by getting
> fsid of the main btrfs volume, the application will be able to properly decode
> file handles and resolve correct paths.
> 
> The bottom line is that btrfs will have the full functionality of super block
> monitoring with no ability to watch (or ignore) a single subvolume
> (unless by using a mount mark).

Sounds good. I'll check the new version of your series.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-11-27 19:32 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-14 17:43 [PATCH v2 0/5] fsnotify prep work for fanotify dentry events Amir Goldstein
2018-11-14 17:43 ` [PATCH v2 1/5] fsnotify: pass dentry instead of inode when available Amir Goldstein
2018-11-20 11:32   ` Jan Kara
2018-11-20 14:36     ` Amir Goldstein
2018-11-21 12:51       ` Jan Kara
2018-11-21 16:13         ` Amir Goldstein
2018-11-22  9:52           ` Jan Kara
2018-11-22 12:36             ` Amir Goldstein
2018-11-22 13:26               ` Jan Kara
2018-11-22 15:18                 ` Amir Goldstein
2018-11-22 19:42                   ` Amir Goldstein
2018-11-23 12:56                     ` Jan Kara
2018-11-23 13:42                       ` Amir Goldstein
2018-11-23 12:52                   ` Btrfs and fanotify filesystem watches Jan Kara
2018-11-23 13:34                     ` Amir Goldstein
2018-11-23 17:53                       ` Amir Goldstein
2018-11-27  8:35                         ` Jan Kara
2018-11-14 17:43 ` [PATCH v2 2/5] fsnotify: annotate filename events Amir Goldstein
2018-11-20 11:59   ` Jan Kara
2018-11-20 14:58     ` Amir Goldstein
2018-11-21 13:18       ` Jan Kara
2018-11-21 15:40         ` Amir Goldstein
2018-11-22  7:45           ` Amir Goldstein
2018-11-22  9:33             ` Jan Kara
2018-11-22  8:40     ` Amir Goldstein
2018-11-14 17:43 ` [PATCH v2 3/5] fsnotify: simplify API for " Amir Goldstein
2018-11-14 17:43 ` [PATCH v2 4/5] fsnotify: make MOVED_FROM a dentry event Amir Goldstein
2018-11-20 12:45   ` Jan Kara
2018-11-14 17:43 ` [PATCH v2 5/5] fsnotify: send event type FSNOTIFY_EVENT_DENTRY to super block mark Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).