linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/6] fanotify: super block root watch
@ 2017-03-13 13:20 Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 1/6] fanotify: add a " Amir Goldstein
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Amir Goldstein @ 2017-03-13 13:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Eric Paris, Marko Rauhamaa, linux-fsdevel

This is the 4th and last part of fanotify super block watch work.

I am posting this last part in preparation to the fsnotify discussion
on LSF/MM.

The issue, which this work sets to solve, is the poor scalability of
recursive inotify watches.

Other operating systems have a scalable way of watching changes on
a large file system. Windows has USN Journal, macOS has FSEvents
and BSD has kevents.

Currently, the only way in Linux to monitor file system 'filename events'
(e.g. create/delete/move) is the recursive inotify watch way and
this method scales very poorly for large enough directory trees.
Beyond the exploding probability of a need for full scan, pinning
all directory inodes wastes a lot of memory.

The efforts to merge fanotify took several steps in the direction
of solving the scalability issue, but they did not go all the way
to provide the functionality required to replace inotify.

=== For Reviewers ===

I split the work to 4 parts, which are denoted by 4 branch heads,
for convenience of reviewers:

1. A cleanup series with no functional nor performance impact [1][5].

2. Adds the super block root watch functionality to fsnotify infrastructure,
   without adding user API and without adding support in any backend [2][6].

3. Adds the user API and functionality of reporting 'filename events'
   (e.g. create/delete/rename) to the fanotify backend [3][7].

4. [This posting] Adds the user API and functionality for reporting all events
   on a super block via the fanotify backend [4].

The fanotify super block watch feature is currently being tested
by my employer and by other interested parties as well.

Adding more tests and possibly writing a dedicated testsuite has
been on my TODO list for a while. I have experimented with some
options, but not much to show for yet.
In the mean while, I started an fsnotify-TODO wiki [8] and listed
some test requirements based on some earlier discussions with Jan.

Comments and thoughts are most welcome.

Thanks!
Amir.

[1] https://github.com/amir73il/linux/commits/fsnotify_dentry
[2] https://github.com/amir73il/linux/commits/fsnotify_sb
[3] https://github.com/amir73il/linux/commits/fanotify_dentry
[4] https://github.com/amir73il/linux/commits/fanotify_sb
[5] http://marc.info/?l=linux-fsdevel&m=148198446127338&w=2
[6] http://marc.info/?l=linux-fsdevel&m=148224725519139&w=2
[7] http://marc.info/?l=linux-fsdevel&m=147612772031987&w=2
[8] https://github.com/amir73il/fsnotify-utils/wiki/fsnotify-TODO

Amir Goldstein (6):
  fanotify: add a super block root watch
  fanotify: report events to sb root with fanotify_file_event_info
  fanotify: pass file handle on sb root watcher events
  fanotify: report file name to root inode watch with FS_EVENT_ON_CHILD
  fanotify: export FAN_ONDIR to user
  fanotify: filter events by root mark mount point

 fs/notify/fanotify/fanotify.c      |  83 ++++++++++++++++++++++++------
 fs/notify/fanotify/fanotify.h      |  40 +++++++++++++--
 fs/notify/fanotify/fanotify_user.c | 102 ++++++++++++++++++++++++++-----------
 fs/notify/fsnotify.c               |  13 +++--
 include/linux/fsnotify_backend.h   |  13 +++++
 include/uapi/linux/fanotify.h      |   7 ++-
 6 files changed, 205 insertions(+), 53 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC][PATCH 1/6] fanotify: add a super block root watch
  2017-03-13 13:20 [RFC][PATCH 0/6] fanotify: super block root watch Amir Goldstein
@ 2017-03-13 13:20 ` Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 2/6] fanotify: report events to sb root with fanotify_file_event_info Amir Goldstein
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2017-03-13 13:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Eric Paris, Marko Rauhamaa, linux-fsdevel

Set a watch on the super block's root inode including the
mask bit FAN_EVENT_ON_SB to get notified on the events
of all the inodes on the same super block.

When requesting to add a super block root watch, any
file on the file system can be passed as input argument
for fanotify_mark(). The mark will be added to the root
inode of the file system that file belongs to.

The super block watch cannot be set with FAN_MARK_MOUNT flag,
because a super block watch is an inode mark, not a mount mark.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      |  1 +
 fs/notify/fanotify/fanotify_user.c | 15 ++++++++++++---
 include/uapi/linux/fanotify.h      |  3 +++
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index a647e7b..67feeb6 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -237,6 +237,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 	BUILD_BUG_ON(FAN_DELETE_SELF != FS_DELETE_SELF);
 	BUILD_BUG_ON(FAN_MOVE_SELF != FS_MOVE_SELF);
 	BUILD_BUG_ON(FAN_EVENT_ON_CHILD != FS_EVENT_ON_CHILD);
+	BUILD_BUG_ON(FAN_EVENT_ON_SB != FS_EVENT_ON_SB);
 	BUILD_BUG_ON(FAN_Q_OVERFLOW != FS_Q_OVERFLOW);
 	BUILD_BUG_ON(FAN_OPEN_PERM != FS_OPEN_PERM);
 	BUILD_BUG_ON(FAN_ACCESS_PERM != FS_ACCESS_PERM);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 13015ee..e57c82a 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -902,9 +902,9 @@ SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
 	}
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
-	if (mask & ~(FAN_ALL_EVENTS | FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_CHILD))
+	if (mask & ~(FAN_ALL_EVENTS | FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_DESCENDANT))
 #else
-	if (mask & ~(FAN_ALL_EVENTS | FAN_EVENT_ON_CHILD))
+	if (mask & ~(FAN_ALL_EVENTS | FAN_EVENT_ON_DESCENDANT))
 #endif
 		return -EINVAL;
 
@@ -927,6 +927,12 @@ SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
 	    group->priority == FS_PRIO_0)
 		goto fput_and_out;
 
+	/* Super block root watch is not a mount watch */
+	ret = -EINVAL;
+	if ((mask & FAN_EVENT_ON_SB) &&
+	    (flags & FAN_MARK_MOUNT))
+		goto fput_and_out;
+
 	if (flags & FAN_MARK_FLUSH) {
 		ret = 0;
 		if (flags & FAN_MARK_MOUNT)
@@ -941,7 +947,9 @@ SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
 		goto fput_and_out;
 
 	/* inode held in place by reference to path; group by fget on fd */
-	if (!(flags & FAN_MARK_MOUNT))
+	if (mask & FAN_EVENT_ON_SB)
+		inode = path.dentry->d_sb->s_root->d_inode;
+	else if (!(flags & FAN_MARK_MOUNT))
 		inode = path.dentry->d_inode;
 
 	/*
@@ -950,6 +958,7 @@ SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
 	 * even if the events happened on another mount point.
 	 */
 	if ((flags & FAN_MARK_MOUNT) ||
+	    (mask & FAN_EVENT_ON_SB) ||
 	    group->fanotify_data.flags & FAN_EVENT_INFO_PARENT)
 		mnt = path.mnt;
 
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index 95b8335..b202e09 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -25,8 +25,11 @@
 
 #define FAN_ONDIR		0x40000000	/* event occurred against dir */
 
+#define FAN_EVENT_ON_SB		0x01000000	/* interested in all sb inodes */
 #define FAN_EVENT_ON_CHILD	0x08000000	/* interested in child events */
 
+#define FAN_EVENT_ON_DESCENDANT	(FAN_EVENT_ON_CHILD | FAN_EVENT_ON_SB)
+
 /* helper events */
 #define FAN_CLOSE		(FAN_CLOSE_WRITE | FAN_CLOSE_NOWRITE) /* close */
 #define FAN_MOVE		(FAN_MOVED_FROM | FAN_MOVED_TO) /* moves */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC][PATCH 2/6] fanotify: report events to sb root with fanotify_file_event_info
  2017-03-13 13:20 [RFC][PATCH 0/6] fanotify: super block root watch Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 1/6] fanotify: add a " Amir Goldstein
@ 2017-03-13 13:20 ` Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 3/6] fanotify: pass file handle on sb root watcher events Amir Goldstein
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2017-03-13 13:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Eric Paris, Marko Rauhamaa, linux-fsdevel

Allocate variable length fanotify_file_event_info struct to
report events when FAN_EVENT_ON_SB mask is set on event.

This is needed to provide extra info on the event for the
super block watcher.

An exception is made for permission event which use the
fanotify_perm_event_info struct also when reporting to
super block watcher.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 19 +++++++++++++------
 fs/notify/fanotify/fanotify.h      | 21 +++++++++++++++++++++
 fs/notify/fanotify/fanotify_user.c |  8 ++++----
 3 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 67feeb6..8a88d88 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -41,7 +41,7 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
 	 * the event structure we have created in fanotify_handle_event() is the
 	 * one we should check for permission response.
 	 */
-	if (event->mask & FAN_ALL_PERM_EVENTS)
+	if (FANOTIFY_IS_PE(event))
 		return 0;
 #endif
 
@@ -169,11 +169,18 @@ struct fanotify_event_info *fanotify_alloc_event(struct inode *inode, u32 mask,
 #endif
 
 	/*
-	 * For filename events (create,delete,rename), path points to the
+	 * For filename events (create,delete,move), path points to the
 	 * directory and name holds the entry name, so allocate a variable
 	 * length fanotify_file_event_info struct.
+	 *
+	 * When non permission events are reported on super block root watch,
+	 * they may need to carry extra file information. So alway allocate
+	 * fanotify_file_event_info struct for those events, even if data len
+	 * end up being 0.
+	 * This makes it easier to know when an event struct should be cast
+	 * to FANOTIFY_FE(), e.g. in fanotify_free_event().
 	 */
-	if (mask & FAN_FILENAME_EVENTS) {
+	if (mask & (FAN_FILENAME_EVENTS | FAN_EVENT_ON_SB)) {
 		struct fanotify_file_event_info *ffe;
 		int alloc_len = sizeof(*ffe);
 		int len = 0;
@@ -276,7 +283,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 	}
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
-	if (mask & FAN_ALL_PERM_EVENTS) {
+	if (FANOTIFY_IS_PE(fsn_event)) {
 		ret = fanotify_get_response(group, FANOTIFY_PE(fsn_event));
 		fsnotify_destroy_event(group, fsn_event);
 	}
@@ -301,13 +308,13 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
 	path_put(&event->path);
 	put_pid(event->tgid);
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
-	if (fsn_event->mask & FAN_ALL_PERM_EVENTS) {
+	if (FANOTIFY_IS_PE(fsn_event)) {
 		kmem_cache_free(fanotify_perm_event_cachep,
 				FANOTIFY_PE(fsn_event));
 		return;
 	}
 #endif
-	if (fsn_event->mask & FAN_FILENAME_EVENTS) {
+	if (FANOTIFY_IS_FE(fsn_event)) {
 		kfree(FANOTIFY_FE(fsn_event));
 		return;
 	}
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index d0bb7acb4..d23b35b 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -54,8 +54,29 @@ FANOTIFY_PE(struct fsnotify_event *fse)
 {
 	return container_of(fse, struct fanotify_perm_event_info, fae.fse);
 }
+
+/* Should use fanotify_perm_event_info for this event? */
+static inline bool FANOTIFY_IS_PE(struct fsnotify_event *fse)
+{
+	return fse->mask & FAN_ALL_PERM_EVENTS;
+}
+#else
+static inline bool FANOTIFY_IS_PE(struct fsnotify_event *fse)
+{
+	return false;
+}
 #endif
 
+/* Should use fanotify_file_event_info for this event? */
+static inline bool FANOTIFY_IS_FE(struct fsnotify_event *fse)
+{
+	if (FANOTIFY_IS_PE(fse))
+		return false;
+
+	/* Non permission events reported on root may carry file info */
+	return fse->mask & (FAN_FILENAME_EVENTS | FAN_EVENT_ON_SB);
+}
+
 static inline struct fanotify_file_event_info *
 FANOTIFY_FE(struct fsnotify_event *fse)
 {
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index e57c82a..191bd7d 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -229,7 +229,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	if (ret < 0)
 		return ret;
 
-	if ((event->mask & FAN_FILENAME_EVENTS) &&
+	if (FANOTIFY_IS_FE(event) &&
 	    (group->fanotify_data.flags & FAN_EVENT_INFO_NAME)) {
 		ffe = FANOTIFY_FE(event);
 		pad_name_len = round_event_name_len(ffe);
@@ -261,7 +261,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	}
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
-	if (event->mask & FAN_ALL_PERM_EVENTS)
+	if (FANOTIFY_IS_PE(event))
 		FANOTIFY_PE(event)->fd = fd;
 #endif
 
@@ -338,7 +338,7 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
 		 * Permission events get queued to wait for response.  Other
 		 * events can be destroyed now.
 		 */
-		if (!(kevent->mask & FAN_ALL_PERM_EVENTS)) {
+		if (!FANOTIFY_IS_PE(kevent)) {
 			fsnotify_destroy_event(group, kevent);
 			if (ret < 0)
 				break;
@@ -428,7 +428,7 @@ static int fanotify_release(struct inode *ignored, struct file *file)
 	 */
 	while (!fsnotify_notify_queue_is_empty(group)) {
 		fsn_event = fsnotify_remove_first_event(group);
-		if (!(fsn_event->mask & FAN_ALL_PERM_EVENTS)) {
+		if (!FANOTIFY_IS_PE(fsn_event)) {
 			spin_unlock(&group->notification_lock);
 			fsnotify_destroy_event(group, fsn_event);
 			spin_lock(&group->notification_lock);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC][PATCH 3/6] fanotify: pass file handle on sb root watcher events
  2017-03-13 13:20 [RFC][PATCH 0/6] fanotify: super block root watch Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 1/6] fanotify: add a " Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 2/6] fanotify: report events to sb root with fanotify_file_event_info Amir Goldstein
@ 2017-03-13 13:20 ` Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 4/6] fanotify: report file name to root inode watch with FS_EVENT_ON_CHILD Amir Goldstein
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2017-03-13 13:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Eric Paris, Marko Rauhamaa, linux-fsdevel

When user requests the flag FAN_EVENT_INFO_FH in fanotify_init(),
sb root watcher events data will start with the reported object's
file handle, followed by an optional filename for filename events.

The file handle can be used as unique object identifier of the affected
inode.
It also makes holding the dentry reference and passing file descriptor
to user redundant, so they may be made optional later on.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 40 ++++++++++++++++-----
 fs/notify/fanotify/fanotify.h      | 19 ++++++++--
 fs/notify/fanotify/fanotify_user.c | 72 +++++++++++++++++++++++++++-----------
 include/uapi/linux/fanotify.h      |  4 ++-
 4 files changed, 102 insertions(+), 33 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 8a88d88..ca9894c 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -148,7 +148,8 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark,
 	return true;
 }
 
-struct fanotify_event_info *fanotify_alloc_event(struct inode *inode, u32 mask,
+struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
+						 struct inode *inode, u32 mask,
 						 const struct path *path,
 						 const char *file_name)
 {
@@ -183,19 +184,42 @@ struct fanotify_event_info *fanotify_alloc_event(struct inode *inode, u32 mask,
 	if (mask & (FAN_FILENAME_EVENTS | FAN_EVENT_ON_SB)) {
 		struct fanotify_file_event_info *ffe;
 		int alloc_len = sizeof(*ffe);
-		int len = 0;
+		int name_len = 0;
 
-		if ((mask & FAN_FILENAME_EVENTS) && file_name) {
-			len = strlen(file_name);
-			alloc_len += len + 1;
+		if ((mask & FAN_FILENAME_EVENTS) && file_name &&
+		    (group->fanotify_data.flags & FAN_EVENT_INFO_NAME)) {
+			name_len = strlen(file_name);
+			alloc_len += name_len + 1;
 		}
 		ffe = kmalloc(alloc_len, GFP_KERNEL);
 		if (!ffe)
 			return NULL;
 		event = &ffe->fae;
-		ffe->name_len = len;
-		if (len)
+		ffe->name_len = name_len;
+		if (name_len)
 			strcpy(ffe->name, file_name);
+
+		ffe->fh.handle_type = FILEID_INVALID;
+		ffe->fh.handle_bytes = 0;
+		if ((mask & FAN_EVENT_ON_SB) &&
+		    (group->fanotify_data.flags & FAN_EVENT_INFO_FH)) {
+			/*
+			 * Encode only parent (dentry) for filename events
+			 * and both parent and child for other events.
+			 * ffe->fid is big enough to encode xfs type 0x82:
+			 * 64bit parent+child inodes and 32bit generations
+			 */
+			int handle_dwords = sizeof(ffe->fid) >> 2;
+			int type = exportfs_encode_fh(path->dentry,
+					(struct fid *)&ffe->fid,
+					&handle_dwords,
+					!(mask & FAN_FILENAME_EVENTS));
+
+			if (type > 0 && type < FILEID_INVALID) {
+				ffe->fh.handle_type = type;
+				ffe->fh.handle_bytes = handle_dwords << 2;
+			}
+		}
 		goto init;
 	}
 
@@ -267,7 +291,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 	pr_debug("%s: group=%p inode=%p mask=%x\n", __func__, group, inode,
 		 mask);
 
-	event = fanotify_alloc_event(inode, mask, &path, file_name);
+	event = fanotify_alloc_event(group, inode, mask, &path, file_name);
 	if (unlikely(!event))
 		return -ENOMEM;
 
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index d23b35b..cbe8b2a 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -1,6 +1,7 @@
 #include <linux/fsnotify_backend.h>
 #include <linux/path.h>
 #include <linux/slab.h>
+#include <linux/exportfs.h>
 
 extern struct kmem_cache *fanotify_event_cachep;
 extern struct kmem_cache *fanotify_perm_event_cachep;
@@ -20,6 +21,13 @@ struct fanotify_event_info {
 	struct pid *tgid;
 };
 
+struct fanotify_fid64 {
+	u64 ino;
+	u32 gen;
+	u64 parent_ino;
+	u32 parent_gen;
+} __attribute__((packed));
+
 /*
  * Structure for fanotify events with variable length data.
  * It gets allocated in fanotify_handle_event() and freed
@@ -28,11 +36,16 @@ struct fanotify_event_info {
 struct fanotify_file_event_info {
 	struct fanotify_event_info fae;
 	/*
+	 * For events reported to sb root record the file handle
+	 */
+	struct file_handle fh;
+	struct fanotify_fid64 fid;	/* make this allocated? */
+	/*
 	 * For filename events (create,delete,rename), path points to the
 	 * directory and name holds the entry name
 	 */
 	int name_len;
-	char name[];
+	char name[];	/* make this allocated? */
 };
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
@@ -88,7 +101,7 @@ static inline struct fanotify_event_info *FANOTIFY_E(struct fsnotify_event *fse)
 	return container_of(fse, struct fanotify_event_info, fse);
 }
 
-struct fanotify_event_info *fanotify_alloc_event(struct inode *inode, u32 mask,
+struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
+						 struct inode *inode, u32 mask,
 						 const struct path *path,
-						 struct path *path,
 						 const char *file_name);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 191bd7d..adef7b0 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -45,6 +45,20 @@ static struct kmem_cache *fanotify_mark_cache __read_mostly;
 struct kmem_cache *fanotify_event_cachep __read_mostly;
 struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
 
+static int round_event_data_len(struct fanotify_file_event_info *event)
+{
+	int data_len = 0;
+
+	if (!event->name_len && !event->fh.handle_bytes)
+		return 0;
+
+	if (event->name_len)
+		data_len += event->name_len + 1;
+	if (event->fh.handle_bytes)
+		data_len +=  sizeof(event->fh) + event->fh.handle_bytes;
+	return roundup(data_len, FAN_EVENT_METADATA_LEN);
+}
+
 /*
  * Get an fsnotify notification event if one exists and is small
  * enough to fit in "count". Return an error pointer if the count
@@ -55,14 +69,23 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
 static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
 					    size_t count)
 {
-	assert_spin_locked(&group->notification_lock);
+	size_t event_size = FAN_EVENT_METADATA_LEN;
+	struct fsnotify_event *event;
 
-	pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
+	assert_spin_locked(&group->notification_lock);
 
 	if (fsnotify_notify_queue_is_empty(group))
 		return NULL;
 
-	if (FAN_EVENT_METADATA_LEN > count)
+	event = fsnotify_peek_first_event(group);
+
+	pr_debug("%s: group=%p event=%p count=%zd\n", __func__,
+		 group, event, count);
+
+	if (FANOTIFY_IS_FE(event))
+		event_size += round_event_data_len(FANOTIFY_FE(event));
+
+	if (event_size > count)
 		return ERR_PTR(-EINVAL);
 
 	/* held the notification_lock the whole time, so this is the
@@ -206,13 +229,6 @@ static int process_access_response(struct fsnotify_group *group,
 }
 #endif
 
-static int round_event_name_len(struct fanotify_file_event_info *event)
-{
-	if (!event->name_len)
-		return 0;
-	return roundup(event->name_len + 1, FAN_EVENT_METADATA_LEN);
-}
-
 static ssize_t copy_event_to_user(struct fsnotify_group *group,
 				  struct fsnotify_event *event,
 				  char __user *buf)
@@ -221,7 +237,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	struct fanotify_file_event_info *ffe = NULL;
 	struct file *f;
 	int fd, ret;
-	size_t pad_name_len = 0;
+	size_t pad_data_len = 0;
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
@@ -230,10 +246,11 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 		return ret;
 
 	if (FANOTIFY_IS_FE(event) &&
-	    (group->fanotify_data.flags & FAN_EVENT_INFO_NAME)) {
+	    (group->fanotify_data.flags &
+	     (FAN_EVENT_INFO_NAME | FAN_EVENT_INFO_FH))) {
 		ffe = FANOTIFY_FE(event);
-		pad_name_len = round_event_name_len(ffe);
-		fanotify_event_metadata.event_len += pad_name_len;
+		pad_data_len = round_event_data_len(ffe);
+		fanotify_event_metadata.event_len += pad_data_len;
 	}
 
 	fd = fanotify_event_metadata.fd;
@@ -249,14 +266,27 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	 * with zeros.
 	 */
 	ret = -EFAULT;
-	if (ffe && pad_name_len) {
-		/* copy the filename */
-		if (copy_to_user(buf, ffe->name, ffe->name_len))
-			goto out_close_fd;
-		buf += ffe->name_len;
+	if (ffe && pad_data_len) {
+		if (ffe->fh.handle_bytes) {
+			int fh_len = sizeof(ffe->fh) + ffe->fh.handle_bytes;
+
+			/* copy the file handle (bytes,type,fid) */
+			if (copy_to_user(buf, &ffe->fh, fh_len))
+				goto out_close_fd;
+			buf += fh_len;
+			pad_data_len -= fh_len;
+		}
+
+		if (ffe->name_len) {
+			/* copy the filename */
+			if (copy_to_user(buf, ffe->name, ffe->name_len))
+				goto out_close_fd;
+			buf += ffe->name_len;
+			pad_data_len -= ffe->name_len;
+		}
 
 		/* fill userspace with 0's */
-		if (clear_user(buf, pad_name_len - ffe->name_len))
+		if (clear_user(buf, pad_data_len))
 			goto out_close_fd;
 	}
 
@@ -803,7 +833,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	group->fanotify_data.user = user;
 	atomic_inc(&user->fanotify_listeners);
 
-	oevent = fanotify_alloc_event(NULL, FS_Q_OVERFLOW, NULL, NULL);
+	oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL, NULL);
 	if (unlikely(!oevent)) {
 		fd = -ENOMEM;
 		goto out_destroy_group;
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index b202e09..86f0321 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -51,8 +51,10 @@
 /* These bits determine the format of the reported events */
 #define FAN_EVENT_INFO_PARENT	0x00000100	/* Event fd maybe of parent */
 #define FAN_EVENT_INFO_NAME	0x00000200	/* Event data has filename */
+#define FAN_EVENT_INFO_FH	0x00000400	/* Event data has filehandle */
 #define FAN_ALL_EVENT_INFO_BITS (FAN_EVENT_INFO_PARENT | \
-				 FAN_EVENT_INFO_NAME)
+				 FAN_EVENT_INFO_NAME | \
+				 FAN_EVENT_INFO_FH)
 
 #define FAN_ALL_INIT_FLAGS	(FAN_CLOEXEC | FAN_NONBLOCK | \
 				 FAN_ALL_CLASS_BITS | \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC][PATCH 4/6] fanotify: report file name to root inode watch with FS_EVENT_ON_CHILD
  2017-03-13 13:20 [RFC][PATCH 0/6] fanotify: super block root watch Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-03-13 13:20 ` [RFC][PATCH 3/6] fanotify: pass file handle on sb root watcher events Amir Goldstein
@ 2017-03-13 13:20 ` Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 5/6] fanotify: export FAN_ONDIR to user Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 6/6] fanotify: filter events by root mark mount point Amir Goldstein
  5 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2017-03-13 13:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Eric Paris, Marko Rauhamaa, linux-fsdevel

When adding a root watch with FS_EVENT_ON_DESCENDANT
flags (FS_EVENT_ON_CHILD|FS_EVENT_ON_SB), deliver non directory
events to root inode as if they are reported to the parent
inode, including the file name.

This is only relevant to events open/modify/attrib/close,
which can be reported to both parent and self.
Filename events (create/detete/move) always include file name
and self events (move_self/delete_self) never include file name.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c    | 16 +++++++++++++---
 fs/notify/fsnotify.c             | 13 +++++++++----
 include/linux/fsnotify_backend.h | 13 +++++++++++++
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index ca9894c..4b74e56 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -186,11 +186,20 @@ struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
 		int alloc_len = sizeof(*ffe);
 		int name_len = 0;
 
-		if ((mask & FAN_FILENAME_EVENTS) && file_name &&
+		/*
+		 * We need to report the file name either for filename events
+		 * (create,delete,move) or for events that happen on non
+		 * directory inodes when reporting file ids to root sb inode
+		 * and only if user has requested to get filename info.
+		 */
+		if (file_name &&
+		    ((mask & FAN_FILENAME_EVENTS) ||
+		     !d_is_dir(path->dentry)) &&
 		    (group->fanotify_data.flags & FAN_EVENT_INFO_NAME)) {
 			name_len = strlen(file_name);
 			alloc_len += name_len + 1;
 		}
+
 		ffe = kmalloc(alloc_len, GFP_KERNEL);
 		if (!ffe)
 			return NULL;
@@ -204,8 +213,9 @@ struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group,
 		if ((mask & FAN_EVENT_ON_SB) &&
 		    (group->fanotify_data.flags & FAN_EVENT_INFO_FH)) {
 			/*
-			 * Encode only parent (dentry) for filename events
-			 * and both parent and child for other events.
+			 * Encode only parent inode for filename events
+			 * and events on directories. Encode both parent
+			 * and child inodes for other events.
 			 * ffe->fid is big enough to encode xfs type 0x82:
 			 * 64bit parent+child inodes and 32bit generations
 			 */
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 9f0c988..b99e51d 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -95,11 +95,16 @@ int __fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask
 	if (!dentry)
 		dentry = path->dentry;
 
-	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+	if (dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED) {
+		parent = dget_parent(dentry);
+	} else if (unlikely(fsnotify_sb_root_watches_descendants(dentry)) &&
+		   !(mask & FS_ISDIR)) {
+		/* Parent is not watching, but root inode is watching */
+		parent = dget(dentry->d_sb->s_root);
+	} else {
 		return 0;
-
-	parent = dget_parent(dentry);
-	p_inode = parent->d_inode;
+	}
+	p_inode = d_inode(parent);
 
 	if (unlikely(!fsnotify_inode_watches_children(p_inode)))
 		__fsnotify_update_child_dentry_flags(p_inode);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 3be81d9..e23c549 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -298,6 +298,19 @@ static inline int fsnotify_inode_watches_children(struct inode *inode)
 	return inode->i_fsnotify_mask & FS_EVENTS_POSS_ON_CHILD;
 }
 
+static inline int fsnotify_sb_root_watches_descendants(struct dentry *dentry)
+{
+	struct inode *root = dentry->d_sb->s_root->d_inode;
+
+	/* All FS_EVENT_ON_DESCENDANTS flags are set if root inode may care */
+	if ((root->i_fsnotify_mask & FS_EVENT_ON_DESCENDANT) !=
+	     FS_EVENT_ON_DESCENDANT)
+		return 0;
+	/* root inode might care about distant child events, does it care about
+	 * the specific set of events that can happen on a child? */
+	return root->i_fsnotify_mask & FS_EVENTS_POSS_ON_CHILD;
+}
+
 /*
  * Update the dentry with a flag indicating the interest of its parent to receive
  * filesystem events when those events happens to this dentry->d_inode.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC][PATCH 5/6] fanotify: export FAN_ONDIR to user
  2017-03-13 13:20 [RFC][PATCH 0/6] fanotify: super block root watch Amir Goldstein
                   ` (3 preceding siblings ...)
  2017-03-13 13:20 ` [RFC][PATCH 4/6] fanotify: report file name to root inode watch with FS_EVENT_ON_CHILD Amir Goldstein
@ 2017-03-13 13:20 ` Amir Goldstein
  2017-03-13 13:20 ` [RFC][PATCH 6/6] fanotify: filter events by root mark mount point Amir Goldstein
  5 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2017-03-13 13:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Eric Paris, Marko Rauhamaa, linux-fsdevel

User who requested the flag FAN_EVENT_INFO_PARENT
on fanotify_init() will get the additional information
FAN_ONDIR when event subject is a directory.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify_user.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index adef7b0..bc1ccd0 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -147,17 +147,22 @@ static int fill_event_metadata(struct fsnotify_group *group,
 {
 	int ret = 0;
 	struct fanotify_event_info *event;
+	__u32 user_mask = FAN_ALL_OUTGOING_EVENTS;
 
 	pr_debug("%s: group=%p metadata=%p event=%p\n", __func__,
 		 group, metadata, fsn_event);
 
+	/* FAN_ONDIR is important for dentry events */
+	if (group->fanotify_data.flags & FAN_EVENT_INFO_PARENT)
+		user_mask |= FAN_ONDIR;
+
 	*file = NULL;
 	event = container_of(fsn_event, struct fanotify_event_info, fse);
 	metadata->event_len = FAN_EVENT_METADATA_LEN;
 	metadata->metadata_len = FAN_EVENT_METADATA_LEN;
 	metadata->vers = FANOTIFY_METADATA_VERSION;
 	metadata->reserved = 0;
-	metadata->mask = fsn_event->mask & FAN_ALL_OUTGOING_EVENTS;
+	metadata->mask = fsn_event->mask & user_mask;
 	metadata->pid = pid_vnr(event->tgid);
 	if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
 		metadata->fd = FAN_NOFD;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC][PATCH 6/6] fanotify: filter events by root mark mount point
  2017-03-13 13:20 [RFC][PATCH 0/6] fanotify: super block root watch Amir Goldstein
                   ` (4 preceding siblings ...)
  2017-03-13 13:20 ` [RFC][PATCH 5/6] fanotify: export FAN_ONDIR to user Amir Goldstein
@ 2017-03-13 13:20 ` Amir Goldstein
  5 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2017-03-13 13:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Eric Paris, Marko Rauhamaa, linux-fsdevel

When adding a super block root watch from a mount point that is not mounted
on the root of the file system, filter out events on file system objects
that happen outside this mount point directory (on non decendant objects).

This is not like FAN_MARK_MOUNT which filters only events that happened
on the mount of the mark. All events on file system objects are reported
as long as these objects are accessible from the mark mount point.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 4b74e56..e016ade 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -96,11 +96,12 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark,
 {
 	__u32 marks_mask, marks_ignored_mask;
 	const struct path *path = data;
+	struct vfsmount *mark_mnt = inode_mark ? inode_mark->mnt : NULL;
 	struct dentry *dentry = path->dentry;
 
-	pr_debug("%s: inode_mark=%p vfsmnt_mark=%p mask=%x"
+	pr_debug("%s: inode_mark=%p vfsmnt_mark=%p mark_mnt=%p mask=%x"
 		 " data_type=%d\n", __func__, inode_mark, vfsmnt_mark,
-		 event_mask, data_type);
+		 mark_mnt, event_mask, data_type);
 
 	/* if we don't have enough info to send an event to userspace say no */
 	if (data_type != FSNOTIFY_EVENT_PATH &&
@@ -145,6 +146,14 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark,
 	      ~marks_ignored_mask))
 		return false;
 
+	/*
+	 * Only interesetd in dentry events visible from the mount
+	 * from which the root watch was added
+	 */
+	if (mark_mnt && mark_mnt->mnt_root != dentry &&
+	    d_ancestor(mark_mnt->mnt_root, dentry) == NULL)
+		return false;
+
 	return true;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-03-13 13:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-13 13:20 [RFC][PATCH 0/6] fanotify: super block root watch Amir Goldstein
2017-03-13 13:20 ` [RFC][PATCH 1/6] fanotify: add a " Amir Goldstein
2017-03-13 13:20 ` [RFC][PATCH 2/6] fanotify: report events to sb root with fanotify_file_event_info Amir Goldstein
2017-03-13 13:20 ` [RFC][PATCH 3/6] fanotify: pass file handle on sb root watcher events Amir Goldstein
2017-03-13 13:20 ` [RFC][PATCH 4/6] fanotify: report file name to root inode watch with FS_EVENT_ON_CHILD Amir Goldstein
2017-03-13 13:20 ` [RFC][PATCH 5/6] fanotify: export FAN_ONDIR to user Amir Goldstein
2017-03-13 13:20 ` [RFC][PATCH 6/6] fanotify: filter events by root mark mount point Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).