All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Performance improvement for fanotify merge
@ 2021-03-04 10:48 Amir Goldstein
  2021-03-04 10:48 ` [PATCH v2 1/5] fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue Amir Goldstein
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Amir Goldstein @ 2021-03-04 10:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

Jan,

Following is v2 for the fanotify_merge() performance improvements.

For more details on functional and performance tests please refer to
v1 cover letter [1].

This version is much simpler than v1 using standard hlist.
It was rebased and tested against 5.12-rc1 using LTP tests [2].

Thanks,
Amir.

Chanes since v1:
- Use hlist instead of multi notification lists
- Handling all hashing within fanotify backend
- Cram event key member together with event type
- Remove ifdefs and use constant queue hash bits
- Address other review comments on v1

[1] https://lore.kernel.org/linux-fsdevel/20210202162010.305971-1-amir73il@gmail.com/
[2] https://github.com/amir73il/ltp/commits/fanotify_merge

Amir Goldstein (5):
  fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
  fanotify: reduce event objectid to 29-bit hash
  fanotify: mix event info and pid into merge key hash
  fsnotify: use hash table for faster events merge
  fanotify: limit number of event merge attempts

 fs/notify/fanotify/fanotify.c        | 150 +++++++++++++++++++--------
 fs/notify/fanotify/fanotify.h        |  46 +++++++-
 fs/notify/fanotify/fanotify_user.c   |  65 ++++++++++--
 fs/notify/inotify/inotify_fsnotify.c |   9 +-
 fs/notify/inotify/inotify_user.c     |   7 +-
 fs/notify/notification.c             |  64 ++++++------
 include/linux/fsnotify_backend.h     |  23 ++--
 7 files changed, 263 insertions(+), 101 deletions(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/5] fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
  2021-03-04 10:48 [PATCH v2 0/5] Performance improvement for fanotify merge Amir Goldstein
@ 2021-03-04 10:48 ` Amir Goldstein
  2021-03-04 10:48 ` [PATCH v2 2/5] fanotify: reduce event objectid to 29-bit hash Amir Goldstein
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Amir Goldstein @ 2021-03-04 10:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

Current code has an assumtion that fsnotify_notify_queue_is_empty() is
called to verify that queue is not empty before trying to peek or remove
an event from queue.

Remove this assumption by moving the fsnotify_notify_queue_is_empty()
into the functions, allow them to return NULL value and check return
value by all callers.

This is a prep patch for multi event queues.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify_user.c | 26 +++++++++++-------
 fs/notify/inotify/inotify_user.c   |  5 ++--
 fs/notify/notification.c           | 42 ++++++++++++++----------------
 include/linux/fsnotify_backend.h   |  8 +++++-
 4 files changed, 44 insertions(+), 37 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 9e0c1afac8bd..16162207e886 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -100,24 +100,30 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
 {
 	size_t event_size = FAN_EVENT_METADATA_LEN;
 	struct fanotify_event *event = NULL;
+	struct fsnotify_event *fsn_event;
 	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
 
 	pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
 
 	spin_lock(&group->notification_lock);
-	if (fsnotify_notify_queue_is_empty(group))
+	fsn_event = fsnotify_peek_first_event(group);
+	if (!fsn_event)
 		goto out;
 
-	if (fid_mode) {
-		event_size += fanotify_event_info_len(fid_mode,
-			FANOTIFY_E(fsnotify_peek_first_event(group)));
-	}
+	event = FANOTIFY_E(fsn_event);
+	if (fid_mode)
+		event_size += fanotify_event_info_len(fid_mode, event);
 
 	if (event_size > count) {
 		event = ERR_PTR(-EINVAL);
 		goto out;
 	}
-	event = FANOTIFY_E(fsnotify_remove_first_event(group));
+
+	/*
+	 * Held the notification_lock the whole time, so this is the
+	 * same event we peeked above.
+	 */
+	fsnotify_remove_first_event(group);
 	if (fanotify_is_perm_event(event->mask))
 		FANOTIFY_PERM(event)->state = FAN_EVENT_REPORTED;
 out:
@@ -573,6 +579,7 @@ static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t
 static int fanotify_release(struct inode *ignored, struct file *file)
 {
 	struct fsnotify_group *group = file->private_data;
+	struct fsnotify_event *fsn_event;
 
 	/*
 	 * Stop new events from arriving in the notification queue. since
@@ -601,13 +608,12 @@ static int fanotify_release(struct inode *ignored, struct file *file)
 	 * dequeue them and set the response. They will be freed once the
 	 * response is consumed and fanotify_get_response() returns.
 	 */
-	while (!fsnotify_notify_queue_is_empty(group)) {
-		struct fanotify_event *event;
+	while ((fsn_event = fsnotify_remove_first_event(group))) {
+		struct fanotify_event *event = FANOTIFY_E(fsn_event);
 
-		event = FANOTIFY_E(fsnotify_remove_first_event(group));
 		if (!(event->mask & FANOTIFY_PERM_EVENTS)) {
 			spin_unlock(&group->notification_lock);
-			fsnotify_destroy_event(group, &event->fse);
+			fsnotify_destroy_event(group, fsn_event);
 		} else {
 			finish_permission_event(group, FANOTIFY_PERM(event),
 						FAN_ALLOW);
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index c71be4fb7dc5..a6c95bd64618 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -146,10 +146,9 @@ static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
 	size_t event_size = sizeof(struct inotify_event);
 	struct fsnotify_event *event;
 
-	if (fsnotify_notify_queue_is_empty(group))
-		return NULL;
-
 	event = fsnotify_peek_first_event(group);
+	if (!event)
+		return NULL;
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
diff --git a/fs/notify/notification.c b/fs/notify/notification.c
index 75d79d6d3ef0..001cfe7d2e4e 100644
--- a/fs/notify/notification.c
+++ b/fs/notify/notification.c
@@ -47,13 +47,6 @@ u32 fsnotify_get_cookie(void)
 }
 EXPORT_SYMBOL_GPL(fsnotify_get_cookie);
 
-/* return true if the notify queue is empty, false otherwise */
-bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
-{
-	assert_spin_locked(&group->notification_lock);
-	return list_empty(&group->notification_list) ? true : false;
-}
-
 void fsnotify_destroy_event(struct fsnotify_group *group,
 			    struct fsnotify_event *event)
 {
@@ -141,33 +134,36 @@ void fsnotify_remove_queued_event(struct fsnotify_group *group,
 }
 
 /*
- * Remove and return the first event from the notification list.  It is the
- * responsibility of the caller to destroy the obtained event
+ * Return the first event on the notification list without removing it.
+ * Returns NULL if the list is empty.
  */
-struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group *group)
+struct fsnotify_event *fsnotify_peek_first_event(struct fsnotify_group *group)
 {
-	struct fsnotify_event *event;
-
 	assert_spin_locked(&group->notification_lock);
 
-	pr_debug("%s: group=%p\n", __func__, group);
+	if (fsnotify_notify_queue_is_empty(group))
+		return NULL;
 
-	event = list_first_entry(&group->notification_list,
-				 struct fsnotify_event, list);
-	fsnotify_remove_queued_event(group, event);
-	return event;
+	return list_first_entry(&group->notification_list,
+				struct fsnotify_event, list);
 }
 
 /*
- * This will not remove the event, that must be done with
- * fsnotify_remove_first_event()
+ * Remove and return the first event from the notification list.  It is the
+ * responsibility of the caller to destroy the obtained event
  */
-struct fsnotify_event *fsnotify_peek_first_event(struct fsnotify_group *group)
+struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group *group)
 {
-	assert_spin_locked(&group->notification_lock);
+	struct fsnotify_event *event = fsnotify_peek_first_event(group);
 
-	return list_first_entry(&group->notification_list,
-				struct fsnotify_event, list);
+	if (!event)
+		return NULL;
+
+	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
+
+	fsnotify_remove_queued_event(group, event);
+
+	return event;
 }
 
 /*
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index e5409b83e731..7eb979bfc141 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -495,7 +495,13 @@ static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
 	fsnotify_add_event(group, group->overflow_event, NULL);
 }
 
-/* true if the group notification queue is empty */
+static inline bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
+{
+	assert_spin_locked(&group->notification_lock);
+
+	return list_empty(&group->notification_list);
+}
+
 extern bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group);
 /* return, but do not dequeue the first event on the notification queue */
 extern struct fsnotify_event *fsnotify_peek_first_event(struct fsnotify_group *group);
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/5] fanotify: reduce event objectid to 29-bit hash
  2021-03-04 10:48 [PATCH v2 0/5] Performance improvement for fanotify merge Amir Goldstein
  2021-03-04 10:48 ` [PATCH v2 1/5] fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue Amir Goldstein
@ 2021-03-04 10:48 ` Amir Goldstein
  2021-03-04 10:48 ` [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash Amir Goldstein
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Amir Goldstein @ 2021-03-04 10:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

objectid is only used by fanotify backend and it is just an optimization
for event merge before comparing all fields in event.

Move the objectid member from common struct fsnotify_event into struct
fanotify_event and reduce it to 29-bit hash to cram it together with the
3-bit event type.

Events of different types are never merged, so the combination of event
type and hash form a 32-bit key for fast compare of events.

This reduces the size of events by one pointer and paves the way for
adding hashed queue support for fanotify.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c        | 25 ++++++++++++-------------
 fs/notify/fanotify/fanotify.h        | 16 +++++++++++++---
 fs/notify/inotify/inotify_fsnotify.c |  2 +-
 fs/notify/inotify/inotify_user.c     |  2 +-
 include/linux/fsnotify_backend.h     |  5 +----
 5 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 1192c9953620..8a2bb6954e02 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -88,16 +88,12 @@ static bool fanotify_name_event_equal(struct fanotify_name_event *fne1,
 	return fanotify_info_equal(info1, info2);
 }
 
-static bool fanotify_should_merge(struct fsnotify_event *old_fsn,
-				  struct fsnotify_event *new_fsn)
+static bool fanotify_should_merge(struct fanotify_event *old,
+				  struct fanotify_event *new)
 {
-	struct fanotify_event *old, *new;
+	pr_debug("%s: old=%p new=%p\n", __func__, old, new);
 
-	pr_debug("%s: old=%p new=%p\n", __func__, old_fsn, new_fsn);
-	old = FANOTIFY_E(old_fsn);
-	new = FANOTIFY_E(new_fsn);
-
-	if (old_fsn->objectid != new_fsn->objectid ||
+	if (old->hash != new->hash ||
 	    old->type != new->type || old->pid != new->pid)
 		return false;
 
@@ -133,10 +129,9 @@ static bool fanotify_should_merge(struct fsnotify_event *old_fsn,
 static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
 {
 	struct fsnotify_event *test_event;
-	struct fanotify_event *new;
+	struct fanotify_event *old, *new = FANOTIFY_E(event);
 
 	pr_debug("%s: list=%p event=%p\n", __func__, list, event);
-	new = FANOTIFY_E(event);
 
 	/*
 	 * Don't merge a permission event with any other event so that we know
@@ -147,8 +142,9 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
 		return 0;
 
 	list_for_each_entry_reverse(test_event, list, list) {
-		if (fanotify_should_merge(test_event, event)) {
-			FANOTIFY_E(test_event)->mask |= new->mask;
+		old = FANOTIFY_E(test_event);
+		if (fanotify_should_merge(old, new)) {
+			old->mask |= new->mask;
 			return 1;
 		}
 	}
@@ -533,6 +529,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	struct mem_cgroup *old_memcg;
 	struct inode *child = NULL;
 	bool name_event = false;
+	unsigned int hash = 0;
 
 	if ((fid_mode & FAN_REPORT_DIR_FID) && dirid) {
 		/*
@@ -600,8 +597,10 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	 * Use the victim inode instead of the watching inode as the id for
 	 * event queue, so event reported on parent is merged with event
 	 * reported on child when both directory and child watches exist.
+	 * Hash object id for queue merge.
 	 */
-	fanotify_init_event(event, (unsigned long)id, mask);
+	hash = hash_ptr(id, FANOTIFY_EVENT_HASH_BITS);
+	fanotify_init_event(event, hash, mask);
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_TID))
 		event->pid = get_pid(task_pid(current));
 	else
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 896c819a1786..d531f0cfa46f 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -135,19 +135,29 @@ enum fanotify_event_type {
 	FANOTIFY_EVENT_TYPE_PATH,
 	FANOTIFY_EVENT_TYPE_PATH_PERM,
 	FANOTIFY_EVENT_TYPE_OVERFLOW, /* struct fanotify_event */
+	__FANOTIFY_EVENT_TYPE_NUM
 };
 
+#define FANOTIFY_EVENT_TYPE_BITS \
+	(ilog2(__FANOTIFY_EVENT_TYPE_NUM - 1) + 1)
+#define FANOTIFY_EVENT_HASH_BITS \
+	(32 - FANOTIFY_EVENT_TYPE_BITS)
+
 struct fanotify_event {
 	struct fsnotify_event fse;
 	u32 mask;
-	enum fanotify_event_type type;
+	struct {
+		unsigned int type : FANOTIFY_EVENT_TYPE_BITS;
+		unsigned int hash : FANOTIFY_EVENT_HASH_BITS;
+	};
 	struct pid *pid;
 };
 
 static inline void fanotify_init_event(struct fanotify_event *event,
-				       unsigned long id, u32 mask)
+				       unsigned int hash, u32 mask)
 {
-	fsnotify_init_event(&event->fse, id);
+	fsnotify_init_event(&event->fse);
+	event->hash = hash;
 	event->mask = mask;
 	event->pid = NULL;
 }
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index 1901d799909b..0533bacbd584 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -107,7 +107,7 @@ int inotify_handle_inode_event(struct fsnotify_mark *inode_mark, u32 mask,
 		mask &= ~IN_ISDIR;
 
 	fsn_event = &event->fse;
-	fsnotify_init_event(fsn_event, 0);
+	fsnotify_init_event(fsn_event);
 	event->mask = mask;
 	event->wd = i_mark->wd;
 	event->sync_cookie = cookie;
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index a6c95bd64618..98f61b31745a 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -641,7 +641,7 @@ static struct fsnotify_group *inotify_new_group(unsigned int max_events)
 		return ERR_PTR(-ENOMEM);
 	}
 	group->overflow_event = &oevent->fse;
-	fsnotify_init_event(group->overflow_event, 0);
+	fsnotify_init_event(group->overflow_event);
 	oevent->mask = FS_Q_OVERFLOW;
 	oevent->wd = -1;
 	oevent->sync_cookie = 0;
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 7eb979bfc141..fc98f9f88d12 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -167,7 +167,6 @@ struct fsnotify_ops {
  */
 struct fsnotify_event {
 	struct list_head list;
-	unsigned long objectid;	/* identifier for queue merges */
 };
 
 /*
@@ -582,11 +581,9 @@ extern void fsnotify_put_mark(struct fsnotify_mark *mark);
 extern void fsnotify_finish_user_wait(struct fsnotify_iter_info *iter_info);
 extern bool fsnotify_prepare_user_wait(struct fsnotify_iter_info *iter_info);
 
-static inline void fsnotify_init_event(struct fsnotify_event *event,
-				       unsigned long objectid)
+static inline void fsnotify_init_event(struct fsnotify_event *event)
 {
 	INIT_LIST_HEAD(&event->list);
-	event->objectid = objectid;
 }
 
 #else
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash
  2021-03-04 10:48 [PATCH v2 0/5] Performance improvement for fanotify merge Amir Goldstein
  2021-03-04 10:48 ` [PATCH v2 1/5] fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue Amir Goldstein
  2021-03-04 10:48 ` [PATCH v2 2/5] fanotify: reduce event objectid to 29-bit hash Amir Goldstein
@ 2021-03-04 10:48 ` Amir Goldstein
  2021-03-16 15:18   ` Jan Kara
  2021-03-04 10:48 ` [PATCH v2 4/5] fsnotify: use hash table for faster events merge Amir Goldstein
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Amir Goldstein @ 2021-03-04 10:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

Improve the merge key hash by mixing more values relevant for merge.

For example, all FAN_CREATE name events in the same dir used to have the
same merge key based on the dir inode.  With this change the created
file name is mixed into the merge key.

The object id that was used as merge key is redundant to the event info
so it is no longer mixed into the hash.

Permission events are not hashed, so no need to hash their info.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c | 87 ++++++++++++++++++++++++-----------
 fs/notify/fanotify/fanotify.h |  5 ++
 2 files changed, 66 insertions(+), 26 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 8a2bb6954e02..db30db61a815 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -14,6 +14,7 @@
 #include <linux/audit.h>
 #include <linux/sched/mm.h>
 #include <linux/statfs.h>
+#include <linux/stringhash.h>
 
 #include "fanotify.h"
 
@@ -22,12 +23,24 @@ static bool fanotify_path_equal(struct path *p1, struct path *p2)
 	return p1->mnt == p2->mnt && p1->dentry == p2->dentry;
 }
 
+static unsigned int fanotify_hash_path(const struct path *path)
+{
+	return hash_ptr(path->dentry, FANOTIFY_EVENT_HASH_BITS) ^
+		hash_ptr(path->mnt, FANOTIFY_EVENT_HASH_BITS);
+}
+
 static inline bool fanotify_fsid_equal(__kernel_fsid_t *fsid1,
 				       __kernel_fsid_t *fsid2)
 {
 	return fsid1->val[0] == fsid2->val[0] && fsid1->val[1] == fsid2->val[1];
 }
 
+static unsigned int fanotify_hash_fsid(__kernel_fsid_t *fsid)
+{
+	return hash_32(fsid->val[0], FANOTIFY_EVENT_HASH_BITS) ^
+		hash_32(fsid->val[1], FANOTIFY_EVENT_HASH_BITS);
+}
+
 static bool fanotify_fh_equal(struct fanotify_fh *fh1,
 			      struct fanotify_fh *fh2)
 {
@@ -38,6 +51,16 @@ static bool fanotify_fh_equal(struct fanotify_fh *fh1,
 		!memcmp(fanotify_fh_buf(fh1), fanotify_fh_buf(fh2), fh1->len);
 }
 
+static unsigned int fanotify_hash_fh(struct fanotify_fh *fh)
+{
+	long salt = (long)fh->type | (long)fh->len << 8;
+
+	/*
+	 * full_name_hash() works long by long, so it handles fh buf optimally.
+	 */
+	return full_name_hash((void *)salt, fanotify_fh_buf(fh), fh->len);
+}
+
 static bool fanotify_fid_event_equal(struct fanotify_fid_event *ffe1,
 				     struct fanotify_fid_event *ffe2)
 {
@@ -325,7 +348,8 @@ static int fanotify_encode_fh_len(struct inode *inode)
  * Return 0 on failure to encode.
  */
 static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
-			      unsigned int fh_len, gfp_t gfp)
+			      unsigned int fh_len, unsigned int *hash,
+			      gfp_t gfp)
 {
 	int dwords, type = 0;
 	char *ext_buf = NULL;
@@ -368,6 +392,9 @@ static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
 	fh->type = type;
 	fh->len = fh_len;
 
+	/* Mix fh into event merge key */
+	*hash ^= fanotify_hash_fh(fh);
+
 	return FANOTIFY_FH_HDR_LEN + fh_len;
 
 out_err:
@@ -421,6 +448,7 @@ static struct inode *fanotify_dfid_inode(u32 event_mask, const void *data,
 }
 
 static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
+							unsigned int *hash,
 							gfp_t gfp)
 {
 	struct fanotify_path_event *pevent;
@@ -431,6 +459,7 @@ static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
 
 	pevent->fae.type = FANOTIFY_EVENT_TYPE_PATH;
 	pevent->path = *path;
+	*hash ^= fanotify_hash_path(path);
 	path_get(path);
 
 	return &pevent->fae;
@@ -456,6 +485,7 @@ static struct fanotify_event *fanotify_alloc_perm_event(const struct path *path,
 
 static struct fanotify_event *fanotify_alloc_fid_event(struct inode *id,
 						       __kernel_fsid_t *fsid,
+						       unsigned int *hash,
 						       gfp_t gfp)
 {
 	struct fanotify_fid_event *ffe;
@@ -466,16 +496,18 @@ static struct fanotify_event *fanotify_alloc_fid_event(struct inode *id,
 
 	ffe->fae.type = FANOTIFY_EVENT_TYPE_FID;
 	ffe->fsid = *fsid;
+	*hash ^= fanotify_hash_fsid(fsid);
 	fanotify_encode_fh(&ffe->object_fh, id, fanotify_encode_fh_len(id),
-			   gfp);
+			   hash, gfp);
 
 	return &ffe->fae;
 }
 
 static struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
 							__kernel_fsid_t *fsid,
-							const struct qstr *file_name,
+							const struct qstr *name,
 							struct inode *child,
+							unsigned int *hash,
 							gfp_t gfp)
 {
 	struct fanotify_name_event *fne;
@@ -488,24 +520,30 @@ static struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
 	size = sizeof(*fne) + FANOTIFY_FH_HDR_LEN + dir_fh_len;
 	if (child_fh_len)
 		size += FANOTIFY_FH_HDR_LEN + child_fh_len;
-	if (file_name)
-		size += file_name->len + 1;
+	if (name)
+		size += name->len + 1;
 	fne = kmalloc(size, gfp);
 	if (!fne)
 		return NULL;
 
 	fne->fae.type = FANOTIFY_EVENT_TYPE_FID_NAME;
 	fne->fsid = *fsid;
+	*hash ^= fanotify_hash_fsid(fsid);
 	info = &fne->info;
 	fanotify_info_init(info);
 	dfh = fanotify_info_dir_fh(info);
-	info->dir_fh_totlen = fanotify_encode_fh(dfh, id, dir_fh_len, 0);
+	info->dir_fh_totlen = fanotify_encode_fh(dfh, id, dir_fh_len, hash, 0);
 	if (child_fh_len) {
 		ffh = fanotify_info_file_fh(info);
-		info->file_fh_totlen = fanotify_encode_fh(ffh, child, child_fh_len, 0);
+		info->file_fh_totlen = fanotify_encode_fh(ffh, child,
+							child_fh_len, hash, 0);
+	}
+	if (name) {
+		long salt = name->len;
+
+		fanotify_info_copy_name(info, name);
+		*hash ^= full_name_hash((void *)salt, name->name, name->len);
 	}
-	if (file_name)
-		fanotify_info_copy_name(info, file_name);
 
 	pr_debug("%s: ino=%lu size=%u dir_fh_len=%u child_fh_len=%u name_len=%u name='%.*s'\n",
 		 __func__, id->i_ino, size, dir_fh_len, child_fh_len,
@@ -530,6 +568,8 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 	struct inode *child = NULL;
 	bool name_event = false;
 	unsigned int hash = 0;
+	unsigned long ondir = (mask & FAN_ONDIR) ? 1UL : 0;
+	struct pid *pid;
 
 	if ((fid_mode & FAN_REPORT_DIR_FID) && dirid) {
 		/*
@@ -537,8 +577,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 		 * report the child fid for events reported on a non-dir child
 		 * in addition to reporting the parent fid and maybe child name.
 		 */
-		if ((fid_mode & FAN_REPORT_FID) &&
-		    id != dirid && !(mask & FAN_ONDIR))
+		if ((fid_mode & FAN_REPORT_FID) && id != dirid && !ondir)
 			child = id;
 
 		id = dirid;
@@ -559,8 +598,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 		if (!(fid_mode & FAN_REPORT_NAME)) {
 			name_event = !!child;
 			file_name = NULL;
-		} else if ((mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
-			   !(mask & FAN_ONDIR)) {
+		} else if ((mask & ALL_FSNOTIFY_DIRENT_EVENTS) || !ondir) {
 			name_event = true;
 		}
 	}
@@ -583,28 +621,25 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
 		event = fanotify_alloc_perm_event(path, gfp);
 	} else if (name_event && (file_name || child)) {
 		event = fanotify_alloc_name_event(id, fsid, file_name, child,
-						  gfp);
+						  &hash, gfp);
 	} else if (fid_mode) {
-		event = fanotify_alloc_fid_event(id, fsid, gfp);
+		event = fanotify_alloc_fid_event(id, fsid, &hash, gfp);
 	} else {
-		event = fanotify_alloc_path_event(path, gfp);
+		event = fanotify_alloc_path_event(path, &hash, gfp);
 	}
 
 	if (!event)
 		goto out;
 
-	/*
-	 * Use the victim inode instead of the watching inode as the id for
-	 * event queue, so event reported on parent is merged with event
-	 * reported on child when both directory and child watches exist.
-	 * Hash object id for queue merge.
-	 */
-	hash = hash_ptr(id, FANOTIFY_EVENT_HASH_BITS);
-	fanotify_init_event(event, hash, mask);
 	if (FAN_GROUP_FLAG(group, FAN_REPORT_TID))
-		event->pid = get_pid(task_pid(current));
+		pid = get_pid(task_pid(current));
 	else
-		event->pid = get_pid(task_tgid(current));
+		pid = get_pid(task_tgid(current));
+
+	/* Mix event info, FAN_ONDIR flag and pid into event merge key */
+	hash ^= hash_long((unsigned long)pid | ondir, FANOTIFY_EVENT_HASH_BITS);
+	fanotify_init_event(event, hash, mask);
+	event->pid = pid;
 
 out:
 	set_active_memcg(old_memcg);
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index d531f0cfa46f..9871f76cd9c2 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -115,6 +115,11 @@ static inline void fanotify_info_init(struct fanotify_info *info)
 	info->name_len = 0;
 }
 
+static inline unsigned int fanotify_info_len(struct fanotify_info *info)
+{
+	return info->dir_fh_totlen + info->file_fh_totlen + info->name_len;
+}
+
 static inline void fanotify_info_copy_name(struct fanotify_info *info,
 					   const struct qstr *name)
 {
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/5] fsnotify: use hash table for faster events merge
  2021-03-04 10:48 [PATCH v2 0/5] Performance improvement for fanotify merge Amir Goldstein
                   ` (2 preceding siblings ...)
  2021-03-04 10:48 ` [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash Amir Goldstein
@ 2021-03-04 10:48 ` Amir Goldstein
  2021-03-04 10:48 ` [PATCH v2 5/5] fanotify: limit number of event merge attempts Amir Goldstein
  2021-03-16 15:39 ` [PATCH v2 0/5] Performance improvement for fanotify merge Jan Kara
  5 siblings, 0 replies; 10+ messages in thread
From: Amir Goldstein @ 2021-03-04 10:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

In order to improve event merge performance, hash events in a 128 size
hash table by the event merge key.

The fanotify_event size grows by two pointers, but we just reduced its
size by removing the objectid member, so overall its size is increased
by one pointer.

Permission events and overflow event are not merged so they are also
not hashed.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c        | 40 +++++++++++++++++++++++-----
 fs/notify/fanotify/fanotify.h        | 25 +++++++++++++++++
 fs/notify/fanotify/fanotify_user.c   | 39 +++++++++++++++++++++++++++
 fs/notify/inotify/inotify_fsnotify.c |  7 ++---
 fs/notify/notification.c             | 22 ++++++++++-----
 include/linux/fsnotify_backend.h     | 10 ++++---
 6 files changed, 123 insertions(+), 20 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index db30db61a815..1795facc5439 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -149,12 +149,15 @@ static bool fanotify_should_merge(struct fanotify_event *old,
 }
 
 /* and the list better be locked by something too! */
-static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
+static int fanotify_merge(struct fsnotify_group *group,
+			  struct fsnotify_event *event)
 {
-	struct fsnotify_event *test_event;
 	struct fanotify_event *old, *new = FANOTIFY_E(event);
+	unsigned int bucket = fanotify_event_hash_bucket(group, new);
+	struct hlist_head *hlist = &group->fanotify_data.merge_hash[bucket];
 
-	pr_debug("%s: list=%p event=%p\n", __func__, list, event);
+	pr_debug("%s: group=%p event=%p bucket=%u\n", __func__,
+		 group, event, bucket);
 
 	/*
 	 * Don't merge a permission event with any other event so that we know
@@ -164,8 +167,7 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
 	if (fanotify_is_perm_event(new->mask))
 		return 0;
 
-	list_for_each_entry_reverse(test_event, list, list) {
-		old = FANOTIFY_E(test_event);
+	hlist_for_each_entry(old, hlist, merge_list) {
 		if (fanotify_should_merge(old, new)) {
 			old->mask |= new->mask;
 			return 1;
@@ -203,8 +205,11 @@ static int fanotify_get_response(struct fsnotify_group *group,
 			return ret;
 		}
 		/* Event not yet reported? Just remove it. */
-		if (event->state == FAN_EVENT_INIT)
+		if (event->state == FAN_EVENT_INIT) {
 			fsnotify_remove_queued_event(group, &event->fae.fse);
+			/* Permission events are not supposed to be hashed */
+			WARN_ON_ONCE(!hlist_unhashed(&event->fae.merge_list));
+		}
 		/*
 		 * Event may be also answered in case signal delivery raced
 		 * with wakeup. In that case we have nothing to do besides
@@ -679,6 +684,24 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
 	return fsid;
 }
 
+/*
+ * Add an event to hash table for faster merge.
+ */
+static void fanotify_insert_event(struct fsnotify_group *group,
+				  struct fsnotify_event *fsn_event)
+{
+	struct fanotify_event *event = FANOTIFY_E(fsn_event);
+	unsigned int bucket = fanotify_event_hash_bucket(group, event);
+	struct hlist_head *hlist = &group->fanotify_data.merge_hash[bucket];
+
+	assert_spin_locked(&group->notification_lock);
+
+	pr_debug("%s: group=%p event=%p bucket=%u\n", __func__,
+		 group, event, bucket);
+
+	hlist_add_head(&event->merge_list, hlist);
+}
+
 static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
 				 const void *data, int data_type,
 				 struct inode *dir,
@@ -749,7 +772,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
 	}
 
 	fsn_event = &event->fse;
-	ret = fsnotify_add_event(group, fsn_event, fanotify_merge);
+	ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
+				 fanotify_is_hashed_event(mask) ?
+				 fanotify_insert_event : NULL);
 	if (ret) {
 		/* Permission events shouldn't be merged */
 		BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
@@ -772,6 +797,7 @@ static void fanotify_free_group_priv(struct fsnotify_group *group)
 {
 	struct user_struct *user;
 
+	kfree(group->fanotify_data.merge_hash);
 	user = group->fanotify_data.user;
 	atomic_dec(&user->fanotify_listeners);
 	free_uid(user);
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 9871f76cd9c2..4a5e555dc3d2 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -3,6 +3,7 @@
 #include <linux/path.h>
 #include <linux/slab.h>
 #include <linux/exportfs.h>
+#include <linux/hashtable.h>
 
 extern struct kmem_cache *fanotify_mark_cache;
 extern struct kmem_cache *fanotify_fid_event_cachep;
@@ -150,6 +151,7 @@ enum fanotify_event_type {
 
 struct fanotify_event {
 	struct fsnotify_event fse;
+	struct hlist_node merge_list;	/* List for hashed merge */
 	u32 mask;
 	struct {
 		unsigned int type : FANOTIFY_EVENT_TYPE_BITS;
@@ -162,6 +164,7 @@ static inline void fanotify_init_event(struct fanotify_event *event,
 				       unsigned int hash, u32 mask)
 {
 	fsnotify_init_event(&event->fse);
+	INIT_HLIST_NODE(&event->merge_list);
 	event->hash = hash;
 	event->mask = mask;
 	event->pid = NULL;
@@ -299,3 +302,25 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
 	else
 		return NULL;
 }
+
+/*
+ * Use 128 size hash table to speed up events merge.
+ */
+#define FANOTIFY_HTABLE_BITS	(7)
+#define FANOTIFY_HTABLE_SIZE	(1 << FANOTIFY_HTABLE_BITS)
+#define FANOTIFY_HTABLE_MASK	(FANOTIFY_HTABLE_SIZE - 1)
+
+/*
+ * Permission events and overflow event do not get merged - don't hash them.
+ */
+static inline bool fanotify_is_hashed_event(u32 mask)
+{
+	return !fanotify_is_perm_event(mask) && !(mask & FS_Q_OVERFLOW);
+}
+
+static inline unsigned int fanotify_event_hash_bucket(
+						struct fsnotify_group *group,
+						struct fanotify_event *event)
+{
+	return event->hash & FANOTIFY_HTABLE_MASK;
+}
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 16162207e886..b89f332248bd 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -89,6 +89,23 @@ static int fanotify_event_info_len(unsigned int fid_mode,
 	return info_len;
 }
 
+/*
+ * Remove an hashed event from merge hash table.
+ */
+static void fanotify_unhash_event(struct fsnotify_group *group,
+				  struct fanotify_event *event)
+{
+	assert_spin_locked(&group->notification_lock);
+
+	pr_debug("%s: group=%p event=%p bucket=%u\n", __func__,
+		 group, event, fanotify_event_hash_bucket(group, event));
+
+	if (WARN_ON_ONCE(hlist_unhashed(&event->merge_list)))
+		return;
+
+	hlist_del_init(&event->merge_list);
+}
+
 /*
  * Get an fanotify notification event if one exists and is small
  * enough to fit in "count". Return an error pointer if the count
@@ -126,6 +143,8 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
 	fsnotify_remove_first_event(group);
 	if (fanotify_is_perm_event(event->mask))
 		FANOTIFY_PERM(event)->state = FAN_EVENT_REPORTED;
+	if (fanotify_is_hashed_event(event->mask))
+		fanotify_unhash_event(group, event);
 out:
 	spin_unlock(&group->notification_lock);
 	return event;
@@ -925,6 +944,20 @@ static struct fsnotify_event *fanotify_alloc_overflow_event(void)
 	return &oevent->fse;
 }
 
+static struct hlist_head *fanotify_alloc_merge_hash(void)
+{
+	struct hlist_head *hash;
+
+	hash = kmalloc(sizeof(struct hlist_head) << FANOTIFY_HTABLE_BITS,
+		       GFP_KERNEL_ACCOUNT);
+	if (!hash)
+		return NULL;
+
+	__hash_init(hash, FANOTIFY_HTABLE_SIZE);
+
+	return hash;
+}
+
 /* fanotify syscalls */
 SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 {
@@ -993,6 +1026,12 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	atomic_inc(&user->fanotify_listeners);
 	group->memcg = get_mem_cgroup_from_mm(current->mm);
 
+	group->fanotify_data.merge_hash = fanotify_alloc_merge_hash();
+	if (!group->fanotify_data.merge_hash) {
+		fd = -ENOMEM;
+		goto out_destroy_group;
+	}
+
 	group->overflow_event = fanotify_alloc_overflow_event();
 	if (unlikely(!group->overflow_event)) {
 		fd = -ENOMEM;
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index 0533bacbd584..d1a64daa0171 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -46,9 +46,10 @@ static bool event_compare(struct fsnotify_event *old_fsn,
 	return false;
 }
 
-static int inotify_merge(struct list_head *list,
-			  struct fsnotify_event *event)
+static int inotify_merge(struct fsnotify_group *group,
+			 struct fsnotify_event *event)
 {
+	struct list_head *list = &group->notification_list;
 	struct fsnotify_event *last_event;
 
 	last_event = list_entry(list->prev, struct fsnotify_event, list);
@@ -115,7 +116,7 @@ int inotify_handle_inode_event(struct fsnotify_mark *inode_mark, u32 mask,
 	if (len)
 		strcpy(event->name, name->name);
 
-	ret = fsnotify_add_event(group, fsn_event, inotify_merge);
+	ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL);
 	if (ret) {
 		/* Our event wasn't used in the end. Free it. */
 		fsnotify_destroy_event(group, fsn_event);
diff --git a/fs/notify/notification.c b/fs/notify/notification.c
index 001cfe7d2e4e..0582c7209bc0 100644
--- a/fs/notify/notification.c
+++ b/fs/notify/notification.c
@@ -68,16 +68,22 @@ void fsnotify_destroy_event(struct fsnotify_group *group,
 }
 
 /*
- * Add an event to the group notification queue.  The group can later pull this
- * event off the queue to deal with.  The function returns 0 if the event was
- * added to the queue, 1 if the event was merged with some other queued event,
+ * Try to Add an event to the notification queue.
+ * The group can later pull this event off the queue to deal with.
+ * The group can use the @merge hook to merge the event with a queued event.
+ * The group can use the @insert hook to insert the event into hash table.
+ * The function returns:
+ * 0 if the event was added to a queue
+ * 1 if the event was merged with some other queued event
  * 2 if the event was not queued - either the queue of events has overflown
- * or the group is shutting down.
+ *   or the group is shutting down.
  */
 int fsnotify_add_event(struct fsnotify_group *group,
 		       struct fsnotify_event *event,
-		       int (*merge)(struct list_head *,
-				    struct fsnotify_event *))
+		       int (*merge)(struct fsnotify_group *,
+				    struct fsnotify_event *),
+		       void (*insert)(struct fsnotify_group *,
+				      struct fsnotify_event *))
 {
 	int ret = 0;
 	struct list_head *list = &group->notification_list;
@@ -104,7 +110,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
 	}
 
 	if (!list_empty(list) && merge) {
-		ret = merge(list, event);
+		ret = merge(group, event);
 		if (ret) {
 			spin_unlock(&group->notification_lock);
 			return ret;
@@ -114,6 +120,8 @@ int fsnotify_add_event(struct fsnotify_group *group,
 queue:
 	group->q_len++;
 	list_add_tail(&event->list, list);
+	if (insert)
+		insert(group, event);
 	spin_unlock(&group->notification_lock);
 
 	wake_up(&group->notification_waitq);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index fc98f9f88d12..63fb766f0f3e 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -233,6 +233,8 @@ struct fsnotify_group {
 #endif
 #ifdef CONFIG_FANOTIFY
 		struct fanotify_group_private_data {
+			/* Hash table of events for merge */
+			struct hlist_head *merge_hash;
 			/* allows a group to block waiting for a userspace response */
 			struct list_head access_list;
 			wait_queue_head_t access_waitq;
@@ -486,12 +488,14 @@ extern void fsnotify_destroy_event(struct fsnotify_group *group,
 /* attach the event to the group notification queue */
 extern int fsnotify_add_event(struct fsnotify_group *group,
 			      struct fsnotify_event *event,
-			      int (*merge)(struct list_head *,
-					   struct fsnotify_event *));
+			      int (*merge)(struct fsnotify_group *,
+					   struct fsnotify_event *),
+			      void (*insert)(struct fsnotify_group *,
+					     struct fsnotify_event *));
 /* Queue overflow event to a notification group */
 static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
 {
-	fsnotify_add_event(group, group->overflow_event, NULL);
+	fsnotify_add_event(group, group->overflow_event, NULL, NULL);
 }
 
 static inline bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 5/5] fanotify: limit number of event merge attempts
  2021-03-04 10:48 [PATCH v2 0/5] Performance improvement for fanotify merge Amir Goldstein
                   ` (3 preceding siblings ...)
  2021-03-04 10:48 ` [PATCH v2 4/5] fsnotify: use hash table for faster events merge Amir Goldstein
@ 2021-03-04 10:48 ` Amir Goldstein
  2021-03-16 15:39 ` [PATCH v2 0/5] Performance improvement for fanotify merge Jan Kara
  5 siblings, 0 replies; 10+ messages in thread
From: Amir Goldstein @ 2021-03-04 10:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

Event merges are expensive when event queue size is large, so limit the
linear search to 128 merge tests.

In combination with 128 size hash table, there is a potential to merge
with up to 16K events in the hashed queue.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 1795facc5439..e9384de29f6c 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -148,6 +148,9 @@ static bool fanotify_should_merge(struct fanotify_event *old,
 	return false;
 }
 
+/* Limit event merges to limit CPU overhead per event */
+#define FANOTIFY_MAX_MERGE_EVENTS 128
+
 /* and the list better be locked by something too! */
 static int fanotify_merge(struct fsnotify_group *group,
 			  struct fsnotify_event *event)
@@ -155,6 +158,7 @@ static int fanotify_merge(struct fsnotify_group *group,
 	struct fanotify_event *old, *new = FANOTIFY_E(event);
 	unsigned int bucket = fanotify_event_hash_bucket(group, new);
 	struct hlist_head *hlist = &group->fanotify_data.merge_hash[bucket];
+	int i = 0;
 
 	pr_debug("%s: group=%p event=%p bucket=%u\n", __func__,
 		 group, event, bucket);
@@ -168,6 +172,8 @@ static int fanotify_merge(struct fsnotify_group *group,
 		return 0;
 
 	hlist_for_each_entry(old, hlist, merge_list) {
+		if (++i > FANOTIFY_MAX_MERGE_EVENTS)
+			break;
 		if (fanotify_should_merge(old, new)) {
 			old->mask |= new->mask;
 			return 1;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash
  2021-03-04 10:48 ` [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash Amir Goldstein
@ 2021-03-16 15:18   ` Jan Kara
  2021-03-17  9:26     ` Amir Goldstein
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2021-03-16 15:18 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel

On Thu 04-03-21 12:48:24, Amir Goldstein wrote:
> Improve the merge key hash by mixing more values relevant for merge.
> 
> For example, all FAN_CREATE name events in the same dir used to have the
> same merge key based on the dir inode.  With this change the created
> file name is mixed into the merge key.
> 
> The object id that was used as merge key is redundant to the event info
> so it is no longer mixed into the hash.
> 
> Permission events are not hashed, so no need to hash their info.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

...

> @@ -530,6 +568,8 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
>  	struct inode *child = NULL;
>  	bool name_event = false;
>  	unsigned int hash = 0;
> +	unsigned long ondir = (mask & FAN_ONDIR) ? 1UL : 0;
> +	struct pid *pid;

I've made a tiny change here and changed 'ondir' to bool since I don't see
a strong reason to play games like this. Otherwise I took the patch as is.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/5] Performance improvement for fanotify merge
  2021-03-04 10:48 [PATCH v2 0/5] Performance improvement for fanotify merge Amir Goldstein
                   ` (4 preceding siblings ...)
  2021-03-04 10:48 ` [PATCH v2 5/5] fanotify: limit number of event merge attempts Amir Goldstein
@ 2021-03-16 15:39 ` Jan Kara
  5 siblings, 0 replies; 10+ messages in thread
From: Jan Kara @ 2021-03-16 15:39 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel

Hi Amir!

On Thu 04-03-21 12:48:21, Amir Goldstein wrote:
> Jan,
> 
> Following is v2 for the fanotify_merge() performance improvements.
> 
> For more details on functional and performance tests please refer to
> v1 cover letter [1].
> 
> This version is much simpler than v1 using standard hlist.
> It was rebased and tested against 5.12-rc1 using LTP tests [2].

Thanks for the patches. I've just changed that one small thing, otherwise
the patches look fine so I've merged them to my tree.

								Honza

> 
> Thanks,
> Amir.
> 
> Chanes since v1:
> - Use hlist instead of multi notification lists
> - Handling all hashing within fanotify backend
> - Cram event key member together with event type
> - Remove ifdefs and use constant queue hash bits
> - Address other review comments on v1
> 
> [1] https://lore.kernel.org/linux-fsdevel/20210202162010.305971-1-amir73il@gmail.com/
> [2] https://github.com/amir73il/ltp/commits/fanotify_merge
> 
> Amir Goldstein (5):
>   fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
>   fanotify: reduce event objectid to 29-bit hash
>   fanotify: mix event info and pid into merge key hash
>   fsnotify: use hash table for faster events merge
>   fanotify: limit number of event merge attempts
> 
>  fs/notify/fanotify/fanotify.c        | 150 +++++++++++++++++++--------
>  fs/notify/fanotify/fanotify.h        |  46 +++++++-
>  fs/notify/fanotify/fanotify_user.c   |  65 ++++++++++--
>  fs/notify/inotify/inotify_fsnotify.c |   9 +-
>  fs/notify/inotify/inotify_user.c     |   7 +-
>  fs/notify/notification.c             |  64 ++++++------
>  include/linux/fsnotify_backend.h     |  23 ++--
>  7 files changed, 263 insertions(+), 101 deletions(-)
> 
> -- 
> 2.30.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash
  2021-03-16 15:18   ` Jan Kara
@ 2021-03-17  9:26     ` Amir Goldstein
  2021-03-17 10:17       ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: Amir Goldstein @ 2021-03-17  9:26 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

On Tue, Mar 16, 2021 at 5:18 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 04-03-21 12:48:24, Amir Goldstein wrote:
> > Improve the merge key hash by mixing more values relevant for merge.
> >
> > For example, all FAN_CREATE name events in the same dir used to have the
> > same merge key based on the dir inode.  With this change the created
> > file name is mixed into the merge key.
> >
> > The object id that was used as merge key is redundant to the event info
> > so it is no longer mixed into the hash.
> >
> > Permission events are not hashed, so no need to hash their info.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> ...
>
> > @@ -530,6 +568,8 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
> >       struct inode *child = NULL;
> >       bool name_event = false;
> >       unsigned int hash = 0;
> > +     unsigned long ondir = (mask & FAN_ONDIR) ? 1UL : 0;
> > +     struct pid *pid;
>
> I've made a tiny change here and changed 'ondir' to bool since I don't see
> a strong reason to play games like this. Otherwise I took the patch as is.
>

OK, so you kept this arithmetics with a bool:

(unsigned long)pid | ondir

I suppose there's no harm.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash
  2021-03-17  9:26     ` Amir Goldstein
@ 2021-03-17 10:17       ` Jan Kara
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Kara @ 2021-03-17 10:17 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel

On Wed 17-03-21 11:26:36, Amir Goldstein wrote:
> On Tue, Mar 16, 2021 at 5:18 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 04-03-21 12:48:24, Amir Goldstein wrote:
> > > Improve the merge key hash by mixing more values relevant for merge.
> > >
> > > For example, all FAN_CREATE name events in the same dir used to have the
> > > same merge key based on the dir inode.  With this change the created
> > > file name is mixed into the merge key.
> > >
> > > The object id that was used as merge key is redundant to the event info
> > > so it is no longer mixed into the hash.
> > >
> > > Permission events are not hashed, so no need to hash their info.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >
> > ...
> >
> > > @@ -530,6 +568,8 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
> > >       struct inode *child = NULL;
> > >       bool name_event = false;
> > >       unsigned int hash = 0;
> > > +     unsigned long ondir = (mask & FAN_ONDIR) ? 1UL : 0;
> > > +     struct pid *pid;
> >
> > I've made a tiny change here and changed 'ondir' to bool since I don't see
> > a strong reason to play games like this. Otherwise I took the patch as is.
> >
> 
> OK, so you kept this arithmetics with a bool:
> 
> (unsigned long)pid | ondir
> 
> I suppose there's no harm.

Yes. Bool is guaranteed to be typed to int (and then subsequently lifted to
unsigned long) in this expression with 'false' translated to 0 and 'true'
translated to 1. So everything works as expected.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-17 10:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04 10:48 [PATCH v2 0/5] Performance improvement for fanotify merge Amir Goldstein
2021-03-04 10:48 ` [PATCH v2 1/5] fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue Amir Goldstein
2021-03-04 10:48 ` [PATCH v2 2/5] fanotify: reduce event objectid to 29-bit hash Amir Goldstein
2021-03-04 10:48 ` [PATCH v2 3/5] fanotify: mix event info and pid into merge key hash Amir Goldstein
2021-03-16 15:18   ` Jan Kara
2021-03-17  9:26     ` Amir Goldstein
2021-03-17 10:17       ` Jan Kara
2021-03-04 10:48 ` [PATCH v2 4/5] fsnotify: use hash table for faster events merge Amir Goldstein
2021-03-04 10:48 ` [PATCH v2 5/5] fanotify: limit number of event merge attempts Amir Goldstein
2021-03-16 15:39 ` [PATCH v2 0/5] Performance improvement for fanotify merge Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.