All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events
@ 2017-04-03 15:33 Jan Kara
  2017-04-03 15:33 ` [PATCH 01/35] fsnotify: Remove unnecessary tests when showing fdinfo Jan Kara
                   ` (34 more replies)
  0 siblings, 35 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Hello,

This is the seventh revision of my patches to avoid SRCU stalls when fanotify
waits for response to permission events from userspace processes. Thanks to
Amir, Paul, and Miklos for review! It also passes a new LTP test that tries to
provoke hangs in fanotify subsystem when there are unanswered fanotify
permission events. If nobody has more objections, I'll push the changes to my
tree to queue them for the next merge window.

Changes since v6:
* Added Reviewed-by tags from Miklos
* Improved couple of comments suggested by Miklos
* Fixed possible NULL pointer dereference in audit_tree
* Cleaned up some patches based on Miklos' feedback

Changes since v5:
* Added Reviewed-by tags from Amir
* Fixed up __rcu annotation
* Fixed minor issues spotted by 0-day in the middle of the series
* Added fsnotify_attach_connector_to_object()
* Removed igrab()/iput() from fsnotify_recalc_mask()

Changes since v4:
* Further split up of patches as requested by Miklos
* Moved some hunks between patches to make things more logical
* Couple of smaller improvements suggested by Miklos
* Rebased on top of 4.11-rc2

Changes since v3:
* added Reviewed-by tags
* split adding of fsnotify_mark_connector into 4 smaller parts as Miklos asked
* simplified API of fsnotify_prepare/finish_user_wait()

Changes since v2:
* added Reviewed-by tags
* dropped fsnotify_put_list() abstraction
* use rcu_assign_pointer() where appropriate

Changes since v1:
* renamed fsnotify_mark_list to fsnotify_mark_connector and couple other
  things
* updated some comments and changelogs to better explain what is going on
* made audit use inode pointer as a key again
* added Reviewed-by tags
* dropped two audit fixes that got already merged
* added cleanup of mark destruction functions

Patch set overview
------------------

Currently, fanotify waits for response to a permission even from userspace
process while holding fsnotify_mark_srcu lock. That has a consequence that
when userspace process takes long to respond or does not respond at all,
fsnotify_mark_srcu period cannot ever complete blocking reclaim of any
notification marks and also blocking any process that did synchronize_srcu()
on fsnotify_mark_srcu. Effectively, this eventually blocks anybody interacting
with the notification subsystem. Miklos has some real world reports of this
happening. Although this in principle a problem of broken userspace
application (which futhermore has to have CAP_SYS_ADMIN in init_user_ns, so
it is not a security problem), it is still nasty that a simple error can
block the kernel like this.

This patch set solves this problem. The basic idea of the solution is that
when fanotify needs to wait for response from userspace process, it grabs
reference to the mark which generated the event and drops fsnotify_mark_srcu
lock. When userspace responds, we grab fsnotify_mark_srcu again, drop
the mark reference, and continue iterating the list of marks attached to the
inode / vfsmount delivering the event to other notification groups. What
complicates this simple approach is that the mark for which we wait for
response has to stay pinned in the list of marks attached to the inode /
vfsmount so that we can resume iteration of the list when userspace responds
but on the other hand when the inode gets unlinked while we wait for userspace
reponse, we need to destroy the mark (or at least detach it from the inode).

The first 5 patches contain some initial fixes and cleanups. Patches 6-17
implement attaching of marks to inode / vfsmount via a dedicated structure
which allows us to detach list of marks from the object without having to
destroy the list itself. Patches 18-20 implement removal of mark from the
list of marks attached to an object when last mark reference is dropped.
Patches 21-24 then implement dropping of SRCU lock when waiting on response
from userspace. Patches 25-33 are mostly trivial cleanups that get rid of
trivial wrappers and one pointer in the mark structure.

Patches have survived testing with inotify/fanotify tests in LTP.

Finally, to ease experimenting with the patches I've pushed them out to
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify

								Honza

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/35] fsnotify: Remove unnecessary tests when showing fdinfo
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:33 ` [PATCH 02/35] inotify: Remove inode pointers from debug messages Jan Kara
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

show_fdinfo() iterates group's list of marks. All marks found there are
guaranteed to be alive and they stay so until we release
group->mark_mutex. So remove uncecessary tests whether mark is alive.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fdinfo.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index fd98e5100cab..601a59c8d87e 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -76,8 +76,7 @@ static void inotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
 	struct inotify_inode_mark *inode_mark;
 	struct inode *inode;
 
-	if (!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE) ||
-	    !(mark->flags & FSNOTIFY_MARK_FLAG_INODE))
+	if (!(mark->flags & FSNOTIFY_MARK_FLAG_INODE))
 		return;
 
 	inode_mark = container_of(mark, struct inotify_inode_mark, fsn_mark);
@@ -113,9 +112,6 @@ static void fanotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
 	unsigned int mflags = 0;
 	struct inode *inode;
 
-	if (!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE))
-		return;
-
 	if (mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY)
 		mflags |= FAN_MARK_IGNORED_SURV_MODIFY;
 
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 02/35] inotify: Remove inode pointers from debug messages
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
  2017-04-03 15:33 ` [PATCH 01/35] fsnotify: Remove unnecessary tests when showing fdinfo Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:33 ` [PATCH 03/35] fanotify: Move recalculation of inode / vfsmount mask under mark_mutex Jan Kara
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Printing inode pointers in warnings has dubious value and with future
changes we won't be able to easily get them without either locking or
chances we oops along the way. So just remove inode pointers from the
warning messages.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/inotify/inotify_fsnotify.c |  4 ++--
 fs/notify/inotify/inotify_user.c     | 25 ++++++++++---------------
 2 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index 1aeb837ae414..f310d8368a2d 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -156,8 +156,8 @@ static int idr_callback(int id, void *p, void *data)
 	 * BUG() that was here.
 	 */
 	if (fsn_mark)
-		printk(KERN_WARNING "fsn_mark->group=%p inode=%p wd=%d\n",
-			fsn_mark->group, fsn_mark->inode, i_mark->wd);
+		printk(KERN_WARNING "fsn_mark->group=%p wd=%d\n",
+			fsn_mark->group, i_mark->wd);
 	return 0;
 }
 
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 498d609b26c7..b82a507a5367 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -429,18 +429,16 @@ static void inotify_remove_from_idr(struct fsnotify_group *group,
 	 * if it wasn't....
 	 */
 	if (wd == -1) {
-		WARN_ONCE(1, "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p"
-			" i_mark->inode=%p\n", __func__, i_mark, i_mark->wd,
-			i_mark->fsn_mark.group, i_mark->fsn_mark.inode);
+		WARN_ONCE(1, "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p\n",
+			__func__, i_mark, i_mark->wd, i_mark->fsn_mark.group);
 		goto out;
 	}
 
 	/* Lets look in the idr to see if we find it */
 	found_i_mark = inotify_idr_find_locked(group, wd);
 	if (unlikely(!found_i_mark)) {
-		WARN_ONCE(1, "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p"
-			" i_mark->inode=%p\n", __func__, i_mark, i_mark->wd,
-			i_mark->fsn_mark.group, i_mark->fsn_mark.inode);
+		WARN_ONCE(1, "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p\n",
+			__func__, i_mark, i_mark->wd, i_mark->fsn_mark.group);
 		goto out;
 	}
 
@@ -451,12 +449,10 @@ static void inotify_remove_from_idr(struct fsnotify_group *group,
 	 */
 	if (unlikely(found_i_mark != i_mark)) {
 		WARN_ONCE(1, "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p "
-			"mark->inode=%p found_i_mark=%p found_i_mark->wd=%d "
-			"found_i_mark->group=%p found_i_mark->inode=%p\n",
-			__func__, i_mark, i_mark->wd, i_mark->fsn_mark.group,
-			i_mark->fsn_mark.inode, found_i_mark, found_i_mark->wd,
-			found_i_mark->fsn_mark.group,
-			found_i_mark->fsn_mark.inode);
+			"found_i_mark=%p found_i_mark->wd=%d "
+			"found_i_mark->group=%p\n", __func__, i_mark,
+			i_mark->wd, i_mark->fsn_mark.group, found_i_mark,
+			found_i_mark->wd, found_i_mark->fsn_mark.group);
 		goto out;
 	}
 
@@ -466,9 +462,8 @@ static void inotify_remove_from_idr(struct fsnotify_group *group,
 	 * one ref grabbed by inotify_idr_find
 	 */
 	if (unlikely(atomic_read(&i_mark->fsn_mark.refcnt) < 3)) {
-		printk(KERN_ERR "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p"
-			" i_mark->inode=%p\n", __func__, i_mark, i_mark->wd,
-			i_mark->fsn_mark.group, i_mark->fsn_mark.inode);
+		printk(KERN_ERR "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p\n",
+			 __func__, i_mark, i_mark->wd, i_mark->fsn_mark.group);
 		/* we can't really recover with bad ref cnting.. */
 		BUG();
 	}
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 03/35] fanotify: Move recalculation of inode / vfsmount mask under mark_mutex
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
  2017-04-03 15:33 ` [PATCH 01/35] fsnotify: Remove unnecessary tests when showing fdinfo Jan Kara
  2017-04-03 15:33 ` [PATCH 02/35] inotify: Remove inode pointers from debug messages Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:33 ` [PATCH 04/35] audit: Abstract hash key handling Jan Kara
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Move recalculation of inode / vfsmount notification mask under
group->mark_mutex of the mark which was modified. These are the only
places where mask recalculation happens without mark being protected
from detaching from inode / vfsmount which will cause issues with the
following patches.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fanotify/fanotify_user.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 2b37f2785834..c5e69870287f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -542,6 +542,8 @@ static int fanotify_remove_vfsmount_mark(struct fsnotify_group *group,
 
 	removed = fanotify_mark_remove_from_mask(fsn_mark, mask, flags,
 						 &destroy_mark);
+	if (removed & real_mount(mnt)->mnt_fsnotify_mask)
+		fsnotify_recalc_vfsmount_mask(mnt);
 	if (destroy_mark)
 		fsnotify_detach_mark(fsn_mark);
 	mutex_unlock(&group->mark_mutex);
@@ -549,9 +551,6 @@ static int fanotify_remove_vfsmount_mark(struct fsnotify_group *group,
 		fsnotify_free_mark(fsn_mark);
 
 	fsnotify_put_mark(fsn_mark);
-	if (removed & real_mount(mnt)->mnt_fsnotify_mask)
-		fsnotify_recalc_vfsmount_mask(mnt);
-
 	return 0;
 }
 
@@ -572,6 +571,8 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
 
 	removed = fanotify_mark_remove_from_mask(fsn_mark, mask, flags,
 						 &destroy_mark);
+	if (removed & inode->i_fsnotify_mask)
+		fsnotify_recalc_inode_mask(inode);
 	if (destroy_mark)
 		fsnotify_detach_mark(fsn_mark);
 	mutex_unlock(&group->mark_mutex);
@@ -580,8 +581,6 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
 
 	/* matches the fsnotify_find_inode_mark() */
 	fsnotify_put_mark(fsn_mark);
-	if (removed & inode->i_fsnotify_mask)
-		fsnotify_recalc_inode_mask(inode);
 
 	return 0;
 }
@@ -657,10 +656,9 @@ static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
 		}
 	}
 	added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
-	mutex_unlock(&group->mark_mutex);
-
 	if (added & ~real_mount(mnt)->mnt_fsnotify_mask)
 		fsnotify_recalc_vfsmount_mask(mnt);
+	mutex_unlock(&group->mark_mutex);
 
 	fsnotify_put_mark(fsn_mark);
 	return 0;
@@ -695,10 +693,9 @@ static int fanotify_add_inode_mark(struct fsnotify_group *group,
 		}
 	}
 	added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
-	mutex_unlock(&group->mark_mutex);
-
 	if (added & ~inode->i_fsnotify_mask)
 		fsnotify_recalc_inode_mask(inode);
+	mutex_unlock(&group->mark_mutex);
 
 	fsnotify_put_mark(fsn_mark);
 	return 0;
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 04/35] audit: Abstract hash key handling
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (2 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 03/35] fanotify: Move recalculation of inode / vfsmount mask under mark_mutex Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-04 20:38   ` Paul Moore
  2017-04-03 15:33 ` [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive Jan Kara
                   ` (30 subsequent siblings)
  34 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Audit tree currently uses inode pointer as a key into the hash table.
Getting that from notification mark will be somewhat more difficult with
coming fsnotify changes. So abstract getting of hash key from the audit
chunk and inode so that we can change the method to obtain a key easily.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
CC: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/audit_tree.c | 39 ++++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 7b44195da81b..11c7ac441624 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -163,33 +163,48 @@ enum {HASH_SIZE = 128};
 static struct list_head chunk_hash_heads[HASH_SIZE];
 static __cacheline_aligned_in_smp DEFINE_SPINLOCK(hash_lock);
 
-static inline struct list_head *chunk_hash(const struct inode *inode)
+/* Function to return search key in our hash from inode. */
+static unsigned long inode_to_key(const struct inode *inode)
 {
-	unsigned long n = (unsigned long)inode / L1_CACHE_BYTES;
+	return (unsigned long)inode;
+}
+
+/*
+ * Function to return search key in our hash from chunk. Key 0 is special and
+ * should never be present in the hash.
+ */
+static unsigned long chunk_to_key(struct audit_chunk *chunk)
+{
+	return (unsigned long)chunk->mark.inode;
+}
+
+static inline struct list_head *chunk_hash(unsigned long key)
+{
+	unsigned long n = key / L1_CACHE_BYTES;
 	return chunk_hash_heads + n % HASH_SIZE;
 }
 
 /* hash_lock & entry->lock is held by caller */
 static void insert_hash(struct audit_chunk *chunk)
 {
-	struct fsnotify_mark *entry = &chunk->mark;
+	unsigned long key = chunk_to_key(chunk);
 	struct list_head *list;
 
-	if (!entry->inode)
+	if (!key)
 		return;
-	list = chunk_hash(entry->inode);
+	list = chunk_hash(key);
 	list_add_rcu(&chunk->hash, list);
 }
 
 /* called under rcu_read_lock */
 struct audit_chunk *audit_tree_lookup(const struct inode *inode)
 {
-	struct list_head *list = chunk_hash(inode);
+	unsigned long key = inode_to_key(inode);
+	struct list_head *list = chunk_hash(key);
 	struct audit_chunk *p;
 
 	list_for_each_entry_rcu(p, list, hash) {
-		/* mark.inode may have gone NULL, but who cares? */
-		if (p->mark.inode == inode) {
+		if (chunk_to_key(p) == key) {
 			atomic_long_inc(&p->refs);
 			return p;
 		}
@@ -588,7 +603,8 @@ int audit_remove_tree_rule(struct audit_krule *rule)
 
 static int compare_root(struct vfsmount *mnt, void *arg)
 {
-	return d_backing_inode(mnt->mnt_root) == arg;
+	return inode_to_key(d_backing_inode(mnt->mnt_root)) ==
+	       (unsigned long)arg;
 }
 
 void audit_trim_trees(void)
@@ -623,9 +639,10 @@ void audit_trim_trees(void)
 		list_for_each_entry(node, &tree->chunks, list) {
 			struct audit_chunk *chunk = find_chunk(node);
 			/* this could be NULL if the watch is dying else where... */
-			struct inode *inode = chunk->mark.inode;
 			node->index |= 1U<<31;
-			if (iterate_mounts(compare_root, inode, root_mnt))
+			if (iterate_mounts(compare_root,
+					   (void *)chunk_to_key(chunk),
+					   root_mnt))
 				node->index &= ~(1U<<31);
 		}
 		spin_unlock(&hash_lock);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (3 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 04/35] audit: Abstract hash key handling Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-04 20:47   ` Paul Moore
  2017-04-03 15:33 ` [PATCH 06/35] fsnotify: Update comments Jan Kara
                   ` (29 subsequent siblings)
  34 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Currently audit code uses checking of mark->inode to verify whether mark
is still alive. Switch that to checking mark flags as that is more
logical and current way will become unreliable in future.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/audit_tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 11c7ac441624..f12bd40fb8f1 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -248,7 +248,7 @@ static void untag_chunk(struct node *p)
 
 	mutex_lock(&entry->group->mark_mutex);
 	spin_lock(&entry->lock);
-	if (chunk->dead || !entry->inode) {
+	if (chunk->dead || !(entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 		spin_unlock(&entry->lock);
 		mutex_unlock(&entry->group->mark_mutex);
 		if (new)
@@ -408,7 +408,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 
 	mutex_lock(&old_entry->group->mark_mutex);
 	spin_lock(&old_entry->lock);
-	if (!old_entry->inode) {
+	if (!(old_entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 		/* old_entry is being shot, lets just lie */
 		spin_unlock(&old_entry->lock);
 		mutex_unlock(&old_entry->group->mark_mutex);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 06/35] fsnotify: Update comments
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (4 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:33 ` [PATCH 07/35] fsnotify: Move mark list head from object into dedicated structure Jan Kara
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Add a comment that lifetime of a notification mark is protected by SRCU
and remove a comment about clearing of marks attached to the inode. It
is stale and more uptodate version is at fsnotify_destroy_marks() which
is the function handling this case.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/mark.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 6043306e8e21..44836e539169 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -51,7 +51,7 @@
  *
  * LIFETIME:
  * Inode marks survive between when they are added to an inode and when their
- * refcnt==0.
+ * refcnt==0. Marks are also protected by fsnotify_mark_srcu.
  *
  * The inode mark can be cleared for a number of different reasons including:
  * - The inode is unlinked for the last time.  (fsnotify_inode_remove)
@@ -61,17 +61,6 @@
  * - The fsnotify_group associated with the mark is going away and all such marks
  *   need to be cleaned up. (fsnotify_clear_marks_by_group)
  *
- * Worst case we are given an inode and need to clean up all the marks on that
- * inode.  We take i_lock and walk the i_fsnotify_marks safely.  For each
- * mark on the list we take a reference (so the mark can't disappear under us).
- * We remove that mark form the inode's list of marks and we add this mark to a
- * private list anchored on the stack using i_free_list; we walk i_free_list
- * and before we destroy the mark we make sure that we dont race with a
- * concurrent destroy_group by getting a ref to the marks group and taking the
- * groups mutex.
-
- * Very similarly for freeing by group, except we use free_g_list.
- *
  * This has the very interesting property of being able to run concurrently with
  * any (or all) other directions.
  */
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 07/35] fsnotify: Move mark list head from object into dedicated structure
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (5 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 06/35] fsnotify: Update comments Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:33 ` [PATCH 08/35] fsnotify: Move object pointer to fsnotify_mark_connector Jan Kara
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Currently notification marks are attached to object (inode or vfsmnt) by
a hlist_head in the object. The list is also protected by a spinlock in
the object. So while there is any mark attached to the list of marks,
the object must be pinned in memory (and thus e.g. last iput() deleting
inode cannot happen). Also for list iteration in fsnotify() to work, we
must hold fsnotify_mark_srcu lock so that mark itself and
mark->obj_list.next cannot get freed. Thus we are required to wait for
response to fanotify events from userspace process with
fsnotify_mark_srcu lock held. That causes issues when userspace process
is buggy and does not reply to some event - basically the whole
notification subsystem gets eventually stuck.

So to be able to drop fsnotify_mark_srcu lock while waiting for
response, we have to pin the mark in memory and make sure it stays in
the object list (as removing the mark waiting for response could lead to
lost notification events for groups later in the list). However we don't
want inode reclaim to block on such mark as that would lead to system
just locking up elsewhere.

This commit is the first in the series that paves way towards solving
these conflicting lifetime needs. Instead of anchoring the list of marks
directly in the object, we anchor it in a dedicated structure
(fsnotify_mark_connector) and just point to that structure from the
object. The following commits will also add spinlock protecting the list
and object pointer to the structure.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/inode.c                       |  6 +--
 fs/mount.h                       |  2 +-
 fs/namespace.c                   |  6 +--
 fs/notify/fsnotify.c             | 32 +++++++++++-----
 fs/notify/fsnotify.h             | 16 ++++----
 fs/notify/inode_mark.c           |  8 ++--
 fs/notify/mark.c                 | 80 +++++++++++++++++++++++++++++++++-------
 fs/notify/vfsmount_mark.c        |  8 ++--
 include/linux/fs.h               |  4 +-
 include/linux/fsnotify_backend.h | 10 +++++
 kernel/auditsc.c                 |  7 +++-
 11 files changed, 132 insertions(+), 47 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 88110fd0b282..750e952d2918 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -234,6 +234,9 @@ void __destroy_inode(struct inode *inode)
 	inode_detach_wb(inode);
 	security_inode_free(inode);
 	fsnotify_inode_delete(inode);
+#ifdef CONFIG_FSNOTIFY
+	fsnotify_connector_free(&inode->i_fsnotify_marks);
+#endif
 	locks_free_lock_context(inode);
 	if (!inode->i_nlink) {
 		WARN_ON(atomic_long_read(&inode->i_sb->s_remove_count) == 0);
@@ -371,9 +374,6 @@ void inode_init_once(struct inode *inode)
 	INIT_LIST_HEAD(&inode->i_lru);
 	address_space_init_once(&inode->i_data);
 	i_size_ordered_init(inode);
-#ifdef CONFIG_FSNOTIFY
-	INIT_HLIST_HEAD(&inode->i_fsnotify_marks);
-#endif
 }
 EXPORT_SYMBOL(inode_init_once);
 
diff --git a/fs/mount.h b/fs/mount.h
index 2826543a131d..bc409360a03b 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -59,7 +59,7 @@ struct mount {
 	struct mountpoint *mnt_mp;	/* where is it mounted */
 	struct hlist_node mnt_mp_list;	/* list mounts with the same mountpoint */
 #ifdef CONFIG_FSNOTIFY
-	struct hlist_head mnt_fsnotify_marks;
+	struct fsnotify_mark_connector *mnt_fsnotify_marks;
 	__u32 mnt_fsnotify_mask;
 #endif
 	int mnt_id;			/* mount identifier */
diff --git a/fs/namespace.c b/fs/namespace.c
index cc1375eff88c..2625e1d97a3a 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -236,9 +236,6 @@ static struct mount *alloc_vfsmnt(const char *name)
 		INIT_LIST_HEAD(&mnt->mnt_slave_list);
 		INIT_LIST_HEAD(&mnt->mnt_slave);
 		INIT_HLIST_NODE(&mnt->mnt_mp_list);
-#ifdef CONFIG_FSNOTIFY
-		INIT_HLIST_HEAD(&mnt->mnt_fsnotify_marks);
-#endif
 		init_fs_pin(&mnt->mnt_umount, drop_mountpoint);
 	}
 	return mnt;
@@ -1111,6 +1108,9 @@ static void cleanup_mnt(struct mount *mnt)
 	if (unlikely(mnt->mnt_pins.first))
 		mnt_pin_kill(mnt);
 	fsnotify_vfsmount_delete(&mnt->mnt);
+#ifdef CONFIG_FSNOTIFY
+	fsnotify_connector_free(&mnt->mnt_fsnotify_marks);
+#endif
 	dput(mnt->mnt.mnt_root);
 	deactivate_super(mnt->mnt.mnt_sb);
 	mnt_free_id(mnt);
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index b41515d3f081..eae621a18ac9 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -193,6 +193,7 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	struct hlist_node *inode_node = NULL, *vfsmount_node = NULL;
 	struct fsnotify_mark *inode_mark = NULL, *vfsmount_mark = NULL;
 	struct fsnotify_group *inode_group, *vfsmount_group;
+	struct fsnotify_mark_connector *inode_conn, *vfsmount_conn;
 	struct mount *mnt;
 	int idx, ret = 0;
 	/* global tests shouldn't care about events on child only the specific event */
@@ -210,8 +211,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	 * SRCU because we have no references to any objects and do not
 	 * need SRCU to keep them "alive".
 	 */
-	if (hlist_empty(&to_tell->i_fsnotify_marks) &&
-	    (!mnt || hlist_empty(&mnt->mnt_fsnotify_marks)))
+	if (!to_tell->i_fsnotify_marks &&
+	    (!mnt || !mnt->mnt_fsnotify_marks))
 		return 0;
 	/*
 	 * if this is a modify event we may need to clear the ignored masks
@@ -226,16 +227,24 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	idx = srcu_read_lock(&fsnotify_mark_srcu);
 
 	if ((mask & FS_MODIFY) ||
-	    (test_mask & to_tell->i_fsnotify_mask))
-		inode_node = srcu_dereference(to_tell->i_fsnotify_marks.first,
-					      &fsnotify_mark_srcu);
+	    (test_mask & to_tell->i_fsnotify_mask)) {
+		inode_conn = lockless_dereference(to_tell->i_fsnotify_marks);
+		if (inode_conn)
+			inode_node = srcu_dereference(inode_conn->list.first,
+						      &fsnotify_mark_srcu);
+	}
 
 	if (mnt && ((mask & FS_MODIFY) ||
 		    (test_mask & mnt->mnt_fsnotify_mask))) {
-		vfsmount_node = srcu_dereference(mnt->mnt_fsnotify_marks.first,
-						 &fsnotify_mark_srcu);
-		inode_node = srcu_dereference(to_tell->i_fsnotify_marks.first,
-					      &fsnotify_mark_srcu);
+		inode_conn = lockless_dereference(to_tell->i_fsnotify_marks);
+		if (inode_conn)
+			inode_node = srcu_dereference(inode_conn->list.first,
+						      &fsnotify_mark_srcu);
+		vfsmount_conn = lockless_dereference(mnt->mnt_fsnotify_marks);
+		if (vfsmount_conn)
+			vfsmount_node = srcu_dereference(
+						vfsmount_conn->list.first,
+						&fsnotify_mark_srcu);
 	}
 
 	/*
@@ -293,6 +302,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 }
 EXPORT_SYMBOL_GPL(fsnotify);
 
+extern struct kmem_cache *fsnotify_mark_connector_cachep;
+
 static __init int fsnotify_init(void)
 {
 	int ret;
@@ -303,6 +314,9 @@ static __init int fsnotify_init(void)
 	if (ret)
 		panic("initializing fsnotify_mark_srcu");
 
+	fsnotify_mark_connector_cachep = KMEM_CACHE(fsnotify_mark_connector,
+						    SLAB_PANIC);
+
 	return 0;
 }
 core_initcall(fsnotify_init);
diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 0a3bc2cf192c..eb64c59c9ad1 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -15,7 +15,7 @@ extern void fsnotify_flush_notify(struct fsnotify_group *group);
 extern struct srcu_struct fsnotify_mark_srcu;
 
 /* Calculate mask of events for a list of marks */
-extern u32 fsnotify_recalc_mask(struct hlist_head *head);
+extern u32 fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
 
 /* compare two groups for sorting of marks lists */
 extern int fsnotify_compare_groups(struct fsnotify_group *a,
@@ -24,7 +24,7 @@ extern int fsnotify_compare_groups(struct fsnotify_group *a,
 extern void fsnotify_set_inode_mark_mask_locked(struct fsnotify_mark *fsn_mark,
 						__u32 mask);
 /* Add mark to a proper place in mark list */
-extern int fsnotify_add_mark_list(struct hlist_head *head,
+extern int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 				  struct fsnotify_mark *mark,
 				  int allow_dups);
 /* add a mark to an inode */
@@ -41,19 +41,21 @@ extern void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark);
 /* inode specific destruction of a mark */
 extern void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark);
 /* Find mark belonging to given group in the list of marks */
-extern struct fsnotify_mark *fsnotify_find_mark(struct hlist_head *head,
-						struct fsnotify_group *group);
+extern struct fsnotify_mark *fsnotify_find_mark(
+					struct fsnotify_mark_connector *conn,
+					struct fsnotify_group *group);
 /* Destroy all marks in the given list protected by 'lock' */
-extern void fsnotify_destroy_marks(struct hlist_head *head, spinlock_t *lock);
+extern void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn,
+				   spinlock_t *lock);
 /* run the list of all marks associated with inode and destroy them */
 static inline void fsnotify_clear_marks_by_inode(struct inode *inode)
 {
-	fsnotify_destroy_marks(&inode->i_fsnotify_marks, &inode->i_lock);
+	fsnotify_destroy_marks(inode->i_fsnotify_marks, &inode->i_lock);
 }
 /* run the list of all marks associated with vfsmount and destroy them */
 static inline void fsnotify_clear_marks_by_mount(struct vfsmount *mnt)
 {
-	fsnotify_destroy_marks(&real_mount(mnt)->mnt_fsnotify_marks,
+	fsnotify_destroy_marks(real_mount(mnt)->mnt_fsnotify_marks,
 			       &mnt->mnt_root->d_lock);
 }
 /* prepare for freeing all marks associated with given group */
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index a3645249f7ec..e8c6b822ff8d 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -37,7 +37,7 @@
 void fsnotify_recalc_inode_mask(struct inode *inode)
 {
 	spin_lock(&inode->i_lock);
-	inode->i_fsnotify_mask = fsnotify_recalc_mask(&inode->i_fsnotify_marks);
+	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	spin_unlock(&inode->i_lock);
 
 	__fsnotify_update_child_dentry_flags(inode);
@@ -60,7 +60,7 @@ void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
 	 * hold the inode->i_lock, so this is the perfect time to update the
 	 * inode->i_fsnotify_mask
 	 */
-	inode->i_fsnotify_mask = fsnotify_recalc_mask(&inode->i_fsnotify_marks);
+	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	spin_unlock(&inode->i_lock);
 }
 
@@ -82,7 +82,7 @@ struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group,
 	struct fsnotify_mark *mark;
 
 	spin_lock(&inode->i_lock);
-	mark = fsnotify_find_mark(&inode->i_fsnotify_marks, group);
+	mark = fsnotify_find_mark(inode->i_fsnotify_marks, group);
 	spin_unlock(&inode->i_lock);
 
 	return mark;
@@ -135,7 +135,7 @@ int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
 	mark->inode = inode;
 	ret = fsnotify_add_mark_list(&inode->i_fsnotify_marks, mark,
 				     allow_dups);
-	inode->i_fsnotify_mask = fsnotify_recalc_mask(&inode->i_fsnotify_marks);
+	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	spin_unlock(&inode->i_lock);
 
 	return ret;
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 44836e539169..24b6191bd6c6 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -83,6 +83,8 @@
 #define FSNOTIFY_REAPER_DELAY	(1)	/* 1 jiffy */
 
 struct srcu_struct fsnotify_mark_srcu;
+struct kmem_cache *fsnotify_mark_connector_cachep;
+
 static DEFINE_SPINLOCK(destroy_lock);
 static LIST_HEAD(destroy_list);
 
@@ -104,12 +106,15 @@ void fsnotify_put_mark(struct fsnotify_mark *mark)
 }
 
 /* Calculate mask of events for a list of marks */
-u32 fsnotify_recalc_mask(struct hlist_head *head)
+u32 fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 {
 	u32 new_mask = 0;
 	struct fsnotify_mark *mark;
 
-	hlist_for_each_entry(mark, head, obj_list)
+	if (!conn)
+		return 0;
+
+	hlist_for_each_entry(mark, &conn->list, obj_list)
 		new_mask |= mark->mask;
 	return new_mask;
 }
@@ -220,10 +225,14 @@ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 	fsnotify_free_mark(mark);
 }
 
-void fsnotify_destroy_marks(struct hlist_head *head, spinlock_t *lock)
+void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn,
+			    spinlock_t *lock)
 {
 	struct fsnotify_mark *mark;
 
+	if (!conn)
+		return;
+
 	while (1) {
 		/*
 		 * We have to be careful since we can race with e.g.
@@ -233,11 +242,12 @@ void fsnotify_destroy_marks(struct hlist_head *head, spinlock_t *lock)
 		 * calling fsnotify_destroy_mark() more than once is fine.
 		 */
 		spin_lock(lock);
-		if (hlist_empty(head)) {
+		if (hlist_empty(&conn->list)) {
 			spin_unlock(lock);
 			break;
 		}
-		mark = hlist_entry(head->first, struct fsnotify_mark, obj_list);
+		mark = hlist_entry(conn->list.first, struct fsnotify_mark,
+				   obj_list);
 		/*
 		 * We don't update i_fsnotify_mask / mnt_fsnotify_mask here
 		 * since inode / mount is going away anyway. So just remove
@@ -251,6 +261,14 @@ void fsnotify_destroy_marks(struct hlist_head *head, spinlock_t *lock)
 	}
 }
 
+void fsnotify_connector_free(struct fsnotify_mark_connector **connp)
+{
+	if (*connp) {
+		kmem_cache_free(fsnotify_mark_connector_cachep, *connp);
+		*connp = NULL;
+	}
+}
+
 void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask)
 {
 	assert_spin_locked(&mark->lock);
@@ -304,21 +322,54 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
 	return -1;
 }
 
-/* Add mark into proper place in given list of marks */
-int fsnotify_add_mark_list(struct hlist_head *head, struct fsnotify_mark *mark,
-			   int allow_dups)
+static int fsnotify_attach_connector_to_object(
+					struct fsnotify_mark_connector **connp)
+{
+	struct fsnotify_mark_connector *conn;
+
+	conn = kmem_cache_alloc(fsnotify_mark_connector_cachep, GFP_ATOMIC);
+	if (!conn)
+		return -ENOMEM;
+	INIT_HLIST_HEAD(&conn->list);
+	/*
+	 * Make sure 'conn' initialization is visible. Matches
+	 * lockless_dereference() in fsnotify().
+	 */
+	smp_wmb();
+	*connp = conn;
+
+	return 0;
+}
+
+/*
+ * Add mark into proper place in given list of marks. These marks may be used
+ * for the fsnotify backend to determine which event types should be delivered
+ * to which group and for which inodes. These marks are ordered according to
+ * priority, highest number first, and then by the group's location in memory.
+ */
+int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
+			   struct fsnotify_mark *mark, int allow_dups)
 {
 	struct fsnotify_mark *lmark, *last = NULL;
+	struct fsnotify_mark_connector *conn;
 	int cmp;
+	int err;
+
+	if (!*connp) {
+		err = fsnotify_attach_connector_to_object(connp);
+		if (err)
+			return err;
+	}
+	conn = *connp;
 
 	/* is mark the first mark? */
-	if (hlist_empty(head)) {
-		hlist_add_head_rcu(&mark->obj_list, head);
+	if (hlist_empty(&conn->list)) {
+		hlist_add_head_rcu(&mark->obj_list, &conn->list);
 		return 0;
 	}
 
 	/* should mark be in the middle of the current list? */
-	hlist_for_each_entry(lmark, head, obj_list) {
+	hlist_for_each_entry(lmark, &conn->list, obj_list) {
 		last = lmark;
 
 		if ((lmark->group == mark->group) && !allow_dups)
@@ -419,12 +470,15 @@ int fsnotify_add_mark(struct fsnotify_mark *mark, struct fsnotify_group *group,
  * Given a list of marks, find the mark associated with given group. If found
  * take a reference to that mark and return it, else return NULL.
  */
-struct fsnotify_mark *fsnotify_find_mark(struct hlist_head *head,
+struct fsnotify_mark *fsnotify_find_mark(struct fsnotify_mark_connector *conn,
 					 struct fsnotify_group *group)
 {
 	struct fsnotify_mark *mark;
 
-	hlist_for_each_entry(mark, head, obj_list) {
+	if (!conn)
+		return NULL;
+
+	hlist_for_each_entry(mark, &conn->list, obj_list) {
 		if (mark->group == group) {
 			fsnotify_get_mark(mark);
 			return mark;
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index a8fcab68faef..28815d5cba7c 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -43,7 +43,7 @@ void fsnotify_recalc_vfsmount_mask(struct vfsmount *mnt)
 	struct mount *m = real_mount(mnt);
 
 	spin_lock(&mnt->mnt_root->d_lock);
-	m->mnt_fsnotify_mask = fsnotify_recalc_mask(&m->mnt_fsnotify_marks);
+	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
 	spin_unlock(&mnt->mnt_root->d_lock);
 }
 
@@ -60,7 +60,7 @@ void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark)
 	hlist_del_init_rcu(&mark->obj_list);
 	mark->mnt = NULL;
 
-	m->mnt_fsnotify_mask = fsnotify_recalc_mask(&m->mnt_fsnotify_marks);
+	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
 	spin_unlock(&mnt->mnt_root->d_lock);
 }
 
@@ -75,7 +75,7 @@ struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group,
 	struct fsnotify_mark *mark;
 
 	spin_lock(&mnt->mnt_root->d_lock);
-	mark = fsnotify_find_mark(&m->mnt_fsnotify_marks, group);
+	mark = fsnotify_find_mark(m->mnt_fsnotify_marks, group);
 	spin_unlock(&mnt->mnt_root->d_lock);
 
 	return mark;
@@ -101,7 +101,7 @@ int fsnotify_add_vfsmount_mark(struct fsnotify_mark *mark,
 	spin_lock(&mnt->mnt_root->d_lock);
 	mark->mnt = mnt;
 	ret = fsnotify_add_mark_list(&m->mnt_fsnotify_marks, mark, allow_dups);
-	m->mnt_fsnotify_mask = fsnotify_recalc_mask(&m->mnt_fsnotify_marks);
+	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
 	spin_unlock(&mnt->mnt_root->d_lock);
 
 	return ret;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7251f7bb45e8..66e52342be2d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -546,6 +546,8 @@ is_uncached_acl(struct posix_acl *acl)
 #define IOP_XATTR	0x0008
 #define IOP_DEFAULT_READLINK	0x0010
 
+struct fsnotify_mark_connector;
+
 /*
  * Keep mostly read-only and often accessed (especially for
  * the RCU path lookup and 'stat' data) fields at the beginning
@@ -645,7 +647,7 @@ struct inode {
 
 #ifdef CONFIG_FSNOTIFY
 	__u32			i_fsnotify_mask; /* all events this inode cares about */
-	struct hlist_head	i_fsnotify_marks;
+	struct fsnotify_mark_connector	*i_fsnotify_marks;
 #endif
 
 #if IS_ENABLED(CONFIG_FS_ENCRYPTION)
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index e6e689b5569e..8b63085f8855 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -195,6 +195,15 @@ struct fsnotify_group {
 #define FSNOTIFY_EVENT_INODE	2
 
 /*
+ * Inode / vfsmount point to this structure which tracks all marks attached to
+ * the inode / vfsmount. The structure is freed only when inode / vfsmount gets
+ * freed.
+ */
+struct fsnotify_mark_connector {
+	struct hlist_head list;
+};
+
+/*
  * A mark is simply an object attached to an in core inode which allows an
  * fsnotify listener to indicate they are either no longer interested in events
  * of a type matching mask or only interested in those events.
@@ -346,6 +355,7 @@ extern void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
 extern void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group);
 /* run all the marks in a group, and clear all of the marks where mark->flags & flags is true*/
 extern void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, unsigned int flags);
+extern void fsnotify_connector_free(struct fsnotify_mark_connector **connp);
 extern void fsnotify_get_mark(struct fsnotify_mark *mark);
 extern void fsnotify_put_mark(struct fsnotify_mark *mark);
 extern void fsnotify_unmount_inodes(struct super_block *sb);
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index d6a8de5f8fa3..bf7b7ca295d0 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -73,6 +73,7 @@
 #include <linux/ctype.h>
 #include <linux/string.h>
 #include <linux/uaccess.h>
+#include <linux/fsnotify_backend.h>
 #include <uapi/linux/limits.h>
 
 #include "audit.h"
@@ -1596,7 +1597,8 @@ static inline void handle_one(const struct inode *inode)
 	struct audit_tree_refs *p;
 	struct audit_chunk *chunk;
 	int count;
-	if (likely(hlist_empty(&inode->i_fsnotify_marks)))
+	if (likely(!inode->i_fsnotify_marks ||
+		   hlist_empty(&inode->i_fsnotify_marks->list)))
 		return;
 	context = current->audit_context;
 	p = context->trees;
@@ -1639,7 +1641,8 @@ static void handle_path(const struct dentry *dentry)
 	seq = read_seqbegin(&rename_lock);
 	for(;;) {
 		struct inode *inode = d_backing_inode(d);
-		if (inode && unlikely(!hlist_empty(&inode->i_fsnotify_marks))) {
+		if (inode && unlikely(inode->i_fsnotify_marks &&
+		    !hlist_empty(&inode->i_fsnotify_marks->list))) {
 			struct audit_chunk *chunk;
 			chunk = audit_tree_lookup(inode);
 			if (chunk) {
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 08/35] fsnotify: Move object pointer to fsnotify_mark_connector
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (6 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 07/35] fsnotify: Move mark list head from object into dedicated structure Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:33 ` [PATCH 09/35] fsnotify: Make fsnotify_mark_connector hold inode reference Jan Kara
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Move pointer to inode / vfsmount from mark itself to the
fsnotify_mark_connector structure. This is another step on the path
towards decoupling inode / vfsmount lifetime from notification mark
lifetime.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/dnotify/dnotify.c      |  4 ++--
 fs/notify/fdinfo.c               | 12 ++++++------
 fs/notify/fsnotify.h             |  3 ++-
 fs/notify/inode_mark.c           | 18 +++++++-----------
 fs/notify/mark.c                 | 32 ++++++++++++++++++++++----------
 fs/notify/vfsmount_mark.c        | 12 +++++-------
 include/linux/fsnotify_backend.h | 17 ++++++++++-------
 kernel/audit_tree.c              | 25 ++++++++++++++++++++-----
 8 files changed, 74 insertions(+), 49 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 5a4ec309e283..5024729dba23 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -69,8 +69,8 @@ static void dnotify_recalc_inode_mask(struct fsnotify_mark *fsn_mark)
 	if (old_mask == new_mask)
 		return;
 
-	if (fsn_mark->inode)
-		fsnotify_recalc_inode_mask(fsn_mark->inode);
+	if (fsn_mark->connector)
+		fsnotify_recalc_inode_mask(fsn_mark->connector->inode);
 }
 
 /*
diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index 601a59c8d87e..dd63aa9a6f9a 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -76,11 +76,11 @@ static void inotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
 	struct inotify_inode_mark *inode_mark;
 	struct inode *inode;
 
-	if (!(mark->flags & FSNOTIFY_MARK_FLAG_INODE))
+	if (!(mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE))
 		return;
 
 	inode_mark = container_of(mark, struct inotify_inode_mark, fsn_mark);
-	inode = igrab(mark->inode);
+	inode = igrab(mark->connector->inode);
 	if (inode) {
 		/*
 		 * IN_ALL_EVENTS represents all of the mask bits
@@ -115,8 +115,8 @@ static void fanotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
 	if (mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY)
 		mflags |= FAN_MARK_IGNORED_SURV_MODIFY;
 
-	if (mark->flags & FSNOTIFY_MARK_FLAG_INODE) {
-		inode = igrab(mark->inode);
+	if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE) {
+		inode = igrab(mark->connector->inode);
 		if (!inode)
 			return;
 		seq_printf(m, "fanotify ino:%lx sdev:%x mflags:%x mask:%x ignored_mask:%x ",
@@ -125,8 +125,8 @@ static void fanotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
 		show_mark_fhandle(m, inode);
 		seq_putc(m, '\n');
 		iput(inode);
-	} else if (mark->flags & FSNOTIFY_MARK_FLAG_VFSMOUNT) {
-		struct mount *mnt = real_mount(mark->mnt);
+	} else if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
+		struct mount *mnt = real_mount(mark->connector->mnt);
 
 		seq_printf(m, "fanotify mnt_id:%x mflags:%x mask:%x ignored_mask:%x\n",
 			   mnt->mnt_id, mflags, mark->mask, mark->ignored_mask);
diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index eb64c59c9ad1..dd1a6798c9cd 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -26,6 +26,7 @@ extern void fsnotify_set_inode_mark_mask_locked(struct fsnotify_mark *fsn_mark,
 /* Add mark to a proper place in mark list */
 extern int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 				  struct fsnotify_mark *mark,
+				  struct inode *inode, struct vfsmount *mnt,
 				  int allow_dups);
 /* add a mark to an inode */
 extern int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
@@ -44,7 +45,7 @@ extern void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark);
 extern struct fsnotify_mark *fsnotify_find_mark(
 					struct fsnotify_mark_connector *conn,
 					struct fsnotify_group *group);
-/* Destroy all marks in the given list protected by 'lock' */
+/* Destroy all marks connected via given connector protected by 'lock' */
 extern void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn,
 				   spinlock_t *lock);
 /* run the list of all marks associated with inode and destroy them */
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index e8c6b822ff8d..1644ba09efd4 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -45,7 +45,7 @@ void fsnotify_recalc_inode_mask(struct inode *inode)
 
 void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
 {
-	struct inode *inode = mark->inode;
+	struct inode *inode = mark->connector->inode;
 
 	BUG_ON(!mutex_is_locked(&mark->group->mark_mutex));
 	assert_spin_locked(&mark->lock);
@@ -53,7 +53,7 @@ void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
 	spin_lock(&inode->i_lock);
 
 	hlist_del_init_rcu(&mark->obj_list);
-	mark->inode = NULL;
+	mark->connector = NULL;
 
 	/*
 	 * this mark is now off the inode->i_fsnotify_marks list and we
@@ -69,7 +69,7 @@ void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
  */
 void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
 {
-	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_MARK_FLAG_INODE);
+	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_INODE);
 }
 
 /*
@@ -99,11 +99,10 @@ void fsnotify_set_inode_mark_mask_locked(struct fsnotify_mark *mark,
 
 	assert_spin_locked(&mark->lock);
 
-	if (mask &&
-	    mark->inode &&
+	if (mask && mark->connector &&
 	    !(mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED)) {
 		mark->flags |= FSNOTIFY_MARK_FLAG_OBJECT_PINNED;
-		inode = igrab(mark->inode);
+		inode = igrab(mark->connector->inode);
 		/*
 		 * we shouldn't be able to get here if the inode wasn't
 		 * already safely held in memory.  But bug in case it
@@ -126,15 +125,12 @@ int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
 {
 	int ret;
 
-	mark->flags |= FSNOTIFY_MARK_FLAG_INODE;
-
 	BUG_ON(!mutex_is_locked(&group->mark_mutex));
 	assert_spin_locked(&mark->lock);
 
 	spin_lock(&inode->i_lock);
-	mark->inode = inode;
-	ret = fsnotify_add_mark_list(&inode->i_fsnotify_marks, mark,
-				     allow_dups);
+	ret = fsnotify_add_mark_list(&inode->i_fsnotify_marks, mark, inode,
+				     NULL, allow_dups);
 	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	spin_unlock(&inode->i_lock);
 
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 24b6191bd6c6..3d6e7a8e58be 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -142,10 +142,10 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
 
 	mark->flags &= ~FSNOTIFY_MARK_FLAG_ATTACHED;
 
-	if (mark->flags & FSNOTIFY_MARK_FLAG_INODE) {
-		inode = mark->inode;
+	if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE) {
+		inode = mark->connector->inode;
 		fsnotify_destroy_inode_mark(mark);
-	} else if (mark->flags & FSNOTIFY_MARK_FLAG_VFSMOUNT)
+	} else if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
 		fsnotify_destroy_vfsmount_mark(mark);
 	else
 		BUG();
@@ -275,7 +275,7 @@ void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask)
 
 	mark->mask = mask;
 
-	if (mark->flags & FSNOTIFY_MARK_FLAG_INODE)
+	if (mark->connector && mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE)
 		fsnotify_set_inode_mark_mask_locked(mark, mask);
 }
 
@@ -323,7 +323,9 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
 }
 
 static int fsnotify_attach_connector_to_object(
-					struct fsnotify_mark_connector **connp)
+					struct fsnotify_mark_connector **connp,
+					struct inode *inode,
+					struct vfsmount *mnt)
 {
 	struct fsnotify_mark_connector *conn;
 
@@ -331,6 +333,13 @@ static int fsnotify_attach_connector_to_object(
 	if (!conn)
 		return -ENOMEM;
 	INIT_HLIST_HEAD(&conn->list);
+	if (inode) {
+		conn->flags = FSNOTIFY_OBJ_TYPE_INODE;
+		conn->inode = inode;
+	} else {
+		conn->flags = FSNOTIFY_OBJ_TYPE_VFSMOUNT;
+		conn->mnt = mnt;
+	}
 	/*
 	 * Make sure 'conn' initialization is visible. Matches
 	 * lockless_dereference() in fsnotify().
@@ -348,7 +357,8 @@ static int fsnotify_attach_connector_to_object(
  * priority, highest number first, and then by the group's location in memory.
  */
 int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
-			   struct fsnotify_mark *mark, int allow_dups)
+			   struct fsnotify_mark *mark, struct inode *inode,
+			   struct vfsmount *mnt, int allow_dups)
 {
 	struct fsnotify_mark *lmark, *last = NULL;
 	struct fsnotify_mark_connector *conn;
@@ -356,7 +366,7 @@ int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 	int err;
 
 	if (!*connp) {
-		err = fsnotify_attach_connector_to_object(connp);
+		err = fsnotify_attach_connector_to_object(connp, inode, mnt);
 		if (err)
 			return err;
 	}
@@ -365,7 +375,7 @@ int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 	/* is mark the first mark? */
 	if (hlist_empty(&conn->list)) {
 		hlist_add_head_rcu(&mark->obj_list, &conn->list);
-		return 0;
+		goto added;
 	}
 
 	/* should mark be in the middle of the current list? */
@@ -378,13 +388,15 @@ int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 		cmp = fsnotify_compare_groups(lmark->group, mark->group);
 		if (cmp >= 0) {
 			hlist_add_before_rcu(&mark->obj_list, &lmark->obj_list);
-			return 0;
+			goto added;
 		}
 	}
 
 	BUG_ON(last == NULL);
 	/* mark should be the last entry.  last is the current last entry */
 	hlist_add_behind_rcu(&mark->obj_list, &last->obj_list);
+added:
+	mark->connector = conn;
 	return 0;
 }
 
@@ -507,7 +519,7 @@ void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
 	 */
 	mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
 	list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
-		if (mark->flags & flags)
+		if (mark->connector->flags & flags)
 			list_move(&mark->g_list, &to_free);
 	}
 	mutex_unlock(&group->mark_mutex);
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index 28815d5cba7c..e04e33ef02d4 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -31,7 +31,7 @@
 
 void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
 {
-	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_MARK_FLAG_VFSMOUNT);
+	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
 }
 
 /*
@@ -49,7 +49,7 @@ void fsnotify_recalc_vfsmount_mask(struct vfsmount *mnt)
 
 void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark)
 {
-	struct vfsmount *mnt = mark->mnt;
+	struct vfsmount *mnt = mark->connector->mnt;
 	struct mount *m = real_mount(mnt);
 
 	BUG_ON(!mutex_is_locked(&mark->group->mark_mutex));
@@ -58,7 +58,7 @@ void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark)
 	spin_lock(&mnt->mnt_root->d_lock);
 
 	hlist_del_init_rcu(&mark->obj_list);
-	mark->mnt = NULL;
+	mark->connector = NULL;
 
 	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
 	spin_unlock(&mnt->mnt_root->d_lock);
@@ -93,14 +93,12 @@ int fsnotify_add_vfsmount_mark(struct fsnotify_mark *mark,
 	struct mount *m = real_mount(mnt);
 	int ret;
 
-	mark->flags |= FSNOTIFY_MARK_FLAG_VFSMOUNT;
-
 	BUG_ON(!mutex_is_locked(&group->mark_mutex));
 	assert_spin_locked(&mark->lock);
 
 	spin_lock(&mnt->mnt_root->d_lock);
-	mark->mnt = mnt;
-	ret = fsnotify_add_mark_list(&m->mnt_fsnotify_marks, mark, allow_dups);
+	ret = fsnotify_add_mark_list(&m->mnt_fsnotify_marks, mark, NULL, mnt,
+				     allow_dups);
 	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
 	spin_unlock(&mnt->mnt_root->d_lock);
 
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 8b63085f8855..06f9a2cc1463 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -200,6 +200,13 @@ struct fsnotify_group {
  * freed.
  */
 struct fsnotify_mark_connector {
+#define FSNOTIFY_OBJ_TYPE_INODE		0x01
+#define FSNOTIFY_OBJ_TYPE_VFSMOUNT	0x02
+	unsigned int flags;	/* Type of object [lock] */
+	union {	/* Object pointer [lock] */
+		struct inode *inode;
+		struct vfsmount *mnt;
+	};
 	struct hlist_head list;
 };
 
@@ -234,14 +241,10 @@ struct fsnotify_mark {
 	spinlock_t lock;
 	/* List of marks for inode / vfsmount [obj_lock] */
 	struct hlist_node obj_list;
-	union {	/* Object pointer [mark->lock, group->mark_mutex] */
-		struct inode *inode;	/* inode this mark is associated with */
-		struct vfsmount *mnt;	/* vfsmount this mark is associated with */
-	};
+	/* Head of list of marks for an object [mark->lock, group->mark_mutex] */
+	struct fsnotify_mark_connector *connector;
 	/* Events types to ignore [mark->lock, group->mark_mutex] */
 	__u32 ignored_mask;
-#define FSNOTIFY_MARK_FLAG_INODE		0x01
-#define FSNOTIFY_MARK_FLAG_VFSMOUNT		0x02
 #define FSNOTIFY_MARK_FLAG_OBJECT_PINNED	0x04
 #define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY	0x08
 #define FSNOTIFY_MARK_FLAG_ALIVE		0x10
@@ -353,7 +356,7 @@ extern void fsnotify_free_mark(struct fsnotify_mark *mark);
 extern void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group);
 /* run all the marks in a group, and clear all of the inode marks */
 extern void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group);
-/* run all the marks in a group, and clear all of the marks where mark->flags & flags is true*/
+/* run all the marks in a group, and clear all of the marks attached to given object type */
 extern void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, unsigned int flags);
 extern void fsnotify_connector_free(struct fsnotify_mark_connector **connp);
 extern void fsnotify_get_mark(struct fsnotify_mark *mark);
diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index f12bd40fb8f1..4d4f3284a9e3 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -172,10 +172,25 @@ static unsigned long inode_to_key(const struct inode *inode)
 /*
  * Function to return search key in our hash from chunk. Key 0 is special and
  * should never be present in the hash.
+ *
+ * Must be called with chunk->mark.lock held to protect from connector
+ * becoming NULL.
  */
+static unsigned long __chunk_to_key(struct audit_chunk *chunk)
+{
+	if (!chunk->mark.connector)
+		return 0;
+	return (unsigned long)chunk->mark.connector->inode;
+}
+
 static unsigned long chunk_to_key(struct audit_chunk *chunk)
 {
-	return (unsigned long)chunk->mark.inode;
+	unsigned long key;
+
+	spin_lock(&chunk->mark.lock);
+	key = __chunk_to_key(chunk);
+	spin_unlock(&chunk->mark.lock);
+	return key;
 }
 
 static inline struct list_head *chunk_hash(unsigned long key)
@@ -187,7 +202,7 @@ static inline struct list_head *chunk_hash(unsigned long key)
 /* hash_lock & entry->lock is held by caller */
 static void insert_hash(struct audit_chunk *chunk)
 {
-	unsigned long key = chunk_to_key(chunk);
+	unsigned long key = __chunk_to_key(chunk);
 	struct list_head *list;
 
 	if (!key)
@@ -276,8 +291,8 @@ static void untag_chunk(struct node *p)
 	if (!new)
 		goto Fallback;
 
-	if (fsnotify_add_mark_locked(&new->mark, entry->group, entry->inode,
-				     NULL, 1)) {
+	if (fsnotify_add_mark_locked(&new->mark, entry->group,
+				     entry->connector->inode, NULL, 1)) {
 		fsnotify_put_mark(&new->mark);
 		goto Fallback;
 	}
@@ -418,7 +433,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 	}
 
 	if (fsnotify_add_mark_locked(chunk_entry, old_entry->group,
-				     old_entry->inode, NULL, 1)) {
+			     old_entry->connector->inode, NULL, 1)) {
 		spin_unlock(&old_entry->lock);
 		mutex_unlock(&old_entry->group->mark_mutex);
 		fsnotify_put_mark(chunk_entry);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 09/35] fsnotify: Make fsnotify_mark_connector hold inode reference
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (7 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 08/35] fsnotify: Move object pointer to fsnotify_mark_connector Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:33 ` [PATCH 10/35] fsnotify: Remove indirection from mark list addition Jan Kara
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Currently inode reference is held by fsnotify marks. Change the rules so
that inode reference is held by fsnotify_mark_connector structure
whenever the list is non-empty. This simplifies the code and is more
logical.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fsnotify.h             |  4 +---
 fs/notify/inode_mark.c           | 30 +++++-------------------------
 fs/notify/mark.c                 | 17 ++++++-----------
 include/linux/fsnotify_backend.h | 12 ++++++------
 4 files changed, 18 insertions(+), 45 deletions(-)

diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index dd1a6798c9cd..1a2aec65ebd8 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -21,8 +21,6 @@ extern u32 fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
 extern int fsnotify_compare_groups(struct fsnotify_group *a,
 				   struct fsnotify_group *b);
 
-extern void fsnotify_set_inode_mark_mask_locked(struct fsnotify_mark *fsn_mark,
-						__u32 mask);
 /* Add mark to a proper place in mark list */
 extern int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 				  struct fsnotify_mark *mark,
@@ -40,7 +38,7 @@ extern int fsnotify_add_vfsmount_mark(struct fsnotify_mark *mark,
 /* vfsmount specific destruction of a mark */
 extern void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark);
 /* inode specific destruction of a mark */
-extern void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark);
+extern struct inode *fsnotify_destroy_inode_mark(struct fsnotify_mark *mark);
 /* Find mark belonging to given group in the list of marks */
 extern struct fsnotify_mark *fsnotify_find_mark(
 					struct fsnotify_mark_connector *conn,
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index 1644ba09efd4..c3873b6920e7 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -43,9 +43,10 @@ void fsnotify_recalc_inode_mask(struct inode *inode)
 	__fsnotify_update_child_dentry_flags(inode);
 }
 
-void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
+struct inode *fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
 {
 	struct inode *inode = mark->connector->inode;
+	bool empty;
 
 	BUG_ON(!mutex_is_locked(&mark->group->mark_mutex));
 	assert_spin_locked(&mark->lock);
@@ -53,6 +54,7 @@ void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
 	spin_lock(&inode->i_lock);
 
 	hlist_del_init_rcu(&mark->obj_list);
+	empty = hlist_empty(&mark->connector->list);
 	mark->connector = NULL;
 
 	/*
@@ -62,6 +64,8 @@ void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
 	 */
 	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	spin_unlock(&inode->i_lock);
+
+	return empty ? inode : NULL;
 }
 
 /*
@@ -89,30 +93,6 @@ struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group,
 }
 
 /*
- * If we are setting a mark mask on an inode mark we should pin the inode
- * in memory.
- */
-void fsnotify_set_inode_mark_mask_locked(struct fsnotify_mark *mark,
-					 __u32 mask)
-{
-	struct inode *inode;
-
-	assert_spin_locked(&mark->lock);
-
-	if (mask && mark->connector &&
-	    !(mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED)) {
-		mark->flags |= FSNOTIFY_MARK_FLAG_OBJECT_PINNED;
-		inode = igrab(mark->connector->inode);
-		/*
-		 * we shouldn't be able to get here if the inode wasn't
-		 * already safely held in memory.  But bug in case it
-		 * ever is wrong.
-		 */
-		BUG_ON(!inode);
-	}
-}
-
-/*
  * Attach an initialized mark to a given inode.
  * These marks may be used for the fsnotify backend to determine which
  * event types should be delivered to which group and for which inodes.  These
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 3d6e7a8e58be..8a15c64fbe80 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -142,10 +142,9 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
 
 	mark->flags &= ~FSNOTIFY_MARK_FLAG_ATTACHED;
 
-	if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE) {
-		inode = mark->connector->inode;
-		fsnotify_destroy_inode_mark(mark);
-	} else if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
+	if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE)
+		inode = fsnotify_destroy_inode_mark(mark);
+	else if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
 		fsnotify_destroy_vfsmount_mark(mark);
 	else
 		BUG();
@@ -160,7 +159,7 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
 
 	spin_unlock(&mark->lock);
 
-	if (inode && (mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED))
+	if (inode)
 		iput(inode);
 
 	atomic_dec(&group->num_marks);
@@ -274,9 +273,6 @@ void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask)
 	assert_spin_locked(&mark->lock);
 
 	mark->mask = mask;
-
-	if (mark->connector && mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE)
-		fsnotify_set_inode_mark_mask_locked(mark, mask);
 }
 
 void fsnotify_set_mark_ignored_mask_locked(struct fsnotify_mark *mark, __u32 mask)
@@ -375,6 +371,8 @@ int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 	/* is mark the first mark? */
 	if (hlist_empty(&conn->list)) {
 		hlist_add_head_rcu(&mark->obj_list, &conn->list);
+		if (inode)
+			__iget(inode);
 		goto added;
 	}
 
@@ -441,9 +439,6 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	} else {
 		BUG();
 	}
-
-	/* this will pin the object if appropriate */
-	fsnotify_set_mark_mask_locked(mark, mark->mask);
 	spin_unlock(&mark->lock);
 
 	if (inode)
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 06f9a2cc1463..96333fb09309 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -196,8 +196,9 @@ struct fsnotify_group {
 
 /*
  * Inode / vfsmount point to this structure which tracks all marks attached to
- * the inode / vfsmount. The structure is freed only when inode / vfsmount gets
- * freed.
+ * the inode / vfsmount. The reference to inode / vfsmount is held by this
+ * structure whenever the list is non-empty. The structure is freed only when
+ * inode / vfsmount gets freed.
  */
 struct fsnotify_mark_connector {
 #define FSNOTIFY_OBJ_TYPE_INODE		0x01
@@ -245,10 +246,9 @@ struct fsnotify_mark {
 	struct fsnotify_mark_connector *connector;
 	/* Events types to ignore [mark->lock, group->mark_mutex] */
 	__u32 ignored_mask;
-#define FSNOTIFY_MARK_FLAG_OBJECT_PINNED	0x04
-#define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY	0x08
-#define FSNOTIFY_MARK_FLAG_ALIVE		0x10
-#define FSNOTIFY_MARK_FLAG_ATTACHED		0x20
+#define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY	0x01
+#define FSNOTIFY_MARK_FLAG_ALIVE		0x02
+#define FSNOTIFY_MARK_FLAG_ATTACHED		0x04
 	unsigned int flags;		/* flags [mark->lock] */
 	void (*free_mark)(struct fsnotify_mark *mark); /* called on final put+free */
 };
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 10/35] fsnotify: Remove indirection from mark list addition
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (8 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 09/35] fsnotify: Make fsnotify_mark_connector hold inode reference Jan Kara
@ 2017-04-03 15:33 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 11/35] fsnotify: Move fsnotify_destroy_marks() Jan Kara
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Adding notification mark to object list has been currently done through
fsnotify_add_{inode|vfsmount}_mark() helpers from
fsnotify_add_mark_locked() which call fsnotify_add_mark_list(). Remove
this unnecessary indirection to simplify the code.

Pushing all the locking to fsnotify_add_mark_list() also allows us to
allocate the connector structure with GFP_KERNEL mode.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fsnotify.h      | 14 ----------
 fs/notify/inode_mark.c    | 25 ------------------
 fs/notify/mark.c          | 66 ++++++++++++++++++++++++++++++-----------------
 fs/notify/vfsmount_mark.c | 24 -----------------
 4 files changed, 43 insertions(+), 86 deletions(-)

diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 1a2aec65ebd8..0354338aad78 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -21,20 +21,6 @@ extern u32 fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
 extern int fsnotify_compare_groups(struct fsnotify_group *a,
 				   struct fsnotify_group *b);
 
-/* Add mark to a proper place in mark list */
-extern int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
-				  struct fsnotify_mark *mark,
-				  struct inode *inode, struct vfsmount *mnt,
-				  int allow_dups);
-/* add a mark to an inode */
-extern int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
-				   struct fsnotify_group *group, struct inode *inode,
-				   int allow_dups);
-/* add a mark to a vfsmount */
-extern int fsnotify_add_vfsmount_mark(struct fsnotify_mark *mark,
-				      struct fsnotify_group *group, struct vfsmount *mnt,
-				      int allow_dups);
-
 /* vfsmount specific destruction of a mark */
 extern void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark);
 /* inode specific destruction of a mark */
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index c3873b6920e7..87bef7d802db 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -92,31 +92,6 @@ struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group,
 	return mark;
 }
 
-/*
- * Attach an initialized mark to a given inode.
- * These marks may be used for the fsnotify backend to determine which
- * event types should be delivered to which group and for which inodes.  These
- * marks are ordered according to priority, highest number first, and then by
- * the group's location in memory.
- */
-int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
-			    struct fsnotify_group *group, struct inode *inode,
-			    int allow_dups)
-{
-	int ret;
-
-	BUG_ON(!mutex_is_locked(&group->mark_mutex));
-	assert_spin_locked(&mark->lock);
-
-	spin_lock(&inode->i_lock);
-	ret = fsnotify_add_mark_list(&inode->i_fsnotify_marks, mark, inode,
-				     NULL, allow_dups);
-	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
-	spin_unlock(&inode->i_lock);
-
-	return ret;
-}
-
 /**
  * fsnotify_unmount_inodes - an sb is unmounting.  handle any watched inodes.
  * @sb: superblock being unmounted.
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 8a15c64fbe80..e8c2f829ce65 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -320,12 +320,13 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
 
 static int fsnotify_attach_connector_to_object(
 					struct fsnotify_mark_connector **connp,
+					spinlock_t *lock,
 					struct inode *inode,
 					struct vfsmount *mnt)
 {
 	struct fsnotify_mark_connector *conn;
 
-	conn = kmem_cache_alloc(fsnotify_mark_connector_cachep, GFP_ATOMIC);
+	conn = kmem_cache_alloc(fsnotify_mark_connector_cachep, GFP_KERNEL);
 	if (!conn)
 		return -ENOMEM;
 	INIT_HLIST_HEAD(&conn->list);
@@ -341,7 +342,12 @@ static int fsnotify_attach_connector_to_object(
 	 * lockless_dereference() in fsnotify().
 	 */
 	smp_wmb();
-	*connp = conn;
+	spin_lock(lock);
+	if (!*connp)
+		*connp = conn;
+	else
+		kmem_cache_free(fsnotify_mark_connector_cachep, conn);
+	spin_unlock(lock);
 
 	return 0;
 }
@@ -352,20 +358,35 @@ static int fsnotify_attach_connector_to_object(
  * to which group and for which inodes. These marks are ordered according to
  * priority, highest number first, and then by the group's location in memory.
  */
-int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
-			   struct fsnotify_mark *mark, struct inode *inode,
-			   struct vfsmount *mnt, int allow_dups)
+static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
+				  struct inode *inode, struct vfsmount *mnt,
+				  int allow_dups)
 {
 	struct fsnotify_mark *lmark, *last = NULL;
 	struct fsnotify_mark_connector *conn;
+	struct fsnotify_mark_connector **connp;
+	spinlock_t *lock;
 	int cmp;
-	int err;
+	int err = 0;
+
+	if (WARN_ON(!inode && !mnt))
+		return -EINVAL;
+	if (inode) {
+		connp = &inode->i_fsnotify_marks;
+		lock = &inode->i_lock;
+	} else {
+		connp = &real_mount(mnt)->mnt_fsnotify_marks;
+		lock = &mnt->mnt_root->d_lock;
+	}
 
 	if (!*connp) {
-		err = fsnotify_attach_connector_to_object(connp, inode, mnt);
+		err = fsnotify_attach_connector_to_object(connp, lock,
+							  inode, mnt);
 		if (err)
 			return err;
 	}
+	spin_lock(&mark->lock);
+	spin_lock(lock);
 	conn = *connp;
 
 	/* is mark the first mark? */
@@ -380,8 +401,10 @@ int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 	hlist_for_each_entry(lmark, &conn->list, obj_list) {
 		last = lmark;
 
-		if ((lmark->group == mark->group) && !allow_dups)
-			return -EEXIST;
+		if ((lmark->group == mark->group) && !allow_dups) {
+			err = -EEXIST;
+			goto out_err;
+		}
 
 		cmp = fsnotify_compare_groups(lmark->group, mark->group);
 		if (cmp >= 0) {
@@ -395,7 +418,10 @@ int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp,
 	hlist_add_behind_rcu(&mark->obj_list, &last->obj_list);
 added:
 	mark->connector = conn;
-	return 0;
+out_err:
+	spin_unlock(lock);
+	spin_unlock(&mark->lock);
+	return err;
 }
 
 /*
@@ -427,22 +453,16 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	list_add(&mark->g_list, &group->marks_list);
 	atomic_inc(&group->num_marks);
 	fsnotify_get_mark(mark); /* for i_list and g_list */
-
-	if (inode) {
-		ret = fsnotify_add_inode_mark(mark, group, inode, allow_dups);
-		if (ret)
-			goto err;
-	} else if (mnt) {
-		ret = fsnotify_add_vfsmount_mark(mark, group, mnt, allow_dups);
-		if (ret)
-			goto err;
-	} else {
-		BUG();
-	}
 	spin_unlock(&mark->lock);
 
+	ret = fsnotify_add_mark_list(mark, inode, mnt, allow_dups);
+	if (ret)
+		goto err;
+
 	if (inode)
-		__fsnotify_update_child_dentry_flags(inode);
+		fsnotify_recalc_inode_mask(inode);
+	else
+		fsnotify_recalc_vfsmount_mask(mnt);
 
 	return ret;
 err:
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index e04e33ef02d4..49ccbdb74f82 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -80,27 +80,3 @@ struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group,
 
 	return mark;
 }
-
-/*
- * Attach an initialized mark to a given group and vfsmount.
- * These marks may be used for the fsnotify backend to determine which
- * event types should be delivered to which groups.
- */
-int fsnotify_add_vfsmount_mark(struct fsnotify_mark *mark,
-			       struct fsnotify_group *group, struct vfsmount *mnt,
-			       int allow_dups)
-{
-	struct mount *m = real_mount(mnt);
-	int ret;
-
-	BUG_ON(!mutex_is_locked(&group->mark_mutex));
-	assert_spin_locked(&mark->lock);
-
-	spin_lock(&mnt->mnt_root->d_lock);
-	ret = fsnotify_add_mark_list(&m->mnt_fsnotify_marks, mark, NULL, mnt,
-				     allow_dups);
-	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
-	spin_unlock(&mnt->mnt_root->d_lock);
-
-	return ret;
-}
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 11/35] fsnotify: Move fsnotify_destroy_marks()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (9 preceding siblings ...)
  2017-04-03 15:33 ` [PATCH 10/35] fsnotify: Remove indirection from mark list addition Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 12/35] fsnotify: Move locking into fsnotify_recalc_mask() Jan Kara
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Move fsnotify_destroy_marks() to be later in the fs/notify/mark.c. It
will need some functions that are declared after its current
declaration. No functional change.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/mark.c | 72 ++++++++++++++++++++++++++++----------------------------
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index e8c2f829ce65..b3f83ed6e8be 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -224,42 +224,6 @@ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 	fsnotify_free_mark(mark);
 }
 
-void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn,
-			    spinlock_t *lock)
-{
-	struct fsnotify_mark *mark;
-
-	if (!conn)
-		return;
-
-	while (1) {
-		/*
-		 * We have to be careful since we can race with e.g.
-		 * fsnotify_clear_marks_by_group() and once we drop 'lock',
-		 * mark can get removed from the obj_list and destroyed. But
-		 * we are holding mark reference so mark cannot be freed and
-		 * calling fsnotify_destroy_mark() more than once is fine.
-		 */
-		spin_lock(lock);
-		if (hlist_empty(&conn->list)) {
-			spin_unlock(lock);
-			break;
-		}
-		mark = hlist_entry(conn->list.first, struct fsnotify_mark,
-				   obj_list);
-		/*
-		 * We don't update i_fsnotify_mask / mnt_fsnotify_mask here
-		 * since inode / mount is going away anyway. So just remove
-		 * mark from the list.
-		 */
-		hlist_del_init_rcu(&mark->obj_list);
-		fsnotify_get_mark(mark);
-		spin_unlock(lock);
-		fsnotify_destroy_mark(mark, mark->group);
-		fsnotify_put_mark(mark);
-	}
-}
-
 void fsnotify_connector_free(struct fsnotify_mark_connector **connp)
 {
 	if (*connp) {
@@ -580,6 +544,42 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
 	}
 }
 
+void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn,
+			    spinlock_t *lock)
+{
+	struct fsnotify_mark *mark;
+
+	if (!conn)
+		return;
+
+	while (1) {
+		/*
+		 * We have to be careful since we can race with e.g.
+		 * fsnotify_clear_marks_by_group() and once we drop 'lock',
+		 * mark can get removed from the obj_list and destroyed. But
+		 * we are holding mark reference so mark cannot be freed and
+		 * calling fsnotify_destroy_mark() more than once is fine.
+		 */
+		spin_lock(lock);
+		if (hlist_empty(&conn->list)) {
+			spin_unlock(lock);
+			break;
+		}
+		mark = hlist_entry(conn->list.first, struct fsnotify_mark,
+				   obj_list);
+		/*
+		 * We don't update i_fsnotify_mask / mnt_fsnotify_mask here
+		 * since inode / mount is going away anyway. So just remove
+		 * mark from the list.
+		 */
+		hlist_del_init_rcu(&mark->obj_list);
+		fsnotify_get_mark(mark);
+		spin_unlock(lock);
+		fsnotify_destroy_mark(mark, mark->group);
+		fsnotify_put_mark(mark);
+	}
+}
+
 /*
  * Nothing fancy, just initialize lists and locks and counters.
  */
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 12/35] fsnotify: Move locking into fsnotify_recalc_mask()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (10 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 11/35] fsnotify: Move fsnotify_destroy_marks() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 13/35] fsnotify: Move locking into fsnotify_find_mark() Jan Kara
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Move locking of locks protecting a list of marks into
fsnotify_recalc_mask(). This reduces code churn in the following patch
which changes the lock protecting the list of marks.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/dnotify/dnotify.c      |  3 +--
 fs/notify/fsnotify.h             |  3 ---
 fs/notify/inode_mark.c           | 18 +++---------------
 fs/notify/mark.c                 | 40 ++++++++++++++++++++++++++++++----------
 fs/notify/vfsmount_mark.c        | 13 +++----------
 include/linux/fsnotify_backend.h |  2 ++
 6 files changed, 39 insertions(+), 40 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 5024729dba23..41b2a070761c 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -69,8 +69,7 @@ static void dnotify_recalc_inode_mask(struct fsnotify_mark *fsn_mark)
 	if (old_mask == new_mask)
 		return;
 
-	if (fsn_mark->connector)
-		fsnotify_recalc_inode_mask(fsn_mark->connector->inode);
+	fsnotify_recalc_mask(fsn_mark->connector);
 }
 
 /*
diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 0354338aad78..96051780d50e 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -14,9 +14,6 @@ extern void fsnotify_flush_notify(struct fsnotify_group *group);
 /* protects reads of inode and vfsmount marks list */
 extern struct srcu_struct fsnotify_mark_srcu;
 
-/* Calculate mask of events for a list of marks */
-extern u32 fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
-
 /* compare two groups for sorting of marks lists */
 extern int fsnotify_compare_groups(struct fsnotify_group *a,
 				   struct fsnotify_group *b);
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index 87bef7d802db..9b2f4e6eb8eb 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -30,17 +30,9 @@
 
 #include "../internal.h"
 
-/*
- * Recalculate the inode->i_fsnotify_mask, or the mask of all FS_* event types
- * any notifier is interested in hearing for this inode.
- */
 void fsnotify_recalc_inode_mask(struct inode *inode)
 {
-	spin_lock(&inode->i_lock);
-	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
-	spin_unlock(&inode->i_lock);
-
-	__fsnotify_update_child_dentry_flags(inode);
+	fsnotify_recalc_mask(inode->i_fsnotify_marks);
 }
 
 struct inode *fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
@@ -57,14 +49,10 @@ struct inode *fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
 	empty = hlist_empty(&mark->connector->list);
 	mark->connector = NULL;
 
-	/*
-	 * this mark is now off the inode->i_fsnotify_marks list and we
-	 * hold the inode->i_lock, so this is the perfect time to update the
-	 * inode->i_fsnotify_mask
-	 */
-	inode->i_fsnotify_mask = fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	spin_unlock(&inode->i_lock);
 
+	fsnotify_recalc_mask(inode->i_fsnotify_marks);
+
 	return empty ? inode : NULL;
 }
 
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index b3f83ed6e8be..06faf166c7ae 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -105,18 +105,40 @@ void fsnotify_put_mark(struct fsnotify_mark *mark)
 	}
 }
 
-/* Calculate mask of events for a list of marks */
-u32 fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
+static void __fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 {
 	u32 new_mask = 0;
 	struct fsnotify_mark *mark;
 
-	if (!conn)
-		return 0;
-
 	hlist_for_each_entry(mark, &conn->list, obj_list)
 		new_mask |= mark->mask;
-	return new_mask;
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+		conn->inode->i_fsnotify_mask = new_mask;
+	else if (conn->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
+		real_mount(conn->mnt)->mnt_fsnotify_mask = new_mask;
+}
+
+/*
+ * Calculate mask of events for a list of marks. The caller must make sure
+ * connector cannot disappear under us (usually by holding a mark->lock or
+ * mark->group->mark_mutex for a mark on this list).
+ */
+void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
+{
+	if (!conn)
+		return;
+
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+		spin_lock(&conn->inode->i_lock);
+	else
+		spin_lock(&conn->mnt->mnt_root->d_lock);
+	__fsnotify_recalc_mask(conn);
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE) {
+		spin_unlock(&conn->inode->i_lock);
+		__fsnotify_update_child_dentry_flags(conn->inode);
+	} else {
+		spin_unlock(&conn->mnt->mnt_root->d_lock);
+	}
 }
 
 /*
@@ -423,10 +445,8 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	if (ret)
 		goto err;
 
-	if (inode)
-		fsnotify_recalc_inode_mask(inode);
-	else
-		fsnotify_recalc_vfsmount_mask(mnt);
+	if (mark->mask)
+		fsnotify_recalc_mask(mark->connector);
 
 	return ret;
 err:
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index 49ccbdb74f82..ffe0d7098cba 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -34,17 +34,9 @@ void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
 	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
 }
 
-/*
- * Recalculate the mnt->mnt_fsnotify_mask, or the mask of all FS_* event types
- * any notifier is interested in hearing for this mount point
- */
 void fsnotify_recalc_vfsmount_mask(struct vfsmount *mnt)
 {
-	struct mount *m = real_mount(mnt);
-
-	spin_lock(&mnt->mnt_root->d_lock);
-	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
-	spin_unlock(&mnt->mnt_root->d_lock);
+	fsnotify_recalc_mask(real_mount(mnt)->mnt_fsnotify_marks);
 }
 
 void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark)
@@ -60,8 +52,9 @@ void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark)
 	hlist_del_init_rcu(&mark->obj_list);
 	mark->connector = NULL;
 
-	m->mnt_fsnotify_mask = fsnotify_recalc_mask(m->mnt_fsnotify_marks);
 	spin_unlock(&mnt->mnt_root->d_lock);
+
+	fsnotify_recalc_mask(m->mnt_fsnotify_marks);
 }
 
 /*
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 96333fb09309..b954f1b2571c 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -327,6 +327,8 @@ extern struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group
 
 /* functions used to manipulate the marks attached to inodes */
 
+/* Calculate mask of events for a list of marks */
+extern void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
 /* run all marks associated with a vfsmount and update mnt->mnt_fsnotify_mask */
 extern void fsnotify_recalc_vfsmount_mask(struct vfsmount *mnt);
 /* run all marks associated with an inode and update inode->i_fsnotify_mask */
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 13/35] fsnotify: Move locking into fsnotify_find_mark()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (11 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 12/35] fsnotify: Move locking into fsnotify_recalc_mask() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 14/35] fsnotify: Determine lock in fsnotify_destroy_marks() Jan Kara
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Move locking of a mark list into fsnotify_find_mark(). This reduces code
churn in the following patch changing lock protecting the list.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/inode_mark.c    | 8 +-------
 fs/notify/mark.c          | 8 ++++++++
 fs/notify/vfsmount_mark.c | 7 +------
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index 9b2f4e6eb8eb..f05fc49b8242 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -71,13 +71,7 @@ void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
 struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group,
 					       struct inode *inode)
 {
-	struct fsnotify_mark *mark;
-
-	spin_lock(&inode->i_lock);
-	mark = fsnotify_find_mark(inode->i_fsnotify_marks, group);
-	spin_unlock(&inode->i_lock);
-
-	return mark;
+	return fsnotify_find_mark(inode->i_fsnotify_marks, group);
 }
 
 /**
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 06faf166c7ae..0830e0af997a 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -485,16 +485,24 @@ struct fsnotify_mark *fsnotify_find_mark(struct fsnotify_mark_connector *conn,
 					 struct fsnotify_group *group)
 {
 	struct fsnotify_mark *mark;
+	spinlock_t *lock;
 
 	if (!conn)
 		return NULL;
 
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+		lock = &conn->inode->i_lock;
+	else
+		lock = &conn->mnt->mnt_root->d_lock;
+	spin_lock(lock);
 	hlist_for_each_entry(mark, &conn->list, obj_list) {
 		if (mark->group == group) {
 			fsnotify_get_mark(mark);
+			spin_unlock(lock);
 			return mark;
 		}
 	}
+	spin_unlock(lock);
 	return NULL;
 }
 
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index ffe0d7098cba..3476ee44b2c5 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -65,11 +65,6 @@ struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group,
 						  struct vfsmount *mnt)
 {
 	struct mount *m = real_mount(mnt);
-	struct fsnotify_mark *mark;
 
-	spin_lock(&mnt->mnt_root->d_lock);
-	mark = fsnotify_find_mark(m->mnt_fsnotify_marks, group);
-	spin_unlock(&mnt->mnt_root->d_lock);
-
-	return mark;
+	return fsnotify_find_mark(m->mnt_fsnotify_marks, group);
 }
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 14/35] fsnotify: Determine lock in fsnotify_destroy_marks()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (12 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 13/35] fsnotify: Move locking into fsnotify_find_mark() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 15/35] fsnotify: Remove indirection from fsnotify_detach_mark() Jan Kara
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Instead of passing spinlock into fsnotify_destroy_marks() determine it
directly in that function from the connector type. This will reduce code
churn when changing lock protecting list of marks.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fsnotify.h | 10 ++++------
 fs/notify/mark.c     |  9 +++++++--
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 96051780d50e..225924274f8a 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -26,19 +26,17 @@ extern struct inode *fsnotify_destroy_inode_mark(struct fsnotify_mark *mark);
 extern struct fsnotify_mark *fsnotify_find_mark(
 					struct fsnotify_mark_connector *conn,
 					struct fsnotify_group *group);
-/* Destroy all marks connected via given connector protected by 'lock' */
-extern void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn,
-				   spinlock_t *lock);
+/* Destroy all marks connected via given connector */
+extern void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn);
 /* run the list of all marks associated with inode and destroy them */
 static inline void fsnotify_clear_marks_by_inode(struct inode *inode)
 {
-	fsnotify_destroy_marks(inode->i_fsnotify_marks, &inode->i_lock);
+	fsnotify_destroy_marks(inode->i_fsnotify_marks);
 }
 /* run the list of all marks associated with vfsmount and destroy them */
 static inline void fsnotify_clear_marks_by_mount(struct vfsmount *mnt)
 {
-	fsnotify_destroy_marks(real_mount(mnt)->mnt_fsnotify_marks,
-			       &mnt->mnt_root->d_lock);
+	fsnotify_destroy_marks(real_mount(mnt)->mnt_fsnotify_marks);
 }
 /* prepare for freeing all marks associated with given group */
 extern void fsnotify_detach_group_marks(struct fsnotify_group *group);
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 0830e0af997a..f32ca924c44e 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -572,14 +572,19 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
 	}
 }
 
-void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn,
-			    spinlock_t *lock)
+void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn)
 {
 	struct fsnotify_mark *mark;
+	spinlock_t *lock;
 
 	if (!conn)
 		return;
 
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+		lock = &conn->inode->i_lock;
+	else
+		lock = &conn->mnt->mnt_root->d_lock;
+
 	while (1) {
 		/*
 		 * We have to be careful since we can race with e.g.
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 15/35] fsnotify: Remove indirection from fsnotify_detach_mark()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (13 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 14/35] fsnotify: Determine lock in fsnotify_destroy_marks() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 16/35] fsnotify: Avoid double locking in fsnotify_detach_from_object() Jan Kara
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

fsnotify_detach_mark() calls fsnotify_destroy_inode_mark() or
fsnotify_destroy_vfsmount_mark() to remove mark from object list. These
two functions are however very similar and differ only in the lock they
use to protect the object list of marks. Simplify the code by removing
the indirection and removing mark from the object list in a common
function.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fsnotify.h      |  4 ----
 fs/notify/inode_mark.c    | 21 ---------------------
 fs/notify/mark.c          | 32 ++++++++++++++++++++++++++------
 fs/notify/vfsmount_mark.c | 18 ------------------
 4 files changed, 26 insertions(+), 49 deletions(-)

diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 225924274f8a..510f027bdf0f 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -18,10 +18,6 @@ extern struct srcu_struct fsnotify_mark_srcu;
 extern int fsnotify_compare_groups(struct fsnotify_group *a,
 				   struct fsnotify_group *b);
 
-/* vfsmount specific destruction of a mark */
-extern void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark);
-/* inode specific destruction of a mark */
-extern struct inode *fsnotify_destroy_inode_mark(struct fsnotify_mark *mark);
 /* Find mark belonging to given group in the list of marks */
 extern struct fsnotify_mark *fsnotify_find_mark(
 					struct fsnotify_mark_connector *conn,
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index f05fc49b8242..080b6d8b9973 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -35,27 +35,6 @@ void fsnotify_recalc_inode_mask(struct inode *inode)
 	fsnotify_recalc_mask(inode->i_fsnotify_marks);
 }
 
-struct inode *fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
-{
-	struct inode *inode = mark->connector->inode;
-	bool empty;
-
-	BUG_ON(!mutex_is_locked(&mark->group->mark_mutex));
-	assert_spin_locked(&mark->lock);
-
-	spin_lock(&inode->i_lock);
-
-	hlist_del_init_rcu(&mark->obj_list);
-	empty = hlist_empty(&mark->connector->list);
-	mark->connector = NULL;
-
-	spin_unlock(&inode->i_lock);
-
-	fsnotify_recalc_mask(inode->i_fsnotify_marks);
-
-	return empty ? inode : NULL;
-}
-
 /*
  * Given a group clear all of the inode marks associated with that group.
  */
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index f32ca924c44e..08ab7b252322 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -141,6 +141,30 @@ void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 	}
 }
 
+static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
+{
+	struct fsnotify_mark_connector *conn;
+	struct inode *inode = NULL;
+	spinlock_t *lock;
+
+	conn = mark->connector;
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+		lock = &conn->inode->i_lock;
+	else
+		lock = &conn->mnt->mnt_root->d_lock;
+	spin_lock(lock);
+	hlist_del_init_rcu(&mark->obj_list);
+	if (hlist_empty(&conn->list)) {
+		if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+			inode = conn->inode;
+	}
+	mark->connector = NULL;
+	spin_unlock(lock);
+	fsnotify_recalc_mask(conn);
+
+	return inode;
+}
+
 /*
  * Remove mark from inode / vfsmount list, group list, drop inode reference
  * if we got one.
@@ -164,12 +188,8 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
 
 	mark->flags &= ~FSNOTIFY_MARK_FLAG_ATTACHED;
 
-	if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE)
-		inode = fsnotify_destroy_inode_mark(mark);
-	else if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
-		fsnotify_destroy_vfsmount_mark(mark);
-	else
-		BUG();
+	inode = fsnotify_detach_from_object(mark);
+
 	/*
 	 * Note that we didn't update flags telling whether inode cares about
 	 * what's happening with children. We update these flags from
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index 3476ee44b2c5..26da5c209944 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -39,24 +39,6 @@ void fsnotify_recalc_vfsmount_mask(struct vfsmount *mnt)
 	fsnotify_recalc_mask(real_mount(mnt)->mnt_fsnotify_marks);
 }
 
-void fsnotify_destroy_vfsmount_mark(struct fsnotify_mark *mark)
-{
-	struct vfsmount *mnt = mark->connector->mnt;
-	struct mount *m = real_mount(mnt);
-
-	BUG_ON(!mutex_is_locked(&mark->group->mark_mutex));
-	assert_spin_locked(&mark->lock);
-
-	spin_lock(&mnt->mnt_root->d_lock);
-
-	hlist_del_init_rcu(&mark->obj_list);
-	mark->connector = NULL;
-
-	spin_unlock(&mnt->mnt_root->d_lock);
-
-	fsnotify_recalc_mask(m->mnt_fsnotify_marks);
-}
-
 /*
  * given a group and vfsmount, find the mark associated with that combination.
  * if found take a reference to that mark and return it, else return NULL
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 16/35] fsnotify: Avoid double locking in fsnotify_detach_from_object()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (14 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 15/35] fsnotify: Remove indirection from fsnotify_detach_mark() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 17/35] fsnotify: Remove useless list deletion and comment Jan Kara
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

We lock object list lock in fsnotify_detach_from_object() twice - once
to detach mark and second time to recalculate mask. That is unnecessary
and later it will become problematic as we will free the connector as
soon as there is no mark in it. So move recalculation of fsnotify mask
into the same critical section that is detaching mark.

This also removes recalculation of child dentry flags from
fsnotify_detach_from_object(). That is however fine. Those marks will
get recalculated once some event happens on a child.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/mark.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 08ab7b252322..416ba91750a9 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -158,9 +158,9 @@ static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
 		if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
 			inode = conn->inode;
 	}
+	__fsnotify_recalc_mask(conn);
 	mark->connector = NULL;
 	spin_unlock(lock);
-	fsnotify_recalc_mask(conn);
 
 	return inode;
 }
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 17/35] fsnotify: Remove useless list deletion and comment
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (15 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 16/35] fsnotify: Avoid double locking in fsnotify_detach_from_object() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 18/35] fsnotify: Lock object list with connector lock Jan Kara
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

After removing all the indirection it is clear that

hlist_del_init_rcu(&mark->obj_list);

in fsnotify_destroy_marks() is not needed as the mark gets removed from
the list shortly afterwards in fsnotify_destroy_mark() ->
fsnotify_detach_mark() -> fsnotify_detach_from_object(). Also there is
no problem with mark being visible on object list while we call
fsnotify_destroy_mark() as parallel destruction of marks from several
places is properly handled (as mentioned in the comment in
fsnotify_destroy_marks(). So just remove the list removal and also the
stale comment.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/mark.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 416ba91750a9..b5b641a2b557 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -620,12 +620,6 @@ void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn)
 		}
 		mark = hlist_entry(conn->list.first, struct fsnotify_mark,
 				   obj_list);
-		/*
-		 * We don't update i_fsnotify_mask / mnt_fsnotify_mask here
-		 * since inode / mount is going away anyway. So just remove
-		 * mark from the list.
-		 */
-		hlist_del_init_rcu(&mark->obj_list);
 		fsnotify_get_mark(mark);
 		spin_unlock(lock);
 		fsnotify_destroy_mark(mark, mark->group);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 18/35] fsnotify: Lock object list with connector lock
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (16 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 17/35] fsnotify: Remove useless list deletion and comment Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 19/35] fsnotify: Free fsnotify_mark_connector when there is no mark attached Jan Kara
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

So far list of marks attached to an object (inode / vfsmount) was
protected by i_lock or mnt_root->d_lock. This dictates that the list
must be empty before the object can be destroyed although the list is
now anchored in the fsnotify_mark_connector structure. Protect the list
by a spinlock in the fsnotify_mark_connector structure to decouple
lifetime of a list of marks from a lifetime of the object. This also
simplifies the code quite a bit since we don't have to differentiate
between inode and vfsmount lists in quite a few places anymore.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/mark.c                 | 90 ++++++++++++++--------------------------
 include/linux/fsnotify_backend.h |  3 +-
 2 files changed, 34 insertions(+), 59 deletions(-)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index b5b641a2b557..bfb415d0d757 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -33,7 +33,7 @@
  *
  * group->mark_mutex
  * mark->lock
- * inode->i_lock
+ * mark->connector->lock
  *
  * group->mark_mutex protects the marks_list anchored inside a given group and
  * each mark is hooked via the g_list.  It also protects the groups private
@@ -44,10 +44,12 @@
  * is assigned to as well as the access to a reference of the inode/vfsmount
  * that is being watched by the mark.
  *
- * inode->i_lock protects the i_fsnotify_marks list anchored inside a
- * given inode and each mark is hooked via the i_list. (and sorta the
- * free_i_list)
+ * mark->connector->lock protects the list of marks anchored inside an
+ * inode / vfsmount and each mark is hooked via the i_list.
  *
+ * A list of notification marks relating to inode / mnt is contained in
+ * fsnotify_mark_connector. That structure is alive as long as there are any
+ * marks in the list and is also protected by fsnotify_mark_srcu.
  *
  * LIFETIME:
  * Inode marks survive between when they are added to an inode and when their
@@ -110,8 +112,10 @@ static void __fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 	u32 new_mask = 0;
 	struct fsnotify_mark *mark;
 
+	assert_spin_locked(&conn->lock);
 	hlist_for_each_entry(mark, &conn->list, obj_list)
 		new_mask |= mark->mask;
+
 	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
 		conn->inode->i_fsnotify_mask = new_mask;
 	else if (conn->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
@@ -128,31 +132,20 @@ void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 	if (!conn)
 		return;
 
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
-		spin_lock(&conn->inode->i_lock);
-	else
-		spin_lock(&conn->mnt->mnt_root->d_lock);
+	spin_lock(&conn->lock);
 	__fsnotify_recalc_mask(conn);
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE) {
-		spin_unlock(&conn->inode->i_lock);
+	spin_unlock(&conn->lock);
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
 		__fsnotify_update_child_dentry_flags(conn->inode);
-	} else {
-		spin_unlock(&conn->mnt->mnt_root->d_lock);
-	}
 }
 
 static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
 {
 	struct fsnotify_mark_connector *conn;
 	struct inode *inode = NULL;
-	spinlock_t *lock;
 
 	conn = mark->connector;
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
-		lock = &conn->inode->i_lock;
-	else
-		lock = &conn->mnt->mnt_root->d_lock;
-	spin_lock(lock);
+	spin_lock(&conn->lock);
 	hlist_del_init_rcu(&mark->obj_list);
 	if (hlist_empty(&conn->list)) {
 		if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
@@ -160,7 +153,7 @@ static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
 	}
 	__fsnotify_recalc_mask(conn);
 	mark->connector = NULL;
-	spin_unlock(lock);
+	spin_unlock(&conn->lock);
 
 	return inode;
 }
@@ -326,7 +319,6 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
 
 static int fsnotify_attach_connector_to_object(
 					struct fsnotify_mark_connector **connp,
-					spinlock_t *lock,
 					struct inode *inode,
 					struct vfsmount *mnt)
 {
@@ -335,6 +327,7 @@ static int fsnotify_attach_connector_to_object(
 	conn = kmem_cache_alloc(fsnotify_mark_connector_cachep, GFP_KERNEL);
 	if (!conn)
 		return -ENOMEM;
+	spin_lock_init(&conn->lock);
 	INIT_HLIST_HEAD(&conn->list);
 	if (inode) {
 		conn->flags = FSNOTIFY_OBJ_TYPE_INODE;
@@ -344,16 +337,13 @@ static int fsnotify_attach_connector_to_object(
 		conn->mnt = mnt;
 	}
 	/*
-	 * Make sure 'conn' initialization is visible. Matches
-	 * lockless_dereference() in fsnotify().
+	 * cmpxchg() provides the barrier so that readers of *connp can see
+	 * only initialized structure
 	 */
-	smp_wmb();
-	spin_lock(lock);
-	if (!*connp)
-		*connp = conn;
-	else
+	if (cmpxchg(connp, NULL, conn)) {
+		/* Someone else created list structure for us */
 		kmem_cache_free(fsnotify_mark_connector_cachep, conn);
-	spin_unlock(lock);
+	}
 
 	return 0;
 }
@@ -371,35 +361,30 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
 	struct fsnotify_mark *lmark, *last = NULL;
 	struct fsnotify_mark_connector *conn;
 	struct fsnotify_mark_connector **connp;
-	spinlock_t *lock;
 	int cmp;
 	int err = 0;
 
 	if (WARN_ON(!inode && !mnt))
 		return -EINVAL;
-	if (inode) {
+	if (inode)
 		connp = &inode->i_fsnotify_marks;
-		lock = &inode->i_lock;
-	} else {
+	else
 		connp = &real_mount(mnt)->mnt_fsnotify_marks;
-		lock = &mnt->mnt_root->d_lock;
-	}
 
 	if (!*connp) {
-		err = fsnotify_attach_connector_to_object(connp, lock,
-							  inode, mnt);
+		err = fsnotify_attach_connector_to_object(connp, inode, mnt);
 		if (err)
 			return err;
 	}
 	spin_lock(&mark->lock);
-	spin_lock(lock);
 	conn = *connp;
+	spin_lock(&conn->lock);
 
 	/* is mark the first mark? */
 	if (hlist_empty(&conn->list)) {
 		hlist_add_head_rcu(&mark->obj_list, &conn->list);
 		if (inode)
-			__iget(inode);
+			igrab(inode);
 		goto added;
 	}
 
@@ -425,7 +410,7 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
 added:
 	mark->connector = conn;
 out_err:
-	spin_unlock(lock);
+	spin_unlock(&conn->lock);
 	spin_unlock(&mark->lock);
 	return err;
 }
@@ -449,7 +434,7 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	 * LOCKING ORDER!!!!
 	 * group->mark_mutex
 	 * mark->lock
-	 * inode->i_lock
+	 * mark->connector->lock
 	 */
 	spin_lock(&mark->lock);
 	mark->flags |= FSNOTIFY_MARK_FLAG_ALIVE | FSNOTIFY_MARK_FLAG_ATTACHED;
@@ -505,24 +490,19 @@ struct fsnotify_mark *fsnotify_find_mark(struct fsnotify_mark_connector *conn,
 					 struct fsnotify_group *group)
 {
 	struct fsnotify_mark *mark;
-	spinlock_t *lock;
 
 	if (!conn)
 		return NULL;
 
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
-		lock = &conn->inode->i_lock;
-	else
-		lock = &conn->mnt->mnt_root->d_lock;
-	spin_lock(lock);
+	spin_lock(&conn->lock);
 	hlist_for_each_entry(mark, &conn->list, obj_list) {
 		if (mark->group == group) {
 			fsnotify_get_mark(mark);
-			spin_unlock(lock);
+			spin_unlock(&conn->lock);
 			return mark;
 		}
 	}
-	spin_unlock(lock);
+	spin_unlock(&conn->lock);
 	return NULL;
 }
 
@@ -595,16 +575,10 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
 void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn)
 {
 	struct fsnotify_mark *mark;
-	spinlock_t *lock;
 
 	if (!conn)
 		return;
 
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
-		lock = &conn->inode->i_lock;
-	else
-		lock = &conn->mnt->mnt_root->d_lock;
-
 	while (1) {
 		/*
 		 * We have to be careful since we can race with e.g.
@@ -613,15 +587,15 @@ void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn)
 		 * we are holding mark reference so mark cannot be freed and
 		 * calling fsnotify_destroy_mark() more than once is fine.
 		 */
-		spin_lock(lock);
+		spin_lock(&conn->lock);
 		if (hlist_empty(&conn->list)) {
-			spin_unlock(lock);
+			spin_unlock(&conn->lock);
 			break;
 		}
 		mark = hlist_entry(conn->list.first, struct fsnotify_mark,
 				   obj_list);
 		fsnotify_get_mark(mark);
-		spin_unlock(lock);
+		spin_unlock(&conn->lock);
 		fsnotify_destroy_mark(mark, mark->group);
 		fsnotify_put_mark(mark);
 	}
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index b954f1b2571c..02c6fac652a4 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -201,6 +201,7 @@ struct fsnotify_group {
  * inode / vfsmount gets freed.
  */
 struct fsnotify_mark_connector {
+	spinlock_t lock;
 #define FSNOTIFY_OBJ_TYPE_INODE		0x01
 #define FSNOTIFY_OBJ_TYPE_VFSMOUNT	0x02
 	unsigned int flags;	/* Type of object [lock] */
@@ -240,7 +241,7 @@ struct fsnotify_mark {
 	struct list_head g_list;
 	/* Protects inode / mnt pointers, flags, masks */
 	spinlock_t lock;
-	/* List of marks for inode / vfsmount [obj_lock] */
+	/* List of marks for inode / vfsmount [connector->lock] */
 	struct hlist_node obj_list;
 	/* Head of list of marks for an object [mark->lock, group->mark_mutex] */
 	struct fsnotify_mark_connector *connector;
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 19/35] fsnotify: Free fsnotify_mark_connector when there is no mark attached
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (17 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 18/35] fsnotify: Lock object list with connector lock Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 20/35] inotify: Do not drop mark reference under idr_lock Jan Kara
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Currently we free fsnotify_mark_connector structure only when inode /
vfsmount is getting freed. This can however impose noticeable memory
overhead when marks get attached to inodes only temporarily. So free the
connector structure once the last mark is detached from the object.
Since notification infrastructure can be working with the connector
under the protection of fsnotify_mark_srcu, we have to be careful and
free the fsnotify_mark_connector only after SRCU period passes.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/inode.c                       |   3 -
 fs/mount.h                       |   2 +-
 fs/namespace.c                   |   3 -
 fs/notify/fsnotify.c             |   9 ++-
 fs/notify/fsnotify.h             |  10 +--
 fs/notify/inode_mark.c           |   2 +-
 fs/notify/mark.c                 | 152 ++++++++++++++++++++++++++++-----------
 fs/notify/vfsmount_mark.c        |   2 +-
 include/linux/fs.h               |   2 +-
 include/linux/fsnotify_backend.h |  11 +--
 kernel/auditsc.c                 |   6 +-
 11 files changed, 136 insertions(+), 66 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 750e952d2918..131b2bcebc48 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -234,9 +234,6 @@ void __destroy_inode(struct inode *inode)
 	inode_detach_wb(inode);
 	security_inode_free(inode);
 	fsnotify_inode_delete(inode);
-#ifdef CONFIG_FSNOTIFY
-	fsnotify_connector_free(&inode->i_fsnotify_marks);
-#endif
 	locks_free_lock_context(inode);
 	if (!inode->i_nlink) {
 		WARN_ON(atomic_long_read(&inode->i_sb->s_remove_count) == 0);
diff --git a/fs/mount.h b/fs/mount.h
index bc409360a03b..bf1fda6eed8f 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -59,7 +59,7 @@ struct mount {
 	struct mountpoint *mnt_mp;	/* where is it mounted */
 	struct hlist_node mnt_mp_list;	/* list mounts with the same mountpoint */
 #ifdef CONFIG_FSNOTIFY
-	struct fsnotify_mark_connector *mnt_fsnotify_marks;
+	struct fsnotify_mark_connector __rcu *mnt_fsnotify_marks;
 	__u32 mnt_fsnotify_mask;
 #endif
 	int mnt_id;			/* mount identifier */
diff --git a/fs/namespace.c b/fs/namespace.c
index 2625e1d97a3a..b3b115bd4e1e 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1108,9 +1108,6 @@ static void cleanup_mnt(struct mount *mnt)
 	if (unlikely(mnt->mnt_pins.first))
 		mnt_pin_kill(mnt);
 	fsnotify_vfsmount_delete(&mnt->mnt);
-#ifdef CONFIG_FSNOTIFY
-	fsnotify_connector_free(&mnt->mnt_fsnotify_marks);
-#endif
 	dput(mnt->mnt.mnt_root);
 	deactivate_super(mnt->mnt.mnt_sb);
 	mnt_free_id(mnt);
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index eae621a18ac9..d512ef9f75fc 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -228,7 +228,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 
 	if ((mask & FS_MODIFY) ||
 	    (test_mask & to_tell->i_fsnotify_mask)) {
-		inode_conn = lockless_dereference(to_tell->i_fsnotify_marks);
+		inode_conn = srcu_dereference(to_tell->i_fsnotify_marks,
+					      &fsnotify_mark_srcu);
 		if (inode_conn)
 			inode_node = srcu_dereference(inode_conn->list.first,
 						      &fsnotify_mark_srcu);
@@ -236,11 +237,13 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 
 	if (mnt && ((mask & FS_MODIFY) ||
 		    (test_mask & mnt->mnt_fsnotify_mask))) {
-		inode_conn = lockless_dereference(to_tell->i_fsnotify_marks);
+		inode_conn = srcu_dereference(to_tell->i_fsnotify_marks,
+					      &fsnotify_mark_srcu);
 		if (inode_conn)
 			inode_node = srcu_dereference(inode_conn->list.first,
 						      &fsnotify_mark_srcu);
-		vfsmount_conn = lockless_dereference(mnt->mnt_fsnotify_marks);
+		vfsmount_conn = srcu_dereference(mnt->mnt_fsnotify_marks,
+					         &fsnotify_mark_srcu);
 		if (vfsmount_conn)
 			vfsmount_node = srcu_dereference(
 						vfsmount_conn->list.first,
diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 510f027bdf0f..72050b75ca8c 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -20,19 +20,19 @@ extern int fsnotify_compare_groups(struct fsnotify_group *a,
 
 /* Find mark belonging to given group in the list of marks */
 extern struct fsnotify_mark *fsnotify_find_mark(
-					struct fsnotify_mark_connector *conn,
-					struct fsnotify_group *group);
+				struct fsnotify_mark_connector __rcu **connp,
+				struct fsnotify_group *group);
 /* Destroy all marks connected via given connector */
-extern void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn);
+extern void fsnotify_destroy_marks(struct fsnotify_mark_connector __rcu **connp);
 /* run the list of all marks associated with inode and destroy them */
 static inline void fsnotify_clear_marks_by_inode(struct inode *inode)
 {
-	fsnotify_destroy_marks(inode->i_fsnotify_marks);
+	fsnotify_destroy_marks(&inode->i_fsnotify_marks);
 }
 /* run the list of all marks associated with vfsmount and destroy them */
 static inline void fsnotify_clear_marks_by_mount(struct vfsmount *mnt)
 {
-	fsnotify_destroy_marks(real_mount(mnt)->mnt_fsnotify_marks);
+	fsnotify_destroy_marks(&real_mount(mnt)->mnt_fsnotify_marks);
 }
 /* prepare for freeing all marks associated with given group */
 extern void fsnotify_detach_group_marks(struct fsnotify_group *group);
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index 080b6d8b9973..b9370316727e 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -50,7 +50,7 @@ void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
 struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group,
 					       struct inode *inode)
 {
-	return fsnotify_find_mark(inode->i_fsnotify_marks, group);
+	return fsnotify_find_mark(&inode->i_fsnotify_marks, group);
 }
 
 /**
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index bfb415d0d757..824095db5a3b 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -89,10 +89,14 @@ struct kmem_cache *fsnotify_mark_connector_cachep;
 
 static DEFINE_SPINLOCK(destroy_lock);
 static LIST_HEAD(destroy_list);
+static struct fsnotify_mark_connector *connector_destroy_list;
 
 static void fsnotify_mark_destroy_workfn(struct work_struct *work);
 static DECLARE_DELAYED_WORK(reaper_work, fsnotify_mark_destroy_workfn);
 
+static void fsnotify_connector_destroy_workfn(struct work_struct *work);
+static DECLARE_WORK(connector_reaper_work, fsnotify_connector_destroy_workfn);
+
 void fsnotify_get_mark(struct fsnotify_mark *mark)
 {
 	atomic_inc(&mark->refcnt);
@@ -139,22 +143,73 @@ void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 		__fsnotify_update_child_dentry_flags(conn->inode);
 }
 
+/* Free all connectors queued for freeing once SRCU period ends */
+static void fsnotify_connector_destroy_workfn(struct work_struct *work)
+{
+	struct fsnotify_mark_connector *conn, *free;
+
+	spin_lock(&destroy_lock);
+	conn = connector_destroy_list;
+	connector_destroy_list = NULL;
+	spin_unlock(&destroy_lock);
+
+	synchronize_srcu(&fsnotify_mark_srcu);
+	while (conn) {
+		free = conn;
+		conn = conn->destroy_next;
+		kmem_cache_free(fsnotify_mark_connector_cachep, free);
+	}
+}
+
+
+static struct inode *fsnotify_detach_connector_from_object(
+					struct fsnotify_mark_connector *conn)
+{
+	struct inode *inode = NULL;
+
+	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE) {
+		inode = conn->inode;
+		rcu_assign_pointer(inode->i_fsnotify_marks, NULL);
+		inode->i_fsnotify_mask = 0;
+		conn->inode = NULL;
+		conn->flags &= ~FSNOTIFY_OBJ_TYPE_INODE;
+	} else if (conn->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
+		rcu_assign_pointer(real_mount(conn->mnt)->mnt_fsnotify_marks,
+				   NULL);
+		real_mount(conn->mnt)->mnt_fsnotify_mask = 0;
+		conn->mnt = NULL;
+		conn->flags &= ~FSNOTIFY_OBJ_TYPE_VFSMOUNT;
+	}
+
+	return inode;
+}
+
 static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
 {
 	struct fsnotify_mark_connector *conn;
 	struct inode *inode = NULL;
+	bool free_conn = false;
 
 	conn = mark->connector;
 	spin_lock(&conn->lock);
 	hlist_del_init_rcu(&mark->obj_list);
 	if (hlist_empty(&conn->list)) {
-		if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
-			inode = conn->inode;
+		inode = fsnotify_detach_connector_from_object(conn);
+		free_conn = true;
+	} else {
+		__fsnotify_recalc_mask(conn);
 	}
-	__fsnotify_recalc_mask(conn);
 	mark->connector = NULL;
 	spin_unlock(&conn->lock);
 
+	if (free_conn) {
+		spin_lock(&destroy_lock);
+		conn->destroy_next = connector_destroy_list;
+		connector_destroy_list = conn;
+		spin_unlock(&destroy_lock);
+		queue_work(system_unbound_wq, &connector_reaper_work);
+	}
+
 	return inode;
 }
 
@@ -259,14 +314,6 @@ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 	fsnotify_free_mark(mark);
 }
 
-void fsnotify_connector_free(struct fsnotify_mark_connector **connp)
-{
-	if (*connp) {
-		kmem_cache_free(fsnotify_mark_connector_cachep, *connp);
-		*connp = NULL;
-	}
-}
-
 void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask)
 {
 	assert_spin_locked(&mark->lock);
@@ -318,9 +365,9 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b)
 }
 
 static int fsnotify_attach_connector_to_object(
-					struct fsnotify_mark_connector **connp,
-					struct inode *inode,
-					struct vfsmount *mnt)
+				struct fsnotify_mark_connector __rcu **connp,
+				struct inode *inode,
+				struct vfsmount *mnt)
 {
 	struct fsnotify_mark_connector *conn;
 
@@ -331,7 +378,7 @@ static int fsnotify_attach_connector_to_object(
 	INIT_HLIST_HEAD(&conn->list);
 	if (inode) {
 		conn->flags = FSNOTIFY_OBJ_TYPE_INODE;
-		conn->inode = inode;
+		conn->inode = igrab(inode);
 	} else {
 		conn->flags = FSNOTIFY_OBJ_TYPE_VFSMOUNT;
 		conn->mnt = mnt;
@@ -342,6 +389,8 @@ static int fsnotify_attach_connector_to_object(
 	 */
 	if (cmpxchg(connp, NULL, conn)) {
 		/* Someone else created list structure for us */
+		if (inode)
+			iput(inode);
 		kmem_cache_free(fsnotify_mark_connector_cachep, conn);
 	}
 
@@ -349,6 +398,34 @@ static int fsnotify_attach_connector_to_object(
 }
 
 /*
+ * Get mark connector, make sure it is alive and return with its lock held.
+ * This is for users that get connector pointer from inode or mount. Users that
+ * hold reference to a mark on the list may directly lock connector->lock as
+ * they are sure list cannot go away under them.
+ */
+static struct fsnotify_mark_connector *fsnotify_grab_connector(
+				struct fsnotify_mark_connector __rcu **connp)
+{
+	struct fsnotify_mark_connector *conn;
+	int idx;
+
+	idx = srcu_read_lock(&fsnotify_mark_srcu);
+	conn = srcu_dereference(*connp, &fsnotify_mark_srcu);
+	if (!conn)
+		goto out;
+	spin_lock(&conn->lock);
+	if (!(conn->flags & (FSNOTIFY_OBJ_TYPE_INODE |
+			     FSNOTIFY_OBJ_TYPE_VFSMOUNT))) {
+		spin_unlock(&conn->lock);
+		srcu_read_unlock(&fsnotify_mark_srcu, idx);
+		return NULL;
+	}
+out:
+	srcu_read_unlock(&fsnotify_mark_srcu, idx);
+	return conn;
+}
+
+/*
  * Add mark into proper place in given list of marks. These marks may be used
  * for the fsnotify backend to determine which event types should be delivered
  * to which group and for which inodes. These marks are ordered according to
@@ -360,7 +437,7 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
 {
 	struct fsnotify_mark *lmark, *last = NULL;
 	struct fsnotify_mark_connector *conn;
-	struct fsnotify_mark_connector **connp;
+	struct fsnotify_mark_connector __rcu **connp;
 	int cmp;
 	int err = 0;
 
@@ -370,21 +447,20 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
 		connp = &inode->i_fsnotify_marks;
 	else
 		connp = &real_mount(mnt)->mnt_fsnotify_marks;
-
-	if (!*connp) {
+restart:
+	spin_lock(&mark->lock);
+	conn = fsnotify_grab_connector(connp);
+	if (!conn) {
+		spin_unlock(&mark->lock);
 		err = fsnotify_attach_connector_to_object(connp, inode, mnt);
 		if (err)
 			return err;
+		goto restart;
 	}
-	spin_lock(&mark->lock);
-	conn = *connp;
-	spin_lock(&conn->lock);
 
 	/* is mark the first mark? */
 	if (hlist_empty(&conn->list)) {
 		hlist_add_head_rcu(&mark->obj_list, &conn->list);
-		if (inode)
-			igrab(inode);
 		goto added;
 	}
 
@@ -486,15 +562,17 @@ int fsnotify_add_mark(struct fsnotify_mark *mark, struct fsnotify_group *group,
  * Given a list of marks, find the mark associated with given group. If found
  * take a reference to that mark and return it, else return NULL.
  */
-struct fsnotify_mark *fsnotify_find_mark(struct fsnotify_mark_connector *conn,
-					 struct fsnotify_group *group)
+struct fsnotify_mark *fsnotify_find_mark(
+				struct fsnotify_mark_connector __rcu **connp,
+				struct fsnotify_group *group)
 {
+	struct fsnotify_mark_connector *conn;
 	struct fsnotify_mark *mark;
 
+	conn = fsnotify_grab_connector(connp);
 	if (!conn)
 		return NULL;
 
-	spin_lock(&conn->lock);
 	hlist_for_each_entry(mark, &conn->list, obj_list) {
 		if (mark->group == group) {
 			fsnotify_get_mark(mark);
@@ -572,26 +650,20 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
 	}
 }
 
-void fsnotify_destroy_marks(struct fsnotify_mark_connector *conn)
+/* Destroy all marks attached to inode / vfsmount */
+void fsnotify_destroy_marks(struct fsnotify_mark_connector __rcu **connp)
 {
+	struct fsnotify_mark_connector *conn;
 	struct fsnotify_mark *mark;
 
-	if (!conn)
-		return;
-
-	while (1) {
+	while ((conn = fsnotify_grab_connector(connp))) {
 		/*
 		 * We have to be careful since we can race with e.g.
-		 * fsnotify_clear_marks_by_group() and once we drop 'lock',
-		 * mark can get removed from the obj_list and destroyed. But
-		 * we are holding mark reference so mark cannot be freed and
-		 * calling fsnotify_destroy_mark() more than once is fine.
+		 * fsnotify_clear_marks_by_group() and once we drop the list
+		 * lock, mark can get removed from the obj_list and destroyed.
+		 * But we are holding mark reference so mark cannot be freed
+		 * and calling fsnotify_destroy_mark() more than once is fine.
 		 */
-		spin_lock(&conn->lock);
-		if (hlist_empty(&conn->list)) {
-			spin_unlock(&conn->lock);
-			break;
-		}
 		mark = hlist_entry(conn->list.first, struct fsnotify_mark,
 				   obj_list);
 		fsnotify_get_mark(mark);
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index 26da5c209944..dd5f3fcbccfb 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -48,5 +48,5 @@ struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group,
 {
 	struct mount *m = real_mount(mnt);
 
-	return fsnotify_find_mark(m->mnt_fsnotify_marks, group);
+	return fsnotify_find_mark(&m->mnt_fsnotify_marks, group);
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 66e52342be2d..c0b6150c5fcc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -647,7 +647,7 @@ struct inode {
 
 #ifdef CONFIG_FSNOTIFY
 	__u32			i_fsnotify_mask; /* all events this inode cares about */
-	struct fsnotify_mark_connector	*i_fsnotify_marks;
+	struct fsnotify_mark_connector __rcu	*i_fsnotify_marks;
 #endif
 
 #if IS_ENABLED(CONFIG_FS_ENCRYPTION)
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 02c6fac652a4..84d71b6f75f6 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -197,8 +197,8 @@ struct fsnotify_group {
 /*
  * Inode / vfsmount point to this structure which tracks all marks attached to
  * the inode / vfsmount. The reference to inode / vfsmount is held by this
- * structure whenever the list is non-empty. The structure is freed only when
- * inode / vfsmount gets freed.
+ * structure. We destroy this structure when there are no more marks attached
+ * to it. The structure is protected by fsnotify_mark_srcu.
  */
 struct fsnotify_mark_connector {
 	spinlock_t lock;
@@ -209,7 +209,11 @@ struct fsnotify_mark_connector {
 		struct inode *inode;
 		struct vfsmount *mnt;
 	};
-	struct hlist_head list;
+	union {
+		struct hlist_head list;
+		/* Used listing heads to free after srcu period expires */
+		struct fsnotify_mark_connector *destroy_next;
+	};
 };
 
 /*
@@ -361,7 +365,6 @@ extern void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
 extern void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group);
 /* run all the marks in a group, and clear all of the marks attached to given object type */
 extern void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, unsigned int flags);
-extern void fsnotify_connector_free(struct fsnotify_mark_connector **connp);
 extern void fsnotify_get_mark(struct fsnotify_mark *mark);
 extern void fsnotify_put_mark(struct fsnotify_mark *mark);
 extern void fsnotify_unmount_inodes(struct super_block *sb);
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index bf7b7ca295d0..d383c33540af 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1597,8 +1597,7 @@ static inline void handle_one(const struct inode *inode)
 	struct audit_tree_refs *p;
 	struct audit_chunk *chunk;
 	int count;
-	if (likely(!inode->i_fsnotify_marks ||
-		   hlist_empty(&inode->i_fsnotify_marks->list)))
+	if (likely(!inode->i_fsnotify_marks))
 		return;
 	context = current->audit_context;
 	p = context->trees;
@@ -1641,8 +1640,7 @@ static void handle_path(const struct dentry *dentry)
 	seq = read_seqbegin(&rename_lock);
 	for(;;) {
 		struct inode *inode = d_backing_inode(d);
-		if (inode && unlikely(inode->i_fsnotify_marks &&
-		    !hlist_empty(&inode->i_fsnotify_marks->list))) {
+		if (inode && unlikely(inode->i_fsnotify_marks)) {
 			struct audit_chunk *chunk;
 			chunk = audit_tree_lookup(inode);
 			if (chunk) {
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 20/35] inotify: Do not drop mark reference under idr_lock
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (18 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 19/35] fsnotify: Free fsnotify_mark_connector when there is no mark attached Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 21/35] fsnotify: Move queueing of mark for destruction into fsnotify_put_mark() Jan Kara
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Dropping mark reference can result in mark being freed. Although it
should not happen in inotify_remove_from_idr() since caller should hold
another reference, just don't risk lock up just after WARN_ON
unnecessarily. Also fold do_inotify_remove_from_idr() into the single
callsite as that function really is just two lines of real code.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/inotify/inotify_user.c | 24 ++++++------------------
 1 file changed, 6 insertions(+), 18 deletions(-)

diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index b82a507a5367..f9113e57ef33 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -395,21 +395,6 @@ static struct inotify_inode_mark *inotify_idr_find(struct fsnotify_group *group,
 	return i_mark;
 }
 
-static void do_inotify_remove_from_idr(struct fsnotify_group *group,
-				       struct inotify_inode_mark *i_mark)
-{
-	struct idr *idr = &group->inotify_data.idr;
-	spinlock_t *idr_lock = &group->inotify_data.idr_lock;
-	int wd = i_mark->wd;
-
-	assert_spin_locked(idr_lock);
-
-	idr_remove(idr, wd);
-
-	/* removed from the idr, drop that ref */
-	fsnotify_put_mark(&i_mark->fsn_mark);
-}
-
 /*
  * Remove the mark from the idr (if present) and drop the reference
  * on the mark because it was in the idr.
@@ -417,6 +402,7 @@ static void do_inotify_remove_from_idr(struct fsnotify_group *group,
 static void inotify_remove_from_idr(struct fsnotify_group *group,
 				    struct inotify_inode_mark *i_mark)
 {
+	struct idr *idr = &group->inotify_data.idr;
 	spinlock_t *idr_lock = &group->inotify_data.idr_lock;
 	struct inotify_inode_mark *found_i_mark = NULL;
 	int wd;
@@ -468,13 +454,15 @@ static void inotify_remove_from_idr(struct fsnotify_group *group,
 		BUG();
 	}
 
-	do_inotify_remove_from_idr(group, i_mark);
+	idr_remove(idr, wd);
+	/* Removed from the idr, drop that ref. */
+	fsnotify_put_mark(&i_mark->fsn_mark);
 out:
+	i_mark->wd = -1;
+	spin_unlock(idr_lock);
 	/* match the ref taken by inotify_idr_find_locked() */
 	if (found_i_mark)
 		fsnotify_put_mark(&found_i_mark->fsn_mark);
-	i_mark->wd = -1;
-	spin_unlock(idr_lock);
 }
 
 /*
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 21/35] fsnotify: Move queueing of mark for destruction into fsnotify_put_mark()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (19 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 20/35] inotify: Do not drop mark reference under idr_lock Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 22/35] fsnotify: Detach mark from object list when last reference is dropped Jan Kara
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Currently we queue mark into a list of marks for destruction in
__fsnotify_free_mark() and keep the last mark reference dangling. After the
worker waits for SRCU period, it drops the last reference to the mark
which frees it. This scheme has the disadvantage that if we hold
reference to a mark and drop and reacquire SRCU lock, the mark can get
freed immediately which is slightly inconvenient and we will need to
avoid this in the future.

Move to a scheme where queueing of mark into a list of marks for
destruction happens when the last reference to the mark is dropped. Also
drop reference to the mark held by group list already when mark is
removed from that list instead of dropping it only from the destruction
worker.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/inotify/inotify_user.c |  3 +-
 fs/notify/mark.c                 | 73 ++++++++++++++++------------------------
 2 files changed, 30 insertions(+), 46 deletions(-)

diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index f9113e57ef33..43cbd1b178c9 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -444,10 +444,9 @@ static void inotify_remove_from_idr(struct fsnotify_group *group,
 
 	/*
 	 * One ref for being in the idr
-	 * one ref held by the caller trying to kill us
 	 * one ref grabbed by inotify_idr_find
 	 */
-	if (unlikely(atomic_read(&i_mark->fsn_mark.refcnt) < 3)) {
+	if (unlikely(atomic_read(&i_mark->fsn_mark.refcnt) < 2)) {
 		printk(KERN_ERR "%s: i_mark=%p i_mark->wd=%d i_mark->group=%p\n",
 			 __func__, i_mark, i_mark->wd, i_mark->fsn_mark.group);
 		/* we can't really recover with bad ref cnting.. */
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 824095db5a3b..df66d708a7ec 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -99,15 +99,18 @@ static DECLARE_WORK(connector_reaper_work, fsnotify_connector_destroy_workfn);
 
 void fsnotify_get_mark(struct fsnotify_mark *mark)
 {
+	WARN_ON_ONCE(!atomic_read(&mark->refcnt));
 	atomic_inc(&mark->refcnt);
 }
 
 void fsnotify_put_mark(struct fsnotify_mark *mark)
 {
 	if (atomic_dec_and_test(&mark->refcnt)) {
-		if (mark->group)
-			fsnotify_put_group(mark->group);
-		mark->free_mark(mark);
+		spin_lock(&destroy_lock);
+		list_add(&mark->g_list, &destroy_list);
+		spin_unlock(&destroy_lock);
+		queue_delayed_work(system_unbound_wq, &reaper_work,
+				   FSNOTIFY_REAPER_DELAY);
 	}
 }
 
@@ -217,14 +220,18 @@ static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
  * Remove mark from inode / vfsmount list, group list, drop inode reference
  * if we got one.
  *
- * Must be called with group->mark_mutex held.
+ * Must be called with group->mark_mutex held. The caller must either hold
+ * reference to the mark or be protected by fsnotify_mark_srcu.
  */
 void fsnotify_detach_mark(struct fsnotify_mark *mark)
 {
 	struct inode *inode = NULL;
 	struct fsnotify_group *group = mark->group;
 
-	BUG_ON(!mutex_is_locked(&group->mark_mutex));
+	WARN_ON_ONCE(!mutex_is_locked(&group->mark_mutex));
+	WARN_ON_ONCE(!srcu_read_lock_held(&fsnotify_mark_srcu) &&
+		     atomic_read(&mark->refcnt) < 1 +
+			!!(mark->flags & FSNOTIFY_MARK_FLAG_ATTACHED));
 
 	spin_lock(&mark->lock);
 
@@ -253,18 +260,20 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
 		iput(inode);
 
 	atomic_dec(&group->num_marks);
+
+	/* Drop mark reference acquired in fsnotify_add_mark_locked() */
+	fsnotify_put_mark(mark);
 }
 
 /*
- * Prepare mark for freeing and add it to the list of marks prepared for
- * freeing. The actual freeing must happen after SRCU period ends and the
- * caller is responsible for this.
+ * Free fsnotify mark. The mark is actually only marked as being freed.  The
+ * freeing is actually happening only once last reference to the mark is
+ * dropped from a workqueue which first waits for srcu period end.
  *
- * The function returns true if the mark was added to the list of marks for
- * freeing. The function returns false if someone else has already called
- * __fsnotify_free_mark() for the mark.
+ * Caller must have a reference to the mark or be protected by
+ * fsnotify_mark_srcu.
  */
-static bool __fsnotify_free_mark(struct fsnotify_mark *mark)
+void fsnotify_free_mark(struct fsnotify_mark *mark)
 {
 	struct fsnotify_group *group = mark->group;
 
@@ -272,7 +281,7 @@ static bool __fsnotify_free_mark(struct fsnotify_mark *mark)
 	/* something else already called this function on this mark */
 	if (!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE)) {
 		spin_unlock(&mark->lock);
-		return false;
+		return;
 	}
 	mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
 	spin_unlock(&mark->lock);
@@ -284,25 +293,6 @@ static bool __fsnotify_free_mark(struct fsnotify_mark *mark)
 	 */
 	if (group->ops->freeing_mark)
 		group->ops->freeing_mark(mark, group);
-
-	spin_lock(&destroy_lock);
-	list_add(&mark->g_list, &destroy_list);
-	spin_unlock(&destroy_lock);
-
-	return true;
-}
-
-/*
- * Free fsnotify mark. The freeing is actually happening from a workqueue which
- * first waits for srcu period end. Caller must have a reference to the mark
- * or be protected by fsnotify_mark_srcu.
- */
-void fsnotify_free_mark(struct fsnotify_mark *mark)
-{
-	if (__fsnotify_free_mark(mark)) {
-		queue_delayed_work(system_unbound_wq, &reaper_work,
-				   FSNOTIFY_REAPER_DELAY);
-	}
 }
 
 void fsnotify_destroy_mark(struct fsnotify_mark *mark,
@@ -531,20 +521,13 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 
 	return ret;
 err:
-	mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
+	mark->flags &= ~(FSNOTIFY_MARK_FLAG_ALIVE |
+			 FSNOTIFY_MARK_FLAG_ATTACHED);
 	list_del_init(&mark->g_list);
-	fsnotify_put_group(group);
-	mark->group = NULL;
 	atomic_dec(&group->num_marks);
-
 	spin_unlock(&mark->lock);
 
-	spin_lock(&destroy_lock);
-	list_add(&mark->g_list, &destroy_list);
-	spin_unlock(&destroy_lock);
-	queue_delayed_work(system_unbound_wq, &reaper_work,
-				FSNOTIFY_REAPER_DELAY);
-
+	fsnotify_put_mark(mark);
 	return ret;
 }
 
@@ -645,7 +628,7 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
 		fsnotify_get_mark(mark);
 		fsnotify_detach_mark(mark);
 		mutex_unlock(&group->mark_mutex);
-		__fsnotify_free_mark(mark);
+		fsnotify_free_mark(mark);
 		fsnotify_put_mark(mark);
 	}
 }
@@ -703,7 +686,9 @@ void fsnotify_mark_destroy_list(void)
 
 	list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
 		list_del_init(&mark->g_list);
-		fsnotify_put_mark(mark);
+		if (mark->group)
+			fsnotify_put_group(mark->group);
+		mark->free_mark(mark);
 	}
 }
 
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 22/35] fsnotify: Detach mark from object list when last reference is dropped
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (20 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 21/35] fsnotify: Move queueing of mark for destruction into fsnotify_put_mark() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 23/35] fsnotify: Remove special handling of mark destruction on group shutdown Jan Kara
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Instead of removing mark from object list from fsnotify_detach_mark(),
remove the mark when last reference to the mark is dropped. This will
allow fanotify to wait for userspace response to event without having to
hold onto fsnotify_mark_srcu.

To avoid pinning inodes by elevated refcount (and thus e.g. delaying
file deletion) while someone holds mark reference, we detach connector
from the object also from fsnotify_destroy_marks() and not only after
removing last mark from the list as it was now.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/mark.c                 | 147 +++++++++++++++++++++++----------------
 include/linux/fsnotify_backend.h |   4 +-
 kernel/audit_tree.c              |  31 ++++-----
 3 files changed, 105 insertions(+), 77 deletions(-)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index df66d708a7ec..21c7791362c8 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -49,7 +49,13 @@
  *
  * A list of notification marks relating to inode / mnt is contained in
  * fsnotify_mark_connector. That structure is alive as long as there are any
- * marks in the list and is also protected by fsnotify_mark_srcu.
+ * marks in the list and is also protected by fsnotify_mark_srcu. A mark gets
+ * detached from fsnotify_mark_connector when last reference to the mark is
+ * dropped.  Thus having mark reference is enough to protect mark->connector
+ * pointer and to make sure fsnotify_mark_connector cannot disappear. Also
+ * because we remove mark from g_list before dropping mark reference associated
+ * with that, any mark found through g_list is guaranteed to have
+ * mark->connector set until we drop group->mark_mutex.
  *
  * LIFETIME:
  * Inode marks survive between when they are added to an inode and when their
@@ -103,26 +109,16 @@ void fsnotify_get_mark(struct fsnotify_mark *mark)
 	atomic_inc(&mark->refcnt);
 }
 
-void fsnotify_put_mark(struct fsnotify_mark *mark)
-{
-	if (atomic_dec_and_test(&mark->refcnt)) {
-		spin_lock(&destroy_lock);
-		list_add(&mark->g_list, &destroy_list);
-		spin_unlock(&destroy_lock);
-		queue_delayed_work(system_unbound_wq, &reaper_work,
-				   FSNOTIFY_REAPER_DELAY);
-	}
-}
-
 static void __fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 {
 	u32 new_mask = 0;
 	struct fsnotify_mark *mark;
 
 	assert_spin_locked(&conn->lock);
-	hlist_for_each_entry(mark, &conn->list, obj_list)
-		new_mask |= mark->mask;
-
+	hlist_for_each_entry(mark, &conn->list, obj_list) {
+		if (mark->flags & FSNOTIFY_MARK_FLAG_ATTACHED)
+			new_mask |= mark->mask;
+	}
 	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
 		conn->inode->i_fsnotify_mask = new_mask;
 	else if (conn->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
@@ -131,8 +127,9 @@ static void __fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 
 /*
  * Calculate mask of events for a list of marks. The caller must make sure
- * connector cannot disappear under us (usually by holding a mark->lock or
- * mark->group->mark_mutex for a mark on this list).
+ * connector and connector->inode cannot disappear under us.  Callers achieve
+ * this by holding a mark->lock or mark->group->mark_mutex for a mark on this
+ * list.
  */
 void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 {
@@ -164,7 +161,6 @@ static void fsnotify_connector_destroy_workfn(struct work_struct *work)
 	}
 }
 
-
 static struct inode *fsnotify_detach_connector_from_object(
 					struct fsnotify_mark_connector *conn)
 {
@@ -187,14 +183,34 @@ static struct inode *fsnotify_detach_connector_from_object(
 	return inode;
 }
 
-static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
+static void fsnotify_final_mark_destroy(struct fsnotify_mark *mark)
+{
+	if (mark->group)
+		fsnotify_put_group(mark->group);
+	mark->free_mark(mark);
+}
+
+void fsnotify_put_mark(struct fsnotify_mark *mark)
 {
 	struct fsnotify_mark_connector *conn;
 	struct inode *inode = NULL;
 	bool free_conn = false;
 
+	/* Catch marks that were actually never attached to object */
+	if (!mark->connector) {
+		if (atomic_dec_and_test(&mark->refcnt))
+			fsnotify_final_mark_destroy(mark);
+		return;
+	}
+
+	/*
+	 * We have to be careful so that traversals of obj_list under lock can
+	 * safely grab mark reference.
+	 */
+	if (!atomic_dec_and_lock(&mark->refcnt, &mark->connector->lock))
+		return;
+
 	conn = mark->connector;
-	spin_lock(&conn->lock);
 	hlist_del_init_rcu(&mark->obj_list);
 	if (hlist_empty(&conn->list)) {
 		inode = fsnotify_detach_connector_from_object(conn);
@@ -205,6 +221,8 @@ static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
 	mark->connector = NULL;
 	spin_unlock(&conn->lock);
 
+	iput(inode);
+
 	if (free_conn) {
 		spin_lock(&destroy_lock);
 		conn->destroy_next = connector_destroy_list;
@@ -212,20 +230,31 @@ static struct inode *fsnotify_detach_from_object(struct fsnotify_mark *mark)
 		spin_unlock(&destroy_lock);
 		queue_work(system_unbound_wq, &connector_reaper_work);
 	}
-
-	return inode;
+	/*
+	 * Note that we didn't update flags telling whether inode cares about
+	 * what's happening with children. We update these flags from
+	 * __fsnotify_parent() lazily when next event happens on one of our
+	 * children.
+	 */
+	spin_lock(&destroy_lock);
+	list_add(&mark->g_list, &destroy_list);
+	spin_unlock(&destroy_lock);
+	queue_delayed_work(system_unbound_wq, &reaper_work,
+			   FSNOTIFY_REAPER_DELAY);
 }
 
 /*
- * Remove mark from inode / vfsmount list, group list, drop inode reference
- * if we got one.
+ * Mark mark as detached, remove it from group list. Mark still stays in object
+ * list until its last reference is dropped. Note that we rely on mark being
+ * removed from group list before corresponding reference to it is dropped. In
+ * particular we rely on mark->connector being valid while we hold
+ * group->mark_mutex if we found the mark through g_list.
  *
  * Must be called with group->mark_mutex held. The caller must either hold
  * reference to the mark or be protected by fsnotify_mark_srcu.
  */
 void fsnotify_detach_mark(struct fsnotify_mark *mark)
 {
-	struct inode *inode = NULL;
 	struct fsnotify_group *group = mark->group;
 
 	WARN_ON_ONCE(!mutex_is_locked(&group->mark_mutex));
@@ -234,31 +263,15 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
 			!!(mark->flags & FSNOTIFY_MARK_FLAG_ATTACHED));
 
 	spin_lock(&mark->lock);
-
 	/* something else already called this function on this mark */
 	if (!(mark->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 		spin_unlock(&mark->lock);
 		return;
 	}
-
 	mark->flags &= ~FSNOTIFY_MARK_FLAG_ATTACHED;
-
-	inode = fsnotify_detach_from_object(mark);
-
-	/*
-	 * Note that we didn't update flags telling whether inode cares about
-	 * what's happening with children. We update these flags from
-	 * __fsnotify_parent() lazily when next event happens on one of our
-	 * children.
-	 */
-
 	list_del_init(&mark->g_list);
-
 	spin_unlock(&mark->lock);
 
-	if (inode)
-		iput(inode);
-
 	atomic_dec(&group->num_marks);
 
 	/* Drop mark reference acquired in fsnotify_add_mark_locked() */
@@ -458,7 +471,9 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
 	hlist_for_each_entry(lmark, &conn->list, obj_list) {
 		last = lmark;
 
-		if ((lmark->group == mark->group) && !allow_dups) {
+		if ((lmark->group == mark->group) &&
+		    (lmark->flags & FSNOTIFY_MARK_FLAG_ATTACHED) &&
+		    !allow_dups) {
 			err = -EEXIST;
 			goto out_err;
 		}
@@ -509,7 +524,7 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	mark->group = group;
 	list_add(&mark->g_list, &group->marks_list);
 	atomic_inc(&group->num_marks);
-	fsnotify_get_mark(mark); /* for i_list and g_list */
+	fsnotify_get_mark(mark); /* for g_list */
 	spin_unlock(&mark->lock);
 
 	ret = fsnotify_add_mark_list(mark, inode, mnt, allow_dups);
@@ -557,7 +572,8 @@ struct fsnotify_mark *fsnotify_find_mark(
 		return NULL;
 
 	hlist_for_each_entry(mark, &conn->list, obj_list) {
-		if (mark->group == group) {
+		if (mark->group == group &&
+		    (mark->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 			fsnotify_get_mark(mark);
 			spin_unlock(&conn->lock);
 			return mark;
@@ -637,23 +653,38 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
 void fsnotify_destroy_marks(struct fsnotify_mark_connector __rcu **connp)
 {
 	struct fsnotify_mark_connector *conn;
-	struct fsnotify_mark *mark;
+	struct fsnotify_mark *mark, *old_mark = NULL;
+	struct inode *inode;
 
-	while ((conn = fsnotify_grab_connector(connp))) {
-		/*
-		 * We have to be careful since we can race with e.g.
-		 * fsnotify_clear_marks_by_group() and once we drop the list
-		 * lock, mark can get removed from the obj_list and destroyed.
-		 * But we are holding mark reference so mark cannot be freed
-		 * and calling fsnotify_destroy_mark() more than once is fine.
-		 */
-		mark = hlist_entry(conn->list.first, struct fsnotify_mark,
-				   obj_list);
+	conn = fsnotify_grab_connector(connp);
+	if (!conn)
+		return;
+	/*
+	 * We have to be careful since we can race with e.g.
+	 * fsnotify_clear_marks_by_group() and once we drop the conn->lock, the
+	 * list can get modified. However we are holding mark reference and
+	 * thus our mark cannot be removed from obj_list so we can continue
+	 * iteration after regaining conn->lock.
+	 */
+	hlist_for_each_entry(mark, &conn->list, obj_list) {
 		fsnotify_get_mark(mark);
 		spin_unlock(&conn->lock);
+		if (old_mark)
+			fsnotify_put_mark(old_mark);
+		old_mark = mark;
 		fsnotify_destroy_mark(mark, mark->group);
-		fsnotify_put_mark(mark);
+		spin_lock(&conn->lock);
 	}
+	/*
+	 * Detach list from object now so that we don't pin inode until all
+	 * mark references get dropped. It would lead to strange results such
+	 * as delaying inode deletion or blocking unmount.
+	 */
+	inode = fsnotify_detach_connector_from_object(conn);
+	spin_unlock(&conn->lock);
+	if (old_mark)
+		fsnotify_put_mark(old_mark);
+	iput(inode);
 }
 
 /*
@@ -686,9 +717,7 @@ void fsnotify_mark_destroy_list(void)
 
 	list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
 		list_del_init(&mark->g_list);
-		if (mark->group)
-			fsnotify_put_group(mark->group);
-		mark->free_mark(mark);
+		fsnotify_final_mark_destroy(mark);
 	}
 }
 
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 84d71b6f75f6..a483614b25d0 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -245,9 +245,9 @@ struct fsnotify_mark {
 	struct list_head g_list;
 	/* Protects inode / mnt pointers, flags, masks */
 	spinlock_t lock;
-	/* List of marks for inode / vfsmount [connector->lock] */
+	/* List of marks for inode / vfsmount [connector->lock, mark ref] */
 	struct hlist_node obj_list;
-	/* Head of list of marks for an object [mark->lock, group->mark_mutex] */
+	/* Head of list of marks for an object [mark ref] */
 	struct fsnotify_mark_connector *connector;
 	/* Events types to ignore [mark->lock, group->mark_mutex] */
 	__u32 ignored_mask;
diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 4d4f3284a9e3..672ca1512888 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -172,27 +172,18 @@ static unsigned long inode_to_key(const struct inode *inode)
 /*
  * Function to return search key in our hash from chunk. Key 0 is special and
  * should never be present in the hash.
- *
- * Must be called with chunk->mark.lock held to protect from connector
- * becoming NULL.
  */
-static unsigned long __chunk_to_key(struct audit_chunk *chunk)
+static unsigned long chunk_to_key(struct audit_chunk *chunk)
 {
-	if (!chunk->mark.connector)
+	/*
+	 * We have a reference to the mark so it should be attached to a
+	 * connector.
+	 */
+	if (WARN_ON_ONCE(!chunk->mark.connector))
 		return 0;
 	return (unsigned long)chunk->mark.connector->inode;
 }
 
-static unsigned long chunk_to_key(struct audit_chunk *chunk)
-{
-	unsigned long key;
-
-	spin_lock(&chunk->mark.lock);
-	key = __chunk_to_key(chunk);
-	spin_unlock(&chunk->mark.lock);
-	return key;
-}
-
 static inline struct list_head *chunk_hash(unsigned long key)
 {
 	unsigned long n = key / L1_CACHE_BYTES;
@@ -202,7 +193,7 @@ static inline struct list_head *chunk_hash(unsigned long key)
 /* hash_lock & entry->lock is held by caller */
 static void insert_hash(struct audit_chunk *chunk)
 {
-	unsigned long key = __chunk_to_key(chunk);
+	unsigned long key = chunk_to_key(chunk);
 	struct list_head *list;
 
 	if (!key)
@@ -263,6 +254,10 @@ static void untag_chunk(struct node *p)
 
 	mutex_lock(&entry->group->mark_mutex);
 	spin_lock(&entry->lock);
+	/*
+	 * mark_mutex protects mark from getting detached and thus also from
+	 * mark->connector->inode getting NULL.
+	 */
 	if (chunk->dead || !(entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 		spin_unlock(&entry->lock);
 		mutex_unlock(&entry->group->mark_mutex);
@@ -423,6 +418,10 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 
 	mutex_lock(&old_entry->group->mark_mutex);
 	spin_lock(&old_entry->lock);
+	/*
+	 * mark_mutex protects mark from getting detached and thus also from
+	 * mark->connector->inode getting NULL.
+	 */
 	if (!(old_entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 		/* old_entry is being shot, lets just lie */
 		spin_unlock(&old_entry->lock);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 23/35] fsnotify: Remove special handling of mark destruction on group shutdown
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (21 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 22/35] fsnotify: Detach mark from object list when last reference is dropped Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 24/35] fsnotify: Provide framework for dropping SRCU lock in ->handle_event Jan Kara
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Currently we queue all marks for destruction on group shutdown and then
destroy them from fsnotify_destroy_group() instead from a worker thread
which is the usual path. However worker can already be processing some
list of marks to destroy so this does not make 100% all marks are really
destroyed by the time group is shut down. This isn't a big problem as
each mark holds group reference and thus group stays partially alive
until all marks are really freed but there's no point in complicating
our lives - just wait for the delayed work to be finished instead.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fsnotify.h |  6 ++----
 fs/notify/group.c    | 10 ++++++----
 fs/notify/mark.c     |  7 ++++---
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 72050b75ca8c..2a92dc06198c 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -36,10 +36,8 @@ static inline void fsnotify_clear_marks_by_mount(struct vfsmount *mnt)
 }
 /* prepare for freeing all marks associated with given group */
 extern void fsnotify_detach_group_marks(struct fsnotify_group *group);
-/*
- * wait for fsnotify_mark_srcu period to end and free all marks in destroy_list
- */
-extern void fsnotify_mark_destroy_list(void);
+/* Wait until all marks queued for destruction are destroyed */
+extern void fsnotify_wait_marks_destroyed(void);
 
 /*
  * update the dentry->d_flags of all of inode's children to indicate if inode cares
diff --git a/fs/notify/group.c b/fs/notify/group.c
index fbe3cbebec16..0fb4aadcc19f 100644
--- a/fs/notify/group.c
+++ b/fs/notify/group.c
@@ -66,14 +66,16 @@ void fsnotify_destroy_group(struct fsnotify_group *group)
 	 */
 	fsnotify_group_stop_queueing(group);
 
-	/* clear all inode marks for this group, attach them to destroy_list */
+	/* Clear all marks for this group and queue them for destruction */
 	fsnotify_detach_group_marks(group);
 
 	/*
-	 * Wait for fsnotify_mark_srcu period to end and free all marks in
-	 * destroy_list
+	 * Wait until all marks get really destroyed. We could actually destroy
+	 * them ourselves instead of waiting for worker to do it, however that
+	 * would be racy as worker can already be processing some marks before
+	 * we even entered fsnotify_destroy_group().
 	 */
-	fsnotify_mark_destroy_list();
+	fsnotify_wait_marks_destroyed();
 
 	/*
 	 * Since we have waited for fsnotify_mark_srcu in
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 21c7791362c8..f916b71c9139 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -703,7 +703,7 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
  * Destroy all marks in destroy_list, waits for SRCU period to finish before
  * actually freeing marks.
  */
-void fsnotify_mark_destroy_list(void)
+static void fsnotify_mark_destroy_workfn(struct work_struct *work)
 {
 	struct fsnotify_mark *mark, *next;
 	struct list_head private_destroy_list;
@@ -721,7 +721,8 @@ void fsnotify_mark_destroy_list(void)
 	}
 }
 
-static void fsnotify_mark_destroy_workfn(struct work_struct *work)
+/* Wait for all marks queued for destruction to be actually destroyed */
+void fsnotify_wait_marks_destroyed(void)
 {
-	fsnotify_mark_destroy_list();
+	flush_delayed_work(&reaper_work);
 }
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 24/35] fsnotify: Provide framework for dropping SRCU lock in ->handle_event
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (22 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 23/35] fsnotify: Remove special handling of mark destruction on group shutdown Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 25/35] fsnotify: Pass fsnotify_iter_info into handle_event handler Jan Kara
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

fanotify wants to drop fsnotify_mark_srcu lock when waiting for response
from userspace so that the whole notification subsystem is not blocked
during that time. This patch provides a framework for safely getting
mark reference for a mark found in the object list which pins the mark
in that list. We can then drop fsnotify_mark_srcu, wait for userspace
response and then safely continue iteration of the object list once we
reaquire fsnotify_mark_srcu.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fsnotify.h             |  6 +++
 fs/notify/group.c                |  1 +
 fs/notify/mark.c                 | 82 ++++++++++++++++++++++++++++++++++++++++
 include/linux/fsnotify_backend.h |  5 +++
 4 files changed, 94 insertions(+)

diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 2a92dc06198c..86383c7865c0 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -8,6 +8,12 @@
 
 #include "../mount.h"
 
+struct fsnotify_iter_info {
+	struct fsnotify_mark *inode_mark;
+	struct fsnotify_mark *vfsmount_mark;
+	int srcu_idx;
+};
+
 /* destroy all events sitting in this groups notification queue */
 extern void fsnotify_flush_notify(struct fsnotify_group *group);
 
diff --git a/fs/notify/group.c b/fs/notify/group.c
index 0fb4aadcc19f..79439cdf16e0 100644
--- a/fs/notify/group.c
+++ b/fs/notify/group.c
@@ -126,6 +126,7 @@ struct fsnotify_group *fsnotify_alloc_group(const struct fsnotify_ops *ops)
 	/* set to 0 when there a no external references to this group */
 	atomic_set(&group->refcnt, 1);
 	atomic_set(&group->num_marks, 0);
+	atomic_set(&group->user_waits, 0);
 
 	spin_lock_init(&group->notification_lock);
 	INIT_LIST_HEAD(&group->notification_list);
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index f916b71c9139..c4f43a6acd9a 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -109,6 +109,16 @@ void fsnotify_get_mark(struct fsnotify_mark *mark)
 	atomic_inc(&mark->refcnt);
 }
 
+/*
+ * Get mark reference when we found the mark via lockless traversal of object
+ * list. Mark can be already removed from the list by now and on its way to be
+ * destroyed once SRCU period ends.
+ */
+static bool fsnotify_get_mark_safe(struct fsnotify_mark *mark)
+{
+	return atomic_inc_not_zero(&mark->refcnt);
+}
+
 static void __fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 {
 	u32 new_mask = 0;
@@ -243,6 +253,72 @@ void fsnotify_put_mark(struct fsnotify_mark *mark)
 			   FSNOTIFY_REAPER_DELAY);
 }
 
+bool fsnotify_prepare_user_wait(struct fsnotify_iter_info *iter_info)
+{
+	struct fsnotify_group *group;
+
+	if (WARN_ON_ONCE(!iter_info->inode_mark && !iter_info->vfsmount_mark))
+		return false;
+
+	if (iter_info->inode_mark)
+		group = iter_info->inode_mark->group;
+	else
+		group = iter_info->vfsmount_mark->group;
+
+	/*
+	 * Since acquisition of mark reference is an atomic op as well, we can
+	 * be sure this inc is seen before any effect of refcount increment.
+	 */
+	atomic_inc(&group->user_waits);
+
+	if (iter_info->inode_mark) {
+		/* This can fail if mark is being removed */
+		if (!fsnotify_get_mark_safe(iter_info->inode_mark))
+			goto out_wait;
+	}
+	if (iter_info->vfsmount_mark) {
+		if (!fsnotify_get_mark_safe(iter_info->vfsmount_mark))
+			goto out_inode;
+	}
+
+	/*
+	 * Now that both marks are pinned by refcount in the inode / vfsmount
+	 * lists, we can drop SRCU lock, and safely resume the list iteration
+	 * once userspace returns.
+	 */
+	srcu_read_unlock(&fsnotify_mark_srcu, iter_info->srcu_idx);
+
+	return true;
+out_inode:
+	if (iter_info->inode_mark)
+		fsnotify_put_mark(iter_info->inode_mark);
+out_wait:
+	if (atomic_dec_and_test(&group->user_waits) && group->shutdown)
+		wake_up(&group->notification_waitq);
+	return false;
+}
+
+void fsnotify_finish_user_wait(struct fsnotify_iter_info *iter_info)
+{
+	struct fsnotify_group *group = NULL;
+
+	iter_info->srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
+	if (iter_info->inode_mark) {
+		group = iter_info->inode_mark->group;
+		fsnotify_put_mark(iter_info->inode_mark);
+	}
+	if (iter_info->vfsmount_mark) {
+		group = iter_info->vfsmount_mark->group;
+		fsnotify_put_mark(iter_info->vfsmount_mark);
+	}
+	/*
+	 * We abuse notification_waitq on group shutdown for waiting for all
+	 * marks pinned when waiting for userspace.
+	 */
+	if (atomic_dec_and_test(&group->user_waits) && group->shutdown)
+		wake_up(&group->notification_waitq);
+}
+
 /*
  * Mark mark as detached, remove it from group list. Mark still stays in object
  * list until its last reference is dropped. Note that we rely on mark being
@@ -647,6 +723,12 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
 		fsnotify_free_mark(mark);
 		fsnotify_put_mark(mark);
 	}
+	/*
+	 * Some marks can still be pinned when waiting for response from
+	 * userspace. Wait for those now. fsnotify_prepare_user_wait() will
+	 * not succeed now so this wait is race-free.
+	 */
+	wait_event(group->notification_waitq, !atomic_read(&group->user_waits));
 }
 
 /* Destroy all marks attached to inode / vfsmount */
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index a483614b25d0..5bb6d988b9f6 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -80,6 +80,7 @@ struct fsnotify_event;
 struct fsnotify_mark;
 struct fsnotify_event_private_data;
 struct fsnotify_fname;
+struct fsnotify_iter_info;
 
 /*
  * Each group much define these ops.  The fsnotify infrastructure will call
@@ -163,6 +164,8 @@ struct fsnotify_group {
 	struct fsnotify_event *overflow_event;	/* Event we queue when the
 						 * notification list is too
 						 * full */
+	atomic_t user_waits;		/* Number of tasks waiting for user
+					 * response */
 
 	/* groups can define private fields here or use the void *private */
 	union {
@@ -368,6 +371,8 @@ extern void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, un
 extern void fsnotify_get_mark(struct fsnotify_mark *mark);
 extern void fsnotify_put_mark(struct fsnotify_mark *mark);
 extern void fsnotify_unmount_inodes(struct super_block *sb);
+extern void fsnotify_finish_user_wait(struct fsnotify_iter_info *iter_info);
+extern bool fsnotify_prepare_user_wait(struct fsnotify_iter_info *iter_info);
 
 /* put here because inotify does some weird stuff when destroying watches */
 extern void fsnotify_init_event(struct fsnotify_event *event,
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 25/35] fsnotify: Pass fsnotify_iter_info into handle_event handler
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (23 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 24/35] fsnotify: Provide framework for dropping SRCU lock in ->handle_event Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 26/35] fanotify: Release SRCU lock when waiting for userspace response Jan Kara
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Pass fsnotify_iter_info into ->handle_event() handler so that it can
release and reacquire SRCU lock via fsnotify_prepare_user_wait() and
fsnotify_finish_user_wait() functions.  These functions also make sure
current marks are appropriately pinned so that iteration protected by
srcu in fsnotify() stays safe.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/dnotify/dnotify.c          |  3 ++-
 fs/notify/fanotify/fanotify.c        |  3 ++-
 fs/notify/fsnotify.c                 | 19 +++++++++++++------
 fs/notify/inotify/inotify.h          |  3 ++-
 fs/notify/inotify/inotify_fsnotify.c |  3 ++-
 fs/notify/inotify/inotify_user.c     |  2 +-
 include/linux/fsnotify_backend.h     |  3 ++-
 kernel/audit_fsnotify.c              |  3 ++-
 kernel/audit_tree.c                  |  3 ++-
 kernel/audit_watch.c                 |  3 ++-
 10 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 41b2a070761c..aba165ae3397 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -85,7 +85,8 @@ static int dnotify_handle_event(struct fsnotify_group *group,
 				struct fsnotify_mark *inode_mark,
 				struct fsnotify_mark *vfsmount_mark,
 				u32 mask, const void *data, int data_type,
-				const unsigned char *file_name, u32 cookie)
+				const unsigned char *file_name, u32 cookie,
+				struct fsnotify_iter_info *iter_info)
 {
 	struct dnotify_mark *dn_mark;
 	struct dnotify_struct *dn;
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index e5f7e47de68e..ec80a51cbb3d 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -174,7 +174,8 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 				 struct fsnotify_mark *inode_mark,
 				 struct fsnotify_mark *fanotify_mark,
 				 u32 mask, const void *data, int data_type,
-				 const unsigned char *file_name, u32 cookie)
+				 const unsigned char *file_name, u32 cookie,
+				 struct fsnotify_iter_info *iter_info)
 {
 	int ret = 0;
 	struct fanotify_event_info *event;
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index d512ef9f75fc..c4afb6a88268 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -127,7 +127,8 @@ static int send_to_group(struct inode *to_tell,
 			 struct fsnotify_mark *vfsmount_mark,
 			 __u32 mask, const void *data,
 			 int data_is, u32 cookie,
-			 const unsigned char *file_name)
+			 const unsigned char *file_name,
+			 struct fsnotify_iter_info *iter_info)
 {
 	struct fsnotify_group *group = NULL;
 	__u32 inode_test_mask = 0;
@@ -178,7 +179,7 @@ static int send_to_group(struct inode *to_tell,
 
 	return group->ops->handle_event(group, to_tell, inode_mark,
 					vfsmount_mark, mask, data, data_is,
-					file_name, cookie);
+					file_name, cookie, iter_info);
 }
 
 /*
@@ -194,8 +195,9 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	struct fsnotify_mark *inode_mark = NULL, *vfsmount_mark = NULL;
 	struct fsnotify_group *inode_group, *vfsmount_group;
 	struct fsnotify_mark_connector *inode_conn, *vfsmount_conn;
+	struct fsnotify_iter_info iter_info;
 	struct mount *mnt;
-	int idx, ret = 0;
+	int ret = 0;
 	/* global tests shouldn't care about events on child only the specific event */
 	__u32 test_mask = (mask & ~FS_EVENT_ON_CHILD);
 
@@ -224,7 +226,7 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	    !(mnt && test_mask & mnt->mnt_fsnotify_mask))
 		return 0;
 
-	idx = srcu_read_lock(&fsnotify_mark_srcu);
+	iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
 
 	if ((mask & FS_MODIFY) ||
 	    (test_mask & to_tell->i_fsnotify_mask)) {
@@ -284,8 +286,13 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 				vfsmount_mark = NULL;
 			}
 		}
+
+		iter_info.inode_mark = inode_mark;
+		iter_info.vfsmount_mark = vfsmount_mark;
+
 		ret = send_to_group(to_tell, inode_mark, vfsmount_mark, mask,
-				    data, data_is, cookie, file_name);
+				    data, data_is, cookie, file_name,
+				    &iter_info);
 
 		if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
 			goto out;
@@ -299,7 +306,7 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	}
 	ret = 0;
 out:
-	srcu_read_unlock(&fsnotify_mark_srcu, idx);
+	srcu_read_unlock(&fsnotify_mark_srcu, iter_info.srcu_idx);
 
 	return ret;
 }
diff --git a/fs/notify/inotify/inotify.h b/fs/notify/inotify/inotify.h
index 7c461fd49c4c..7a966f456269 100644
--- a/fs/notify/inotify/inotify.h
+++ b/fs/notify/inotify/inotify.h
@@ -27,7 +27,8 @@ extern int inotify_handle_event(struct fsnotify_group *group,
 				struct fsnotify_mark *inode_mark,
 				struct fsnotify_mark *vfsmount_mark,
 				u32 mask, const void *data, int data_type,
-				const unsigned char *file_name, u32 cookie);
+				const unsigned char *file_name, u32 cookie,
+				struct fsnotify_iter_info *iter_info);
 
 extern const struct fsnotify_ops inotify_fsnotify_ops;
 
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index f310d8368a2d..ccd6a4055e0c 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -68,7 +68,8 @@ int inotify_handle_event(struct fsnotify_group *group,
 			 struct fsnotify_mark *inode_mark,
 			 struct fsnotify_mark *vfsmount_mark,
 			 u32 mask, const void *data, int data_type,
-			 const unsigned char *file_name, u32 cookie)
+			 const unsigned char *file_name, u32 cookie,
+			 struct fsnotify_iter_info *iter_info)
 {
 	struct inotify_inode_mark *i_mark;
 	struct inotify_event_info *event;
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 43cbd1b178c9..05b268ec0f5f 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -474,7 +474,7 @@ void inotify_ignored_and_remove_idr(struct fsnotify_mark *fsn_mark,
 
 	/* Queue ignore event for the watch */
 	inotify_handle_event(group, NULL, fsn_mark, NULL, FS_IN_IGNORED,
-			     NULL, FSNOTIFY_EVENT_NONE, NULL, 0);
+			     NULL, FSNOTIFY_EVENT_NONE, NULL, 0, NULL);
 
 	i_mark = container_of(fsn_mark, struct inotify_inode_mark, fsn_mark);
 	/* remove this mark from the idr */
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 5bb6d988b9f6..744a4b9076f9 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -99,7 +99,8 @@ struct fsnotify_ops {
 			    struct fsnotify_mark *inode_mark,
 			    struct fsnotify_mark *vfsmount_mark,
 			    u32 mask, const void *data, int data_type,
-			    const unsigned char *file_name, u32 cookie);
+			    const unsigned char *file_name, u32 cookie,
+			    struct fsnotify_iter_info *iter_info);
 	void (*free_group_priv)(struct fsnotify_group *group);
 	void (*freeing_mark)(struct fsnotify_mark *mark, struct fsnotify_group *group);
 	void (*free_event)(struct fsnotify_event *event);
diff --git a/kernel/audit_fsnotify.c b/kernel/audit_fsnotify.c
index 7ea57e516029..e8b371ff1e91 100644
--- a/kernel/audit_fsnotify.c
+++ b/kernel/audit_fsnotify.c
@@ -168,7 +168,8 @@ static int audit_mark_handle_event(struct fsnotify_group *group,
 				    struct fsnotify_mark *inode_mark,
 				    struct fsnotify_mark *vfsmount_mark,
 				    u32 mask, const void *data, int data_type,
-				    const unsigned char *dname, u32 cookie)
+				    const unsigned char *dname, u32 cookie,
+				    struct fsnotify_iter_info *iter_info)
 {
 	struct audit_fsnotify_mark *audit_mark;
 	const struct inode *inode = NULL;
diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 672ca1512888..dbd7a9606065 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -989,7 +989,8 @@ static int audit_tree_handle_event(struct fsnotify_group *group,
 				   struct fsnotify_mark *inode_mark,
 				   struct fsnotify_mark *vfsmount_mark,
 				   u32 mask, const void *data, int data_type,
-				   const unsigned char *file_name, u32 cookie)
+				   const unsigned char *file_name, u32 cookie,
+				   struct fsnotify_iter_info *iter_info)
 {
 	return 0;
 }
diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c
index f79e4658433d..6caaf087801f 100644
--- a/kernel/audit_watch.c
+++ b/kernel/audit_watch.c
@@ -472,7 +472,8 @@ static int audit_watch_handle_event(struct fsnotify_group *group,
 				    struct fsnotify_mark *inode_mark,
 				    struct fsnotify_mark *vfsmount_mark,
 				    u32 mask, const void *data, int data_type,
-				    const unsigned char *dname, u32 cookie)
+				    const unsigned char *dname, u32 cookie,
+				    struct fsnotify_iter_info *iter_info)
 {
 	const struct inode *inode;
 	struct audit_parent *parent;
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 26/35] fanotify: Release SRCU lock when waiting for userspace response
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (24 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 25/35] fsnotify: Pass fsnotify_iter_info into handle_event handler Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 27/35] fsnotify: Remove fsnotify_set_mark_{,ignored_}mask_locked() Jan Kara
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

When userspace task processing fanotify permission events screws up and
does not respond, fsnotify_mark_srcu SRCU is held indefinitely which
causes further hangs in the whole notification subsystem. Although we
cannot easily solve the problem of operations blocked waiting for
response from userspace, we can at least somewhat localize the damage by
dropping SRCU lock before waiting for userspace response and reacquiring
it when userspace responds.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fanotify/fanotify.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index ec80a51cbb3d..461c21ebebeb 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -57,14 +57,26 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
 static int fanotify_get_response(struct fsnotify_group *group,
-				 struct fanotify_perm_event_info *event)
+				 struct fanotify_perm_event_info *event,
+				 struct fsnotify_iter_info *iter_info)
 {
 	int ret;
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
+	/*
+	 * fsnotify_prepare_user_wait() fails if we race with mark deletion.
+	 * Just let the operation pass in that case.
+	 */
+	if (!fsnotify_prepare_user_wait(iter_info)) {
+		event->response = FAN_ALLOW;
+		goto out;
+	}
+
 	wait_event(group->fanotify_data.access_waitq, event->response);
 
+	fsnotify_finish_user_wait(iter_info);
+out:
 	/* userspace responded, convert to something usable */
 	switch (event->response) {
 	case FAN_ALLOW:
@@ -216,7 +228,8 @@ static int fanotify_handle_event(struct fsnotify_group *group,
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
 	if (mask & FAN_ALL_PERM_EVENTS) {
-		ret = fanotify_get_response(group, FANOTIFY_PE(fsn_event));
+		ret = fanotify_get_response(group, FANOTIFY_PE(fsn_event),
+					    iter_info);
 		fsnotify_destroy_event(group, fsn_event);
 	}
 #endif
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 27/35] fsnotify: Remove fsnotify_set_mark_{,ignored_}mask_locked()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (25 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 26/35] fanotify: Release SRCU lock when waiting for userspace response Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 28/35] fsnotify: Remove fsnotify_recalc_{inode|vfsmount}_mask() Jan Kara
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

These helpers are now only a simple assignment and just obfuscate
what is going on. Remove them.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/dnotify/dnotify.c        |  9 +++------
 fs/notify/fanotify/fanotify_user.c |  9 ++++-----
 fs/notify/inotify/inotify_user.c   |  6 ++----
 fs/notify/mark.c                   | 14 --------------
 include/linux/fsnotify_backend.h   |  4 ----
 5 files changed, 9 insertions(+), 33 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index aba165ae3397..5940c75541a7 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -52,7 +52,7 @@ struct dnotify_mark {
  */
 static void dnotify_recalc_inode_mask(struct fsnotify_mark *fsn_mark)
 {
-	__u32 new_mask, old_mask;
+	__u32 new_mask = 0;
 	struct dnotify_struct *dn;
 	struct dnotify_mark *dn_mark  = container_of(fsn_mark,
 						     struct dnotify_mark,
@@ -60,14 +60,11 @@ static void dnotify_recalc_inode_mask(struct fsnotify_mark *fsn_mark)
 
 	assert_spin_locked(&fsn_mark->lock);
 
-	old_mask = fsn_mark->mask;
-	new_mask = 0;
 	for (dn = dn_mark->dn; dn != NULL; dn = dn->dn_next)
 		new_mask |= (dn->dn_mask & ~FS_DN_MULTISHOT);
-	fsnotify_set_mark_mask_locked(fsn_mark, new_mask);
-
-	if (old_mask == new_mask)
+	if (fsn_mark->mask == new_mask)
 		return;
+	fsn_mark->mask = new_mask;
 
 	fsnotify_recalc_mask(fsn_mark->connector);
 }
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index c5e69870287f..cf38a345032f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -511,13 +511,12 @@ static __u32 fanotify_mark_remove_from_mask(struct fsnotify_mark *fsn_mark,
 			tmask &= ~FAN_ONDIR;
 
 		oldmask = fsn_mark->mask;
-		fsnotify_set_mark_mask_locked(fsn_mark, tmask);
+		fsn_mark->mask = tmask;
 	} else {
 		__u32 tmask = fsn_mark->ignored_mask & ~mask;
 		if (flags & FAN_MARK_ONDIR)
 			tmask &= ~FAN_ONDIR;
-
-		fsnotify_set_mark_ignored_mask_locked(fsn_mark, tmask);
+		fsn_mark->ignored_mask = tmask;
 	}
 	*destroy = !(fsn_mark->mask | fsn_mark->ignored_mask);
 	spin_unlock(&fsn_mark->lock);
@@ -599,13 +598,13 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
 			tmask |= FAN_ONDIR;
 
 		oldmask = fsn_mark->mask;
-		fsnotify_set_mark_mask_locked(fsn_mark, tmask);
+		fsn_mark->mask = tmask;
 	} else {
 		__u32 tmask = fsn_mark->ignored_mask | mask;
 		if (flags & FAN_MARK_ONDIR)
 			tmask |= FAN_ONDIR;
 
-		fsnotify_set_mark_ignored_mask_locked(fsn_mark, tmask);
+		fsn_mark->ignored_mask = tmask;
 		if (flags & FAN_MARK_IGNORED_SURV_MODIFY)
 			fsn_mark->flags |= FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY;
 	}
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 05b268ec0f5f..69739b26c7e4 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -513,14 +513,12 @@ static int inotify_update_existing_watch(struct fsnotify_group *group,
 	i_mark = container_of(fsn_mark, struct inotify_inode_mark, fsn_mark);
 
 	spin_lock(&fsn_mark->lock);
-
 	old_mask = fsn_mark->mask;
 	if (add)
-		fsnotify_set_mark_mask_locked(fsn_mark, (fsn_mark->mask | mask));
+		fsn_mark->mask |= mask;
 	else
-		fsnotify_set_mark_mask_locked(fsn_mark, mask);
+		fsn_mark->mask = mask;
 	new_mask = fsn_mark->mask;
-
 	spin_unlock(&fsn_mark->lock);
 
 	if (old_mask != new_mask) {
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index c4f43a6acd9a..ae33e9f91849 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -393,20 +393,6 @@ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 	fsnotify_free_mark(mark);
 }
 
-void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask)
-{
-	assert_spin_locked(&mark->lock);
-
-	mark->mask = mask;
-}
-
-void fsnotify_set_mark_ignored_mask_locked(struct fsnotify_mark *mark, __u32 mask)
-{
-	assert_spin_locked(&mark->lock);
-
-	mark->ignored_mask = mask;
-}
-
 /*
  * Sorting function for lists of fsnotify marks.
  *
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 744a4b9076f9..63354cd86a7b 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -347,10 +347,6 @@ extern void fsnotify_init_mark(struct fsnotify_mark *mark, void (*free_mark)(str
 extern struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group, struct inode *inode);
 /* find (and take a reference) to a mark associated with group and vfsmount */
 extern struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group, struct vfsmount *mnt);
-/* set the ignored_mask of a mark */
-extern void fsnotify_set_mark_ignored_mask_locked(struct fsnotify_mark *mark, __u32 mask);
-/* set the mask of a mark (might pin the object into memory */
-extern void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask);
 /* attach the mark to both the group and the inode */
 extern int fsnotify_add_mark(struct fsnotify_mark *mark, struct fsnotify_group *group,
 			     struct inode *inode, struct vfsmount *mnt, int allow_dups);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 28/35] fsnotify: Remove fsnotify_recalc_{inode|vfsmount}_mask()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (26 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 27/35] fsnotify: Remove fsnotify_set_mark_{,ignored_}mask_locked() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 29/35] fsnotify: Inline fsnotify_clear_{inode|vfsmount}_mark_group() Jan Kara
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

These helpers are just very thin wrappers now. Remove them.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fanotify/fanotify_user.c | 8 ++++----
 fs/notify/inode_mark.c             | 5 -----
 fs/notify/inotify/inotify_user.c   | 2 +-
 fs/notify/vfsmount_mark.c          | 5 -----
 include/linux/fsnotify_backend.h   | 4 ----
 5 files changed, 5 insertions(+), 19 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index cf38a345032f..24fa3f24b9ad 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -542,7 +542,7 @@ static int fanotify_remove_vfsmount_mark(struct fsnotify_group *group,
 	removed = fanotify_mark_remove_from_mask(fsn_mark, mask, flags,
 						 &destroy_mark);
 	if (removed & real_mount(mnt)->mnt_fsnotify_mask)
-		fsnotify_recalc_vfsmount_mask(mnt);
+		fsnotify_recalc_mask(real_mount(mnt)->mnt_fsnotify_marks);
 	if (destroy_mark)
 		fsnotify_detach_mark(fsn_mark);
 	mutex_unlock(&group->mark_mutex);
@@ -571,7 +571,7 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
 	removed = fanotify_mark_remove_from_mask(fsn_mark, mask, flags,
 						 &destroy_mark);
 	if (removed & inode->i_fsnotify_mask)
-		fsnotify_recalc_inode_mask(inode);
+		fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	if (destroy_mark)
 		fsnotify_detach_mark(fsn_mark);
 	mutex_unlock(&group->mark_mutex);
@@ -656,7 +656,7 @@ static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
 	}
 	added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
 	if (added & ~real_mount(mnt)->mnt_fsnotify_mask)
-		fsnotify_recalc_vfsmount_mask(mnt);
+		fsnotify_recalc_mask(real_mount(mnt)->mnt_fsnotify_marks);
 	mutex_unlock(&group->mark_mutex);
 
 	fsnotify_put_mark(fsn_mark);
@@ -693,7 +693,7 @@ static int fanotify_add_inode_mark(struct fsnotify_group *group,
 	}
 	added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
 	if (added & ~inode->i_fsnotify_mask)
-		fsnotify_recalc_inode_mask(inode);
+		fsnotify_recalc_mask(inode->i_fsnotify_marks);
 	mutex_unlock(&group->mark_mutex);
 
 	fsnotify_put_mark(fsn_mark);
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index b9370316727e..2188329da3c2 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -30,11 +30,6 @@
 
 #include "../internal.h"
 
-void fsnotify_recalc_inode_mask(struct inode *inode)
-{
-	fsnotify_recalc_mask(inode->i_fsnotify_marks);
-}
-
 /*
  * Given a group clear all of the inode marks associated with that group.
  */
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 69739b26c7e4..b3b2a464a03c 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -529,7 +529,7 @@ static int inotify_update_existing_watch(struct fsnotify_group *group,
 
 		/* update the inode with this new fsn_mark */
 		if (dropped || do_inode)
-			fsnotify_recalc_inode_mask(inode);
+			fsnotify_recalc_mask(inode->i_fsnotify_marks);
 
 	}
 
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index dd5f3fcbccfb..41bff46576c2 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -34,11 +34,6 @@ void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
 	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
 }
 
-void fsnotify_recalc_vfsmount_mask(struct vfsmount *mnt)
-{
-	fsnotify_recalc_mask(real_mount(mnt)->mnt_fsnotify_marks);
-}
-
 /*
  * given a group and vfsmount, find the mark associated with that combination.
  * if found take a reference to that mark and return it, else return NULL
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 63354cd86a7b..6d09c6ff9810 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -338,10 +338,6 @@ extern struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group
 
 /* Calculate mask of events for a list of marks */
 extern void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
-/* run all marks associated with a vfsmount and update mnt->mnt_fsnotify_mask */
-extern void fsnotify_recalc_vfsmount_mask(struct vfsmount *mnt);
-/* run all marks associated with an inode and update inode->i_fsnotify_mask */
-extern void fsnotify_recalc_inode_mask(struct inode *inode);
 extern void fsnotify_init_mark(struct fsnotify_mark *mark, void (*free_mark)(struct fsnotify_mark *mark));
 /* find (and take a reference) to a mark associated with group and inode */
 extern struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group, struct inode *inode);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 29/35] fsnotify: Inline fsnotify_clear_{inode|vfsmount}_mark_group()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (27 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 28/35] fsnotify: Remove fsnotify_recalc_{inode|vfsmount}_mask() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 30/35] fsnotify: Rename fsnotify_clear_marks_by_group_flags() Jan Kara
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Inline these helpers as they are very thin. We still keep them as we
don't want to expose details about how list type is determined.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/inode_mark.c           |  8 --------
 fs/notify/vfsmount_mark.c        |  5 -----
 include/linux/fsnotify_backend.h | 14 ++++++++++----
 3 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index 2188329da3c2..bdc15f736082 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -31,14 +31,6 @@
 #include "../internal.h"
 
 /*
- * Given a group clear all of the inode marks associated with that group.
- */
-void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
-{
-	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_INODE);
-}
-
-/*
  * given a group and inode, find the mark associated with that combination.
  * if found take a reference to that mark and return it, else return NULL
  */
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
index 41bff46576c2..1e692c56deec 100644
--- a/fs/notify/vfsmount_mark.c
+++ b/fs/notify/vfsmount_mark.c
@@ -29,11 +29,6 @@
 #include <linux/fsnotify_backend.h>
 #include "fsnotify.h"
 
-void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
-{
-	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
-}
-
 /*
  * given a group and vfsmount, find the mark associated with that combination.
  * if found take a reference to that mark and return it, else return NULL
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 6d09c6ff9810..700b4fa991d4 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -355,12 +355,18 @@ extern void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 extern void fsnotify_detach_mark(struct fsnotify_mark *mark);
 /* free mark */
 extern void fsnotify_free_mark(struct fsnotify_mark *mark);
-/* run all the marks in a group, and clear all of the vfsmount marks */
-extern void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group);
-/* run all the marks in a group, and clear all of the inode marks */
-extern void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group);
 /* run all the marks in a group, and clear all of the marks attached to given object type */
 extern void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, unsigned int flags);
+/* run all the marks in a group, and clear all of the vfsmount marks */
+static inline void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
+{
+	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
+}
+/* run all the marks in a group, and clear all of the inode marks */
+static inline void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
+{
+	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_INODE);
+}
 extern void fsnotify_get_mark(struct fsnotify_mark *mark);
 extern void fsnotify_put_mark(struct fsnotify_mark *mark);
 extern void fsnotify_unmount_inodes(struct super_block *sb);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 30/35] fsnotify: Rename fsnotify_clear_marks_by_group_flags()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (28 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 29/35] fsnotify: Inline fsnotify_clear_{inode|vfsmount}_mark_group() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 31/35] fsnotify: Remove fsnotify_detach_group_marks() Jan Kara
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

The _flags() suffix in the function name was more confusing than
explaining so just remove it. Also rename the argument from 'flags' to
'type' to better explain what the function expects.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Suggested-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/mark.c                 | 12 +++++-------
 include/linux/fsnotify_backend.h |  6 +++---
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index ae33e9f91849..89656abbf4f8 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -67,7 +67,7 @@
  * - The fs the inode is on is unmounted.  (fsnotify_inode_delete/fsnotify_unmount_inodes)
  * - Something explicitly requests that it be removed.  (fsnotify_destroy_mark)
  * - The fsnotify_group associated with the mark is going away and all such marks
- *   need to be cleaned up. (fsnotify_clear_marks_by_group)
+ *   need to be cleaned up. (fsnotify_detach_group_marks)
  *
  * This has the very interesting property of being able to run concurrently with
  * any (or all) other directions.
@@ -645,11 +645,9 @@ struct fsnotify_mark *fsnotify_find_mark(
 	return NULL;
 }
 
-/*
- * clear any marks in a group in which mark->flags & flags is true
- */
-void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
-					 unsigned int flags)
+/* Clear any marks in a group with given type */
+void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
+				   unsigned int type)
 {
 	struct fsnotify_mark *lmark, *mark;
 	LIST_HEAD(to_free);
@@ -665,7 +663,7 @@ void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
 	 */
 	mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
 	list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
-		if (mark->connector->flags & flags)
+		if (mark->connector->flags & type)
 			list_move(&mark->g_list, &to_free);
 	}
 	mutex_unlock(&group->mark_mutex);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 700b4fa991d4..d6bbd5acdac1 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -356,16 +356,16 @@ extern void fsnotify_detach_mark(struct fsnotify_mark *mark);
 /* free mark */
 extern void fsnotify_free_mark(struct fsnotify_mark *mark);
 /* run all the marks in a group, and clear all of the marks attached to given object type */
-extern void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, unsigned int flags);
+extern void fsnotify_clear_marks_by_group(struct fsnotify_group *group, unsigned int type);
 /* run all the marks in a group, and clear all of the vfsmount marks */
 static inline void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
 {
-	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
+	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
 }
 /* run all the marks in a group, and clear all of the inode marks */
 static inline void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
 {
-	fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_OBJ_TYPE_INODE);
+	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_INODE);
 }
 extern void fsnotify_get_mark(struct fsnotify_mark *mark);
 extern void fsnotify_put_mark(struct fsnotify_mark *mark);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 31/35] fsnotify: Remove fsnotify_detach_group_marks()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (29 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 30/35] fsnotify: Rename fsnotify_clear_marks_by_group_flags() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 32/35] fsnotify: Remove fsnotify_find_{inode|vfsmount}_mark() Jan Kara
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

The function is already mostly contained in what
fsnotify_clear_marks_by_group() does. Just update that function to not
select marks when all of them should be destroyed and remove
fsnotify_detach_group_marks().

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/fsnotify.h             |  2 --
 fs/notify/group.c                |  9 +++++++-
 fs/notify/mark.c                 | 45 +++++++++-------------------------------
 include/linux/fsnotify_backend.h |  2 ++
 4 files changed, 20 insertions(+), 38 deletions(-)

diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 86383c7865c0..3ec593c32684 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -40,8 +40,6 @@ static inline void fsnotify_clear_marks_by_mount(struct vfsmount *mnt)
 {
 	fsnotify_destroy_marks(&real_mount(mnt)->mnt_fsnotify_marks);
 }
-/* prepare for freeing all marks associated with given group */
-extern void fsnotify_detach_group_marks(struct fsnotify_group *group);
 /* Wait until all marks queued for destruction are destroyed */
 extern void fsnotify_wait_marks_destroyed(void);
 
diff --git a/fs/notify/group.c b/fs/notify/group.c
index 79439cdf16e0..32357534de18 100644
--- a/fs/notify/group.c
+++ b/fs/notify/group.c
@@ -67,7 +67,14 @@ void fsnotify_destroy_group(struct fsnotify_group *group)
 	fsnotify_group_stop_queueing(group);
 
 	/* Clear all marks for this group and queue them for destruction */
-	fsnotify_detach_group_marks(group);
+	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_ALL_TYPES);
+
+	/*
+	 * Some marks can still be pinned when waiting for response from
+	 * userspace. Wait for those now. fsnotify_prepare_user_wait() will
+	 * not succeed now so this wait is race-free.
+	 */
+	wait_event(group->notification_waitq, !atomic_read(&group->user_waits));
 
 	/*
 	 * Wait until all marks get really destroyed. We could actually destroy
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 89656abbf4f8..9f3364ef19d3 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -67,7 +67,7 @@
  * - The fs the inode is on is unmounted.  (fsnotify_inode_delete/fsnotify_unmount_inodes)
  * - Something explicitly requests that it be removed.  (fsnotify_destroy_mark)
  * - The fsnotify_group associated with the mark is going away and all such marks
- *   need to be cleaned up. (fsnotify_detach_group_marks)
+ *   need to be cleaned up. (fsnotify_clear_marks_by_group)
  *
  * This has the very interesting property of being able to run concurrently with
  * any (or all) other directions.
@@ -651,7 +651,13 @@ void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
 {
 	struct fsnotify_mark *lmark, *mark;
 	LIST_HEAD(to_free);
+	struct list_head *head = &to_free;
 
+	/* Skip selection step if we want to clear all marks. */
+	if (type == FSNOTIFY_OBJ_ALL_TYPES) {
+		head = &group->marks_list;
+		goto clear;
+	}
 	/*
 	 * We have to be really careful here. Anytime we drop mark_mutex, e.g.
 	 * fsnotify_clear_marks_by_inode() can come and free marks. Even in our
@@ -668,13 +674,14 @@ void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
 	}
 	mutex_unlock(&group->mark_mutex);
 
+clear:
 	while (1) {
 		mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
-		if (list_empty(&to_free)) {
+		if (list_empty(head)) {
 			mutex_unlock(&group->mark_mutex);
 			break;
 		}
-		mark = list_first_entry(&to_free, struct fsnotify_mark, g_list);
+		mark = list_first_entry(head, struct fsnotify_mark, g_list);
 		fsnotify_get_mark(mark);
 		fsnotify_detach_mark(mark);
 		mutex_unlock(&group->mark_mutex);
@@ -683,38 +690,6 @@ void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
 	}
 }
 
-/*
- * Given a group, prepare for freeing all the marks associated with that group.
- * The marks are attached to the list of marks prepared for destruction, the
- * caller is responsible for freeing marks in that list after SRCU period has
- * ended.
- */
-void fsnotify_detach_group_marks(struct fsnotify_group *group)
-{
-	struct fsnotify_mark *mark;
-
-	while (1) {
-		mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
-		if (list_empty(&group->marks_list)) {
-			mutex_unlock(&group->mark_mutex);
-			break;
-		}
-		mark = list_first_entry(&group->marks_list,
-					struct fsnotify_mark, g_list);
-		fsnotify_get_mark(mark);
-		fsnotify_detach_mark(mark);
-		mutex_unlock(&group->mark_mutex);
-		fsnotify_free_mark(mark);
-		fsnotify_put_mark(mark);
-	}
-	/*
-	 * Some marks can still be pinned when waiting for response from
-	 * userspace. Wait for those now. fsnotify_prepare_user_wait() will
-	 * not succeed now so this wait is race-free.
-	 */
-	wait_event(group->notification_waitq, !atomic_read(&group->user_waits));
-}
-
 /* Destroy all marks attached to inode / vfsmount */
 void fsnotify_destroy_marks(struct fsnotify_mark_connector __rcu **connp)
 {
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index d6bbd5acdac1..7287cba42a66 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -208,6 +208,8 @@ struct fsnotify_mark_connector {
 	spinlock_t lock;
 #define FSNOTIFY_OBJ_TYPE_INODE		0x01
 #define FSNOTIFY_OBJ_TYPE_VFSMOUNT	0x02
+#define FSNOTIFY_OBJ_ALL_TYPES		(FSNOTIFY_OBJ_TYPE_INODE | \
+					 FSNOTIFY_OBJ_TYPE_VFSMOUNT)
 	unsigned int flags;	/* Type of object [lock] */
 	union {	/* Object pointer [lock] */
 		struct inode *inode;
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 32/35] fsnotify: Remove fsnotify_find_{inode|vfsmount}_mark()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (30 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 31/35] fsnotify: Remove fsnotify_detach_group_marks() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 33/35] fsnotify: Drop inode_mark.c Jan Kara
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

These are very thin wrappers, just remove them. Drop
fs/notify/vfsmount_mark.c as it is empty now.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/Makefile                 |  2 +-
 fs/notify/dnotify/dnotify.c        |  4 ++--
 fs/notify/fanotify/fanotify_user.c | 12 ++++++-----
 fs/notify/fsnotify.h               |  4 ----
 fs/notify/inode_mark.c             | 10 ---------
 fs/notify/inotify/inotify_user.c   |  2 +-
 fs/notify/vfsmount_mark.c          | 42 --------------------------------------
 include/linux/fsnotify_backend.h   |  8 ++++----
 kernel/audit_tree.c                |  3 ++-
 kernel/audit_watch.c               |  2 +-
 10 files changed, 18 insertions(+), 71 deletions(-)
 delete mode 100644 fs/notify/vfsmount_mark.c

diff --git a/fs/notify/Makefile b/fs/notify/Makefile
index 96d3420d0242..ebb64a0282d1 100644
--- a/fs/notify/Makefile
+++ b/fs/notify/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_FSNOTIFY)		+= fsnotify.o notification.o group.o inode_mark.o \
-				   mark.o vfsmount_mark.o fdinfo.o
+				   mark.o fdinfo.o
 
 obj-y			+= dnotify/
 obj-y			+= inotify/
diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 5940c75541a7..b77d8d049e4d 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -157,7 +157,7 @@ void dnotify_flush(struct file *filp, fl_owner_t id)
 	if (!S_ISDIR(inode->i_mode))
 		return;
 
-	fsn_mark = fsnotify_find_inode_mark(dnotify_group, inode);
+	fsn_mark = fsnotify_find_mark(&inode->i_fsnotify_marks, dnotify_group);
 	if (!fsn_mark)
 		return;
 	dn_mark = container_of(fsn_mark, struct dnotify_mark, fsn_mark);
@@ -313,7 +313,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 	mutex_lock(&dnotify_group->mark_mutex);
 
 	/* add the new_fsn_mark or find an old one. */
-	fsn_mark = fsnotify_find_inode_mark(dnotify_group, inode);
+	fsn_mark = fsnotify_find_mark(&inode->i_fsnotify_marks, dnotify_group);
 	if (fsn_mark) {
 		dn_mark = container_of(fsn_mark, struct dnotify_mark, fsn_mark);
 		spin_lock(&fsn_mark->lock);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 24fa3f24b9ad..5a82bbb79f55 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -533,7 +533,8 @@ static int fanotify_remove_vfsmount_mark(struct fsnotify_group *group,
 	int destroy_mark;
 
 	mutex_lock(&group->mark_mutex);
-	fsn_mark = fsnotify_find_vfsmount_mark(group, mnt);
+	fsn_mark = fsnotify_find_mark(&real_mount(mnt)->mnt_fsnotify_marks,
+				      group);
 	if (!fsn_mark) {
 		mutex_unlock(&group->mark_mutex);
 		return -ENOENT;
@@ -562,7 +563,7 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
 	int destroy_mark;
 
 	mutex_lock(&group->mark_mutex);
-	fsn_mark = fsnotify_find_inode_mark(group, inode);
+	fsn_mark = fsnotify_find_mark(&inode->i_fsnotify_marks, group);
 	if (!fsn_mark) {
 		mutex_unlock(&group->mark_mutex);
 		return -ENOENT;
@@ -578,7 +579,7 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
 	if (destroy_mark)
 		fsnotify_free_mark(fsn_mark);
 
-	/* matches the fsnotify_find_inode_mark() */
+	/* matches the fsnotify_find_mark() */
 	fsnotify_put_mark(fsn_mark);
 
 	return 0;
@@ -646,7 +647,8 @@ static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
 	__u32 added;
 
 	mutex_lock(&group->mark_mutex);
-	fsn_mark = fsnotify_find_vfsmount_mark(group, mnt);
+	fsn_mark = fsnotify_find_mark(&real_mount(mnt)->mnt_fsnotify_marks,
+				      group);
 	if (!fsn_mark) {
 		fsn_mark = fanotify_add_new_mark(group, NULL, mnt);
 		if (IS_ERR(fsn_mark)) {
@@ -683,7 +685,7 @@ static int fanotify_add_inode_mark(struct fsnotify_group *group,
 		return 0;
 
 	mutex_lock(&group->mark_mutex);
-	fsn_mark = fsnotify_find_inode_mark(group, inode);
+	fsn_mark = fsnotify_find_mark(&inode->i_fsnotify_marks, group);
 	if (!fsn_mark) {
 		fsn_mark = fanotify_add_new_mark(group, inode, NULL);
 		if (IS_ERR(fsn_mark)) {
diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 3ec593c32684..bf012e8ecd14 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -24,10 +24,6 @@ extern struct srcu_struct fsnotify_mark_srcu;
 extern int fsnotify_compare_groups(struct fsnotify_group *a,
 				   struct fsnotify_group *b);
 
-/* Find mark belonging to given group in the list of marks */
-extern struct fsnotify_mark *fsnotify_find_mark(
-				struct fsnotify_mark_connector __rcu **connp,
-				struct fsnotify_group *group);
 /* Destroy all marks connected via given connector */
 extern void fsnotify_destroy_marks(struct fsnotify_mark_connector __rcu **connp);
 /* run the list of all marks associated with inode and destroy them */
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
index bdc15f736082..5cc317bad082 100644
--- a/fs/notify/inode_mark.c
+++ b/fs/notify/inode_mark.c
@@ -30,16 +30,6 @@
 
 #include "../internal.h"
 
-/*
- * given a group and inode, find the mark associated with that combination.
- * if found take a reference to that mark and return it, else return NULL
- */
-struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group,
-					       struct inode *inode)
-{
-	return fsnotify_find_mark(&inode->i_fsnotify_marks, group);
-}
-
 /**
  * fsnotify_unmount_inodes - an sb is unmounting.  handle any watched inodes.
  * @sb: superblock being unmounted.
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index b3b2a464a03c..a5e4411362f2 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -506,7 +506,7 @@ static int inotify_update_existing_watch(struct fsnotify_group *group,
 
 	mask = inotify_arg_to_mask(arg);
 
-	fsn_mark = fsnotify_find_inode_mark(group, inode);
+	fsn_mark = fsnotify_find_mark(&inode->i_fsnotify_marks, group);
 	if (!fsn_mark)
 		return -ENOENT;
 
diff --git a/fs/notify/vfsmount_mark.c b/fs/notify/vfsmount_mark.c
deleted file mode 100644
index 1e692c56deec..000000000000
--- a/fs/notify/vfsmount_mark.c
+++ /dev/null
@@ -1,42 +0,0 @@
-/*
- *  Copyright (C) 2008 Red Hat, Inc., Eric Paris <eparis@redhat.com>
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; either version 2, or (at your option)
- *  any later version.
- *
- *  This program is distributed in the hope that it will be useful,
- *  but WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *  GNU General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License
- *  along with this program; see the file COPYING.  If not, write to
- *  the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-
-#include <linux/fs.h>
-#include <linux/init.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/mount.h>
-#include <linux/mutex.h>
-#include <linux/spinlock.h>
-
-#include <linux/atomic.h>
-
-#include <linux/fsnotify_backend.h>
-#include "fsnotify.h"
-
-/*
- * given a group and vfsmount, find the mark associated with that combination.
- * if found take a reference to that mark and return it, else return NULL
- */
-struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group,
-						  struct vfsmount *mnt)
-{
-	struct mount *m = real_mount(mnt);
-
-	return fsnotify_find_mark(&m->mnt_fsnotify_marks, group);
-}
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 7287cba42a66..2ef0e04c5a9d 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -341,10 +341,10 @@ extern struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group
 /* Calculate mask of events for a list of marks */
 extern void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
 extern void fsnotify_init_mark(struct fsnotify_mark *mark, void (*free_mark)(struct fsnotify_mark *mark));
-/* find (and take a reference) to a mark associated with group and inode */
-extern struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group, struct inode *inode);
-/* find (and take a reference) to a mark associated with group and vfsmount */
-extern struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group, struct vfsmount *mnt);
+/* Find mark belonging to given group in the list of marks */
+extern struct fsnotify_mark *fsnotify_find_mark(
+				struct fsnotify_mark_connector __rcu **connp,
+				struct fsnotify_group *group);
 /* attach the mark to both the group and the inode */
 extern int fsnotify_add_mark(struct fsnotify_mark *mark, struct fsnotify_group *group,
 			     struct inode *inode, struct vfsmount *mnt, int allow_dups);
diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index dbd7a9606065..3bc8e9320cf1 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -391,7 +391,8 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 	struct node *p;
 	int n;
 
-	old_entry = fsnotify_find_inode_mark(audit_tree_group, inode);
+	old_entry = fsnotify_find_mark(&inode->i_fsnotify_marks,
+				       audit_tree_group);
 	if (!old_entry)
 		return create_chunk(inode, tree);
 
diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c
index 6caaf087801f..956fa584c239 100644
--- a/kernel/audit_watch.c
+++ b/kernel/audit_watch.c
@@ -102,7 +102,7 @@ static inline struct audit_parent *audit_find_parent(struct inode *inode)
 	struct audit_parent *parent = NULL;
 	struct fsnotify_mark *entry;
 
-	entry = fsnotify_find_inode_mark(audit_watch_group, inode);
+	entry = fsnotify_find_mark(&inode->i_fsnotify_marks, audit_watch_group);
 	if (entry)
 		parent = container_of(entry, struct audit_parent, mark);
 
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 33/35] fsnotify: Drop inode_mark.c
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (31 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 32/35] fsnotify: Remove fsnotify_find_{inode|vfsmount}_mark() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 34/35] fsnotify: Add group pointer in fsnotify_init_mark() Jan Kara
  2017-04-03 15:34 ` [PATCH 35/35] fsnotify: Move ->free_mark callback to fsnotify_ops Jan Kara
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

inode_mark.c now contains only a single function. Move it to
fs/notify/fsnotify.c and remove inode_mark.c.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/Makefile     |  4 +--
 fs/notify/fsnotify.c   | 57 ++++++++++++++++++++++++++++++++
 fs/notify/inode_mark.c | 88 --------------------------------------------------
 3 files changed, 59 insertions(+), 90 deletions(-)
 delete mode 100644 fs/notify/inode_mark.c

diff --git a/fs/notify/Makefile b/fs/notify/Makefile
index ebb64a0282d1..3e969ae91b60 100644
--- a/fs/notify/Makefile
+++ b/fs/notify/Makefile
@@ -1,5 +1,5 @@
-obj-$(CONFIG_FSNOTIFY)		+= fsnotify.o notification.o group.o inode_mark.o \
-				   mark.o fdinfo.o
+obj-$(CONFIG_FSNOTIFY)		+= fsnotify.o notification.o group.o mark.o \
+				   fdinfo.o
 
 obj-y			+= dnotify/
 obj-y			+= inotify/
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index c4afb6a88268..01a9f0f007d4 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -41,6 +41,63 @@ void __fsnotify_vfsmount_delete(struct vfsmount *mnt)
 	fsnotify_clear_marks_by_mount(mnt);
 }
 
+/**
+ * fsnotify_unmount_inodes - an sb is unmounting.  handle any watched inodes.
+ * @sb: superblock being unmounted.
+ *
+ * Called during unmount with no locks held, so needs to be safe against
+ * concurrent modifiers. We temporarily drop sb->s_inode_list_lock and CAN block.
+ */
+void fsnotify_unmount_inodes(struct super_block *sb)
+{
+	struct inode *inode, *iput_inode = NULL;
+
+	spin_lock(&sb->s_inode_list_lock);
+	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
+		/*
+		 * We cannot __iget() an inode in state I_FREEING,
+		 * I_WILL_FREE, or I_NEW which is fine because by that point
+		 * the inode cannot have any associated watches.
+		 */
+		spin_lock(&inode->i_lock);
+		if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
+			spin_unlock(&inode->i_lock);
+			continue;
+		}
+
+		/*
+		 * If i_count is zero, the inode cannot have any watches and
+		 * doing an __iget/iput with MS_ACTIVE clear would actually
+		 * evict all inodes with zero i_count from icache which is
+		 * unnecessarily violent and may in fact be illegal to do.
+		 */
+		if (!atomic_read(&inode->i_count)) {
+			spin_unlock(&inode->i_lock);
+			continue;
+		}
+
+		__iget(inode);
+		spin_unlock(&inode->i_lock);
+		spin_unlock(&sb->s_inode_list_lock);
+
+		if (iput_inode)
+			iput(iput_inode);
+
+		/* for each watch, send FS_UNMOUNT and then remove it */
+		fsnotify(inode, FS_UNMOUNT, inode, FSNOTIFY_EVENT_INODE, NULL, 0);
+
+		fsnotify_inode_delete(inode);
+
+		iput_inode = inode;
+
+		spin_lock(&sb->s_inode_list_lock);
+	}
+	spin_unlock(&sb->s_inode_list_lock);
+
+	if (iput_inode)
+		iput(iput_inode);
+}
+
 /*
  * Given an inode, first check if we care what happens to our children.  Inotify
  * and dnotify both tell their parents about events.  If we care about any event
diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c
deleted file mode 100644
index 5cc317bad082..000000000000
--- a/fs/notify/inode_mark.c
+++ /dev/null
@@ -1,88 +0,0 @@
-/*
- *  Copyright (C) 2008 Red Hat, Inc., Eric Paris <eparis@redhat.com>
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; either version 2, or (at your option)
- *  any later version.
- *
- *  This program is distributed in the hope that it will be useful,
- *  but WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *  GNU General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License
- *  along with this program; see the file COPYING.  If not, write to
- *  the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-
-#include <linux/fs.h>
-#include <linux/init.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/mutex.h>
-#include <linux/spinlock.h>
-
-#include <linux/atomic.h>
-
-#include <linux/fsnotify_backend.h>
-#include "fsnotify.h"
-
-#include "../internal.h"
-
-/**
- * fsnotify_unmount_inodes - an sb is unmounting.  handle any watched inodes.
- * @sb: superblock being unmounted.
- *
- * Called during unmount with no locks held, so needs to be safe against
- * concurrent modifiers. We temporarily drop sb->s_inode_list_lock and CAN block.
- */
-void fsnotify_unmount_inodes(struct super_block *sb)
-{
-	struct inode *inode, *iput_inode = NULL;
-
-	spin_lock(&sb->s_inode_list_lock);
-	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
-		/*
-		 * We cannot __iget() an inode in state I_FREEING,
-		 * I_WILL_FREE, or I_NEW which is fine because by that point
-		 * the inode cannot have any associated watches.
-		 */
-		spin_lock(&inode->i_lock);
-		if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
-			spin_unlock(&inode->i_lock);
-			continue;
-		}
-
-		/*
-		 * If i_count is zero, the inode cannot have any watches and
-		 * doing an __iget/iput with MS_ACTIVE clear would actually
-		 * evict all inodes with zero i_count from icache which is
-		 * unnecessarily violent and may in fact be illegal to do.
-		 */
-		if (!atomic_read(&inode->i_count)) {
-			spin_unlock(&inode->i_lock);
-			continue;
-		}
-
-		__iget(inode);
-		spin_unlock(&inode->i_lock);
-		spin_unlock(&sb->s_inode_list_lock);
-
-		if (iput_inode)
-			iput(iput_inode);
-
-		/* for each watch, send FS_UNMOUNT and then remove it */
-		fsnotify(inode, FS_UNMOUNT, inode, FSNOTIFY_EVENT_INODE, NULL, 0);
-
-		fsnotify_inode_delete(inode);
-
-		iput_inode = inode;
-
-		spin_lock(&sb->s_inode_list_lock);
-	}
-	spin_unlock(&sb->s_inode_list_lock);
-
-	if (iput_inode)
-		iput(iput_inode);
-}
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 34/35] fsnotify: Add group pointer in fsnotify_init_mark()
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (32 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 33/35] fsnotify: Drop inode_mark.c Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  2017-04-03 15:34 ` [PATCH 35/35] fsnotify: Move ->free_mark callback to fsnotify_ops Jan Kara
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Currently we initialize mark->group only in fsnotify_add_mark_lock().
However we will need to access fsnotify_ops of corresponding group from
fsnotify_put_mark() so we need mark->group initialized earlier. Do that
in fsnotify_init_mark() which has a consequence that once
fsnotify_init_mark() is called on a mark, the mark has to be destroyed
by fsnotify_put_mark().

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/dnotify/dnotify.c        |  5 ++---
 fs/notify/fanotify/fanotify_user.c |  4 ++--
 fs/notify/inotify/inotify_user.c   |  5 ++---
 fs/notify/mark.c                   | 17 ++++++++++-------
 include/linux/fsnotify_backend.h   | 12 +++++++-----
 kernel/audit_fsnotify.c            |  7 ++++---
 kernel/audit_tree.c                | 15 ++++++++-------
 kernel/audit_watch.c               |  5 +++--
 8 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index b77d8d049e4d..f9d500fd7b9a 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -305,7 +305,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 
 	/* set up the new_fsn_mark and new_dn_mark */
 	new_fsn_mark = &new_dn_mark->fsn_mark;
-	fsnotify_init_mark(new_fsn_mark, dnotify_free_mark);
+	fsnotify_init_mark(new_fsn_mark, dnotify_group, dnotify_free_mark);
 	new_fsn_mark->mask = mask;
 	new_dn_mark->dn = NULL;
 
@@ -318,8 +318,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 		dn_mark = container_of(fsn_mark, struct dnotify_mark, fsn_mark);
 		spin_lock(&fsn_mark->lock);
 	} else {
-		fsnotify_add_mark_locked(new_fsn_mark, dnotify_group, inode,
-					 NULL, 0);
+		fsnotify_add_mark_locked(new_fsn_mark, inode, NULL, 0);
 		spin_lock(&new_fsn_mark->lock);
 		fsn_mark = new_fsn_mark;
 		dn_mark = new_dn_mark;
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 5a82bbb79f55..d5775f054be7 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -628,8 +628,8 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
 	if (!mark)
 		return ERR_PTR(-ENOMEM);
 
-	fsnotify_init_mark(mark, fanotify_free_mark);
-	ret = fsnotify_add_mark_locked(mark, group, inode, mnt, 0);
+	fsnotify_init_mark(mark, group, fanotify_free_mark);
+	ret = fsnotify_add_mark_locked(mark, inode, mnt, 0);
 	if (ret) {
 		fsnotify_put_mark(mark);
 		return ERR_PTR(ret);
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index a5e4411362f2..07febafd826e 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -558,7 +558,7 @@ static int inotify_new_watch(struct fsnotify_group *group,
 	if (unlikely(!tmp_i_mark))
 		return -ENOMEM;
 
-	fsnotify_init_mark(&tmp_i_mark->fsn_mark, inotify_free_mark);
+	fsnotify_init_mark(&tmp_i_mark->fsn_mark, group, inotify_free_mark);
 	tmp_i_mark->fsn_mark.mask = mask;
 	tmp_i_mark->wd = -1;
 
@@ -574,8 +574,7 @@ static int inotify_new_watch(struct fsnotify_group *group,
 	}
 
 	/* we are on the idr, now get on the inode */
-	ret = fsnotify_add_mark_locked(&tmp_i_mark->fsn_mark, group, inode,
-				       NULL, 0);
+	ret = fsnotify_add_mark_locked(&tmp_i_mark->fsn_mark, inode, NULL, 0);
 	if (ret) {
 		/* we failed to get on the inode, get off the idr */
 		inotify_remove_from_idr(group, tmp_i_mark);
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 9f3364ef19d3..2f743e2035e4 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -563,10 +563,10 @@ static int fsnotify_add_mark_list(struct fsnotify_mark *mark,
  * These marks may be used for the fsnotify backend to determine which
  * event types should be delivered to which group.
  */
-int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
-			     struct fsnotify_group *group, struct inode *inode,
+int fsnotify_add_mark_locked(struct fsnotify_mark *mark, struct inode *inode,
 			     struct vfsmount *mnt, int allow_dups)
 {
+	struct fsnotify_group *group = mark->group;
 	int ret = 0;
 
 	BUG_ON(inode && mnt);
@@ -582,8 +582,6 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	spin_lock(&mark->lock);
 	mark->flags |= FSNOTIFY_MARK_FLAG_ALIVE | FSNOTIFY_MARK_FLAG_ATTACHED;
 
-	fsnotify_get_group(group);
-	mark->group = group;
 	list_add(&mark->g_list, &group->marks_list);
 	atomic_inc(&group->num_marks);
 	fsnotify_get_mark(mark); /* for g_list */
@@ -608,12 +606,14 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 	return ret;
 }
 
-int fsnotify_add_mark(struct fsnotify_mark *mark, struct fsnotify_group *group,
-		      struct inode *inode, struct vfsmount *mnt, int allow_dups)
+int fsnotify_add_mark(struct fsnotify_mark *mark, struct inode *inode,
+		      struct vfsmount *mnt, int allow_dups)
 {
 	int ret;
+	struct fsnotify_group *group = mark->group;
+
 	mutex_lock(&group->mark_mutex);
-	ret = fsnotify_add_mark_locked(mark, group, inode, mnt, allow_dups);
+	ret = fsnotify_add_mark_locked(mark, inode, mnt, allow_dups);
 	mutex_unlock(&group->mark_mutex);
 	return ret;
 }
@@ -732,12 +732,15 @@ void fsnotify_destroy_marks(struct fsnotify_mark_connector __rcu **connp)
  * Nothing fancy, just initialize lists and locks and counters.
  */
 void fsnotify_init_mark(struct fsnotify_mark *mark,
+			struct fsnotify_group *group,
 			void (*free_mark)(struct fsnotify_mark *mark))
 {
 	memset(mark, 0, sizeof(*mark));
 	spin_lock_init(&mark->lock);
 	atomic_set(&mark->refcnt, 1);
 	mark->free_mark = free_mark;
+	fsnotify_get_group(group);
+	mark->group = group;
 }
 
 /*
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 2ef0e04c5a9d..a64518e36bd5 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -340,15 +340,17 @@ extern struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group
 
 /* Calculate mask of events for a list of marks */
 extern void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
-extern void fsnotify_init_mark(struct fsnotify_mark *mark, void (*free_mark)(struct fsnotify_mark *mark));
+extern void fsnotify_init_mark(struct fsnotify_mark *mark,
+			       struct fsnotify_group *group,
+			       void (*free_mark)(struct fsnotify_mark *mark));
 /* Find mark belonging to given group in the list of marks */
 extern struct fsnotify_mark *fsnotify_find_mark(
 				struct fsnotify_mark_connector __rcu **connp,
 				struct fsnotify_group *group);
-/* attach the mark to both the group and the inode */
-extern int fsnotify_add_mark(struct fsnotify_mark *mark, struct fsnotify_group *group,
-			     struct inode *inode, struct vfsmount *mnt, int allow_dups);
-extern int fsnotify_add_mark_locked(struct fsnotify_mark *mark, struct fsnotify_group *group,
+/* attach the mark to the inode or vfsmount */
+extern int fsnotify_add_mark(struct fsnotify_mark *mark, struct inode *inode,
+			     struct vfsmount *mnt, int allow_dups);
+extern int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
 				    struct inode *inode, struct vfsmount *mnt, int allow_dups);
 /* given a group and a mark, flag mark to be freed when all references are dropped */
 extern void fsnotify_destroy_mark(struct fsnotify_mark *mark,
diff --git a/kernel/audit_fsnotify.c b/kernel/audit_fsnotify.c
index e8b371ff1e91..2522ceaca758 100644
--- a/kernel/audit_fsnotify.c
+++ b/kernel/audit_fsnotify.c
@@ -103,15 +103,16 @@ struct audit_fsnotify_mark *audit_alloc_mark(struct audit_krule *krule, char *pa
 		goto out;
 	}
 
-	fsnotify_init_mark(&audit_mark->mark, audit_fsnotify_free_mark);
+	fsnotify_init_mark(&audit_mark->mark, audit_fsnotify_group,
+			   audit_fsnotify_free_mark);
 	audit_mark->mark.mask = AUDIT_FS_EVENTS;
 	audit_mark->path = pathname;
 	audit_update_mark(audit_mark, dentry->d_inode);
 	audit_mark->rule = krule;
 
-	ret = fsnotify_add_mark(&audit_mark->mark, audit_fsnotify_group, inode, NULL, true);
+	ret = fsnotify_add_mark(&audit_mark->mark, inode, NULL, true);
 	if (ret < 0) {
-		audit_fsnotify_mark_free(audit_mark);
+		fsnotify_put_mark(&audit_mark->mark);
 		audit_mark = ERR_PTR(ret);
 	}
 out:
diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 3bc8e9320cf1..7b17c3d6a286 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -154,7 +154,8 @@ static struct audit_chunk *alloc_chunk(int count)
 		INIT_LIST_HEAD(&chunk->owners[i].list);
 		chunk->owners[i].index = i;
 	}
-	fsnotify_init_mark(&chunk->mark, audit_tree_destroy_watch);
+	fsnotify_init_mark(&chunk->mark, audit_tree_group,
+			   audit_tree_destroy_watch);
 	chunk->mark.mask = FS_IN_IGNORED;
 	return chunk;
 }
@@ -262,7 +263,7 @@ static void untag_chunk(struct node *p)
 		spin_unlock(&entry->lock);
 		mutex_unlock(&entry->group->mark_mutex);
 		if (new)
-			free_chunk(new);
+			fsnotify_put_mark(&new->mark);
 		goto out;
 	}
 
@@ -286,8 +287,8 @@ static void untag_chunk(struct node *p)
 	if (!new)
 		goto Fallback;
 
-	if (fsnotify_add_mark_locked(&new->mark, entry->group,
-				     entry->connector->inode, NULL, 1)) {
+	if (fsnotify_add_mark_locked(&new->mark, entry->connector->inode,
+				     NULL, 1)) {
 		fsnotify_put_mark(&new->mark);
 		goto Fallback;
 	}
@@ -352,7 +353,7 @@ static int create_chunk(struct inode *inode, struct audit_tree *tree)
 		return -ENOMEM;
 
 	entry = &chunk->mark;
-	if (fsnotify_add_mark(entry, audit_tree_group, inode, NULL, 0)) {
+	if (fsnotify_add_mark(entry, inode, NULL, 0)) {
 		fsnotify_put_mark(entry);
 		return -ENOSPC;
 	}
@@ -428,11 +429,11 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 		spin_unlock(&old_entry->lock);
 		mutex_unlock(&old_entry->group->mark_mutex);
 		fsnotify_put_mark(old_entry);
-		free_chunk(chunk);
+		fsnotify_put_mark(&chunk->mark);
 		return -ENOENT;
 	}
 
-	if (fsnotify_add_mark_locked(chunk_entry, old_entry->group,
+	if (fsnotify_add_mark_locked(chunk_entry,
 			     old_entry->connector->inode, NULL, 1)) {
 		spin_unlock(&old_entry->lock);
 		mutex_unlock(&old_entry->group->mark_mutex);
diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c
index 956fa584c239..e32efed86828 100644
--- a/kernel/audit_watch.c
+++ b/kernel/audit_watch.c
@@ -157,9 +157,10 @@ static struct audit_parent *audit_init_parent(struct path *path)
 
 	INIT_LIST_HEAD(&parent->watches);
 
-	fsnotify_init_mark(&parent->mark, audit_watch_free_mark);
+	fsnotify_init_mark(&parent->mark, audit_watch_group,
+			   audit_watch_free_mark);
 	parent->mark.mask = AUDIT_FS_WATCH;
-	ret = fsnotify_add_mark(&parent->mark, audit_watch_group, inode, NULL, 0);
+	ret = fsnotify_add_mark(&parent->mark, inode, NULL, 0);
 	if (ret < 0) {
 		audit_free_parent(parent);
 		return ERR_PTR(ret);
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 35/35] fsnotify: Move ->free_mark callback to fsnotify_ops
  2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
                   ` (33 preceding siblings ...)
  2017-04-03 15:34 ` [PATCH 34/35] fsnotify: Add group pointer in fsnotify_init_mark() Jan Kara
@ 2017-04-03 15:34 ` Jan Kara
  34 siblings, 0 replies; 43+ messages in thread
From: Jan Kara @ 2017-04-03 15:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Miklos Szeredi, Amir Goldstein, Paul Moore, Jan Kara

Pointer to ->free_mark callback unnecessarily occupies one long in each
fsnotify_mark although they are the same for all marks from one
notification group. Move the callback pointer to fsnotify_ops.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/notify/dnotify/dnotify.c          |  3 ++-
 fs/notify/fanotify/fanotify.c        |  6 ++++++
 fs/notify/fanotify/fanotify.h        |  1 +
 fs/notify/fanotify/fanotify_user.c   |  9 ++-------
 fs/notify/inotify/inotify.h          |  1 +
 fs/notify/inotify/inotify_fsnotify.c | 11 +++++++++++
 fs/notify/inotify/inotify_user.c     | 14 ++------------
 fs/notify/mark.c                     | 13 +++++++------
 include/linux/fsnotify_backend.h     |  6 +++---
 kernel/audit_fsnotify.c              |  4 ++--
 kernel/audit_tree.c                  |  4 ++--
 kernel/audit_watch.c                 |  4 ++--
 12 files changed, 41 insertions(+), 35 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index f9d500fd7b9a..2430a0415995 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -135,6 +135,7 @@ static void dnotify_free_mark(struct fsnotify_mark *fsn_mark)
 
 static struct fsnotify_ops dnotify_fsnotify_ops = {
 	.handle_event = dnotify_handle_event,
+	.free_mark = dnotify_free_mark,
 };
 
 /*
@@ -305,7 +306,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
 
 	/* set up the new_fsn_mark and new_dn_mark */
 	new_fsn_mark = &new_dn_mark->fsn_mark;
-	fsnotify_init_mark(new_fsn_mark, dnotify_group, dnotify_free_mark);
+	fsnotify_init_mark(new_fsn_mark, dnotify_group);
 	new_fsn_mark->mask = mask;
 	new_dn_mark->dn = NULL;
 
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 461c21ebebeb..2fa99aeaa095 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -262,8 +262,14 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
 	kmem_cache_free(fanotify_event_cachep, event);
 }
 
+static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
+{
+	kmem_cache_free(fanotify_mark_cache, fsn_mark);
+}
+
 const struct fsnotify_ops fanotify_fsnotify_ops = {
 	.handle_event = fanotify_handle_event,
 	.free_group_priv = fanotify_free_group_priv,
 	.free_event = fanotify_free_event,
+	.free_mark = fanotify_free_mark,
 };
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 4500a74f8d38..4eb6f5efa282 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -2,6 +2,7 @@
 #include <linux/path.h>
 #include <linux/slab.h>
 
+extern struct kmem_cache *fanotify_mark_cache;
 extern struct kmem_cache *fanotify_event_cachep;
 extern struct kmem_cache *fanotify_perm_event_cachep;
 
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index d5775f054be7..bf306d4f72f7 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -41,7 +41,7 @@
 
 extern const struct fsnotify_ops fanotify_fsnotify_ops;
 
-static struct kmem_cache *fanotify_mark_cache __read_mostly;
+struct kmem_cache *fanotify_mark_cache __read_mostly;
 struct kmem_cache *fanotify_event_cachep __read_mostly;
 struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
 
@@ -445,11 +445,6 @@ static const struct file_operations fanotify_fops = {
 	.llseek		= noop_llseek,
 };
 
-static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
-{
-	kmem_cache_free(fanotify_mark_cache, fsn_mark);
-}
-
 static int fanotify_find_path(int dfd, const char __user *filename,
 			      struct path *path, unsigned int flags)
 {
@@ -628,7 +623,7 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
 	if (!mark)
 		return ERR_PTR(-ENOMEM);
 
-	fsnotify_init_mark(mark, group, fanotify_free_mark);
+	fsnotify_init_mark(mark, group);
 	ret = fsnotify_add_mark_locked(mark, inode, mnt, 0);
 	if (ret) {
 		fsnotify_put_mark(mark);
diff --git a/fs/notify/inotify/inotify.h b/fs/notify/inotify/inotify.h
index 7a966f456269..9ff67b61da8a 100644
--- a/fs/notify/inotify/inotify.h
+++ b/fs/notify/inotify/inotify.h
@@ -31,6 +31,7 @@ extern int inotify_handle_event(struct fsnotify_group *group,
 				struct fsnotify_iter_info *iter_info);
 
 extern const struct fsnotify_ops inotify_fsnotify_ops;
+extern struct kmem_cache *inotify_inode_mark_cachep;
 
 #ifdef CONFIG_INOTIFY_USER
 static inline void dec_inotify_instances(struct ucounts *ucounts)
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index ccd6a4055e0c..8b73332735ba 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -176,9 +176,20 @@ static void inotify_free_event(struct fsnotify_event *fsn_event)
 	kfree(INOTIFY_E(fsn_event));
 }
 
+/* ding dong the mark is dead */
+static void inotify_free_mark(struct fsnotify_mark *fsn_mark)
+{
+	struct inotify_inode_mark *i_mark;
+
+	i_mark = container_of(fsn_mark, struct inotify_inode_mark, fsn_mark);
+
+	kmem_cache_free(inotify_inode_mark_cachep, i_mark);
+}
+
 const struct fsnotify_ops inotify_fsnotify_ops = {
 	.handle_event = inotify_handle_event,
 	.free_group_priv = inotify_free_group_priv,
 	.free_event = inotify_free_event,
 	.freeing_mark = inotify_freeing_mark,
+	.free_mark = inotify_free_mark,
 };
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 07febafd826e..7cc7d3fb1862 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -47,7 +47,7 @@
 /* configurable via /proc/sys/fs/inotify/ */
 static int inotify_max_queued_events __read_mostly;
 
-static struct kmem_cache *inotify_inode_mark_cachep __read_mostly;
+struct kmem_cache *inotify_inode_mark_cachep __read_mostly;
 
 #ifdef CONFIG_SYSCTL
 
@@ -483,16 +483,6 @@ void inotify_ignored_and_remove_idr(struct fsnotify_mark *fsn_mark,
 	dec_inotify_watches(group->inotify_data.ucounts);
 }
 
-/* ding dong the mark is dead */
-static void inotify_free_mark(struct fsnotify_mark *fsn_mark)
-{
-	struct inotify_inode_mark *i_mark;
-
-	i_mark = container_of(fsn_mark, struct inotify_inode_mark, fsn_mark);
-
-	kmem_cache_free(inotify_inode_mark_cachep, i_mark);
-}
-
 static int inotify_update_existing_watch(struct fsnotify_group *group,
 					 struct inode *inode,
 					 u32 arg)
@@ -558,7 +548,7 @@ static int inotify_new_watch(struct fsnotify_group *group,
 	if (unlikely(!tmp_i_mark))
 		return -ENOMEM;
 
-	fsnotify_init_mark(&tmp_i_mark->fsn_mark, group, inotify_free_mark);
+	fsnotify_init_mark(&tmp_i_mark->fsn_mark, group);
 	tmp_i_mark->fsn_mark.mask = mask;
 	tmp_i_mark->wd = -1;
 
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 2f743e2035e4..55955ded338d 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -195,9 +195,12 @@ static struct inode *fsnotify_detach_connector_from_object(
 
 static void fsnotify_final_mark_destroy(struct fsnotify_mark *mark)
 {
-	if (mark->group)
-		fsnotify_put_group(mark->group);
-	mark->free_mark(mark);
+	struct fsnotify_group *group = mark->group;
+
+	if (WARN_ON_ONCE(!group))
+		return;
+	group->ops->free_mark(mark);
+	fsnotify_put_group(group);
 }
 
 void fsnotify_put_mark(struct fsnotify_mark *mark)
@@ -732,13 +735,11 @@ void fsnotify_destroy_marks(struct fsnotify_mark_connector __rcu **connp)
  * Nothing fancy, just initialize lists and locks and counters.
  */
 void fsnotify_init_mark(struct fsnotify_mark *mark,
-			struct fsnotify_group *group,
-			void (*free_mark)(struct fsnotify_mark *mark))
+			struct fsnotify_group *group)
 {
 	memset(mark, 0, sizeof(*mark));
 	spin_lock_init(&mark->lock);
 	atomic_set(&mark->refcnt, 1);
-	mark->free_mark = free_mark;
 	fsnotify_get_group(group);
 	mark->group = group;
 }
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index a64518e36bd5..c6c69318752b 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -104,6 +104,8 @@ struct fsnotify_ops {
 	void (*free_group_priv)(struct fsnotify_group *group);
 	void (*freeing_mark)(struct fsnotify_mark *mark, struct fsnotify_group *group);
 	void (*free_event)(struct fsnotify_event *event);
+	/* called on final put+free to free memory */
+	void (*free_mark)(struct fsnotify_mark *mark);
 };
 
 /*
@@ -261,7 +263,6 @@ struct fsnotify_mark {
 #define FSNOTIFY_MARK_FLAG_ALIVE		0x02
 #define FSNOTIFY_MARK_FLAG_ATTACHED		0x04
 	unsigned int flags;		/* flags [mark->lock] */
-	void (*free_mark)(struct fsnotify_mark *mark); /* called on final put+free */
 };
 
 #ifdef CONFIG_FSNOTIFY
@@ -341,8 +342,7 @@ extern struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group
 /* Calculate mask of events for a list of marks */
 extern void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
 extern void fsnotify_init_mark(struct fsnotify_mark *mark,
-			       struct fsnotify_group *group,
-			       void (*free_mark)(struct fsnotify_mark *mark));
+			       struct fsnotify_group *group);
 /* Find mark belonging to given group in the list of marks */
 extern struct fsnotify_mark *fsnotify_find_mark(
 				struct fsnotify_mark_connector __rcu **connp,
diff --git a/kernel/audit_fsnotify.c b/kernel/audit_fsnotify.c
index 2522ceaca758..4aad0a467fed 100644
--- a/kernel/audit_fsnotify.c
+++ b/kernel/audit_fsnotify.c
@@ -103,8 +103,7 @@ struct audit_fsnotify_mark *audit_alloc_mark(struct audit_krule *krule, char *pa
 		goto out;
 	}
 
-	fsnotify_init_mark(&audit_mark->mark, audit_fsnotify_group,
-			   audit_fsnotify_free_mark);
+	fsnotify_init_mark(&audit_mark->mark, audit_fsnotify_group);
 	audit_mark->mark.mask = AUDIT_FS_EVENTS;
 	audit_mark->path = pathname;
 	audit_update_mark(audit_mark, dentry->d_inode);
@@ -203,6 +202,7 @@ static int audit_mark_handle_event(struct fsnotify_group *group,
 
 static const struct fsnotify_ops audit_mark_fsnotify_ops = {
 	.handle_event =	audit_mark_handle_event,
+	.free_mark = audit_fsnotify_free_mark,
 };
 
 static int __init audit_fsnotify_init(void)
diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 7b17c3d6a286..ed006ffb1cba 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -154,8 +154,7 @@ static struct audit_chunk *alloc_chunk(int count)
 		INIT_LIST_HEAD(&chunk->owners[i].list);
 		chunk->owners[i].index = i;
 	}
-	fsnotify_init_mark(&chunk->mark, audit_tree_group,
-			   audit_tree_destroy_watch);
+	fsnotify_init_mark(&chunk->mark, audit_tree_group);
 	chunk->mark.mask = FS_IN_IGNORED;
 	return chunk;
 }
@@ -1013,6 +1012,7 @@ static void audit_tree_freeing_mark(struct fsnotify_mark *entry, struct fsnotify
 static const struct fsnotify_ops audit_tree_ops = {
 	.handle_event = audit_tree_handle_event,
 	.freeing_mark = audit_tree_freeing_mark,
+	.free_mark = audit_tree_destroy_watch,
 };
 
 static int __init audit_tree_init(void)
diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c
index e32efed86828..13d30a8dfc56 100644
--- a/kernel/audit_watch.c
+++ b/kernel/audit_watch.c
@@ -157,8 +157,7 @@ static struct audit_parent *audit_init_parent(struct path *path)
 
 	INIT_LIST_HEAD(&parent->watches);
 
-	fsnotify_init_mark(&parent->mark, audit_watch_group,
-			   audit_watch_free_mark);
+	fsnotify_init_mark(&parent->mark, audit_watch_group);
 	parent->mark.mask = AUDIT_FS_WATCH;
 	ret = fsnotify_add_mark(&parent->mark, inode, NULL, 0);
 	if (ret < 0) {
@@ -508,6 +507,7 @@ static int audit_watch_handle_event(struct fsnotify_group *group,
 
 static const struct fsnotify_ops audit_watch_fsnotify_ops = {
 	.handle_event = 	audit_watch_handle_event,
+	.free_mark =		audit_watch_free_mark,
 };
 
 static int __init audit_watch_init(void)
-- 
2.10.2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/35] audit: Abstract hash key handling
  2017-04-03 15:33 ` [PATCH 04/35] audit: Abstract hash key handling Jan Kara
@ 2017-04-04 20:38   ` Paul Moore
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Moore @ 2017-04-04 20:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Miklos Szeredi, Amir Goldstein

On Mon, Apr 3, 2017 at 11:33 AM, Jan Kara <jack@suse.cz> wrote:
> Audit tree currently uses inode pointer as a key into the hash table.
> Getting that from notification mark will be somewhat more difficult with
> coming fsnotify changes. So abstract getting of hash key from the audit
> chunk and inode so that we can change the method to obtain a key easily.
>
> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
> CC: Paul Moore <paul@paul-moore.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  kernel/audit_tree.c | 39 ++++++++++++++++++++++++++++-----------
>  1 file changed, 28 insertions(+), 11 deletions(-)

Thanks Jan.

Acked-by: Paul Moore <paul@paul-moore.com>

> diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> index 7b44195da81b..11c7ac441624 100644
> --- a/kernel/audit_tree.c
> +++ b/kernel/audit_tree.c
> @@ -163,33 +163,48 @@ enum {HASH_SIZE = 128};
>  static struct list_head chunk_hash_heads[HASH_SIZE];
>  static __cacheline_aligned_in_smp DEFINE_SPINLOCK(hash_lock);
>
> -static inline struct list_head *chunk_hash(const struct inode *inode)
> +/* Function to return search key in our hash from inode. */
> +static unsigned long inode_to_key(const struct inode *inode)
>  {
> -       unsigned long n = (unsigned long)inode / L1_CACHE_BYTES;
> +       return (unsigned long)inode;
> +}
> +
> +/*
> + * Function to return search key in our hash from chunk. Key 0 is special and
> + * should never be present in the hash.
> + */
> +static unsigned long chunk_to_key(struct audit_chunk *chunk)
> +{
> +       return (unsigned long)chunk->mark.inode;
> +}
> +
> +static inline struct list_head *chunk_hash(unsigned long key)
> +{
> +       unsigned long n = key / L1_CACHE_BYTES;
>         return chunk_hash_heads + n % HASH_SIZE;
>  }
>
>  /* hash_lock & entry->lock is held by caller */
>  static void insert_hash(struct audit_chunk *chunk)
>  {
> -       struct fsnotify_mark *entry = &chunk->mark;
> +       unsigned long key = chunk_to_key(chunk);
>         struct list_head *list;
>
> -       if (!entry->inode)
> +       if (!key)
>                 return;
> -       list = chunk_hash(entry->inode);
> +       list = chunk_hash(key);
>         list_add_rcu(&chunk->hash, list);
>  }
>
>  /* called under rcu_read_lock */
>  struct audit_chunk *audit_tree_lookup(const struct inode *inode)
>  {
> -       struct list_head *list = chunk_hash(inode);
> +       unsigned long key = inode_to_key(inode);
> +       struct list_head *list = chunk_hash(key);
>         struct audit_chunk *p;
>
>         list_for_each_entry_rcu(p, list, hash) {
> -               /* mark.inode may have gone NULL, but who cares? */
> -               if (p->mark.inode == inode) {
> +               if (chunk_to_key(p) == key) {
>                         atomic_long_inc(&p->refs);
>                         return p;
>                 }
> @@ -588,7 +603,8 @@ int audit_remove_tree_rule(struct audit_krule *rule)
>
>  static int compare_root(struct vfsmount *mnt, void *arg)
>  {
> -       return d_backing_inode(mnt->mnt_root) == arg;
> +       return inode_to_key(d_backing_inode(mnt->mnt_root)) ==
> +              (unsigned long)arg;
>  }
>
>  void audit_trim_trees(void)
> @@ -623,9 +639,10 @@ void audit_trim_trees(void)
>                 list_for_each_entry(node, &tree->chunks, list) {
>                         struct audit_chunk *chunk = find_chunk(node);
>                         /* this could be NULL if the watch is dying else where... */
> -                       struct inode *inode = chunk->mark.inode;
>                         node->index |= 1U<<31;
> -                       if (iterate_mounts(compare_root, inode, root_mnt))
> +                       if (iterate_mounts(compare_root,
> +                                          (void *)chunk_to_key(chunk),
> +                                          root_mnt))
>                                 node->index &= ~(1U<<31);
>                 }
>                 spin_unlock(&hash_lock);
> --
> 2.10.2
>



-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive
  2017-04-03 15:33 ` [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive Jan Kara
@ 2017-04-04 20:47   ` Paul Moore
  2017-04-05  7:38     ` Jan Kara
  0 siblings, 1 reply; 43+ messages in thread
From: Paul Moore @ 2017-04-04 20:47 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Miklos Szeredi, Amir Goldstein

On Mon, Apr 3, 2017 at 11:33 AM, Jan Kara <jack@suse.cz> wrote:
> Currently audit code uses checking of mark->inode to verify whether mark
> is still alive. Switch that to checking mark flags as that is more
> logical and current way will become unreliable in future.
>
> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  kernel/audit_tree.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Should audit_tree.c:insert_hash() also be updated in a similar manner?

> diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> index 11c7ac441624..f12bd40fb8f1 100644
> --- a/kernel/audit_tree.c
> +++ b/kernel/audit_tree.c
> @@ -248,7 +248,7 @@ static void untag_chunk(struct node *p)
>
>         mutex_lock(&entry->group->mark_mutex);
>         spin_lock(&entry->lock);
> -       if (chunk->dead || !entry->inode) {
> +       if (chunk->dead || !(entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
>                 spin_unlock(&entry->lock);
>                 mutex_unlock(&entry->group->mark_mutex);
>                 if (new)
> @@ -408,7 +408,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>
>         mutex_lock(&old_entry->group->mark_mutex);
>         spin_lock(&old_entry->lock);
> -       if (!old_entry->inode) {
> +       if (!(old_entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
>                 /* old_entry is being shot, lets just lie */
>                 spin_unlock(&old_entry->lock);
>                 mutex_unlock(&old_entry->group->mark_mutex);
> --
> 2.10.2
>

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive
  2017-04-04 20:47   ` Paul Moore
@ 2017-04-05  7:38     ` Jan Kara
  2017-04-06 11:51       ` Paul Moore
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2017-04-05  7:38 UTC (permalink / raw)
  To: Paul Moore; +Cc: Jan Kara, linux-fsdevel, Miklos Szeredi, Amir Goldstein

On Tue 04-04-17 16:47:11, Paul Moore wrote:
> On Mon, Apr 3, 2017 at 11:33 AM, Jan Kara <jack@suse.cz> wrote:
> > Currently audit code uses checking of mark->inode to verify whether mark
> > is still alive. Switch that to checking mark flags as that is more
> > logical and current way will become unreliable in future.
> >
> > Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  kernel/audit_tree.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Should audit_tree.c:insert_hash() also be updated in a similar manner?

Do you mean the part which has become chunk_to_key()? The code there is
correct as is since we just use the value of inode pointer for hashing but
never dereference it. Also some of the callers of chunk_to_key() don't hold
locks that would prevent chunk.mark->connector->inode becoming NULL just
after we checked the FSNOTIFY_MARK_FLAG_ATTACHED flag so in practice
nothing would change. So I've decided to leave that code as is. That being
said I don't have a strong opinion about it so if you prefer checking the
flag, I can do it that way.

								Honza

> 
> > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> > index 11c7ac441624..f12bd40fb8f1 100644
> > --- a/kernel/audit_tree.c
> > +++ b/kernel/audit_tree.c
> > @@ -248,7 +248,7 @@ static void untag_chunk(struct node *p)
> >
> >         mutex_lock(&entry->group->mark_mutex);
> >         spin_lock(&entry->lock);
> > -       if (chunk->dead || !entry->inode) {
> > +       if (chunk->dead || !(entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
> >                 spin_unlock(&entry->lock);
> >                 mutex_unlock(&entry->group->mark_mutex);
> >                 if (new)
> > @@ -408,7 +408,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
> >
> >         mutex_lock(&old_entry->group->mark_mutex);
> >         spin_lock(&old_entry->lock);
> > -       if (!old_entry->inode) {
> > +       if (!(old_entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
> >                 /* old_entry is being shot, lets just lie */
> >                 spin_unlock(&old_entry->lock);
> >                 mutex_unlock(&old_entry->group->mark_mutex);
> > --
> > 2.10.2
> >
> 
> -- 
> paul moore
> www.paul-moore.com
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive
  2017-04-05  7:38     ` Jan Kara
@ 2017-04-06 11:51       ` Paul Moore
  2017-04-10 15:31         ` Jan Kara
  0 siblings, 1 reply; 43+ messages in thread
From: Paul Moore @ 2017-04-06 11:51 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Miklos Szeredi, Amir Goldstein

On Wed, Apr 5, 2017 at 3:38 AM, Jan Kara <jack@suse.cz> wrote:
> On Tue 04-04-17 16:47:11, Paul Moore wrote:
>> On Mon, Apr 3, 2017 at 11:33 AM, Jan Kara <jack@suse.cz> wrote:
>> > Currently audit code uses checking of mark->inode to verify whether mark
>> > is still alive. Switch that to checking mark flags as that is more
>> > logical and current way will become unreliable in future.
>> >
>> > Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
>> > Signed-off-by: Jan Kara <jack@suse.cz>
>> > ---
>> >  kernel/audit_tree.c | 4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> Should audit_tree.c:insert_hash() also be updated in a similar manner?
>
> Do you mean the part which has become chunk_to_key()? ...

No, I was talking about the if conditional near the top of the
function that checks to see if the fsnotify_mark's inode is non-NULL;
it seems like you would also want to convert that to a
FSNOTIFY_MARK_FLAG_ATTACHED, yes?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive
  2017-04-06 11:51       ` Paul Moore
@ 2017-04-10 15:31         ` Jan Kara
  2017-04-10 15:43           ` Jan Kara
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2017-04-10 15:31 UTC (permalink / raw)
  To: Paul Moore; +Cc: Jan Kara, linux-fsdevel, Miklos Szeredi, Amir Goldstein

On Thu 06-04-17 07:51:48, Paul Moore wrote:
> On Wed, Apr 5, 2017 at 3:38 AM, Jan Kara <jack@suse.cz> wrote:
> > On Tue 04-04-17 16:47:11, Paul Moore wrote:
> >> On Mon, Apr 3, 2017 at 11:33 AM, Jan Kara <jack@suse.cz> wrote:
> >> > Currently audit code uses checking of mark->inode to verify whether mark
> >> > is still alive. Switch that to checking mark flags as that is more
> >> > logical and current way will become unreliable in future.
> >> >
> >> > Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
> >> > Signed-off-by: Jan Kara <jack@suse.cz>
> >> > ---
> >> >  kernel/audit_tree.c | 4 ++--
> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> Should audit_tree.c:insert_hash() also be updated in a similar manner?
> >
> > Do you mean the part which has become chunk_to_key()? ...
> 
> No, I was talking about the if conditional near the top of the
> function that checks to see if the fsnotify_mark's inode is non-NULL;
> it seems like you would also want to convert that to a
> FSNOTIFY_MARK_FLAG_ATTACHED, yes?

Ah, that one. Yes, that can certainly become a FSNOTIFY_MARK_FLAG_ATTACHED
check although I cannot currently come up with a situation where it would
matter. But it looks safer that way so I'll change that check as you
suggest.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive
  2017-04-10 15:31         ` Jan Kara
@ 2017-04-10 15:43           ` Jan Kara
  2017-04-11 20:27             ` Paul Moore
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2017-04-10 15:43 UTC (permalink / raw)
  To: Paul Moore; +Cc: Jan Kara, linux-fsdevel, Miklos Szeredi, Amir Goldstein

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

On Mon 10-04-17 17:31:26, Jan Kara wrote:
> On Thu 06-04-17 07:51:48, Paul Moore wrote:
> > On Wed, Apr 5, 2017 at 3:38 AM, Jan Kara <jack@suse.cz> wrote:
> > > On Tue 04-04-17 16:47:11, Paul Moore wrote:
> > >> On Mon, Apr 3, 2017 at 11:33 AM, Jan Kara <jack@suse.cz> wrote:
> > >> > Currently audit code uses checking of mark->inode to verify whether mark
> > >> > is still alive. Switch that to checking mark flags as that is more
> > >> > logical and current way will become unreliable in future.
> > >> >
> > >> > Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
> > >> > Signed-off-by: Jan Kara <jack@suse.cz>
> > >> > ---
> > >> >  kernel/audit_tree.c | 4 ++--
> > >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >>
> > >> Should audit_tree.c:insert_hash() also be updated in a similar manner?
> > >
> > > Do you mean the part which has become chunk_to_key()? ...
> > 
> > No, I was talking about the if conditional near the top of the
> > function that checks to see if the fsnotify_mark's inode is non-NULL;
> > it seems like you would also want to convert that to a
> > FSNOTIFY_MARK_FLAG_ATTACHED, yes?
> 
> Ah, that one. Yes, that can certainly become a FSNOTIFY_MARK_FLAG_ATTACHED
> check although I cannot currently come up with a situation where it would
> matter. But it looks safer that way so I'll change that check as you
> suggest.

Attached is the resulting patch.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-audit_tree-Use-mark-flags-to-check-whether-mark-is-a.patch --]
[-- Type: text/x-patch, Size: 1738 bytes --]

>From 43471d15df0e7c40ca4df1513fc1dcf5765396ac Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 3 Apr 2017 16:47:58 +0200
Subject: [PATCH] audit_tree: Use mark flags to check whether mark is alive

Currently audit code uses checking of mark->inode to verify whether mark
is still alive. Switch that to checking mark flags as that is more
logical and current way will become unreliable in future.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/audit_tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 11c7ac441624..51451245936a 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -190,7 +190,7 @@ static void insert_hash(struct audit_chunk *chunk)
 	unsigned long key = chunk_to_key(chunk);
 	struct list_head *list;
 
-	if (!key)
+	if (!(chunk->mark.flags & FSNOTIFY_MARK_FLAG_ATTACHED))
 		return;
 	list = chunk_hash(key);
 	list_add_rcu(&chunk->hash, list);
@@ -248,7 +248,7 @@ static void untag_chunk(struct node *p)
 
 	mutex_lock(&entry->group->mark_mutex);
 	spin_lock(&entry->lock);
-	if (chunk->dead || !entry->inode) {
+	if (chunk->dead || !(entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 		spin_unlock(&entry->lock);
 		mutex_unlock(&entry->group->mark_mutex);
 		if (new)
@@ -408,7 +408,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 
 	mutex_lock(&old_entry->group->mark_mutex);
 	spin_lock(&old_entry->lock);
-	if (!old_entry->inode) {
+	if (!(old_entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) {
 		/* old_entry is being shot, lets just lie */
 		spin_unlock(&old_entry->lock);
 		mutex_unlock(&old_entry->group->mark_mutex);
-- 
2.12.0


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive
  2017-04-10 15:43           ` Jan Kara
@ 2017-04-11 20:27             ` Paul Moore
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Moore @ 2017-04-11 20:27 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Miklos Szeredi, Amir Goldstein

On Mon, Apr 10, 2017 at 11:43 AM, Jan Kara <jack@suse.cz> wrote:
> On Mon 10-04-17 17:31:26, Jan Kara wrote:
>> On Thu 06-04-17 07:51:48, Paul Moore wrote:
>> > On Wed, Apr 5, 2017 at 3:38 AM, Jan Kara <jack@suse.cz> wrote:
>> > > On Tue 04-04-17 16:47:11, Paul Moore wrote:
>> > >> On Mon, Apr 3, 2017 at 11:33 AM, Jan Kara <jack@suse.cz> wrote:
>> > >> > Currently audit code uses checking of mark->inode to verify whether mark
>> > >> > is still alive. Switch that to checking mark flags as that is more
>> > >> > logical and current way will become unreliable in future.
>> > >> >
>> > >> > Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
>> > >> > Signed-off-by: Jan Kara <jack@suse.cz>
>> > >> > ---
>> > >> >  kernel/audit_tree.c | 4 ++--
>> > >> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> > >>
>> > >> Should audit_tree.c:insert_hash() also be updated in a similar manner?
>> > >
>> > > Do you mean the part which has become chunk_to_key()? ...
>> >
>> > No, I was talking about the if conditional near the top of the
>> > function that checks to see if the fsnotify_mark's inode is non-NULL;
>> > it seems like you would also want to convert that to a
>> > FSNOTIFY_MARK_FLAG_ATTACHED, yes?
>>
>> Ah, that one. Yes, that can certainly become a FSNOTIFY_MARK_FLAG_ATTACHED
>> check although I cannot currently come up with a situation where it would
>> matter. But it looks safer that way so I'll change that check as you
>> suggest.
>
> Attached is the resulting patch.

<grumpy old maintainer voice>
Please post patches inline, it is so much easier for me to review that way.
</grumpy old maintainer voice>

Acked-by: Paul Moore <paul@paul-moore.com>

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2017-04-11 20:27 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-03 15:33 [PATCH 0/35 v7] fsnotify: Avoid SRCU stalls with fanotify permission events Jan Kara
2017-04-03 15:33 ` [PATCH 01/35] fsnotify: Remove unnecessary tests when showing fdinfo Jan Kara
2017-04-03 15:33 ` [PATCH 02/35] inotify: Remove inode pointers from debug messages Jan Kara
2017-04-03 15:33 ` [PATCH 03/35] fanotify: Move recalculation of inode / vfsmount mask under mark_mutex Jan Kara
2017-04-03 15:33 ` [PATCH 04/35] audit: Abstract hash key handling Jan Kara
2017-04-04 20:38   ` Paul Moore
2017-04-03 15:33 ` [PATCH 05/35] audit_tree: Use mark flags to check whether mark is alive Jan Kara
2017-04-04 20:47   ` Paul Moore
2017-04-05  7:38     ` Jan Kara
2017-04-06 11:51       ` Paul Moore
2017-04-10 15:31         ` Jan Kara
2017-04-10 15:43           ` Jan Kara
2017-04-11 20:27             ` Paul Moore
2017-04-03 15:33 ` [PATCH 06/35] fsnotify: Update comments Jan Kara
2017-04-03 15:33 ` [PATCH 07/35] fsnotify: Move mark list head from object into dedicated structure Jan Kara
2017-04-03 15:33 ` [PATCH 08/35] fsnotify: Move object pointer to fsnotify_mark_connector Jan Kara
2017-04-03 15:33 ` [PATCH 09/35] fsnotify: Make fsnotify_mark_connector hold inode reference Jan Kara
2017-04-03 15:33 ` [PATCH 10/35] fsnotify: Remove indirection from mark list addition Jan Kara
2017-04-03 15:34 ` [PATCH 11/35] fsnotify: Move fsnotify_destroy_marks() Jan Kara
2017-04-03 15:34 ` [PATCH 12/35] fsnotify: Move locking into fsnotify_recalc_mask() Jan Kara
2017-04-03 15:34 ` [PATCH 13/35] fsnotify: Move locking into fsnotify_find_mark() Jan Kara
2017-04-03 15:34 ` [PATCH 14/35] fsnotify: Determine lock in fsnotify_destroy_marks() Jan Kara
2017-04-03 15:34 ` [PATCH 15/35] fsnotify: Remove indirection from fsnotify_detach_mark() Jan Kara
2017-04-03 15:34 ` [PATCH 16/35] fsnotify: Avoid double locking in fsnotify_detach_from_object() Jan Kara
2017-04-03 15:34 ` [PATCH 17/35] fsnotify: Remove useless list deletion and comment Jan Kara
2017-04-03 15:34 ` [PATCH 18/35] fsnotify: Lock object list with connector lock Jan Kara
2017-04-03 15:34 ` [PATCH 19/35] fsnotify: Free fsnotify_mark_connector when there is no mark attached Jan Kara
2017-04-03 15:34 ` [PATCH 20/35] inotify: Do not drop mark reference under idr_lock Jan Kara
2017-04-03 15:34 ` [PATCH 21/35] fsnotify: Move queueing of mark for destruction into fsnotify_put_mark() Jan Kara
2017-04-03 15:34 ` [PATCH 22/35] fsnotify: Detach mark from object list when last reference is dropped Jan Kara
2017-04-03 15:34 ` [PATCH 23/35] fsnotify: Remove special handling of mark destruction on group shutdown Jan Kara
2017-04-03 15:34 ` [PATCH 24/35] fsnotify: Provide framework for dropping SRCU lock in ->handle_event Jan Kara
2017-04-03 15:34 ` [PATCH 25/35] fsnotify: Pass fsnotify_iter_info into handle_event handler Jan Kara
2017-04-03 15:34 ` [PATCH 26/35] fanotify: Release SRCU lock when waiting for userspace response Jan Kara
2017-04-03 15:34 ` [PATCH 27/35] fsnotify: Remove fsnotify_set_mark_{,ignored_}mask_locked() Jan Kara
2017-04-03 15:34 ` [PATCH 28/35] fsnotify: Remove fsnotify_recalc_{inode|vfsmount}_mask() Jan Kara
2017-04-03 15:34 ` [PATCH 29/35] fsnotify: Inline fsnotify_clear_{inode|vfsmount}_mark_group() Jan Kara
2017-04-03 15:34 ` [PATCH 30/35] fsnotify: Rename fsnotify_clear_marks_by_group_flags() Jan Kara
2017-04-03 15:34 ` [PATCH 31/35] fsnotify: Remove fsnotify_detach_group_marks() Jan Kara
2017-04-03 15:34 ` [PATCH 32/35] fsnotify: Remove fsnotify_find_{inode|vfsmount}_mark() Jan Kara
2017-04-03 15:34 ` [PATCH 33/35] fsnotify: Drop inode_mark.c Jan Kara
2017-04-03 15:34 ` [PATCH 34/35] fsnotify: Add group pointer in fsnotify_init_mark() Jan Kara
2017-04-03 15:34 ` [PATCH 35/35] fsnotify: Move ->free_mark callback to fsnotify_ops Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.