linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Brennan <stephen.s.brennan@oracle.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Al Viro <viro@zeniv.linux.org.uk>,
	Stephen Brennan <stephen.s.brennan@oracle.com>,
	Amir Goldstein <amir73il@gmail.com>
Subject: [PATCH v4 5/5] fsnotify: require inode lock held during child flag update
Date: Fri, 11 Nov 2022 14:06:14 -0800	[thread overview]
Message-ID: <20221111220614.991928-6-stephen.s.brennan@oracle.com> (raw)
In-Reply-To: <20221111220614.991928-1-stephen.s.brennan@oracle.com>

With the prior changes to fsnotify, it is now possible for
fsnotify_recalc_mask() to return before all children flags have been
set. Imagine that two CPUs attempt to add a mark to an inode which would
require watching the children of that directory:

CPU 1:                                 CPU 2:

fsnotify_recalc_mask() {
  spin_lock();
  update_children = ...
  __fsnotify_recalc_mask();
  update_children = ...
  spin_unlock();
  // update_children is true!
  fsnotify_conn_set_children_dentry_flags() {
    // updating flags ...
    cond_resched();

                                       fsnotify_recalc_mask() {
                                         spin_lock();
                                         update_children = ...
                                         __fsnotify_recalc_mask();
                                         update_children = ...
                                         spin_unlock();
                                         // update_children is false
                                       }
                                       // returns to userspace, but
                                       // not all children are marked
    // continue updating flags
   }
}

To prevent this situation, hold the directory inode lock. This ensures
that any concurrent update to the mask will block until the update is
complete, so that we can guarantee that child flags are set prior to
returning.

Since the directory inode lock is now held during iteration over
d_subdirs, we are guaranteed that __d_move() cannot remove the dentry we
hold, so we no longer need check whether we should retry iteration. We
also are guaranteed that no cursors are moving through the list, since
simple_readdir() holds the inode read lock. Simplify the iteration by
removing this logic.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
---
 fs/notify/fsnotify.c | 25 +++++++++----------------
 fs/notify/mark.c     |  8 ++++++++
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 0ba61211456c..b5778775b88d 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -102,6 +102,8 @@ void fsnotify_sb_delete(struct super_block *sb)
  * on a child we run all of our children and set a dentry flag saying that the
  * parent cares.  Thus when an event happens on a child it can quickly tell
  * if there is a need to find a parent and send the event to the parent.
+ *
+ * Context: inode locked exclusive
  */
 void fsnotify_update_children_dentry_flags(struct inode *inode, bool watched)
 {
@@ -124,22 +126,16 @@ void fsnotify_update_children_dentry_flags(struct inode *inode, bool watched)
 	 * over d_subdirs which will allow us to sleep.
 	 */
 	spin_lock(&alias->d_lock);
-retry:
 	list_for_each_entry(child, &alias->d_subdirs, d_child) {
 		/*
-		 * We need to hold a reference while we sleep. But we cannot
-		 * sleep holding a reference to a cursor, or we risk skipping
-		 * through the list.
-		 *
-		 * When we wake, dput() could free the dentry, invalidating the
-		 * list pointers.  We can't look at the list pointers until we
-		 * re-lock the parent, and we can't dput() once we have the
-		 * parent locked.  So the solution is to hold onto our reference
-		 * and free it the *next* time we drop alias->d_lock: either at
-		 * the end of the function, or at the time of the next sleep.
+		 * We need to hold a reference while we sleep. When we wake,
+		 * dput() could free the dentry, invalidating the list pointers.
+		 * We can't look at the list pointers until we re-lock the
+		 * parent, and we can't dput() once we have the parent locked.
+		 * So the solution is to hold onto our reference and free it the
+		 * *next* time we drop alias->d_lock: either at the end of the
+		 * function, or at the time of the next sleep.
 		 */
-		if (child->d_flags & DCACHE_DENTRY_CURSOR)
-			continue;
 		if (need_resched()) {
 			dget(child);
 			spin_unlock(&alias->d_lock);
@@ -147,9 +143,6 @@ void fsnotify_update_children_dentry_flags(struct inode *inode, bool watched)
 			last_ref = child;
 			cond_resched();
 			spin_lock(&alias->d_lock);
-			/* Check for races with __d_move() */
-			if (child->d_parent != alias)
-				goto retry;
 		}
 
 		/*
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 6797a2952f87..f39cd88ad778 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -203,10 +203,15 @@ static void fsnotify_conn_set_children_dentry_flags(
 void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 {
 	bool update_children;
+	struct inode *inode = NULL;
 
 	if (!conn)
 		return;
 
+	if (conn->type == FSNOTIFY_OBJ_TYPE_INODE) {
+		inode = fsnotify_conn_inode(conn);
+		inode_lock(inode);
+	}
 	spin_lock(&conn->lock);
 	update_children = !fsnotify_conn_watches_children(conn);
 	__fsnotify_recalc_mask(conn);
@@ -219,6 +224,9 @@ void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 	 */
 	if (update_children)
 		fsnotify_conn_set_children_dentry_flags(conn);
+
+	if (inode)
+		inode_unlock(inode);
 }
 
 /* Free all connectors queued for freeing once SRCU period ends */
-- 
2.34.1


  parent reply	other threads:[~2022-11-11 22:07 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-13 22:27 [RFC] fsnotify: allow sleepable child dentry flag update Stephen Brennan
2022-10-13 23:51 ` Al Viro
2022-11-01 21:47   ` Stephen Brennan
2022-10-14  8:01 ` Amir Goldstein
2022-10-17  7:59   ` Stephen Brennan
2022-10-17 11:44     ` Amir Goldstein
2022-10-17 16:59       ` Stephen Brennan
2022-10-17 17:42         ` Amir Goldstein
2022-10-17  9:09   ` Jan Kara
2022-10-18  4:12 ` [PATCH 0/2] fsnotify: fix softlockups iterating over d_subdirs Stephen Brennan
2022-10-18  4:12   ` [PATCH 1/2] fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem Stephen Brennan
2022-10-18  7:39     ` Amir Goldstein
2022-10-21  0:33       ` Stephen Brennan
2022-10-21  7:22         ` Amir Goldstein
2022-10-18  4:12   ` [PATCH 2/2] fsnotify: allow sleepable child flag update Stephen Brennan
2022-10-18  5:36     ` Amir Goldstein
2022-10-27  7:50     ` kernel test robot
2022-10-27  8:44       ` Yujie Liu
2022-10-27 22:12         ` Stephen Brennan
2022-10-18  8:07   ` [PATCH 0/2] fsnotify: fix softlockups iterating over d_subdirs Amir Goldstein
2022-10-18 23:52     ` Stephen Brennan
2022-10-19  5:33       ` Amir Goldstein
2022-10-27 22:06         ` Stephen Brennan
2022-10-28  8:58           ` Amir Goldstein
2022-10-21  1:03   ` [PATCH v2 0/3] " Stephen Brennan
2022-10-21  1:03     ` [PATCH v2 1/3] fsnotify: Use d_find_any_alias to get dentry associated with inode Stephen Brennan
2022-10-21  9:25       ` Amir Goldstein
2022-10-21  1:03     ` [PATCH v2 2/3] fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem Stephen Brennan
2022-10-21  4:01       ` kernel test robot
2022-10-21  8:22       ` Amir Goldstein
2022-10-21  9:18         ` Amir Goldstein
2022-10-25 18:02           ` Stephen Brennan
2022-10-26  5:41             ` Amir Goldstein
2022-10-21  9:17       ` Christian Brauner
2022-10-21  9:21         ` Amir Goldstein
2022-10-21  1:03     ` [PATCH v2 3/3] fsnotify: allow sleepable child flag update Stephen Brennan
2022-10-28  0:10     ` [PATCH v3 0/3] fsnotify: fix softlockups iterating over d_subdirs Stephen Brennan
2022-10-28  0:10       ` [PATCH v3 1/3] fsnotify: Use d_find_any_alias to get dentry associated with inode Stephen Brennan
2022-11-10  1:12         ` Stephen Brennan
2022-10-28  0:10       ` [PATCH v3 2/3] fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem Stephen Brennan
2022-10-28  9:11         ` Amir Goldstein
2022-11-10  0:03         ` kernel test robot
2022-11-10  1:06           ` Stephen Brennan
2022-10-28  0:10       ` [PATCH v3 3/3] fsnotify: allow sleepable child flag update Stephen Brennan
2022-10-28  9:32         ` Amir Goldstein
2022-11-01 21:25           ` Stephen Brennan
2022-11-01 17:51       ` [PATCH v3 0/3] fsnotify: fix softlockups iterating over d_subdirs Jan Kara
2022-11-01 20:48         ` Stephen Brennan
2022-11-02  8:55           ` Amir Goldstein
2022-11-10 20:04             ` Stephen Brennan
2022-11-02 17:52           ` Jan Kara
2022-11-04 23:33             ` Stephen Brennan
2022-11-07 11:56               ` Jan Kara
2022-11-11 22:06       ` [PATCH v4 0/5] " Stephen Brennan
2022-11-11 22:06         ` [PATCH v4 1/5] fsnotify: clear PARENT_WATCHED flags lazily Stephen Brennan
2022-11-11 22:06         ` [PATCH v4 2/5] fsnotify: Use d_find_any_alias to get dentry associated with inode Stephen Brennan
2022-11-12  8:53           ` Amir Goldstein
2022-11-11 22:06         ` [PATCH v4 3/5] dnotify: move fsnotify_recalc_mask() outside spinlock Stephen Brennan
2022-11-12  9:06           ` Amir Goldstein
2022-11-11 22:06         ` [PATCH v4 4/5] fsnotify: allow sleepable child flag update Stephen Brennan
2022-11-12 10:00           ` Amir Goldstein
2022-11-15  7:10           ` kernel test robot
2022-11-11 22:06         ` Stephen Brennan [this message]
2022-11-12  9:42           ` [PATCH v4 5/5] fsnotify: require inode lock held during " Amir Goldstein
2022-11-11 22:08         ` [PATCH v4 0/5] fsnotify: fix softlockups iterating over d_subdirs Stephen Brennan
2022-11-22 11:50         ` Jan Kara
2022-11-22 14:03           ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221111220614.991928-6-stephen.s.brennan@oracle.com \
    --to=stephen.s.brennan@oracle.com \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).