All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH REPOST] fs/namespace: Boost the mount_lock.lock owner instead of spinning on PREEMPT_RT.
@ 2021-11-25 12:07 Sebastian Andrzej Siewior
  2021-11-26 13:24 ` Christian Brauner
  0 siblings, 1 reply; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-11-25 12:07 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Christian Brauner, Alexander Viro, Peter Zijlstra, John Ogness,
	Thomas Gleixner

The MNT_WRITE_HOLD flag is used to hold back any new writers while the
mount point is about to be made read-only. __mnt_want_write() then loops
with disabled preemption until this flag disappears. Callers of
mnt_hold_writers() (which sets the flag) hold the spinlock_t of
mount_lock (seqlock_t) which disables preemption on !PREEMPT_RT and
ensures the task is not scheduled away so that the spinning side spins
for a long time.

On PREEMPT_RT the spinlock_t does not disable preemption and so it is
possible that the task setting MNT_WRITE_HOLD is preempted by task with
higher priority which then spins infinitely waiting for MNT_WRITE_HOLD
to get removed.

Acquire mount_lock::lock which is held by setter of MNT_WRITE_HOLD. This
will PI-boost the owner and wait until the lock is dropped and which
means that MNT_WRITE_HOLD is cleared again.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20211025152218.opvcqfku2lhqvp4o@linutronix.de
---
 fs/namespace.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 659a8f39c61af..3ab45b47b2860 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -343,8 +343,24 @@ int __mnt_want_write(struct vfsmount *m)
 	 * incremented count after it has set MNT_WRITE_HOLD.
 	 */
 	smp_mb();
-	while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD)
-		cpu_relax();
+	might_lock(&mount_lock.lock);
+	while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) {
+		if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
+			cpu_relax();
+		} else {
+			/*
+			 * This prevents priority inversion, if the task
+			 * setting MNT_WRITE_HOLD got preempted on a remote
+			 * CPU, and it prevents life lock if the task setting
+			 * MNT_WRITE_HOLD has a lower priority and is bound to
+			 * the same CPU as the task that is spinning here.
+			 */
+			preempt_enable();
+			lock_mount_hash();
+			unlock_mount_hash();
+			preempt_disable();
+		}
+	}
 	/*
 	 * After the slowpath clears MNT_WRITE_HOLD, mnt_is_readonly will
 	 * be set to match its requirements. So we must not load that until
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-02-08 14:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-25 12:07 [PATCH REPOST] fs/namespace: Boost the mount_lock.lock owner instead of spinning on PREEMPT_RT Sebastian Andrzej Siewior
2021-11-26 13:24 ` Christian Brauner
2021-11-26 14:04   ` Sebastian Andrzej Siewior
2022-01-25 14:40   ` Sebastian Andrzej Siewior
2022-01-25 15:49     ` Christian Brauner
2022-01-25 16:46       ` Sebastian Andrzej Siewior
2022-02-08 14:07       ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.