linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] vfs: remove lockdep fs freeze weirdness
@ 2020-11-09 18:16 Darrick J. Wong
  2020-11-09 18:16 ` [PATCH 1/3] vfs: remove lockdep bogosity in __sb_start_write Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Darrick J. Wong @ 2020-11-09 18:16 UTC (permalink / raw)
  To: darrick.wong
  Cc: Christoph Hellwig, linux-xfs, david, hch, fdmanana, linux-fsdevel

Hi all,

A long time ago, XFS got tangled up with lockdep because the last thing
it does during a fs freeze is use a transaction to flush the superblock
to disk.  Transactions require int(ernal) write freeze protection, which
implies a recursive grab of the intwrite lock and makes lockdep all sad.

This was "solved" in lockdep back in 2012 as commit 5accdf82ba25c by
adding a "convert XFS' blocking fsfreeze lock attempts into a trylock"
behavior in v6 of a patch[1] that Jan Kara sent to try to fix fs freeze
handling.  The behavior was not in the v5[0] patch, nor was there any
discussion for any of the v5 patches that would suggest why things
changed from v5 to v6.

Commit f4b554af9931 in 2015 created the current weird logic in
__sb_start_write, which converts recursive freeze lock grabs into
trylocks whose return values are ignored(!!!).  XFS solved the problem
by creating a variant of transactions (XFS_TRANS_NO_WRITECOUNT) that
don't grab intwrite freeze protection, thus making lockdep's solution
unnecessary.  The commit claims that Dave Chinner explained that the
trylock hack + comment could be removed, but the patch author never did
that, and lore doesn't seem to know where or when Dave actually said
that?

Now it's 2020, and still nobody removed this from __sb_start_write.
Worse yet, nowadays lock_is_held returns 1 if lockdep is built-in but
offline.  This causes attempts to grab the pagefaults freeze lock
synchronously to turn into unchecked trylocks!  Hilarity ensues if a
page fault races with fsfreeze and loses, which causes us to break the
locking model.

This finally came to a head in 5.10-rc1 because the new lockdep bugs
introduced during the merge window caused this maintainer to hit the
weird case where sb_start_pagefault can return without having taken the
freeze lock, leading to test failures and memory corruption.

Since this insanity is dangerous and hasn't been needed by xfs since the
late 2.6(???) days, kill it with fire.

v2: refactor helpers to be more cohesive
v3: move more cohesive helpers to header files

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=remove-freeze-weirdness-5.11
---
 fs/aio.c           |    2 +-
 fs/io_uring.c      |    3 +--
 fs/super.c         |   49 -------------------------------------------------
 include/linux/fs.h |   38 +++++++++++++++++++++++++++-----------
 4 files changed, 29 insertions(+), 63 deletions(-)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] vfs: remove lockdep bogosity in __sb_start_write
  2020-11-09 18:16 [PATCH v3 0/3] vfs: remove lockdep fs freeze weirdness Darrick J. Wong
@ 2020-11-09 18:16 ` Darrick J. Wong
  2020-11-10 11:33   ` Jan Kara
  2020-11-09 18:16 ` [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers Darrick J. Wong
  2020-11-09 18:17 ` [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h Darrick J. Wong
  2 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2020-11-09 18:16 UTC (permalink / raw)
  To: darrick.wong
  Cc: Christoph Hellwig, linux-xfs, david, hch, fdmanana, linux-fsdevel

From: Darrick J. Wong <darrick.wong@oracle.com>

__sb_start_write has some weird looking lockdep code that claims to
exist to handle nested freeze locking requests from xfs.  The code as
written seems broken -- if we think we hold a read lock on any of the
higher freeze levels (e.g. we hold SB_FREEZE_WRITE and are trying to
lock SB_FREEZE_PAGEFAULT), it converts a blocking lock attempt into a
trylock.

However, it's not correct to downgrade a blocking lock attempt to a
trylock unless the downgrading code or the callers are prepared to deal
with that situation.  Neither __sb_start_write nor its callers handle
this at all.  For example:

sb_start_pagefault ignores the return value completely, with the result
that if xfs_filemap_fault loses a race with a different thread trying to
fsfreeze, it will proceed without pagefault freeze protection (thereby
breaking locking rules) and then unlocks the pagefault freeze lock that
it doesn't own on its way out (thereby corrupting the lock state), which
leads to a system hang shortly afterwards.

Normally, this won't happen because our ownership of a read lock on a
higher freeze protection level blocks fsfreeze from grabbing a write
lock on that higher level.  *However*, if lockdep is offline,
lock_is_held_type unconditionally returns 1, which means that
percpu_rwsem_is_held returns 1, which means that __sb_start_write
unconditionally converts blocking freeze lock attempts into trylocks,
even when we *don't* hold anything that would block a fsfreeze.

Apparently this all held together until 5.10-rc1, when bugs in lockdep
caused lockdep to shut itself off early in an fstests run, and once
fstests gets to the "race writes with freezer" tests, kaboom.  This
might explain the long trail of vanishingly infrequent livelocks in
fstests after lockdep goes offline that I've never been able to
diagnose.

We could fix it by spinning on the trylock if wait==true, but AFAICT the
locking works fine if lockdep is not built at all (and I didn't see any
complaints running fstests overnight), so remove this snippet entirely.

NOTE: Commit f4b554af9931 in 2015 created the current weird logic (which
used to exist in a different form in commit 5accdf82ba25c from 2012) in
__sb_start_write.  XFS solved this whole problem in the late 2.6 era by
creating a variant of transactions (XFS_TRANS_NO_WRITECOUNT) that don't
grab intwrite freeze protection, thus making lockdep's solution
unnecessary.  The commit claims that Dave Chinner explained that the
trylock hack + comment could be removed, but nobody ever did.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/super.c |   33 ++++-----------------------------
 1 file changed, 4 insertions(+), 29 deletions(-)


diff --git a/fs/super.c b/fs/super.c
index a51c2083cd6b..e1fd667454d4 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1647,36 +1647,11 @@ EXPORT_SYMBOL(__sb_end_write);
  */
 int __sb_start_write(struct super_block *sb, int level, bool wait)
 {
-	bool force_trylock = false;
-	int ret = 1;
+	if (!wait)
+		return percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
 
-#ifdef CONFIG_LOCKDEP
-	/*
-	 * We want lockdep to tell us about possible deadlocks with freezing
-	 * but it's it bit tricky to properly instrument it. Getting a freeze
-	 * protection works as getting a read lock but there are subtle
-	 * problems. XFS for example gets freeze protection on internal level
-	 * twice in some cases, which is OK only because we already hold a
-	 * freeze protection also on higher level. Due to these cases we have
-	 * to use wait == F (trylock mode) which must not fail.
-	 */
-	if (wait) {
-		int i;
-
-		for (i = 0; i < level - 1; i++)
-			if (percpu_rwsem_is_held(sb->s_writers.rw_sem + i)) {
-				force_trylock = true;
-				break;
-			}
-	}
-#endif
-	if (wait && !force_trylock)
-		percpu_down_read(sb->s_writers.rw_sem + level-1);
-	else
-		ret = percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
-
-	WARN_ON(force_trylock && !ret);
-	return ret;
+	percpu_down_read(sb->s_writers.rw_sem + level-1);
+	return 1;
 }
 EXPORT_SYMBOL(__sb_start_write);
 


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers
  2020-11-09 18:16 [PATCH v3 0/3] vfs: remove lockdep fs freeze weirdness Darrick J. Wong
  2020-11-09 18:16 ` [PATCH 1/3] vfs: remove lockdep bogosity in __sb_start_write Darrick J. Wong
@ 2020-11-09 18:16 ` Darrick J. Wong
  2020-11-10 11:35   ` Jan Kara
  2020-11-10 18:32   ` Christoph Hellwig
  2020-11-09 18:17 ` [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h Darrick J. Wong
  2 siblings, 2 replies; 9+ messages in thread
From: Darrick J. Wong @ 2020-11-09 18:16 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, david, hch, fdmanana, linux-fsdevel

From: Darrick J. Wong <darrick.wong@oracle.com>

Break this function into two helpers so that it's obvious that the
trylock versions return a value that must be checked, and the blocking
versions don't require that.  While we're at it, clean up the return
type mismatch.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/aio.c           |    2 +-
 fs/io_uring.c      |    3 +--
 fs/super.c         |   18 ++++++++++++------
 include/linux/fs.h |   21 +++++++++++----------
 4 files changed, 25 insertions(+), 19 deletions(-)


diff --git a/fs/aio.c b/fs/aio.c
index c45c20d87538..6a21d8919409 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1572,7 +1572,7 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
 		 * we return to userspace.
 		 */
 		if (S_ISREG(file_inode(file)->i_mode)) {
-			__sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, true);
+			sb_start_write(file_inode(file)->i_sb);
 			__sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE);
 		}
 		req->ki_flags |= IOCB_WRITE;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index b42dfa0243bf..4cbaddfe3d80 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3532,8 +3532,7 @@ static int io_write(struct io_kiocb *req, bool force_nonblock,
 	 * we return to userspace.
 	 */
 	if (req->flags & REQ_F_ISREG) {
-		__sb_start_write(file_inode(req->file)->i_sb,
-					SB_FREEZE_WRITE, true);
+		sb_start_write(file_inode(req->file)->i_sb);
 		__sb_writers_release(file_inode(req->file)->i_sb,
 					SB_FREEZE_WRITE);
 	}
diff --git a/fs/super.c b/fs/super.c
index e1fd667454d4..59aa59279133 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1645,16 +1645,22 @@ EXPORT_SYMBOL(__sb_end_write);
  * This is an internal function, please use sb_start_{write,pagefault,intwrite}
  * instead.
  */
-int __sb_start_write(struct super_block *sb, int level, bool wait)
+void __sb_start_write(struct super_block *sb, int level)
 {
-	if (!wait)
-		return percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
-
-	percpu_down_read(sb->s_writers.rw_sem + level-1);
-	return 1;
+	percpu_down_read(sb->s_writers.rw_sem + level - 1);
 }
 EXPORT_SYMBOL(__sb_start_write);
 
+/*
+ * This is an internal function, please use sb_start_{write,pagefault,intwrite}
+ * instead.
+ */
+bool __sb_start_write_trylock(struct super_block *sb, int level)
+{
+	return percpu_down_read_trylock(sb->s_writers.rw_sem + level - 1);
+}
+EXPORT_SYMBOL_GPL(__sb_start_write_trylock);
+
 /**
  * sb_wait_write - wait until all writers to given file system finish
  * @sb: the super for which we wait
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0bd126418bb6..305989afd49c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1581,7 +1581,8 @@ extern struct timespec64 current_time(struct inode *inode);
  */
 
 void __sb_end_write(struct super_block *sb, int level);
-int __sb_start_write(struct super_block *sb, int level, bool wait);
+void __sb_start_write(struct super_block *sb, int level);
+bool __sb_start_write_trylock(struct super_block *sb, int level);
 
 #define __sb_writers_acquired(sb, lev)	\
 	percpu_rwsem_acquire(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_)
@@ -1645,12 +1646,12 @@ static inline void sb_end_intwrite(struct super_block *sb)
  */
 static inline void sb_start_write(struct super_block *sb)
 {
-	__sb_start_write(sb, SB_FREEZE_WRITE, true);
+	__sb_start_write(sb, SB_FREEZE_WRITE);
 }
 
-static inline int sb_start_write_trylock(struct super_block *sb)
+static inline bool sb_start_write_trylock(struct super_block *sb)
 {
-	return __sb_start_write(sb, SB_FREEZE_WRITE, false);
+	return __sb_start_write_trylock(sb, SB_FREEZE_WRITE);
 }
 
 /**
@@ -1674,7 +1675,7 @@ static inline int sb_start_write_trylock(struct super_block *sb)
  */
 static inline void sb_start_pagefault(struct super_block *sb)
 {
-	__sb_start_write(sb, SB_FREEZE_PAGEFAULT, true);
+	__sb_start_write(sb, SB_FREEZE_PAGEFAULT);
 }
 
 /*
@@ -1692,12 +1693,12 @@ static inline void sb_start_pagefault(struct super_block *sb)
  */
 static inline void sb_start_intwrite(struct super_block *sb)
 {
-	__sb_start_write(sb, SB_FREEZE_FS, true);
+	__sb_start_write(sb, SB_FREEZE_FS);
 }
 
-static inline int sb_start_intwrite_trylock(struct super_block *sb)
+static inline bool sb_start_intwrite_trylock(struct super_block *sb)
 {
-	return __sb_start_write(sb, SB_FREEZE_FS, false);
+	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
 }
 
 
@@ -2756,14 +2757,14 @@ static inline void file_start_write(struct file *file)
 {
 	if (!S_ISREG(file_inode(file)->i_mode))
 		return;
-	__sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, true);
+	sb_start_write(file_inode(file)->i_sb);
 }
 
 static inline bool file_start_write_trylock(struct file *file)
 {
 	if (!S_ISREG(file_inode(file)->i_mode))
 		return true;
-	return __sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, false);
+	return sb_start_write_trylock(file_inode(file)->i_sb);
 }
 
 static inline void file_end_write(struct file *file)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h
  2020-11-09 18:16 [PATCH v3 0/3] vfs: remove lockdep fs freeze weirdness Darrick J. Wong
  2020-11-09 18:16 ` [PATCH 1/3] vfs: remove lockdep bogosity in __sb_start_write Darrick J. Wong
  2020-11-09 18:16 ` [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers Darrick J. Wong
@ 2020-11-09 18:17 ` Darrick J. Wong
  2020-11-10 11:36   ` Jan Kara
  2020-11-10 18:32   ` Christoph Hellwig
  2 siblings, 2 replies; 9+ messages in thread
From: Darrick J. Wong @ 2020-11-09 18:17 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, david, hch, fdmanana, linux-fsdevel

From: Darrick J. Wong <darrick.wong@oracle.com>

Now that we've straightened out the callers, move these three functions
to fs.h since they're fairly trivial.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/super.c         |   30 ------------------------------
 include/linux/fs.h |   21 ++++++++++++++++++---
 2 files changed, 18 insertions(+), 33 deletions(-)


diff --git a/fs/super.c b/fs/super.c
index 59aa59279133..98bb0629ee10 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1631,36 +1631,6 @@ int super_setup_bdi(struct super_block *sb)
 }
 EXPORT_SYMBOL(super_setup_bdi);
 
-/*
- * This is an internal function, please use sb_end_{write,pagefault,intwrite}
- * instead.
- */
-void __sb_end_write(struct super_block *sb, int level)
-{
-	percpu_up_read(sb->s_writers.rw_sem + level-1);
-}
-EXPORT_SYMBOL(__sb_end_write);
-
-/*
- * This is an internal function, please use sb_start_{write,pagefault,intwrite}
- * instead.
- */
-void __sb_start_write(struct super_block *sb, int level)
-{
-	percpu_down_read(sb->s_writers.rw_sem + level - 1);
-}
-EXPORT_SYMBOL(__sb_start_write);
-
-/*
- * This is an internal function, please use sb_start_{write,pagefault,intwrite}
- * instead.
- */
-bool __sb_start_write_trylock(struct super_block *sb, int level)
-{
-	return percpu_down_read_trylock(sb->s_writers.rw_sem + level - 1);
-}
-EXPORT_SYMBOL_GPL(__sb_start_write_trylock);
-
 /**
  * sb_wait_write - wait until all writers to given file system finish
  * @sb: the super for which we wait
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 305989afd49c..6dabd019cab0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1580,9 +1580,24 @@ extern struct timespec64 current_time(struct inode *inode);
  * Snapshotting support.
  */
 
-void __sb_end_write(struct super_block *sb, int level);
-void __sb_start_write(struct super_block *sb, int level);
-bool __sb_start_write_trylock(struct super_block *sb, int level);
+/*
+ * These are internal functions, please use sb_start_{write,pagefault,intwrite}
+ * instead.
+ */
+static inline void __sb_end_write(struct super_block *sb, int level)
+{
+	percpu_up_read(sb->s_writers.rw_sem + level-1);
+}
+
+static inline void __sb_start_write(struct super_block *sb, int level)
+{
+	percpu_down_read(sb->s_writers.rw_sem + level - 1);
+}
+
+static inline bool __sb_start_write_trylock(struct super_block *sb, int level)
+{
+	return percpu_down_read_trylock(sb->s_writers.rw_sem + level - 1);
+}
 
 #define __sb_writers_acquired(sb, lev)	\
 	percpu_rwsem_acquire(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] vfs: remove lockdep bogosity in __sb_start_write
  2020-11-09 18:16 ` [PATCH 1/3] vfs: remove lockdep bogosity in __sb_start_write Darrick J. Wong
@ 2020-11-10 11:33   ` Jan Kara
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Kara @ 2020-11-10 11:33 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, linux-xfs, david, fdmanana, linux-fsdevel

On Mon 09-11-20 10:16:50, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> __sb_start_write has some weird looking lockdep code that claims to
> exist to handle nested freeze locking requests from xfs.  The code as
> written seems broken -- if we think we hold a read lock on any of the
> higher freeze levels (e.g. we hold SB_FREEZE_WRITE and are trying to
> lock SB_FREEZE_PAGEFAULT), it converts a blocking lock attempt into a
> trylock.
> 
> However, it's not correct to downgrade a blocking lock attempt to a
> trylock unless the downgrading code or the callers are prepared to deal
> with that situation.  Neither __sb_start_write nor its callers handle
> this at all.  For example:
> 
> sb_start_pagefault ignores the return value completely, with the result
> that if xfs_filemap_fault loses a race with a different thread trying to
> fsfreeze, it will proceed without pagefault freeze protection (thereby
> breaking locking rules) and then unlocks the pagefault freeze lock that
> it doesn't own on its way out (thereby corrupting the lock state), which
> leads to a system hang shortly afterwards.
> 
> Normally, this won't happen because our ownership of a read lock on a
> higher freeze protection level blocks fsfreeze from grabbing a write
> lock on that higher level.  *However*, if lockdep is offline,
> lock_is_held_type unconditionally returns 1, which means that
> percpu_rwsem_is_held returns 1, which means that __sb_start_write
> unconditionally converts blocking freeze lock attempts into trylocks,
> even when we *don't* hold anything that would block a fsfreeze.
> 
> Apparently this all held together until 5.10-rc1, when bugs in lockdep
> caused lockdep to shut itself off early in an fstests run, and once
> fstests gets to the "race writes with freezer" tests, kaboom.  This
> might explain the long trail of vanishingly infrequent livelocks in
> fstests after lockdep goes offline that I've never been able to
> diagnose.
> 
> We could fix it by spinning on the trylock if wait==true, but AFAICT the
> locking works fine if lockdep is not built at all (and I didn't see any
> complaints running fstests overnight), so remove this snippet entirely.
> 
> NOTE: Commit f4b554af9931 in 2015 created the current weird logic (which
> used to exist in a different form in commit 5accdf82ba25c from 2012) in
> __sb_start_write.  XFS solved this whole problem in the late 2.6 era by
> creating a variant of transactions (XFS_TRANS_NO_WRITECOUNT) that don't
> grab intwrite freeze protection, thus making lockdep's solution
> unnecessary.  The commit claims that Dave Chinner explained that the
> trylock hack + comment could be removed, but nobody ever did.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks for cleaning this up. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c |   33 ++++-----------------------------
>  1 file changed, 4 insertions(+), 29 deletions(-)
> 
> 
> diff --git a/fs/super.c b/fs/super.c
> index a51c2083cd6b..e1fd667454d4 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1647,36 +1647,11 @@ EXPORT_SYMBOL(__sb_end_write);
>   */
>  int __sb_start_write(struct super_block *sb, int level, bool wait)
>  {
> -	bool force_trylock = false;
> -	int ret = 1;
> +	if (!wait)
> +		return percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
>  
> -#ifdef CONFIG_LOCKDEP
> -	/*
> -	 * We want lockdep to tell us about possible deadlocks with freezing
> -	 * but it's it bit tricky to properly instrument it. Getting a freeze
> -	 * protection works as getting a read lock but there are subtle
> -	 * problems. XFS for example gets freeze protection on internal level
> -	 * twice in some cases, which is OK only because we already hold a
> -	 * freeze protection also on higher level. Due to these cases we have
> -	 * to use wait == F (trylock mode) which must not fail.
> -	 */
> -	if (wait) {
> -		int i;
> -
> -		for (i = 0; i < level - 1; i++)
> -			if (percpu_rwsem_is_held(sb->s_writers.rw_sem + i)) {
> -				force_trylock = true;
> -				break;
> -			}
> -	}
> -#endif
> -	if (wait && !force_trylock)
> -		percpu_down_read(sb->s_writers.rw_sem + level-1);
> -	else
> -		ret = percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
> -
> -	WARN_ON(force_trylock && !ret);
> -	return ret;
> +	percpu_down_read(sb->s_writers.rw_sem + level-1);
> +	return 1;
>  }
>  EXPORT_SYMBOL(__sb_start_write);
>  
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers
  2020-11-09 18:16 ` [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers Darrick J. Wong
@ 2020-11-10 11:35   ` Jan Kara
  2020-11-10 18:32   ` Christoph Hellwig
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Kara @ 2020-11-10 11:35 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, hch, fdmanana, linux-fsdevel

On Mon 09-11-20 10:16:57, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Break this function into two helpers so that it's obvious that the
> trylock versions return a value that must be checked, and the blocking
> versions don't require that.  While we're at it, clean up the return
> type mismatch.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/aio.c           |    2 +-
>  fs/io_uring.c      |    3 +--
>  fs/super.c         |   18 ++++++++++++------
>  include/linux/fs.h |   21 +++++++++++----------
>  4 files changed, 25 insertions(+), 19 deletions(-)
> 
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index c45c20d87538..6a21d8919409 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1572,7 +1572,7 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
>  		 * we return to userspace.
>  		 */
>  		if (S_ISREG(file_inode(file)->i_mode)) {
> -			__sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, true);
> +			sb_start_write(file_inode(file)->i_sb);
>  			__sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE);
>  		}
>  		req->ki_flags |= IOCB_WRITE;
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index b42dfa0243bf..4cbaddfe3d80 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3532,8 +3532,7 @@ static int io_write(struct io_kiocb *req, bool force_nonblock,
>  	 * we return to userspace.
>  	 */
>  	if (req->flags & REQ_F_ISREG) {
> -		__sb_start_write(file_inode(req->file)->i_sb,
> -					SB_FREEZE_WRITE, true);
> +		sb_start_write(file_inode(req->file)->i_sb);
>  		__sb_writers_release(file_inode(req->file)->i_sb,
>  					SB_FREEZE_WRITE);
>  	}
> diff --git a/fs/super.c b/fs/super.c
> index e1fd667454d4..59aa59279133 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1645,16 +1645,22 @@ EXPORT_SYMBOL(__sb_end_write);
>   * This is an internal function, please use sb_start_{write,pagefault,intwrite}
>   * instead.
>   */
> -int __sb_start_write(struct super_block *sb, int level, bool wait)
> +void __sb_start_write(struct super_block *sb, int level)
>  {
> -	if (!wait)
> -		return percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
> -
> -	percpu_down_read(sb->s_writers.rw_sem + level-1);
> -	return 1;
> +	percpu_down_read(sb->s_writers.rw_sem + level - 1);
>  }
>  EXPORT_SYMBOL(__sb_start_write);
>  
> +/*
> + * This is an internal function, please use sb_start_{write,pagefault,intwrite}
> + * instead.
> + */
> +bool __sb_start_write_trylock(struct super_block *sb, int level)
> +{
> +	return percpu_down_read_trylock(sb->s_writers.rw_sem + level - 1);
> +}
> +EXPORT_SYMBOL_GPL(__sb_start_write_trylock);
> +
>  /**
>   * sb_wait_write - wait until all writers to given file system finish
>   * @sb: the super for which we wait
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 0bd126418bb6..305989afd49c 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1581,7 +1581,8 @@ extern struct timespec64 current_time(struct inode *inode);
>   */
>  
>  void __sb_end_write(struct super_block *sb, int level);
> -int __sb_start_write(struct super_block *sb, int level, bool wait);
> +void __sb_start_write(struct super_block *sb, int level);
> +bool __sb_start_write_trylock(struct super_block *sb, int level);
>  
>  #define __sb_writers_acquired(sb, lev)	\
>  	percpu_rwsem_acquire(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_)
> @@ -1645,12 +1646,12 @@ static inline void sb_end_intwrite(struct super_block *sb)
>   */
>  static inline void sb_start_write(struct super_block *sb)
>  {
> -	__sb_start_write(sb, SB_FREEZE_WRITE, true);
> +	__sb_start_write(sb, SB_FREEZE_WRITE);
>  }
>  
> -static inline int sb_start_write_trylock(struct super_block *sb)
> +static inline bool sb_start_write_trylock(struct super_block *sb)
>  {
> -	return __sb_start_write(sb, SB_FREEZE_WRITE, false);
> +	return __sb_start_write_trylock(sb, SB_FREEZE_WRITE);
>  }
>  
>  /**
> @@ -1674,7 +1675,7 @@ static inline int sb_start_write_trylock(struct super_block *sb)
>   */
>  static inline void sb_start_pagefault(struct super_block *sb)
>  {
> -	__sb_start_write(sb, SB_FREEZE_PAGEFAULT, true);
> +	__sb_start_write(sb, SB_FREEZE_PAGEFAULT);
>  }
>  
>  /*
> @@ -1692,12 +1693,12 @@ static inline void sb_start_pagefault(struct super_block *sb)
>   */
>  static inline void sb_start_intwrite(struct super_block *sb)
>  {
> -	__sb_start_write(sb, SB_FREEZE_FS, true);
> +	__sb_start_write(sb, SB_FREEZE_FS);
>  }
>  
> -static inline int sb_start_intwrite_trylock(struct super_block *sb)
> +static inline bool sb_start_intwrite_trylock(struct super_block *sb)
>  {
> -	return __sb_start_write(sb, SB_FREEZE_FS, false);
> +	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
>  }
>  
>  
> @@ -2756,14 +2757,14 @@ static inline void file_start_write(struct file *file)
>  {
>  	if (!S_ISREG(file_inode(file)->i_mode))
>  		return;
> -	__sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, true);
> +	sb_start_write(file_inode(file)->i_sb);
>  }
>  
>  static inline bool file_start_write_trylock(struct file *file)
>  {
>  	if (!S_ISREG(file_inode(file)->i_mode))
>  		return true;
> -	return __sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, false);
> +	return sb_start_write_trylock(file_inode(file)->i_sb);
>  }
>  
>  static inline void file_end_write(struct file *file)
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h
  2020-11-09 18:17 ` [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h Darrick J. Wong
@ 2020-11-10 11:36   ` Jan Kara
  2020-11-10 18:32   ` Christoph Hellwig
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Kara @ 2020-11-10 11:36 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, hch, fdmanana, linux-fsdevel

On Mon 09-11-20 10:17:04, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Now that we've straightened out the callers, move these three functions
> to fs.h since they're fairly trivial.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c         |   30 ------------------------------
>  include/linux/fs.h |   21 ++++++++++++++++++---
>  2 files changed, 18 insertions(+), 33 deletions(-)
> 
> 
> diff --git a/fs/super.c b/fs/super.c
> index 59aa59279133..98bb0629ee10 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1631,36 +1631,6 @@ int super_setup_bdi(struct super_block *sb)
>  }
>  EXPORT_SYMBOL(super_setup_bdi);
>  
> -/*
> - * This is an internal function, please use sb_end_{write,pagefault,intwrite}
> - * instead.
> - */
> -void __sb_end_write(struct super_block *sb, int level)
> -{
> -	percpu_up_read(sb->s_writers.rw_sem + level-1);
> -}
> -EXPORT_SYMBOL(__sb_end_write);
> -
> -/*
> - * This is an internal function, please use sb_start_{write,pagefault,intwrite}
> - * instead.
> - */
> -void __sb_start_write(struct super_block *sb, int level)
> -{
> -	percpu_down_read(sb->s_writers.rw_sem + level - 1);
> -}
> -EXPORT_SYMBOL(__sb_start_write);
> -
> -/*
> - * This is an internal function, please use sb_start_{write,pagefault,intwrite}
> - * instead.
> - */
> -bool __sb_start_write_trylock(struct super_block *sb, int level)
> -{
> -	return percpu_down_read_trylock(sb->s_writers.rw_sem + level - 1);
> -}
> -EXPORT_SYMBOL_GPL(__sb_start_write_trylock);
> -
>  /**
>   * sb_wait_write - wait until all writers to given file system finish
>   * @sb: the super for which we wait
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 305989afd49c..6dabd019cab0 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1580,9 +1580,24 @@ extern struct timespec64 current_time(struct inode *inode);
>   * Snapshotting support.
>   */
>  
> -void __sb_end_write(struct super_block *sb, int level);
> -void __sb_start_write(struct super_block *sb, int level);
> -bool __sb_start_write_trylock(struct super_block *sb, int level);
> +/*
> + * These are internal functions, please use sb_start_{write,pagefault,intwrite}
> + * instead.
> + */
> +static inline void __sb_end_write(struct super_block *sb, int level)
> +{
> +	percpu_up_read(sb->s_writers.rw_sem + level-1);
> +}
> +
> +static inline void __sb_start_write(struct super_block *sb, int level)
> +{
> +	percpu_down_read(sb->s_writers.rw_sem + level - 1);
> +}
> +
> +static inline bool __sb_start_write_trylock(struct super_block *sb, int level)
> +{
> +	return percpu_down_read_trylock(sb->s_writers.rw_sem + level - 1);
> +}
>  
>  #define __sb_writers_acquired(sb, lev)	\
>  	percpu_rwsem_acquire(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_)
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers
  2020-11-09 18:16 ` [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers Darrick J. Wong
  2020-11-10 11:35   ` Jan Kara
@ 2020-11-10 18:32   ` Christoph Hellwig
  1 sibling, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2020-11-10 18:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, hch, fdmanana, linux-fsdevel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h
  2020-11-09 18:17 ` [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h Darrick J. Wong
  2020-11-10 11:36   ` Jan Kara
@ 2020-11-10 18:32   ` Christoph Hellwig
  1 sibling, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2020-11-10 18:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, hch, fdmanana, linux-fsdevel

On Mon, Nov 09, 2020 at 10:17:04AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Now that we've straightened out the callers, move these three functions
> to fs.h since they're fairly trivial.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-11-10 18:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-09 18:16 [PATCH v3 0/3] vfs: remove lockdep fs freeze weirdness Darrick J. Wong
2020-11-09 18:16 ` [PATCH 1/3] vfs: remove lockdep bogosity in __sb_start_write Darrick J. Wong
2020-11-10 11:33   ` Jan Kara
2020-11-09 18:16 ` [PATCH 2/3] vfs: separate __sb_start_write into blocking and non-blocking helpers Darrick J. Wong
2020-11-10 11:35   ` Jan Kara
2020-11-10 18:32   ` Christoph Hellwig
2020-11-09 18:17 ` [PATCH 3/3] vfs: move __sb_{start,end}_write* to fs.h Darrick J. Wong
2020-11-10 11:36   ` Jan Kara
2020-11-10 18:32   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).