All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Rearrange inode locking
@ 2020-06-22 16:20 Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 1/8] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write() Goldwyn Rodrigues
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs

This series attempts to arrange inode locking and unlocking to be more
aligned to ext4 and xfs, and makes it simpler in logic. The main goal is
to have shared inode lock for direct reads and direct writes within EOF
to make sure we do not race with truncate.

The advantage is that we get rid of btrfs_inode->dio_sem in the DIO
path.

This patch is on top of btrfs-iomap-dio work and survived a run of
xfstests.

Git: https://github.com/goldwynr/linux/tree/btrfs-inode-lock

 btrfs_inode.h |   10 -
 ctree.h       |    8 +
 file.c        |  355 +++++++++++++++++++++++++++++++---------------------------
 inode.c       |   13 +-
 4 files changed, 207 insertions(+), 179 deletions(-)

-- 
Goldwyn



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/8] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write()
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 2/8] btrfs: Move FS error state bit early during write Goldwyn Rodrigues
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

While we do this, correct the call to pagecache_isize_extended():
 - pagecache_isisze_extended needs to be called to the starting of the
   write as opposed to i_size
 - We don't need to check range befor the call, this is done in the
   function

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 7f75adf70bcd..01377a7d1d13 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1585,6 +1585,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 	int ret = 0;
 	bool only_release_metadata = false;
 	bool force_page_uptodate = false;
+	loff_t old_isize = i_size_read(inode);
 
 	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
 			PAGE_SIZE / (sizeof(struct page *)));
@@ -1806,6 +1807,10 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 	}
 
 	extent_changeset_free(data_reserved);
+	if (num_written > 0) {
+		pagecache_isize_extended(inode, old_isize, iocb->ki_pos);
+		iocb->ki_pos += num_written;
+	}
 	return num_written ? num_written : ret;
 }
 
@@ -1926,7 +1931,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	loff_t pos;
 	size_t count;
 	loff_t oldsize;
-	int clean_page = 0;
 
 	if (!(iocb->ki_flags & IOCB_DIRECT) &&
 	    (iocb->ki_flags & IOCB_NOWAIT))
@@ -1998,8 +2002,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 			inode_unlock(inode);
 			goto out;
 		}
-		if (start_pos > round_up(oldsize, fs_info->sectorsize))
-			clean_page = 1;
 	}
 
 	if (sync)
@@ -2009,11 +2011,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		num_written = btrfs_direct_write(iocb, from);
 	} else {
 		num_written = btrfs_buffered_write(iocb, from);
-		if (num_written > 0)
-			iocb->ki_pos = pos + num_written;
-		if (clean_page)
-			pagecache_isize_extended(inode, oldsize,
-						i_size_read(inode));
 	}
 
 	inode_unlock(inode);
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/8] btrfs: Move FS error state bit early during write
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 1/8] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write() Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  2020-06-30  8:22   ` David Sterba
  2020-06-22 16:20 ` [PATCH 3/8] btrfs: Introduce btrfs_write_check() Goldwyn Rodrigues
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

We don't need the inode locked to check for the error bit. Move the
check early.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 01377a7d1d13..9ec30ccbd67a 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1932,6 +1932,15 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	size_t count;
 	loff_t oldsize;
 
+	/*
+	 * If BTRFS flips readonly due to some impossible error
+	 * (fs_info->fs_state now has BTRFS_SUPER_FLAG_ERROR),
+	 * although we have opened a file as writable, we have
+	 * to stop this write operation to ensure FS consistency.
+	 */
+	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+		return -EROFS;
+
 	if (!(iocb->ki_flags & IOCB_DIRECT) &&
 	    (iocb->ki_flags & IOCB_NOWAIT))
 		return -EOPNOTSUPP;
@@ -1971,18 +1980,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		goto out;
 	}
 
-	/*
-	 * If BTRFS flips readonly due to some impossible error
-	 * (fs_info->fs_state now has BTRFS_SUPER_FLAG_ERROR),
-	 * although we have opened a file as writable, we have
-	 * to stop this write operation to ensure FS consistency.
-	 */
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
-		inode_unlock(inode);
-		err = -EROFS;
-		goto out;
-	}
-
 	/*
 	 * We reserve space for updating the inode when we reserve space for the
 	 * extent we are going to write, so we will enospc out there.  We don't
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/8] btrfs: Introduce btrfs_write_check()
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 1/8] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write() Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 2/8] btrfs: Move FS error state bit early during write Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock() Goldwyn Rodrigues
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

btrfs_write_check() checks for all parameters before the write.
This simplifies with error situations with respect to inode unlocking.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c | 138 +++++++++++++++++++++++++-----------------------
 1 file changed, 71 insertions(+), 67 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 9ec30ccbd67a..ba7c3b2cf1c5 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1567,6 +1567,76 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos,
 	return ret;
 }
 
+static void update_time_for_write(struct inode *inode)
+{
+	struct timespec64 now;
+
+	if (IS_NOCMTIME(inode))
+		return;
+
+	now = current_time(inode);
+	if (!timespec64_equal(&inode->i_mtime, &now))
+		inode->i_mtime = now;
+
+	if (!timespec64_equal(&inode->i_ctime, &now))
+		inode->i_ctime = now;
+
+	if (IS_I_VERSION(inode))
+		inode_inc_iversion(inode);
+}
+
+static size_t btrfs_write_check(struct kiocb *iocb, struct iov_iter *from)
+{
+	int err;
+	loff_t oldsize;
+	loff_t start_pos;
+	struct file *file = iocb->ki_filp;
+	struct inode *inode = file_inode(file);
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	loff_t pos = iocb->ki_pos;
+	size_t count = iov_iter_count(from);
+
+	err = generic_write_checks(iocb, from);
+	if (err <= 0)
+		return err;
+
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		/*
+		 * We will allocate space in case nodatacow is not set,
+		 * so bail
+		 */
+		if (!(BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
+					      BTRFS_INODE_PREALLOC)) ||
+		    check_can_nocow(BTRFS_I(inode), pos, &count) <= 0)
+			return -EAGAIN;
+	}
+
+	current->backing_dev_info = inode_to_bdi(inode);
+	err = file_remove_privs(file);
+	if (err)
+		return err;
+
+	/*
+	 * We reserve space for updating the inode when we reserve space for the
+	 * extent we are going to write, so we will enospc out there.  We don't
+	 * need to start yet another transaction to update the inode as we will
+	 * update the inode when we finish writing whatever data we write.
+	 */
+	update_time_for_write(inode);
+
+	start_pos = round_down(pos, fs_info->sectorsize);
+	oldsize = i_size_read(inode);
+	if (start_pos > oldsize) {
+		/* Expand hole size to cover write data, preventing empty gap */
+		u64 end_pos = round_up(pos + count,
+				   fs_info->sectorsize);
+		err = btrfs_cont_expand(inode, oldsize, end_pos);
+		if (err)
+			return err;
+	}
+	return count;
+}
+
 static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 					       struct iov_iter *i)
 {
@@ -1898,24 +1968,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	return written ? written : err;
 }
 
-static void update_time_for_write(struct inode *inode)
-{
-	struct timespec64 now;
-
-	if (IS_NOCMTIME(inode))
-		return;
-
-	now = current_time(inode);
-	if (!timespec64_equal(&inode->i_mtime, &now))
-		inode->i_mtime = now;
-
-	if (!timespec64_equal(&inode->i_ctime, &now))
-		inode->i_ctime = now;
-
-	if (IS_I_VERSION(inode))
-		inode_inc_iversion(inode);
-}
-
 static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 				    struct iov_iter *from)
 {
@@ -1923,14 +1975,9 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	struct inode *inode = file_inode(file);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
-	u64 start_pos;
-	u64 end_pos;
 	ssize_t num_written = 0;
 	const bool sync = iocb->ki_flags & IOCB_DSYNC;
 	ssize_t err;
-	loff_t pos;
-	size_t count;
-	loff_t oldsize;
 
 	/*
 	 * If BTRFS flips readonly due to some impossible error
@@ -1952,55 +1999,12 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		inode_lock(inode);
 	}
 
-	err = generic_write_checks(iocb, from);
+	err = btrfs_write_check(iocb, from);
 	if (err <= 0) {
-		inode_unlock(inode);
-		return err;
-	}
-
-	pos = iocb->ki_pos;
-	count = iov_iter_count(from);
-	if (iocb->ki_flags & IOCB_NOWAIT) {
-		/*
-		 * We will allocate space in case nodatacow is not set,
-		 * so bail
-		 */
-		if (!(BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
-					      BTRFS_INODE_PREALLOC)) ||
-		    check_can_nocow(BTRFS_I(inode), pos, &count) <= 0) {
-			inode_unlock(inode);
-			return -EAGAIN;
-		}
-	}
-
-	current->backing_dev_info = inode_to_bdi(inode);
-	err = file_remove_privs(file);
-	if (err) {
 		inode_unlock(inode);
 		goto out;
 	}
 
-	/*
-	 * We reserve space for updating the inode when we reserve space for the
-	 * extent we are going to write, so we will enospc out there.  We don't
-	 * need to start yet another transaction to update the inode as we will
-	 * update the inode when we finish writing whatever data we write.
-	 */
-	update_time_for_write(inode);
-
-	start_pos = round_down(pos, fs_info->sectorsize);
-	oldsize = i_size_read(inode);
-	if (start_pos > oldsize) {
-		/* Expand hole size to cover write data, preventing empty gap */
-		end_pos = round_up(pos + count,
-				   fs_info->sectorsize);
-		err = btrfs_cont_expand(inode, oldsize, end_pos);
-		if (err) {
-			inode_unlock(inode);
-			goto out;
-		}
-	}
-
 	if (sync)
 		atomic_inc(&BTRFS_I(inode)->sync_writers);
 
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock()
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
                   ` (2 preceding siblings ...)
  2020-06-22 16:20 ` [PATCH 3/8] btrfs: Introduce btrfs_write_check() Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  2020-06-24 16:19   ` David Sterba
  2020-06-30  8:34   ` David Sterba
  2020-06-22 16:20 ` [PATCH 5/8] btrfs: Push inode locking and unlocking in buffered/direct write Goldwyn Rodrigues
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

btrfs_inode_lock/unlock() acquires the inode->i_rwsem depending on the
flags passed. ilock_flags determines the type of lock to be taken:

BTRFS_ILOCK_SHARED - for shared locks, for possible parallel DIO
BTRFS_ILOCK_TRY - for the RWF_NOWAIT sequence

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/ctree.h |  8 ++++++++
 fs/btrfs/file.c  | 51 +++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 161533040978..346fea668ca0 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2953,6 +2953,14 @@ void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
 			       struct btrfs_ioctl_balance_args *bargs);
 
 /* file.c */
+
+/* ilock flags definition */
+#define BTRFS_ILOCK_SHARED	0x1
+#define BTRFS_ILOCK_TRY 	0x2
+
+int btrfs_inode_lock(struct inode *inode, int ilock_flags);
+void btrfs_inode_unlock(struct inode *inode, int ilock_flags);
+
 int __init btrfs_auto_defrag_init(void);
 void __cold btrfs_auto_defrag_exit(void);
 int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index ba7c3b2cf1c5..1a9a0a9e4b3d 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -70,6 +70,37 @@ static int __compare_inode_defrag(struct inode_defrag *defrag1,
 		return 0;
 }
 
+int btrfs_inode_lock(struct inode *inode, int ilock_flags)
+{
+	if (ilock_flags & BTRFS_ILOCK_SHARED) {
+		if (ilock_flags & BTRFS_ILOCK_TRY) {
+			if (!inode_trylock_shared(inode))
+				return -EAGAIN;
+			else
+				return 0;
+		}
+		inode_lock_shared(inode);
+	} else {
+		if (ilock_flags & BTRFS_ILOCK_TRY) {
+			if (!inode_trylock(inode))
+				return -EAGAIN;
+			else
+				return 0;
+		}
+		inode_lock(inode);
+	}
+	return 0;
+}
+
+void btrfs_inode_unlock(struct inode *inode, int ilock_flags)
+{
+	if (ilock_flags & BTRFS_ILOCK_SHARED)
+		inode_unlock_shared(inode);
+	else
+		inode_unlock(inode);
+
+}
+
 /* pop a record for an inode into the defrag tree.  The lock
  * must be held already
  *
@@ -1978,6 +2009,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	ssize_t num_written = 0;
 	const bool sync = iocb->ki_flags & IOCB_DSYNC;
 	ssize_t err;
+	int ilock_flags = 0;
 
 	/*
 	 * If BTRFS flips readonly due to some impossible error
@@ -1992,12 +2024,12 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	    (iocb->ki_flags & IOCB_NOWAIT))
 		return -EOPNOTSUPP;
 
-	if (iocb->ki_flags & IOCB_NOWAIT) {
-		if (!inode_trylock(inode))
-			return -EAGAIN;
-	} else {
-		inode_lock(inode);
-	}
+	if (iocb->ki_flags & IOCB_NOWAIT)
+		ilock_flags |= BTRFS_ILOCK_TRY;
+
+	err = btrfs_inode_lock(inode, ilock_flags);
+	if (err < 0)
+		goto out;
 
 	err = btrfs_write_check(iocb, from);
 	if (err <= 0) {
@@ -2014,7 +2046,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		num_written = btrfs_buffered_write(iocb, from);
 	}
 
-	inode_unlock(inode);
+	btrfs_inode_unlock(inode, ilock_flags);
 
 	/*
 	 * We also have to set last_sub_trans to the current log transid,
@@ -3547,9 +3579,10 @@ static ssize_t btrfs_direct_read(struct kiocb *iocb, struct iov_iter *to)
 	if (is_sync_kiocb(iocb))
 		flags |= IOMAP_DIOF_WAIT_FOR_COMPLETION;
 
-	inode_lock_shared(inode);
+	btrfs_inode_lock(inode, BTRFS_ILOCK_SHARED);
         ret = iomap_dio_rw(iocb, to, &btrfs_dio_iomap_ops, &btrfs_dops, flags);
-	inode_unlock_shared(inode);
+	btrfs_inode_unlock(inode, BTRFS_ILOCK_SHARED);
+
 	return ret;
 }
 
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/8] btrfs: Push inode locking and unlocking in buffered/direct write
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
                   ` (3 preceding siblings ...)
  2020-06-22 16:20 ` [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock() Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF Goldwyn Rodrigues
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Push inode locking and unlocking closer to where we perform the I/O. For
this we need to move the write checks inside the respective functions as
well.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c | 70 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 46 insertions(+), 24 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 1a9a0a9e4b3d..aa6be931620b 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1672,7 +1672,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 					       struct iov_iter *i)
 {
 	struct file *file = iocb->ki_filp;
-	loff_t pos = iocb->ki_pos;
+	loff_t pos;
 	struct inode *inode = file_inode(file);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
@@ -1687,14 +1687,29 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 	bool only_release_metadata = false;
 	bool force_page_uptodate = false;
 	loff_t old_isize = i_size_read(inode);
+	int ilock_flags = 0;
+
+	if (iocb->ki_flags & IOCB_NOWAIT)
+		ilock_flags |= BTRFS_ILOCK_TRY;
+
+	ret = btrfs_inode_lock(inode, ilock_flags);
+	if (ret < 0)
+		return ret;
+
+	ret = btrfs_write_check(iocb, i);
+	if (ret <= 0)
+		goto out;
+	pos = iocb->ki_pos;
 
 	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
 			PAGE_SIZE / (sizeof(struct page *)));
 	nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
 	nrptrs = max(nrptrs, 8);
 	pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL);
-	if (!pages)
-		return -ENOMEM;
+	if (!pages) {
+		ret = -ENOMEM;
+		goto out;
+	}
 
 	while (iov_iter_count(i) > 0) {
 		struct extent_state *cached_state = NULL;
@@ -1912,6 +1927,8 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 		pagecache_isize_extended(inode, old_isize, iocb->ki_pos);
 		iocb->ki_pos += num_written;
 	}
+out:
+	btrfs_inode_unlock(inode, ilock_flags);
 	return num_written ? num_written : ret;
 }
 
@@ -1942,6 +1959,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	size_t count = 0;
 	bool relock = false;
 	int flags = IOMAP_DIOF_PGINVALID_FAIL;
+	int ilock_flags = 0;
+
+	if (iocb->ki_flags & IOCB_NOWAIT)
+		ilock_flags |= BTRFS_ILOCK_TRY;
+
+	err = btrfs_inode_lock(inode, ilock_flags);
+	if (err < 0)
+		goto error;
+
+	err = btrfs_write_check(iocb, from);
+	if (err <= 0)
+		goto error;
 
 	if (check_direct_IO(fs_info, from, pos))
 		goto buffered;
@@ -1956,7 +1985,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 		inode_unlock(inode);
 		relock = true;
 	} else if (iocb->ki_flags & IOCB_NOWAIT) {
-		return -EAGAIN;
+		err = -EAGAIN;
+		goto out;
 	}
 
 	if (is_sync_kiocb(iocb))
@@ -1970,10 +2000,13 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	if (relock)
 		inode_lock(inode);
 
-	if (written < 0 || !iov_iter_count(from))
-		return written;
+	if (written < 0 || !iov_iter_count(from)) {
+		err = written;
+		goto error;
+	}
 
 buffered:
+	btrfs_inode_unlock(inode, ilock_flags);
 	pos = iocb->ki_pos;
 	written_buffered = btrfs_buffered_write(iocb, from);
 	if (written_buffered < 0) {
@@ -1995,8 +2028,13 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	iocb->ki_pos = pos + written_buffered;
 	invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT,
 				 endbyte >> PAGE_SHIFT);
+
 out:
 	return written ? written : err;
+
+error:
+	btrfs_inode_unlock(inode, ilock_flags);
+	return err;
 }
 
 static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
@@ -2008,8 +2046,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	ssize_t num_written = 0;
 	const bool sync = iocb->ki_flags & IOCB_DSYNC;
-	ssize_t err;
-	int ilock_flags = 0;
+	ssize_t err = 0;
 
 	/*
 	 * If BTRFS flips readonly due to some impossible error
@@ -2024,19 +2061,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	    (iocb->ki_flags & IOCB_NOWAIT))
 		return -EOPNOTSUPP;
 
-	if (iocb->ki_flags & IOCB_NOWAIT)
-		ilock_flags |= BTRFS_ILOCK_TRY;
-
-	err = btrfs_inode_lock(inode, ilock_flags);
-	if (err < 0)
-		goto out;
-
-	err = btrfs_write_check(iocb, from);
-	if (err <= 0) {
-		inode_unlock(inode);
-		goto out;
-	}
-
 	if (sync)
 		atomic_inc(&BTRFS_I(inode)->sync_writers);
 
@@ -2046,8 +2070,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		num_written = btrfs_buffered_write(iocb, from);
 	}
 
-	btrfs_inode_unlock(inode, ilock_flags);
-
 	/*
 	 * We also have to set last_sub_trans to the current log transid,
 	 * otherwise subsequent syncs to a file that's been synced in this
@@ -2061,7 +2083,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 
 	if (sync)
 		atomic_dec(&BTRFS_I(inode)->sync_writers);
-out:
+
 	current->backing_dev_info = NULL;
 	return num_written ? num_written : err;
 }
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
                   ` (4 preceding siblings ...)
  2020-06-22 16:20 ` [PATCH 5/8] btrfs: Push inode locking and unlocking in buffered/direct write Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  2020-06-30  8:42   ` David Sterba
  2020-06-30  9:37   ` Filipe Manana
  2020-06-22 16:20 ` [PATCH 7/8] btrfs: Remove dio_sem Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 8/8] btrfs: Move generic_write_sync() to btrfs_buffered_write() Goldwyn Rodrigues
  7 siblings, 2 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

This is to parallelize direct writes within EOF or with direct I/O
reads. This covers the race with truncate() accidentally increasing the
filesize.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c | 25 +++++++------------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index aa6be931620b..c446a4aeb867 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1957,12 +1957,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	loff_t endbyte;
 	int err;
 	size_t count = 0;
-	bool relock = false;
 	int flags = IOMAP_DIOF_PGINVALID_FAIL;
 	int ilock_flags = 0;
 
 	if (iocb->ki_flags & IOCB_NOWAIT)
 		ilock_flags |= BTRFS_ILOCK_TRY;
+	/*
+	 * If the write DIO within EOF,  use a shared lock
+	 */
+	if (pos + count <= i_size_read(inode))
+		ilock_flags |= BTRFS_ILOCK_SHARED;
+	else if (iocb->ki_flags & IOCB_NOWAIT)
+		return -EAGAIN;
 
 	err = btrfs_inode_lock(inode, ilock_flags);
 	if (err < 0)
@@ -1975,20 +1981,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	if (check_direct_IO(fs_info, from, pos))
 		goto buffered;
 
-	count = iov_iter_count(from);
-	/*
-	 * If the write DIO is beyond the EOF, we need update the isize, but it
-	 * is protected by i_mutex. So we can not unlock the i_mutex at this
-	 * case.
-	 */
-	if (pos + count <= inode->i_size) {
-		inode_unlock(inode);
-		relock = true;
-	} else if (iocb->ki_flags & IOCB_NOWAIT) {
-		err = -EAGAIN;
-		goto out;
-	}
-
 	if (is_sync_kiocb(iocb))
 		flags |= IOMAP_DIOF_WAIT_FOR_COMPLETION;
 
@@ -1997,9 +1989,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 			       flags);
 	up_read(&BTRFS_I(inode)->dio_sem);
 
-	if (relock)
-		inode_lock(inode);
-
 	if (written < 0 || !iov_iter_count(from)) {
 		err = written;
 		goto error;
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/8] btrfs: Remove dio_sem
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
                   ` (5 preceding siblings ...)
  2020-06-22 16:20 ` [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  2020-06-22 16:20 ` [PATCH 8/8] btrfs: Move generic_write_sync() to btrfs_buffered_write() Goldwyn Rodrigues
  7 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Since we handle all synchronization using inode_lock()/unlock(), we
don't need dio_sem to handle the race with btrfs_sync_file()

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/btrfs_inode.h | 10 ----------
 fs/btrfs/file.c        | 14 --------------
 fs/btrfs/inode.c       |  1 -
 3 files changed, 25 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index aeff56a0e105..86c5482e0fdd 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -187,16 +187,6 @@ struct btrfs_inode {
 	/* Hook into fs_info->delayed_iputs */
 	struct list_head delayed_iput;
 
-	/*
-	 * To avoid races between lockless (i_mutex not held) direct IO writes
-	 * and concurrent fsync requests. Direct IO writes must acquire read
-	 * access on this semaphore for creating an extent map and its
-	 * corresponding ordered extent. The fast fsync path must acquire write
-	 * access on this semaphore before it collects ordered extents and
-	 * extent maps.
-	 */
-	struct rw_semaphore dio_sem;
-
 	struct inode vfs_inode;
 };
 
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index c446a4aeb867..c8127406fcc6 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1984,10 +1984,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	if (is_sync_kiocb(iocb))
 		flags |= IOMAP_DIOF_WAIT_FOR_COMPLETION;
 
-	down_read(&BTRFS_I(inode)->dio_sem);
 	written = iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops, &btrfs_dops,
 			       flags);
-	up_read(&BTRFS_I(inode)->dio_sem);
 
 	if (written < 0 || !iov_iter_count(from)) {
 		err = written;
@@ -2165,13 +2163,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 
 	inode_lock(inode);
 
-	/*
-	 * We take the dio_sem here because the tree log stuff can race with
-	 * lockless dio writes and get an extent map logged for an extent we
-	 * never waited on.  We need it this high up for lockdep reasons.
-	 */
-	down_write(&BTRFS_I(inode)->dio_sem);
-
 	atomic_inc(&root->log_batch);
 
 	/*
@@ -2209,7 +2200,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	 */
 	ret = start_ordered_ops(inode, start, end);
 	if (ret) {
-		up_write(&BTRFS_I(inode)->dio_sem);
 		inode_unlock(inode);
 		goto out;
 	}
@@ -2223,7 +2213,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	 */
 	ret = btrfs_wait_ordered_range(inode, start, (u64)end - (u64)start + 1);
 	if (ret) {
-		up_write(&BTRFS_I(inode)->dio_sem);
 		inode_unlock(inode);
 		goto out;
 	}
@@ -2247,7 +2236,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 		 * checked called fsync.
 		 */
 		ret = filemap_check_wb_err(inode->i_mapping, file->f_wb_err);
-		up_write(&BTRFS_I(inode)->dio_sem);
 		inode_unlock(inode);
 		goto out;
 	}
@@ -2266,7 +2254,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	trans = btrfs_start_transaction(root, 0);
 	if (IS_ERR(trans)) {
 		ret = PTR_ERR(trans);
-		up_write(&BTRFS_I(inode)->dio_sem);
 		inode_unlock(inode);
 		goto out;
 	}
@@ -2287,7 +2274,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	 * file again, but that will end up using the synchronization
 	 * inside btrfs_sync_log to keep things safe.
 	 */
-	up_write(&BTRFS_I(inode)->dio_sem);
 	inode_unlock(inode);
 
 	if (ret != BTRFS_NO_LOG_SYNC) {
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 31ac8c682f19..7e71f42e7ec0 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8433,7 +8433,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb)
 	INIT_LIST_HEAD(&ei->delalloc_inodes);
 	INIT_LIST_HEAD(&ei->delayed_iput);
 	RB_CLEAR_NODE(&ei->rb_node);
-	init_rwsem(&ei->dio_sem);
 
 	return inode;
 }
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 8/8] btrfs: Move generic_write_sync() to  btrfs_buffered_write()
  2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
                   ` (6 preceding siblings ...)
  2020-06-22 16:20 ` [PATCH 7/8] btrfs: Remove dio_sem Goldwyn Rodrigues
@ 2020-06-22 16:20 ` Goldwyn Rodrigues
  7 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-22 16:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

For direct I/O writes, generic_write_sync() gets called twice, once with
iomap_dio_complete() and the next time in btrfs_file_write_iter(). Move
it within btrfs_buffered_write(). In the process, we have to move
btrfs_inode->last_sub_trans as well to btrfs_buffered_write() and
btrfs_dio_iomap_end() respectively.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c  | 23 +++++++++++------------
 fs/btrfs/inode.c | 12 ++++++++----
 2 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index c8127406fcc6..8bbde5474297 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1929,6 +1929,17 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 	}
 out:
 	btrfs_inode_unlock(inode, ilock_flags);
+	/*
+	 * We also have to set last_sub_trans to the current log transid,
+	 * otherwise subsequent syncs to a file that's been synced in this
+	 * transaction will appear to have already occurred.
+	 */
+	spin_lock(&BTRFS_I(inode)->lock);
+	BTRFS_I(inode)->last_sub_trans = root->log_transid;
+	spin_unlock(&BTRFS_I(inode)->lock);
+	if (num_written > 0)
+		num_written = generic_write_sync(iocb, num_written);
+
 	return num_written ? num_written : ret;
 }
 
@@ -2030,7 +2041,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = file_inode(file);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
-	struct btrfs_root *root = BTRFS_I(inode)->root;
 	ssize_t num_written = 0;
 	const bool sync = iocb->ki_flags & IOCB_DSYNC;
 	ssize_t err = 0;
@@ -2057,17 +2067,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		num_written = btrfs_buffered_write(iocb, from);
 	}
 
-	/*
-	 * We also have to set last_sub_trans to the current log transid,
-	 * otherwise subsequent syncs to a file that's been synced in this
-	 * transaction will appear to have already occurred.
-	 */
-	spin_lock(&BTRFS_I(inode)->lock);
-	BTRFS_I(inode)->last_sub_trans = root->log_transid;
-	spin_unlock(&BTRFS_I(inode)->lock);
-	if (num_written > 0)
-		num_written = generic_write_sync(iocb, num_written);
-
 	if (sync)
 		atomic_dec(&BTRFS_I(inode)->sync_writers);
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7e71f42e7ec0..e0cfb1a5b952 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7416,10 +7416,11 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length,
 	struct btrfs_dio_data *dio_data = iomap->private;
 	size_t submitted = dio_data->submitted;
 	const bool write = !!(flags & IOMAP_WRITE);
+	struct btrfs_inode *bi = BTRFS_I(inode);
 
 	if (!write && (iomap->type == IOMAP_HOLE)) {
 		/* If reading from a hole, unlock and return */
-		unlock_extent(&BTRFS_I(inode)->io_tree, pos, pos + length - 1);
+		unlock_extent(&bi->io_tree, pos, pos + length - 1);
 		goto out;
 	}
 
@@ -7429,8 +7430,7 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length,
 		if (write)
 			__endio_write_update_ordered(inode, pos, length, false);
 		else
-			unlock_extent(&BTRFS_I(inode)->io_tree, pos,
-				      pos + length - 1);
+			unlock_extent(&bi->io_tree, pos, pos + length - 1);
 		ret = -ENOTBLK;
 	}
 
@@ -7439,8 +7439,12 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length,
 			btrfs_delalloc_release_space(inode,
 					dio_data->data_reserved, pos,
 					dio_data->reserve, true);
-		btrfs_delalloc_release_extents(BTRFS_I(inode), dio_data->length);
+		btrfs_delalloc_release_extents(bi, dio_data->length);
 		extent_changeset_free(dio_data->data_reserved);
+		spin_lock(&bi->lock);
+		bi->last_sub_trans = bi->root->log_transid;
+		spin_unlock(&bi->lock);
+
 	}
 out:
 	kfree(dio_data);
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock()
  2020-06-22 16:20 ` [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock() Goldwyn Rodrigues
@ 2020-06-24 16:19   ` David Sterba
  2020-06-25 17:34     ` Goldwyn Rodrigues
  2020-06-30  8:34   ` David Sterba
  1 sibling, 1 reply; 20+ messages in thread
From: David Sterba @ 2020-06-24 16:19 UTC (permalink / raw)
  To: Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues

On Mon, Jun 22, 2020 at 11:20:13AM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> btrfs_inode_lock/unlock() acquires the inode->i_rwsem depending on the
> flags passed. ilock_flags determines the type of lock to be taken:
> 
> BTRFS_ILOCK_SHARED - for shared locks, for possible parallel DIO
> BTRFS_ILOCK_TRY - for the RWF_NOWAIT sequence
> 
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/btrfs/ctree.h |  8 ++++++++
>  fs/btrfs/file.c  | 51 +++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 50 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 161533040978..346fea668ca0 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -2953,6 +2953,14 @@ void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
>  			       struct btrfs_ioctl_balance_args *bargs);
>  
>  /* file.c */
> +
> +/* ilock flags definition */
> +#define BTRFS_ILOCK_SHARED	0x1
> +#define BTRFS_ILOCK_TRY 	0x2

Please use enums and add them to a new file inode.h.

> +int btrfs_inode_lock(struct inode *inode, int ilock_flags);
> +void btrfs_inode_unlock(struct inode *inode, int ilock_flags);
> +
>  int __init btrfs_auto_defrag_init(void);
>  void __cold btrfs_auto_defrag_exit(void);
>  int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans,
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index ba7c3b2cf1c5..1a9a0a9e4b3d 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -70,6 +70,37 @@ static int __compare_inode_defrag(struct inode_defrag *defrag1,
>  		return 0;
>  }
>  
> +int btrfs_inode_lock(struct inode *inode, int ilock_flags)
> +void btrfs_inode_unlock(struct inode *inode, int ilock_flags)

And the helpers should be in inode.c, not file.c

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock()
  2020-06-24 16:19   ` David Sterba
@ 2020-06-25 17:34     ` Goldwyn Rodrigues
  2020-06-26 11:34       ` Hans van Kranenburg
  0 siblings, 1 reply; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-25 17:34 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Goldwyn Rodrigues

On 18:19 24/06, David Sterba wrote:
> On Mon, Jun 22, 2020 at 11:20:13AM -0500, Goldwyn Rodrigues wrote:
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > 
> > btrfs_inode_lock/unlock() acquires the inode->i_rwsem depending on the
> > flags passed. ilock_flags determines the type of lock to be taken:
> > 
> > BTRFS_ILOCK_SHARED - for shared locks, for possible parallel DIO
> > BTRFS_ILOCK_TRY - for the RWF_NOWAIT sequence
> > 
> > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > ---
> >  fs/btrfs/ctree.h |  8 ++++++++
> >  fs/btrfs/file.c  | 51 +++++++++++++++++++++++++++++++++++++++---------
> >  2 files changed, 50 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> > index 161533040978..346fea668ca0 100644
> > --- a/fs/btrfs/ctree.h
> > +++ b/fs/btrfs/ctree.h
> > @@ -2953,6 +2953,14 @@ void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
> >  			       struct btrfs_ioctl_balance_args *bargs);
> >  
> >  /* file.c */
> > +
> > +/* ilock flags definition */
> > +#define BTRFS_ILOCK_SHARED	0x1
> > +#define BTRFS_ILOCK_TRY 	0x2
> 
> Please use enums and add them to a new file inode.h.

These are bitwise flags. Wouldn't it be ugly if you have to set the flag
with:

flags |= (1 << BTRFS_ILOCK_TRY);

rather than

flags |= BTRFS_ILOCK_TRY;

> 
> > +int btrfs_inode_lock(struct inode *inode, int ilock_flags);
> > +void btrfs_inode_unlock(struct inode *inode, int ilock_flags);
> > +
> >  int __init btrfs_auto_defrag_init(void);
> >  void __cold btrfs_auto_defrag_exit(void);
> >  int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans,
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index ba7c3b2cf1c5..1a9a0a9e4b3d 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -70,6 +70,37 @@ static int __compare_inode_defrag(struct inode_defrag *defrag1,
> >  		return 0;
> >  }
> >  
> > +int btrfs_inode_lock(struct inode *inode, int ilock_flags)
> > +void btrfs_inode_unlock(struct inode *inode, int ilock_flags)
> 
> And the helpers should be in inode.c, not file.c

Yes, inode.c is more appropriate. Perhaps I should change other calls to
inode_lock(inode) to btrfs_inode_lock(inode, 0) for uniformity.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock()
  2020-06-25 17:34     ` Goldwyn Rodrigues
@ 2020-06-26 11:34       ` Hans van Kranenburg
  2020-06-26 11:59         ` David Sterba
  0 siblings, 1 reply; 20+ messages in thread
From: Hans van Kranenburg @ 2020-06-26 11:34 UTC (permalink / raw)
  To: Goldwyn Rodrigues, dsterba, linux-btrfs, Goldwyn Rodrigues

On 6/25/20 7:34 PM, Goldwyn Rodrigues wrote:
> On 18:19 24/06, David Sterba wrote:
>> On Mon, Jun 22, 2020 at 11:20:13AM -0500, Goldwyn Rodrigues wrote:
>>> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>>>
>>> btrfs_inode_lock/unlock() acquires the inode->i_rwsem depending on the
>>> flags passed. ilock_flags determines the type of lock to be taken:
>>>
>>> BTRFS_ILOCK_SHARED - for shared locks, for possible parallel DIO
>>> BTRFS_ILOCK_TRY - for the RWF_NOWAIT sequence
>>>
>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
>>> ---
>>>  fs/btrfs/ctree.h |  8 ++++++++
>>>  fs/btrfs/file.c  | 51 +++++++++++++++++++++++++++++++++++++++---------
>>>  2 files changed, 50 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>>> index 161533040978..346fea668ca0 100644
>>> --- a/fs/btrfs/ctree.h
>>> +++ b/fs/btrfs/ctree.h
>>> @@ -2953,6 +2953,14 @@ void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
>>>  			       struct btrfs_ioctl_balance_args *bargs);
>>>  
>>>  /* file.c */
>>> +
>>> +/* ilock flags definition */
>>> +#define BTRFS_ILOCK_SHARED	0x1
>>> +#define BTRFS_ILOCK_TRY 	0x2
>>
>> Please use enums and add them to a new file inode.h.
> 
> These are bitwise flags.

I suspect that in that case it's more common to do:

#define BTRFS_ILOCK_SHARED	(1 << 0)
#define BTRFS_ILOCK_TRY 	(1 << 1)

> Wouldn't it be ugly if you have to set the flag
> with:
> 
> flags |= (1 << BTRFS_ILOCK_TRY);
> 
> rather than
> 
> flags |= BTRFS_ILOCK_TRY;
> 
>>
>>> +int btrfs_inode_lock(struct inode *inode, int ilock_flags);
>>> +void btrfs_inode_unlock(struct inode *inode, int ilock_flags);
>>> +
>>>  int __init btrfs_auto_defrag_init(void);
>>>  void __cold btrfs_auto_defrag_exit(void);
>>>  int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans,
>>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>>> index ba7c3b2cf1c5..1a9a0a9e4b3d 100644
>>> --- a/fs/btrfs/file.c
>>> +++ b/fs/btrfs/file.c
>>> @@ -70,6 +70,37 @@ static int __compare_inode_defrag(struct inode_defrag *defrag1,
>>>  		return 0;
>>>  }
>>>  
>>> +int btrfs_inode_lock(struct inode *inode, int ilock_flags)
>>> +void btrfs_inode_unlock(struct inode *inode, int ilock_flags)
>>
>> And the helpers should be in inode.c, not file.c
> 
> Yes, inode.c is more appropriate. Perhaps I should change other calls to
> inode_lock(inode) to btrfs_inode_lock(inode, 0) for uniformity.
> 

Hans

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock()
  2020-06-26 11:34       ` Hans van Kranenburg
@ 2020-06-26 11:59         ` David Sterba
  0 siblings, 0 replies; 20+ messages in thread
From: David Sterba @ 2020-06-26 11:59 UTC (permalink / raw)
  To: Hans van Kranenburg
  Cc: Goldwyn Rodrigues, dsterba, linux-btrfs, Goldwyn Rodrigues

On Fri, Jun 26, 2020 at 01:34:05PM +0200, Hans van Kranenburg wrote:
> On 6/25/20 7:34 PM, Goldwyn Rodrigues wrote:
> > On 18:19 24/06, David Sterba wrote:
> >> On Mon, Jun 22, 2020 at 11:20:13AM -0500, Goldwyn Rodrigues wrote:
> >>> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> >>>
> >>> btrfs_inode_lock/unlock() acquires the inode->i_rwsem depending on the
> >>> flags passed. ilock_flags determines the type of lock to be taken:
> >>>
> >>> BTRFS_ILOCK_SHARED - for shared locks, for possible parallel DIO
> >>> BTRFS_ILOCK_TRY - for the RWF_NOWAIT sequence
> >>>
> >>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> >>> ---
> >>>  fs/btrfs/ctree.h |  8 ++++++++
> >>>  fs/btrfs/file.c  | 51 +++++++++++++++++++++++++++++++++++++++---------
> >>>  2 files changed, 50 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> >>> index 161533040978..346fea668ca0 100644
> >>> --- a/fs/btrfs/ctree.h
> >>> +++ b/fs/btrfs/ctree.h
> >>> @@ -2953,6 +2953,14 @@ void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
> >>>  			       struct btrfs_ioctl_balance_args *bargs);
> >>>  
> >>>  /* file.c */
> >>> +
> >>> +/* ilock flags definition */
> >>> +#define BTRFS_ILOCK_SHARED	0x1
> >>> +#define BTRFS_ILOCK_TRY 	0x2
> >>
> >> Please use enums and add them to a new file inode.h.
> > 
> > These are bitwise flags.
> 
> I suspect that in that case it's more common to do:
> 
> #define BTRFS_ILOCK_SHARED	(1 << 0)
> #define BTRFS_ILOCK_TRY 	(1 << 1)

Yeah, that looks more clear that it's meant for bitwise operations.

We don't have any other enum for bitwise flags so define is ok for now.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/8] btrfs: Move FS error state bit early during write
  2020-06-22 16:20 ` [PATCH 2/8] btrfs: Move FS error state bit early during write Goldwyn Rodrigues
@ 2020-06-30  8:22   ` David Sterba
  2020-06-30 14:36     ` Goldwyn Rodrigues
  0 siblings, 1 reply; 20+ messages in thread
From: David Sterba @ 2020-06-30  8:22 UTC (permalink / raw)
  To: Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues

On Mon, Jun 22, 2020 at 11:20:11AM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> We don't need the inode locked to check for the error bit. Move the
> check early.

This lacks explanation why it's not needed. I've checked history of the
code and it seems the error state flags has been after all other checks
since long, starting in acce952b0263825d. But it's part of a bigger
patch and is not specific about this call site.

If something checks state and changes the location, we need to make sure
the state is not affected by code between the old and new location.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock()
  2020-06-22 16:20 ` [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock() Goldwyn Rodrigues
  2020-06-24 16:19   ` David Sterba
@ 2020-06-30  8:34   ` David Sterba
  1 sibling, 0 replies; 20+ messages in thread
From: David Sterba @ 2020-06-30  8:34 UTC (permalink / raw)
  To: Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues

On Mon, Jun 22, 2020 at 11:20:13AM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> btrfs_inode_lock/unlock() acquires the inode->i_rwsem depending on the
> flags passed. ilock_flags determines the type of lock to be taken:
> 
> BTRFS_ILOCK_SHARED - for shared locks, for possible parallel DIO
> BTRFS_ILOCK_TRY - for the RWF_NOWAIT sequence
> 
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/btrfs/ctree.h |  8 ++++++++
>  fs/btrfs/file.c  | 51 +++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 50 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 161533040978..346fea668ca0 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -2953,6 +2953,14 @@ void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
>  			       struct btrfs_ioctl_balance_args *bargs);
>  
>  /* file.c */
> +
> +/* ilock flags definition */
> +#define BTRFS_ILOCK_SHARED	0x1
> +#define BTRFS_ILOCK_TRY 	0x2
> +
> +int btrfs_inode_lock(struct inode *inode, int ilock_flags);
> +void btrfs_inode_unlock(struct inode *inode, int ilock_flags);

There's another lock in btrfs_inode, and the locking functions added
here have the 'btrfs_' prefix, yet take the VFS inode structure. This is
confusing.

The flags are not btrfs-specific so it could be even a generic VFS
helper but for now I think it can be in fs/btrfs until it's finalized.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF
  2020-06-22 16:20 ` [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF Goldwyn Rodrigues
@ 2020-06-30  8:42   ` David Sterba
  2020-06-30 14:24     ` Goldwyn Rodrigues
  2020-06-30  9:37   ` Filipe Manana
  1 sibling, 1 reply; 20+ messages in thread
From: David Sterba @ 2020-06-30  8:42 UTC (permalink / raw)
  To: Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues

On Mon, Jun 22, 2020 at 11:20:15AM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> This is to parallelize direct writes within EOF or with direct I/O
> reads. This covers the race with truncate() accidentally increasing the
> filesize.

Please describe the race condition in more detail and how the DIO/EOF
parallelization is supposed to work.

> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/btrfs/file.c | 25 +++++++------------------
>  1 file changed, 7 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index aa6be931620b..c446a4aeb867 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1957,12 +1957,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>  	loff_t endbyte;
>  	int err;
>  	size_t count = 0;
> -	bool relock = false;
>  	int flags = IOMAP_DIOF_PGINVALID_FAIL;
>  	int ilock_flags = 0;
>  
>  	if (iocb->ki_flags & IOCB_NOWAIT)
>  		ilock_flags |= BTRFS_ILOCK_TRY;
> +	/*
> +	 * If the write DIO within EOF,  use a shared lock
> +	 */
> +	if (pos + count <= i_size_read(inode))

Inode size is now read outside of the inode lock, so it could change
until the lock is taken a few lines below.

> +		ilock_flags |= BTRFS_ILOCK_SHARED;
> +	else if (iocb->ki_flags & IOCB_NOWAIT)
> +		return -EAGAIN;
>  
>  	err = btrfs_inode_lock(inode, ilock_flags);

Is it necessary to revalidate that 'pos + count < i_size' still holds
when the lock was taken as SHARED?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF
  2020-06-22 16:20 ` [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF Goldwyn Rodrigues
  2020-06-30  8:42   ` David Sterba
@ 2020-06-30  9:37   ` Filipe Manana
  2020-06-30 14:20     ` Goldwyn Rodrigues
  1 sibling, 1 reply; 20+ messages in thread
From: Filipe Manana @ 2020-06-30  9:37 UTC (permalink / raw)
  To: Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues

On Mon, Jun 22, 2020 at 5:22 PM Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
>
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> This is to parallelize direct writes within EOF or with direct I/O
> reads. This covers the race with truncate() accidentally increasing the
> filesize.
>
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/btrfs/file.c | 25 +++++++------------------
>  1 file changed, 7 insertions(+), 18 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index aa6be931620b..c446a4aeb867 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1957,12 +1957,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>         loff_t endbyte;
>         int err;
>         size_t count = 0;
> -       bool relock = false;
>         int flags = IOMAP_DIOF_PGINVALID_FAIL;
>         int ilock_flags = 0;
>
>         if (iocb->ki_flags & IOCB_NOWAIT)
>                 ilock_flags |= BTRFS_ILOCK_TRY;
> +       /*
> +        * If the write DIO within EOF,  use a shared lock
> +        */
> +       if (pos + count <= i_size_read(inode))
> +               ilock_flags |= BTRFS_ILOCK_SHARED;
> +       else if (iocb->ki_flags & IOCB_NOWAIT)
> +               return -EAGAIN;

In the next iteration, please rebase the patchset on a more recent misc-next.

That hunk returning -EAGAIN is buggy and was removed a couple weeks
ago in a patchset fixing several bugs with NOWAIT writes.

Thanks.

>
>         err = btrfs_inode_lock(inode, ilock_flags);
>         if (err < 0)
> @@ -1975,20 +1981,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>         if (check_direct_IO(fs_info, from, pos))
>                 goto buffered;
>
> -       count = iov_iter_count(from);
> -       /*
> -        * If the write DIO is beyond the EOF, we need update the isize, but it
> -        * is protected by i_mutex. So we can not unlock the i_mutex at this
> -        * case.
> -        */
> -       if (pos + count <= inode->i_size) {
> -               inode_unlock(inode);
> -               relock = true;
> -       } else if (iocb->ki_flags & IOCB_NOWAIT) {
> -               err = -EAGAIN;
> -               goto out;
> -       }
> -
>         if (is_sync_kiocb(iocb))
>                 flags |= IOMAP_DIOF_WAIT_FOR_COMPLETION;
>
> @@ -1997,9 +1989,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>                                flags);
>         up_read(&BTRFS_I(inode)->dio_sem);
>
> -       if (relock)
> -               inode_lock(inode);
> -
>         if (written < 0 || !iov_iter_count(from)) {
>                 err = written;
>                 goto error;
> --
> 2.25.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF
  2020-06-30  9:37   ` Filipe Manana
@ 2020-06-30 14:20     ` Goldwyn Rodrigues
  0 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-30 14:20 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs

On 10:37 30/06, Filipe Manana wrote:
> On Mon, Jun 22, 2020 at 5:22 PM Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
> >
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> >
> > This is to parallelize direct writes within EOF or with direct I/O
> > reads. This covers the race with truncate() accidentally increasing the
> > filesize.
> >
> > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > ---
> >  fs/btrfs/file.c | 25 +++++++------------------
> >  1 file changed, 7 insertions(+), 18 deletions(-)
> >
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index aa6be931620b..c446a4aeb867 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -1957,12 +1957,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
> >         loff_t endbyte;
> >         int err;
> >         size_t count = 0;
> > -       bool relock = false;
> >         int flags = IOMAP_DIOF_PGINVALID_FAIL;
> >         int ilock_flags = 0;
> >
> >         if (iocb->ki_flags & IOCB_NOWAIT)
> >                 ilock_flags |= BTRFS_ILOCK_TRY;
> > +       /*
> > +        * If the write DIO within EOF,  use a shared lock
> > +        */
> > +       if (pos + count <= i_size_read(inode))
> > +               ilock_flags |= BTRFS_ILOCK_SHARED;
> > +       else if (iocb->ki_flags & IOCB_NOWAIT)
> > +               return -EAGAIN;
> 
> In the next iteration, please rebase the patchset on a more recent misc-next.
> 
> That hunk returning -EAGAIN is buggy and was removed a couple weeks
> ago in a patchset fixing several bugs with NOWAIT writes.
> 

I worked against the vanilla and it is already ported in the git tree
since your patch was added in the meantime.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF
  2020-06-30  8:42   ` David Sterba
@ 2020-06-30 14:24     ` Goldwyn Rodrigues
  0 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-30 14:24 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Goldwyn Rodrigues

On 10:42 30/06, David Sterba wrote:
> On Mon, Jun 22, 2020 at 11:20:15AM -0500, Goldwyn Rodrigues wrote:
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > 
> > This is to parallelize direct writes within EOF or with direct I/O
> > reads. This covers the race with truncate() accidentally increasing the
> > filesize.
> 
> Please describe the race condition in more detail and how the DIO/EOF
> parallelization is supposed to work.
> 
> > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > ---
> >  fs/btrfs/file.c | 25 +++++++------------------
> >  1 file changed, 7 insertions(+), 18 deletions(-)
> > 
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index aa6be931620b..c446a4aeb867 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -1957,12 +1957,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
> >  	loff_t endbyte;
> >  	int err;
> >  	size_t count = 0;
> > -	bool relock = false;
> >  	int flags = IOMAP_DIOF_PGINVALID_FAIL;
> >  	int ilock_flags = 0;
> >  
> >  	if (iocb->ki_flags & IOCB_NOWAIT)
> >  		ilock_flags |= BTRFS_ILOCK_TRY;
> > +	/*
> > +	 * If the write DIO within EOF,  use a shared lock
> > +	 */
> > +	if (pos + count <= i_size_read(inode))
> 
> Inode size is now read outside of the inode lock, so it could change
> until the lock is taken a few lines below.

Good catch. This should be re-checked in btrfs_write_check().
Thanks.

> 
> > +		ilock_flags |= BTRFS_ILOCK_SHARED;
> > +	else if (iocb->ki_flags & IOCB_NOWAIT)
> > +		return -EAGAIN;
> >  
> >  	err = btrfs_inode_lock(inode, ilock_flags);
> 
> Is it necessary to revalidate that 'pos + count < i_size' still holds
> when the lock was taken as SHARED?

Yes, and convert the lock, if not. 

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/8] btrfs: Move FS error state bit early during write
  2020-06-30  8:22   ` David Sterba
@ 2020-06-30 14:36     ` Goldwyn Rodrigues
  0 siblings, 0 replies; 20+ messages in thread
From: Goldwyn Rodrigues @ 2020-06-30 14:36 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Goldwyn Rodrigues

On 10:22 30/06, David Sterba wrote:
> On Mon, Jun 22, 2020 at 11:20:11AM -0500, Goldwyn Rodrigues wrote:
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > 
> > We don't need the inode locked to check for the error bit. Move the
> > check early.
> 
> This lacks explanation why it's not needed. I've checked history of the
> code and it seems the error state flags has been after all other checks
> since long, starting in acce952b0263825d. But it's part of a bigger
> patch and is not specific about this call site.
> 
> If something checks state and changes the location, we need to make sure
> the state is not affected by code between the old and new location.

This is a filesystem state bit as opposed to a file level state bit. The
bit states that you don't let writes proceed because of a filesystem
error. Why would you need a inode level lock for that? It is better to
return -EROFS early enough rather than take a lock, check for filesystem
failure, release the lock and then return an error.

The other check is in start_transaction, which perhaps is for
writes-in-flight.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-06-30 14:36 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-22 16:20 [PATCH 0/8] Rearrange inode locking Goldwyn Rodrigues
2020-06-22 16:20 ` [PATCH 1/8] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write() Goldwyn Rodrigues
2020-06-22 16:20 ` [PATCH 2/8] btrfs: Move FS error state bit early during write Goldwyn Rodrigues
2020-06-30  8:22   ` David Sterba
2020-06-30 14:36     ` Goldwyn Rodrigues
2020-06-22 16:20 ` [PATCH 3/8] btrfs: Introduce btrfs_write_check() Goldwyn Rodrigues
2020-06-22 16:20 ` [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock() Goldwyn Rodrigues
2020-06-24 16:19   ` David Sterba
2020-06-25 17:34     ` Goldwyn Rodrigues
2020-06-26 11:34       ` Hans van Kranenburg
2020-06-26 11:59         ` David Sterba
2020-06-30  8:34   ` David Sterba
2020-06-22 16:20 ` [PATCH 5/8] btrfs: Push inode locking and unlocking in buffered/direct write Goldwyn Rodrigues
2020-06-22 16:20 ` [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF Goldwyn Rodrigues
2020-06-30  8:42   ` David Sterba
2020-06-30 14:24     ` Goldwyn Rodrigues
2020-06-30  9:37   ` Filipe Manana
2020-06-30 14:20     ` Goldwyn Rodrigues
2020-06-22 16:20 ` [PATCH 7/8] btrfs: Remove dio_sem Goldwyn Rodrigues
2020-06-22 16:20 ` [PATCH 8/8] btrfs: Move generic_write_sync() to btrfs_buffered_write() Goldwyn Rodrigues

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.