All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Fix locking for btrfs direct writes
@ 2020-12-15 18:06 Goldwyn Rodrigues
  2020-12-15 18:06 ` [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete() Goldwyn Rodrigues
  2020-12-15 18:06 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
  0 siblings, 2 replies; 8+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-15 18:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-btrfs; +Cc: darrick.wong, hch, nborisov, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

BTRFS direct write takes the inode lock for performing the direct write.
In case of a failure or an incomplete write, it falls back to buffered
writes. Before initiating the buffered write, it releases the inode lock
and reacquires it for buffered write. This may lead to corruption if
another process attempts to write around the same offset between the
unlock and the relock. The patches change the flow so that the lock is
taken only once before the write and released only after the I/O is
complete.


Goldwyn Rodrigues (2):
  iomap: Separate out generic_write_sync() from iomap_dio_complete()
  btrfs: Make btrfs_direct_write atomic with respect to inode_lock

 fs/btrfs/file.c       | 69 +++++++++++++++++++++++++------------------
 fs/iomap/direct-io.c  | 16 ++++++++--
 include/linux/iomap.h |  2 +-
 3 files changed, 54 insertions(+), 33 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete()
  2020-12-15 18:06 [PATCH v2 0/2] Fix locking for btrfs direct writes Goldwyn Rodrigues
@ 2020-12-15 18:06 ` Goldwyn Rodrigues
  2020-12-15 21:24     ` kernel test robot
  2020-12-15 22:16   ` Dave Chinner
  2020-12-15 18:06 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
  1 sibling, 2 replies; 8+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-15 18:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-btrfs; +Cc: darrick.wong, hch, nborisov, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

This introduces a separate function __iomap_dio_complte() which
completes the Direct I/O without performing the write sync.

Filesystems such as btrfs which require an inode_lock for sync can call
__iomap_dio_complete() and must perform sync on their own after unlock.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/iomap/direct-io.c  | 16 +++++++++++++---
 include/linux/iomap.h |  2 +-
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 933f234d5bec..11a108f39fd9 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -76,7 +76,7 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
 		dio->submit.cookie = submit_bio(bio);
 }
 
-ssize_t iomap_dio_complete(struct iomap_dio *dio)
+ssize_t __iomap_dio_complete(struct iomap_dio *dio)
 {
 	const struct iomap_dio_ops *dops = dio->dops;
 	struct kiocb *iocb = dio->iocb;
@@ -119,18 +119,28 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
 	}
 
 	inode_dio_end(file_inode(iocb->ki_filp));
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__iomap_dio_complete);
+
+ssize_t iomap_dio_complete(struct iomap_dio *dio)
+{
+	ssize_t ret;
+
+	ret = __iomap_dio_complete(dio);
 	/*
 	 * If this is a DSYNC write, make sure we push it to stable storage now
 	 * that we've written data.
 	 */
 	if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
-		ret = generic_write_sync(iocb, ret);
+		ret = generic_write_sync(dio->iocb, ret);
 
 	kfree(dio);
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(iomap_dio_complete);
+
 
 static void iomap_dio_complete_work(struct work_struct *work)
 {
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 5bd3cac4df9c..5785dc0b8ec5 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -262,7 +262,7 @@ ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 struct iomap_dio *__iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
 		bool wait_for_completion);
-ssize_t iomap_dio_complete(struct iomap_dio *dio);
+ssize_t __iomap_dio_complete(struct iomap_dio *dio);
 int iomap_dio_iopoll(struct kiocb *kiocb, bool spin);
 
 #ifdef CONFIG_SWAP
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock
  2020-12-15 18:06 [PATCH v2 0/2] Fix locking for btrfs direct writes Goldwyn Rodrigues
  2020-12-15 18:06 ` [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete() Goldwyn Rodrigues
@ 2020-12-15 18:06 ` Goldwyn Rodrigues
  2020-12-15 22:13   ` Darrick J. Wong
  1 sibling, 1 reply; 8+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-15 18:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-btrfs; +Cc: darrick.wong, hch, nborisov, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

btrfs_direct_write() fallsback to buffered write in case btrfs is not
able to perform or complete a direct I/O. During the fallback
inode lock is unlocked and relocked. This does not guarantee the
atomicity of the entire write since the lock can be acquired by another
write between unlock and relock.

__btrfs_buffered_write() is used to perform the direct fallback write,
which performs the write without acquiring the lock or checks.

fa54fc76db94 ("btrfs: push inode locking and unlocking into buffered/direct write")
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c | 69 ++++++++++++++++++++++++++++---------------------
 1 file changed, 40 insertions(+), 29 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0e41459b8de6..9fc768b951f1 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1638,11 +1638,11 @@ static int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from,
 	return 0;
 }
 
-static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
+static noinline ssize_t __btrfs_buffered_write(struct kiocb *iocb,
 					       struct iov_iter *i)
 {
 	struct file *file = iocb->ki_filp;
-	loff_t pos;
+	loff_t pos = iocb->ki_pos;
 	struct inode *inode = file_inode(file);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct page **pages = NULL;
@@ -1656,24 +1656,9 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 	bool only_release_metadata = false;
 	bool force_page_uptodate = false;
 	loff_t old_isize = i_size_read(inode);
-	unsigned int ilock_flags = 0;
-
-	if (iocb->ki_flags & IOCB_NOWAIT)
-		ilock_flags |= BTRFS_ILOCK_TRY;
-
-	ret = btrfs_inode_lock(inode, ilock_flags);
-	if (ret < 0)
-		return ret;
-
-	ret = generic_write_checks(iocb, i);
-	if (ret <= 0)
-		goto out;
 
-	ret = btrfs_write_check(iocb, i, ret);
-	if (ret < 0)
-		goto out;
+	lockdep_assert_held(&inode->i_rwsem);
 
-	pos = iocb->ki_pos;
 	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
 			PAGE_SIZE / (sizeof(struct page *)));
 	nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
@@ -1877,10 +1862,37 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 		iocb->ki_pos += num_written;
 	}
 out:
-	btrfs_inode_unlock(inode, ilock_flags);
 	return num_written ? num_written : ret;
 }
 
+static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
+					       struct iov_iter *i)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	unsigned int ilock_flags = 0;
+	ssize_t ret;
+
+	if (iocb->ki_flags & IOCB_NOWAIT)
+		ilock_flags |= BTRFS_ILOCK_TRY;
+
+	ret = btrfs_inode_lock(inode, ilock_flags);
+	if (ret < 0)
+		return ret;
+
+	ret = generic_write_checks(iocb, i);
+	if (ret <= 0)
+		goto out;
+
+	ret = btrfs_write_check(iocb, i, ret);
+	if (ret < 0)
+		goto out;
+
+	ret = __btrfs_buffered_write(iocb, i);
+out:
+	btrfs_inode_unlock(inode, ilock_flags);
+	return ret;
+}
+
 static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info,
 			       const struct iov_iter *iter, loff_t offset)
 {
@@ -1927,10 +1939,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	}
 
 	err = btrfs_write_check(iocb, from, err);
-	if (err < 0) {
-		btrfs_inode_unlock(inode, ilock_flags);
+	if (err < 0)
 		goto out;
-	}
 
 	pos = iocb->ki_pos;
 	/*
@@ -1944,22 +1954,19 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 		goto relock;
 	}
 
-	if (check_direct_IO(fs_info, from, pos)) {
-		btrfs_inode_unlock(inode, ilock_flags);
+	if (check_direct_IO(fs_info, from, pos))
 		goto buffered;
-	}
 
 	dio = __iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops,
 			     &btrfs_dio_ops, is_sync_kiocb(iocb));
 
-	btrfs_inode_unlock(inode, ilock_flags);
-
 	if (IS_ERR_OR_NULL(dio)) {
 		err = PTR_ERR_OR_ZERO(dio);
 		if (err < 0 && err != -ENOTBLK)
 			goto out;
 	} else {
-		written = iomap_dio_complete(dio);
+		written = __iomap_dio_complete(dio);
+		kfree(dio);
 	}
 
 	if (written < 0 || !iov_iter_count(from)) {
@@ -1969,7 +1976,7 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 
 buffered:
 	pos = iocb->ki_pos;
-	written_buffered = btrfs_buffered_write(iocb, from);
+	written_buffered = __btrfs_buffered_write(iocb, from);
 	if (written_buffered < 0) {
 		err = written_buffered;
 		goto out;
@@ -1990,6 +1997,10 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT,
 				 endbyte >> PAGE_SHIFT);
 out:
+	btrfs_inode_unlock(inode, ilock_flags);
+	if (written > 0)
+		generic_write_sync(iocb, written);
+
 	return written ? written : err;
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete()
  2020-12-15 18:06 ` [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete() Goldwyn Rodrigues
@ 2020-12-15 21:24     ` kernel test robot
  2020-12-15 22:16   ` Dave Chinner
  1 sibling, 0 replies; 8+ messages in thread
From: kernel test robot @ 2020-12-15 21:24 UTC (permalink / raw)
  To: Goldwyn Rodrigues, linux-fsdevel, linux-btrfs
  Cc: kbuild-all, darrick.wong, hch, nborisov, Goldwyn Rodrigues

[-- Attachment #1: Type: text/plain, Size: 2483 bytes --]

Hi Goldwyn,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kdave/for-next]
[also build test WARNING on v5.10 next-20201215]
[cannot apply to xfs-linux/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Goldwyn-Rodrigues/Fix-locking-for-btrfs-direct-writes/20201216-021312
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
config: sparc-randconfig-s031-20201215 (attached as .config)
compiler: sparc-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-184-g1b896707-dirty
        # https://github.com/0day-ci/linux/commit/4706fd8a8832b4948c25abc5fec38a017704d828
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Goldwyn-Rodrigues/Fix-locking-for-btrfs-direct-writes/20201216-021312
        git checkout 4706fd8a8832b4948c25abc5fec38a017704d828
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=sparc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> fs/iomap/direct-io.c:127:9: warning: no previous prototype for 'iomap_dio_complete' [-Wmissing-prototypes]
     127 | ssize_t iomap_dio_complete(struct iomap_dio *dio)
         |         ^~~~~~~~~~~~~~~~~~

"sparse warnings: (new ones prefixed by >>)"


vim +/iomap_dio_complete +127 fs/iomap/direct-io.c

   126	
 > 127	ssize_t iomap_dio_complete(struct iomap_dio *dio)
   128	{
   129		ssize_t ret;
   130	
   131		ret = __iomap_dio_complete(dio);
   132		/*
   133		 * If this is a DSYNC write, make sure we push it to stable storage now
   134		 * that we've written data.
   135		 */
   136		if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
   137			ret = generic_write_sync(dio->iocb, ret);
   138	
   139		kfree(dio);
   140	
   141		return ret;
   142	}
   143	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25723 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete()
@ 2020-12-15 21:24     ` kernel test robot
  0 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2020-12-15 21:24 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2548 bytes --]

Hi Goldwyn,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kdave/for-next]
[also build test WARNING on v5.10 next-20201215]
[cannot apply to xfs-linux/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Goldwyn-Rodrigues/Fix-locking-for-btrfs-direct-writes/20201216-021312
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
config: sparc-randconfig-s031-20201215 (attached as .config)
compiler: sparc-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-184-g1b896707-dirty
        # https://github.com/0day-ci/linux/commit/4706fd8a8832b4948c25abc5fec38a017704d828
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Goldwyn-Rodrigues/Fix-locking-for-btrfs-direct-writes/20201216-021312
        git checkout 4706fd8a8832b4948c25abc5fec38a017704d828
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=sparc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> fs/iomap/direct-io.c:127:9: warning: no previous prototype for 'iomap_dio_complete' [-Wmissing-prototypes]
     127 | ssize_t iomap_dio_complete(struct iomap_dio *dio)
         |         ^~~~~~~~~~~~~~~~~~

"sparse warnings: (new ones prefixed by >>)"


vim +/iomap_dio_complete +127 fs/iomap/direct-io.c

   126	
 > 127	ssize_t iomap_dio_complete(struct iomap_dio *dio)
   128	{
   129		ssize_t ret;
   130	
   131		ret = __iomap_dio_complete(dio);
   132		/*
   133		 * If this is a DSYNC write, make sure we push it to stable storage now
   134		 * that we've written data.
   135		 */
   136		if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
   137			ret = generic_write_sync(dio->iocb, ret);
   138	
   139		kfree(dio);
   140	
   141		return ret;
   142	}
   143	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 25723 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock
  2020-12-15 18:06 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
@ 2020-12-15 22:13   ` Darrick J. Wong
  2020-12-16 21:07     ` Goldwyn Rodrigues
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2020-12-15 22:13 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, hch, nborisov, Goldwyn Rodrigues

On Tue, Dec 15, 2020 at 12:06:36PM -0600, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> btrfs_direct_write() fallsback to buffered write in case btrfs is not
> able to perform or complete a direct I/O. During the fallback
> inode lock is unlocked and relocked. This does not guarantee the
> atomicity of the entire write since the lock can be acquired by another
> write between unlock and relock.
> 
> __btrfs_buffered_write() is used to perform the direct fallback write,
> which performs the write without acquiring the lock or checks.

Er... can you grab the inode lock before deciding which of the IO
path(s) you're going to take?  Then you'd always have an atomic write
even if fallback happens.

(Also vaguely wondering why this needs even more slicing and dicing of
the iomap directio functions...)

--D

> 
> fa54fc76db94 ("btrfs: push inode locking and unlocking into buffered/direct write")
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/btrfs/file.c | 69 ++++++++++++++++++++++++++++---------------------
>  1 file changed, 40 insertions(+), 29 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 0e41459b8de6..9fc768b951f1 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1638,11 +1638,11 @@ static int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from,
>  	return 0;
>  }
>  
> -static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
> +static noinline ssize_t __btrfs_buffered_write(struct kiocb *iocb,
>  					       struct iov_iter *i)
>  {
>  	struct file *file = iocb->ki_filp;
> -	loff_t pos;
> +	loff_t pos = iocb->ki_pos;
>  	struct inode *inode = file_inode(file);
>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>  	struct page **pages = NULL;
> @@ -1656,24 +1656,9 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>  	bool only_release_metadata = false;
>  	bool force_page_uptodate = false;
>  	loff_t old_isize = i_size_read(inode);
> -	unsigned int ilock_flags = 0;
> -
> -	if (iocb->ki_flags & IOCB_NOWAIT)
> -		ilock_flags |= BTRFS_ILOCK_TRY;
> -
> -	ret = btrfs_inode_lock(inode, ilock_flags);
> -	if (ret < 0)
> -		return ret;
> -
> -	ret = generic_write_checks(iocb, i);
> -	if (ret <= 0)
> -		goto out;
>  
> -	ret = btrfs_write_check(iocb, i, ret);
> -	if (ret < 0)
> -		goto out;
> +	lockdep_assert_held(&inode->i_rwsem);
>  
> -	pos = iocb->ki_pos;
>  	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
>  			PAGE_SIZE / (sizeof(struct page *)));
>  	nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
> @@ -1877,10 +1862,37 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>  		iocb->ki_pos += num_written;
>  	}
>  out:
> -	btrfs_inode_unlock(inode, ilock_flags);
>  	return num_written ? num_written : ret;
>  }
>  
> +static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
> +					       struct iov_iter *i)
> +{
> +	struct inode *inode = file_inode(iocb->ki_filp);
> +	unsigned int ilock_flags = 0;
> +	ssize_t ret;
> +
> +	if (iocb->ki_flags & IOCB_NOWAIT)
> +		ilock_flags |= BTRFS_ILOCK_TRY;
> +
> +	ret = btrfs_inode_lock(inode, ilock_flags);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = generic_write_checks(iocb, i);
> +	if (ret <= 0)
> +		goto out;
> +
> +	ret = btrfs_write_check(iocb, i, ret);
> +	if (ret < 0)
> +		goto out;
> +
> +	ret = __btrfs_buffered_write(iocb, i);
> +out:
> +	btrfs_inode_unlock(inode, ilock_flags);
> +	return ret;
> +}
> +
>  static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info,
>  			       const struct iov_iter *iter, loff_t offset)
>  {
> @@ -1927,10 +1939,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>  	}
>  
>  	err = btrfs_write_check(iocb, from, err);
> -	if (err < 0) {
> -		btrfs_inode_unlock(inode, ilock_flags);
> +	if (err < 0)
>  		goto out;
> -	}
>  
>  	pos = iocb->ki_pos;
>  	/*
> @@ -1944,22 +1954,19 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>  		goto relock;
>  	}
>  
> -	if (check_direct_IO(fs_info, from, pos)) {
> -		btrfs_inode_unlock(inode, ilock_flags);
> +	if (check_direct_IO(fs_info, from, pos))
>  		goto buffered;
> -	}
>  
>  	dio = __iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops,
>  			     &btrfs_dio_ops, is_sync_kiocb(iocb));
>  
> -	btrfs_inode_unlock(inode, ilock_flags);
> -
>  	if (IS_ERR_OR_NULL(dio)) {
>  		err = PTR_ERR_OR_ZERO(dio);
>  		if (err < 0 && err != -ENOTBLK)
>  			goto out;
>  	} else {
> -		written = iomap_dio_complete(dio);
> +		written = __iomap_dio_complete(dio);
> +		kfree(dio);
>  	}
>  
>  	if (written < 0 || !iov_iter_count(from)) {
> @@ -1969,7 +1976,7 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>  
>  buffered:
>  	pos = iocb->ki_pos;
> -	written_buffered = btrfs_buffered_write(iocb, from);
> +	written_buffered = __btrfs_buffered_write(iocb, from);
>  	if (written_buffered < 0) {
>  		err = written_buffered;
>  		goto out;
> @@ -1990,6 +1997,10 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
>  	invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT,
>  				 endbyte >> PAGE_SHIFT);
>  out:
> +	btrfs_inode_unlock(inode, ilock_flags);
> +	if (written > 0)
> +		generic_write_sync(iocb, written);
> +
>  	return written ? written : err;
>  }
>  
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete()
  2020-12-15 18:06 ` [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete() Goldwyn Rodrigues
  2020-12-15 21:24     ` kernel test robot
@ 2020-12-15 22:16   ` Dave Chinner
  1 sibling, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2020-12-15 22:16 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, hch, nborisov,
	Goldwyn Rodrigues

On Tue, Dec 15, 2020 at 12:06:35PM -0600, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> This introduces a separate function __iomap_dio_complte() which
> completes the Direct I/O without performing the write sync.
> 
> Filesystems such as btrfs which require an inode_lock for sync can call
> __iomap_dio_complete() and must perform sync on their own after unlock.
> 
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/iomap/direct-io.c  | 16 +++++++++++++---
>  include/linux/iomap.h |  2 +-
>  2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 933f234d5bec..11a108f39fd9 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -76,7 +76,7 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
>  		dio->submit.cookie = submit_bio(bio);
>  }
>  
> -ssize_t iomap_dio_complete(struct iomap_dio *dio)
> +ssize_t __iomap_dio_complete(struct iomap_dio *dio)
>  {
>  	const struct iomap_dio_ops *dops = dio->dops;
>  	struct kiocb *iocb = dio->iocb;
> @@ -119,18 +119,28 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
>  	}
>  
>  	inode_dio_end(file_inode(iocb->ki_filp));
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(__iomap_dio_complete);
> +
> +ssize_t iomap_dio_complete(struct iomap_dio *dio)
> +{
> +	ssize_t ret;
> +
> +	ret = __iomap_dio_complete(dio);
>  	/*
>  	 * If this is a DSYNC write, make sure we push it to stable storage now
>  	 * that we've written data.
>  	 */
>  	if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
> -		ret = generic_write_sync(iocb, ret);
> +		ret = generic_write_sync(dio->iocb, ret);
>  
>  	kfree(dio);
>  
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(iomap_dio_complete);
> +

NACK.

If you don't want iomap_dio_complete to do O_DSYNC work after
successfully writing data, strip those flags out of the kiocb
before you call iomap_dio_rw() and do it yourself after calling
iomap_dio_complete().

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock
  2020-12-15 22:13   ` Darrick J. Wong
@ 2020-12-16 21:07     ` Goldwyn Rodrigues
  0 siblings, 0 replies; 8+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-16 21:07 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-fsdevel, linux-btrfs, hch, nborisov

On 14:13 15/12, Darrick J. Wong wrote:
> On Tue, Dec 15, 2020 at 12:06:36PM -0600, Goldwyn Rodrigues wrote:
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > 
> > btrfs_direct_write() fallsback to buffered write in case btrfs is not
> > able to perform or complete a direct I/O. During the fallback
> > inode lock is unlocked and relocked. This does not guarantee the
> > atomicity of the entire write since the lock can be acquired by another
> > write between unlock and relock.
> > 
> > __btrfs_buffered_write() is used to perform the direct fallback write,
> > which performs the write without acquiring the lock or checks.
> 
> Er... can you grab the inode lock before deciding which of the IO
> path(s) you're going to take?  Then you'd always have an atomic write
> even if fallback happens.

No, since this is a fallback option which also works if the I/O is
incomplete.

> 
> (Also vaguely wondering why this needs even more slicing and dicing of
> the iomap directio functions...)

I would most likely go with Dave's method of storing the flag in the
function and calling iomap dio functions without IOCB_DSYNC flag. This
way we don't have to change iomap.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-12-16 21:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15 18:06 [PATCH v2 0/2] Fix locking for btrfs direct writes Goldwyn Rodrigues
2020-12-15 18:06 ` [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete() Goldwyn Rodrigues
2020-12-15 21:24   ` kernel test robot
2020-12-15 21:24     ` kernel test robot
2020-12-15 22:16   ` Dave Chinner
2020-12-15 18:06 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
2020-12-15 22:13   ` Darrick J. Wong
2020-12-16 21:07     ` Goldwyn Rodrigues

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.