Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
From: Jan Kara <jack@suse.cz>
To: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	jack@suse.cz, tytso@mit.edu, riteshh@linux.ibm.com
Subject: Re: [PATCH 2/5] ext4: move inode extension/truncate code out from ext4_iomap_end()
Date: Wed, 28 Aug 2019 21:59:14 +0200
Message-ID: <20190828195914.GF22343@quack2.suse.cz> (raw)
In-Reply-To: <774754e9b2afc541df619921f7743d98c5c6a358.1565609891.git.mbobrowski@mbobrowski.org>

On Mon 12-08-19 22:52:53, Matthew Bobrowski wrote:
> In preparation for implementing the direct IO write code path modifications
> that make us of iomap infrastructure we need to move out the inode
> extension/truncate code from ext4_iomap_end() callback. For direct IO, if the
> current code remained it would behave incorrectly. If we update the inode size
> prior to converting unwritten extents we run the risk of allowing a racing
> direct IO read operation to find unwritten extents before they are converted.
> 
> The inode extension/truncate has been moved out into a new function
> ext4_handle_inode_extension(). This will be used by both direct IO and DAX
> code paths if the write results with the inode being extended.
> 
> Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
> ---
>  fs/ext4/file.c  | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  fs/ext4/inode.c | 48 +--------------------------------------------
>  2 files changed, 60 insertions(+), 48 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 360eff7b6aa2..7470800c63b7 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -33,6 +33,7 @@
>  #include "ext4_jbd2.h"
>  #include "xattr.h"
>  #include "acl.h"
> +#include "truncate.h"
>  
>  static bool ext4_dio_checks(struct inode *inode)
>  {
> @@ -234,12 +235,62 @@ static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from)
>  	return iov_iter_count(from);
>  }
>  
> +static int ext4_handle_inode_extension(struct inode *inode, loff_t size,
> +				       size_t count)
> +{
> +	handle_t *handle;
> +	bool truncate = false;
> +	ext4_lblk_t written_blk, end_blk;
> +	int ret = 0, blkbits = inode->i_blkbits;
> +
> +	handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
> +	if (IS_ERR(handle)) {
> +		ret = PTR_ERR(handle);
> +		goto orphan_del;
> +	}
> +
> +	if (ext4_update_inode_size(inode, size))
> +		ext4_mark_inode_dirty(handle, inode);
> +
> +	/*
> +	 * We may need truncate allocated but not written blocks
> +	 * beyond EOF.
> +	 */
> +	written_blk = ALIGN(size, 1 << blkbits);
> +	end_blk = ALIGN(size + count, 1 << blkbits);

So this seems to imply that 'size' is really offset where IO started but
ext4_update_inode_size(inode, size) above suggests 'size' is really where
IO has ended and that's indeed what you pass into
ext4_handle_inode_extension(). So I'd just make the calling convention for
ext4_handle_inode_extension() less confusing and pass 'offset' and 'len'
and fixup the math inside the function...

Otherwise the patch looks OK to me.

								Honza


> +	if (written_blk < end_blk && ext4_can_truncate(inode))
> +		truncate = true;
> +
> +	/*
> +	 * Remove the inode from the orphan list if it has been
> +	 * extended and everything went OK.
> +	 */
> +	if (!truncate && inode->i_nlink)
> +		ext4_orphan_del(handle, inode);
> +	ext4_journal_stop(handle);
> +
> +	if (truncate) {
> +		ext4_truncate_failed_write(inode);
> +orphan_del:
> +		/*
> +		 * If the truncate operation failed early the inode
> +		 * may still be on the orphan list. In that case, we
> +		 * need try remove the inode from the linked list in
> +		 * memory.
> +		 */
> +		if (inode->i_nlink)
> +			ext4_orphan_del(NULL, inode);
> +	}
> +	return ret;
> +}
> +
>  #ifdef CONFIG_FS_DAX
>  static ssize_t
>  ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  {
> -	struct inode *inode = file_inode(iocb->ki_filp);
> +	int err;
>  	ssize_t ret;
> +	struct inode *inode = file_inode(iocb->ki_filp);
>  
>  	if (!inode_trylock(inode)) {
>  		if (iocb->ki_flags & IOCB_NOWAIT)
> @@ -257,6 +308,13 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  		goto out;
>  
>  	ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
> +
> +	if (ret > 0 && iocb->ki_pos > i_size_read(inode)) {
> +		err = ext4_handle_inode_extension(inode, iocb->ki_pos,
> +						  iov_iter_count(from));
> +		if (err)
> +			ret = err;
> +	}
>  out:
>  	inode_unlock(inode);
>  	if (ret > 0)
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 420fe3deed39..761ce6286b05 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3601,53 +3601,7 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
>  static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
>  			  ssize_t written, unsigned flags, struct iomap *iomap)
>  {
> -	int ret = 0;
> -	handle_t *handle;
> -	int blkbits = inode->i_blkbits;
> -	bool truncate = false;
> -
> -	if (!(flags & IOMAP_WRITE) || (flags & IOMAP_FAULT))
> -		return 0;
> -
> -	handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
> -	if (IS_ERR(handle)) {
> -		ret = PTR_ERR(handle);
> -		goto orphan_del;
> -	}
> -	if (ext4_update_inode_size(inode, offset + written))
> -		ext4_mark_inode_dirty(handle, inode);
> -	/*
> -	 * We may need to truncate allocated but not written blocks beyond EOF.
> -	 */
> -	if (iomap->offset + iomap->length > 
> -	    ALIGN(inode->i_size, 1 << blkbits)) {
> -		ext4_lblk_t written_blk, end_blk;
> -
> -		written_blk = (offset + written) >> blkbits;
> -		end_blk = (offset + length) >> blkbits;
> -		if (written_blk < end_blk && ext4_can_truncate(inode))
> -			truncate = true;
> -	}
> -	/*
> -	 * Remove inode from orphan list if we were extending a inode and
> -	 * everything went fine.
> -	 */
> -	if (!truncate && inode->i_nlink &&
> -	    !list_empty(&EXT4_I(inode)->i_orphan))
> -		ext4_orphan_del(handle, inode);
> -	ext4_journal_stop(handle);
> -	if (truncate) {
> -		ext4_truncate_failed_write(inode);
> -orphan_del:
> -		/*
> -		 * If truncate failed early the inode might still be on the
> -		 * orphan list; we need to make sure the inode is removed from
> -		 * the orphan list in that case.
> -		 */
> -		if (inode->i_nlink)
> -			ext4_orphan_del(NULL, inode);
> -	}
> -	return ret;
> +	return 0;
>  }
>  
>  const struct iomap_ops ext4_iomap_ops = {
> -- 
> 2.16.4
> 
> 
> -- 
> Matthew Bobrowski
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  parent reply index

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-12 12:52 [PATCH 0/5] ext4: direct IO via iomap infrastructure Matthew Bobrowski
2019-08-12 12:52 ` [PATCH 1/5] ext4: introduce direct IO read code path using " Matthew Bobrowski
2019-08-12 17:18   ` Christoph Hellwig
2019-08-12 20:17     ` Matthew Wilcox
2019-08-13 10:45       ` Matthew Bobrowski
2019-08-12 12:52 ` [PATCH 2/5] ext4: move inode extension/truncate code out from ext4_iomap_end() Matthew Bobrowski
2019-08-12 17:18   ` Christoph Hellwig
2019-08-13 10:46     ` Matthew Bobrowski
2019-08-28 19:59   ` Jan Kara [this message]
2019-08-28 21:54     ` Matthew Bobrowski
2019-08-29  8:18       ` Jan Kara
2019-08-12 12:53 ` [PATCH 3/5] iomap: modify ->end_io() calling convention Matthew Bobrowski
2019-08-12 17:18   ` Christoph Hellwig
2019-08-13 10:43     ` Matthew Bobrowski
2019-08-12 12:53 ` [PATCH 4/5] ext4: introduce direct IO write code path using iomap infrastructure Matthew Bobrowski
2019-08-12 17:04   ` RITESH HARJANI
2019-08-13 12:58     ` Matthew Bobrowski
2019-08-13 14:35       ` Darrick J. Wong
2019-08-14  9:51         ` Matthew Bobrowski
2019-08-12 17:34   ` Christoph Hellwig
2019-08-13 10:45     ` Matthew Bobrowski
2019-08-28 20:26   ` Jan Kara
2019-08-28 22:32     ` Dave Chinner
2019-08-29  8:03       ` Jan Kara
2019-08-29 11:47       ` Matthew Bobrowski
2019-08-29 11:45     ` Matthew Bobrowski
2019-08-29 12:38       ` Jan Kara
2019-08-12 12:53 ` [PATCH 5/5] ext4: clean up redundant buffer_head direct IO code Matthew Bobrowski
2019-08-12 17:31 ` [PATCH 0/5] ext4: direct IO via iomap infrastructure RITESH HARJANI
2019-08-13 11:10   ` Matthew Bobrowski
2019-08-13 12:27     ` RITESH HARJANI
2019-08-14  9:48       ` Matthew Bobrowski
2019-08-14 11:58         ` RITESH HARJANI
2019-08-21 13:14       ` Matthew Bobrowski
2019-08-22 12:00         ` Matthew Bobrowski
2019-08-22 14:11           ` Ritesh Harjani
2019-08-24  3:18             ` Matthew Bobrowski
2019-08-24  3:55               ` Darrick J. Wong
2019-08-24 23:04                 ` Christoph Hellwig
2019-08-27  9:52                   ` Matthew Bobrowski
2019-08-28 12:05                     ` Matthew Bobrowski
2019-08-28 14:27                       ` Theodore Y. Ts'o
2019-08-28 18:02                         ` Jan Kara
2019-08-29  6:36                           ` Christoph Hellwig
2019-08-29 11:20                             ` Matthew Bobrowski
2019-08-29 14:41                               ` Christoph Hellwig
2019-08-23 13:43           ` [RFC 1/1] ext4: PoC implementation of option-1 Ritesh Harjani
2019-08-23 13:49             ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190828195914.GF22343@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mbobrowski@mbobrowski.org \
    --cc=riteshh@linux.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git