Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
From: Matthew Bobrowski <mbobrowski@mbobrowski.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	jack@suse.cz, tytso@mit.edu, riteshh@linux.ibm.com
Subject: Re: [PATCH 4/5] ext4: introduce direct IO write code path using iomap infrastructure
Date: Tue, 13 Aug 2019 20:45:03 +1000
Message-ID: <20190813104501.GA28622@poseidon.bobrowski.net> (raw)
In-Reply-To: <20190812173403.GD24564@infradead.org>

On Mon, Aug 12, 2019 at 10:34:03AM -0700, Christoph Hellwig wrote:
> > +	if (error) {
> > +		if (offset + size > i_size_read(inode))
> > +			ext4_truncate_failed_write(inode);
> > +
> > +		/*
> > +		 * The inode may have been placed onto the orphan list
> > +		 * as a result of an extension. However, an error may
> > +		 * have been encountered prior to being able to
> > +		 * complete the write operation. Perform any necessary
> > +		 * clean up in this case.
> > +		 */
> > +		if (!list_empty(&EXT4_I(inode)->i_orphan)) {
> > +			handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
> > +			if (IS_ERR(handle)) {
> > +				if (inode->i_nlink)
> > +					ext4_orphan_del(NULL, inode);
> > +				return PTR_ERR(handle);
> > +			}
> > +
> > +			if (inode->i_nlink)
> > +				ext4_orphan_del(handle, inode);
> > +			ext4_journal_stop(handle);
> > +		}
> > +		return error;
> 
> I'd split this branch into a separate function just to keep the
> end_io handler tidy.

Good idea. I'll do that...

> > +	if (ret == -EIOCBQUEUED && (unaligned_aio || extend))
> > +		inode_dio_wait(inode);
> > +
> > +	if (ret >= 0 && iov_iter_count(from)) {
> > +		overwrite ? inode_unlock_shared(inode) : inode_unlock(inode);
> > +		return ext4_buffered_write_iter(iocb, from);
> > +	}
> > +out:
> > +	overwrite ? inode_unlock_shared(inode) : inode_unlock(inode);
> > +	return ret;
> 
> the ? : expression here is weird.
> 
> I'd write this as:
> 
> 	if (overwrite)
> 		inode_unlock_shared(inode);
> 	else
> 		inode_unlock(inode);
> 
> 	if (ret >= 0 && iov_iter_count(from))
> 		return ext4_buffered_write_iter(iocb, from);
> 	return ret;
> 
> and handle the only place we jump to the current out label manually,
> as that always does an exclusive unlock anyway.

Yeah, the ternary operators do look weird here and I'd prefer if we
also dropped them. I was at a point where I was trying to clean up
some of the code, but I had been staring at the screen for so long my
brain went numb and couldn't think of how to do this neatly. I'm happy
with this suggestion. :-)

> > +		if (IS_DAX(inode)) {
> > +			ret = ext4_map_blocks(handle, inode, &map,
> > +					      EXT4_GET_BLOCKS_CREATE_ZERO);
> > +		} else {
> > +			/*
> > +			 * DAX and direct IO are the only two
> > +			 * operations currently supported with
> > +			 * IOMAP_WRITE.
> > +			 */
> > +			WARN_ON(!(flags & IOMAP_DIRECT));
> > +			if (round_down(offset, i_blocksize(inode)) >=
> > +			    i_size_read(inode)) {
> > +				ret = ext4_map_blocks(handle, inode, &map,
> > +						      EXT4_GET_BLOCKS_CREATE);
> > +			} else if (!ext4_test_inode_flag(inode,
> > +							 EXT4_INODE_EXTENTS)) {
> > +				/*
> > +				 * We cannot fill holes in indirect
> > +				 * tree based inodes as that could
> > +				 * expose stale data in the case of a
> > +				 * crash. Use magic error code to
> > +				 * fallback to buffered IO.
> > +				 */
> > +				ret = ext4_map_blocks(handle, inode, &map, 0);
> > +				if (ret == 0)
> > +					ret = -ENOTBLK;
> > +			} else {
> > +				ret = ext4_map_blocks(handle, inode, &map,
> > +						      EXT4_GET_BLOCKS_IO_CREATE_EXT);
> > +			}
> > +		}
> 
> I think this could be simplified down to something like:
> 
> 		int flags = 0;
> 
> 		...
> 
> 		/*
> 		 * DAX and direct IO are the only two operations currently
> 		 * supported with IOMAP_WRITE.
> 		 */
> 		WARN_ON(!IS_DAX(inode) && !(flags & IOMAP_DIRECT));
> 
> 		if (IS_DAX(inode))
> 			flags = EXT4_GET_BLOCKS_CREATE_ZERO;
> 		else if (round_down(offset, i_blocksize(inode)) >=
> 				i_size_read(inode)) {
> 			flags = EXT4_GET_BLOCKS_CREATE;
> 		else if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
> 			flags = EXT4_GET_BLOCKS_IO_CREATE_EXT;
> 
> 		/*
> 		 * We cannot fill holes in indirect tree based inodes as that
> 		 * could expose stale data in the case of a crash.  Use the
> 		 * magic error code to fallback to buffered IO.
> 		 */
> 		if (!flags && !ret)
> 			ret = -ENOTBLK;

This also seems OK to me.

> > @@ -3601,6 +3631,8 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> >  static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
> >  			  ssize_t written, unsigned flags, struct iomap *iomap)
> >  {
> > +	if (flags & IOMAP_DIRECT && written == 0)
> > +		return -ENOTBLK;
> 
> This probably wants a comment, too.  But do we actually ever end up
> here?

Sure, I can append a comment. Also, I don't believe that we can
completely drop the ->iomap_end() callback as hinted in one of your
other comments. The reason I say this is because we still need this to
catch the case where an error an occurs within 'iomap_actor_t'. If
that happens to be, within iomap_dio_rw() we wait for IO completion
before returning and then we fallback to buffered IO to complete the
remainder of the IO. We will also be able to reuse the extent that was
allocated when preparing for direct IO if we do this.

--M

  reply index

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-12 12:52 [PATCH 0/5] ext4: direct IO via " Matthew Bobrowski
2019-08-12 12:52 ` [PATCH 1/5] ext4: introduce direct IO read code path using " Matthew Bobrowski
2019-08-12 17:18   ` Christoph Hellwig
2019-08-12 20:17     ` Matthew Wilcox
2019-08-13 10:45       ` Matthew Bobrowski
2019-08-12 12:52 ` [PATCH 2/5] ext4: move inode extension/truncate code out from ext4_iomap_end() Matthew Bobrowski
2019-08-12 17:18   ` Christoph Hellwig
2019-08-13 10:46     ` Matthew Bobrowski
2019-08-28 19:59   ` Jan Kara
2019-08-28 21:54     ` Matthew Bobrowski
2019-08-29  8:18       ` Jan Kara
2019-08-12 12:53 ` [PATCH 3/5] iomap: modify ->end_io() calling convention Matthew Bobrowski
2019-08-12 17:18   ` Christoph Hellwig
2019-08-13 10:43     ` Matthew Bobrowski
2019-08-12 12:53 ` [PATCH 4/5] ext4: introduce direct IO write code path using iomap infrastructure Matthew Bobrowski
2019-08-12 17:04   ` RITESH HARJANI
2019-08-13 12:58     ` Matthew Bobrowski
2019-08-13 14:35       ` Darrick J. Wong
2019-08-14  9:51         ` Matthew Bobrowski
2019-08-12 17:34   ` Christoph Hellwig
2019-08-13 10:45     ` Matthew Bobrowski [this message]
2019-08-28 20:26   ` Jan Kara
2019-08-28 22:32     ` Dave Chinner
2019-08-29  8:03       ` Jan Kara
2019-08-29 11:47       ` Matthew Bobrowski
2019-08-29 11:45     ` Matthew Bobrowski
2019-08-29 12:38       ` Jan Kara
2019-08-12 12:53 ` [PATCH 5/5] ext4: clean up redundant buffer_head direct IO code Matthew Bobrowski
2019-08-12 17:31 ` [PATCH 0/5] ext4: direct IO via iomap infrastructure RITESH HARJANI
2019-08-13 11:10   ` Matthew Bobrowski
2019-08-13 12:27     ` RITESH HARJANI
2019-08-14  9:48       ` Matthew Bobrowski
2019-08-14 11:58         ` RITESH HARJANI
2019-08-21 13:14       ` Matthew Bobrowski
2019-08-22 12:00         ` Matthew Bobrowski
2019-08-22 14:11           ` Ritesh Harjani
2019-08-24  3:18             ` Matthew Bobrowski
2019-08-24  3:55               ` Darrick J. Wong
2019-08-24 23:04                 ` Christoph Hellwig
2019-08-27  9:52                   ` Matthew Bobrowski
2019-08-28 12:05                     ` Matthew Bobrowski
2019-08-28 14:27                       ` Theodore Y. Ts'o
2019-08-28 18:02                         ` Jan Kara
2019-08-29  6:36                           ` Christoph Hellwig
2019-08-29 11:20                             ` Matthew Bobrowski
2019-08-29 14:41                               ` Christoph Hellwig
2019-08-23 13:43           ` [RFC 1/1] ext4: PoC implementation of option-1 Ritesh Harjani
2019-08-23 13:49             ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190813104501.GA28622@poseidon.bobrowski.net \
    --to=mbobrowski@mbobrowski.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=riteshh@linux.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git