From: Jan Kara <jack@suse.cz>
To: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Cc: Jan Kara <jack@suse.cz>, "Theodore Y. Ts'o" <tytso@mit.edu>,
adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
linux-fsdevel@vger.kernel.org, hch@infradead.org,
david@fromorbit.com, darrick.wong@oracle.com
Subject: Re: [PATCH v6 00/11] ext4: port direct I/O to iomap infrastructure
Date: Thu, 31 Oct 2019 17:54:16 +0100 [thread overview]
Message-ID: <20191031165416.GD13321@quack2.suse.cz> (raw)
In-Reply-To: <20191031091639.GB28679@bobrowski>
On Thu 31-10-19 20:16:41, Matthew Bobrowski wrote:
> On Wed, Oct 30, 2019 at 12:39:18PM +0100, Jan Kara wrote:
> > On Wed 30-10-19 12:26:52, Jan Kara wrote:
> > > On Wed 30-10-19 13:00:24, Matthew Bobrowski wrote:
> > > > On Tue, Oct 29, 2019 at 07:34:01PM -0400, Theodore Y. Ts'o wrote:
> > > > > On Tue, Oct 29, 2019 at 07:31:59PM -0400, Theodore Y. Ts'o wrote:
> > > > > > Hi Matthew, it looks like there are a number of problems with this
> > > > > > patch series when using the ext3 backwards compatibility mode (e.g.,
> > > > > > no extents enabled).
> > > > > >
> > > > > > So the following configurations are failing:
> > > > > >
> > > > > > kvm-xfstests -c ext3 generic/091 generic/240 generic/263
> > > >
> > > > This is one mode that I didn't get around to testing. Let me take a
> > > > look at the above and get back to you.
> > >
> > > If I should guess, I'd start looking at what that -ENOTBLK fallback from
> > > direct IO ends up doing as we seem to be hitting that path...
> >
> > Hum, actually no. This write from fsx output:
> >
> > 24( 24 mod 256): WRITE 0x23000 thru 0x285ff (0x5600 bytes)
> >
> > should have allocated blocks to where the failed write was going (0x24000).
> > But still I'd expect some interaction between how buffered writes to holes
> > interact with following direct IO writes... One of the subtle differences
> > we have introduced with iomap conversion is that the old code in
> > __generic_file_write_iter() did fsync & invalidate written range after
> > buffered write fallback and we don't seem to do that now (probably should
> > be fixed regardless of relation to this bug).
>
> After performing some debugging this afternoon, I quickly realised
> that the fix for this is rather trivial. Within the previous direct
> I/O implementation, we passed EXT4_GET_BLOCKS_CREATE to
> ext4_map_blocks() for any writes to inodes without extents. I seem to
> have missed that here and consequently block allocation for a write
> wasn't performing correctly in such cases.
No, this is not correct. For inodes without extents we used
ext4_dio_get_block() and we pass DIO_SKIP_HOLES to __blockdev_direct_IO().
Now DIO_SKIP_HOLES means that if starting block is within i_size, we pass
'create == 0' to get_blocks() function and thus ext4_dio_get_block() uses
'0' argument to ext4_map_blocks() similarly to what you do.
And indeed for inodes without extents we must fallback to buffered IO for
filling holes inside a file to avoid stale data exposure (racing DIO read
could read block contents before data is written to it if we used
EXT4_GET_BLOCKS_CREATE).
> Also, I agree, the fsync + page cache invalidation bits need to be
> implemented. I'm just thinking to branch out within
> ext4_buffered_write_iter() and implement those bits there i.e.
>
> ...
> ret = generic_perform_write();
>
> if (ret > 0 && iocb->ki_flags & IOCB_DIRECT) {
> err = filemap_write_and_wait_range();
>
> if (!err)
> invalidate_mapping_pages();
> ...
>
> AFAICT, this would be the most appropriate place to put it? Or, did
> you have something else in mind?
Yes, either this, or maybe in ext4_dio_write_iter() after returning from
ext4_buffered_write_iter() would be even more logical.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2019-10-31 16:54 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-28 10:50 [PATCH v6 00/11] ext4: port direct I/O to iomap infrastructure Matthew Bobrowski
2019-10-28 10:50 ` [PATCH v6 01/11] ext4: reorder map.m_flags checks within ext4_iomap_begin() Matthew Bobrowski
2019-10-28 10:50 ` [PATCH v6 02/11] ext4: update direct I/O read lock pattern for IOCB_NOWAIT Matthew Bobrowski
2019-10-28 10:51 ` [PATCH v6 03/11] ext4: iomap that extends beyond EOF should be marked dirty Matthew Bobrowski
2019-10-28 10:51 ` [PATCH v6 04/11] ext4: move set iomap routines into a separate helper ext4_set_iomap() Matthew Bobrowski
2019-10-28 17:03 ` Darrick J. Wong
2019-10-28 20:36 ` Matthew Bobrowski
2019-10-28 23:56 ` Darrick J. Wong
2019-10-28 10:51 ` [PATCH v6 05/11] ext4: split IOMAP_WRITE branch in ext4_iomap_begin() into helper Matthew Bobrowski
2019-10-28 10:52 ` [PATCH v6 06/11] ext4: introduce new callback for IOMAP_REPORT Matthew Bobrowski
2019-10-29 5:42 ` Ritesh Harjani
2019-10-28 10:52 ` [PATCH v6 07/11] ext4: introduce direct I/O read using iomap infrastructure Matthew Bobrowski
2019-10-28 10:52 ` [PATCH v6 08/11] ext4: move inode extension/truncate code out from ->iomap_end() callback Matthew Bobrowski
2019-10-29 5:46 ` Ritesh Harjani
2019-10-28 10:53 ` [PATCH v6 09/11] ext4: move inode extension check out from ext4_iomap_alloc() Matthew Bobrowski
2019-10-28 10:53 ` [PATCH v6 11/11] ext4: introduce direct I/O write using iomap infrastructure Matthew Bobrowski
2019-10-29 6:14 ` Ritesh Harjani
2019-10-28 10:53 ` [PATCH v6 10/11] ext4: update ext4_sync_file() to not use __generic_file_fsync() Matthew Bobrowski
2019-10-29 6:12 ` Ritesh Harjani
2019-10-30 11:18 ` Jan Kara
2019-10-29 23:31 ` [PATCH v6 00/11] ext4: port direct I/O to iomap infrastructure Theodore Y. Ts'o
2019-10-29 23:34 ` Theodore Y. Ts'o
2019-10-30 2:00 ` Matthew Bobrowski
2019-10-30 11:26 ` Jan Kara
2019-10-30 11:39 ` Jan Kara
2019-10-31 9:16 ` Matthew Bobrowski
2019-10-31 16:54 ` Jan Kara [this message]
2019-10-31 22:58 ` Matthew Bobrowski
2019-11-03 19:20 ` Theodore Y. Ts'o
2019-11-04 6:04 ` Matthew Bobrowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191031165416.GD13321@quack2.suse.cz \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mbobrowski@mbobrowski.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).