From: Matthew Bobrowski <email@example.com> To: Christoph Hellwig <firstname.lastname@example.org> Cc: Jan Kara <email@example.com>, "Theodore Y. Ts'o" <firstname.lastname@example.org>, "Darrick J. Wong" <email@example.com>, Ritesh Harjani <firstname.lastname@example.org>, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Subject: Re: [PATCH 0/5] ext4: direct IO via iomap infrastructure Date: Thu, 29 Aug 2019 21:20:50 +1000 Message-ID: <20190829112048.GA2486@poseidon.bobrowski.net> (raw) In-Reply-To: <20190829063608.GA17426@infradead.org> Awesome, and thank you *all* for your very valueable input. On Wed, Aug 28, 2019 at 11:36:08PM -0700, Christoph Hellwig wrote: > On Wed, Aug 28, 2019 at 08:02:15PM +0200, Jan Kara wrote: > > > The original reason why we created the DIO_STATE_UNWRITTEN flag was a > > > fast path, where the common case is writing blocks to an existing > > > location in a file where the blocks are already allocated, and marked > > > as written. So consulting the on-disk extent tree to determine > > > whether unwritten extents need to be converted and/or split is > > > certainly doable. However, it's expensive for the common case. So > > > having a hint whether we need to schedule a workqueue to possibly > > > convert an unwritten region is helpful. If we can just free the bio > > > and exit the I/O completion handler without having to take shared > > > locks to examine the on-disk extent tree, so much the better. > > > > Yes, but for determining whether extent conversion on IO completion is > > needed we now use IOMAP_DIO_UNWRITTEN flag iomap infrastructure provides to > > us. So we don't have to track this internally in ext4 anymore. > > Exactly. As mentioned before the ioend to track unwritten thing was > in XFS by the time ext4 copied the ioend approach. but we actually got > rid of that long before the iomap conversion. Maybe to make everything > easier to understand and bisect you might want to get rid of the ioend > for direct I/O in ext4 as a prep path as well. > > The relevant commit is: 273dda76f757 ("xfs: don't use ioends for direct > write completions") Uh ha! So, we conclude that there's no need to muck around with hairy ioend's, or the need to denote whether there's unwritten extents held against the inode using tricky state flag for that matter. > > > To be honest, i'm not 100% sure what would happen if we removed that > > > restriction; it might be that things would work just fine (just slower > > > in some workloads), or whether there is some hidden dependency that > > > would explode. I suspect we'd have to try the experiment to be sure. > > > > As far as I remember the concern was that extent split may need block > > allocation and we may not have enough free blocks to do it. These days we > > have some blocks reserved in the filesystem to accomodate unexpected extent > > splits so this shouldn't happen anymore so the only real concern is the > > wasted performance due to unnecessary extent merge & split. Kind of a > > stress test for this would be to fire of lots of sequential AIO DIO > > requests against a hole in a file. > > Well, you can always add a don't merge flag to the actual allocation. > You might still get a merge for pathological case (fallocate adjacent > to a dio write just submitted), but if the merging is such a performance > over head here is easy ways to avoid it for the common case. After I've posted through the next version of this patch series, I will attempt to perform some stress testing to see what the performance hit could potentially be.
next prev parent reply index Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-08-12 12:52 Matthew Bobrowski 2019-08-12 12:52 ` [PATCH 1/5] ext4: introduce direct IO read code path using " Matthew Bobrowski 2019-08-12 17:18 ` Christoph Hellwig 2019-08-12 20:17 ` Matthew Wilcox 2019-08-13 10:45 ` Matthew Bobrowski 2019-08-12 12:52 ` [PATCH 2/5] ext4: move inode extension/truncate code out from ext4_iomap_end() Matthew Bobrowski 2019-08-12 17:18 ` Christoph Hellwig 2019-08-13 10:46 ` Matthew Bobrowski 2019-08-28 19:59 ` Jan Kara 2019-08-28 21:54 ` Matthew Bobrowski 2019-08-29 8:18 ` Jan Kara 2019-08-12 12:53 ` [PATCH 3/5] iomap: modify ->end_io() calling convention Matthew Bobrowski 2019-08-12 17:18 ` Christoph Hellwig 2019-08-13 10:43 ` Matthew Bobrowski 2019-08-12 12:53 ` [PATCH 4/5] ext4: introduce direct IO write code path using iomap infrastructure Matthew Bobrowski 2019-08-12 17:04 ` RITESH HARJANI 2019-08-13 12:58 ` Matthew Bobrowski 2019-08-13 14:35 ` Darrick J. Wong 2019-08-14 9:51 ` Matthew Bobrowski 2019-08-12 17:34 ` Christoph Hellwig 2019-08-13 10:45 ` Matthew Bobrowski 2019-08-28 20:26 ` Jan Kara 2019-08-28 22:32 ` Dave Chinner 2019-08-29 8:03 ` Jan Kara 2019-08-29 11:47 ` Matthew Bobrowski 2019-08-29 11:45 ` Matthew Bobrowski 2019-08-29 12:38 ` Jan Kara 2019-08-12 12:53 ` [PATCH 5/5] ext4: clean up redundant buffer_head direct IO code Matthew Bobrowski 2019-08-12 17:31 ` [PATCH 0/5] ext4: direct IO via iomap infrastructure RITESH HARJANI 2019-08-13 11:10 ` Matthew Bobrowski 2019-08-13 12:27 ` RITESH HARJANI 2019-08-14 9:48 ` Matthew Bobrowski 2019-08-14 11:58 ` RITESH HARJANI 2019-08-21 13:14 ` Matthew Bobrowski 2019-08-22 12:00 ` Matthew Bobrowski 2019-08-22 14:11 ` Ritesh Harjani 2019-08-24 3:18 ` Matthew Bobrowski 2019-08-24 3:55 ` Darrick J. Wong 2019-08-24 23:04 ` Christoph Hellwig 2019-08-27 9:52 ` Matthew Bobrowski 2019-08-28 12:05 ` Matthew Bobrowski 2019-08-28 14:27 ` Theodore Y. Ts'o 2019-08-28 18:02 ` Jan Kara 2019-08-29 6:36 ` Christoph Hellwig 2019-08-29 11:20 ` Matthew Bobrowski [this message] 2019-08-29 14:41 ` Christoph Hellwig 2019-08-23 13:43 ` [RFC 1/1] ext4: PoC implementation of option-1 Ritesh Harjani 2019-08-23 13:49 ` Ritesh Harjani
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190829112048.GA2486@poseidon.bobrowski.net \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-ext4 Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \ email@example.com public-inbox-index linux-ext4 Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4 AGPL code for this site: git clone https://public-inbox.org/public-inbox.git