linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	fdmanana@gmail.com, dsterba@suse.cz, darrick.wong@oracle.com,
	cluster-devel@redhat.com, linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org
Subject: Re: always fall back to buffered I/O after invalidation failures, was: Re: [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails
Date: Wed, 8 Jul 2020 14:54:37 +0100	[thread overview]
Message-ID: <20200708135437.GP25523@casper.infradead.org> (raw)
In-Reply-To: <20200708065127.GM2005@dread.disaster.area>

On Wed, Jul 08, 2020 at 04:51:27PM +1000, Dave Chinner wrote:
> On Tue, Jul 07, 2020 at 03:00:30PM +0200, Christoph Hellwig wrote:
> > On Tue, Jul 07, 2020 at 01:57:05PM +0100, Matthew Wilcox wrote:
> > > Indeed, I'm in favour of not invalidating
> > > the page cache at all for direct I/O.  For reads, I think the page cache
> > > should be used to satisfy any portion of the read which is currently
> > > cached.  For writes, I think we should write into the page cache pages
> > > which currently exist, and then force those pages to be written back,
> > > but left in cache.
> > 
> > Something like that, yes.
> 
> So are we really willing to take the performance regression that
> occurs from copying out of the page cache consuming lots more CPU
> than an actual direct IO read? Or that direct IO writes suddenly
> serialise because there are page cache pages and now we have to do
> buffered IO?
> 
> Direct IO should be a deterministic, zero-copy IO path to/from
> storage. Using the CPU to copy data during direct IO is the complete
> opposite of the intended functionality, not to mention the behaviour
> that many applications have been careful designed and tuned for.

Direct I/O isn't deterministic though.  If the file isn't shared, then
it works great, but as soon as you get mixed buffered and direct I/O,
everything is already terrible.  Direct I/Os perform pagecache lookups
already, but instead of using the data that we found in the cache, we
(if it's dirty) write it back, wait for the write to complete, remove
the page from the pagecache and then perform another I/O to get the data
that we just wrote out!  And then the app that's using buffered I/O has
to read it back in again.

Nobody's proposing changing Direct I/O to exclusively work through the
pagecache.  The proposal is to behave less weirdly when there's already
data in the pagecache.

I have had an objection raised off-list.  In a scenario with a block
device shared between two systems and an application which does direct
I/O, everything is normally fine.  If one of the systems uses tar to
back up the contents of the block device then the application on that
system will no longer see the writes from the other system because
there's nothing to invalidate the pagecache on the first system.

Unfortunately, this is in direct conflict with the performance
problem caused by some little arsewipe deciding to do:

$ while true; do dd if=/lib/x86_64-linux-gnu/libc-2.30.so iflag=direct of=/dev/null; done

... doesn't hurt me because my root filesystem is on ext4 which doesn't
purge the cache.  But anything using iomap gets all the pages for libc
kicked out of the cache, and that's a lot of fun.

  reply	other threads:[~2020-07-08 14:15 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-29 19:23 [PATCH 0/6 v10] btrfs direct-io using iomap Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 1/6] iomap: Convert wait_for_completion to flags Goldwyn Rodrigues
2020-06-29 23:03   ` David Sterba
2020-06-30 16:35   ` David Sterba
2020-07-01  7:34     ` Johannes Thumshirn
2020-07-01  7:50   ` Christoph Hellwig
2020-06-29 19:23 ` [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails Goldwyn Rodrigues
2020-07-01  7:53   ` always fall back to buffered I/O after invalidation failures, was: " Christoph Hellwig
2020-07-07 12:43     ` Goldwyn Rodrigues
2020-07-07 12:57       ` Matthew Wilcox
2020-07-07 13:00         ` Christoph Hellwig
2020-07-08  6:51           ` Dave Chinner
2020-07-08 13:54             ` Matthew Wilcox [this message]
2020-07-08 16:54               ` Christoph Hellwig
2020-07-08 17:11                 ` Matthew Wilcox
2020-07-09  8:26                 ` [Cluster-devel] " Steven Whitehouse
2020-07-09  2:25               ` Dave Chinner
2020-07-09 16:09                 ` Darrick J. Wong
2020-07-09 17:05                   ` Matthew Wilcox
2020-07-09 17:10                     ` Darrick J. Wong
2020-07-09 22:59                       ` Dave Chinner
2020-07-10 16:03                         ` Christoph Hellwig
2020-07-12 11:36                 ` Avi Kivity
2020-07-07 13:49         ` Goldwyn Rodrigues
2020-07-07 14:01           ` Darrick J. Wong
2020-07-07 14:30             ` Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 3/6] btrfs: switch to iomap_dio_rw() for dio Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 4/6] fs: remove dio_end_io() Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 5/6] btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 6/6] btrfs: split btrfs_direct_IO to read and write part Goldwyn Rodrigues

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200708135437.GP25523@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=cluster-devel@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dsterba@suse.cz \
    --cc=fdmanana@gmail.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rgoldwyn@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).