linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	fdmanana@gmail.com, dsterba@suse.cz, darrick.wong@oracle.com,
	cluster-devel@redhat.com, linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org
Subject: Re: always fall back to buffered I/O after invalidation failures, was: Re: [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails
Date: Wed, 8 Jul 2020 16:51:27 +1000	[thread overview]
Message-ID: <20200708065127.GM2005@dread.disaster.area> (raw)
In-Reply-To: <20200707130030.GA13870@lst.de>

On Tue, Jul 07, 2020 at 03:00:30PM +0200, Christoph Hellwig wrote:
> On Tue, Jul 07, 2020 at 01:57:05PM +0100, Matthew Wilcox wrote:
> > On Tue, Jul 07, 2020 at 07:43:46AM -0500, Goldwyn Rodrigues wrote:
> > > On  9:53 01/07, Christoph Hellwig wrote:
> > > > On Mon, Jun 29, 2020 at 02:23:49PM -0500, Goldwyn Rodrigues wrote:
> > > > > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > > > > 
> > > > > For direct I/O, add the flag IOMAP_DIO_RWF_NO_STALE_PAGECACHE to indicate
> > > > > that if the page invalidation fails, return back control to the
> > > > > filesystem so it may fallback to buffered mode.
> > > > > 
> > > > > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > > > 
> > > > I'd like to start a discussion of this shouldn't really be the
> > > > default behavior.  If we have page cache that can't be invalidated it
> > > > actually makes a whole lot of sense to not do direct I/O, avoid the
> > > > warnings, etc.
> > > > 
> > > > Adding all the relevant lists.
> > > 
> > > Since no one responded so far, let me see if I can stir the cauldron :)
> > > 
> > > What error should be returned in case of such an error? I think the
> > 
> > Christoph's message is ambiguous.  I don't know if he means "fail the
> > I/O with an error" or "satisfy the I/O through the page cache".  I'm
> > strongly in favour of the latter.
> 
> Same here.  Sorry if my previous mail was unclear.
> 
> > Indeed, I'm in favour of not invalidating
> > the page cache at all for direct I/O.  For reads, I think the page cache
> > should be used to satisfy any portion of the read which is currently
> > cached.  For writes, I think we should write into the page cache pages
> > which currently exist, and then force those pages to be written back,
> > but left in cache.
> 
> Something like that, yes.

So are we really willing to take the performance regression that
occurs from copying out of the page cache consuming lots more CPU
than an actual direct IO read? Or that direct IO writes suddenly
serialise because there are page cache pages and now we have to do
buffered IO?

Direct IO should be a deterministic, zero-copy IO path to/from
storage. Using the CPU to copy data during direct IO is the complete
opposite of the intended functionality, not to mention the behaviour
that many applications have been careful designed and tuned for.

Hence I think that forcing iomap to use cached pages for DIO is a
non-starter. I have no problems with providing infrastructure that
allows filesystems to -opt in- to using buffered IO for the direct
IO path. However, the change in IO behaviour caused by unpredicably
switching between direct IO and buffered IO (e.g. suddening DIO
writes serialise -all IO-) will cause unacceptible performance
regressions for many applications and be -very difficult to
diagnose- in production systems.

IOWs, we need to let the individual filesystems decide how they want
to use the page cache for direct IO. Just because we have new direct
IO infrastructure (i.e. iomap) it does not mean we can just make
wholesale changes to the direct IO path behaviour...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-07-08  6:51 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-29 19:23 [PATCH 0/6 v10] btrfs direct-io using iomap Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 1/6] iomap: Convert wait_for_completion to flags Goldwyn Rodrigues
2020-06-29 23:03   ` David Sterba
2020-06-30 16:35   ` David Sterba
2020-07-01  7:34     ` Johannes Thumshirn
2020-07-01  7:50   ` Christoph Hellwig
2020-06-29 19:23 ` [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails Goldwyn Rodrigues
2020-07-01  7:53   ` always fall back to buffered I/O after invalidation failures, was: " Christoph Hellwig
2020-07-07 12:43     ` Goldwyn Rodrigues
2020-07-07 12:57       ` Matthew Wilcox
2020-07-07 13:00         ` Christoph Hellwig
2020-07-08  6:51           ` Dave Chinner [this message]
2020-07-08 13:54             ` Matthew Wilcox
2020-07-08 16:54               ` Christoph Hellwig
2020-07-08 17:11                 ` Matthew Wilcox
2020-07-09  8:26                 ` [Cluster-devel] " Steven Whitehouse
2020-07-09  2:25               ` Dave Chinner
2020-07-09 16:09                 ` Darrick J. Wong
2020-07-09 17:05                   ` Matthew Wilcox
2020-07-09 17:10                     ` Darrick J. Wong
2020-07-09 22:59                       ` Dave Chinner
2020-07-10 16:03                         ` Christoph Hellwig
2020-07-12 11:36                 ` Avi Kivity
2020-07-07 13:49         ` Goldwyn Rodrigues
2020-07-07 14:01           ` Darrick J. Wong
2020-07-07 14:30             ` Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 3/6] btrfs: switch to iomap_dio_rw() for dio Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 4/6] fs: remove dio_end_io() Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 5/6] btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 6/6] btrfs: split btrfs_direct_IO to read and write part Goldwyn Rodrigues

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200708065127.GM2005@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=cluster-devel@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=fdmanana@gmail.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rgoldwyn@suse.de \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).