linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	fdmanana@gmail.com, dsterba@suse.cz, darrick.wong@oracle.com,
	cluster-devel@redhat.com, linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org
Subject: Re: always fall back to buffered I/O after invalidation failures, was: Re: [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails
Date: Wed, 8 Jul 2020 18:11:52 +0100	[thread overview]
Message-ID: <20200708171152.GV25523@casper.infradead.org> (raw)
In-Reply-To: <20200708165412.GA637@lst.de>

On Wed, Jul 08, 2020 at 06:54:12PM +0200, Christoph Hellwig wrote:
> On Wed, Jul 08, 2020 at 02:54:37PM +0100, Matthew Wilcox wrote:
> > Direct I/O isn't deterministic though.  If the file isn't shared, then
> > it works great, but as soon as you get mixed buffered and direct I/O,
> > everything is already terrible.  Direct I/Os perform pagecache lookups
> > already, but instead of using the data that we found in the cache, we
> > (if it's dirty) write it back, wait for the write to complete, remove
> > the page from the pagecache and then perform another I/O to get the data
> > that we just wrote out!  And then the app that's using buffered I/O has
> > to read it back in again.
> 
> Mostly agreed.  That being said I suspect invalidating clean cache
> might still be a good idea.  The original idea was mostly on how
> to deal with invalidation failures of any kind, but falling back for
> any kind of dirty cache also makes at least some sense.

That's certainly the btrfs problem that needs to be solved, but I think
it's all part of the directio misdesign.

> > I have had an objection raised off-list.  In a scenario with a block
> > device shared between two systems and an application which does direct
> > I/O, everything is normally fine.  If one of the systems uses tar to
> > back up the contents of the block device then the application on that
> > system will no longer see the writes from the other system because
> > there's nothing to invalidate the pagecache on the first system.
> 
> Err, WTF?  If someone access shared block devices with random
> applications all bets are off anyway.

That doesn't mean that customers don't do it.  It is, of course, not
recommended, but we suspect people do it anyway.  Because it does
work, unfortunately.  I'd be open to making this exact situation
deterministically not work (eg disallowing mixing O_DIRECT and
non-O_DIRECT openers of block devices), but making it suddenly
non-deterministically give you old data is a non-starter.

> > Unfortunately, this is in direct conflict with the performance
> > problem caused by some little arsewipe deciding to do:
> > 
> > $ while true; do dd if=/lib/x86_64-linux-gnu/libc-2.30.so iflag=direct of=/dev/null; done
> > 
> > ... doesn't hurt me because my root filesystem is on ext4 which doesn't
> > purge the cache.  But anything using iomap gets all the pages for libc
> > kicked out of the cache, and that's a lot of fun.
> 
> ext4 uses iomap..

I happen to be running an older kernel that doesn't on this laptop ;-)

  reply	other threads:[~2020-07-08 17:12 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-29 19:23 [PATCH 0/6 v10] btrfs direct-io using iomap Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 1/6] iomap: Convert wait_for_completion to flags Goldwyn Rodrigues
2020-06-29 23:03   ` David Sterba
2020-06-30 16:35   ` David Sterba
2020-07-01  7:34     ` Johannes Thumshirn
2020-07-01  7:50   ` Christoph Hellwig
2020-06-29 19:23 ` [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails Goldwyn Rodrigues
2020-07-01  7:53   ` always fall back to buffered I/O after invalidation failures, was: " Christoph Hellwig
2020-07-07 12:43     ` Goldwyn Rodrigues
2020-07-07 12:57       ` Matthew Wilcox
2020-07-07 13:00         ` Christoph Hellwig
2020-07-08  6:51           ` Dave Chinner
2020-07-08 13:54             ` Matthew Wilcox
2020-07-08 16:54               ` Christoph Hellwig
2020-07-08 17:11                 ` Matthew Wilcox [this message]
2020-07-09  8:26                 ` [Cluster-devel] " Steven Whitehouse
2020-07-09  2:25               ` Dave Chinner
2020-07-09 16:09                 ` Darrick J. Wong
2020-07-09 17:05                   ` Matthew Wilcox
2020-07-09 17:10                     ` Darrick J. Wong
2020-07-09 22:59                       ` Dave Chinner
2020-07-10 16:03                         ` Christoph Hellwig
2020-07-12 11:36                 ` Avi Kivity
2020-07-07 13:49         ` Goldwyn Rodrigues
2020-07-07 14:01           ` Darrick J. Wong
2020-07-07 14:30             ` Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 3/6] btrfs: switch to iomap_dio_rw() for dio Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 4/6] fs: remove dio_end_io() Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 5/6] btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 6/6] btrfs: split btrfs_direct_IO to read and write part Goldwyn Rodrigues

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200708171152.GV25523@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=cluster-devel@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dsterba@suse.cz \
    --cc=fdmanana@gmail.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rgoldwyn@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).