linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@scylladb.com>
To: Dave Chinner <david@fromorbit.com>, Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	fdmanana@gmail.com, dsterba@suse.cz, darrick.wong@oracle.com,
	cluster-devel@redhat.com, linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org
Subject: Re: always fall back to buffered I/O after invalidation failures, was: Re: [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails
Date: Sun, 12 Jul 2020 14:36:28 +0300	[thread overview]
Message-ID: <f86a687a-29bf-ef9c-844c-4354de9a65bb@scylladb.com> (raw)
In-Reply-To: <20200709022527.GQ2005@dread.disaster.area>


On 09/07/2020 05.25, Dave Chinner wrote:
>
>> Nobody's proposing changing Direct I/O to exclusively work through the
>> pagecache.  The proposal is to behave less weirdly when there's already
>> data in the pagecache.
> No, the proposal it to make direct IO behave *less
> deterministically* if there is data in the page cache.
>
> e.g. Instead of having a predicatable submission CPU overhead and
> read latency of 100us for your data, this proposal makes the claim
> that it is always better to burn 10x the IO submission CPU for a
> single IO to copy the data and give that specific IO 10x lower
> latency than it is to submit 10 async IOs to keep the IO pipeline
> full.
>
> What it fails to take into account is that in spending that CPU time
> to copy the data, we haven't submitted 10 other IOs and so the
> actual in-flight IO for the application has decreased. If
> performance comes from keeping the IO pipeline as close to 100% full
> as possible, then copying the data out of the page cache will cause
> performance regressions.
>
> i.e. Hit 5 page cache pages in 5 IOs in a row, and the IO queue
> depth craters because we've only fulfilled 5 complete IOs instead of
> submitting 50 entire IOs. This is the hidden cost of synchronous IO
> via CPU data copying vs async IO via hardware offload, and if we
> take that into account we must look at future hardware performance
> trends to determine if this cost is going to increase or decrease in
> future.
>
> That is: CPUs are not getting faster anytime soon. IO subsystems are
> still deep in the "performance doubles every 2 years" part of the
> technology curve (pcie 3.0->4.0 just happened, 4->5 is a year away,
> 5->6 is 3-4 years away, etc). Hence our reality is that we are deep
> within a performance trend curve that tells us synchronous CPU
> operations are not getting faster, but IO bandwidth and IOPS are
> going to increase massively over the next 5-10 years. Hence putting
> (already expensive) synchronous CPU operations in the asynchronous
> zero-data-touch IO fast path is -exactly the wrong direction to be
> moving-.
>
> This is simple math. The gap between IO latency and bandwidth and
> CPU addressable memory latency and bandwidth is closing all the
> time, and the closer that gap gets the less sense it makes to use
> CPU addressable memory for buffering syscall based read and write
> IO. We are not quite yet at the cross-over point, but we really
> aren't that far from it.
>
>

My use-case supports this. The application uses AIO+DIO, but backup may 
bring pages into page cache. For me, it is best to ignore page cache (as 
long as it's clean, which it is for backup) and serve from disk as usual.



  parent reply	other threads:[~2020-07-12 11:36 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-29 19:23 [PATCH 0/6 v10] btrfs direct-io using iomap Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 1/6] iomap: Convert wait_for_completion to flags Goldwyn Rodrigues
2020-06-29 23:03   ` David Sterba
2020-06-30 16:35   ` David Sterba
2020-07-01  7:34     ` Johannes Thumshirn
2020-07-01  7:50   ` Christoph Hellwig
2020-06-29 19:23 ` [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails Goldwyn Rodrigues
2020-07-01  7:53   ` always fall back to buffered I/O after invalidation failures, was: " Christoph Hellwig
2020-07-07 12:43     ` Goldwyn Rodrigues
2020-07-07 12:57       ` Matthew Wilcox
2020-07-07 13:00         ` Christoph Hellwig
2020-07-08  6:51           ` Dave Chinner
2020-07-08 13:54             ` Matthew Wilcox
2020-07-08 16:54               ` Christoph Hellwig
2020-07-08 17:11                 ` Matthew Wilcox
2020-07-09  8:26                 ` [Cluster-devel] " Steven Whitehouse
2020-07-09  2:25               ` Dave Chinner
2020-07-09 16:09                 ` Darrick J. Wong
2020-07-09 17:05                   ` Matthew Wilcox
2020-07-09 17:10                     ` Darrick J. Wong
2020-07-09 22:59                       ` Dave Chinner
2020-07-10 16:03                         ` Christoph Hellwig
2020-07-12 11:36                 ` Avi Kivity [this message]
2020-07-07 13:49         ` Goldwyn Rodrigues
2020-07-07 14:01           ` Darrick J. Wong
2020-07-07 14:30             ` Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 3/6] btrfs: switch to iomap_dio_rw() for dio Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 4/6] fs: remove dio_end_io() Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 5/6] btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK Goldwyn Rodrigues
2020-06-29 19:23 ` [PATCH 6/6] btrfs: split btrfs_direct_IO to read and write part Goldwyn Rodrigues

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f86a687a-29bf-ef9c-844c-4354de9a65bb@scylladb.com \
    --to=avi@scylladb.com \
    --cc=cluster-devel@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dsterba@suse.cz \
    --cc=fdmanana@gmail.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rgoldwyn@suse.de \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).