All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Theodore Tso <tytso@mit.edu>,
	Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"Li, Shaohua" <shaohua.li@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"richard@rsk.demon.co.uk" <richard@rsk.demon.co.uk>,
	"jens.axboe@oracle.com" <jens.axboe@oracle.com>
Subject: Re: regression in page writeback
Date: Fri, 2 Oct 2009 16:19:53 +0800	[thread overview]
Message-ID: <20091002081953.GA14529@localhost> (raw)
In-Reply-To: <20091002025502.GA14246@localhost>

On Fri, Oct 02, 2009 at 10:55:02AM +0800, Wu Fengguang wrote:
> On Fri, Oct 02, 2009 at 05:54:38AM +0800, Theodore Ts'o wrote:
> > On Thu, Oct 01, 2009 at 11:14:29PM +0800, Wu Fengguang wrote:
> > > Yes and no. Yes if the queue was empty for the slow device. No if the
> > > queue was full, in which case IO submission speed = IO complete speed
> > > for previously queued requests.
> > > 
> > > So wbc.timeout will be accurate for IO submission time, and mostly
> > > accurate for IO completion time. The transient queue fill up phase
> > > shall not be a big problem?
> > 
> > So the problem is if we have a mixed workload where there are lots
> > large contiguous writes, and lots of small writes which are fsync'ed()
> > --- for example, consider the workload of copying lots of big DVD
> > images combined with the infamous firefox-we-must-write-out-300-megs-of-
> > small-random-writes-and-then-fsync-them-on-every-single-url-click-so-
> > that-every-last-visited-page-is-preserved-for-history-bar-autocompletion
> > workload.    The big writes, if the are contiguous, could take 1-2 seconds
> > on a very slow, ancient laptop disk, and that will hold up any kind of 
> > small synchornous activities --- such as either a disk read or a firefox-
> > triggered fsync().
> 
> Yes, that's a problem. The SYNC/ASYNC elevator queues can help here.
> 
> In IO submission paths, fsync writes will not be blocked by non-sync
> writes because __filemap_fdatawrite_range() starts foreground sync
> for the inode.

> Without the congestion backoff, it will now have to
> compete queue with bdi-flush. Should not be a big problem though.

I'd like to correct this: get_request_wait() uses one queue for SYNC
rw and another for ASYNC rw. So fsync won't compete the request queue
with background flush. That's perfect: when fsync comes, CFQ will
honor it a green channel, and somehow block background flushes.

> There's still the problem of IO submission time != IO completion time,
> due to fluctuations of randomness and more. However that's a general
> and unavoidable problem.  Both the wbc.timeout scheme and the
> "wbc.nr_to_write based on estimated throughput" scheme are based on
> _past_ requests and it's simply impossible to have a 100% accurate
> scheme. In principle, wbc.timeout will only be inferior at IO startup
> time. In the steady state of 100% full queue, it is actually estimating
> the IO throughput implicitly :)

Another difference between wbc.timeout and adaptive wbc.nr_to_write
is, when there comes many _read_ requests or fsync, these SYNC rw
requests will significant lower the ASYNC writeback throughput, if
it's not completely stalled. So with timeout, the inode will be
aborted with few pages written; with nr_to_write, the inode will be
written a good number of pages, at the cost of taking up long time.

IMHO the nr_to_write behavior seems more efficient. What do you think?

Thanks,
Fengguang

> > That's why the IO completion time matters; it causes latency problems
> > for slow disks and mixed large and small write workloads.  It was the
> > original reason for the 1024 MAX_WRITEBACK_PAGES, which might have
> > made sense 10 years ago back when disks were a lot slower.  One of the
> > advantages of an auto-tuning algorithm, beyond auto-adjusting for
> > different types of hardware, is that we don't need to worry about
> > arbitrary and magic caps beocoming obsolete due to technological
> > changes.  :-)
> 
> Yeah, I'm a big fan of auto-tuning :)
> 
> Thanks,
> Fengguang

  reply	other threads:[~2009-10-02  8:20 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-22  5:49 regression in page writeback Shaohua Li
2009-09-22  6:40 ` Peter Zijlstra
2009-09-22  8:05   ` Wu Fengguang
2009-09-22  8:09     ` Peter Zijlstra
2009-09-22  8:24       ` Wu Fengguang
2009-09-22  8:32         ` Peter Zijlstra
2009-09-22  8:51           ` Wu Fengguang
2009-09-22  8:52           ` Richard Kennedy
2009-09-22  9:05             ` Wu Fengguang
2009-09-22 11:41               ` Shaohua Li
2009-09-22 15:52           ` Chris Mason
2009-09-23  0:22             ` Wu Fengguang
2009-09-23  0:54               ` Andrew Morton
2009-09-23  1:17                 ` Wu Fengguang
2009-09-23  1:27                   ` Wu Fengguang
2009-09-23  1:28                   ` Andrew Morton
2009-09-23  1:32                     ` Wu Fengguang
2009-09-23  1:47                       ` Andrew Morton
2009-09-23  2:01                         ` Wu Fengguang
2009-09-23  2:09                           ` Andrew Morton
2009-09-23  3:07                             ` Wu Fengguang
2009-09-23  1:45                     ` Wu Fengguang
2009-09-23  1:59                       ` Andrew Morton
2009-09-23  2:26                         ` Wu Fengguang
2009-09-23  2:36                           ` Andrew Morton
2009-09-23  2:49                             ` Wu Fengguang
2009-09-23  2:56                               ` Andrew Morton
2009-09-23  3:11                                 ` Wu Fengguang
2009-09-23  3:10                               ` Shaohua Li
2009-09-23  3:14                                 ` Wu Fengguang
2009-09-23  3:25                                   ` Wu Fengguang
2009-09-23 14:00                             ` Chris Mason
2009-09-24  3:15                               ` Wu Fengguang
2009-09-24 12:10                                 ` Chris Mason
2009-09-25  3:26                                   ` Wu Fengguang
2009-09-25  0:11                                 ` Dave Chinner
2009-09-25  0:38                                   ` Chris Mason
2009-09-25  5:04                                     ` Dave Chinner
2009-09-25  6:45                                       ` Wu Fengguang
2009-09-28  1:07                                         ` Dave Chinner
2009-09-28  7:15                                           ` Wu Fengguang
2009-09-28 13:08                                             ` Christoph Hellwig
2009-09-28 14:07                                               ` Theodore Tso
2009-09-30  5:26                                                 ` Wu Fengguang
2009-09-30  5:32                                                   ` Wu Fengguang
2009-10-01 22:17                                                     ` Jan Kara
2009-10-02  3:27                                                       ` Wu Fengguang
2009-10-06 12:55                                                         ` Jan Kara
2009-10-06 13:18                                                           ` Wu Fengguang
2009-09-30 14:11                                                   ` Theodore Tso
2009-10-01 15:14                                                     ` Wu Fengguang
2009-10-01 21:54                                                       ` Theodore Tso
2009-10-02  2:55                                                         ` Wu Fengguang
2009-10-02  8:19                                                           ` Wu Fengguang [this message]
2009-10-02 17:26                                                             ` Theodore Tso
2009-10-03  6:10                                                               ` Wu Fengguang
2009-09-29  2:32                                               ` Wu Fengguang
2009-09-29 14:00                                                 ` Chris Mason
2009-09-29 14:21                                                 ` Christoph Hellwig
2009-09-29  0:15                                             ` Wu Fengguang
2009-09-28 14:25                                           ` Chris Mason
2009-09-29 23:39                                             ` Dave Chinner
2009-09-30  1:30                                               ` Wu Fengguang
2009-09-25 12:06                                       ` Chris Mason
2009-09-25  3:19                                   ` Wu Fengguang
2009-09-26  1:47                                     ` Dave Chinner
2009-09-26  3:02                                       ` Wu Fengguang
2009-09-26  3:02                                         ` Wu Fengguang
2009-09-23  9:19                         ` Richard Kennedy
2009-09-23  9:23                           ` Peter Zijlstra
2009-09-23  9:37                             ` Wu Fengguang
2009-09-23 10:30                               ` Wu Fengguang
2009-09-23  6:41             ` Shaohua Li
2009-09-22 10:49 ` Wu Fengguang
2009-09-22 11:50   ` Shaohua Li
2009-09-22 13:39     ` Wu Fengguang
2009-09-23  1:52       ` Shaohua Li
2009-09-23  4:00         ` Wu Fengguang
2009-09-25  6:14           ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091002081953.GA14529@localhost \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard@rsk.demon.co.uk \
    --cc=shaohua.li@intel.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.