linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@transmeta.com>
To: Ben LaHaise <bcrl@redhat.com>
Cc: Daniel Phillips <phillips@bonn-fries.net>,
	Rik van Riel <riel@conectiva.com.br>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: Re: [RFC][DATA] re "ongoing vm suckage"
Date: Fri, 3 Aug 2001 22:28:31 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.33.0108032216350.1032-100000@penguin.transmeta.com> (raw)
In-Reply-To: <Pine.LNX.4.33.0108040055090.11200-100000@touchme.toronto.redhat.com>


On Sat, 4 Aug 2001, Ben LaHaise wrote:
>
> > How about capping the number of requests to something sane, like 128? Then
> > the natural request allocation (together with the batching that we already
> > have) should work just dandy.
>
> This has other drawbacks that are quite serious: namely, the order in
> which io is submitted to the block layer is not anywhere close to optimal
> for getting useful amounts of work done.

Now this is true _whatever_ we do.

We all agree that we have to cap the thing somewhere, no?

Which means that we may be cutting off at a point where if we didn't cut
off, we could have merged better etc. So that problem we have regardless
of whether we could bhäs submitted to ll_rw_block() or we count requests
submitted to the actual IO layer.

The advantage off cutting off on a per-request basis is:

 - doing contiguous IO is "almost free" on most hardware today. So it's ok
   to allow a lot more IO if it's contiguous - because the cost of doing
   one request (even if large) is usually much lower than the cost of
   doing two (smaller) requests.

 - What we really want to do is to have a sliding window of active
   requests - enough to get reasonable elevator behaviour, and small
   enough to get reasonable latency. Again, for both of these, the
   "request" is the right entity - latency comes mostly from seeks (ie
   between request boundaries), and similarly the elevator obviously works
   on request boundaries too, not on "bh" boundaries.

Also, I doubt it makes all that much sense to change the number of queue
entries based on memory size. It probably makes more sense to scale the
number of requests by disk speed, for example.

[ Although there's almost certainly some amount of correlation - if you
  have 2GB of RAM, you probably have fast disks too. But not the linear
  function that we currently have. ]

>			  This situation only gets worse
> as more and more tasks find that they need to clean buffers in order to
> allocate memory, and start throwing more and more buffers from different
> tasks into the io queue (think what happens when two tasks are walking
> the dirty buffer lists locking buffers and then attempting to allocate a
> request which then delays one of the tasks).

Note that this really is a sitation we've had forever.

There are good reasons to believe that we should do a better job of
sorting the IO requests at a higher level in _addition_ to the low-level
elevator. Filesystems should strive to allocate blocks contiguously etc,
and we should strive to keep (and write out) the dirty lists etc in a
somewhat cronological order to take advantage of usually contiguous writes
(and maybe actively sort the dirty queue on writes that are _not_ going to
have good locality, like swapping).

		Linus


  reply	other threads:[~2001-08-04  5:31 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-08-03 23:44 [RFC][DATA] re "ongoing vm suckage" Ben LaHaise
2001-08-04  1:29 ` Rik van Riel
2001-08-04  3:06   ` Daniel Phillips
2001-08-04  3:13     ` Linus Torvalds
2001-08-04  3:23       ` Rik van Riel
2001-08-04  3:35         ` Linus Torvalds
2001-08-04  3:26       ` Ben LaHaise
2001-08-04  3:34         ` Rik van Riel
2001-08-04  3:38         ` Linus Torvalds
2001-08-04  3:48         ` Linus Torvalds
2001-08-04  4:14           ` Ben LaHaise
2001-08-04  4:20             ` Linus Torvalds
2001-08-04  4:39               ` Ben LaHaise
2001-08-04  4:47                 ` Linus Torvalds
2001-08-04  5:13                   ` Ben LaHaise
2001-08-04  5:28                     ` Linus Torvalds [this message]
2001-08-04  6:37                     ` Linus Torvalds
2001-08-04  5:38                       ` Marcelo Tosatti
2001-08-04  7:13                         ` Rik van Riel
2001-08-04 14:22                       ` Mike Black
2001-08-04 17:08                         ` Linus Torvalds
2001-08-05  4:19                           ` Michael Rothwell
2001-08-05 18:40                             ` Marcelo Tosatti
2001-08-05 20:20                             ` Linus Torvalds
2001-08-05 20:45                               ` arjan
2001-08-06 20:32                               ` Rob Landley
2001-08-05 15:24                           ` Mike Black
2001-08-05 20:04                             ` Linus Torvalds
2001-08-05 20:23                               ` Alan Cox
2001-08-05 20:33                                 ` Linus Torvalds
2001-08-04 16:21                       ` Mark Hemment
2001-08-07 15:45                       ` Ben LaHaise
2001-08-07 16:22                         ` Linus Torvalds
2001-08-07 16:51                           ` Ben LaHaise
2001-08-07 17:08                             ` Linus Torvalds
2001-08-07 18:17                             ` Andrew Morton
2001-08-07 18:40                               ` Ben LaHaise
2001-08-07 21:33                                 ` Daniel Phillips
2001-08-07 22:03                                 ` Linus Torvalds
2001-08-07 21:33                             ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.33.0108032216350.1032-100000@penguin.transmeta.com \
    --to=torvalds@transmeta.com \
    --cc=bcrl@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=phillips@bonn-fries.net \
    --cc=riel@conectiva.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).