linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <piggin@cyberone.com.au>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Con Kolivas <ckolivas@yahoo.com.au>,
	lkml <linux-kernel@vger.kernel.org>, Jens Axboe <axboe@suse.de>
Subject: Re: stochastic fair queueing in the elevator [Re: [BENCHMARK] 2.4.20-ck3 / aa / rmap with contest]
Date: Mon, 10 Feb 2003 14:13:29 +1100	[thread overview]
Message-ID: <3E4718D9.4030805@cyberone.com.au> (raw)
In-Reply-To: 20030209144622.GB31401@dualathlon.random

Andrea Arcangeli wrote:

>The only way to get the minimal possible latency and maximal fariness is
>my new stochastic fair queueing idea.
>
Sounds nice. I would like to see per process disk distribution.

>
>
>The troughput/latency curve has only a few points where the throughput
>incrase while latency decrease. The miniumum latency doesn't necessairly
>mean minimal throughput as maxmium latency doesn't necessairly imply
>maximal throughput.
>
>But clearly maximum throughput likely implys not minium latency and the
>other way around.
>
>I described my idea to Jens a few days ago and he's working on it right
>now AFIK, incidentally he had an equivalent idea in the past and he had
>some old code that he's starting with.  For the details on how it works
>you can read the stochastic fair queueing in the network packet
>scheduler (net/sched/sch_sfq.c).
>
>The design I proposed is to have multiple I/O queues, where to apply the
>elevator, and to choose the queue in function of the task->pid that is
>sumbitting the bh/bio. You'll have to apply an hash to the pid and you
>probably want a perturbation timer that will change the hash function
>every 30 sec or so. Plus I want a special queue for everything
>asynchronoys. So that the asynchronoys queue will be elevated and
>reordered like today since it's not latency critical. I suggeted Jens to
>differentiate it by checking current->mm, all the async stuff is
>normally submitted by the kernel daemons. A more finegrined way is to
>add a bitflag that you have to set between the bh lock and the submit_bh
>and that will be cleared by unlock_buffer (or equivalent one for the bio
>in 2.5). But the current->mm will take care of async-io with keventd too
>so it should be just fine (modulo the first aio submit but it's a minor
>issue).
>
>Then the lowlevel drivers will pick requests from these queues in round
>robin (other variations may be possible too, like two requests per queue
>or stuff like that, but 1 request per queue should give an unbelieveable
>responsiveness to the system, something that we never experienced before).
>
However dependant reads can not merge with each other obviously so
you could end up say submitting 4K reads per process.

>
>
>This stochastic fair queueing I/O scheduling algorithm will be the best
>for desktop/multimedia, i.e. when your priority is that xmms or mplayer
>never skips no matter how many cp /dev/zero . you're running in
>background. Or also for a multiuser system. There is no other possible
>definitive fix IMHO. The only other possible fix would be to reduce the
>I/O queue to say 512kbytes and to take advantage of the FIFO behaviour
>of the queue wakeups, I tried that, it works so well, you can trivially
>test it with my elevator-lowlatency by just changing a line, but the
>problem is 512k is a too small I/O pipeline, i.e. it is not enough to
>guarantee the maximal throughput during contigous I/O. 2M of queue is
>the miniumum for using at best 100Mbyte arrays like my raid box.
>
But your solution also does nothing for sequential IO throughput in
the presence of more than one submitter. Two sequential readers on
different parts of the disk for example. Submit 128K request (4ms),
seek (5ms), submit 128K request (4ms), so your 30MB/s disk gets
less than 15MB/s here. Same with writes.

I think you should be giving each process a timeslice, that way you
don't need to try to account for request size or seek vs IO time and
is easily tunable. This should solve the above problem with say 50ms
timeslices.

It also does little to help dependant reads vs writes or streaming
reads.

For reads, anticipatory scheduling can be very helpful in theory
however it remains to be seen if I can make it work without adding
too much complexity. Your fair queueing should go nicely on top
if I can make it work. I do like your idea though!

Nick


  reply	other threads:[~2003-02-10  3:03 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-02-09 13:30 [BENCHMARK] 2.4.20-ck3 / aa / rmap with contest Con Kolivas
2003-02-09 14:46 ` stochastic fair queueing in the elevator [Re: [BENCHMARK] 2.4.20-ck3 / aa / rmap with contest] Andrea Arcangeli
2003-02-10  3:13   ` Nick Piggin [this message]
2003-02-10  3:52     ` Rik van Riel
2003-02-10  4:44       ` Nick Piggin
2003-02-10  5:15         ` usbaudio.c 2.5.59 John
2003-02-10  7:26         ` stochastic fair queueing in the elevator [Re: [BENCHMARK] 2.4.20-ck3 / aa / rmap with contest] Andrea Arcangeli
2003-02-10  7:43           ` Nick Piggin
2003-02-10  3:42   ` Rik van Riel
2003-02-10  4:15     ` Rik van Riel
2003-02-10  4:19       ` David Lang
2003-02-10  4:29         ` Rik van Riel
2003-02-10  7:20           ` Andrea Arcangeli
2003-02-10  4:33         ` Andrew Morton
2003-02-10  4:47           ` Rik van Riel
2003-02-10  7:31             ` Andrea Arcangeli
2003-02-10  4:51           ` Jakob Oestergaard
2003-02-10  4:58             ` Nick Piggin
2003-02-10  5:10               ` Jakob Oestergaard
2003-02-10  6:06                 ` Valdis.Kletnieks
2003-02-10  6:31                   ` Jakob Oestergaard
2003-02-10  7:36               ` Andrea Arcangeli
2003-02-10  7:41                 ` Nick Piggin
2003-02-10  8:08                   ` Andrea Arcangeli
2003-02-10  8:19                     ` Andrew Morton
2003-02-10  8:56                       ` Andrea Arcangeli
2003-02-10  9:09                         ` Andrew Morton
2003-02-10  9:14                           ` Andrea Arcangeli
2003-02-10 10:07                           ` Hans Reiser
2003-02-10 10:15                             ` Andrea Arcangeli
2003-02-10 10:40                               ` Nick Piggin
2003-02-10 11:10                                 ` Andrea Arcangeli
2003-02-10 11:21                                   ` Andrew Morton
2003-02-10 11:31                                     ` Andrea Arcangeli
2003-02-10 11:24                                   ` Nick Piggin
2003-02-10 11:39                                     ` Andrea Arcangeli
2003-02-10 11:45                                       ` Nick Piggin
2003-02-10 12:00                                         ` Andrea Arcangeli
2003-02-10 12:11                                           ` Nick Piggin
2003-02-10 12:22                                             ` Andrea Arcangeli
2003-02-10 12:36                                               ` Nick Piggin
2003-02-10 12:47                                                 ` Andrea Arcangeli
2003-02-10 13:26                                           ` Rik van Riel
2003-02-10 11:48                                       ` Andrew Morton
2003-02-10 11:53                                         ` Nick Piggin
2003-02-10 12:10                                           ` Andrea Arcangeli
2003-02-10 12:14                                             ` Nick Piggin
2003-02-10 12:26                                               ` Andrea Arcangeli
2003-02-10 12:12                                           ` Andrew Morton
2003-02-10 12:25                                             ` Andrea Arcangeli
2003-02-10 12:27                                             ` Nick Piggin
2003-02-10 12:30                                               ` Andrea Arcangeli
2003-02-10 12:34                                                 ` Nick Piggin
2003-02-10 12:43                                                   ` Andrea Arcangeli
2003-02-10 12:55                                                     ` Nick Piggin
2003-02-10 13:30                                             ` Rik van Riel
2003-02-11 19:13                                               ` Rod Van Meter
2003-02-10 12:09                                         ` Andrea Arcangeli
2003-02-10 12:17                                           ` Nick Piggin
2003-02-10 12:28                                             ` Andrea Arcangeli
2003-02-10 12:58                                           ` Hans Reiser
2003-02-10 13:18                                             ` Andrea Arcangeli
2003-02-10 20:14                                               ` Hans Reiser
2003-02-10 13:19                                             ` Nick Piggin
2003-02-10 14:49                                         ` stochastic fair queueing in the elevator [Re: [BENCHMARK] 2 Giuliano Pochini
2003-02-10 15:05                                           ` Andrea Arcangeli
2003-02-10 11:25                                   ` stochastic fair queueing in the elevator [Re: [BENCHMARK] 2.4.20-ck3 / aa / rmap with contest] Hans Reiser
2003-02-10 11:42                                     ` Andrew Morton
2003-02-10 13:00                                       ` Hans Reiser
2003-02-10 10:48                             ` Andrew Morton
2003-02-10 10:55                               ` Nick Piggin
2003-02-10 11:21                               ` Andrea Arcangeli
2003-02-10 11:33                                 ` Andrew Morton
2003-02-10 11:43                                   ` Andrea Arcangeli
2003-02-10 11:39                                 ` Nick Piggin
2003-02-10  9:59                       ` Hans Reiser
2003-02-10 10:06                         ` Andrew Morton
2003-02-10 10:17                           ` Hans Reiser
2003-02-10 10:39                           ` Hans Reiser
2003-02-10  8:27                     ` Nick Piggin
2003-02-10  9:02                       ` Andrea Arcangeli
2003-02-10  9:18                         ` Nick Piggin
2003-02-10 20:33                         ` Kurt Garloff
2003-02-10 21:43                           ` Rik van Riel
2003-02-10  5:01             ` Andrew Morton
2003-02-10  7:34             ` Andrea Arcangeli
2003-02-10  4:44     ` Rik van Riel
2003-02-10  7:31       ` Andrea Arcangeli
2003-02-10  7:17     ` Andrea Arcangeli
2003-02-10  7:39       ` Nick Piggin
2003-02-10 10:03     ` stochastic fair queueing in the elevator [Re: [BENCHMARK] 2 Giuliano Pochini
2003-02-10 16:23   ` stochastic fair queueing in the elevator [Re: [BENCHMARK] 2.4.20-ck3 / aa / rmap with contest] Pavel Machek
2003-02-11 11:49     ` Andrea Arcangeli
2003-02-11 12:43       ` Jens Axboe
2003-02-11 14:28         ` Jason Lunz
2003-02-11 14:41           ` Jens Axboe
2003-02-11 17:17             ` Jason Lunz
2003-02-11 20:19               ` Jens Axboe
2003-02-10 16:47   ` Pavel Machek
2003-02-11 11:01     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E4718D9.4030805@cyberone.com.au \
    --to=piggin@cyberone.com.au \
    --cc=andrea@suse.de \
    --cc=axboe@suse.de \
    --cc=ckolivas@yahoo.com.au \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).