linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Mike Snitzer" <snitzer@gmail.com>
To: "Linus Torvalds" <torvalds@linux-foundation.org>
Cc: "Mel Gorman" <mel@csn.ul.ie>,
	"Martin Knoblauch" <spamtrap@knobisoft.de>,
	"Fengguang Wu" <wfg@mail.ustc.edu.cn>,
	"Peter Zijlstra" <peterz@infradead.org>,
	jplatte@naasa.net, "Ingo Molnar" <mingo@elte.hu>,
	linux-kernel@vger.kernel.org,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	James.Bottomley@steeleye.com
Subject: Re: regression: 100% io-wait with 2.6.24-rcX
Date: Fri, 18 Jan 2008 15:00:28 -0500	[thread overview]
Message-ID: <170fa0d20801181200p50556132v3a9bafc9ad9e8c91@mail.gmail.com> (raw)
In-Reply-To: <alpine.LFD.1.00.0801180931250.2957@woody.linux-foundation.org>

On Jan 18, 2008 12:46 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Fri, 18 Jan 2008, Mel Gorman wrote:
> >
> > Right, and this is consistent with other complaints about the PFN of the
> > page mattering to some hardware.
>
> I don't think it's actually the PFN per se.
>
> I think it's simply that some controllers (quite probably affected by both
> driver and hardware limits) have some subtle interactions with the size of
> the IO commands.
>
> For example, let's say that you have a controller that has some limit X on
> the size of IO in flight (whether due to hardware or driver issues doesn't
> really matter) in addition to a limit on the size of the scatter-gather
> size. They all tend to have limits, and they differ.
>
> Now, the PFN doesn't matter per se, but the allocation pattern definitely
> matters for whether the IO's are physically contiguous, and thus matters
> for the size of the scatter-gather thing.
>
> Now, generally the rule-of-thumb is that you want big commands, so
> physical merging is good for you, but I could well imagine that the IO
> limits interact, and end up hurting each other. Let's say that a better
> allocation order allows for bigger contiguous physical areas, and thus
> fewer scatter-gather entries.
>
> What does that result in? The obvious answer is
>
>   "Better performance obviously, because the controller needs to do fewer
>    scatter-gather lookups, and the requests are bigger, because there are
>    fewer IO's that hit scatter-gather limits!"
>
> Agreed?
>
> Except maybe the *real* answer for some controllers end up being
>
>   "Worse performance, because individual commands grow because they don't
>    hit the per-command limits, but now we hit the global size-in-flight
>    limits and have many fewer of these good commands in flight. And while
>    the commands are larger, it means that there are fewer outstanding
>    commands, which can mean that the disk cannot scheduling things
>    as well, or makes high latency of command generation by the controller
>    much more visible because there aren't enough concurrent requests
>    queued up to hide it"
>
> Is this the reason? I have no idea. But somebody who knows the AACRAID
> hardware and driver limits might think about interactions like that.
> Sometimes you actually might want to have smaller individual commands if
> there is some other limit that means that it can be more advantageous to
> have many small requests over a few big onees.
>
> RAID might well make it worse. Maybe small requests work better because
> they are simpler to schedule because they only hit one disk (eg if you
> have simple striping)! So that's another reason why one *large* request
> may actually be slower than two requests half the size, even if it's
> against the "normal rule".
>
> And it may be that that AACRAID box takes a big hit on DIO exactly because
> DIO has been optimized almost purely for making one command as big as
> possible.
>
> Just a theory.

Oddly enough, I'm seeing the opposite here with 2.6.22.16 w/ AACRAID
configured with 5 LUNS (each 2disk HW RAID0, 1024k stripesz).  That
is, with dd the avgrqsiz (from iostat) shows DIO to be ~130k whereas
non-DIO is a mere ~13k! (NOTE: with aacraid, max_hw_sectors_kb=192)

DIO cmdline:  dd if=/dev/zero of=/dev/sdX bs=8192k count=1k
non-DIO cmdline: dd if=/dev/zero of=/dev/sdX bs=8192k count=1k

DIO is ~80MB/s on all 5 LUNs for a total of ~400MB/s
non-DIO is only ~12MB on all 5 LUNs for a mere ~70MB/s aggregate
(deadline w/ nr_requests=32)

Calls into question the theory of small requests being beneficial for
AACRAID.  Martin, what are you seeing for the avg request size when
you're conducting your AACRAID tests?

I can fire up 2.6.24-rc8 in short order to see if things are vastly
improved (as Martin seems to indicate that he is happy with AACRAID on
2.6.24-rc8).  Although even Martin's AACRAID numbers from 2.6.19.2 are
still quite good (relative to mine).  Martin can you share any tuning
you may have done to get AACRAID to where it is for you right now?

regards,
Mike

  parent reply	other threads:[~2008-01-18 20:00 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-18  8:19 regression: 100% io-wait with 2.6.24-rcX Martin Knoblauch
2008-01-18 16:01 ` Mel Gorman
2008-01-18 17:46   ` Linus Torvalds
2008-01-18 19:01     ` Martin Knoblauch
2008-01-18 19:23       ` Linus Torvalds
2008-01-22 14:39       ` Alasdair G Kergon
2008-01-18 20:00     ` Mike Snitzer [this message]
2008-01-18 22:47       ` Mike Snitzer
  -- strict thread matches above, loose matches on Subject: below --
2008-01-23 11:12 Martin Knoblauch
2008-01-22 18:51 Martin Knoblauch
2008-01-22 15:25 Martin Knoblauch
2008-01-22 23:40 ` Alasdair G Kergon
2008-01-19 10:24 Martin Knoblauch
2008-01-17 21:50 Martin Knoblauch
2008-01-17 22:12 ` Mel Gorman
2008-01-17 17:51 Martin Knoblauch
2008-01-17 17:44 Martin Knoblauch
2008-01-17 20:23 ` Mel Gorman
2008-01-17 13:52 Martin Knoblauch
2008-01-17 16:11 ` Mike Snitzer
2008-01-16 14:15 Martin Knoblauch
2008-01-16 16:27 ` Mike Snitzer
2008-01-16  9:26 Martin Knoblauch
     [not found] ` <E1JF6w8-0000vs-HM@localhost.localdomain>
2008-01-16 12:00   ` Fengguang Wu
2008-01-07 10:51 Joerg Platte
2008-01-07 11:19 ` Ingo Molnar
2008-01-07 13:24   ` Joerg Platte
2008-01-07 13:32     ` Peter Zijlstra
2008-01-07 13:40       ` Joerg Platte
     [not found]         ` <E1JCRbA-0002bh-3c@localhost.localdomain>
2008-01-09  3:27           ` Fengguang Wu
2008-01-09  6:13             ` Joerg Platte
     [not found]               ` <E1JCZg2-0001DE-RP@localhost.localdomain>
2008-01-09 12:04                 ` Fengguang Wu
2008-01-09 12:22                   ` Joerg Platte
     [not found]                     ` <E1JCaUd-0001Ko-Tt@localhost.localdomain>
2008-01-09 12:57                       ` Fengguang Wu
2008-01-09 13:04                         ` Joerg Platte
     [not found]                           ` <E1JCrMj-0001HR-SZ@localhost.localdomain>
2008-01-10  6:58                             ` Fengguang Wu
     [not found]                           ` <E1JCrsE-0000v4-Dz@localhost.localdomain>
2008-01-10  7:30                             ` Fengguang Wu
     [not found]                           ` <20080110073046.GA3432@mail.ustc.edu.cn>
     [not found]                             ` <E1JCsDr-0002cl-0e@localhost.localdomain>
2008-01-10  7:53                               ` Fengguang Wu
2008-01-10  8:37                                 ` Joerg Platte
     [not found]                                   ` <E1JCt0n-00048n-AD@localhost.localdomain>
2008-01-10  8:43                                     ` Fengguang Wu
2008-01-10 10:03                                       ` Joerg Platte
     [not found]                                         ` <E1JDBk4-0000UF-03@localhost.localdomain>
2008-01-11  4:43                                           ` Fengguang Wu
2008-01-11  5:29                                             ` Joerg Platte
2008-01-11  6:41                                               ` Joerg Platte
2008-01-12 23:32                                             ` Joerg Platte
     [not found]                                               ` <E1JDwaA-00017Q-W6@localhost.localdomain>
2008-01-13  6:44                                                 ` Fengguang Wu
2008-01-13  8:05                                                   ` Joerg Platte
     [not found]                                                     ` <E1JDy5a-0001al-Tk@localhost.localdomain>
2008-01-13  8:21                                                       ` Fengguang Wu
2008-01-13  9:49                                                         ` Joerg Platte
     [not found]                                                           ` <E1JE1Uz-0002w5-6z@localhost.localdomain>
2008-01-13 11:59                                                             ` Fengguang Wu
     [not found]                                                           ` <20080113115933.GA11045@mail.ustc.edu.cn>
     [not found]                                                             ` <E1JEGPH-0001uw-Df@localhost.localdomain>
2008-01-14  3:54                                                               ` Fengguang Wu
     [not found]                                                             ` <20080114035439.GA7330@mail.ustc.edu.cn>
     [not found]                                                               ` <E1JEM2I-00010S-5U@localhost.localdomain>
2008-01-14  9:55                                                                 ` Fengguang Wu
2008-01-14 11:30                                                                   ` Joerg Platte
2008-01-14 11:41                                                                     ` Peter Zijlstra
     [not found]                                                                       ` <E1JEOmD-0001Ap-U7@localhost.localdomain>
2008-01-14 12:50                                                                         ` Fengguang Wu
2008-01-15 21:13                                                                           ` Mike Snitzer
     [not found]                                                                             ` <E1JF0m1-000101-OK@localhost.localdomain>
2008-01-16  5:25                                                                               ` Fengguang Wu
2008-01-15 21:42                                                                         ` Ingo Molnar
     [not found]                                                                           ` <E1JF0bJ-0000zU-FG@localhost.localdomain>
2008-01-16  5:14                                                                             ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=170fa0d20801181200p50556132v3a9bafc9ad9e8c91@mail.gmail.com \
    --to=snitzer@gmail.com \
    --cc=James.Bottomley@steeleye.com \
    --cc=jplatte@naasa.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=spamtrap@knobisoft.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wfg@mail.ustc.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).