linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: jw schultz <jw@pegasys.ws>
To: linux-kernel@vger.kernel.org
Subject: Re: raid0 slower than devices it is assembled of?
Date: Wed, 17 Dec 2003 18:18:42 -0800	[thread overview]
Message-ID: <20031218021842.GF9137@pegasys.ws> (raw)
In-Reply-To: <brqlbu$7vh$1@gatekeeper.tmr.com>

On Wed, Dec 17, 2003 at 10:29:18PM +0000, bill davidsen wrote:
> In article <Pine.LNX.4.58.0312160825570.1599@home.osdl.org>,
> Linus Torvalds  <torvalds@osdl.org> wrote:
> | 
> | 
> | On Tue, 16 Dec 2003, Helge Hafting wrote:
> | >
> | > Raid-0 is ideally N times faster than a single disk, when
> | > you have N disks.
> | 
> | Well, that's a _really_ "ideal" world. Ideal to the point of being
> | unrealistic.
> | 
> | In most real-world situations, latency is at least as important as
> | throughput, and often dominates the story. At which point RAID-0 doesn't
> | improve performance one iota (it might make the seeks shorter, but since
> | seek latency tends to be dominated by things like rotational delay and
> | settle times, that's unlikely to be a really noticeable issue).
> 
> Don't forget time in o/s queues, once an array get loaded that may
> dominate the mechanical latency and transfer times. If you call "access
> time" the sum of all latency between syscall and the first data
> transfer, then reading from multiple drives doesn't reliably help until
> you get the transfer time from an i/o somewhere between 2 and 4x the
> access time. So if the transfer time for a typical i/o is less than 2x
> the typical access time, gains are unlikely. If you set the stripe size
> high enough to make it likely that a typical i/o falls on a single drive
> you usually win. And when the transfer time reaches 4x the access time,
> you almost always win with a split.
> 
> So if you are copying 100MB elements you probably win by spreading the
> i/o, but for more normal things it doesn't much matter.
> 
> THERE'S ONE EXCEPTION: if you have a f/s type which puts the inodes at
> the beginning of the space, and you are creating and deleting a LOT of
> files, with a large stripe you will beat the snot out of one drive and
> the system will bottleneck no end. In that one case you gain by using
> small stripe size and spreading the head motion, even though the file
> i/o itself may really rot. Makes me wish for a f/s which could put the
> inodes in some distributed pattern.

If i recall correctly ext2 like ufs splits the inode table
up and puts parts of it at the beginning of each cylinder or
block group.  Inode assignment being based on an allocation
rule that spreads them across the disks so the file data and
inode will be near each other.  ext[23] also has

       -R raid-options
              Set  raid-related options for the filesystem.  Raid
              options are comma separated, and may take an argu­
              ment  using  the  equals ('=') sign.  The following
              options are supported:

                   stride=stripe-size
                          Configure the  filesystem  for  a RAID
                          array   with   stripe-size filesystem
                          blocks per stripe.

The purpose of the stride option is so that that the inode
table pieces won't all wind up on the same disks as would
happen if stripe size aligns with block group size but be
staggered.

My recollection is that one or both of XFS and JFS store
the inode table in extents which are allocated on demand so
i would hope they also make inode and file data locality a
priority.




-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw@pegasys.ws

		Remember Cernan and Schmitt

  reply	other threads:[~2003-12-18  2:18 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-15 13:34 raid0 slower than devices it is assembled of? Witold Krecicki
2003-12-15 15:44 ` Witold Krecicki
2003-12-16  4:01 ` jw schultz
2003-12-16 14:51   ` Helge Hafting
2003-12-16 16:42     ` Linus Torvalds
2003-12-16 20:58       ` Mike Fedyk
2003-12-16 21:11         ` Linus Torvalds
2003-12-17 10:53           ` Jörn Engel
2003-12-17 11:39           ` Peter Zaitsev
2003-12-17 16:01             ` Linus Torvalds
2003-12-17 18:37               ` Mike Fedyk
2003-12-17 21:55               ` bill davidsen
2003-12-17 17:02             ` bill davidsen
2003-12-17 20:14               ` Peter Zaitsev
2003-12-17 19:22       ` Jamie Lokier
2003-12-17 19:40         ` Linus Torvalds
2003-12-17 22:36           ` bill davidsen
2003-12-18  2:47         ` jw schultz
2003-12-17 22:29       ` bill davidsen
2003-12-18  2:18         ` jw schultz [this message]
2004-01-08  4:54       ` Greg Stark
2003-12-16 20:51     ` Andre Hedrick
2003-12-16 21:04       ` Andre Hedrick
2003-12-16 21:46         ` Witold Krecicki
2003-12-16 20:09   ` Witold Krecicki
2003-12-16 21:11   ` Adam Kropelin
2003-12-16 21:25 ` jw schultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031218021842.GF9137@pegasys.ws \
    --to=jw@pegasys.ws \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).