All of lore.kernel.org
 help / color / mirror / Atom feed
From: MRK <mrk@shiftmail.org>
To: Richard Scobie <richard@sauce.co.nz>
Cc: Mark Knecht <markknecht@gmail.com>,
	Learner Study <learner.study@gmail.com>,
	linux-raid@vger.kernel.org, keld@dkuug.dk
Subject: Re: Linux Raid performance
Date: Mon, 05 Apr 2010 13:20:14 +0200	[thread overview]
Message-ID: <4BB9C76E.7080607@shiftmail.org> (raw)
In-Reply-To: <4BB91FBC.10504@sauce.co.nz>

Richard Scobie wrote:
> MRK wrote:
>
>> If this is so, the newer LSI controllers at 6.0gbit/sec could be able to
>> do better (they supposedly have a faster chip). Also maybe one could buy
>> more controller cards and divide drives among those. These two
>
> Yes, both of these would work.
>
> Someone posted previously on this list and was writing at 1.7GB/s 
> using 10 x 15K SAS drives md RAID0. He did mention the troughput was 
> higher with the LSI SAS2 cards, even with SAS1 port expanders connected.

Not so fast... actually I see a problem with previous deduction of what 
is the bottleneck.

The answer from the LSI engineer leads to think that the bottleneck with 
SATA is the number of IOPS, it's because there are 5 connections 
established and then broken for each I/O. And this is independent from 
the size transferred by each I/O operation via DMA (the overhead of the 
data transfer is the same in SAS and SATA case, it's always the same DMA 
chip doing the transfer).

However, if really the total number of IOPS is the bottleneck in SATA 
with the 3.0gbit/sec LSI cards, why they don't slow down a single SSD 
doing 4k random I/O?

Look at this
http://www.anandtech.com/show/2954/5
OCZ vertex LE doing 162 MB/sec at 4K aligned random writes, that means 
41472 IOPS, independent and unmergeable requests. And that is SATA not SAS.
This is on Windows. Unfortunately we don't know the controller which was 
used for this benchmark.

During MD-RAID sequential dd write I have seen linux (via iostat -x 1) 
merging requests by a factor at least 400 (sometimes much higher), so I 
suppose requests issued to the controller would be at least 1.6 MB long 
(original requests are certainly not shorter than 4K, and 4K x 400=1.6MB).

If the system tops out at about 600MB/sec and writes issued are 1.6MB 
long or more, it means that the controller tops out at 375 IOPS or less.

So how come the controller of the anandtech test above is capable of 
doing 41472 IOPS?


This is also interesting:

Richard Scobie wrote:
> This bottleneck is the SAS controller, at least in my case. I did the 
> same math regarding streaming performance of one drive times number of 
> drive and wondered where the shortfall was, after tests showed I could 
> only streaming read at 850MB/s on the same array. 
I think if you use dd to read from the 16 underlying devices 
simultaneously, independently, and not using MD, (output to /dev/null) 
you should obtain the full disk speed of 1.4 GB/sec or so (aggregated). 
I think I did this test in the past and I noticed this. Can you try? I 
don't have our big disk array in my hands any more :-(


  reply	other threads:[~2010-04-05 11:20 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-31 19:42 Linux Raid performance Learner Study
2010-03-31 20:15 ` Keld Simonsen
2010-04-02  3:07   ` Learner Study
2010-04-02  9:58     ` Nicolae Mihalache
2010-04-02 17:58       ` Learner Study
2010-04-02 11:05     ` Keld Simonsen
2010-04-02 11:18       ` Keld Simonsen
2010-04-02 17:55       ` Learner Study
2010-04-02 21:14         ` Keld Simonsen
2010-04-02 21:37           ` Learner Study
2010-04-03 11:20             ` Keld Simonsen
2010-04-03 15:56               ` Learner Study
2010-04-04  1:58                 ` Keld Simonsen
2010-04-03  0:10           ` Learner Study
2010-04-03  0:39         ` Mark Knecht
2010-04-03  1:00           ` John Robinson
2010-04-03  1:14           ` Richard Scobie
2010-04-03  1:32             ` Mark Knecht
2010-04-03  1:37               ` Richard Scobie
2010-04-03  3:06                 ` Learner Study
2010-04-03  3:00             ` Learner Study
2010-04-03 19:27               ` Richard Scobie
2010-04-03 18:14             ` MRK
2010-04-03 19:56               ` Richard Scobie
2010-04-04 15:00                 ` MRK
2010-04-04 18:26                   ` Learner Study
2010-04-04 18:46                     ` Mark Knecht
2010-04-04 21:28                       ` Jools Wills
2010-04-04 22:38                         ` Mark Knecht
2010-04-05 10:07                           ` Learner Study
2010-04-05 16:35                             ` John Robinson
2010-04-04 22:24                       ` Guy Watkins
2010-04-05 13:49                         ` Drew
2010-04-04 23:24                   ` Richard Scobie
2010-04-05 11:20                     ` MRK [this message]
2010-04-05 19:49                       ` Richard Scobie
2010-04-05 21:03                         ` Drew
2010-04-05 22:20                           ` Richard Scobie
2010-04-05 23:49                           ` Roger Heflin
2010-04-14 20:50             ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BB9C76E.7080607@shiftmail.org \
    --to=mrk@shiftmail.org \
    --cc=keld@dkuug.dk \
    --cc=learner.study@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=markknecht@gmail.com \
    --cc=richard@sauce.co.nz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.