All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roberto Spadim <roberto@spadim.com.br>
To: "Keld Jørn Simonsen" <keld@keldix.com>
Cc: Jon Nelson <jnelson-linux-raid@jamponi.net>, linux-raid@vger.kernel.org
Subject: Re: What's the typical RAID10 setup?
Date: Wed, 2 Feb 2011 23:57:17 -0200	[thread overview]
Message-ID: <AANLkTik=GNgLNK47=aFegbX1_6veMZSgBsxhL8=sN1QT@mail.gmail.com> (raw)
In-Reply-To: <AANLkTinROfr2EevWZnq3jzZCW7ctrR_1c6GSevnjQfhK@mail.gmail.com>

i have updated again, some questions are being explained
(https://bbs.archlinux.org/viewtopic.php?pid=887345)
check that this question (optional io mirror scheduler algorithm) is
very old (1+1/2 years, Chris Worley [ Fr, 16 Oktober 2009 21:07 ] [ ID
#2019215 ])

http://www.issociate.de/board/post/499463/Load-balancing_mirrors_w/_asymmetric_performance.html


2011/2/2 Roberto Spadim <roberto@spadim.com.br>:
> nice, i don´t know if it´s a problem of single thread
> i think it´s a problem about async read command being executed in parallel
> i post again at https://bbs.archlinux.org/viewtopic.php?pid=887345
> please see the history at the end of page
> i´m talking about a disk with 5000rpm and a disk with 7000rpm
> i think we can optimize mirror read algorithm and it´s not very hard
> for same speed hard disk, near mirror is good
> for same speed solid state, round robin is good
> for anyone, time based is good
>
> diferences?
> hard disk: time to position head is high, time to read can be small
> solid state: time to position is small, time to read is small (some
> ssd are old, and have small read rate)
> nbd: time based on server hard/solid disk, and network time, but don´t
> think in nbd yet
>
> 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>:
>> Hmm, Roberto, I think we are close to theoretical maximum with
>> some of the raid1/raid10 stuff already. and my nose tells me
>> that we can gain more by minimizing CPU usage.
>> Or maybe using some threading for raid modules - they
>> all run single-threaded.
>>
>> Best regards
>> keld
>>
>>
>> On Wed, Feb 02, 2011 at 06:28:27PM -0200, Roberto Spadim wrote:
>>> before, this thread i put at this page:
>>> https://bbs.archlinux.org/viewtopic.php?pid=887267
>>> to make this mail list with less emails
>>>
>>> 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>:
>>> > Hmm, Roberto, where are the gains?
>>>
>>> it?s dificult to talk... NCQ and linux scheduler don?t help a mirror,
>>> they help a single device
>>> a new scheduler for mirrors can be done (round robin, closest head, others)
>>>
>>> > I think it is hard to make raid1 better than it is today.
>>> i don?t think, since head, is just for hard disk (rotational) not for
>>> solid state disks, let?s not talk about ssd, just hard disk? a raid
>>> with 5000rpm  and 10000rpm disk, we will have better i/o read with
>>> 10000rpm ? we don?t know the model of i/o for that device, but
>>> probally will be faster, but when it?s busy we could use 5000rpm...
>>> that?s the point, just closest head don?t help, we need know what?s
>>> the queue (list of i/o being processed) and the time to read the
>>> current i/o
>>>
>>> > Normally the driver orders the reads to minimize head movement
>>> > and loss with rotation latency. Where can  we improve that?
>>>
>>> no way to improve it, it?s very good! but per hard disk, not per mirror
>>> but since we know it?s busy we can use another mirror (another disk
>>> with same information), that?s what i want
>>>
>>> > Also, what about conflicts with the elevator algorithm?
>>> elevator are based on model of disk, think disk as: linux elevator +
>>> NCQ + disks, the sum of three infomration give us time based
>>> infomrations to select best device
>>> maybe making complex code (per elevator) we could know the time spent
>>> to execute it, but it?s a lot of work,
>>> for the first model, lets think about parameters of our model (linux
>>> elevator + ncq + disks)
>>> a second version we could implement elevator algorithm time
>>> calculation (network block device NBD, have a elevator? at server side
>>> + tcp/ip stack at client and server side, right?)
>>>
>>> > There are several scheduling algorithms available, and each has
>>> > its merits. Will your new scheme work against these?
>>> > Or is your new scheme just another scheduling algorithm?
>>>
>>> it?s a scheduling for mirrors
>>> round balance is a algorithm for mirror
>>> closest head is a algorithm for mirror
>>> my 'new' algorith will be for mirror (if anyone help me coding for
>>> linux kernel hehehe, i didn?t coded for linux kernel yet, just for
>>> user space)
>>>
>>> noop, deadline, cfq isn?t for mirror, these are for raid0 problem
>>> (linear, stripe if you hard disk have more then one head on your hard
>>> disk)
>>>
>>> > I think I learned that scheduling is per drive, not per file system.
>>> yes, you learned right! =)
>>> /dev/md0 (raid1) is a device with scheduling (closest head,round robin)
>>> /dev/sda is a device with scheduling (noop, deadline, cfq, others)
>>> /dev/sda1 is a device with scheduling (it send all i/o directly to /dev/sda)
>>>
>>> the new algorithm is just for mirrors (raid1), i dont remeber about
>>> raid5,6 if they are mirror based too, if yes they could be optimized
>>> with this algorithm too
>>>
>>> raid0 don?t have mirrors, but information is per device striped (not
>>> for linear), that?s why it can be faster... can make parallel reads
>>>
>>> with closest head we can?t use best disk, we can use a single disk all
>>> time if it?s head closer, maybe it?s not the fastest disk (that?s why
>>> we implent the write-mostly, we don?t make they usable for read, just
>>> for write or when mirror fail, but it?s not perfect for speed, a
>>> better algorithm can be made, for identical disks, a round robin work
>>> well, better than closest head if it?s a solid state disk)
>>> ok on a high load, maybe closest mirror is better than this algorithm?
>>> yes, if you just use hard disk, if you mix hard disk+solid
>>> state+network block device +floppy disks+any other device, you don?t
>>> have the best algorithm for i/o over mirrors
>>>
>>>
>>> > and is it reading or writing or both? Normally we are dependant on the
>>> > reading, as we cannot process data before we have read them.
>>> > OTOH writing is less time critical, as nobody is waiting for it.
>>> it must be implemented on write and read, write for just time
>>> calculations, read for select the best mirror
>>> for write we must write on all mirrors (sync write is better, async
>>> isn?t power fail safe)
>>>
>>> > Or is it maximum thruput you want?
>>> > Or a mix, given some restraints?
>>> it?s the maximum performace = what?s the better strategy to spent less
>>> time to execute current i/o, based on time to access disk, time to
>>> read bytes, time to wait others i/o being executed
>>>
>>> that?s for mirror select, not for disks i/o
>>> for disks we can use noop, deadline, cfq scheduller (for disks)
>>> tcp/ip tweaks for network block device
>>>
>>> a model identification must execute to tell the mirror select
>>> algorithm what?s the model of each device
>>> model: time to read X bytes, time to move head, time to start a read,
>>> time to write, time time time per byte per kb per units
>>> calcule time and select the minimal value calculated as the device
>>> (mirror) to execute our read
>>>
>>>
>>> >
>>> > best regards
>>> > keld
>>>
>>> thanks keld
>>>
>>> sorry if i make email list very big
>>>
>>>
>>>
>>> --
>>> Roberto Spadim
>>> Spadim Technology / SPAEmpresarial
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> Roberto Spadim
> Spadim Technology / SPAEmpresarial
>



-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-02-03  1:57 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-31  9:41 What's the typical RAID10 setup? Mathias Burén
2011-01-31 10:14 ` Robin Hill
2011-01-31 10:22   ` Mathias Burén
2011-01-31 10:36 ` CoolCold
2011-01-31 15:00   ` Roberto Spadim
2011-01-31 15:21     ` Robin Hill
2011-01-31 15:27       ` Roberto Spadim
2011-01-31 15:28         ` Roberto Spadim
2011-01-31 15:32           ` Roberto Spadim
2011-01-31 15:34             ` Roberto Spadim
2011-01-31 15:37               ` Roberto Spadim
2011-01-31 15:45             ` Robin Hill
2011-01-31 16:55         ` Denis
2011-01-31 17:31           ` Roberto Spadim
2011-01-31 18:35             ` Denis
2011-01-31 19:15               ` Roberto Spadim
2011-01-31 19:28                 ` Keld Jørn Simonsen
2011-01-31 19:35                   ` Roberto Spadim
2011-01-31 19:37                     ` Roberto Spadim
2011-01-31 20:22                     ` Keld Jørn Simonsen
2011-01-31 20:17                   ` Stan Hoeppner
2011-01-31 20:37                     ` Keld Jørn Simonsen
2011-01-31 21:20                       ` Roberto Spadim
2011-01-31 21:24                         ` Mathias Burén
2011-01-31 21:27                           ` Jon Nelson
2011-01-31 21:47                             ` Roberto Spadim
2011-01-31 21:51                               ` Roberto Spadim
2011-01-31 22:50                                 ` NeilBrown
2011-01-31 22:53                                   ` Roberto Spadim
2011-01-31 23:10                                     ` NeilBrown
2011-01-31 23:14                                       ` Roberto Spadim
2011-01-31 22:52                               ` Keld Jørn Simonsen
2011-01-31 23:00                                 ` Roberto Spadim
2011-02-01 10:01                                 ` David Brown
2011-02-01 13:50                                   ` Jon Nelson
2011-02-01 14:25                                     ` Roberto Spadim
2011-02-01 14:48                                     ` David Brown
2011-02-01 15:41                                       ` Roberto Spadim
2011-02-03  3:36                                         ` Drew
2011-02-03  8:18                                           ` Stan Hoeppner
     [not found]                                           ` <AANLkTikerSZfhMbkEvGBVyLB=wHDSHLWszoEz5As5Hi4@mail.gmail.com>
     [not found]                                             ` <AANLkTikLyR206x4aMy+veNkWPV67uF9r5dZKGqXJUEqN@mail.gmail.com>
2011-02-03 14:35                                               ` Roberto Spadim
2011-02-03 15:43                                                 ` Keld Jørn Simonsen
2011-02-03 15:50                                                   ` Roberto Spadim
2011-02-03 15:54                                                     ` Roberto Spadim
2011-02-03 16:02                                                     ` Keld Jørn Simonsen
2011-02-03 16:07                                                       ` Roberto Spadim
2011-02-03 16:16                                                       ` Roberto Spadim
2011-02-01 22:05                                     ` Stan Hoeppner
2011-02-01 23:12                                       ` Roberto Spadim
2011-02-02  9:25                                         ` Robin Hill
2011-02-02 16:00                                           ` Roberto Spadim
2011-02-02 16:06                                             ` Roberto Spadim
2011-02-02 16:07                                               ` Roberto Spadim
2011-02-02 16:10                                                 ` Roberto Spadim
2011-02-02 16:13                                                   ` Roberto Spadim
2011-02-02 19:44                                                     ` Keld Jørn Simonsen
2011-02-02 20:28                                                       ` Roberto Spadim
2011-02-02 21:31                                                         ` Roberto Spadim
2011-02-02 22:13                                                         ` Keld Jørn Simonsen
2011-02-02 22:26                                                           ` Roberto Spadim
2011-02-03  1:57                                                             ` Roberto Spadim [this message]
2011-02-03  3:05                                                         ` Stan Hoeppner
2011-02-03  3:13                                                           ` Roberto Spadim
2011-02-03  3:17                                                             ` Roberto Spadim
2011-02-01 23:35                                       ` Keld Jørn Simonsen
2011-02-01 16:02                                   ` Keld Jørn Simonsen
2011-02-01 16:24                                     ` Roberto Spadim
2011-02-01 17:56                                       ` Keld Jørn Simonsen
2011-02-01 18:09                                         ` Roberto Spadim
2011-02-01 20:16                                           ` Keld Jørn Simonsen
2011-02-01 20:32                                     ` Keld Jørn Simonsen
2011-02-01 20:58                                       ` Roberto Spadim
2011-02-01 21:04                                         ` Roberto Spadim
2011-02-01 21:18                                     ` David Brown
2011-02-01  0:58                             ` Stan Hoeppner
2011-02-01 12:50                               ` Roman Mamedov
2011-02-03 11:04                               ` Keld Jørn Simonsen
2011-02-03 14:17                                 ` Roberto Spadim
2011-02-03 15:54                                   ` Keld Jørn Simonsen
2011-02-03 18:39                                     ` Keld Jørn Simonsen
2011-02-03 18:41                                       ` Roberto Spadim
2011-02-03 23:43                                 ` Stan Hoeppner
2011-02-04  3:49                                   ` hansbkk
2011-02-04  7:06                                   ` Keld Jørn Simonsen
2011-02-04  8:27                                     ` Stan Hoeppner
2011-02-04  9:06                                       ` Keld Jørn Simonsen
2011-02-04 10:04                                         ` Stan Hoeppner
2011-02-04 11:15                                           ` hansbkk
2011-02-04 13:33                                             ` Keld Jørn Simonsen
2011-02-04 20:35                                           ` Keld Jørn Simonsen
2011-02-04 20:42                                         ` Keld Jørn Simonsen
2011-02-04 21:15                                           ` Stan Hoeppner
2011-02-04 22:05                                             ` Keld Jørn Simonsen
2011-02-04 23:03                                               ` Stan Hoeppner
2011-02-06  3:59                                                 ` Drew
2011-02-06  4:27                                                   ` Stan Hoeppner
2011-02-04 11:34                                       ` David Brown
2011-02-04 13:53                                         ` Keld Jørn Simonsen
2011-02-04 14:17                                           ` David Brown
2011-02-04 14:21                                           ` hansbkk
2011-02-06  4:02                                             ` Drew
2011-02-06  7:58                                               ` Keld Jørn Simonsen
2011-02-06 12:03                                                 ` Roman Mamedov
2011-02-06 14:30                                                   ` Roberto Spadim
2011-02-01  8:46                             ` hansbkk
2011-01-31 19:37               ` Phillip Susi
2011-01-31 19:41                 ` Roberto Spadim
2011-01-31 19:46                   ` Phillip Susi
2011-01-31 19:53                     ` Roberto Spadim
2011-01-31 22:10                       ` Phillip Susi
2011-01-31 22:14                         ` Denis
2011-01-31 22:33                           ` Roberto Spadim
2011-01-31 22:36                             ` Roberto Spadim
2011-01-31 20:23                 ` Stan Hoeppner
2011-01-31 21:59                   ` Phillip Susi
2011-01-31 22:08                     ` Jon Nelson
2011-01-31 22:38                       ` Phillip Susi
2011-02-01 10:05                         ` David Brown
2011-02-01  9:20                     ` Robin Hill
2011-02-04 16:03                       ` Phillip Susi
2011-02-04 16:22                         ` Robin Hill
2011-02-04 20:35                           ` [OT] " Phil Turmel
2011-02-04 20:35                           ` Phillip Susi
2011-02-04 21:05                           ` Stan Hoeppner
2011-02-04 21:13                             ` Roberto Spadim
2011-01-31 15:30       ` Robin Hill
2011-01-31 20:07 ` Stan Hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTik=GNgLNK47=aFegbX1_6veMZSgBsxhL8=sN1QT@mail.gmail.com' \
    --to=roberto@spadim.com.br \
    --cc=jnelson-linux-raid@jamponi.net \
    --cc=keld@keldix.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.