From: Vladislav Bolkhovitin <vst@vlnb.net>
To: Wu Fengguang <fengguang.wu@intel.com>,
Ronald Moesbergen <intercommit@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>,
"Alan.Brunelle@hp.com" <Alan.Brunelle@hp.com>,
"hifumi.hisashi@oss.ntt.co.jp" <hifumi.hisashi@oss.ntt.co.jp>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
"randy.dunlap@oracle.com" <randy.dunlap@oracle.com>,
Bart Van Assche <bart.vanassche@gmail.com>
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev
Date: Mon, 29 Jun 2009 17:04:57 +0400 [thread overview]
Message-ID: <4A48BBF9.6050408@vlnb.net> (raw)
In-Reply-To: <20090629125434.GA8416@localhost>
Wu Fengguang, on 06/29/2009 04:54 PM wrote:
> On Mon, Jun 29, 2009 at 06:55:40PM +0800, Vladislav Bolkhovitin wrote:
>> Ronald Moesbergen, on 06/29/2009 02:26 PM wrote:
>>> 2009/6/29 Wu Fengguang <fengguang.wu@intel.com>:
>>>> On Sat, Jun 20, 2009 at 08:29:31PM +0800, Vladislav Bolkhovitin wrote:
>>>>> Wu Fengguang, on 06/20/2009 07:55 AM wrote:
>>>>>> On Fri, Jun 19, 2009 at 03:04:36AM +0800, Andrew Morton wrote:
>>>>>>> On Sun, 7 Jun 2009 06:45:38 +0800
>>>>>>> Wu Fengguang <fengguang.wu@intel.com> wrote:
>>>>>>>
>>>>>>>>>> Do you have a place where the raw blktrace data can be retrieved for
>>>>>>>>>> more in-depth analysis?
>>>>>>>>> I think your comment is really adequate. In another thread, Wu Fengguang pointed
>>>>>>>>> out the same issue.
>>>>>>>>> I and Wu also wait his analysis.
>>>>>>>> And do it with a large readahead size :)
>>>>>>>>
>>>>>>>> Alan, this was my analysis:
>>>>>>>>
>>>>>>>> : Hifumi, can you help retest with some large readahead size?
>>>>>>>> :
>>>>>>>> : Your readahead size (128K) is smaller than your max_sectors_kb (256K),
>>>>>>>> : so two readahead IO requests get merged into one real IO, that means
>>>>>>>> : half of the readahead requests are delayed.
>>>>>>>>
>>>>>>>> ie. two readahead requests get merged and complete together, thus the effective
>>>>>>>> IO size is doubled but at the same time it becomes completely synchronous IO.
>>>>>>>>
>>>>>>>> :
>>>>>>>> : The IO completion size goes down from 512 to 256 sectors:
>>>>>>>> :
>>>>>>>> : before patch:
>>>>>>>> : 8,0 3 177955 50.050313976 0 C R 8724991 + 512 [0]
>>>>>>>> : 8,0 3 177966 50.053380250 0 C R 8725503 + 512 [0]
>>>>>>>> : 8,0 3 177977 50.056970395 0 C R 8726015 + 512 [0]
>>>>>>>> : 8,0 3 177988 50.060326743 0 C R 8726527 + 512 [0]
>>>>>>>> : 8,0 3 177999 50.063922341 0 C R 8727039 + 512 [0]
>>>>>>>> :
>>>>>>>> : after patch:
>>>>>>>> : 8,0 3 257297 50.000760847 0 C R 9480703 + 256 [0]
>>>>>>>> : 8,0 3 257306 50.003034240 0 C R 9480959 + 256 [0]
>>>>>>>> : 8,0 3 257307 50.003076338 0 C R 9481215 + 256 [0]
>>>>>>>> : 8,0 3 257323 50.004774693 0 C R 9481471 + 256 [0]
>>>>>>>> : 8,0 3 257332 50.006865854 0 C R 9481727 + 256 [0]
>>>>>>>>
>>>>>>> I haven't sent readahead-add-blk_run_backing_dev.patch in to Linus yet
>>>>>>> and it's looking like 2.6.32 material, if ever.
>>>>>>>
>>>>>>> If it turns out to be wonderful, we could always ask the -stable
>>>>>>> maintainers to put it in 2.6.x.y I guess.
>>>>>> Agreed. The expected (and interesting) test on a properly configured
>>>>>> HW RAID has not happened yet, hence the theory remains unsupported.
>>>>> Hmm, do you see anything improper in the Ronald's setup (see
>>>>> http://sourceforge.net/mailarchive/forum.php?thread_name=a0272b440906030714g67eabc5k8f847fb1e538cc62%40mail.gmail.com&forum_name=scst-devel)?
>>>>> It is HW RAID based.
>>>> No. Ronald's HW RAID performance is reasonably good. I meant Hifumi's
>>>> RAID performance is too bad and may be improved by increasing the
>>>> readahead size, hehe.
>>>>
>>>>> As I already wrote, we can ask Ronald to perform any needed tests.
>>>> Thanks! Ronald's test results are:
>>>>
>>>> 231 MB/s HW RAID
>>>> 69.6 MB/s HW RAID + SCST
>>>> 89.7 MB/s HW RAID + SCST + this patch
>>>>
>>>> So this patch seem to help SCST, but again it would be better to
>>>> improve the SCST throughput first - it is now quite sub-optimal.
>>>> (Sorry for the long delay: currently I have not got an idea on
>>>> how to measure such timing issues.)
>>>>
>>>> And if Ronald could provide the HW RAID performance with this patch,
>>>> then we can confirm if this patch really makes a difference for RAID.
>>> I just tested raw HW RAID throughput with the patch applied, same
>>> readahead setting (512KB), and it doesn't look promising:
>>>
>>> ./blockdev-perftest -d -r /dev/cciss/c0d0
>>> blocksize W W W R R R
>>> 67108864 -1 -1 -1 5.59686 5.4098 5.45396
>>> 33554432 -1 -1 -1 6.18616 6.13232 5.96124
>>> 16777216 -1 -1 -1 7.6757 7.32139 7.4966
>>> 8388608 -1 -1 -1 8.82793 9.02057 9.01055
>>> 4194304 -1 -1 -1 12.2289 12.6804 12.19
>>> 2097152 -1 -1 -1 13.3012 13.706 14.7542
>>> 1048576 -1 -1 -1 11.7577 12.3609 11.9507
>>> 524288 -1 -1 -1 12.4112 12.2383 11.9105
>>> 262144 -1 -1 -1 7.30687 7.4417 7.38246
>>> 131072 -1 -1 -1 7.95752 7.95053 8.60796
>>> 65536 -1 -1 -1 10.1282 10.1286 10.1956
>>> 32768 -1 -1 -1 9.91857 9.98597 10.8421
>>> 16384 -1 -1 -1 10.8267 10.8899 10.8718
>>> 8192 -1 -1 -1 12.0345 12.5275 12.005
>>> 4096 -1 -1 -1 15.1537 15.0771 15.1753
>>> 2048 -1 -1 -1 25.432 24.8985 25.4303
>>> 1024 -1 -1 -1 45.2674 45.2707 45.3504
>>> 512 -1 -1 -1 87.9405 88.5047 87.4726
>>>
>>> It dropped down to 189 MB/s. :(
>> Ronald,
>>
>> Can you, please, rerun this test locally on the target with the latest
>> version of blockdev-perftest, which produces much more readable results,
>
> Is blockdev-perftest public available? It's not obvious from google search.
>
>> for the following 6 cases:
>>
>> 1. Default vanilla 2.6.29 kernel, default parameters, including read-ahead
>
> Why not 2.6.30? :)
We started with 2.6.29, so why not complete with it (to save additional
Ronald's effort to move on 2.6.30)?
>> 2. Default vanilla 2.6.29 kernel, 512 KB read-ahead, the rest is default
>
> How about 2MB RAID readahead size? That transforms into about 512KB
> per-disk readahead size.
OK. Ronald, can you 4 more test cases, please:
7. Default vanilla 2.6.29 kernel, 2MB read-ahead, the rest is default
8. Default vanilla 2.6.29 kernel, 2MB read-ahead, 64 KB
max_sectors_kb, the rest is default
9. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
read-ahead, the rest is default
10. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
read-ahead, 64 KB max_sectors_kb, the rest is default
>> 3. Default vanilla 2.6.29 kernel, 512 KB read-ahead, 64 KB
>> max_sectors_kb, the rest is default
>>
>> 4. Patched by the Fengguang's patch http://lkml.org/lkml/2009/5/21/319
>> vanilla 2.6.29 kernel, default parameters, including read-ahead
>>
>> 5. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 512 KB
>> read-ahead, the rest is default
>>
>> 6. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 512 KB
>> read-ahead, 64 KB max_sectors_kb, the rest is default
>
> Thanks,
> Fengguang
>
>
next prev parent reply other threads:[~2009-06-29 13:05 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-29 5:35 [RESEND] [PATCH] readahead:add blk_run_backing_dev Hisashi Hifumi
2009-06-01 0:36 ` Andrew Morton
2009-06-01 1:04 ` Hisashi Hifumi
2009-06-05 15:15 ` Alan D. Brunelle
2009-06-06 14:36 ` KOSAKI Motohiro
2009-06-06 22:45 ` Wu Fengguang
2009-06-18 19:04 ` Andrew Morton
2009-06-20 3:55 ` Wu Fengguang
2009-06-20 12:29 ` Vladislav Bolkhovitin
2009-06-29 9:34 ` Wu Fengguang
2009-06-29 10:26 ` Ronald Moesbergen
2009-06-29 10:26 ` Ronald Moesbergen
2009-06-29 10:55 ` Vladislav Bolkhovitin
2009-06-29 12:54 ` Wu Fengguang
2009-06-29 12:58 ` Bart Van Assche
2009-06-29 13:01 ` Wu Fengguang
2009-06-29 13:04 ` Vladislav Bolkhovitin [this message]
2009-06-29 13:13 ` Wu Fengguang
2009-06-29 13:28 ` Wu Fengguang
2009-06-29 14:43 ` Ronald Moesbergen
2009-06-29 14:51 ` Wu Fengguang
2009-06-29 14:56 ` Ronald Moesbergen
2009-06-29 15:37 ` Vladislav Bolkhovitin
2009-06-29 14:00 ` Ronald Moesbergen
2009-06-29 14:21 ` Wu Fengguang
2009-06-29 15:01 ` Wu Fengguang
2009-06-29 15:37 ` Vladislav Bolkhovitin
[not found] ` <20090630010414.GB31418@localhost>
2009-06-30 10:54 ` Vladislav Bolkhovitin
2009-07-01 13:07 ` Ronald Moesbergen
2009-07-01 18:12 ` Vladislav Bolkhovitin
2009-07-03 9:14 ` Ronald Moesbergen
2009-07-03 10:56 ` Vladislav Bolkhovitin
2009-07-03 12:41 ` Ronald Moesbergen
2009-07-03 12:46 ` Vladislav Bolkhovitin
2009-07-04 15:19 ` Ronald Moesbergen
2009-07-06 11:12 ` Vladislav Bolkhovitin
2009-07-06 14:37 ` Ronald Moesbergen
2009-07-06 14:37 ` Ronald Moesbergen
2009-07-06 17:48 ` Vladislav Bolkhovitin
2009-07-07 6:49 ` Ronald Moesbergen
2009-07-07 6:49 ` Ronald Moesbergen
[not found] ` <4A5395FD.2040507@vlnb.net>
[not found] ` <a0272b440907080149j3eeeb9bat13f942520db059a8@mail.gmail.com>
2009-07-08 12:40 ` Vladislav Bolkhovitin
2009-07-10 6:32 ` Ronald Moesbergen
2009-07-10 8:43 ` Vladislav Bolkhovitin
2009-07-10 9:27 ` Vladislav Bolkhovitin
2009-07-13 12:12 ` Ronald Moesbergen
2009-07-13 12:36 ` Wu Fengguang
2009-07-13 12:47 ` Ronald Moesbergen
2009-07-13 12:52 ` Wu Fengguang
2009-07-14 18:52 ` Vladislav Bolkhovitin
2009-07-15 7:06 ` Wu Fengguang
2009-07-14 18:52 ` Vladislav Bolkhovitin
2009-07-15 6:30 ` Vladislav Bolkhovitin
2009-07-16 7:32 ` Ronald Moesbergen
2009-07-16 10:36 ` Vladislav Bolkhovitin
2009-07-16 14:54 ` Ronald Moesbergen
2009-07-16 16:03 ` Vladislav Bolkhovitin
2009-07-17 14:15 ` Ronald Moesbergen
2009-07-17 18:23 ` Vladislav Bolkhovitin
2009-07-20 7:20 ` Vladislav Bolkhovitin
2009-07-22 8:44 ` Ronald Moesbergen
2009-07-27 13:11 ` Vladislav Bolkhovitin
2009-07-28 9:51 ` Ronald Moesbergen
2009-07-28 19:07 ` Vladislav Bolkhovitin
2009-07-29 12:48 ` Ronald Moesbergen
2009-07-31 18:32 ` Vladislav Bolkhovitin
2009-08-03 9:15 ` Ronald Moesbergen
2009-08-03 9:20 ` Vladislav Bolkhovitin
2009-08-03 11:44 ` Ronald Moesbergen
2009-08-03 11:44 ` Ronald Moesbergen
2009-07-15 20:52 ` Kurt Garloff
2009-07-16 10:38 ` Vladislav Bolkhovitin
2009-06-30 10:22 ` Vladislav Bolkhovitin
2009-06-29 10:55 ` Vladislav Bolkhovitin
2009-06-29 13:00 ` Wu Fengguang
2009-09-22 20:58 ` Andrew Morton
2009-09-22 20:58 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2009-09-23 1:48 Wu Fengguang
2009-09-23 1:48 ` [RESEND] [PATCH] readahead:add blk_run_backing_dev Wu Fengguang
2009-09-23 1:48 ` (unknown) Wu Fengguang
2009-05-22 0:09 [RESEND][PATCH] readahead:add blk_run_backing_dev Hisashi Hifumi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A48BBF9.6050408@vlnb.net \
--to=vst@vlnb.net \
--cc=Alan.Brunelle@hp.com \
--cc=akpm@linux-foundation.org \
--cc=bart.vanassche@gmail.com \
--cc=fengguang.wu@intel.com \
--cc=hifumi.hisashi@oss.ntt.co.jp \
--cc=intercommit@gmail.com \
--cc=jens.axboe@oracle.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=randy.dunlap@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.