All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ronald Moesbergen <intercommit@gmail.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Vladislav Bolkhovitin <vst@vlnb.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>,
	"Alan.Brunelle@hp.com" <Alan.Brunelle@hp.com>,
	"hifumi.hisashi@oss.ntt.co.jp" <hifumi.hisashi@oss.ntt.co.jp>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
	"randy.dunlap@oracle.com" <randy.dunlap@oracle.com>
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev
Date: Mon, 29 Jun 2009 12:26:16 +0200	[thread overview]
Message-ID: <a0272b440906290326rcd63849j2513f6ee9b9bf93e@mail.gmail.com> (raw)
In-Reply-To: <20090629093423.GB1315@localhost>

2009/6/29 Wu Fengguang <fengguang.wu@intel.com>:
> On Sat, Jun 20, 2009 at 08:29:31PM +0800, Vladislav Bolkhovitin wrote:
>>
>> Wu Fengguang, on 06/20/2009 07:55 AM wrote:
>> > On Fri, Jun 19, 2009 at 03:04:36AM +0800, Andrew Morton wrote:
>> >> On Sun, 7 Jun 2009 06:45:38 +0800
>> >> Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >>
>> >>>>> Do you have a place where the raw blktrace data can be retrieved for
>> >>>>> more in-depth analysis?
>> >>>> I think your comment is really adequate. In another thread, Wu Fengguang pointed
>> >>>> out the same issue.
>> >>>> I and Wu also wait his analysis.
>> >>> And do it with a large readahead size :)
>> >>>
>> >>> Alan, this was my analysis:
>> >>>
>> >>> : Hifumi, can you help retest with some large readahead size?
>> >>> :
>> >>> : Your readahead size (128K) is smaller than your max_sectors_kb (256K),
>> >>> : so two readahead IO requests get merged into one real IO, that means
>> >>> : half of the readahead requests are delayed.
>> >>>
>> >>> ie. two readahead requests get merged and complete together, thus the effective
>> >>> IO size is doubled but at the same time it becomes completely synchronous IO.
>> >>>
>> >>> :
>> >>> : The IO completion size goes down from 512 to 256 sectors:
>> >>> :
>> >>> : before patch:
>> >>> :   8,0    3   177955    50.050313976     0  C   R 8724991 + 512 [0]
>> >>> :   8,0    3   177966    50.053380250     0  C   R 8725503 + 512 [0]
>> >>> :   8,0    3   177977    50.056970395     0  C   R 8726015 + 512 [0]
>> >>> :   8,0    3   177988    50.060326743     0  C   R 8726527 + 512 [0]
>> >>> :   8,0    3   177999    50.063922341     0  C   R 8727039 + 512 [0]
>> >>> :
>> >>> : after patch:
>> >>> :   8,0    3   257297    50.000760847     0  C   R 9480703 + 256 [0]
>> >>> :   8,0    3   257306    50.003034240     0  C   R 9480959 + 256 [0]
>> >>> :   8,0    3   257307    50.003076338     0  C   R 9481215 + 256 [0]
>> >>> :   8,0    3   257323    50.004774693     0  C   R 9481471 + 256 [0]
>> >>> :   8,0    3   257332    50.006865854     0  C   R 9481727 + 256 [0]
>> >>>
>> >> I haven't sent readahead-add-blk_run_backing_dev.patch in to Linus yet
>> >> and it's looking like 2.6.32 material, if ever.
>> >>
>> >> If it turns out to be wonderful, we could always ask the -stable
>> >> maintainers to put it in 2.6.x.y I guess.
>> >
>> > Agreed. The expected (and interesting) test on a properly configured
>> > HW RAID has not happened yet, hence the theory remains unsupported.
>>
>> Hmm, do you see anything improper in the Ronald's setup (see
>> http://sourceforge.net/mailarchive/forum.php?thread_name=a0272b440906030714g67eabc5k8f847fb1e538cc62%40mail.gmail.com&forum_name=scst-devel)?
>> It is HW RAID based.
>
> No. Ronald's HW RAID performance is reasonably good.  I meant Hifumi's
> RAID performance is too bad and may be improved by increasing the
> readahead size, hehe.
>
>> As I already wrote, we can ask Ronald to perform any needed tests.
>
> Thanks!  Ronald's test results are:
>
> 231   MB/s   HW RAID
>  69.6 MB/s   HW RAID + SCST
>  89.7 MB/s   HW RAID + SCST + this patch
>
> So this patch seem to help SCST, but again it would be better to
> improve the SCST throughput first - it is now quite sub-optimal.
> (Sorry for the long delay: currently I have not got an idea on
>  how to measure such timing issues.)
>
> And if Ronald could provide the HW RAID performance with this patch,
> then we can confirm if this patch really makes a difference for RAID.

I just tested raw HW RAID throughput with the patch applied, same
readahead setting (512KB), and it doesn't look promising:

./blockdev-perftest -d -r /dev/cciss/c0d0
blocksize        W        W        W        R        R        R
 67108864       -1       -1       -1  5.59686   5.4098  5.45396
 33554432       -1       -1       -1  6.18616  6.13232  5.96124
 16777216       -1       -1       -1   7.6757  7.32139   7.4966
  8388608       -1       -1       -1  8.82793  9.02057  9.01055
  4194304       -1       -1       -1  12.2289  12.6804    12.19
  2097152       -1       -1       -1  13.3012   13.706  14.7542
  1048576       -1       -1       -1  11.7577  12.3609  11.9507
   524288       -1       -1       -1  12.4112  12.2383  11.9105
   262144       -1       -1       -1  7.30687   7.4417  7.38246
   131072       -1       -1       -1  7.95752  7.95053  8.60796
    65536       -1       -1       -1  10.1282  10.1286  10.1956
    32768       -1       -1       -1  9.91857  9.98597  10.8421
    16384       -1       -1       -1  10.8267  10.8899  10.8718
     8192       -1       -1       -1  12.0345  12.5275   12.005
     4096       -1       -1       -1  15.1537  15.0771  15.1753
     2048       -1       -1       -1   25.432  24.8985  25.4303
     1024       -1       -1       -1  45.2674  45.2707  45.3504
      512       -1       -1       -1  87.9405  88.5047  87.4726

It dropped down to 189 MB/s. :(

Ronald.

WARNING: multiple messages have this Message-ID (diff)
From: Ronald Moesbergen <intercommit@gmail.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Vladislav Bolkhovitin <vst@vlnb.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>,
	"Alan.Brunelle@hp.com" <Alan.Brunelle@hp.com>,
	"hifumi.hisashi@oss.ntt.co.jp" <hifumi.hisashi@oss.ntt.co.jp>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
	"randy.dunlap@oracle.com" <randy.dunlap@oracle.com>
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev
Date: Mon, 29 Jun 2009 12:26:16 +0200	[thread overview]
Message-ID: <a0272b440906290326rcd63849j2513f6ee9b9bf93e@mail.gmail.com> (raw)
In-Reply-To: <20090629093423.GB1315@localhost>

2009/6/29 Wu Fengguang <fengguang.wu@intel.com>:
> On Sat, Jun 20, 2009 at 08:29:31PM +0800, Vladislav Bolkhovitin wrote:
>>
>> Wu Fengguang, on 06/20/2009 07:55 AM wrote:
>> > On Fri, Jun 19, 2009 at 03:04:36AM +0800, Andrew Morton wrote:
>> >> On Sun, 7 Jun 2009 06:45:38 +0800
>> >> Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >>
>> >>>>> Do you have a place where the raw blktrace data can be retrieved for
>> >>>>> more in-depth analysis?
>> >>>> I think your comment is really adequate. In another thread, Wu Fengguang pointed
>> >>>> out the same issue.
>> >>>> I and Wu also wait his analysis.
>> >>> And do it with a large readahead size :)
>> >>>
>> >>> Alan, this was my analysis:
>> >>>
>> >>> : Hifumi, can you help retest with some large readahead size?
>> >>> :
>> >>> : Your readahead size (128K) is smaller than your max_sectors_kb (256K),
>> >>> : so two readahead IO requests get merged into one real IO, that means
>> >>> : half of the readahead requests are delayed.
>> >>>
>> >>> ie. two readahead requests get merged and complete together, thus the effective
>> >>> IO size is doubled but at the same time it becomes completely synchronous IO.
>> >>>
>> >>> :
>> >>> : The IO completion size goes down from 512 to 256 sectors:
>> >>> :
>> >>> : before patch:
>> >>> :   8,0    3   177955    50.050313976     0  C   R 8724991 + 512 [0]
>> >>> :   8,0    3   177966    50.053380250     0  C   R 8725503 + 512 [0]
>> >>> :   8,0    3   177977    50.056970395     0  C   R 8726015 + 512 [0]
>> >>> :   8,0    3   177988    50.060326743     0  C   R 8726527 + 512 [0]
>> >>> :   8,0    3   177999    50.063922341     0  C   R 8727039 + 512 [0]
>> >>> :
>> >>> : after patch:
>> >>> :   8,0    3   257297    50.000760847     0  C   R 9480703 + 256 [0]
>> >>> :   8,0    3   257306    50.003034240     0  C   R 9480959 + 256 [0]
>> >>> :   8,0    3   257307    50.003076338     0  C   R 9481215 + 256 [0]
>> >>> :   8,0    3   257323    50.004774693     0  C   R 9481471 + 256 [0]
>> >>> :   8,0    3   257332    50.006865854     0  C   R 9481727 + 256 [0]
>> >>>
>> >> I haven't sent readahead-add-blk_run_backing_dev.patch in to Linus yet
>> >> and it's looking like 2.6.32 material, if ever.
>> >>
>> >> If it turns out to be wonderful, we could always ask the -stable
>> >> maintainers to put it in 2.6.x.y I guess.
>> >
>> > Agreed. The expected (and interesting) test on a properly configured
>> > HW RAID has not happened yet, hence the theory remains unsupported.
>>
>> Hmm, do you see anything improper in the Ronald's setup (see
>> http://sourceforge.net/mailarchive/forum.php?thread_name=a0272b440906030714g67eabc5k8f847fb1e538cc62%40mail.gmail.com&forum_name=scst-devel)?
>> It is HW RAID based.
>
> No. Ronald's HW RAID performance is reasonably good.  I meant Hifumi's
> RAID performance is too bad and may be improved by increasing the
> readahead size, hehe.
>
>> As I already wrote, we can ask Ronald to perform any needed tests.
>
> Thanks!  Ronald's test results are:
>
> 231   MB/s   HW RAID
>  69.6 MB/s   HW RAID + SCST
>  89.7 MB/s   HW RAID + SCST + this patch
>
> So this patch seem to help SCST, but again it would be better to
> improve the SCST throughput first - it is now quite sub-optimal.
> (Sorry for the long delay: currently I have not got an idea on
>  how to measure such timing issues.)
>
> And if Ronald could provide the HW RAID performance with this patch,
> then we can confirm if this patch really makes a difference for RAID.

I just tested raw HW RAID throughput with the patch applied, same
readahead setting (512KB), and it doesn't look promising:

./blockdev-perftest -d -r /dev/cciss/c0d0
blocksize        W        W        W        R        R        R
 67108864       -1       -1       -1  5.59686   5.4098  5.45396
 33554432       -1       -1       -1  6.18616  6.13232  5.96124
 16777216       -1       -1       -1   7.6757  7.32139   7.4966
  8388608       -1       -1       -1  8.82793  9.02057  9.01055
  4194304       -1       -1       -1  12.2289  12.6804    12.19
  2097152       -1       -1       -1  13.3012   13.706  14.7542
  1048576       -1       -1       -1  11.7577  12.3609  11.9507
   524288       -1       -1       -1  12.4112  12.2383  11.9105
   262144       -1       -1       -1  7.30687   7.4417  7.38246
   131072       -1       -1       -1  7.95752  7.95053  8.60796
    65536       -1       -1       -1  10.1282  10.1286  10.1956
    32768       -1       -1       -1  9.91857  9.98597  10.8421
    16384       -1       -1       -1  10.8267  10.8899  10.8718
     8192       -1       -1       -1  12.0345  12.5275   12.005
     4096       -1       -1       -1  15.1537  15.0771  15.1753
     2048       -1       -1       -1   25.432  24.8985  25.4303
     1024       -1       -1       -1  45.2674  45.2707  45.3504
      512       -1       -1       -1  87.9405  88.5047  87.4726

It dropped down to 189 MB/s. :(

Ronald.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-06-29 10:26 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-29  5:35 [RESEND] [PATCH] readahead:add blk_run_backing_dev Hisashi Hifumi
2009-06-01  0:36 ` Andrew Morton
2009-06-01  1:04   ` Hisashi Hifumi
2009-06-05 15:15     ` Alan D. Brunelle
2009-06-06 14:36       ` KOSAKI Motohiro
2009-06-06 22:45         ` Wu Fengguang
2009-06-18 19:04           ` Andrew Morton
2009-06-20  3:55             ` Wu Fengguang
2009-06-20 12:29               ` Vladislav Bolkhovitin
2009-06-29  9:34                 ` Wu Fengguang
2009-06-29 10:26                   ` Ronald Moesbergen [this message]
2009-06-29 10:26                     ` Ronald Moesbergen
2009-06-29 10:55                     ` Vladislav Bolkhovitin
2009-06-29 12:54                       ` Wu Fengguang
2009-06-29 12:58                         ` Bart Van Assche
2009-06-29 13:01                           ` Wu Fengguang
2009-06-29 13:04                         ` Vladislav Bolkhovitin
2009-06-29 13:13                           ` Wu Fengguang
2009-06-29 13:28                             ` Wu Fengguang
2009-06-29 14:43                               ` Ronald Moesbergen
2009-06-29 14:51                                 ` Wu Fengguang
2009-06-29 14:56                                   ` Ronald Moesbergen
2009-06-29 15:37                                   ` Vladislav Bolkhovitin
2009-06-29 14:00                           ` Ronald Moesbergen
2009-06-29 14:21                             ` Wu Fengguang
2009-06-29 15:01                               ` Wu Fengguang
2009-06-29 15:37                                 ` Vladislav Bolkhovitin
     [not found]                                   ` <20090630010414.GB31418@localhost>
2009-06-30 10:54                                     ` Vladislav Bolkhovitin
2009-07-01 13:07                                       ` Ronald Moesbergen
2009-07-01 18:12                                         ` Vladislav Bolkhovitin
2009-07-03  9:14                                       ` Ronald Moesbergen
2009-07-03 10:56                                         ` Vladislav Bolkhovitin
2009-07-03 12:41                                           ` Ronald Moesbergen
2009-07-03 12:46                                             ` Vladislav Bolkhovitin
2009-07-04 15:19                                           ` Ronald Moesbergen
2009-07-06 11:12                                             ` Vladislav Bolkhovitin
2009-07-06 14:37                                               ` Ronald Moesbergen
2009-07-06 14:37                                                 ` Ronald Moesbergen
2009-07-06 17:48                                                 ` Vladislav Bolkhovitin
2009-07-07  6:49                                                   ` Ronald Moesbergen
2009-07-07  6:49                                                     ` Ronald Moesbergen
     [not found]                                                     ` <4A5395FD.2040507@vlnb.net>
     [not found]                                                       ` <a0272b440907080149j3eeeb9bat13f942520db059a8@mail.gmail.com>
2009-07-08 12:40                                                         ` Vladislav Bolkhovitin
2009-07-10  6:32                                                           ` Ronald Moesbergen
2009-07-10  8:43                                                             ` Vladislav Bolkhovitin
2009-07-10  9:27                                                               ` Vladislav Bolkhovitin
2009-07-13 12:12                                                                 ` Ronald Moesbergen
2009-07-13 12:36                                                                   ` Wu Fengguang
2009-07-13 12:47                                                                     ` Ronald Moesbergen
2009-07-13 12:52                                                                       ` Wu Fengguang
2009-07-14 18:52                                                                     ` Vladislav Bolkhovitin
2009-07-15  7:06                                                                       ` Wu Fengguang
2009-07-14 18:52                                                                   ` Vladislav Bolkhovitin
2009-07-15  6:30                                                                     ` Vladislav Bolkhovitin
2009-07-16  7:32                                                                       ` Ronald Moesbergen
2009-07-16 10:36                                                                         ` Vladislav Bolkhovitin
2009-07-16 14:54                                                                           ` Ronald Moesbergen
2009-07-16 16:03                                                                             ` Vladislav Bolkhovitin
2009-07-17 14:15                                                                           ` Ronald Moesbergen
2009-07-17 18:23                                                                             ` Vladislav Bolkhovitin
2009-07-20  7:20                                                                               ` Vladislav Bolkhovitin
2009-07-22  8:44                                                                                 ` Ronald Moesbergen
2009-07-27 13:11                                                                                   ` Vladislav Bolkhovitin
2009-07-28  9:51                                                                                     ` Ronald Moesbergen
2009-07-28 19:07                                                                                       ` Vladislav Bolkhovitin
2009-07-29 12:48                                                                                         ` Ronald Moesbergen
2009-07-31 18:32                                                                                           ` Vladislav Bolkhovitin
2009-08-03  9:15                                                                                             ` Ronald Moesbergen
2009-08-03  9:20                                                                                               ` Vladislav Bolkhovitin
2009-08-03 11:44                                                                                                 ` Ronald Moesbergen
2009-08-03 11:44                                                                                                   ` Ronald Moesbergen
2009-07-15 20:52                                                           ` Kurt Garloff
2009-07-16 10:38                                                             ` Vladislav Bolkhovitin
2009-06-30 10:22                             ` Vladislav Bolkhovitin
2009-06-29 10:55                   ` Vladislav Bolkhovitin
2009-06-29 13:00                     ` Wu Fengguang
2009-09-22 20:58 ` Andrew Morton
2009-09-22 20:58   ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2009-09-23  1:48 Wu Fengguang
2009-09-23  1:48 ` [RESEND] [PATCH] readahead:add blk_run_backing_dev Wu Fengguang
2009-09-23  1:48 ` (unknown) Wu Fengguang
2009-05-22  0:09 [RESEND][PATCH] readahead:add blk_run_backing_dev Hisashi Hifumi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0272b440906290326rcd63849j2513f6ee9b9bf93e@mail.gmail.com \
    --to=intercommit@gmail.com \
    --cc=Alan.Brunelle@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=hifumi.hisashi@oss.ntt.co.jp \
    --cc=jens.axboe@oracle.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=randy.dunlap@oracle.com \
    --cc=vst@vlnb.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.