From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755404AbZGFOhR (ORCPT ); Mon, 6 Jul 2009 10:37:17 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754775AbZGFOhF (ORCPT ); Mon, 6 Jul 2009 10:37:05 -0400 Received: from fg-out-1718.google.com ([72.14.220.158]:36121 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754051AbZGFOhC convert rfc822-to-8bit (ORCPT ); Mon, 6 Jul 2009 10:37:02 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=JHIbv1ozsXK3Ua/fIc6cmSu70Hb0aW3R+L+wdGXb2a9VxQ2WvpgCNutTl9S7d8R3iN NUoCvILbQO5JfLVPsX63qz/aSttWZhTWoxqByH1OBpzJC/oJg1DPnlxeu5viztNcKdGp OJ2yZ+6Y1LkU2LQgNmJFZdFyMhkrDVeal+A/8= MIME-Version: 1.0 In-Reply-To: <4A51DC0A.10302@vlnb.net> References: <4A3CD62B.1020407@vlnb.net> <20090629142124.GA28945@localhost> <20090629150109.GA3534@localhost> <4A48DFC5.3090205@vlnb.net> <20090630010414.GB31418@localhost> <4A49EEF9.6010205@vlnb.net> <4A4DE3C1.5080307@vlnb.net> <4A51DC0A.10302@vlnb.net> Date: Mon, 6 Jul 2009 16:37:04 +0200 Message-ID: Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev From: Ronald Moesbergen To: Vladislav Bolkhovitin Cc: Wu Fengguang , linux-kernel@vger.kernel.org, akpm@linux-foundation.org, kosaki.motohiro@jp.fujitsu.com, Alan.Brunelle@hp.com, hifumi.hisashi@oss.ntt.co.jp, linux-fsdevel@vger.kernel.org, jens.axboe@oracle.com, randy.dunlap@oracle.com, Bart Van Assche Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2009/7/6 Vladislav Bolkhovitin : > (Restored the original list of recipients in this thread as I was asked.) > > Hi Ronald, > > Ronald Moesbergen, on 07/04/2009 07:19 PM wrote: >> >> 2009/7/3 Vladislav Bolkhovitin : >>> >>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote: >>>>>> >>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing >>>>>> read_ahead_kb. But before actually trying to push that idea I'd like >>>>>> to >>>>>> - do more benchmarks >>>>>> - figure out why context readahead didn't help SCST performance >>>>>>  (previous traces show that context readahead is submitting perfect >>>>>>  large io requests, so I wonder if it's some io scheduler bug) >>>>> >>>>> Because, as we found out, without your >>>>> http://lkml.org/lkml/2009/5/21/319 >>>>> patch read-ahead was nearly disabled, hence there were no difference >>>>> which >>>>> algorithm was used? >>>>> >>>>> Ronald, can you run the following tests, please? This time with 2 >>>>> hosts, >>>>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It >>>>> would be the best if on the client vanilla 2.6.29 will be ran, but any >>>>> other >>>>> kernel will be fine as well, only specify which. Blockdev-perftest >>>>> should >>>>> be >>>>> ran as before in buffered mode, i.e. with "-a" switch. >>>>> >>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings. >>>>> >>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB >>>>> max_sectors_kb. >>>>> >>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default >>>>> max_sectors_kb. >>>>> >>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB >>>>> max_sectors_kb. >>>>> >>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA >>>>> size >>>>> and max_sectors_kb are default. For your convenience I committed the >>>>> backported context RA patches into the SCST SVN repository. >>>>> >>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default >>>>> RA >>>>> size and 64KB max_sectors_kb. >>>>> >>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA >>>>> size >>>>> and default max_sectors_kb. >>>>> >>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA >>>>> size >>>>> and 64KB max_sectors_kb. >>>>> >>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the server >>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and >>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb. >>>>> >>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the server >>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and >>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb. >>>>> >>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server >>>>> vanilla >>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context >>>>> RA >>>>> patches with 2MB RA size and 64KB max_sectors_kb. >>>> >>>> Ok, done. Performance is pretty bad overall :( >>>> >>>> The kernels I used: >>>> client kernel: 2.6.26-15lenny3 (debian) >>>> server kernel: 2.6.29.5 with blk_dev_run patch >>>> >>>> And I adjusted the blockdev-perftest script to drop caches on both the >>>> server (via ssh) and the client. >>>> >>>> The results: >>>> >> >> ... previous results ... >> >>> Those are on the server without io_context-2.6.29 and readahead-2.6.29 >>> patches applied and with CFQ scheduler, correct? >>> >>> Then we see how reorder of requests caused by many I/O threads submitting >>> I/O in separate I/O contexts badly affect performance and no RA, >>> especially >>> with default 128KB RA size, can solve it. Less max_sectors_kb on the >>> client >>> => more requests it sends at once => more reorder on the server => worse >>> throughput. Although, Fengguang, in theory, context RA with 2MB RA size >>> should considerably help it, no? >>> >>> Ronald, can you perform those tests again with both io_context-2.6.29 and >>> readahead-2.6.29 patches applied on the server, please? >> >> Hi Vlad, >> >> I have retested with the patches you requested (and got access to the >> systems today :) ) The results are better, but still not great. >> >> client kernel: 2.6.26-15lenny3 (debian) >> server kernel: 2.6.29.5 with io_context and readahead patch >> >> 5) client: default, server: default >> blocksize       R        R        R   R(avg,    R(std        R >>  (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS) >>  67108864  18.303   19.867   18.481   54.299    1.961    0.848 >>  33554432  18.321   17.681   18.708   56.181    1.314    1.756 >>  16777216  17.816   17.406   19.257   56.494    2.410    3.531 >>  8388608  18.077   17.727   19.338   55.789    2.056    6.974 >>  4194304  17.918   16.601   18.287   58.276    2.454   14.569 >>  2097152  17.426   17.334   17.610   58.661    0.384   29.331 >>  1048576  19.358   18.764   17.253   55.607    2.734   55.607 >>   524288  17.951   18.163   17.440   57.379    0.983  114.757 >>   262144  18.196   17.724   17.520   57.499    0.907  229.995 >>   131072  18.342   18.259   17.551   56.751    1.131  454.010 >>    65536  17.733   18.572   17.134   57.548    1.893  920.766 >>    32768  19.081   19.321   17.364   55.213    2.673 1766.818 >>    16384  17.181   18.729   17.731   57.343    2.033 3669.932 >> >> 6) client: default, server: 64 max_sectors_kb, RA default >> blocksize       R        R        R   R(avg,    R(std        R >>  (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS) >>  67108864  21.790   20.062   19.534   50.153    2.304    0.784 >>  33554432  20.212   19.744   19.564   51.623    0.706    1.613 >>  16777216  20.404   19.329   19.738   51.680    1.148    3.230 >>  8388608  20.170   20.772   19.509   50.852    1.304    6.356 >>  4194304  19.334   18.742   18.522   54.296    0.978   13.574 >>  2097152  19.413   18.858   18.884   53.758    0.715   26.879 >>  1048576  20.472   18.755   18.476   53.347    2.377   53.347 >>   524288  19.120   20.104   18.404   53.378    1.925  106.756 >>   262144  20.337   19.213   18.636   52.866    1.901  211.464 >>   131072  19.199   18.312   19.970   53.510    1.900  428.083 >>    65536  19.855   20.114   19.592   51.584    0.555  825.342 >>    32768  20.586   18.724   20.340   51.592    2.204 1650.941 >>    16384  21.119   19.834   19.594   50.792    1.651 3250.669 >> >> 7) client: default, server: default max_sectors_kb, RA 2MB >> blocksize       R        R        R   R(avg,    R(std        R >>  (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS) >>  67108864  17.767   16.489   16.949   60.050    1.842    0.938 >>  33554432  16.777   17.034   17.102   60.341    0.500    1.886 >>  16777216  18.509   16.784   16.971   58.891    2.537    3.681 >>  8388608  18.058   17.949   17.599   57.313    0.632    7.164 >>  4194304  18.286   17.648   17.026   58.055    1.692   14.514 >>  2097152  17.387   18.451   17.875   57.226    1.388   28.613 >>  1048576  18.270   17.698   17.570   57.397    0.969   57.397 >>   524288  16.708   17.900   17.233   59.306    1.668  118.611 >>   262144  18.041   17.381   18.035   57.484    1.011  229.934 >>   131072  17.994   17.777   18.146   56.981    0.481  455.844 >>    65536  17.097   18.597   17.737   57.563    1.975  921.011 >>    32768  17.167   17.035   19.693   57.254    3.721 1832.127 >>    16384  17.144   16.664   17.623   59.762    1.367 3824.774 >> >> 8) client: default, server: 64 max_sectors_kb, RA 2MB >> blocksize       R        R        R   R(avg,    R(std        R >>  (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS) >>  67108864  20.003   21.133   19.308   50.894    1.881    0.795 >>  33554432  19.448   20.015   18.908   52.657    1.222    1.646 >>  16777216  19.964   19.350   19.106   52.603    0.967    3.288 >>  8388608  18.961   19.213   19.318   53.437    0.419    6.680 >>  4194304  18.135   19.508   19.361   53.948    1.788   13.487 >>  2097152  18.753   19.471   18.367   54.315    1.306   27.158 >>  1048576  19.189   18.586   18.867   54.244    0.707   54.244 >>   524288  18.985   19.199   18.840   53.874    0.417  107.749 >>   262144  19.064   21.143   19.674   51.398    2.204  205.592 >>   131072  18.691   18.664   19.116   54.406    0.594  435.245 >>    65536  18.468   20.673   18.554   53.389    2.729  854.229 >>    32768  20.401   21.156   19.552   50.323    1.623 1610.331 >>    16384  19.532   20.028   20.466   51.196    0.977 3276.567 >> >> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA >> 2MB >> blocksize       R        R        R   R(avg,    R(std        R >>  (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS) >>  67108864  16.458   16.649   17.346   60.919    1.364    0.952 >>  33554432  16.479   16.744   17.069   61.096    0.878    1.909 >>  16777216  17.128   16.585   17.112   60.456    0.910    3.778 >>  8388608  17.322   16.780   16.885   60.262    0.824    7.533 >>  4194304  17.530   16.725   16.756   60.250    1.299   15.063 >>  2097152  16.580   17.875   16.619   60.221    2.076   30.110 >>  1048576  17.550   17.406   17.075   59.049    0.681   59.049 >>   524288  16.492   18.211   16.832   59.718    2.519  119.436 >>   262144  17.241   17.115   17.365   59.397    0.352  237.588 >>   131072  17.430   16.902   17.511   59.271    0.936  474.167 >>    65536  16.726   16.894   17.246   60.404    0.768  966.461 >>    32768  16.662   17.517   17.052   59.989    1.224 1919.658 >>    16384  17.429   16.793   16.753   60.285    1.085 3858.268 >> >> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb, RA >> 2MB >> blocksize       R        R        R   R(avg,    R(std        R >>  (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS) >>  67108864  17.601   18.334   17.379   57.650    1.307    0.901 >>  33554432  18.281   18.128   17.169   57.381    1.610    1.793 >>  16777216  17.660   17.875   17.356   58.091    0.703    3.631 >>  8388608  17.724   17.810   18.383   56.992    0.918    7.124 >>  4194304  17.475   17.770   19.003   56.704    2.031   14.176 >>  2097152  17.287   17.674   18.492   57.516    1.604   28.758 >>  1048576  17.972   17.460   18.777   56.721    1.689   56.721 >>   524288  18.680   18.952   19.445   53.837    0.890  107.673 >>   262144  18.070   18.337   18.639   55.817    0.707  223.270 >>   131072  16.990   16.651   16.862   60.832    0.507  486.657 >>    65536  17.707   16.972   17.520   58.870    1.066  941.924 >>    32768  17.767   17.208   17.205   58.887    0.885 1884.399 >>    16384  18.258   17.252   18.035   57.407    1.407 3674.059 >> >> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB >> blocksize       R        R        R   R(avg,    R(std        R >>  (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS) >>  67108864  17.993   18.307   18.718   55.850    0.902    0.873 >>  33554432  19.554   18.485   17.902   54.988    1.993    1.718 >>  16777216  18.829   18.236   18.748   55.052    0.785    3.441 >>  8388608  21.152   19.065   18.738   52.257    2.745    6.532 >>  4194304  19.131   19.703   17.850   54.288    2.268   13.572 >>  2097152  19.093   19.152   19.509   53.196    0.504   26.598 >>  1048576  19.371   18.775   18.804   53.953    0.772   53.953 >>   524288  20.003   17.911   18.602   54.470    2.476  108.940 >>   262144  19.182   19.460   18.476   53.809    1.183  215.236 >>   131072  19.403   19.192   18.907   53.429    0.567  427.435 >>    65536  19.502   19.656   18.599   53.219    1.309  851.509 >>    32768  18.746   18.747   18.250   55.119    0.701 1763.817 >>    16384  20.977   19.437   18.840   51.951    2.319 3324.862 > > The results look inconsistently with what you had previously (89.7 MB/s). > How can you explain it? I had more patches applied with that test: (scst_exec_req_fifo-2.6.29, put_page_callback-2.6.29) and I used a different dd command: dd if=/dev/sdc of=/dev/zero bs=512K count=2000 But all that said, I can't reproduce speeds that high now. Must have made a mistake back then (maybe I forgot to clear the pagecache). > I think, most likely, there was some confusion between the tested and > patched versions of the kernel or you forgot to apply the io_context patch. > Please recheck. The tests above were definitely done right, I just rechecked the patches, and I do see an average increase of about 10MB/s over an unpatched kernel. But overall the performance is still pretty bad. Ronald. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ronald Moesbergen Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev Date: Mon, 6 Jul 2009 16:37:04 +0200 Message-ID: References: <4A3CD62B.1020407@vlnb.net> <20090629142124.GA28945@localhost> <20090629150109.GA3534@localhost> <4A48DFC5.3090205@vlnb.net> <20090630010414.GB31418@localhost> <4A49EEF9.6010205@vlnb.net> <4A4DE3C1.5080307@vlnb.net> <4A51DC0A.10302@vlnb.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Wu Fengguang , linux-kernel@vger.kernel.org, akpm@linux-foundation.org, kosaki.motohiro@jp.fujitsu.com, Alan.Brunelle@hp.com, hifumi.hisashi@oss.ntt.co.jp, linux-fsdevel@vger.kernel.org, jens.axboe@oracle.com, randy.dunlap@oracle.com, Bart Van Assche To: Vladislav Bolkhovitin Return-path: Received: from fg-out-1718.google.com ([72.14.220.158]:36121 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754051AbZGFOhC convert rfc822-to-8bit (ORCPT ); Mon, 6 Jul 2009 10:37:02 -0400 In-Reply-To: <4A51DC0A.10302@vlnb.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: 2009/7/6 Vladislav Bolkhovitin : > (Restored the original list of recipients in this thread as I was ask= ed.) > > Hi Ronald, > > Ronald Moesbergen, on 07/04/2009 07:19 PM wrote: >> >> 2009/7/3 Vladislav Bolkhovitin : >>> >>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote: >>>>>> >>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increas= ing >>>>>> read_ahead_kb. But before actually trying to push that idea I'd = like >>>>>> to >>>>>> - do more benchmarks >>>>>> - figure out why context readahead didn't help SCST performance >>>>>> =A0(previous traces show that context readahead is submitting pe= rfect >>>>>> =A0large io requests, so I wonder if it's some io scheduler bug) >>>>> >>>>> Because, as we found out, without your >>>>> http://lkml.org/lkml/2009/5/21/319 >>>>> patch read-ahead was nearly disabled, hence there were no differe= nce >>>>> which >>>>> algorithm was used? >>>>> >>>>> Ronald, can you run the following tests, please? This time with 2 >>>>> hosts, >>>>> initiator (client) and target (server) connected using 1 Gbps iSC= SI. It >>>>> would be the best if on the client vanilla 2.6.29 will be ran, bu= t any >>>>> other >>>>> kernel will be fine as well, only specify which. Blockdev-perftes= t >>>>> should >>>>> be >>>>> ran as before in buffered mode, i.e. with "-a" switch. >>>>> >>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default setting= s. >>>>> >>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and= 64KB >>>>> max_sectors_kb. >>>>> >>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and def= ault >>>>> max_sectors_kb. >>>>> >>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64K= B >>>>> max_sectors_kb. >>>>> >>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patc= h. RA >>>>> size >>>>> and max_sectors_kb are default. For your convenience I committed = the >>>>> backported context RA patches into the SCST SVN repository. >>>>> >>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with de= fault >>>>> RA >>>>> size and 64KB max_sectors_kb. >>>>> >>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2M= B RA >>>>> size >>>>> and default max_sectors_kb. >>>>> >>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2M= B RA >>>>> size >>>>> and 64KB max_sectors_kb. >>>>> >>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the = server >>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/31= 9 and >>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb. >>>>> >>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the = server >>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/31= 9 and >>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb. >>>>> >>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the ser= ver >>>>> vanilla >>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and co= ntext >>>>> RA >>>>> patches with 2MB RA size and 64KB max_sectors_kb. >>>> >>>> Ok, done. Performance is pretty bad overall :( >>>> >>>> The kernels I used: >>>> client kernel: 2.6.26-15lenny3 (debian) >>>> server kernel: 2.6.29.5 with blk_dev_run patch >>>> >>>> And I adjusted the blockdev-perftest script to drop caches on both= the >>>> server (via ssh) and the client. >>>> >>>> The results: >>>> >> >> ... previous results ... >> >>> Those are on the server without io_context-2.6.29 and readahead-2.6= =2E29 >>> patches applied and with CFQ scheduler, correct? >>> >>> Then we see how reorder of requests caused by many I/O threads subm= itting >>> I/O in separate I/O contexts badly affect performance and no RA, >>> especially >>> with default 128KB RA size, can solve it. Less max_sectors_kb on th= e >>> client >>> =3D> more requests it sends at once =3D> more reorder on the server= =3D> worse >>> throughput. Although, Fengguang, in theory, context RA with 2MB RA = size >>> should considerably help it, no? >>> >>> Ronald, can you perform those tests again with both io_context-2.6.= 29 and >>> readahead-2.6.29 patches applied on the server, please? >> >> Hi Vlad, >> >> I have retested with the patches you requested (and got access to th= e >> systems today :) ) The results are better, but still not great. >> >> client kernel: 2.6.26-15lenny3 (debian) >> server kernel: 2.6.29.5 with io_context and readahead patch >> >> 5) client: default, server: default >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A018.303 =A0 19.867 =A0 18.481 =A0 54.299 =A0 =A01.961 = =A0 =A00.848 >> =A033554432 =A018.321 =A0 17.681 =A0 18.708 =A0 56.181 =A0 =A01.314 = =A0 =A01.756 >> =A016777216 =A017.816 =A0 17.406 =A0 19.257 =A0 56.494 =A0 =A02.410 = =A0 =A03.531 >> =A08388608 =A018.077 =A0 17.727 =A0 19.338 =A0 55.789 =A0 =A02.056 =A0= =A06.974 >> =A04194304 =A017.918 =A0 16.601 =A0 18.287 =A0 58.276 =A0 =A02.454 =A0= 14.569 >> =A02097152 =A017.426 =A0 17.334 =A0 17.610 =A0 58.661 =A0 =A00.384 =A0= 29.331 >> =A01048576 =A019.358 =A0 18.764 =A0 17.253 =A0 55.607 =A0 =A02.734 =A0= 55.607 >> =A0 524288 =A017.951 =A0 18.163 =A0 17.440 =A0 57.379 =A0 =A00.983 =A0= 114.757 >> =A0 262144 =A018.196 =A0 17.724 =A0 17.520 =A0 57.499 =A0 =A00.907 =A0= 229.995 >> =A0 131072 =A018.342 =A0 18.259 =A0 17.551 =A0 56.751 =A0 =A01.131 =A0= 454.010 >> =A0 =A065536 =A017.733 =A0 18.572 =A0 17.134 =A0 57.548 =A0 =A01.893= =A0920.766 >> =A0 =A032768 =A019.081 =A0 19.321 =A0 17.364 =A0 55.213 =A0 =A02.673= 1766.818 >> =A0 =A016384 =A017.181 =A0 18.729 =A0 17.731 =A0 57.343 =A0 =A02.033= 3669.932 >> >> 6) client: default, server: 64 max_sectors_kb, RA default >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A021.790 =A0 20.062 =A0 19.534 =A0 50.153 =A0 =A02.304 = =A0 =A00.784 >> =A033554432 =A020.212 =A0 19.744 =A0 19.564 =A0 51.623 =A0 =A00.706 = =A0 =A01.613 >> =A016777216 =A020.404 =A0 19.329 =A0 19.738 =A0 51.680 =A0 =A01.148 = =A0 =A03.230 >> =A08388608 =A020.170 =A0 20.772 =A0 19.509 =A0 50.852 =A0 =A01.304 =A0= =A06.356 >> =A04194304 =A019.334 =A0 18.742 =A0 18.522 =A0 54.296 =A0 =A00.978 =A0= 13.574 >> =A02097152 =A019.413 =A0 18.858 =A0 18.884 =A0 53.758 =A0 =A00.715 =A0= 26.879 >> =A01048576 =A020.472 =A0 18.755 =A0 18.476 =A0 53.347 =A0 =A02.377 =A0= 53.347 >> =A0 524288 =A019.120 =A0 20.104 =A0 18.404 =A0 53.378 =A0 =A01.925 =A0= 106.756 >> =A0 262144 =A020.337 =A0 19.213 =A0 18.636 =A0 52.866 =A0 =A01.901 =A0= 211.464 >> =A0 131072 =A019.199 =A0 18.312 =A0 19.970 =A0 53.510 =A0 =A01.900 =A0= 428.083 >> =A0 =A065536 =A019.855 =A0 20.114 =A0 19.592 =A0 51.584 =A0 =A00.555= =A0825.342 >> =A0 =A032768 =A020.586 =A0 18.724 =A0 20.340 =A0 51.592 =A0 =A02.204= 1650.941 >> =A0 =A016384 =A021.119 =A0 19.834 =A0 19.594 =A0 50.792 =A0 =A01.651= 3250.669 >> >> 7) client: default, server: default max_sectors_kb, RA 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A017.767 =A0 16.489 =A0 16.949 =A0 60.050 =A0 =A01.842 = =A0 =A00.938 >> =A033554432 =A016.777 =A0 17.034 =A0 17.102 =A0 60.341 =A0 =A00.500 = =A0 =A01.886 >> =A016777216 =A018.509 =A0 16.784 =A0 16.971 =A0 58.891 =A0 =A02.537 = =A0 =A03.681 >> =A08388608 =A018.058 =A0 17.949 =A0 17.599 =A0 57.313 =A0 =A00.632 =A0= =A07.164 >> =A04194304 =A018.286 =A0 17.648 =A0 17.026 =A0 58.055 =A0 =A01.692 =A0= 14.514 >> =A02097152 =A017.387 =A0 18.451 =A0 17.875 =A0 57.226 =A0 =A01.388 =A0= 28.613 >> =A01048576 =A018.270 =A0 17.698 =A0 17.570 =A0 57.397 =A0 =A00.969 =A0= 57.397 >> =A0 524288 =A016.708 =A0 17.900 =A0 17.233 =A0 59.306 =A0 =A01.668 =A0= 118.611 >> =A0 262144 =A018.041 =A0 17.381 =A0 18.035 =A0 57.484 =A0 =A01.011 =A0= 229.934 >> =A0 131072 =A017.994 =A0 17.777 =A0 18.146 =A0 56.981 =A0 =A00.481 =A0= 455.844 >> =A0 =A065536 =A017.097 =A0 18.597 =A0 17.737 =A0 57.563 =A0 =A01.975= =A0921.011 >> =A0 =A032768 =A017.167 =A0 17.035 =A0 19.693 =A0 57.254 =A0 =A03.721= 1832.127 >> =A0 =A016384 =A017.144 =A0 16.664 =A0 17.623 =A0 59.762 =A0 =A01.367= 3824.774 >> >> 8) client: default, server: 64 max_sectors_kb, RA 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A020.003 =A0 21.133 =A0 19.308 =A0 50.894 =A0 =A01.881 = =A0 =A00.795 >> =A033554432 =A019.448 =A0 20.015 =A0 18.908 =A0 52.657 =A0 =A01.222 = =A0 =A01.646 >> =A016777216 =A019.964 =A0 19.350 =A0 19.106 =A0 52.603 =A0 =A00.967 = =A0 =A03.288 >> =A08388608 =A018.961 =A0 19.213 =A0 19.318 =A0 53.437 =A0 =A00.419 =A0= =A06.680 >> =A04194304 =A018.135 =A0 19.508 =A0 19.361 =A0 53.948 =A0 =A01.788 =A0= 13.487 >> =A02097152 =A018.753 =A0 19.471 =A0 18.367 =A0 54.315 =A0 =A01.306 =A0= 27.158 >> =A01048576 =A019.189 =A0 18.586 =A0 18.867 =A0 54.244 =A0 =A00.707 =A0= 54.244 >> =A0 524288 =A018.985 =A0 19.199 =A0 18.840 =A0 53.874 =A0 =A00.417 =A0= 107.749 >> =A0 262144 =A019.064 =A0 21.143 =A0 19.674 =A0 51.398 =A0 =A02.204 =A0= 205.592 >> =A0 131072 =A018.691 =A0 18.664 =A0 19.116 =A0 54.406 =A0 =A00.594 =A0= 435.245 >> =A0 =A065536 =A018.468 =A0 20.673 =A0 18.554 =A0 53.389 =A0 =A02.729= =A0854.229 >> =A0 =A032768 =A020.401 =A0 21.156 =A0 19.552 =A0 50.323 =A0 =A01.623= 1610.331 >> =A0 =A016384 =A019.532 =A0 20.028 =A0 20.466 =A0 51.196 =A0 =A00.977= 3276.567 >> >> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb,= RA >> 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A016.458 =A0 16.649 =A0 17.346 =A0 60.919 =A0 =A01.364 = =A0 =A00.952 >> =A033554432 =A016.479 =A0 16.744 =A0 17.069 =A0 61.096 =A0 =A00.878 = =A0 =A01.909 >> =A016777216 =A017.128 =A0 16.585 =A0 17.112 =A0 60.456 =A0 =A00.910 = =A0 =A03.778 >> =A08388608 =A017.322 =A0 16.780 =A0 16.885 =A0 60.262 =A0 =A00.824 =A0= =A07.533 >> =A04194304 =A017.530 =A0 16.725 =A0 16.756 =A0 60.250 =A0 =A01.299 =A0= 15.063 >> =A02097152 =A016.580 =A0 17.875 =A0 16.619 =A0 60.221 =A0 =A02.076 =A0= 30.110 >> =A01048576 =A017.550 =A0 17.406 =A0 17.075 =A0 59.049 =A0 =A00.681 =A0= 59.049 >> =A0 524288 =A016.492 =A0 18.211 =A0 16.832 =A0 59.718 =A0 =A02.519 =A0= 119.436 >> =A0 262144 =A017.241 =A0 17.115 =A0 17.365 =A0 59.397 =A0 =A00.352 =A0= 237.588 >> =A0 131072 =A017.430 =A0 16.902 =A0 17.511 =A0 59.271 =A0 =A00.936 =A0= 474.167 >> =A0 =A065536 =A016.726 =A0 16.894 =A0 17.246 =A0 60.404 =A0 =A00.768= =A0966.461 >> =A0 =A032768 =A016.662 =A0 17.517 =A0 17.052 =A0 59.989 =A0 =A01.224= 1919.658 >> =A0 =A016384 =A017.429 =A0 16.793 =A0 16.753 =A0 60.285 =A0 =A01.085= 3858.268 >> >> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_k= b, RA >> 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A017.601 =A0 18.334 =A0 17.379 =A0 57.650 =A0 =A01.307 = =A0 =A00.901 >> =A033554432 =A018.281 =A0 18.128 =A0 17.169 =A0 57.381 =A0 =A01.610 = =A0 =A01.793 >> =A016777216 =A017.660 =A0 17.875 =A0 17.356 =A0 58.091 =A0 =A00.703 = =A0 =A03.631 >> =A08388608 =A017.724 =A0 17.810 =A0 18.383 =A0 56.992 =A0 =A00.918 =A0= =A07.124 >> =A04194304 =A017.475 =A0 17.770 =A0 19.003 =A0 56.704 =A0 =A02.031 =A0= 14.176 >> =A02097152 =A017.287 =A0 17.674 =A0 18.492 =A0 57.516 =A0 =A01.604 =A0= 28.758 >> =A01048576 =A017.972 =A0 17.460 =A0 18.777 =A0 56.721 =A0 =A01.689 =A0= 56.721 >> =A0 524288 =A018.680 =A0 18.952 =A0 19.445 =A0 53.837 =A0 =A00.890 =A0= 107.673 >> =A0 262144 =A018.070 =A0 18.337 =A0 18.639 =A0 55.817 =A0 =A00.707 =A0= 223.270 >> =A0 131072 =A016.990 =A0 16.651 =A0 16.862 =A0 60.832 =A0 =A00.507 =A0= 486.657 >> =A0 =A065536 =A017.707 =A0 16.972 =A0 17.520 =A0 58.870 =A0 =A01.066= =A0941.924 >> =A0 =A032768 =A017.767 =A0 17.208 =A0 17.205 =A0 58.887 =A0 =A00.885= 1884.399 >> =A0 =A016384 =A018.258 =A0 17.252 =A0 18.035 =A0 57.407 =A0 =A01.407= 3674.059 >> >> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA= 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A017.993 =A0 18.307 =A0 18.718 =A0 55.850 =A0 =A00.902 = =A0 =A00.873 >> =A033554432 =A019.554 =A0 18.485 =A0 17.902 =A0 54.988 =A0 =A01.993 = =A0 =A01.718 >> =A016777216 =A018.829 =A0 18.236 =A0 18.748 =A0 55.052 =A0 =A00.785 = =A0 =A03.441 >> =A08388608 =A021.152 =A0 19.065 =A0 18.738 =A0 52.257 =A0 =A02.745 =A0= =A06.532 >> =A04194304 =A019.131 =A0 19.703 =A0 17.850 =A0 54.288 =A0 =A02.268 =A0= 13.572 >> =A02097152 =A019.093 =A0 19.152 =A0 19.509 =A0 53.196 =A0 =A00.504 =A0= 26.598 >> =A01048576 =A019.371 =A0 18.775 =A0 18.804 =A0 53.953 =A0 =A00.772 =A0= 53.953 >> =A0 524288 =A020.003 =A0 17.911 =A0 18.602 =A0 54.470 =A0 =A02.476 =A0= 108.940 >> =A0 262144 =A019.182 =A0 19.460 =A0 18.476 =A0 53.809 =A0 =A01.183 =A0= 215.236 >> =A0 131072 =A019.403 =A0 19.192 =A0 18.907 =A0 53.429 =A0 =A00.567 =A0= 427.435 >> =A0 =A065536 =A019.502 =A0 19.656 =A0 18.599 =A0 53.219 =A0 =A01.309= =A0851.509 >> =A0 =A032768 =A018.746 =A0 18.747 =A0 18.250 =A0 55.119 =A0 =A00.701= 1763.817 >> =A0 =A016384 =A020.977 =A0 19.437 =A0 18.840 =A0 51.951 =A0 =A02.319= 3324.862 > > The results look inconsistently with what you had previously (89.7 MB= /s). > How can you explain it? I had more patches applied with that test: (scst_exec_req_fifo-2.6.29, put_page_callback-2.6.29) and I used a different dd command: dd if=3D/dev/sdc of=3D/dev/zero bs=3D512K count=3D2000 But all that said, I can't reproduce speeds that high now. Must have made a mistake back then (maybe I forgot to clear the pagecache). > I think, most likely, there was some confusion between the tested and > patched versions of the kernel or you forgot to apply the io_context = patch. > Please recheck. The tests above were definitely done right, I just rechecked the patches, and I do see an average increase of about 10MB/s over an unpatched kernel. But overall the performance is still pretty bad. Ronald. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html