Re: big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?
@ 2003-01-24  1:26 rwhron
  2003-01-24  2:10 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: rwhron @ 2003-01-24  1:26 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, lse-tech

>> >lovely.  These two files have perfectly intermingled blocks.  

>> Writeback? or read?

> Both.

>  The fileystems need fixing....

Did you add a secret sauce to 2.5.59-mm2?  10x sequential
write improvement on ext3 for multiple tiobench threads.

Quad P3 Xeon (4GB ram)
8 GB files
4K blocksize
32 threads
Rate = MB/sec
latency in milliseconds

Sequential Writes ext3
                              Avg       Maximum     Lat%     Lat%    CPU
Kernel       Rate  (CPU%)   Latency     Latency      >2s     >10s    Eff
----------  ------------------------------------------------------------
2.4.20aa1    11.85 72.77%    11.814    21802.73  0.05036  0.00000     16
2.5.59        3.42 17.36%    83.976  3109518.52  0.11253  0.05088     20
2.5.59-mm2   32.39 34.28%     7.742   340597.62  0.04287  0.01765     94

Similar improvement for seq writes for 2, 4, 8, 16, 64, 128, 256 threads.

Sequential reads on ext3 with 2.5.59-mm2 improves around 3x for various 
thread counts.  Below is 32 threads.

Sequential Reads ext3
                              Avg       Maximum     Lat%     Lat%    CPU
Kernel       Rate  (CPU%)   Latency     Latency      >2s     >10s    Eff
----------  ------------------------------------------------------------
2.4.20aa1     8.24  7.21%    28.587   449134.11  0.10395  0.07086    114
2.5.59        9.50  5.50%    36.703     4310.62  0.00000  0.00000    173
2.5.59-mm2   35.28 17.69%    10.173    18950.56  0.01010  0.00000    199


-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?
  2003-01-24  1:26 big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1? rwhron
@ 2003-01-24  2:10 ` Andrew Morton
  2003-01-24  2:33   ` Nick Piggin
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2003-01-24  2:10 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel, lse-tech

rwhron@earthlink.net wrote:
>
> Did you add a secret sauce to 2.5.59-mm2?

I have not been paying any attention to the I/O scheduler changes for a
couple of months, so I can't say exactly what caused this.  Possibly Nick's
batch expiry logic which causes the scheduler to alternate between reading
and writing with fairly coarse granularity.

>  10x sequential write improvement on ext3 for multiple tiobench threads.

OK...  

I _have_ been paying attention to the IO scheduler for the past few days. 
-mm5 will have the first draft of the anticipatory IO scheduler.  This of
course is yielding tremendous improvements in bandwidth when there are
competing reads and writes.

I expect it will take another week or two to get the I/O scheduler changes
really settled down.  Your assistance in thoroughly benching that would be
appreciated.

> 2.4.20aa1     8.24  7.21%    28.587   449134.11  0.10395  0.07086    114
> 2.5.59        9.50  5.50%    36.703     4310.62  0.00000  0.00000    173
> 2.5.59-mm2   35.28 17.69%    10.173    18950.56  0.01010  0.00000    199

boggle.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?
  2003-01-24  2:10 ` Andrew Morton
@ 2003-01-24  2:33   ` Nick Piggin
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Piggin @ 2003-01-24  2:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: rwhron, linux-kernel, lse-tech

Andrew Morton wrote:

>rwhron@earthlink.net wrote:
>
>>Did you add a secret sauce to 2.5.59-mm2?
>>
>
>I have not been paying any attention to the I/O scheduler changes for a
>couple of months, so I can't say exactly what caused this.  Possibly Nick's
>batch expiry logic which causes the scheduler to alternate between reading
>and writing with fairly coarse granularity.
>
Yes, however tiobench doesn't mix the two. The batch_expire helps
probably by giving longer batches between servicing expired requests.
The deadline-np-42 patch also eliminates corner cases in which requests
could be starved for a long time. A large batch_expire as in mm2 is not
a good solution without my anticipatory scheduling stuff though as
writes really starve reads.

>
>
>> 10x sequential write improvement on ext3 for multiple tiobench threads.
>>
>
>OK...  
>
>I _have_ been paying attention to the IO scheduler for the past few days. 
>-mm5 will have the first draft of the anticipatory IO scheduler.  This of
>course is yielding tremendous improvements in bandwidth when there are
>competing reads and writes.
>
>I expect it will take another week or two to get the I/O scheduler changes
>really settled down.  Your assistance in thoroughly benching that would be
>appreciated.
>
>
>>2.4.20aa1     8.24  7.21%    28.587   449134.11  0.10395  0.07086    114
>>2.5.59        9.50  5.50%    36.703     4310.62  0.00000  0.00000    173
>>2.5.59-mm2   35.28 17.69%    10.173    18950.56  0.01010  0.00000    199
>>
>
>boggle.
>
I'm happy with that as long as they aren't too dependant on the phase of
the moon. The initial deadline scheduler had quite a lot of problems with
these workloads.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?
  2003-01-24 21:19 rwhron
@ 2003-01-24 21:39 ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2003-01-24 21:39 UTC (permalink / raw)
  To: rwhron; +Cc: piggin, linux-kernel, lse-tech

rwhron@earthlink.net wrote:
>
> > It is important to specify how much memory you have, and how you are
> > invoking qsbench.
> 
> There is 3.75 GB of ram.  I grab MemTotal from /proc/meminfo, and run
> 4 qsbench processes.  Each qsbench uses 30% of MemTotal (1089 megs).  

Yes, 2.5 sucks at that.  Run `top', observe how in 2.4, one qsbench instance
grabs all the CPU time, then exits.  The remaining three can now complete
with no swapout at all..

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?
@ 2003-01-24 21:19 rwhron
  2003-01-24 21:39 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: rwhron @ 2003-01-24 21:19 UTC (permalink / raw)
  To: akpm, piggin; +Cc: linux-kernel, lse-tech

> qsbench isn't really a thing which should be optimised for.

The way I run qsbench simulates an uncommon workload.

> It is important to specify how much memory you have, and how you are
> invoking qsbench.

There is 3.75 GB of ram.  I grab MemTotal from /proc/meminfo, and run
4 qsbench processes.  Each qsbench uses 30% of MemTotal (1089 megs).  

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?
  2003-01-16  1:50 rwhron
@ 2003-01-16  6:31 ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2003-01-16  6:31 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

rwhron@earthlink.net wrote:
>
> On a quad xeon running tiobench...
> The throughput and max latency for ext3 sequential writes
> looks very good when threads >= 2 on 2.5.51-mm1.
> 
> Did 2.5.51-mm1 mount ext3 as ext2?  I have ext2 logs for
> 2.5.51-mm1 and they look similar to the ext3 results.
> The other 2.5 kernels from around that time look more
> like 2.5.53-mm1.
> 

Dunno.  There have been about 7,000 different versions of the I/O scheduler
in that time and it seems a bit prone to butterfly effects.

Or maybe you accidentally ran the 2.5.51-mm1 tests on uniprocessor? 
Multithreaded tiobench on SMP brings out the worst behaviour in the ext2 and
ext3 block allocators.  Look:

<start tiobench>
<wait a while>
<kill it all off>

    quad:/mnt/sde5/tiobench> ls -ltr 
    ...
    -rw-------    1 akpm     akpm     860971008 Jan 15 22:02 _956_tiotest.0
    -rw-------    1 akpm     akpm     840470528 Jan 15 22:03 _956_tiotest.1

OK, 800 megs.

    quad:/mnt/sde5/tiobench> 0 bmap _956_tiotest.0|wc
     199224  597671 6751187

wtf?  It's taking 200,000 separate chunks of disk.

    quad:/mnt/sde5/tiobench> expr 860971008 / 199224
    4321

so the average chunk size is a little over 4k.

    quad:/mnt/sde5/tiobench> 0 bmap _956_tiotest.0 | tail -50000 | head -10
    149770-149770: 1845103-1845103 (1)
    149771-149771: 1845105-1845105 (1)
    149772-149772: 1845107-1845107 (1)
    149773-149773: 1845109-1845109 (1)
    149774-149774: 1845111-1845111 (1)
    149775-149775: 1845113-1845113 (1)
    149776-149776: 1845115-1845115 (1)
    149777-149777: 1845117-1845117 (1)
    149778-149778: 1845119-1845119 (1)
    149779-149779: 1845121-1845121 (1)

lovely.  These two files have perfectly intermingled blocks.  Writeback
bandwdith goes from 20 megabytes per second to about 0.5.

It doesn't happen on uniprocessor because each tiobench instance gets to run
for a timeslice, during which it is able to allocate a decent number of
contiguous blocks.

ext2 has block preallocation and will intermingle in 32k units, not 4k units.
So it's still crap, only not so smelly.

Does it matter much in practice?   Sometimes, not often.

It is crap? Yes.

Do I have time to do anything about it?   Probably not.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1?
@ 2003-01-16  1:50 rwhron
  2003-01-16  6:31 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: rwhron @ 2003-01-16  1:50 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel

On a quad xeon running tiobench...
The throughput and max latency for ext3 sequential writes
looks very good when threads >= 2 on 2.5.51-mm1.

Did 2.5.51-mm1 mount ext3 as ext2?  I have ext2 logs for
2.5.51-mm1 and they look similar to the ext3 results.
The other 2.5 kernels from around that time look more
like 2.5.53-mm1.

file size = 8192 megs
block size = 4096 bytes
rate in megabytes/second
latency in milliseconds

Sequential Writes
              Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier    Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
------------  ---  ------ ------ --------- -----------  -------- -------- -----
2.5.51-mm1      1   55.58 46.70%     0.234    18461.07   0.00834  0.00000   119
2.5.51-mm1      2   38.00 31.54%     0.563    18460.95   0.00896  0.00000   120
2.5.51-mm1      4   35.28 32.69%     1.041    84910.77   0.01306  0.00057   108
2.5.51-mm1      8   34.86 32.97%     2.090   113261.88   0.02433  0.00387   106
2.5.51-mm1     16   34.79 32.80%     3.786   216278.13   0.02923  0.01054   106
2.5.51-mm1     32   33.25 32.31%     7.083   331456.04   0.03152  0.01411   103
2.5.51-mm1     64   31.77 32.14%    14.020   604095.22   0.03772  0.02094    99
2.5.51-mm1    128   30.59 31.60%    25.436   653761.04   0.04019  0.02298    97
2.5.51-mm1    256   32.45 34.83%    47.633   598925.79   0.06914  0.04615    93

Sequential Writes
              Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier    Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
------------  ---  ------ ------ --------- -----------  -------- -------- -----
2.5.53-mm1      1   52.74 68.60%     0.604    19544.48   0.01731  0.00000    77
2.5.53-mm1      2    2.70 4.951%     7.589    54571.99   0.12716  0.00739    55
2.5.53-mm1      4    2.78 34.78%    13.966   467805.71   0.16842  0.03018     8
2.5.53-mm1      8    2.93 59.73%    26.819  1008655.17   0.19922  0.04420     5
2.5.53-mm1     16    3.14 26.13%    45.610  1939797.82   0.14705  0.05607    12
2.5.53-mm1     32    3.35 19.17%    80.421  3055837.66   0.12188  0.04888    17
2.5.53-mm1     64    3.43 15.13%   163.323  4284106.34   0.11868  0.05264    23
2.5.53-mm1    128    3.66 20.04%   260.372  5148947.62   0.12889  0.04530    18
2.5.53-mm1    256    4.26 20.30%   382.981  3094442.29   0.20232  0.06323    21

There is another odd thing in some of the 2.5 ext3 results.  Several of the
kernels show a jump in throughput at 256 threads. 

Sequential Writes
              Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier    Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
------------  ---  ------ ------ --------- -----------  -------- -------- -----
2.5.56          1   53.17 69.99%     0.194    36612.36   0.00029  0.00005    76
2.5.56          2    2.53 4.728%     7.549  1219600.59   0.05112  0.00205    53
2.5.56          4    2.58 73.97%    15.141   823168.02   0.05078  0.01531     3
2.5.56          8    2.67 179.9%    29.981   641722.67   0.07091  0.04382     1
2.5.56         16    3.34 136.6%    47.075  1416051.92   0.11807  0.09304     2
2.5.56         32    2.93 124.8%   100.112  1842078.09   0.18826  0.14262     2
2.5.56         64    3.66 37.46%   147.693  4216304.67   0.12394  0.06661    10
2.5.56        128    4.01 17.11%   237.592  4194864.65   0.10777  0.05642    23
2.5.56        256   12.64 48.78%   353.895  3741404.43   0.10434  0.05335    26

2.4 has a more gentle degradation in throughput and max latency for seq writes 
on ext3:

Sequential Writes
              Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier    Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
------------- ---  ------ ------ --------- -----------  -------- -------- -----
2.4.20-pre10    1   37.71 56.08%     0.288     4315.58   0.00000  0.00000    67
2.4.20-pre10    2   33.01 98.65%     0.592     5517.10   0.00010  0.00000    33
2.4.20-pre10    4   30.83 153.3%     1.162     3684.74   0.00000  0.00000    20
2.4.20-pre10    8   24.86 126.9%     2.523     7436.22   0.00058  0.00000    20
2.4.20-pre10   16   21.21 104.0%     4.893     9132.94   0.00992  0.00000    20
2.4.20-pre10   32   18.14 97.27%    10.394    13451.42   0.09843  0.00000    19
2.4.20-pre10   64   15.63 90.39%    22.679    18888.44   0.39897  0.00000    17
2.4.20-pre10  128   12.03 78.06%    54.387    31156.69   1.12638  0.00038    15
2.4.20-pre10  256    9.94 71.13%   134.323    61604.97   2.87437  0.03022    14

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-01-24 21:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-24  1:26 big ext3 sequential write improvement in 2.5.51-mm1 gone in 2.5.53-mm1? rwhron
2003-01-24  2:10 ` Andrew Morton
2003-01-24  2:33   ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2003-01-24 21:19 rwhron
2003-01-24 21:39 ` Andrew Morton
2003-01-16  1:50 rwhron
2003-01-16  6:31 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).