linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Strange IDE performance change in 2.6.1-rc1 (again)
@ 2004-01-02 16:02 Paolo Ornati
  2004-01-02 18:08 ` Ed Sweetman
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Ornati @ 2004-01-02 16:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: William Lee Irwin III, Ed Sweetman

[-- Attachment #1: Type: text/plain, Size: 2331 bytes --]

As I have already said I have noticed a strange IDE performance change 
upgrading from 2.6.0 to 2.6.1-rc1.

Now I have more data (and a graph) to show: the test is done using
"hdparm -t dev/hda" at various readahead (form 0 to 512).

o SCRIPT

#!/bin/bash

MIN=0
MAX=512

ra=$MIN
while test $ra -le $MAX; do
    hdparm -a $ra /dev/hda > /dev/null;
    echo -n $ra$'\t';
    s1=`hdparm -t /dev/hda | grep 'Timing' | cut -d'=' -f2| cut -d' ' -f2`;
    s2=`hdparm -t /dev/hda | grep 'Timing' | cut -d'=' -f2| cut -d' ' -f2`;
    s=`echo "scale=2; ($s1+$s2)/2" | bc`;
    echo $s;
    ra=$(($ra+16));
done


o RESULTS for 2.6.0  (readahead / speed)

0	13.30
16	13.52
32	13.76
48	31.81
64	31.83
80	31.90
96	31.86
112	31.82
128	31.89
144	31.93
160	31.89
176	31.86
192	31.93
208	31.91
224	31.87
240	31.18
256	26.41
272	27.52
288	31.74
304	27.29
320	27.23
336	25.44
352	27.59
368	27.32
384	31.84
400	28.03
416	28.07
432	20.46
448	28.59
464	28.63
480	23.95
496	27.21
512	22.38


o RESULTS for 2.6.1-rc1  (readahead / speed)

0	13.34
16	25.86
32	26.27
48	24.81
64	26.26
80	24.88
96	27.09
112	24.88
128	26.31
144	24.79
160	26.31
176	24.51
192	25.86
208	24.35
224	26.48
240	24.82
256	26.38
272	24.60
288	31.15
304	24.61
320	26.69
336	24.54
352	26.23
368	24.87
384	25.91
400	25.74
416	26.45
432	23.61
448	26.44
464	24.36
480	26.80
496	24.60
512	26.49


The graph is attached. (x = readahead && y = MB/s)

The kernel config for 2.6.0 is attached (for 2.6.1-rc1 I have just used 
"make oldconfig").

INFO on my HD:

/dev/hda:

 Model=WDC WD200BB-53AUA1, FwRev=18.20D18, SerialNo=WD-WMA6Y1501425
 Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
 RawCHS=16383/16/63, TrkSize=57600, SectSize=600, ECCbytes=40
 BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=39102336
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2 udma3 *udma4 udma5
 AdvancedPM=no WriteCache=enabled
 Drive conforms to: device does not report version:  1 2 3 4 5

INFO on my IDE controller:

00:04.1 IDE interface: VIA Technologies, Inc. VT82C586/B/686A/B PIPC Bus 
Master IDE (rev 10)


Comments are welcomed.

Bye,

-- 
	Paolo Ornati
	Linux v2.4.23







[-- Attachment #2: graph.png --]
[-- Type: image/png, Size: 5870 bytes --]

[-- Attachment #3: config-2.6.0.gz --]
[-- Type: application/x-gzip, Size: 3611 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 16:02 Strange IDE performance change in 2.6.1-rc1 (again) Paolo Ornati
@ 2004-01-02 18:08 ` Ed Sweetman
  2004-01-02 21:04   ` Paolo Ornati
  0 siblings, 1 reply; 38+ messages in thread
From: Ed Sweetman @ 2004-01-02 18:08 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: linux-kernel, William Lee Irwin III



I do not see this behavior and i'm using the same ide chipset driver 
(though not the same ide chipset).  btw, readahead for all my other 
drives is set to 8192 during these tests but changing them showed no 
effect on my numbers.



/dev/hda:

  Model=Maxtor 6Y120P0, FwRev=YAR41VW0, SerialNo=Y40D924E
  Config={ Fixed }
  RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
  BuffType=DualPortCache, BuffSize=7936kB, MaxMultSect=16, MultSect=16
  CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=240121728
  IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
  PIO modes:  pio0 pio1 pio2 pio3 pio4
  DMA modes:  mdma0 mdma1 mdma2
  UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 udma6
  AdvancedPM=yes: disabled (255) WriteCache=enabled
  Drive conforms to: (null):

00:07.1 IDE interface: VIA Technologies, Inc. 
VT82C586A/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE (rev 06)

it's a vt82C686A

128
/dev/hda:
  Timing buffered disk reads:  130 MB in  3.05 seconds =  42.69 MB/sec

256
/dev/hda:
  Timing buffered disk reads:  134 MB in  3.03 seconds =  44.27 MB/sec

512
/dev/hda:
  Timing buffered disk reads:  136 MB in  3.00 seconds =  45.33 MB/sec

8192
/dev/hda:
  Timing buffered disk reads:  140 MB in  3.03 seconds =  46.24 MB/sec




Note, sometimes when moving backwards back to a lower readhead my speed 
does not decrease to the values you see here. readahead on my system 
always goes up (on avg) with higher readahead numbers, maxing at 8192. 
No matter the buffer size or speed or position the ide drive is in.

hdparm -t is difficult to get really accurate, which is why they suggest 
running it multiple times.  I see differences of 4MB/sec on subsequent 
runs without changing anything.  run hdparm -t at least 3-4 times for 
each readahead value.

I suggest trying 128, 256,512,8192 as values for readahead and skip all 
those crap numbers in between.


if you still see on avg lower numbers on the top end, try nicing hdparm 
to -20.  Also, update to a newer hdparm. hdparm v5.4, you seem to be 
using an older one.

Paolo Ornati wrote:
> As I have already said I have noticed a strange IDE performance change 
> upgrading from 2.6.0 to 2.6.1-rc1.
> 
> Now I have more data (and a graph) to show: the test is done using
> "hdparm -t dev/hda" at various readahead (form 0 to 512).
> 
> o SCRIPT
> 
> #!/bin/bash
> 
> MIN=0
> MAX=512
> 
> ra=$MIN
> while test $ra -le $MAX; do
>     hdparm -a $ra /dev/hda > /dev/null;
>     echo -n $ra$'\t';
>     s1=`hdparm -t /dev/hda | grep 'Timing' | cut -d'=' -f2| cut -d' ' -f2`;
>     s2=`hdparm -t /dev/hda | grep 'Timing' | cut -d'=' -f2| cut -d' ' -f2`;
>     s=`echo "scale=2; ($s1+$s2)/2" | bc`;
>     echo $s;
>     ra=$(($ra+16));
> done
> 
> 
> o RESULTS for 2.6.0  (readahead / speed)
> 
> 0	13.30
> 16	13.52
> 32	13.76
> 48	31.81
> 64	31.83
> 80	31.90
> 96	31.86
> 112	31.82
> 128	31.89
> 144	31.93
> 160	31.89
> 176	31.86
> 192	31.93
> 208	31.91
> 224	31.87
> 240	31.18
> 256	26.41
> 272	27.52
> 288	31.74
> 304	27.29
> 320	27.23
> 336	25.44
> 352	27.59
> 368	27.32
> 384	31.84
> 400	28.03
> 416	28.07
> 432	20.46
> 448	28.59
> 464	28.63
> 480	23.95
> 496	27.21
> 512	22.38
> 
> 
> o RESULTS for 2.6.1-rc1  (readahead / speed)
> 
> 0	13.34
> 16	25.86
> 32	26.27
> 48	24.81
> 64	26.26
> 80	24.88
> 96	27.09
> 112	24.88
> 128	26.31
> 144	24.79
> 160	26.31
> 176	24.51
> 192	25.86
> 208	24.35
> 224	26.48
> 240	24.82
> 256	26.38
> 272	24.60
> 288	31.15
> 304	24.61
> 320	26.69
> 336	24.54
> 352	26.23
> 368	24.87
> 384	25.91
> 400	25.74
> 416	26.45
> 432	23.61
> 448	26.44
> 464	24.36
> 480	26.80
> 496	24.60
> 512	26.49
> 
> 
> The graph is attached. (x = readahead && y = MB/s)
> 
> The kernel config for 2.6.0 is attached (for 2.6.1-rc1 I have just used 
> "make oldconfig").
> 
> INFO on my HD:
> 
> /dev/hda:
> 
>  Model=WDC WD200BB-53AUA1, FwRev=18.20D18, SerialNo=WD-WMA6Y1501425
>  Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
>  RawCHS=16383/16/63, TrkSize=57600, SectSize=600, ECCbytes=40
>  BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16
>  CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=39102336
>  IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
>  PIO modes:  pio0 pio1 pio2 pio3 pio4
>  DMA modes:  mdma0 mdma1 mdma2
>  UDMA modes: udma0 udma1 udma2 udma3 *udma4 udma5
>  AdvancedPM=no WriteCache=enabled
>  Drive conforms to: device does not report version:  1 2 3 4 5
> 
> INFO on my IDE controller:
> 
> 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586/B/686A/B PIPC Bus 
> Master IDE (rev 10)
> 
> 
> Comments are welcomed.
> 
> Bye,
> 
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 18:08 ` Ed Sweetman
@ 2004-01-02 21:04   ` Paolo Ornati
  2004-01-02 21:27     ` Valdis.Kletnieks
                       ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Paolo Ornati @ 2004-01-02 21:04 UTC (permalink / raw)
  To: Ed Sweetman; +Cc: linux-kernel

On Friday 02 January 2004 19:08, Ed Sweetman wrote:
>
>
> Note, sometimes when moving backwards back to a lower readhead my speed
> does not decrease to the values you see here. readahead on my system
> always goes up (on avg) with higher readahead numbers, maxing at 8192.
> No matter the buffer size or speed or position the ide drive is in.
>
> hdparm -t is difficult to get really accurate, which is why they suggest
> running it multiple times.  I see differences of 4MB/sec on subsequent
> runs without changing anything.  run hdparm -t at least 3-4 times for
> each readahead value.
>
> I suggest trying 128, 256,512,8192 as values for readahead and skip all
> those crap numbers in between.
>
>
> if you still see on avg lower numbers on the top end, try nicing hdparm
> to -20.  Also, update to a newer hdparm. hdparm v5.4, you seem to be
> using an older one.
>

ok, hdparm updated to v5.4

and this is the new script:
_____________________________________________________________________
#!/bin/bash

# This script assumes hdparm v5.4

NR_TESTS=3
RA_VALUES="64 128 256 8192"

killall5
sync
hdparm -a 0 /dev/hda > /dev/null
hdparm -t /dev/hda > /dev/null

for ra in $RA_VALUES; do
    hdparm -a $ra /dev/hda > /dev/null;
    echo -n $ra$'\t';
    tot=0;
    for i in `seq $NR_TESTS`; do
	tmp=`nice -n '-20' hdparm -t /dev/hda|grep 'Timing'|tr -d ' '|cut -d'=' -f2|cut -d'M' -f1`;
	tot=`echo "scale=2; $tot+$tmp" | bc`;
    done;
    s=`echo "scale=2; $tot/$NR_TESTS" | bc`;
    echo $s;
done
_____________________________________________________________________


The results are like the previous.

2.6.0:
64        31.91
128      31.89
256      26.22	# during the transfer HD LED blinks
8192    26.26	# during the transfer HD LED blinks

2.6.1-rc1:
64        25.84	# during the transfer HD LED blinks
128      25.85	# during the transfer HD LED blinks
256      25.90	# during the transfer HD LED blinks
8192    26.42	# during the transfer HD LED blinks

I have tried with and without "nice -n '-20'" but without any visible changes.

Performance with 2.4:
with kernel 2.4.23 && readahead = 8 I get 31.89 MB/s...
changing readahead doesn't seem to affect the speed too much.

Bye

-- 
	Paolo Ornati
	Linux v2.4.23


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 21:04   ` Paolo Ornati
@ 2004-01-02 21:27     ` Valdis.Kletnieks
  2004-01-03 10:20       ` Paolo Ornati
  2004-01-02 21:32     ` Mike Fedyk
  2004-01-03  3:33     ` Tobias Diedrich
  2 siblings, 1 reply; 38+ messages in thread
From: Valdis.Kletnieks @ 2004-01-02 21:27 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: Ed Sweetman, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 986 bytes --]

On Fri, 02 Jan 2004 22:04:27 +0100, Paolo Ornati said:

> The results are like the previous.
> 
> 2.6.0:
> 64        31.91
> 128      31.89
> 256      26.22	# during the transfer HD LED blinks
> 8192    26.26	# during the transfer HD LED blinks
> 
> 2.6.1-rc1:
> 64        25.84	# during the transfer HD LED blinks
> 128      25.85	# during the transfer HD LED blinks
> 256      25.90	# during the transfer HD LED blinks
> 8192    26.42	# during the transfer HD LED blinks
> 
> I have tried with and without "nice -n '-20'" but without any visible changes

Do you get different numbers if you boot with:

elevator=as
elevator=deadline
elevator=cfq  (for -mm kernels)
elevator=noop

(You may need to build a kernel with these configured - the symbols are:

CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y  (-mm kernels only)

and can be selected in the 'General Setup' menu - they should all be
built by default unless you've selected EMBEDDED).

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 21:04   ` Paolo Ornati
  2004-01-02 21:27     ` Valdis.Kletnieks
@ 2004-01-02 21:32     ` Mike Fedyk
  2004-01-02 22:34       ` Martin Josefsson
  2004-01-03 10:20       ` Paolo Ornati
  2004-01-03  3:33     ` Tobias Diedrich
  2 siblings, 2 replies; 38+ messages in thread
From: Mike Fedyk @ 2004-01-02 21:32 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: Ed Sweetman, linux-kernel

On Fri, Jan 02, 2004 at 10:04:27PM +0100, Paolo Ornati wrote:
> NR_TESTS=3
> RA_VALUES="64 128 256 8192"

Can you add more samples between 128 and 256, maybe at intervals of 32?

Have there been any ide updates in 2.6.1-rc1?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 21:32     ` Mike Fedyk
@ 2004-01-02 22:34       ` Martin Josefsson
  2004-01-03 11:13         ` Paolo Ornati
  2004-01-03 10:20       ` Paolo Ornati
  1 sibling, 1 reply; 38+ messages in thread
From: Martin Josefsson @ 2004-01-02 22:34 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Paolo Ornati, Ed Sweetman, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 378 bytes --]

On Fri, 2004-01-02 at 22:32, Mike Fedyk wrote:

> Have there been any ide updates in 2.6.1-rc1?

I see that a readahead patch was applied just before -rc1 was released.

found it in bk-commits-head

Subject: [PATCH] readahead: multiple performance fixes
Message-Id:  <200312310120.hBV1KLZN012971@hera.kernel.org>

Maybe Paolo can try backing it out.

-- 
/Martin

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 21:04   ` Paolo Ornati
  2004-01-02 21:27     ` Valdis.Kletnieks
  2004-01-02 21:32     ` Mike Fedyk
@ 2004-01-03  3:33     ` Tobias Diedrich
  2004-01-03  4:15       ` Valdis.Kletnieks
  2 siblings, 1 reply; 38+ messages in thread
From: Tobias Diedrich @ 2004-01-03  3:33 UTC (permalink / raw)
  To: linux-kernel

Here are some numbers I got on 2.4 and 2.6 with hdparm.

2.4.23-acl-preempt-lowlatency:
0: 47.18 47.18
8: 47.18 47.18
16: 47.18 47.18
32: 47.18 47.18
64: 47.18 47.02
128: 47.18 47.18

2.6.0:
0: 28.68 28.73
8: 28.87 28.76
16: 28.82 28.83
256: 43.77 44.13
512: 24.86 24.86
1024: 26.49

Note: The last number is missing because I used hdparm -a${x}t and it
      seems that it sets the readahead _after_ the measurements and I
      had to compensate for that, after I noticed it with the following
      measurement.

2.6.0 with preempt enabled, now 3 repeats and
hdparm -a${x}t /dev/hda
instead of
hdparm -a${x}tT /dev/hda

0:    28.09 28.11 28.17
128:  41.52 41.44 40.94
256:  41.07 41.39 41.32
512:  24.59 25.04 24.84
1024: 26.49 26.30

2.6.1-rc1 without preempt and corrected script to do the readahead
setting first, anticipatory scheduler:

0:    28.92 28.91 28.49
128:  33.78 33.60 33.62
256:  33.62 33.55 33.60
512:  33.54 33.54 33.41
1024: 33.60 33.60 33.43

2.6.1-rc1, noop scheduler:

0:    28.36 28.86 28.82
128:  33.45 33.50 33.52
256:  33.45 33.51 33.52
512:  33.23 33.51 33.51
1024: 33.52 33.54 33.54

Very interesting tidbit:

with 2.6.1-rc1 and "dd if=/dev/hda of=/dev/null" I get stable 28 MB/s,
but with "cat < /dev/hda > /dev/null" I get 48 MB/s according to "vmstat
5".

oprofile report for 2.6.0, the second run IIRC:
CPU: Athlon, speed 1477.56 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 738778
vma      samples  %           app name                 symbol name
c01f0b3e 16325    38.9944     vmlinux                  __copy_to_user_ll
c022fdd7 3600      8.5991     vmlinux                  ide_outb
c010cb41 3374      8.0592     vmlinux                  mask_and_ack_8259A
c01117c4 2670      6.3776     vmlinux                  mark_offset_tsc
00000000 1761      4.2064     hdparm                   (no symbols)
c022fd9b 1448      3.4587     vmlinux                  ide_inb
c010ca69 1335      3.1888     vmlinux                  enable_8259A_irq
c01088b4 862       2.0590     vmlinux                  irq_entries_start
c0111a45 502       1.1991     vmlinux                  delay_tsc
c0106a8c 484       1.1561     vmlinux                  default_idle
c010ca16 414       0.9889     vmlinux                  disable_8259A_irq
c0109154 257       0.6139     vmlinux                  apic_timer_interrupt
c022fde2 217       0.5183     vmlinux                  ide_outbsync
c012fa13 206       0.4921     vmlinux                  mempool_alloc
c012da03 198       0.4729     vmlinux                  do_generic_mapping_read
c01471a4 195       0.4658     vmlinux                  drop_buffers
c013045a 191       0.4562     vmlinux                  __rmqueue
c0144515 180       0.4300     vmlinux                  unlock_buffer
c0218999 179       0.4276     vmlinux                  blk_rq_map_sg
c022fe0a 179       0.4276     vmlinux                  ide_outl
c013335f 177       0.4228     vmlinux                  kmem_cache_free
c01307bd 163       0.3893     vmlinux                  buffered_rmqueue
c0219b09 159       0.3798     vmlinux                  __make_request
c014613c 157       0.3750     vmlinux                  block_read_full_page
c0116b39 150       0.3583     vmlinux                  schedule
c01332ce 149       0.3559     vmlinux                  kmem_cache_alloc
c0144b2f 136       0.3249     vmlinux                  end_buffer_async_read
c012fb42 111       0.2651     vmlinux                  mempool_free
c014741f 110       0.2627     vmlinux                  init_buffer_head
c01ee88b 109       0.2604     vmlinux                  radix_tree_insert
c0130108 105       0.2508     vmlinux                  bad_range
c0133152 98        0.2341     vmlinux                  free_block
c01071ca 96        0.2293     vmlinux                  __switch_to
c012d71f 96        0.2293     vmlinux                  find_get_page
c014597e 93        0.2221     vmlinux                  create_empty_buffers
c022dd3a 90        0.2150     vmlinux                  ide_do_request
c01301ba 84        0.2006     vmlinux                  free_pages_bulk
c01eeab3 84        0.2006     vmlinux                  radix_tree_delete
c012d421 81        0.1935     vmlinux                  add_to_page_cache
c013236e 80        0.1911     vmlinux                  page_cache_readahead
c012d5e8 79        0.1887     vmlinux                  unlock_page
c011813c 76        0.1815     vmlinux                  prepare_to_wait
c01308e1 74        0.1768     vmlinux                  __alloc_pages
c0147512 71        0.1696     vmlinux                  bio_alloc
c0146f5a 68        0.1624     vmlinux                  submit_bh
c01ee922 67        0.1600     vmlinux                  radix_tree_lookup
c01f0729 66        0.1576     vmlinux                  fast_clear_page
c01306d4 66        0.1576     vmlinux                  free_hot_cold_page
c022d12c 66        0.1576     vmlinux                  ide_end_request
c022e349 65        0.1553     vmlinux                  ide_intr
c0116247 65        0.1553     vmlinux                  recalc_task_prio
c021e20d 61        0.1457     vmlinux                  as_merge
c012dd4d 60        0.1433     vmlinux                  file_read_actor
c012d514 60        0.1433     vmlinux                  page_waitqueue
c022db45 58        0.1385     vmlinux                  start_request
c0218800 57        0.1362     vmlinux                  blk_recount_segments
c01445f6 56        0.1338     vmlinux                  __set_page_buffers
c0148fc0 56        0.1338     vmlinux                  max_block
c010a36c 54        0.1290     vmlinux                  handle_IRQ_event
c021db03 53        0.1266     vmlinux                  as_move_to_dispatch
c0236409 53        0.1266     vmlinux                  lba_48_rw_disk
c012d8ee 52        0.1242     vmlinux                  find_get_pages
c021d4a9 51        0.1218     vmlinux                  as_update_iohist
c0146f2d 51        0.1218     vmlinux                  end_bio_bh_io_sync
c021a10c 51        0.1218     vmlinux                  submit_bio
c021a313 50        0.1194     vmlinux                  __end_that_request_first
c0116391 49        0.1170     vmlinux                  try_to_wake_up
c021e1b4 48        0.1147     vmlinux                  as_queue_empty
c01201fc 48        0.1147     vmlinux                  del_timer
c01473d5 48        0.1147     vmlinux                  free_buffer_head
c0219fdf 47        0.1123     vmlinux                  generic_make_request
c0147259 47        0.1123     vmlinux                  try_to_free_buffers
c021dc81 46        0.1099     vmlinux                  as_dispatch_request
c0219439 45        0.1075     vmlinux                  get_request
c0134795 45        0.1075     vmlinux                  invalidate_complete_page
c0218ac9 45        0.1075     vmlinux                  ll_back_merge_fn
c01161f8 43        0.1027     vmlinux                  effective_prio
c0120006 42        0.1003     vmlinux                  __mod_timer
c0132fa8 41        0.0979     vmlinux                  cache_alloc_refill
c0147393 40        0.0955     vmlinux                  alloc_buffer_head
c01451f5 39        0.0932     vmlinux                  create_buffers
c0134344 39        0.0932     vmlinux                  release_pages
c0236118 37        0.0884     vmlinux                  __ide_do_rw_disk
c021e35d 37        0.0884     vmlinux                  as_merged_request
c010a5b9 36        0.0860     vmlinux                  do_IRQ
c0134a52 36        0.0860     vmlinux                  invalidate_mapping_pages
c0111763 36        0.0860     vmlinux                  sched_clock
c012d540 36        0.0860     vmlinux                  wait_on_page_bit
c021909d 35        0.0836     vmlinux                  blk_run_queues
c021d88c 32        0.0764     vmlinux                  as_remove_queued_request
c0147c8c 32        0.0764     vmlinux                  bio_endio
c0217ddc 32        0.0764     vmlinux                  elv_try_last_merge
c023c716 32        0.0764     vmlinux                  ide_build_dmatable
00000000 31        0.0740     libc-2.3.2.so            (no symbols)
c011d0a0 31        0.0740     vmlinux                  do_softirq
c014913b 30        0.0717     vmlinux                  blkdev_get_block
00000000 29        0.0693     ld-2.3.2.so              (no symbols)
c0134517 29        0.0693     vmlinux                  __pagevec_lru_add
c01474cc 29        0.0693     vmlinux                  bio_destructor
c02311bb 29        0.0693     vmlinux                  do_rw_taskfile
c0231f7d 29        0.0693     vmlinux                  ide_cmd_type_parser
c014583c 29        0.0693     vmlinux                  set_bh_page
c0132227 27        0.0645     vmlinux                  do_page_cache_readahead
c0219805 27        0.0645     vmlinux                  drive_stat_acct
c0230a11 27        0.0645     vmlinux                  ide_execute_command
c01201b2 27        0.0645     vmlinux                  mod_timer
c021a77f 26        0.0621     vmlinux                  get_io_context
c011672e 26        0.0621     vmlinux                  scheduler_tick
00000000 25        0.0597     bash                     (no symbols)
c0217bec 25        0.0597     vmlinux                  elv_queue_empty
c012059e 24        0.0573     vmlinux                  update_one_process
c0231000 23        0.0549     vmlinux                  SELECT_DRIVE
c023cb24 23        0.0549     vmlinux                  __ide_dma_read
c021d3a2 23        0.0549     vmlinux                  as_can_break_anticipation
c0109134 23        0.0549     vmlinux                  common_interrupt
c0218fb4 23        0.0549     vmlinux                  generic_unplug_device
c01ee966 22        0.0525     vmlinux                  __lookup
c012d194 22        0.0525     vmlinux                  __remove_from_page_cache
c02001ed 22        0.0525     vmlinux                  add_timer_randomness
c021deef 22        0.0525     vmlinux                  as_add_request
c021a4dd 22        0.0525     vmlinux                  end_that_request_last
c012fbc8 22        0.0525     vmlinux                  mempool_free_slab
c0131f78 22        0.0525     vmlinux                  read_pages
c02361cd 21        0.0502     vmlinux                  get_command
c01eef2b 21        0.0502     vmlinux                  rb_next
c021cf02 20        0.0478     vmlinux                  as_add_arq_hash
c021e80b 20        0.0478     vmlinux                  as_set_request
c023c547 20        0.0478     vmlinux                  ide_build_sglist
c012fbb8 20        0.0478     vmlinux                  mempool_alloc_slab
c010da6b 20        0.0478     vmlinux                  timer_interrupt
00000000 19        0.0454     oprofiled26              (no symbols)
c021d740 19        0.0454     vmlinux                  as_completed_request
c0230231 19        0.0454     vmlinux                  drive_is_ready
c02302e1 19        0.0454     vmlinux                  ide_wait_stat
c01341c1 19        0.0454     vmlinux                  mark_page_accessed
c013055b 19        0.0454     vmlinux                  rmqueue_bulk
c012069a 19        0.0454     vmlinux                  run_timer_softirq
c021da9c 18        0.0430     vmlinux                  as_fifo_expired
c021d95f 18        0.0430     vmlinux                  as_remove_dispatched_request
c01181d7 17        0.0406     vmlinux                  finish_wait
c013246e 17        0.0406     vmlinux                  handle_ra_miss
c021e0f2 16        0.0382     vmlinux                  as_insert_request
c010ad15 16        0.0382     vmlinux                  disable_irq_nosync
c01f0785 16        0.0382     vmlinux                  fast_copy_page
c010a441 16        0.0382     vmlinux                  note_interrupt
c01ee7aa 16        0.0382     vmlinux                  radix_tree_preload
c0120817 15        0.0358     vmlinux                  do_timer
c01087ee 15        0.0358     vmlinux                  restore_all
c01f04bc 14        0.0334     vmlinux                  __delay
c021d47e 14        0.0334     vmlinux                  as_can_anticipate
c021d684 14        0.0334     vmlinux                  as_update_arq
c0147eec 14        0.0334     vmlinux                  bio_hw_segments
c011d18d 14        0.0334     vmlinux                  raise_softirq
c01086c5 14        0.0334     vmlinux                  ret_from_intr
c01204c6 14        0.0334     vmlinux                  update_wall_time_one_tick
c011702f 13        0.0311     vmlinux                  __wake_up_common
c021da16 13        0.0311     vmlinux                  as_remove_request
c0147ecf 13        0.0311     vmlinux                  bio_phys_segments
c0218e97 13        0.0311     vmlinux                  blk_plug_device
c021a7e0 13        0.0311     vmlinux                  copy_io_context
c0136cdd 13        0.0311     vmlinux                  copy_page_range
c0106ae1 13        0.0311     vmlinux                  cpu_idle
c0235577 13        0.0311     vmlinux                  default_end_request
c010938c 13        0.0311     vmlinux                  page_fault
c01eef63 13        0.0311     vmlinux                  rb_prev
c023cc6c 12        0.0287     vmlinux                  __ide_dma_begin
c021cfd8 12        0.0287     vmlinux                  as_add_arq_rb
c021d315 12        0.0287     vmlinux                  as_close_req
c013685b 12        0.0287     vmlinux                  blk_queue_bounce
c013847d 12        0.0287     vmlinux                  do_no_page
c0217a19 12        0.0287     vmlinux                  elv_merge
c0217b01 12        0.0287     vmlinux                  elv_next_request
c023c4ca 12        0.0287     vmlinux                  ide_dma_intr
c013041e 12        0.0287     vmlinux                  prep_new_page
c012d678 11        0.0263     vmlinux                  __lock_page
c021d178 11        0.0263     vmlinux                  as_find_next_arq
c01444e5 11        0.0263     vmlinux                  bh_waitq_head
c021a4af 11        0.0263     vmlinux                  end_that_request_first
c0108702 11        0.0263     vmlinux                  need_resched
c01eee3f 11        0.0263     vmlinux                  rb_erase
c0145878 11        0.0263     vmlinux                  try_to_release_page
c01444fa 11        0.0263     vmlinux                  wake_up_buffer
c0144623 10        0.0239     vmlinux                  __clear_page_buffers
c023cca4 10        0.0239     vmlinux                  __ide_dma_end
c021d07f 10        0.0239     vmlinux                  as_choose_req
c021e7c2 10        0.0239     vmlinux                  as_put_request
c0147155 10        0.0239     vmlinux                  check_ttfb_buffer
c026493f 10        0.0239     vmlinux                  i8042_interrupt
c01341ef 10        0.0239     vmlinux                  lru_cache_add
c014735a 10        0.0239     vmlinux                  recalc_bh_state
c02198b0 9         0.0215     vmlinux                  __blk_put_request
c021e1f4 9         0.0215     vmlinux                  as_latter_request
c010a503 9         0.0215     vmlinux                  enable_irq
c0231efa 9         0.0215     vmlinux                  ide_handler_parser
c0124348 9         0.0215     vmlinux                  notifier_call_chain
c013bd66 9         0.0215     vmlinux                  page_remove_rmap
c0116fdc 9         0.0215     vmlinux                  preempt_schedule
c011707a 8         0.0191     vmlinux                  __wake_up
c012d4e7 8         0.0191     vmlinux                  add_to_page_cache_lru
c014921e 8         0.0191     vmlinux                  blkdev_readpage
c0217c9d 8         0.0191     vmlinux                  elv_completed_request
c0113745 8         0.0191     vmlinux                  smp_apic_timer_interrupt
c028f61d 8         0.0191     vmlinux                  sync_buffer
c012056b 8         0.0191     vmlinux                  update_wall_time
c023cdac 7         0.0167     vmlinux                  __ide_dma_count
c0132cef 7         0.0167     vmlinux                  cache_init_objs
c0217c05 7         0.0167     vmlinux                  elv_latter_request
c0231e74 7         0.0167     vmlinux                  ide_pre_handler_parser
c023caac 7         0.0167     vmlinux                  ide_start_dma
c021a6d7 7         0.0167     vmlinux                  put_io_context
c01eec00 7         0.0167     vmlinux                  rb_insert_color
c01f0517 6         0.0143     vmlinux                  __const_udelay
c0130c1c 6         0.0143     vmlinux                  __pagevec_free
c021d259 6         0.0143     vmlinux                  as_antic_stop
c021ce97 6         0.0143     vmlinux                  as_get_io_context
c021dec8 6         0.0143     vmlinux                  as_next_request
c021cec5 6         0.0143     vmlinux                  as_remove_merge_hints
c0217bc7 6         0.0143     vmlinux                  elv_remove_request
c012d7a4 6         0.0143     vmlinux                  find_lock_page
c0236508 6         0.0143     vmlinux                  ide_do_rw_disk
c013bcba 6         0.0143     vmlinux                  page_add_rmap
c013cf5a 6         0.0143     vmlinux                  shmem_getpage
c013c39c 6         0.0143     vmlinux                  shmem_swp_alloc
c0131ba4 6         0.0143     vmlinux                  test_clear_page_dirty
00000000 5         0.0119     ISO8859-1.so             (no symbols)
c01eecce 5         0.0119     vmlinux                  __rb_erase_color
c01f05dc 5         0.0119     vmlinux                  _mmx_memcpy
c013413c 5         0.0119     vmlinux                  activate_page
c028f76c 5         0.0119     vmlinux                  add_event_entry
c011701d 5         0.0119     vmlinux                  default_wake_function
c021986e 5         0.0119     vmlinux                  disk_round_stats
c0217a35 5         0.0119     vmlinux                  elv_merged_request
c010c9e8 5         0.0119     vmlinux                  end_8259A_irq
c02193b3 5         0.0119     vmlinux                  freed_request
c011ff72 5         0.0119     vmlinux                  internal_add_timer
c01eeb7d 5         0.0119     vmlinux                  radix_tree_node_ctor
c012080d 5         0.0119     vmlinux                  run_local_timers
00000000 4         0.0096     nmbd                     (no symbols)
c023cd21 4         0.0096     vmlinux                  __ide_dma_test_irq
c013320a 4         0.0096     vmlinux                  cache_flusharray
c022e022 4         0.0096     vmlinux                  do_ide_request
c0217c4c 4         0.0096     vmlinux                  elv_set_request
c0139e36 4         0.0096     vmlinux                  find_vma
c013885e 4         0.0096     vmlinux                  handle_mm_fault
c0117c4e 4         0.0096     vmlinux                  io_schedule
c0256be8 4         0.0096     vmlinux                  uhci_hub_status_data
c0136ff3 4         0.0096     vmlinux                  zap_pte_range
c0217a99 3         0.0072     vmlinux                  __elv_add_request
c028f4ed 3         0.0072     vmlinux                  add_sample_entry
c021cfb8 3         0.0072     vmlinux                  as_find_first_arq
c0118233 3         0.0072     vmlinux                  autoremove_wake_function
c0147672 3         0.0072     vmlinux                  bio_put
c0132d68 3         0.0072     vmlinux                  cache_grow
c010920c 3         0.0072     vmlinux                  device_not_available
c0217c71 3         0.0072     vmlinux                  elv_put_request
c017f280 3         0.0072     vmlinux                  journal_switch_revoke_table
c0144ce6 3         0.0072     vmlinux                  mark_buffer_async_read
c0229b32 3         0.0072     vmlinux                  mdio_read
c0133701 3         0.0072     vmlinux                  reap_timer_fnc
c01180f8 3         0.0072     vmlinux                  remove_wait_queue
c010e4eb 3         0.0072     vmlinux                  restore_fpu
c0259c77 3         0.0072     vmlinux                  stall_callback
c01087ac 3         0.0072     vmlinux                  system_call
00000000 2         0.0048     apache                   (no symbols)
00000000 2         0.0048     cupsd                    (no symbols)
c015cadc 2         0.0048     vmlinux                  __mark_inode_dirty
c0131abc 2         0.0048     vmlinux                  __set_page_dirty_nobuffers
c0200345 2         0.0048     vmlinux                  add_disk_randomness
c0217e50 2         0.0048     vmlinux                  clear_queue_congested
c017b4da 2         0.0048     vmlinux                  do_get_write_access
c01154e7 2         0.0048     vmlinux                  do_page_fault
c0137b16 2         0.0048     vmlinux                  do_wp_page
c01555d9 2         0.0048     vmlinux                  dput
c01442a9 2         0.0048     vmlinux                  fget_light
c012e044 2         0.0048     vmlinux                  generic_file_read
c025a39b 2         0.0048     vmlinux                  hc_state_transitions
c0231f7a 2         0.0048     vmlinux                  ide_post_handler_parser
c0133391 2         0.0048     vmlinux                  kfree
c016271f 2         0.0048     vmlinux                  load_elf_binary
c025a326 2         0.0048     vmlinux                  ports_active
c013c174 2         0.0048     vmlinux                  pte_chain_alloc
c013c292 2         0.0048     vmlinux                  shmem_swp_entry
c0123389 2         0.0048     vmlinux                  sys_rt_sigprocmask
c0134735 2         0.0048     vmlinux                  truncate_complete_page
c0202375 2         0.0048     vmlinux                  tty_write
c015803a 2         0.0048     vmlinux                  update_atime
c0143527 2         0.0048     vmlinux                  vfs_read
c0242712 2         0.0048     vmlinux                  vgacon_cursor
c02436fc 2         0.0048     vmlinux                  vgacon_scroll
c012645e 2         0.0048     vmlinux                  worker_thread
c0206a54 2         0.0048     vmlinux                  write_chan
00000000 1         0.0024     gawk                     (no symbols)
00000000 1         0.0024     libc-2.3.2.so            (no symbols)
00000000 1         0.0024     ls                       (no symbols)
00000000 1         0.0024     tee                      (no symbols)
c0145d1c 1         0.0024     vmlinux                  __block_prepare_write
c01f0b94 1         0.0024     vmlinux                  __copy_from_user_ll
c01456a2 1         0.0024     vmlinux                  __find_get_block
c012de1f 1         0.0024     vmlinux                  __generic_file_aio_read
c0130b8e 1         0.0024     vmlinux                  __get_free_pages
c017c9b4 1         0.0024     vmlinux                  __journal_file_buffer
c0134483 1         0.0024     vmlinux                  __pagevec_release
c0153801 1         0.0024     vmlinux                  __posix_lock_file
c013c128 1         0.0024     vmlinux                  __pte_chain_free
c0144ec2 1         0.0024     vmlinux                  __set_page_dirty_buffers
c015cbde 1         0.0024     vmlinux                  __sync_single_inode
c028f45a 1         0.0024     vmlinux                  add_kernel_ctx_switch
c021cf4d 1         0.0024     vmlinux                  as_find_arq_hash
c0218f2f 1         0.0024     vmlinux                  blk_remove_plug
c01472f5 1         0.0024     vmlinux                  block_sync_page
c0146e4d 1         0.0024     vmlinux                  block_write_full_page
c0228b23 1         0.0024     vmlinux                  boomerang_interrupt
c014de24 1         0.0024     vmlinux                  cached_lookup
c01d0de4 1         0.0024     vmlinux                  cap_bprm_set_security
c01d10d7 1         0.0024     vmlinux                  cap_vm_enough_memory
c0136b10 1         0.0024     vmlinux                  clear_page_tables
c01f097f 1         0.0024     vmlinux                  clear_user
c01189c6 1         0.0024     vmlinux                  copy_files
c0118527 1         0.0024     vmlinux                  copy_mm
c014b52b 1         0.0024     vmlinux                  copy_strings
c0106e05 1         0.0024     vmlinux                  copy_thread
c014b4ed 1         0.0024     vmlinux                  count
c0158f04 1         0.0024     vmlinux                  dnotify_parent
c01382ac 1         0.0024     vmlinux                  do_anonymous_page
c020ef74 1         0.0024     vmlinux                  do_con_trol
c011c4f4 1         0.0024     vmlinux                  do_getitimer
c0152745 1         0.0024     vmlinux                  do_poll
c0151fbf 1         0.0024     vmlinux                  do_select
c014347f 1         0.0024     vmlinux                  do_sync_read
c0133682 1         0.0024     vmlinux                  drain_array_locked
c0118269 1         0.0024     vmlinux                  dup_task_struct
c0172d1c 1         0.0024     vmlinux                  ext3_get_inode_loc
c0171437 1         0.0024     vmlinux                  ext3_getblk
c016e989 1         0.0024     vmlinux                  ext3_new_block
c01514cf 1         0.0024     vmlinux                  fasync_helper
c014426d 1         0.0024     vmlinux                  fget
c0144342 1         0.0024     vmlinux                  file_move
c012e30a 1         0.0024     vmlinux                  filemap_nopage
c013fbab 1         0.0024     vmlinux                  free_page_and_swap_cache
c0130c70 1         0.0024     vmlinux                  free_pages
c0157d5e 1         0.0024     vmlinux                  generic_delete_inode
c014aa08 1         0.0024     vmlinux                  generic_fillattr
c028f5d5 1         0.0024     vmlinux                  get_slots
c014868c 1         0.0024     vmlinux                  get_super
c014dbb0 1         0.0024     vmlinux                  getname
c01cccfd 1         0.0024     vmlinux                  grow_ary
c0108471 1         0.0024     vmlinux                  handle_signal
c0264b57 1         0.0024     vmlinux                  i8042_timer_func
c0157ff6 1         0.0024     vmlinux                  inode_times_differ
c01cd0e4 1         0.0024     vmlinux                  ipc_lock
c017b9b0 1         0.0024     vmlinux                  journal_get_write_access
c017f8af 1         0.0024     vmlinux                  journal_write_metadata_buffer
c017f2d8 1         0.0024     vmlinux                  journal_write_revoke_records
c014e113 1         0.0024     vmlinux                  link_path_walk
c0162400 1         0.0024     vmlinux                  load_elf_interp
c0134273 1         0.0024     vmlinux                  lru_add_drain
c010a13c 1         0.0024     vmlinux                  math_state_restore
c014edfd 1         0.0024     vmlinux                  may_open
c01f0887 1         0.0024     vmlinux                  mmx_clear_page
c014ea19 1         0.0024     vmlinux                  path_lookup
c014d48b 1         0.0024     vmlinux                  pipe_poll
c0151e25 1         0.0024     vmlinux                  poll_freewait
c011aa13 1         0.0024     vmlinux                  profile_hook
c0136ba4 1         0.0024     vmlinux                  pte_alloc_map
c01262f4 1         0.0024     vmlinux                  queue_work
c01ee764 1         0.0024     vmlinux                  radix_tree_node_alloc
c0120f94 1         0.0024     vmlinux                  recalc_sigpending
c011a4d8 1         0.0024     vmlinux                  release_console_sem
c028f589 1         0.0024     vmlinux                  release_mm
c010be88 1         0.0024     vmlinux                  release_x86_irqs
c0139160 1         0.0024     vmlinux                  remove_shared_vm_struct
c01086f8 1         0.0024     vmlinux                  resume_kernel
c024a6e0 1         0.0024     vmlinux                  rh_report_status
c01268f1 1         0.0024     vmlinux                  schedule_work
c0155e2d 1         0.0024     vmlinux                  select_parent
c0107f3e 1         0.0024     vmlinux                  setup_sigcontext
c01325f3 1         0.0024     vmlinux                  slab_destroy
c017ad84 1         0.0024     vmlinux                  start_this_handle
c01f0938 1         0.0024     vmlinux                  strncpy_from_user
c01f09d2 1         0.0024     vmlinux                  strnlen_user
c015d0c2 1         0.0024     vmlinux                  sync_inodes_sb
c015ce4f 1         0.0024     vmlinux                  sync_sb_inodes
c01484e9 1         0.0024     vmlinux                  sync_supers
c0151775 1         0.0024     vmlinux                  sys_ioctl
c015226a 1         0.0024     vmlinux                  sys_select
c01371b5 1         0.0024     vmlinux                  unmap_page_range
c013a061 1         0.0024     vmlinux                  unmap_vma
c0120655 1         0.0024     vmlinux                  update_process_times
c015d031 1         0.0024     vmlinux                  writeback_inodes

HTH,

-- 
Tobias						PGP: http://9ac7e0bc.2ya.com
np: CF-Theme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-03  3:33     ` Tobias Diedrich
@ 2004-01-03  4:15       ` Valdis.Kletnieks
  2004-01-03 13:39         ` Tobias Diedrich
  2004-01-04  3:02         ` jw schultz
  0 siblings, 2 replies; 38+ messages in thread
From: Valdis.Kletnieks @ 2004-01-03  4:15 UTC (permalink / raw)
  To: Tobias Diedrich; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 443 bytes --]

On Sat, 03 Jan 2004 04:33:28 +0100, Tobias Diedrich <ranma@gmx.at>  said:

> Very interesting tidbit:
> 
> with 2.6.1-rc1 and "dd if=/dev/hda of=/dev/null" I get stable 28 MB/s,
> but with "cat < /dev/hda > /dev/null" I get 48 MB/s according to "vmstat
> 5".

'cat' is probably doing a stat() on stdout and seeing it's connected to /dev/null
and not even bothering to do the write() call.  I've seen similar behavior in other
GNU utilities.  

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 21:32     ` Mike Fedyk
  2004-01-02 22:34       ` Martin Josefsson
@ 2004-01-03 10:20       ` Paolo Ornati
  1 sibling, 0 replies; 38+ messages in thread
From: Paolo Ornati @ 2004-01-03 10:20 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Ed Sweetman, linux-kernel

On Friday 02 January 2004 22:32, Mike Fedyk wrote:
> On Fri, Jan 02, 2004 at 10:04:27PM +0100, Paolo Ornati wrote:
> > NR_TESTS=3
> > RA_VALUES="64 128 256 8192"
>
> Can you add more samples between 128 and 256, maybe at intervals of 32?

YES

2.6.0:
128	31.66
160	31.88
192	30.93
224	31.18
256	26.16	# HD LED blinking

2.6.1-rc1:
128	25.91	# HD LED blinking
160	26.00	# HD LED blinking
192	26.06	# HD LED blinking
224	25.94	# HD LED blinking
256	25.96	# HD LED blinking

bye

-- 
	Paolo Ornati
	Linux v2.4.23



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 21:27     ` Valdis.Kletnieks
@ 2004-01-03 10:20       ` Paolo Ornati
  0 siblings, 0 replies; 38+ messages in thread
From: Paolo Ornati @ 2004-01-03 10:20 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Ed Sweetman, linux-kernel

On Friday 02 January 2004 22:27, you wrote:
>
> Do you get different numbers if you boot with:
>
> elevator=as
> elevator=deadline
> elevator=cfq  (for -mm kernels)
> elevator=noop
>

Changing io scheduler doesn't seem to affect performance too much...

AS (the one already used)
> > 2.6.0:
> > 64        31.91
> > 128      31.89
> > 256      26.22	# during the transfer HD LED blinks
> > 8192    26.26	# during the transfer HD LED blinks
> >
> > 2.6.1-rc1:
> > 64        25.84	# during the transfer HD LED blinks
> > 128      25.85	# during the transfer HD LED blinks
> > 256      25.90	# during the transfer HD LED blinks
> > 8192    26.42	# during the transfer HD LED blinks

DEADLINE
2.6.0:
64	31.89
128	31.90
256	26.18
8192	26.22

2.6.1-rc1:
64	25.90
128	26.14
256	26.06
8192	26.45

NOOP
2.6.0:
64	31.90
128	31.76
256	26.05
8192	26.20

2.6.1-rc1:
64	25.91
128	26.23
256	26.16
8192	26.40


Bye

-- 
	Paolo Ornati
	Linux v2.4.23


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-02 22:34       ` Martin Josefsson
@ 2004-01-03 11:13         ` Paolo Ornati
  2004-01-03 22:40           ` Andrew Morton
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Ornati @ 2004-01-03 11:13 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: Ram Pai, linux-kernel

On Friday 02 January 2004 23:34, you wrote:
> On Fri, 2004-01-02 at 22:32, Mike Fedyk wrote:
> > Have there been any ide updates in 2.6.1-rc1?
>
> I see that a readahead patch was applied just before -rc1 was released.
>
> found it in bk-commits-head
>
> Subject: [PATCH] readahead: multiple performance fixes
> Message-Id:  <200312310120.hBV1KLZN012971@hera.kernel.org>
>
> Maybe Paolo can try backing it out.

YES, YES, YES...

Reveting "readahead: multiple performance fixes" patch performance came back 
like in 2.6.0.

2.6.0:
64        31.91
128      31.89
256      26.22
8192    26.26

2.6.1-rc1 (readahead patch reverted):
64	31.84
128	31.86
256	25.93
8192	26.16

I know these are only performance in sequential data reads... and real life 
is another thing... but I think the author of the patch should be informed 
(Ram Pai).

for Ram Pai:
_____________________________________________________________________
My first message:
http://www.ussg.iu.edu/hypermail/linux/kernel/0401.0/0004.html

This thread:
Strange IDE performance change in 2.6.1-rc1 (again)
http://www.ussg.iu.edu/hypermail/linux/kernel/0401.0/0289.html
(look at the graph)

Any comments?
_____________________________________________________________________

Bye

-- 
	Paolo Ornati
	Linux v2.4.23


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-03  4:15       ` Valdis.Kletnieks
@ 2004-01-03 13:39         ` Tobias Diedrich
  2004-01-03 20:56           ` Tobias Diedrich
  2004-01-04  3:02         ` jw schultz
  1 sibling, 1 reply; 38+ messages in thread
From: Tobias Diedrich @ 2004-01-03 13:39 UTC (permalink / raw)
  To: linux-kernel

Valdis.Kletnieks@vt.edu wrote:

> 'cat' is probably doing a stat() on stdout and seeing it's connected
> to /dev/null and not even bothering to do the write() call.  I've seen
> similar behavior in other GNU utilities.

I can't see any special casing for /dev/null in cat's source, but I
forgot to check dd with bigger block size. It's ok with bs=4096...

-- 
Tobias						PGP: http://9ac7e0bc.2ya.com
This mail is made of 100% recycled bits.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-03 13:39         ` Tobias Diedrich
@ 2004-01-03 20:56           ` Tobias Diedrich
  0 siblings, 0 replies; 38+ messages in thread
From: Tobias Diedrich @ 2004-01-03 20:56 UTC (permalink / raw)
  To: linux-kernel

I wrote:

> Valdis.Kletnieks@vt.edu wrote:
> 
> > 'cat' is probably doing a stat() on stdout and seeing it's connected
> > to /dev/null and not even bothering to do the write() call.  I've seen
> > similar behavior in other GNU utilities.
> 
> I can't see any special casing for /dev/null in cat's source, but I
> forgot to check dd with bigger block size. It's ok with bs=4096...

However with 2.4 dd performs fine even with bs=512.

-- 
Tobias						PGP: http://9ac7e0bc.2ya.com
Be vigilant!
np: PHILFUL3

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-03 11:13         ` Paolo Ornati
@ 2004-01-03 22:40           ` Andrew Morton
  2004-01-04 14:30             ` Paolo Ornati
  2004-01-04 17:15             ` Buffer and Page cache coherent? was: " Mike Fedyk
  0 siblings, 2 replies; 38+ messages in thread
From: Andrew Morton @ 2004-01-03 22:40 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: gandalf, linuxram, linux-kernel

Paolo Ornati <ornati@lycos.it> wrote:
>
> I know these are only performance in sequential data reads... and real life 
>  is another thing... but I think the author of the patch should be informed 
>  (Ram Pai).

There does seem to be something whacky going on with readahead against
blockdevices.  Perhaps it is related to the soft blocksize.  I've never
been able to reproduce any of this.

Be aware that buffered reads for blockdevs are treated fairly differently
from buffered reads for regular files: they only use lowmem and we always
attach buffer_heads and perform I/O against them.

No effort was made to optimise buffered blockdev reads because it is not
very important and my main interest was in data coherency and filesystem
metadata consistency.

If you observe the same things reading from regular files then that is more
important.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-03  4:15       ` Valdis.Kletnieks
  2004-01-03 13:39         ` Tobias Diedrich
@ 2004-01-04  3:02         ` jw schultz
  1 sibling, 0 replies; 38+ messages in thread
From: jw schultz @ 2004-01-04  3:02 UTC (permalink / raw)
  To: linux-kernel

On Fri, Jan 02, 2004 at 11:15:18PM -0500, Valdis.Kletnieks@vt.edu wrote:
> On Sat, 03 Jan 2004 04:33:28 +0100, Tobias Diedrich <ranma@gmx.at>  said:
> 
> > Very interesting tidbit:
> > 
> > with 2.6.1-rc1 and "dd if=/dev/hda of=/dev/null" I get stable 28 MB/s,
> > but with "cat < /dev/hda > /dev/null" I get 48 MB/s according to "vmstat
> > 5".
> 
> 'cat' is probably doing a stat() on stdout and seeing it's connected to /dev/null
> and not even bothering to do the write() call.  I've seen similar behavior in other
> GNU utilities.  

That is unlikely.

However, i have seen some versions of cat check the input
file and if it is mappable mmap it instead of read.  Given
that a write to /dev/null returns count without
copy_from_user the mapped page never faults so there is no
disk io.




-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw@pegasys.ws

		Remember Cernan and Schmitt

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-03 22:40           ` Andrew Morton
@ 2004-01-04 14:30             ` Paolo Ornati
  2004-01-05 23:19               ` Ram Pai
  2004-03-29 15:45               ` Ram Pai
  2004-01-04 17:15             ` Buffer and Page cache coherent? was: " Mike Fedyk
  1 sibling, 2 replies; 38+ messages in thread
From: Paolo Ornati @ 2004-01-04 14:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: gandalf, linuxram, linux-kernel

On Saturday 03 January 2004 23:40, Andrew Morton wrote:
> Paolo Ornati <ornati@lycos.it> wrote:
> > I know these are only performance in sequential data reads... and real
> > life is another thing... but I think the author of the patch should be
> > informed (Ram Pai).
>
> There does seem to be something whacky going on with readahead against
> blockdevices.  Perhaps it is related to the soft blocksize.  I've never
> been able to reproduce any of this.
>
> Be aware that buffered reads for blockdevs are treated fairly differently
> from buffered reads for regular files: they only use lowmem and we always
> attach buffer_heads and perform I/O against them.
>
> No effort was made to optimise buffered blockdev reads because it is not
> very important and my main interest was in data coherency and filesystem
> metadata consistency.
>
> If you observe the same things reading from regular files then that is
> more important.

I have done some tests with this stupid script and it seems that you are 
right:
_____________________________________________________________________
#!/bin/sh

DEV=/dev/hda7
MOUNT_DIR=mnt
BIG_FILE=$MOUNT_DIR/big_file

mount $DEV $MOUNT_DIR
if [ ! -f $BIG_FILE ]; then
    echo "[DD] $BIG_FILE"
    dd if=/dev/zero of=$BIG_FILE bs=1M count=1024
    umount $MOUNT_DIR
    mount $DEV $MOUNT_DIR
fi

killall5
sleep 2
sync
sleep 2

time cat $BIG_FILE > /dev/null
umount $MOUNT_DIR
_____________________________________________________________________


Results for plain 2.6.1-rc1 (A) and 2.6.1-rc1 without Ram Pai's patch (B):

o readahead = 256 (default setting)

(A)
real	0m43.596s
user	0m0.153s
sys	0m5.602s

real	0m42.971s
user	0m0.136s
sys	0m5.571s

real	0m42.888s
user	0m0.137s
sys	0m5.648s

(B)
real    0m43.520s
user    0m0.130s
sys     0m5.615s

real	0m42.930s
user	0m0.154s
sys	0m5.745s

real	0m42.937s
user	0m0.120s
sys	0m5.751s


o readahead = 128

(A)
real	0m35.932s
user	0m0.133s
sys	0m5.926s

real	0m35.925s
user	0m0.146s
sys	0m5.930s

real	0m35.892s
user	0m0.145s
sys	0m5.946s

(B)
real	0m35.957s
user	0m0.136s
sys	0m6.041s

real	0m35.958s
user	0m0.136s
sys	0m5.957s

real	0m35.924s
user	0m0.146s
sys	0m6.069s


o readahead = 64
(A)
real	0m35.284s
user	0m0.137s
sys	0m6.182s

real	0m35.267s
user	0m0.134s
sys	0m6.110s

real	0m35.260s
user	0m0.149s
sys	0m6.003s


(B)
real	0m35.210s
user	0m0.149s
sys	0m6.009s

real	0m35.341s
user	0m0.151s
sys	0m6.119s

real	0m35.151s
user	0m0.144s
sys	0m6.195s


I don't notice any big difference between kernel A and kernel B....

>From these tests the best readahead value for my HD seems to be 64... and 
the default setting (256) just wrong.

With 2.4.23 kernel and readahead = 8 I get results like these:

real	0m40.085s
user	0m0.130s
sys	0m4.560s

real	0m40.058s
user	0m0.090s
sys	0m4.630s

Bye.

-- 
	Paolo Ornati
	Linux v2.4.23



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Buffer and Page cache coherent? was: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-03 22:40           ` Andrew Morton
  2004-01-04 14:30             ` Paolo Ornati
@ 2004-01-04 17:15             ` Mike Fedyk
  2004-01-04 22:10               ` Andrew Morton
  1 sibling, 1 reply; 38+ messages in thread
From: Mike Fedyk @ 2004-01-04 17:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Paolo Ornati, gandalf, linuxram, linux-kernel

On Sat, Jan 03, 2004 at 02:40:03PM -0800, Andrew Morton wrote:
> No effort was made to optimise buffered blockdev reads because it is not
> very important and my main interest was in data coherency and filesystem
> metadata consistency.

Does that mean that blockdev reads will populate the pagecache in 2.6?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Buffer and Page cache coherent? was: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-04 17:15             ` Buffer and Page cache coherent? was: " Mike Fedyk
@ 2004-01-04 22:10               ` Andrew Morton
  2004-01-04 23:22                 ` Mike Fedyk
  0 siblings, 1 reply; 38+ messages in thread
From: Andrew Morton @ 2004-01-04 22:10 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: ornati, gandalf, linuxram, linux-kernel

Mike Fedyk <mfedyk@matchmail.com> wrote:
>
> On Sat, Jan 03, 2004 at 02:40:03PM -0800, Andrew Morton wrote:
> > No effort was made to optimise buffered blockdev reads because it is not
> > very important and my main interest was in data coherency and filesystem
> > metadata consistency.
> 
> Does that mean that blockdev reads will populate the pagecache in 2.6?

They have since 2.4.10.  The pagecache is the only cacheing entity for file
(and blockdev) data.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Buffer and Page cache coherent? was: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-04 22:10               ` Andrew Morton
@ 2004-01-04 23:22                 ` Mike Fedyk
  2004-01-04 23:32                   ` Andrew Morton
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Fedyk @ 2004-01-04 23:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: ornati, gandalf, linuxram, linux-kernel

On Sun, Jan 04, 2004 at 02:10:30PM -0800, Andrew Morton wrote:
> Mike Fedyk <mfedyk@matchmail.com> wrote:
> >
> > On Sat, Jan 03, 2004 at 02:40:03PM -0800, Andrew Morton wrote:
> > > No effort was made to optimise buffered blockdev reads because it is not
> > > very important and my main interest was in data coherency and filesystem
> > > metadata consistency.
> > 
> > Does that mean that blockdev reads will populate the pagecache in 2.6?
> 
> They have since 2.4.10.  The pagecache is the only cacheing entity for file
> (and blockdev) data.

There was a large thread after 2.4.10 was released about speeding up the
boot proces by reading the underlying blockdev of the root partition in
block order.

Unfortunately at the time reading the files through the pagecache would
cause a second read of the data even if it was already buffered.  I don't
remember the exact details.

Are you saying this is now resolved?  And the above optimization will work?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Buffer and Page cache coherent? was: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-04 23:22                 ` Mike Fedyk
@ 2004-01-04 23:32                   ` Andrew Morton
  2004-01-04 23:45                     ` Mike Fedyk
  0 siblings, 1 reply; 38+ messages in thread
From: Andrew Morton @ 2004-01-04 23:32 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: ornati, gandalf, linuxram, linux-kernel

Mike Fedyk <mfedyk@matchmail.com> wrote:
>
> On Sun, Jan 04, 2004 at 02:10:30PM -0800, Andrew Morton wrote:
> > Mike Fedyk <mfedyk@matchmail.com> wrote:
> > >
> > > On Sat, Jan 03, 2004 at 02:40:03PM -0800, Andrew Morton wrote:
> > > > No effort was made to optimise buffered blockdev reads because it is not
> > > > very important and my main interest was in data coherency and filesystem
> > > > metadata consistency.
> > > 
> > > Does that mean that blockdev reads will populate the pagecache in 2.6?
> > 
> > They have since 2.4.10.  The pagecache is the only cacheing entity for file
> > (and blockdev) data.
> 
> There was a large thread after 2.4.10 was released about speeding up the
> boot proces by reading the underlying blockdev of the root partition in
> block order.
> 
> Unfortunately at the time reading the files through the pagecache would
> cause a second read of the data even if it was already buffered.  I don't
> remember the exact details.

The pagecache is a cache-per-inode.  So the cache for a regular file is not
coherent with the cache for /dev/hda1 is not coherent with the cache for
/dev/hda.

> Are you saying this is now resolved?  And the above optimization will work?

It will not.  And I doubt if it will make much difference anyway.  I once
wrote a gizmo which a) generated tables describing pagecache contents
immediately after bootup and b) used that info to prepopulate pagecache
with an optimised seek pattern after boot.  It was only worth 10-15%.  One
would need an intermediate step which relaid-out the relevant files to get
useful speedups.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Buffer and Page cache coherent? was: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-04 23:32                   ` Andrew Morton
@ 2004-01-04 23:45                     ` Mike Fedyk
  2004-01-05  0:23                       ` Andrew Morton
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Fedyk @ 2004-01-04 23:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: ornati, gandalf, linuxram, linux-kernel

On Sun, Jan 04, 2004 at 03:32:58PM -0800, Andrew Morton wrote:
> Mike Fedyk <mfedyk@matchmail.com> wrote:
> >
> > On Sun, Jan 04, 2004 at 02:10:30PM -0800, Andrew Morton wrote:
> > > Mike Fedyk <mfedyk@matchmail.com> wrote:
> > > >
> > > > On Sat, Jan 03, 2004 at 02:40:03PM -0800, Andrew Morton wrote:
> > > > > No effort was made to optimise buffered blockdev reads because it is not
> > > > > very important and my main interest was in data coherency and filesystem
> > > > > metadata consistency.
> > > > 
> > > > Does that mean that blockdev reads will populate the pagecache in 2.6?
> > > 
> > > They have since 2.4.10.  The pagecache is the only cacheing entity for file
> > > (and blockdev) data.
> > 
> > There was a large thread after 2.4.10 was released about speeding up the
> > boot proces by reading the underlying blockdev of the root partition in
> > block order.
> > 
> > Unfortunately at the time reading the files through the pagecache would
> > cause a second read of the data even if it was already buffered.  I don't
> > remember the exact details.
> 
> The pagecache is a cache-per-inode.  So the cache for a regular file is not
> coherent with the cache for /dev/hda1 is not coherent with the cache for
> /dev/hda.

That's what I remember from the old thread.  Thanks.

Duffers are attached to a page, and blockdev reads will not save
pagecache reads.

So in what way is the buffer cache coherent with the pagecache?

> > Are you saying this is now resolved?  And the above optimization will work?
> 
> It will not.  And I doubt if it will make much difference anyway.  I once
> wrote a gizmo which a) generated tables describing pagecache contents
> immediately after bootup and b) used that info to prepopulate pagecache
> with an optimised seek pattern after boot.  It was only worth 10-15%.  One
> would need an intermediate step which relaid-out the relevant files to get
> useful speedups.

Any progress on that pagecache coherent block relocation patch you had for
ext3? :)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Buffer and Page cache coherent? was: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-04 23:45                     ` Mike Fedyk
@ 2004-01-05  0:23                       ` Andrew Morton
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Morton @ 2004-01-05  0:23 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: ornati, gandalf, linuxram, linux-kernel

Mike Fedyk <mfedyk@matchmail.com> wrote:
>
> So in what way is the buffer cache coherent with the pagecache?
> 

There is no "buffer cache" in Linux.  There is a pagecache for /etc/passwd
and there is a pagecache for /dev/hda1.  They are treated pretty much
identically.  The kernel attaches buffer_heads to those pagecache pages
when needed - generally when it wants to deal with individual disk blocks.

>  Any progress on that pagecache coherent block relocation patch you had for
>  ext3? :)

No.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-04 14:30             ` Paolo Ornati
@ 2004-01-05 23:19               ` Ram Pai
  2004-01-07 14:59                 ` Paolo Ornati
  2004-03-29 15:45               ` Ram Pai
  1 sibling, 1 reply; 38+ messages in thread
From: Ram Pai @ 2004-01-05 23:19 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: Andrew Morton, gandalf, linux-kernel

Sorry I was on vacation and could not get back earlier.

I do not exactly know the reason why sequential reads on blockdevices
has regressed. One probable reason is that the same lazy-read
optimization which helps large random reads is regressing the sequential
read performance.

Note: the patch, waits till the last page in the current window is being
read, before triggering a new readahead. By the time the readahead
request is satisfied, the next sequential read may already have been
requested. Hence there is some loss of parallelism here. However given
that largesize random reads is the most common case; this patch attacks
that case.

If you revert back just the lazy-read optimization, you might see no
regression for sequential reads,

Let me see if I can verify this,
Ram Pai


On Sun, 2004-01-04 at 06:30, Paolo Ornati wrote:
> On Saturday 03 January 2004 23:40, Andrew Morton wrote:
> > Paolo Ornati <ornati@lycos.it> wrote:
> > > I know these are only performance in sequential data reads... and real
> > > life is another thing... but I think the author of the patch should be
> > > informed (Ram Pai).
> >
> > There does seem to be something whacky going on with readahead against
> > blockdevices.  Perhaps it is related to the soft blocksize.  I've never
> > been able to reproduce any of this.
> >
> > Be aware that buffered reads for blockdevs are treated fairly differently
> > from buffered reads for regular files: they only use lowmem and we always
> > attach buffer_heads and perform I/O against them.
> >
> > No effort was made to optimise buffered blockdev reads because it is not
> > very important and my main interest was in data coherency and filesystem
> > metadata consistency.
> >
> > If you observe the same things reading from regular files then that is
> > more important.
> 
> I have done some tests with this stupid script and it seems that you are 
> right:
> _____________________________________________________________________
> #!/bin/sh
> 
> DEV=/dev/hda7
> MOUNT_DIR=mnt
> BIG_FILE=$MOUNT_DIR/big_file
> 
> mount $DEV $MOUNT_DIR
> if [ ! -f $BIG_FILE ]; then
>     echo "[DD] $BIG_FILE"
>     dd if=/dev/zero of=$BIG_FILE bs=1M count=1024
>     umount $MOUNT_DIR
>     mount $DEV $MOUNT_DIR
> fi
> 
> killall5
> sleep 2
> sync
> sleep 2
> 
> time cat $BIG_FILE > /dev/null
> umount $MOUNT_DIR
> _____________________________________________________________________
> 
> 
> Results for plain 2.6.1-rc1 (A) and 2.6.1-rc1 without Ram Pai's patch (B):
> 
> o readahead = 256 (default setting)
> 
> (A)
> real	0m43.596s
> user	0m0.153s
> sys	0m5.602s
> 
> real	0m42.971s
> user	0m0.136s
> sys	0m5.571s
> 
> real	0m42.888s
> user	0m0.137s
> sys	0m5.648s
> 
> (B)
> real    0m43.520s
> user    0m0.130s
> sys     0m5.615s
> 
> real	0m42.930s
> user	0m0.154s
> sys	0m5.745s
> 
> real	0m42.937s
> user	0m0.120s
> sys	0m5.751s
> 
> 
> o readahead = 128
> 
> (A)
> real	0m35.932s
> user	0m0.133s
> sys	0m5.926s
> 
> real	0m35.925s
> user	0m0.146s
> sys	0m5.930s
> 
> real	0m35.892s
> user	0m0.145s
> sys	0m5.946s
> 
> (B)
> real	0m35.957s
> user	0m0.136s
> sys	0m6.041s
> 
> real	0m35.958s
> user	0m0.136s
> sys	0m5.957s
> 
> real	0m35.924s
> user	0m0.146s
> sys	0m6.069s
> 
> 
> o readahead = 64
> (A)
> real	0m35.284s
> user	0m0.137s
> sys	0m6.182s
> 
> real	0m35.267s
> user	0m0.134s
> sys	0m6.110s
> 
> real	0m35.260s
> user	0m0.149s
> sys	0m6.003s
> 
> 
> (B)
> real	0m35.210s
> user	0m0.149s
> sys	0m6.009s
> 
> real	0m35.341s
> user	0m0.151s
> sys	0m6.119s
> 
> real	0m35.151s
> user	0m0.144s
> sys	0m6.195s
> 
> 
> I don't notice any big difference between kernel A and kernel B....
> 
> From these tests the best readahead value for my HD seems to be 64... and 
> the default setting (256) just wrong.
> 
> With 2.4.23 kernel and readahead = 8 I get results like these:
> 
> real	0m40.085s
> user	0m0.130s
> sys	0m4.560s
> 
> real	0m40.058s
> user	0m0.090s
> sys	0m4.630s
> 
> Bye.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-05 23:19               ` Ram Pai
@ 2004-01-07 14:59                 ` Paolo Ornati
  2004-01-07 19:23                   ` Ram Pai
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Ornati @ 2004-01-07 14:59 UTC (permalink / raw)
  To: Ram Pai; +Cc: Andrew Morton, gandalf, linux-kernel

On Tuesday 06 January 2004 00:19, you wrote:
> Sorry I was on vacation and could not get back earlier.
>
> I do not exactly know the reason why sequential reads on blockdevices
> has regressed. One probable reason is that the same lazy-read
> optimization which helps large random reads is regressing the sequential
> read performance.
>
> Note: the patch, waits till the last page in the current window is being
> read, before triggering a new readahead. By the time the readahead
> request is satisfied, the next sequential read may already have been
> requested. Hence there is some loss of parallelism here. However given
> that largesize random reads is the most common case; this patch attacks
> that case.
>
> If you revert back just the lazy-read optimization, you might see no
> regression for sequential reads,

I have tried to revert it out:

--- mm/readahead.c.orig	2004-01-07 15:17:00.000000000 +0100
+++ mm/readahead.c.my	2004-01-07 15:33:13.000000000 +0100
@@ -480,7 +480,8 @@
 		 * If we read in earlier we run the risk of wasting
 		 * the ahead window.
 		 */
-		if (ra->ahead_start == 0 && offset == (ra->start + ra->size -1)) {
+		if (ra->ahead_start == 0) {
 			ra->ahead_start = ra->start + ra->size;
 			ra->ahead_size = ra->next_size;

but the sequential read performance is still the same !

Reverting out the other part of the patch (that touches mm/filemap.c) the
sequential read performance comes back like in 2.6.0.

I don't know why... but it does.

>
> Let me see if I can verify this,
> Ram Pai
>

Bye

-- 
	Paolo Ornati
	Linux v2.4.23



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-07 14:59                 ` Paolo Ornati
@ 2004-01-07 19:23                   ` Ram Pai
  2004-01-07 20:12                     ` Paolo Ornati
  0 siblings, 1 reply; 38+ messages in thread
From: Ram Pai @ 2004-01-07 19:23 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: Andrew Morton, gandalf, linux-kernel

On Wed, 2004-01-07 at 06:59, Paolo Ornati wrote:
> On Tuesday 06 January 2004 00:19, you wrote:
> > Sorry I was on vacation and could not get back earlier.
> >
> > I do not exactly know the reason why sequential reads on blockdevices
> > has regressed. One probable reason is that the same lazy-read
> > optimization which helps large random reads is regressing the sequential
> > read performance.
> >
> > Note: the patch, waits till the last page in the current window is being
> > read, before triggering a new readahead. By the time the readahead
> > request is satisfied, the next sequential read may already have been
> > requested. Hence there is some loss of parallelism here. However given
> > that largesize random reads is the most common case; this patch attacks
> > that case.
> >
> > If you revert back just the lazy-read optimization, you might see no
> > regression for sequential reads,
> 
> I have tried to revert it out:
> 
> --- mm/readahead.c.orig	2004-01-07 15:17:00.000000000 +0100
> +++ mm/readahead.c.my	2004-01-07 15:33:13.000000000 +0100
> @@ -480,7 +480,8 @@
>  		 * If we read in earlier we run the risk of wasting
>  		 * the ahead window.
>  		 */
> -		if (ra->ahead_start == 0 && offset == (ra->start + ra->size -1)) {
> +		if (ra->ahead_start == 0) {
>  			ra->ahead_start = ra->start + ra->size;
>  			ra->ahead_size = ra->next_size;
> 
> but the sequential read performance is still the same !
> 
> Reverting out the other part of the patch (that touches mm/filemap.c) the
> sequential read performance comes back like in 2.6.0.

I tried on my lab machine with scsi disks. (I dont have access currently
to a spare machine with ide disks.)

I find that reverting the changes in mm/filemap.c and then reverting the
lazy-read optimization gives much better sequential read performance on
blockdevices.  Is this your observation on IDE disks too?


> 
> I don't know why... but it does.

Lets see. I think my theory is partly the reason. But the changes in
filemap.c seems to be influencing more.


> 
> >
> > Let me see if I can verify this,
> > Ram Pai
> >
> 
> Bye


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-07 19:23                   ` Ram Pai
@ 2004-01-07 20:12                     ` Paolo Ornati
  2004-01-07 23:57                       ` Andrew Morton
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Ornati @ 2004-01-07 20:12 UTC (permalink / raw)
  To: Ram Pai; +Cc: Andrew Morton, gandalf, linux-kernel

On Wednesday 07 January 2004 20:23, Ram Pai wrote:
>
> I tried on my lab machine with scsi disks. (I dont have access currently
> to a spare machine with ide disks.)
>
> I find that reverting the changes in mm/filemap.c and then reverting the
> lazy-read optimization gives much better sequential read performance on
> blockdevices.  Is this your observation on IDE disks too?

Yes and No.
I have only tried to revert lazy-read optimization (without any visible 
change) so I have reapplied it AND THAN I have reverted changes in 
mm/filemap.c... and performance has gone back.

>
> > I don't know why... but it does.
>
> Lets see. I think my theory is partly the reason. But the changes in
> filemap.c seems to be influencing more.

YES, I agree.
I haven't done a lot of tests but it seems to me that the changes in 
mm/filemap.c are the only things that influence the sequential read 
performance on my disk.

-- 
	Paolo Ornati
	Linux v2.4.23


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-07 20:12                     ` Paolo Ornati
@ 2004-01-07 23:57                       ` Andrew Morton
  2004-01-08  7:31                         ` Ram Pai
  2004-01-09  1:05                         ` Ram Pai
  0 siblings, 2 replies; 38+ messages in thread
From: Andrew Morton @ 2004-01-07 23:57 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: linuxram, gandalf, linux-kernel

Paolo Ornati <ornati@lycos.it> wrote:
>
> I haven't done a lot of tests but it seems to me that the changes in 
> mm/filemap.c are the only things that influence the sequential read 
> performance on my disk.

The fact that this only happens when reading a blockdev (true?) is a big
hint.   Maybe it is because regular files implement ->readpages.

If the below patch makes read throughput worse on regular files too then
that would confirm the idea.

diff -puN mm/readahead.c~a mm/readahead.c
--- 25/mm/readahead.c~a	Wed Jan  7 15:56:32 2004
+++ 25-akpm/mm/readahead.c	Wed Jan  7 15:56:36 2004
@@ -103,11 +103,6 @@ static int read_pages(struct address_spa
 	struct pagevec lru_pvec;
 	int ret = 0;
 
-	if (mapping->a_ops->readpages) {
-		ret = mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
-		goto out;
-	}
-
 	pagevec_init(&lru_pvec, 0);
 	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
 		struct page *page = list_to_page(pages);

_


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-07 23:57                       ` Andrew Morton
@ 2004-01-08  7:31                         ` Ram Pai
  2004-01-09  1:05                         ` Ram Pai
  1 sibling, 0 replies; 38+ messages in thread
From: Ram Pai @ 2004-01-08  7:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Paolo Ornati, gandalf, linux-kernel

On Wed, 2004-01-07 at 15:57, Andrew Morton wrote:
> Paolo Ornati <ornati@lycos.it> wrote:
> >
> > I haven't done a lot of tests but it seems to me that the changes in 
> > mm/filemap.c are the only things that influence the sequential read 
> > performance on my disk.
> 
> The fact that this only happens when reading a blockdev (true?) is a big
> hint.   Maybe it is because regular files implement ->readpages.
> 
> If the below patch makes read throughput worse on regular files too then
> that would confirm the idea.

No the throughput did not worsen with the patch, for regular files(on
scsi disk). Lets see what Paolo Ornati finds.

Its something to do with the changes in filemap.c,
RP


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-07 23:57                       ` Andrew Morton
  2004-01-08  7:31                         ` Ram Pai
@ 2004-01-09  1:05                         ` Ram Pai
  2004-01-09  1:17                           ` Andrew Morton
  1 sibling, 1 reply; 38+ messages in thread
From: Ram Pai @ 2004-01-09  1:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Paolo Ornati, gandalf, linux-kernel

Ok, I did some analysis and found that 'hdparm -t <device> '
generates reads which are of size 1M. This means 256 page requests are
generated by a single read.  

do_generic_mapping_read()  gets the request to read 256 pages. But with
the latest change, this function calls do_pagecahce_readahead() to keep 
256 pages ready in cache. And after having done that
do_generic_mapping_read() tries to access those 256 pages.
But by then some of the pages may have been replaced under low pagecache
conditions. Hence we end up spending extra time reading those pages
again into the page cache.

I think the same problem must exist while reading files too. Paulo
Ornati used cat command to read the file. cat just generates 1 page
request per read and hence the problem did not show up. The problem must
show up if 'dd if=big_file of=/dev/null bs=1M count=256' is used.

To conclude, I think the bug is with the changes to filemap.c 
If the changes are reverted the regression seen with blockdevices should
go away.

Well this is my theory, somebody should validate it,
RP





On Wed, 2004-01-07 at 15:57, Andrew Morton wrote:
> Paolo Ornati <ornati@lycos.it> wrote:
> >
> > I haven't done a lot of tests but it seems to me that the changes in 
> > mm/filemap.c are the only things that influence the sequential read 
> > performance on my disk.
> 
> The fact that this only happens when reading a blockdev (true?) is a big
> hint.   Maybe it is because regular files implement ->readpages.
> 
> If the below patch makes read throughput worse on regular files too then
> that would confirm the idea.
> 
> diff -puN mm/readahead.c~a mm/readahead.c
> --- 25/mm/readahead.c~a	Wed Jan  7 15:56:32 2004
> +++ 25-akpm/mm/readahead.c	Wed Jan  7 15:56:36 2004
> @@ -103,11 +103,6 @@ static int read_pages(struct address_spa
>  	struct pagevec lru_pvec;
>  	int ret = 0;
>  
> -	if (mapping->a_ops->readpages) {
> -		ret = mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
> -		goto out;
> -	}
> -
>  	pagevec_init(&lru_pvec, 0);
>  	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
>  		struct page *page = list_to_page(pages);
> 
> _
> 
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-09  1:05                         ` Ram Pai
@ 2004-01-09  1:17                           ` Andrew Morton
  2004-01-09 19:15                             ` Ram Pai
  0 siblings, 1 reply; 38+ messages in thread
From: Andrew Morton @ 2004-01-09  1:17 UTC (permalink / raw)
  To: Ram Pai; +Cc: ornati, gandalf, linux-kernel

Ram Pai <linuxram@us.ibm.com> wrote:
>
> Ok, I did some analysis and found that 'hdparm -t <device> '
> generates reads which are of size 1M. This means 256 page requests are
> generated by a single read.  
> 
> do_generic_mapping_read()  gets the request to read 256 pages. But with
> the latest change, this function calls do_pagecahce_readahead() to keep 
> 256 pages ready in cache. And after having done that
> do_generic_mapping_read() tries to access those 256 pages.
> But by then some of the pages may have been replaced under low pagecache
> conditions. Hence we end up spending extra time reading those pages
> again into the page cache.
> 
> I think the same problem must exist while reading files too. Paulo
> Ornati used cat command to read the file. cat just generates 1 page
> request per read and hence the problem did not show up. The problem must
> show up if 'dd if=big_file of=/dev/null bs=1M count=256' is used.
> 
> To conclude, I think the bug is with the changes to filemap.c 
> If the changes are reverted the regression seen with blockdevices should
> go away.
> 
> Well this is my theory, somebody should validate it,

One megabyte seems like far too litte memory to be triggering the effect
which you describe.  But yes, the risk is certainly there.

You could verify this with:

--- 25/mm/filemap.c~a	Thu Jan  8 17:15:57 2004
+++ 25-akpm/mm/filemap.c	Thu Jan  8 17:16:06 2004
@@ -629,8 +629,10 @@ find_page:
 			handle_ra_miss(mapping, ra, index);
 			goto no_cached_page;
 		}
-		if (!PageUptodate(page))
+		if (!PageUptodate(page)) {
+			printk("eek!\n");
 			goto page_not_up_to_date;
+		}
 page_ok:
 		/* If users can be writing to this page using arbitrary
 		 * virtual addresses, take care about potential aliasing


But still, that up-front readahead loop is undesirable and yes, it would be
better if we could go back to the original design in there.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-09  1:17                           ` Andrew Morton
@ 2004-01-09 19:15                             ` Ram Pai
  2004-01-09 19:44                               ` Andrew Morton
  2004-01-10 14:48                               ` Paolo Ornati
  0 siblings, 2 replies; 38+ messages in thread
From: Ram Pai @ 2004-01-09 19:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: ornati, gandalf, linux-kernel

On Thu, 2004-01-08 at 17:17, Andrew Morton wrote:
> Ram Pai <linuxram@us.ibm.com> wrote:
 
> > 
> > Well this is my theory, somebody should validate it,
> 
> One megabyte seems like far too litte memory to be triggering the effect
> which you describe.  But yes, the risk is certainly there.
> 
> You could verify this with:
> 
 
I cannot exactly reproduce what Pualo Ornati is seeing.

Pualo: Request you to validate the following,

1) see whether you see a regression with files replacing the 
   cat command in your script with
       dd if=big_file of=/dev/null bs=1M count=256

2) and if you do, check if you see a bunch of 'eek' with Andrew's 
        following patch. (NOTE: without reverting the changes
        in filemap.c)

--------------------------------------------------------------------------

--- 25/mm/filemap.c~a   Thu Jan  8 17:15:57 2004
+++ 25-akpm/mm/filemap.c        Thu Jan  8 17:16:06 2004
@@ -629,8 +629,10 @@ find_page:
                        handle_ra_miss(mapping, ra, index);
                        goto no_cached_page;
                }
-               if (!PageUptodate(page))
+               if (!PageUptodate(page)) {
+                       printk("eek!\n");
                        goto page_not_up_to_date;
+               }
 page_ok:
                /* If users can be writing to this page using arbitrary
                 * virtual addresses, take care about potential aliasing

---------------------------------------------------------------------------


Thanks,
RP




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-09 19:15                             ` Ram Pai
@ 2004-01-09 19:44                               ` Andrew Morton
  2004-01-10 14:48                               ` Paolo Ornati
  1 sibling, 0 replies; 38+ messages in thread
From: Andrew Morton @ 2004-01-09 19:44 UTC (permalink / raw)
  To: Ram Pai; +Cc: ornati, gandalf, linux-kernel

Ram Pai <linuxram@us.ibm.com> wrote:
>
> 1) see whether you see a regression with files replacing the 
>     cat command in your script with
>         dd if=big_file of=/dev/null bs=1M count=256

You'll need to unmount and remount the fs in between to remove the file
from pagecache.  Or use fadvise() to remove the pagecache.  There's a
little tool which does that in 

http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-09 19:15                             ` Ram Pai
  2004-01-09 19:44                               ` Andrew Morton
@ 2004-01-10 14:48                               ` Paolo Ornati
  2004-01-10 16:00                                 ` Ed Sweetman
  1 sibling, 1 reply; 38+ messages in thread
From: Paolo Ornati @ 2004-01-10 14:48 UTC (permalink / raw)
  To: Ram Pai, Andrew Morton; +Cc: gandalf, linux-kernel

On Friday 09 January 2004 20:15, Ram Pai wrote:
> On Thu, 2004-01-08 at 17:17, Andrew Morton wrote:
> > Ram Pai <linuxram@us.ibm.com> wrote:
> > > Well this is my theory, somebody should validate it,
> >
> > One megabyte seems like far too litte memory to be triggering the
> > effect which you describe.  But yes, the risk is certainly there.
> >
> > You could verify this with:
>
> I cannot exactly reproduce what Pualo Ornati is seeing.
>
> Pualo: Request you to validate the following,
>
> 1) see whether you see a regression with files replacing the
>    cat command in your script with
>        dd if=big_file of=/dev/null bs=1M count=256
>
> 2) and if you do, check if you see a bunch of 'eek' with Andrew's
>         following patch. (NOTE: without reverting the changes
>         in filemap.c)
>
> -------------------------------------------------------------------------
>-
>
> --- 25/mm/filemap.c~a   Thu Jan  8 17:15:57 2004
> +++ 25-akpm/mm/filemap.c        Thu Jan  8 17:16:06 2004
> @@ -629,8 +629,10 @@ find_page:
>                         handle_ra_miss(mapping, ra, index);
>                         goto no_cached_page;
>                 }
> -               if (!PageUptodate(page))
> +               if (!PageUptodate(page)) {
> +                       printk("eek!\n");
>                         goto page_not_up_to_date;
> +               }
>  page_ok:
>                 /* If users can be writing to this page using arbitrary
>                  * virtual addresses, take care about potential aliasing
>
> -------------------------------------------------------------------------

Ok, this patch seems for -mm tree... I have applied it by hand (on a vanilla 
2.6.1-rc1).

For my tests I've used this script:

#!/bin/sh

RA_VALS="256 128 64"
FILE="/big_file"
SIZE=`stat -c '%s' $FILE`
NR_TESTS="3"
LINUX=`uname -r`

echo "HD test for Penguin $LINUX"

killall5
sync
sleep 3

for ra in $RA_VALS; do
    hdparm -a $ra /dev/hda
    for i in `seq $NR_TESTS`; do
    echo "_ _ _ _ _ _ _ _ _"
	./fadvise $FILE 0 $SIZE dontneed
	time dd if=$FILE of=/dev/null bs=1M count=256
    done
    echo "________________________________"
done


RESULTS (2.6.0 / 2.6.1-rc1)

HD test for Penguin 2.6.0

/dev/hda:
 setting fs readahead to 256
 readahead    = 256 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.427s
user	0m0.002s
sys	0m1.722s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.963s
user	0m0.000s
sys	0m1.760s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.291s
user	0m0.001s
sys	0m1.713s
________________________________

/dev/hda:
 setting fs readahead to 128
 readahead    = 128 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m9.910s
user	0m0.003s
sys	0m1.882s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m9.693s
user	0m0.003s
sys	0m1.860s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m9.733s
user	0m0.004s
sys	0m1.922s
________________________________

/dev/hda:
 setting fs readahead to 64
 readahead    = 64 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m9.107s
user	0m0.000s
sys	0m2.026s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m9.227s
user	0m0.004s
sys	0m1.984s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m9.152s
user	0m0.002s
sys	0m2.013s
________________________________


HD test for Penguin 2.6.1-rc1

/dev/hda:
 setting fs readahead to 256
 readahead    = 256 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.984s
user	0m0.002s
sys	0m1.751s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.704s
user	0m0.002s
sys	0m1.766s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.886s
user	0m0.002s
sys	0m1.731s
________________________________

/dev/hda:
 setting fs readahead to 128
 readahead    = 128 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.120s
user	0m0.001s
sys	0m1.830s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.596s
user	0m0.005s
sys	0m1.764s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.481s
user	0m0.002s
sys	0m1.727s
________________________________

/dev/hda:
 setting fs readahead to 64
 readahead    = 64 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.361s
user	0m0.006s
sys	0m1.782s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.655s
user	0m0.002s
sys	0m1.778s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real	0m11.369s
user	0m0.004s
sys	0m1.798s
________________________________


As you can see 2.6.0 performances increase setting readahead from 256 to 64 
(64 seems to be the best value) while 2.6.1-rc1 performances don't change 
too much.

I noticed that on 2.6.0 with readahead setted at 256 the HD LED blinks 
during the data transfer while with lower values (128 / 64) it stays on.
Instead on 2.6.1-rc1 HD LED blinks with almost any values (I must set it at 
8 to see it stable on).

ANSWERS:

1) YES... I see a regression with files ;-(

2) YES, I see also a bunch of "eek!" (a mountain of "eek!")

Bye

-- 
	Paolo Ornati
	Linux v2.4.24


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-10 14:48                               ` Paolo Ornati
@ 2004-01-10 16:00                                 ` Ed Sweetman
  2004-01-10 16:19                                   ` Ed Sweetman
  2004-01-10 17:29                                   ` Paolo Ornati
  0 siblings, 2 replies; 38+ messages in thread
From: Ed Sweetman @ 2004-01-10 16:00 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: Ram Pai, Andrew Morton, gandalf, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 6262 bytes --]

Paolo Ornati wrote:
> On Friday 09 January 2004 20:15, Ram Pai wrote:
> 
>>On Thu, 2004-01-08 at 17:17, Andrew Morton wrote:
>>
>>>Ram Pai <linuxram@us.ibm.com> wrote:
>>>
>>>>Well this is my theory, somebody should validate it,
>>>
>>>One megabyte seems like far too litte memory to be triggering the
>>>effect which you describe.  But yes, the risk is certainly there.
>>>
>>>You could verify this with:
>>
>>I cannot exactly reproduce what Pualo Ornati is seeing.
>>
>>Pualo: Request you to validate the following,
>>
>>1) see whether you see a regression with files replacing the
>>   cat command in your script with
>>       dd if=big_file of=/dev/null bs=1M count=256
>>
>>2) and if you do, check if you see a bunch of 'eek' with Andrew's
>>        following patch. (NOTE: without reverting the changes
>>        in filemap.c)
>>
>>-------------------------------------------------------------------------
>>-
>>
>>--- 25/mm/filemap.c~a   Thu Jan  8 17:15:57 2004
>>+++ 25-akpm/mm/filemap.c        Thu Jan  8 17:16:06 2004
>>@@ -629,8 +629,10 @@ find_page:
>>                        handle_ra_miss(mapping, ra, index);
>>                        goto no_cached_page;
>>                }
>>-               if (!PageUptodate(page))
>>+               if (!PageUptodate(page)) {
>>+                       printk("eek!\n");
>>                        goto page_not_up_to_date;
>>+               }
>> page_ok:
>>                /* If users can be writing to this page using arbitrary
>>                 * virtual addresses, take care about potential aliasing
>>
>>-------------------------------------------------------------------------
> 
> 
> Ok, this patch seems for -mm tree... I have applied it by hand (on a vanilla 
> 2.6.1-rc1).
> 
> For my tests I've used this script:
> 
> #!/bin/sh
> 
> RA_VALS="256 128 64"
> FILE="/big_file"
> SIZE=`stat -c '%s' $FILE`
> NR_TESTS="3"
> LINUX=`uname -r`
> 
> echo "HD test for Penguin $LINUX"
> 
> killall5
> sync
> sleep 3
> 
> for ra in $RA_VALS; do
>     hdparm -a $ra /dev/hda
>     for i in `seq $NR_TESTS`; do
>     echo "_ _ _ _ _ _ _ _ _"
> 	./fadvise $FILE 0 $SIZE dontneed
> 	time dd if=$FILE of=/dev/null bs=1M count=256
>     done
>     echo "________________________________"
> done
> 
> 
> RESULTS (2.6.0 / 2.6.1-rc1)
> 
> HD test for Penguin 2.6.0
> 
> /dev/hda:
>  setting fs readahead to 256
>  readahead    = 256 (on)
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.427s
> user	0m0.002s
> sys	0m1.722s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.963s
> user	0m0.000s
> sys	0m1.760s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.291s
> user	0m0.001s
> sys	0m1.713s
> ________________________________
> 
> /dev/hda:
>  setting fs readahead to 128
>  readahead    = 128 (on)
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m9.910s
> user	0m0.003s
> sys	0m1.882s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m9.693s
> user	0m0.003s
> sys	0m1.860s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m9.733s
> user	0m0.004s
> sys	0m1.922s
> ________________________________
> 
> /dev/hda:
>  setting fs readahead to 64
>  readahead    = 64 (on)
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m9.107s
> user	0m0.000s
> sys	0m2.026s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m9.227s
> user	0m0.004s
> sys	0m1.984s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m9.152s
> user	0m0.002s
> sys	0m2.013s
> ________________________________
> 
> 
> HD test for Penguin 2.6.1-rc1
> 
> /dev/hda:
>  setting fs readahead to 256
>  readahead    = 256 (on)
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.984s
> user	0m0.002s
> sys	0m1.751s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.704s
> user	0m0.002s
> sys	0m1.766s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.886s
> user	0m0.002s
> sys	0m1.731s
> ________________________________
> 
> /dev/hda:
>  setting fs readahead to 128
>  readahead    = 128 (on)
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.120s
> user	0m0.001s
> sys	0m1.830s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.596s
> user	0m0.005s
> sys	0m1.764s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.481s
> user	0m0.002s
> sys	0m1.727s
> ________________________________
> 
> /dev/hda:
>  setting fs readahead to 64
>  readahead    = 64 (on)
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.361s
> user	0m0.006s
> sys	0m1.782s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.655s
> user	0m0.002s
> sys	0m1.778s
> _ _ _ _ _ _ _ _ _
> 256+0 records in
> 256+0 records out
> 
> real	0m11.369s
> user	0m0.004s
> sys	0m1.798s
> ________________________________
> 
> 
> As you can see 2.6.0 performances increase setting readahead from 256 to 64 
> (64 seems to be the best value) while 2.6.1-rc1 performances don't change 
> too much.
> 
> I noticed that on 2.6.0 with readahead setted at 256 the HD LED blinks 
> during the data transfer while with lower values (128 / 64) it stays on.
> Instead on 2.6.1-rc1 HD LED blinks with almost any values (I must set it at 
> 8 to see it stable on).
> 
> ANSWERS:
> 
> 1) YES... I see a regression with files ;-(
> 
> 2) YES, I see also a bunch of "eek!" (a mountain of "eek!")
> 
> Bye
> 


I'm using 2.6.0-mm1 and i see no difference from setting readahead to 
anything on my extent enabled partitions. So it appears that filesystem 
plays a big part in your numbers here, not just hdd attributes or settings.

The partition FILE is on is an ext3 + extents enabled partition. Despite 
not having fadvise (what is this anyway?) the numbers are all real and 
no error occured. Extents totally rocks for this type of data access, as 
you can see below.

Stick to non-fs tests if you want to benchmark fs independent code. Not 
everyone is going to be able to come up with the same results as you and 
as such a possible fix could actually be detrimental, and we'd be stuck 
in a loop of "ide regression" mails.








[-- Attachment #2: output --]
[-- Type: text/plain, Size: 3070 bytes --]

HD test for Penguin 2.6.0-mm1-extents

/dev/hda:
 setting fs readahead to 8192
 readahead    = 8192 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.098793 seconds (244300323 bytes/sec)

real	0m1.100s
user	0m0.005s
sys	0m1.096s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.102250 seconds (243534086 bytes/sec)

real	0m1.104s
user	0m0.000s
sys	0m1.104s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.096914 seconds (244718759 bytes/sec)

real	0m1.098s
user	0m0.001s
sys	0m1.097s
________________________________

/dev/hda:
 setting fs readahead to 256
 readahead    = 256 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.104646 seconds (243005877 bytes/sec)

real	0m1.106s
user	0m0.001s
sys	0m1.105s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.100904 seconds (243831834 bytes/sec)

real	0m1.102s
user	0m0.000s
sys	0m1.103s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.102060 seconds (243576076 bytes/sec)

real	0m1.104s
user	0m0.002s
sys	0m1.101s
________________________________

/dev/hda:
 setting fs readahead to 128
 readahead    = 128 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.100799 seconds (243855121 bytes/sec)

real	0m1.102s
user	0m0.000s
sys	0m1.102s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.101516 seconds (243696385 bytes/sec)

real	0m1.103s
user	0m0.002s
sys	0m1.101s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.100963 seconds (243818758 bytes/sec)

real	0m1.102s
user	0m0.000s
sys	0m1.103s
________________________________

/dev/hda:
 setting fs readahead to 64
 readahead    = 64 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.104634 seconds (243008498 bytes/sec)

real	0m1.106s
user	0m0.002s
sys	0m1.105s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.102107 seconds (243565703 bytes/sec)

real	0m1.104s
user	0m0.003s
sys	0m1.100s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.104429 seconds (243053595 bytes/sec)

real	0m1.106s
user	0m0.000s
sys	0m1.106s
________________________________

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-10 16:00                                 ` Ed Sweetman
@ 2004-01-10 16:19                                   ` Ed Sweetman
  2004-01-10 17:29                                     ` Paolo Ornati
  2004-01-10 17:29                                   ` Paolo Ornati
  1 sibling, 1 reply; 38+ messages in thread
From: Ed Sweetman @ 2004-01-10 16:19 UTC (permalink / raw)
  To: Paolo Ornati; +Cc: Ram Pai, Andrew Morton, gandalf, linux-kernel

Ed Sweetman wrote:
> Paolo Ornati wrote:
> 
>> On Friday 09 January 2004 20:15, Ram Pai wrote:
>>
>>> On Thu, 2004-01-08 at 17:17, Andrew Morton wrote:
>>>
>>>> Ram Pai <linuxram@us.ibm.com> wrote:
>>>>
>>>>> Well this is my theory, somebody should validate it,
>>>>
>>>>
>>>> One megabyte seems like far too litte memory to be triggering the
>>>> effect which you describe.  But yes, the risk is certainly there.
>>>>
>>>> You could verify this with:
>>>
>>>
>>> I cannot exactly reproduce what Pualo Ornati is seeing.
>>>
>>> Pualo: Request you to validate the following,
>>>
>>> 1) see whether you see a regression with files replacing the
>>>   cat command in your script with
>>>       dd if=big_file of=/dev/null bs=1M count=256
>>>
>>> 2) and if you do, check if you see a bunch of 'eek' with Andrew's
>>>        following patch. (NOTE: without reverting the changes
>>>        in filemap.c)
>>>
>>> ------------------------------------------------------------------------- 
>>>
>>> -
>>>
>>> --- 25/mm/filemap.c~a   Thu Jan  8 17:15:57 2004
>>> +++ 25-akpm/mm/filemap.c        Thu Jan  8 17:16:06 2004
>>> @@ -629,8 +629,10 @@ find_page:
>>>                        handle_ra_miss(mapping, ra, index);
>>>                        goto no_cached_page;
>>>                }
>>> -               if (!PageUptodate(page))
>>> +               if (!PageUptodate(page)) {
>>> +                       printk("eek!\n");
>>>                        goto page_not_up_to_date;
>>> +               }
>>> page_ok:
>>>                /* If users can be writing to this page using arbitrary
>>>                 * virtual addresses, take care about potential aliasing
>>>
>>> ------------------------------------------------------------------------- 
>>>
>>
>>
>>
>> Ok, this patch seems for -mm tree... I have applied it by hand (on a 
>> vanilla 2.6.1-rc1).
>>
>> For my tests I've used this script:
>>
>> #!/bin/sh
>>
>> RA_VALS="256 128 64"
>> FILE="/big_file"
>> SIZE=`stat -c '%s' $FILE`
>> NR_TESTS="3"
>> LINUX=`uname -r`
>>
>> echo "HD test for Penguin $LINUX"
>>
>> killall5
>> sync
>> sleep 3
>>
>> for ra in $RA_VALS; do
>>     hdparm -a $ra /dev/hda
>>     for i in `seq $NR_TESTS`; do
>>     echo "_ _ _ _ _ _ _ _ _"
>>     ./fadvise $FILE 0 $SIZE dontneed
>>     time dd if=$FILE of=/dev/null bs=1M count=256
>>     done
>>     echo "________________________________"
>> done
>>
>>
>> RESULTS (2.6.0 / 2.6.1-rc1)
>>
>> HD test for Penguin 2.6.0
>>
>> /dev/hda:
>>  setting fs readahead to 256
>>  readahead    = 256 (on)
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.427s
>> user    0m0.002s
>> sys    0m1.722s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.963s
>> user    0m0.000s
>> sys    0m1.760s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.291s
>> user    0m0.001s
>> sys    0m1.713s
>> ________________________________
>>
>> /dev/hda:
>>  setting fs readahead to 128
>>  readahead    = 128 (on)
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m9.910s
>> user    0m0.003s
>> sys    0m1.882s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m9.693s
>> user    0m0.003s
>> sys    0m1.860s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m9.733s
>> user    0m0.004s
>> sys    0m1.922s
>> ________________________________
>>
>> /dev/hda:
>>  setting fs readahead to 64
>>  readahead    = 64 (on)
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m9.107s
>> user    0m0.000s
>> sys    0m2.026s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m9.227s
>> user    0m0.004s
>> sys    0m1.984s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m9.152s
>> user    0m0.002s
>> sys    0m2.013s
>> ________________________________
>>
>>
>> HD test for Penguin 2.6.1-rc1
>>
>> /dev/hda:
>>  setting fs readahead to 256
>>  readahead    = 256 (on)
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.984s
>> user    0m0.002s
>> sys    0m1.751s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.704s
>> user    0m0.002s
>> sys    0m1.766s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.886s
>> user    0m0.002s
>> sys    0m1.731s
>> ________________________________
>>
>> /dev/hda:
>>  setting fs readahead to 128
>>  readahead    = 128 (on)
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.120s
>> user    0m0.001s
>> sys    0m1.830s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.596s
>> user    0m0.005s
>> sys    0m1.764s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.481s
>> user    0m0.002s
>> sys    0m1.727s
>> ________________________________
>>
>> /dev/hda:
>>  setting fs readahead to 64
>>  readahead    = 64 (on)
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.361s
>> user    0m0.006s
>> sys    0m1.782s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.655s
>> user    0m0.002s
>> sys    0m1.778s
>> _ _ _ _ _ _ _ _ _
>> 256+0 records in
>> 256+0 records out
>>
>> real    0m11.369s
>> user    0m0.004s
>> sys    0m1.798s
>> ________________________________
>>
>>
>> As you can see 2.6.0 performances increase setting readahead from 256 
>> to 64 (64 seems to be the best value) while 2.6.1-rc1 performances 
>> don't change too much.
>>
>> I noticed that on 2.6.0 with readahead setted at 256 the HD LED blinks 
>> during the data transfer while with lower values (128 / 64) it stays on.
>> Instead on 2.6.1-rc1 HD LED blinks with almost any values (I must set 
>> it at 8 to see it stable on).
>>
>> ANSWERS:
>>
>> 1) YES... I see a regression with files ;-(
>>
>> 2) YES, I see also a bunch of "eek!" (a mountain of "eek!")
>>
>> Bye
>>
> 
> 
> I'm using 2.6.0-mm1 and i see no difference from setting readahead to 
> anything on my extent enabled partitions. So it appears that filesystem 
> plays a big part in your numbers here, not just hdd attributes or settings.
> 
> The partition FILE is on is an ext3 + extents enabled partition. Despite 
> not having fadvise (what is this anyway?) the numbers are all real and 
> no error occured. Extents totally rocks for this type of data access, as 
> you can see below.
> 
> Stick to non-fs tests if you want to benchmark fs independent code. Not 
> everyone is going to be able to come up with the same results as you and 
> as such a possible fix could actually be detrimental, and we'd be stuck 
> in a loop of "ide regression" mails.
> 

debian unstable's dd may also be seeing that it's writing to /dev/null 
and just not doing anything. I know extents are fast and make certain 
manipulations on them extremely faster than plain ext3 but 256MB/sec is 
really really too fast. So in either case it looks like this test is not 
usable to me.


I dont know why you dont also try 8192 for readahead, measuring 
performance by the duration or intensity of the hdd is led is not very 
sound. i actually copy large files to and from parts of the same ext3 
partition at over 20MB/sec sustained hdparm shows it's highest numbers 
under it. For me it doesn't get any faster than that.  So what's this 
all say, maybe all these performance numbers are just as much based on 
your readahead value as they are on the position of the moon and the 
rest of the system and it's hardware. btw, what is the value of your HZ 
environment variable, debian still sets it to 100, i set it to 1024, not 
really sure if it made any difference.

i'm using the via ide driver, so are you, i'm not seeing the type of 
regression that you are, my dd doesn't do what your dd does. our hdds 
are different.  The regression in the kernels could just as easily be 
due to a regression in the schedular and nothing to do with the ide 
drivers.  Have you tried just using 2.6.0 (whatever version you see 
changes with your readahead values) then the same kernel with the new 
ide code from the kernel you dont see any changes so you're running 
everything else the same but only ide has been "upgraded" and see if you 
see the same regression.  I dont think you will. the readahead effects 
how often you have to ask the hdd to read from the platter and waiting 
on io can possibly effect how your kernel schedules it. Faster drives 
would thus not be effected the same way which could explain why none of 
the conclusions and results you've found are the same with my system.

or i could be completely wrong and something could be going bad with the 
ide drivers.  I just dont see how that could be the case and i not have 
the same performance regression you have when we both use the same ide 
drivers (just slightly different chipsets).



> 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> HD test for Penguin 2.6.0-mm1-extents
> 
> /dev/hda:
>  setting fs readahead to 8192
>  readahead    = 8192 (on)
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.098793 seconds (244300323 bytes/sec)
> 
> real	0m1.100s
> user	0m0.005s
> sys	0m1.096s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.102250 seconds (243534086 bytes/sec)
> 
> real	0m1.104s
> user	0m0.000s
> sys	0m1.104s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.096914 seconds (244718759 bytes/sec)
> 
> real	0m1.098s
> user	0m0.001s
> sys	0m1.097s
> ________________________________
> 
> /dev/hda:
>  setting fs readahead to 256
>  readahead    = 256 (on)
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.104646 seconds (243005877 bytes/sec)
> 
> real	0m1.106s
> user	0m0.001s
> sys	0m1.105s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.100904 seconds (243831834 bytes/sec)
> 
> real	0m1.102s
> user	0m0.000s
> sys	0m1.103s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.102060 seconds (243576076 bytes/sec)
> 
> real	0m1.104s
> user	0m0.002s
> sys	0m1.101s
> ________________________________
> 
> /dev/hda:
>  setting fs readahead to 128
>  readahead    = 128 (on)
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.100799 seconds (243855121 bytes/sec)
> 
> real	0m1.102s
> user	0m0.000s
> sys	0m1.102s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.101516 seconds (243696385 bytes/sec)
> 
> real	0m1.103s
> user	0m0.002s
> sys	0m1.101s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.100963 seconds (243818758 bytes/sec)
> 
> real	0m1.102s
> user	0m0.000s
> sys	0m1.103s
> ________________________________
> 
> /dev/hda:
>  setting fs readahead to 64
>  readahead    = 64 (on)
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.104634 seconds (243008498 bytes/sec)
> 
> real	0m1.106s
> user	0m0.002s
> sys	0m1.105s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.102107 seconds (243565703 bytes/sec)
> 
> real	0m1.104s
> user	0m0.003s
> sys	0m1.100s
> _ _ _ _ _ _ _ _ _
> /tester: line 18: ./fadvise: No such file or directory
> 256+0 records in
> 256+0 records out
> 268435456 bytes transferred in 1.104429 seconds (243053595 bytes/sec)
> 
> real	0m1.106s
> user	0m0.000s
> sys	0m1.106s
> ________________________________



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-10 16:00                                 ` Ed Sweetman
  2004-01-10 16:19                                   ` Ed Sweetman
@ 2004-01-10 17:29                                   ` Paolo Ornati
  1 sibling, 0 replies; 38+ messages in thread
From: Paolo Ornati @ 2004-01-10 17:29 UTC (permalink / raw)
  To: Ed Sweetman; +Cc: Ram Pai, Andrew Morton, gandalf, linux-kernel

On Saturday 10 January 2004 17:00, Ed Sweetman wrote:
>
> I'm using 2.6.0-mm1 and i see no difference from setting readahead to
> anything on my extent enabled partitions. So it appears that filesystem
> plays a big part in your numbers here, not just hdd attributes or
> settings.
>
> The partition FILE is on is an ext3 + extents enabled partition. Despite
> not having fadvise (what is this anyway?) the numbers are all real and
> no error occured. Extents totally rocks for this type of data access, as
> you can see below.
>
> Stick to non-fs tests if you want to benchmark fs independent code. Not
> everyone is going to be able to come up with the same results as you and
> as such a possible fix could actually be detrimental, and we'd be stuck
> in a loop of "ide regression" mails.

To run correctly my script you _MUST_ have "fadvise" tool (my script assumes 
it is installed in current directory).

This is what Andrew said:
_____________________________________________________________________
You'll need to unmount and remount the fs in between to remove the file
from pagecache.  Or use fadvise() to remove the pagecache.  There's a
little tool which does that in 

http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz
_____________________________________________________________________

so "fadvise" is a simple tool that calls "fadvise64" system call.
This system call lets you do some useful things: for example you can discard 
all the cached pages for a file, that is what my command does.

-- 
	Paolo Ornati
	Linux v2.4.24



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-10 16:19                                   ` Ed Sweetman
@ 2004-01-10 17:29                                     ` Paolo Ornati
  0 siblings, 0 replies; 38+ messages in thread
From: Paolo Ornati @ 2004-01-10 17:29 UTC (permalink / raw)
  To: Ed Sweetman; +Cc: Ram Pai, Andrew Morton, gandalf, linux-kernel

On Saturday 10 January 2004 17:19, Ed Sweetman wrote:
>
> debian unstable's dd may also be seeing that it's writing to /dev/null
> and just not doing anything. I know extents are fast and make certain
> manipulations on them extremely faster than plain ext3 but 256MB/sec is
> really really too fast. So in either case it looks like this test is not
> usable to me.

yes... 256MB/s is a bit too high!
Can you try with "fadvice" installed?
Anyway I think your theroy is right... and so intalling "fadvice" you will 
NOT see any big difference.

>
>
> I dont know why you dont also try 8192 for readahead, measuring

beacuse readahead setted to 8192 gives me BAD performance!

> performance by the duration or intensity of the hdd is led is not very
> sound. i actually copy large files to and from parts of the same ext3
> partition at over 20MB/sec sustained hdparm shows it's highest numbers
> under it. For me it doesn't get any faster than that.  So what's this
> all say, maybe all these performance numbers are just as much based on
> your readahead value as they are on the position of the moon and the
> rest of the system and it's hardware. btw, what is the value of your HZ
> environment variable, debian still sets it to 100, i set it to 1024, not
> really sure if it made any difference.
>
> i'm using the via ide driver, so are you, i'm not seeing the type of
> regression that you are, my dd doesn't do what your dd does. our hdds
> are different.  The regression in the kernels could just as easily be
> due to a regression in the schedular and nothing to do with the ide
> drivers.  Have you tried just using 2.6.0 (whatever version you see
> changes with your readahead values) then the same kernel with the new
> ide code from the kernel you dont see any changes so you're running
> everything else the same but only ide has been "upgraded" and see if you
> see the same regression.  I dont think you will. the readahead effects

Yes, the correct way to work is as you say....
BUT read the whole story:

1) using "hdparm -t /dev/hda" I found IDE performace regression (in 
sequential reads) upgrading from 2.6.0 to 2.6.1-rc1

2) someone tell me to try to revert this patch:
"readahead: multiple performance fixes"

Reverting it in 2.6.1-rc1 kernel gives me the same ide performaces that 
2.6.0 has.

3) Since 2.6.0 and 2.6.1-rc1(with "readahead: multiple performance fixes" 
reverted) kernels give me the same results for any IDE performance test I 
do --> I treat them as they are the same thing ;-) 


The part of the patch that gives me all these problem is already found and 
is quite small:

diff -Nru a/mm/filemap.c b/mm/filemap.c

--- a/mm/filemap.c	Sat Jan  3 02:29:08 2004
+++ b/mm/filemap.c	Sat Jan  3 02:29:08 2004
@@ -587,13 +587,22 @@
 			     read_actor_t actor)
 {
 	struct inode *inode = mapping->host;
-	unsigned long index, offset;
+	unsigned long index, offset, last;
 	struct page *cached_page;
 	int error;
 
 	cached_page = NULL;
 	index = *ppos >> PAGE_CACHE_SHIFT;
 	offset = *ppos & ~PAGE_CACHE_MASK;
+	last = (*ppos + desc->count) >> PAGE_CACHE_SHIFT;
+
+	/*
+	 * Let the readahead logic know upfront about all
+	 * the pages we'll need to satisfy this request
+	 */
+	for (; index < last; index++)
+		page_cache_readahead(mapping, ra, filp, index);
+	index = *ppos >> PAGE_CACHE_SHIFT;
 
 	for (;;) {
 		struct page *page;
@@ -612,7 +621,6 @@
 		}
 
 		cond_resched();
-		page_cache_readahead(mapping, ra, filp, index);
 
 		nr = nr - offset;
 find_page:


-- 
	Paolo Ornati
	Linux v2.4.24



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Strange IDE performance change in 2.6.1-rc1 (again)
  2004-01-04 14:30             ` Paolo Ornati
  2004-01-05 23:19               ` Ram Pai
@ 2004-03-29 15:45               ` Ram Pai
  1 sibling, 0 replies; 38+ messages in thread
From: Ram Pai @ 2004-03-29 15:45 UTC (permalink / raw)
  To: Administrator; +Cc: Andrew Morton, gandalf, linux-kernel

Sorry I was on vacation and could not get back earlier.

I do not exactly know the reason why sequential reads on blockdevices
has regressed. One probable reason is that the same lazy-read
optimization which helps large random reads is regressing the sequential
read performance.

Note: the patch, waits till the last page in the current window is being
read, before triggering a new readahead. By the time the readahead
request is satisfied, the next sequential read may already have been
requested. Hence there is some loss of parallelism here. However given
that largesize random reads is the most common case; this patch attacks
that case.

If you revert back just the lazy-read optimization, you might see no
regression for sequential reads,

Let me see if I can verify this,
Ram Pai


On Sun, 2004-01-04 at 06:30, Paolo Ornati wrote:
> On Saturday 03 January 2004 23:40, Andrew Morton wrote:
> > Paolo Ornati <ornati@lycos.it> wrote:
> > > I know these are only performance in sequential data reads... and real
> > > life is another thing... but I think the author of the patch should be
> > > informed (Ram Pai).
> >
> > There does seem to be something whacky going on with readahead against
> > blockdevices.  Perhaps it is related to the soft blocksize.  I've never
> > been able to reproduce any of this.
> >
> > Be aware that buffered reads for blockdevs are treated fairly differently
> > from buffered reads for regular files: they only use lowmem and we always
> > attach buffer_heads and perform I/O against them.
> >
> > No effort was made to optimise buffered blockdev reads because it is not
> > very important and my main interest was in data coherency and filesystem
> > metadata consistency.
> >
> > If you observe the same things reading from regular files then that is
> > more important.
> 
> I have done some tests with this stupid script and it seems that you are 
> right:
> _____________________________________________________________________
> #!/bin/sh
> 
> DEV=/dev/hda7
> MOUNT_DIR=mnt
> BIG_FILE=$MOUNT_DIR/big_file
> 
> mount $DEV $MOUNT_DIR
> if [ ! -f $BIG_FILE ]; then
>     echo "[DD] $BIG_FILE"
>     dd if=/dev/zero of=$BIG_FILE bs=1M count=1024
>     umount $MOUNT_DIR
>     mount $DEV $MOUNT_DIR
> fi
> 
> killall5
> sleep 2
> sync
> sleep 2
> 
> time cat $BIG_FILE > /dev/null
> umount $MOUNT_DIR
> _____________________________________________________________________
> 
> 
> Results for plain 2.6.1-rc1 (A) and 2.6.1-rc1 without Ram Pai's patch (B):
> 
> o readahead = 256 (default setting)
> 
> (A)
> real	0m43.596s
> user	0m0.153s
> sys	0m5.602s
> 
> real	0m42.971s
> user	0m0.136s
> sys	0m5.571s
> 
> real	0m42.888s
> user	0m0.137s
> sys	0m5.648s
> 
> (B)
> real    0m43.520s
> user    0m0.130s
> sys     0m5.615s
> 
> real	0m42.930s
> user	0m0.154s
> sys	0m5.745s
> 
> real	0m42.937s
> user	0m0.120s
> sys	0m5.751s
> 
> 
> o readahead = 128
> 
> (A)
> real	0m35.932s
> user	0m0.133s
> sys	0m5.926s
> 
> real	0m35.925s
> user	0m0.146s
> sys	0m5.930s
> 
> real	0m35.892s
> user	0m0.145s
> sys	0m5.946s
> 
> (B)
> real	0m35.957s
> user	0m0.136s
> sys	0m6.041s
> 
> real	0m35.958s
> user	0m0.136s
> sys	0m5.957s
> 
> real	0m35.924s
> user	0m0.146s
> sys	0m6.069s
> 
> 
> o readahead = 64
> (A)
> real	0m35.284s
> user	0m0.137s
> sys	0m6.182s
> 
> real	0m35.267s
> user	0m0.134s
> sys	0m6.110s
> 
> real	0m35.260s
> user	0m0.149s
> sys	0m6.003s
> 
> 
> (B)
> real	0m35.210s
> user	0m0.149s
> sys	0m6.009s
> 
> real	0m35.341s
> user	0m0.151s
> sys	0m6.119s
> 
> real	0m35.151s
> user	0m0.144s
> sys	0m6.195s
> 
> 
> I don't notice any big difference between kernel A and kernel B....
> 
> From these tests the best readahead value for my HD seems to be 64... and 
> the default setting (256) just wrong.
> 
> With 2.4.23 kernel and readahead = 8 I get results like these:
> 
> real	0m40.085s
> user	0m0.130s
> sys	0m4.560s
> 
> real	0m40.058s
> user	0m0.090s
> sys	0m4.630s
> 
> Bye.


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2004-03-29 15:45 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-02 16:02 Strange IDE performance change in 2.6.1-rc1 (again) Paolo Ornati
2004-01-02 18:08 ` Ed Sweetman
2004-01-02 21:04   ` Paolo Ornati
2004-01-02 21:27     ` Valdis.Kletnieks
2004-01-03 10:20       ` Paolo Ornati
2004-01-02 21:32     ` Mike Fedyk
2004-01-02 22:34       ` Martin Josefsson
2004-01-03 11:13         ` Paolo Ornati
2004-01-03 22:40           ` Andrew Morton
2004-01-04 14:30             ` Paolo Ornati
2004-01-05 23:19               ` Ram Pai
2004-01-07 14:59                 ` Paolo Ornati
2004-01-07 19:23                   ` Ram Pai
2004-01-07 20:12                     ` Paolo Ornati
2004-01-07 23:57                       ` Andrew Morton
2004-01-08  7:31                         ` Ram Pai
2004-01-09  1:05                         ` Ram Pai
2004-01-09  1:17                           ` Andrew Morton
2004-01-09 19:15                             ` Ram Pai
2004-01-09 19:44                               ` Andrew Morton
2004-01-10 14:48                               ` Paolo Ornati
2004-01-10 16:00                                 ` Ed Sweetman
2004-01-10 16:19                                   ` Ed Sweetman
2004-01-10 17:29                                     ` Paolo Ornati
2004-01-10 17:29                                   ` Paolo Ornati
2004-03-29 15:45               ` Ram Pai
2004-01-04 17:15             ` Buffer and Page cache coherent? was: " Mike Fedyk
2004-01-04 22:10               ` Andrew Morton
2004-01-04 23:22                 ` Mike Fedyk
2004-01-04 23:32                   ` Andrew Morton
2004-01-04 23:45                     ` Mike Fedyk
2004-01-05  0:23                       ` Andrew Morton
2004-01-03 10:20       ` Paolo Ornati
2004-01-03  3:33     ` Tobias Diedrich
2004-01-03  4:15       ` Valdis.Kletnieks
2004-01-03 13:39         ` Tobias Diedrich
2004-01-03 20:56           ` Tobias Diedrich
2004-01-04  3:02         ` jw schultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).