linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* scsi vs ide performance on fsync's
@ 2001-03-02 17:42 Jeremy Hansen
  2001-03-02 18:39 ` Steve Lord
  2001-03-02 20:56 ` Linus Torvalds
  0 siblings, 2 replies; 64+ messages in thread
From: Jeremy Hansen @ 2001-03-02 17:42 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2599 bytes --]


We're doing some mysql benchmarking.  For some reason it seems that ide
drives are currently beating a scsi raid array and it seems to be related
to fsync's.  Bonnie stats show the scsi array to blow away ide as
expected, but mysql tests still have the idea beating on plain insert
speeds.  Can anyone explain how this is possible, or perhaps explain how
our testing may be flawed?

Here's the bonnie stats:

IDE Drive:

Version 1.00g       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
jeremy         300M  9026  94 17524  12  8173   9  7269  83 23678   7 102.9   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   469  98  1476  98 16855  89   459  98  7132  99   688  25


SCSI Array:

Version 1.00g       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
orville        300M  8433 100 134143  99 127982  99  8016 100 374457  99 1583.4   6
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   503  13 +++++ +++   538  13   490  13 +++++ +++   428  11

So...obviously from bonnie stats, the scsi array blows away the ide...but
using the attached c program, here's what we get for fsync stats using the
little c program I've attached:

IDE Drive:

jeremy:~# time ./xlog file.out fsync

real    0m1.850s
user    0m0.000s
sys     0m0.220s

SCSI Array:

[root@orville mysql_data]# time /root/xlog file.out fsync

real    0m23.586s
user    0m0.010s
sys     0m0.110s


I would appreciate any help understand what I'm seeing here and any
suggestions on how to improve the performance.

The SCSI adapter on the raid array is an Adaptec 39160, the raid
controller is a CMD-7040.  Kernel 2.4.0 using XFS for the filesystem on
the raid array, kernel 2.2.18 on ext2 on the IDE drive.  The filesystem is
not the problem, as I get almost the exact same results running this on
ext2 on the raid array.

Thanks
-jeremy

-- 
this is my sig.







[-- Attachment #2: Type: TEXT/PLAIN, Size: 792 bytes --]

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

struct Entry
{
	int count;
	char string[50];
};


int main(int argc, char **argv)
{
	int fd;
	struct Entry *trans;
	int x;

	if((fd = creat(argv[1], 0666)) == -1)
	{
		printf("Could not open file %s\n", argv[1]);
		return 1;
	}
	

	for(x=0; x < 2000; ++x)
	{
		trans = malloc(sizeof(struct Entry));
		trans->count = x;
		strcpy(trans->string, "Blah Blah Blah Blah Blah Blah Blah");

		if(strcmp(argv[2],"fsync")== 0)
		{
			write(fd, (char *)trans, sizeof(struct Entry));
			
			if(fdatasync(fd) != 0)
			{
				perror("Error");
			}

		}
		else
		{
			write(fd, (char *)trans, sizeof(struct Entry));
		}
	
		free(trans);

	}
	close(fd);

}

^ permalink raw reply	[flat|nested] 64+ messages in thread
[parent not found: <Pine.LNX.4.33L2.0103021033190.6176-200000@srv2.ecropolis.com>]
* Re: scsi vs ide performance on fsync's
@ 2001-03-06  5:27 Douglas Gilbert
  2001-03-06  5:45 ` Linus Torvalds
  2001-03-06  6:43 ` Jonathan Morton
  0 siblings, 2 replies; 64+ messages in thread
From: Douglas Gilbert @ 2001-03-06  5:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

 	
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Linus Torvalds wrote:

> Well, it's entirely possible that the mid-level SCSI layer is doing
> something horribly stupid.

Well it's in good company as FreeBSD 4.2 on the same hardware
returns the same result (including IDE timings that were too
fast). My timepeg analysis showed that the SCSI disk was consuming
the time, not any of the SCSI layers.

> On the other hand, it's also entirely possible that IDE is just a lot
> better than what the SCSI-bigots tend to claim. It's not all that
> surprising, considering that the PC industry has pushed untold billions of
> dollars into improving IDE, with SCSI as nary a consideration. The above
> may just simply be the Truth, with a capital T.

What exactly do you think fsync() and fdatasync() should
do? If they need to wait for dirty buffers to get flushed
to the disk oxide then multiple reported IDE results to
this thread are defying physics.


Doug Gilbert

^ permalink raw reply	[flat|nested] 64+ messages in thread
* Re: scsi vs ide performance on fsync's
@ 2001-03-06 17:14 David Balazic
  2001-03-06 17:46 ` Gregory Maxwell
  2001-03-06 18:23 ` Jonathan Morton
  0 siblings, 2 replies; 64+ messages in thread
From: David Balazic @ 2001-03-06 17:14 UTC (permalink / raw)
  To: chromi; +Cc: linux-kernel

(( please CC me , not subscribed , david.balazic@uni-mb.si )

Jonathan Morton (chromi@cyberspace.org) wrote :

> The OS needs to know the physical act of writing data has finished before     
> it tells the m/board to cut the power - period. Pathological data sets      
> included - they are the worst case which every engineer must take into  
> account. Out of interest, does Linux guarantee this, in the light of what
> we've uncovered? If so, perhaps it could use the same technique to fix    
> fdatasync() and family...

Linux currently ignores write-cache, AFAICT.
Recently I asked a similar question , about flushing drive caches at shutdown :
Subject : "Flusing caches on shutdown"
message archived at :
http://boudicca.tux.org/hypermail/linux-kernel/2001week08/0157.html
Body attached at end of this message.

The answer ( and only reply ) was :
[ archived at : http://boudicca.tux.org/hypermail/linux-kernel/2001week08/0211.html ]
--- begin quote ---
From: Ingo Oeser (ingo.oeser@informatik.tu-chemnitz.de)

On Mon, Feb 19, 2001 at 01:45:57PM +0100, David Balazic wrote:
> It is a good idea IMO to flush the write cache of storage devices 
> at shutdown and other critical moments. 
     
Not needed. All device drivers should disable write caches of
their devices, that need another signal than switching it off by  
the power button to flush themselves.
     
> Loosing data at powerdown due to write caches have been reported, 
> so this is no a theoretical problems. Also the journaled filesystems 
> are safe only in theory if the journal is not stored on non-volatile 
> memory, which is not guarantied in the current kernel. 
     
Fine. If users/admins have write caching enabled, they either
know what they do, or should disable it (which is the default for
all mass storage drivers AFAIK).
     
Hardware Level caching is only good for OSes which have broken
drivers and broken caching (like plain old DOS).

Linux does a good job in caching and cache control at software
level.
     
Regards

Ingo Oeser
--- end quote ---

My original mail :
--- begin quote ---
   (( CC me the replies, as I'm not subscribed to LKML ))

   Hi! 
     
   It is a good idea IMO to flush the write cache of storage devices
   at shutdown and other critical moments.
   I browsed through linux-2.4.1 and see no use of the SYNCHRONIZE CACHE
   SCSI command ( curiously it is defined in several other files
   besides include/scsi/scsi.h , grep returns :
   drivers/scsi/pci2000.h:#define SCSIOP_SYNCHRONIZE_CACHE 0x35
   drivers/scsi/psi_dale.h:#define SCSIOP_SYNCHRONIZE_CACHE 0x35
   drivers/scsi/psi240i.h:#define SCSIOP_SYNCHRONIZE_CACHE 0x35
   ) 

   I couldn't find evidence to the use of the equivalent ATA command either
   ( FLUSH CACHE , command code E7h ).
   Also add ATAPI to the list. ( and all other interfaces. I checked just SCSI
   and ATA )

   Loosing data at powerdown due to write caches have been reported,
   so this is no a theoretical problems. Also the journaled filesystems
   are safe only in theory if the journal is not stored on non-volatile 
   memory, which is not guarantied in the current kernel.

   What is the official word on this issue ?
   I think this is important to the "enterprise" guys, at the least.
                                           
   Sincerely,
   david 
   
   PS: CC me , as I'm not subscribed to LKML
--- end quote ---

-- 
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -

-- 
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -

^ permalink raw reply	[flat|nested] 64+ messages in thread
* Re: scsi vs ide performance on fsync's
@ 2001-03-06 19:42 David Balazic
  2001-03-06 20:37 ` Jens Axboe
  0 siblings, 1 reply; 64+ messages in thread
From: David Balazic @ 2001-03-06 19:42 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel

Linus Torvalds himself wrote :

> On Tue, 6 Mar 2001, Alan Cox wrote: 
> > 
> > > > I don't know if there is any way to turn of a write buffer on an IDE disk. 
> > > You want a forced set of commands to kill caching at init? 
> > 
> > Wrong model 
> > 
> > You want a write barrier. Write buffering (at least for short intervals) in 
> > the drive is very sensible. The kernel needs to able to send drivers a write 
> > barrier which will not be completed with outstanding commands before the 
> > barrier. 
> 
> Agreed. 
> 
> Write buffering is incredibly useful on a disk - for all the same reasons 
> that an OS wants to do it. The disk can use write buffering to speed up 
> writes a lot - not just lower the _perceived_ latency by the OS, but to 
> actually improve performance too. 
> 
> But Alan is right - we needs a "sync" command or something. I don't know 
> if IDE has one (it already might, for all I know). 

ATA , SCSI and ATAPI all have a FLUSH_CACHE command. (*)
Whether the drives implement it is another question ...

(*) references : 
  ATA-6 draft standard from www.t13.org
  MtFuji document from ????????


-- 
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -

^ permalink raw reply	[flat|nested] 64+ messages in thread
* Re: scsi vs ide performance on fsync's
@ 2001-03-07 12:47 David Balazic
  0 siblings, 0 replies; 64+ messages in thread
From: David Balazic @ 2001-03-07 12:47 UTC (permalink / raw)
  To: andre; +Cc: linux-kernel

Andre Hedrick (andre@linux-ide.org) wrote on Wed Mar 07 2001 - 01:58:44 EST :

> On Wed, 7 Mar 2001, Jonathan Morton wrote: 
 
[ snip ]
 
 
> > >Since all OSes that enable WC at init will flush 
> > >it at shutdown and do a periodic purge with in-activity. 
> > 
> > But Linux doesn't, as has been pointed out earlier. We need to fix Linux. 
> 
> Friend I have fixed this some time ago but it is bundled with TASKFILE 
> that is not going to arrive until 2.5. Because I need a way to execute 
> this and hold the driver until it is complete, regardless of the shutdown 
> method. 

I don't understand 100%.
Is TASKFILE required to do proper write cache flushing ?

> > >Err, last time I check all good devices flush their write caching on their 
> > >own to take advantage of having a maximum cache for prefetching. 
> > 
> > Which doesn't work if the buffer is filled up by the OS 0.5 seconds before 
> > the power goes. 
> 
> Maybe that is why there is a vender disk-cache dump zone on the edge of 
> the platters...just maybe you need to buy your drives from somebody that 
> does this and has a predictive sector stretcher as the energy from the 
> inertia by the DC three-phase motor executes the dump. 

So where is a list of drives that do this ?
www.list-of-hardware-that-doesnt-suck.com is not responding ...
 
> Ever wondered why modern drives have open collectors on the databuss? 

no :-)


-- 
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -

^ permalink raw reply	[flat|nested] 64+ messages in thread
[parent not found: <1epyyz1.etswlv1kmicnqM%smurf@noris.de>]

end of thread, other threads:[~2001-03-12 18:52 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-03-02 17:42 scsi vs ide performance on fsync's Jeremy Hansen
2001-03-02 18:39 ` Steve Lord
2001-03-02 19:17   ` Chris Mason
2001-03-02 19:25     ` Steve Lord
2001-03-02 19:27       ` Jeremy Hansen
2001-03-02 19:38       ` Chris Mason
2001-03-02 19:41         ` Steve Lord
2001-03-05 13:23         ` Andi Kleen
2001-03-02 19:25     ` Andre Hedrick
2001-03-03  1:55     ` Dan Hollis
2001-03-02 20:56 ` Linus Torvalds
2001-03-06  2:13   ` Jeremy Hansen
2001-03-06  2:25     ` Linus Torvalds
2001-03-06  3:30     ` Jonathan Morton
2001-03-06  4:05       ` Linus Torvalds
2001-03-06  7:03       ` Andre Hedrick
2001-03-06  8:24       ` Jonathan Morton
2001-03-06 12:22         ` Rik van Riel
2001-03-06 14:08         ` Jonathan Morton
2001-03-07 16:50           ` Pavel Machek
2001-03-06 19:41         ` Andre Hedrick
2001-03-07  5:25         ` Jonathan Morton
2001-03-07  6:58           ` Andre Hedrick
2001-03-09 11:39       ` Jonathan Morton
     [not found] <Pine.LNX.4.33L2.0103021033190.6176-200000@srv2.ecropolis.com>
     [not found] ` <054201c0a33d$55ee5870$e1de11cc@csihq.com>
2001-03-04 20:10   ` Douglas Gilbert
2001-03-04 21:28     ` Ishikawa
2001-03-06  0:11     ` Douglas Gilbert
2001-03-06  5:27 Douglas Gilbert
2001-03-06  5:45 ` Linus Torvalds
2001-03-06  7:12   ` Andre Hedrick
2001-03-06 12:09     ` Alan Cox
2001-03-06 18:44       ` Linus Torvalds
2001-03-07 13:48         ` Stephen C. Tweedie
2001-03-07 14:13           ` Jens Axboe
2001-03-12 18:50           ` Andre Hedrick
2001-03-06 13:50     ` Mike Black
2001-03-06 16:02       ` Jeremy Hansen
2001-03-07 18:27         ` Jeremy Hansen
2001-03-07 18:36           ` Linus Torvalds
2001-03-08 11:06             ` Stephen C. Tweedie
2001-03-06 16:57       ` Jonathan Morton
2001-03-06  6:43 ` Jonathan Morton
2001-03-06 13:03   ` dean gaudet
2001-03-06 13:15     ` dean gaudet
2001-03-06 13:45     ` Jonathan Morton
2001-03-06 17:14 David Balazic
2001-03-06 17:46 ` Gregory Maxwell
2001-03-06 18:23 ` Jonathan Morton
2001-03-06 23:27   ` Mark Hahn
2001-03-06 19:42 David Balazic
2001-03-06 20:37 ` Jens Axboe
2001-03-07 13:51   ` Stephen C. Tweedie
2001-03-07 14:12     ` Jens Axboe
2001-03-07 15:05       ` Stephen C. Tweedie
2001-03-07 18:51         ` Jens Axboe
2001-03-07 19:10           ` Stephen C. Tweedie
2001-03-07 20:15             ` Jens Axboe
2001-03-07 20:56               ` Stephen C. Tweedie
2001-03-07 20:59                 ` Jens Axboe
2001-03-08 15:45                 ` Chris Mason
2001-03-07 12:47 David Balazic
     [not found] <1epyyz1.etswlv1kmicnqM%smurf@noris.de>
2001-03-09  6:59 ` Matthias Urlichs
2001-03-09 11:51   ` Jens Axboe
2001-03-09 14:26     ` Matthias Urlichs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).