linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Adaptec vs Symbios performance
       [not found] <200111032318.fA3NIQY62745@aslan.scsiguy.com>
@ 2001-11-04  3:50 ` Stephan von Krawczynski
  2001-11-04  5:47   ` Justin T. Gibbs
  2001-11-04 14:17   ` Stephan von Krawczynski
  0 siblings, 2 replies; 14+ messages in thread
From: Stephan von Krawczynski @ 2001-11-04  3:50 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, groudier

> >Hello Justin, hello Gerard                                         
                                                                      
> >                                                                   
                                                                      
> >I am looking currently for reasons for bad behaviour of aic7xxx    
driver                                                                
> >in an shared interrupt setup and general not-nice behaviour of the 
                                                                      
> >driver regarding multi-tasking environment.                        
                                                                      
>                                                                     
> Can you be more specific?                                           
                                                                      
Yes, of course :-)                                                    
What I am seeing over here is that aic7xxx is _significantly_ slower  
than symbios _in the exact same context_. I refused to use the "new"  
driver as long as possible because I had (right from the first test)  
the "feeling" that it hurts the machine overall performance in some   
way, meaning the box seems _slow_ and less responsive than it was with
the old aic driver. When I directly compared it with symbios (LSI     
Logic hardware sold from Tekram) I additionaly found out, that it     
seems to hurt the interrupt performance of a network card sharing its 
interrupt with the aic which again does not happen with symbios. I    
have already seen such behaviour before, on merely every driver I     
formerly wrote for shared interrupt systems I had to fill in code that
_prevents_ lockout of other interrupt users due to indefinitely       
walking through the own code in high load situation.                  
But, of course, you _know_ this. Nobody writes a driver like new      
aic7xxx _and_ doesn't know :-)                                        
My guess is that this knowledge made you enter the comment I ripped   
from your code about using bottom half handler instead of dealing with
workload in a hardware interrupt. Again, I have to no extent read your
code completely or the like. I simply tried to find the hardware      
interrupt routine and look if it does significant eli (EverLasting    
Interrupt ;-) stuff - and I found your comment.                       
Can you re-comment from todays point of view?                         
                                                                      
> >This is nice. I cannot read the complete code around it (it is     
derived                                                               
> >from aic7xxx_linux.c) but if I understand the naming and comments  
                                                                      
> >correct, some workload is done inside the hardware interrupt (which
                                                                      
> >shouldn't), which would very much match my tests showing bad       
overall                                                               
> >performance behaviour. Obviously this code is old (read the        
comment)                                                              
> >and needs reworking.                                               
                                                                      
> >Comments?                                                          
                                                                      
>                                                                     
> I won't comment on whether deferring this work until outside of     
> an interrupt context would help your "problem" until I understand   
> what you are complaining about. 8-)                                 
                                                                      
In a nutshell:                                                        
a) long lasting interrupt workloads prevent normal process activity   
(creating latency and sticky behaviour)                               
b) long lasting interrupt workloads play bad on other interrupt users 
(e.g. on the same shared interrupt)                                   
I can see _both_ comparing aic with symbios.                          
                                                                      
Regards,                                                              
Stephan                                                               
                                                                      
                                                                      
                                                                      

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04  5:47   ` Justin T. Gibbs
@ 2001-11-04  5:23     ` Gérard Roudier
  0 siblings, 0 replies; 14+ messages in thread
From: Gérard Roudier @ 2001-11-04  5:23 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Stephan von Krawczynski, linux-kernel



On Sat, 3 Nov 2001, Justin T. Gibbs wrote:

[...]

> >I can see _both_ comparing aic with symbios.
>
> I'm not sure that you would see much of a difference if you set the
> symbios driver to use 253 commands per-device.  I haven't looked at

This is discouraged. :)
Better, IMO, to compare behaviours with realistic queue depths. As you
know, more than 64 for hard disks does not make sense (yet).
Personnaly, I use 64 under FreeBSD and 16 under Linux. Guess why ? :-)

> the sym driver for some time, but last I remember it does not use
> a bottom half handler and handles queue throttling internally.  It

There is no BH in the driver. The stock sym53c8xx even uses scsi_obsolete
that requires more load in interrupt context for command completion.
SYM-2 that comes back from FreeBSD uses the EH threaded stuff that just
queues to a BH on completion. Stephan may want to give SYM-2 a try, IMO.

> may perform less work at interrupt time than the aic7xxx driver if
> locally queued I/O is compiled into a format suitable for controller
> consumption rather than queue the ScsiCmnd structure provided by
> the mid-layer.  The aic7xxx driver has to convert a ScsiCmnd into a
> controller data structure to service an internal queue and this can
> take a bit of time.

The sym* drivers also uses an internal data structure to handle I/Os. The
SCSI script does not know about any O/S specific data structure.

> I would be interresting if there is a disparity in the TPS numbers
> and tag depths in your comparisons.  Higher tag depth usually means
> higher TPS which may also mean less interactive response from the
> system.  All things being equal, I would expect the sym and aic7xxx
> drivers to perform about the same.

Agreed.

  Gérard.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04  3:50 ` Adaptec vs Symbios performance Stephan von Krawczynski
@ 2001-11-04  5:47   ` Justin T. Gibbs
  2001-11-04  5:23     ` Gérard Roudier
  2001-11-04 14:17   ` Stephan von Krawczynski
  1 sibling, 1 reply; 14+ messages in thread
From: Justin T. Gibbs @ 2001-11-04  5:47 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, groudier

>Can you re-comment from todays point of view?                         

I believe that if you were to set the tag depth in the new aic7xxx
driver to a level similar to either the symbios or the old aic7xxx
driver, that the problem you describe would go away.  The driver
will only perform internal queuing if a device cannot handle the
original queue depth exported to the SCSI mid-layer.  Since the
mid-layer provides no mechanism for proper, dynamic, throttling,
queuing at the driver level will always be required when the driver
determines that a target cannot accept additional commands.  The default
used by the older driver, 8, seems to work for most drives.  So, no
internal queuing is required.  If you are really concerned about
interrupt latency, this will also be a win as you will reduce your
transaction throughput and thus the frequency of interrupts seen
by the controller.

>> I won't comment on whether deferring this work until outside of     
>> an interrupt context would help your "problem" until I understand   
>> what you are complaining about. 8-)                                 
>                                                                      
>In a nutshell:                                                        
>a) long lasting interrupt workloads prevent normal process activity   
>(creating latency and sticky behaviour)                               

Deferring the work to outside of interrupt context will not, in
general, allow non-kernel processes to run any sooner.  Only interrupt
handlers that don't block on the io-request lock (may it die a horrible
death) would be allowed to pre-empt this activity.  Even in this case,
there will be times, albeit much shorter, that this interrupt
will be blocked by the per-controller spin-lock used to protect
driver data structures and access to the card's registers.

If your processes are really feeling sluggish, you are probably doing
*a lot* of I/O.  The only thing that might help is some interrupt
coalessing algorithm in the aic7xxx driver's firmware.  Since these
chips do not have an easily utilized timer facility any such algorithm
would be tricky to implement.  I've thought about it, but not enough
to implement it yet.

>b) long lasting interrupt workloads play bad on other interrupt users 
>(e.g. on the same shared interrupt)                                   

Sure.  As the comment suggests, the driver should use a bottom half
handler or whatever new deferral mechanism is currently the rage
in Linux.  When I first ported the driver, it was targeted to be a
module, suitable for a driver diskette, to replace the old driver.
Things have changed since then, and this area should be revisited.
Internal queuing was not required in the original FreeBSD driver and
this is something the mid-layer should do on a driver's behalf, but
I've already ranted enough about that.

>I can see _both_ comparing aic with symbios.                          

I'm not sure that you would see much of a difference if you set the
symbios driver to use 253 commands per-device.  I haven't looked at
the sym driver for some time, but last I remember it does not use
a bottom half handler and handles queue throttling internally.  It
may perform less work at interrupt time than the aic7xxx driver if 
locally queued I/O is compiled into a format suitable for controller
consumption rather than queue the ScsiCmnd structure provided by
the mid-layer.  The aic7xxx driver has to convert a ScsiCmnd into a
controller data structure to service an internal queue and this can
take a bit of time.

I would be interresting if there is a disparity in the TPS numbers
and tag depths in your comparisons.  Higher tag depth usually means
higher TPS which may also mean less interactive response from the
system.  All things being equal, I would expect the sym and aic7xxx
drivers to perform about the same.

--
Justin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04  3:50 ` Adaptec vs Symbios performance Stephan von Krawczynski
  2001-11-04  5:47   ` Justin T. Gibbs
@ 2001-11-04 14:17   ` Stephan von Krawczynski
  2001-11-04 18:10     ` Justin T. Gibbs
                       ` (2 more replies)
  1 sibling, 3 replies; 14+ messages in thread
From: Stephan von Krawczynski @ 2001-11-04 14:17 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, groudier

On Sat, 03 Nov 2001 22:47:39 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com> wrote:

> >Can you re-comment from todays point of view?                         
> 
> I believe that if you were to set the tag depth in the new aic7xxx
> driver to a level similar to either the symbios or the old aic7xxx
> driver, that the problem you describe would go away.

Nope.
I know the stuff :-) I already took tcq down to 8 (as in old driver) back at
the times I compared old an new driver. Indeed I found out that everything is a
lot worse if using tcq 256 (which doesn't work anyway and gets down to 128 in
real life using my IBM harddrive). After using depth 8 the comparison to
symbios is just as described. Though I must admit, that the symbios driver
takes down tcq from 8 to 4 according to his boot-up message. Do you think it
will make a noticeable difference if I hardcode the depth to 4 in the aic7xxx
driver?

>  The driver
> will only perform internal queuing if a device cannot handle the
> original queue depth exported to the SCSI mid-layer.  Since the
> mid-layer provides no mechanism for proper, dynamic, throttling,
> queuing at the driver level will always be required when the driver
> determines that a target cannot accept additional commands.  The default
> used by the older driver, 8, seems to work for most drives.  So, no
> internal queuing is required.  If you are really concerned about
> interrupt latency, this will also be a win as you will reduce your
> transaction throughput and thus the frequency of interrupts seen
> by the controller.

Hm, this is not really true in my experience. Since a harddrive is in a
completely other time-framing than pure software issues it may well be, that
building up internal data not directly inside the hardware interrupt, but on a
somewhere higher layer, is no noticeable performance loss, _if_ it is done
right. "Right" here means obviously there must not be a synchronous linkage
between this higher layer and the hardware interrupt in this sense that the
higher layer has to wait on hardware interrupts' completion. But this is all
pretty "down to earth" stuff you know anyways.

> >> I won't comment on whether deferring this work until outside of     
> >> an interrupt context would help your "problem" until I understand   
> >> what you are complaining about. 8-)                                 
> >                                                                      
> >In a nutshell:                                                        
> >a) long lasting interrupt workloads prevent normal process activity   
> >(creating latency and sticky behaviour)                               
> 
> Deferring the work to outside of interrupt context will not, in
> general, allow non-kernel processes to run any sooner.

kernel processes would be completely sufficient. If you hit allocation routines
(e.g.) the whole system enters hickup state :-).

>  Only interrupt
> handlers that don't block on the io-request lock (may it die a horrible
> death) would be allowed to pre-empt this activity.  Even in this case,
> there will be times, albeit much shorter, that this interrupt
> will be blocked by the per-controller spin-lock used to protect
> driver data structures and access to the card's registers.

Well, this is a natural thing. You always have to protect such exclusively
working things like controller registers, but doubtlessly things turn out the
better the less exclusiveness you have (what can be more exclusive than a
hardware interrupt?).

> If your processes are really feeling sluggish, you are probably doing
> *a lot* of I/O.

Yes, of course. I wouldn't have complained in the first place _not_ knowing
that symbios does it better.

> The only thing that might help is some interrupt
> coalessing algorithm in the aic7xxx driver's firmware.  Since these
> chips do not have an easily utilized timer facility any such algorithm
> would be tricky to implement.  I've thought about it, but not enough
> to implement it yet.

I cannot comment on that, I don't know what Gerard really does here.

> >b) long lasting interrupt workloads play bad on other interrupt users 
> >(e.g. on the same shared interrupt)                                   
> 
> Sure.  As the comment suggests, the driver should use a bottom half
> handler or whatever new deferral mechanism is currently the rage
> in Linux.

Do you think this is complex in implementation?

> [...]
> >I can see _both_ comparing aic with symbios.                          
> 
> I'm not sure that you would see much of a difference if you set the
> symbios driver to use 253 commands per-device.

As stated earlier I took both drivers to comparable values (8).

> I would be interresting if there is a disparity in the TPS numbers
> and tag depths in your comparisons.  Higher tag depth usually means
> higher TPS which may also mean less interactive response from the
> system.  All things being equal, I would expect the sym and aic7xxx
> drivers to perform about the same.

I can confirm that. 253 is a bad joke in terms of interactive responsiveness
during high load.
Probably the configured standard value should be taken down remarkably.
253 feels like old IDE.
Yes, I know this comment hurt you badly ;-)
In my eyes the changes required in your driver are _not_ that big. The gain
would be noticeable. I don't say its a bad driver, really not, I would only
suggest some refinement. I know _you_ can do a bit better, prove me right ;-)

Regards,
Stephan


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 18:35     ` Stephan von Krawczynski
@ 2001-11-04 16:31       ` Gérard Roudier
  2001-11-04 19:13       ` Justin T. Gibbs
  2001-11-04 19:56       ` Stephan von Krawczynski
  2 siblings, 0 replies; 14+ messages in thread
From: Gérard Roudier @ 2001-11-04 16:31 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: Justin T. Gibbs, linux-kernel, groudier



Hi Stephan,

The difference in performance for your CD (slow device) between aic7xxx
and sym53c8xx using equi-capable HBAs (notably Ultra-160) cannot be
believed a single second to be due to a design flaw in the aic7xxx driver.

Instead of trying to prove Justin wrong with his driver, you should look
into your system configuration and/or provide Justin with accurate
information and/or do different testings in order to get some clue about
the real cause.

You may have triggerred a software/hardware bug somewhere, but I am
convinced that it cannot be a driver design bug.

In order to help Justin work on your problem, you should for example
report:

- The device configuration you set up in the controller EEPROM/NVRAM.
- The kernel boot-up messages.
- Your kernel configuration.
- Etc...

You might for example have unintentionnaly configured some devices in the
HBA set-up for disconnection not to be granted. Such configuration MISTAKE
is likely to kill SCSI performances a LOT.

  Gérard.

PS: If you are interested in Justin's ability to design software for SCSI,
then you may want to have a look into all FreeBSD IO-related stuff owned
by Justin.


On Sun, 4 Nov 2001, Stephan von Krawczynski wrote:

> On Sun, 04 Nov 2001 11:10:26 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com> wrote:
>
> > >On Sat, 03 Nov 2001 22:47:39 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com>
> wrote
> > Show me where the real problem is, and I'll fix it.  I'll add the bottom
> > half handler too eventually, but I don't see it as a pressing item.  I'm
> > much more interested in why you are seeing the behavior you are and exactly
> > what, quantitatively, that behavior is.
>
> Hm, what more specific can I tell you, than:
>
> Take my box with
>
> Host: scsi1 Channel: 00 Id: 03 Lun: 00
>   Vendor: TEAC     Model: CD-ROM CD-532S   Rev: 1.0A
>   Type:   CD-ROM                           ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 08 Lun: 00
>   Vendor: IBM      Model: DDYS-T36950N     Rev: S96H
>   Type:   Direct-Access                    ANSI SCSI revision: 03
>
> and an aic7xxx driver. Start xcdroast an read a cd image. You get something
> between 2968,4 and 3168,2 kB/s throughput measured from xcdroast.
>
> Now redo this with a Tekram controller (which is sym53c1010) and you get
> throughput of 3611,1 to 3620,2 kB/s.
> No special stuff or background processes or anything else involved. I wonder
> how much simpler a test could be.
> Give me values to compare from _your_ setup.
>
> If you redo this test with nfs-load (copy files from some client to your
> test-box acting as nfs-server) you will end up at 1926 - 2631 kB/s throughput
> with aic, but 3395 - 3605 kB/s with symbios.
>
> If you need more on that picture, then redo the last and start _some_
> application in the background during the test (like mozilla). Time how long it
> takes until the application is up and running.
> If you are really unlucky you have your mail-client open during test and let it
> get mails via pop3 in a MH folder (lots of small files). You have a high chance
> that your mail-client is unusable until xcdroast is finished with cd reading -
> but not with symbios.
>
> ??
>
> Regards,
> Stephan



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 14:17   ` Stephan von Krawczynski
@ 2001-11-04 18:10     ` Justin T. Gibbs
  2001-11-04 18:35     ` Stephan von Krawczynski
  2001-11-04 19:02     ` Stephan von Krawczynski
  2 siblings, 0 replies; 14+ messages in thread
From: Justin T. Gibbs @ 2001-11-04 18:10 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, groudier

>On Sat, 03 Nov 2001 22:47:39 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com> wrote
>:
>
>> >Can you re-comment from todays point of view?                         
>> 
>> I believe that if you were to set the tag depth in the new aic7xxx
>> driver to a level similar to either the symbios or the old aic7xxx
>> driver, that the problem you describe would go away.
>
>Nope.
>I know the stuff :-) I already took tcq down to 8 (as in old driver) back at
>the times I compared old an new driver.

Then you will have to find some other reason for the difference in
performance.  Internal queuing is not a factor with any reasonable
modern drive when the depth is set at 8.

>Indeed I found out that everything is a lot worse if using tcq 256 (which
>doesn't work anyway and gets down to 128 in real life using my IBM harddrive). 

The driver cannot know if you are using an external RAID controller or
an IBM drive or a Qunatum fireball.  It is my belief that in a true
multi-tasking workload, giving the device as much work to chew on
as it can handle is always best.  Your sequential bandwidth may
be a bit less, but sequential I/O is not that interesting in my opinion.

>After using depth 8 the comparison to
>symbios is just as described. Though I must admit, that the symbios driver
>takes down tcq from 8 to 4 according to his boot-up message. Do you think it
>will make a noticeable difference if I hardcode the depth to 4 in the aic7xxx
>driver?

As mentioned above, I would not expect any difference.

>>  The driver
>> will only perform internal queuing if a device cannot handle the
>> original queue depth exported to the SCSI mid-layer.  Since the
>> mid-layer provides no mechanism for proper, dynamic, throttling,
>> queuing at the driver level will always be required when the driver
>> determines that a target cannot accept additional commands.  The default
>> used by the older driver, 8, seems to work for most drives.  So, no
>> internal queuing is required.  If you are really concerned about
>> interrupt latency, this will also be a win as you will reduce your
>> transaction throughput and thus the frequency of interrupts seen
>> by the controller.
>
>Hm, this is not really true in my experience. Since a harddrive is in a
>completely other time-framing than pure software issues it may well be, that
>building up internal data not directly inside the hardware interrupt, but on a
>somewhere higher layer, is no noticeable performance loss, _if_ it is done
>right. "Right" here means obviously there must not be a synchronous linkage
>between this higher layer and the hardware interrupt in this sense that the
>higher layer has to wait on hardware interrupts' completion. But this is all
>pretty "down to earth" stuff you know anyways.

I don't understand how your comments relate to mine.  In a perfect world,
and with a "real" SCSI layer in Linux, the driver would never have any
queued data above and beyond what it can directly send to the device.
Since Linux lets you set the queue depth only at startup, before you can
dynamically determine a useful value, the driver has little choice.  To
say it more directly, internal queuing is not something I wanted in the
design - in fact it makes it more complicated and less efficient.

>> Deferring the work to outside of interrupt context will not, in
>> general, allow non-kernel processes to run any sooner.
>
>kernel processes would be completely sufficient. If you hit allocation routine
>s
>(e.g.) the whole system enters hickup state :-).

But even those kernel processes will not run unless they have a higher
priority than the bottom half handler.  I can't stress this enough...
interactive performance will not change if this is done because kernel
tasks take priority over user tasks.

>> If your processes are really feeling sluggish, you are probably doing
>> *a lot* of I/O.
>
>Yes, of course. I wouldn't have complained in the first place _not_ knowing
>that symbios does it better.

I wish you could be a bit more quanitative in your analysis.  It seems
clear to me that the area you're pointing to is not the cause of your
complaint.  Without a quantitative analysis, I can't help you figure
this out.

>> Sure.  As the comment suggests, the driver should use a bottom half
>> handler or whatever new deferral mechanism is currently the rage
>> in Linux.
>
>Do you think this is complex in implementation?

No, but doing anything like this requires some research to find a solution
that works for all kernel versions the driver supports.  I hope I don't need
three different implementations to make this work.  Regardless, this change
will not make any difference in your problem.

>> I would be interresting if there is a disparity in the TPS numbers
>> and tag depths in your comparisons.  Higher tag depth usually means
>> higher TPS which may also mean less interactive response from the
>> system.  All things being equal, I would expect the sym and aic7xxx
>> drivers to perform about the same.
>
>I can confirm that. 253 is a bad joke in terms of interactive responsiveness
>during high load.

Its there for throughput, not interactive performance.  I'm sure if you
were doing things like news expirations, you'd appreciate the higher number
(up to the 128 tags your drives support).

>Probably the configured standard value should be taken down remarkably.
>253 feels like old IDE.
>Yes, I know this comment hurt you badly ;-)

Not really.  Each to their own.  You can tune your system however you
see fit.

>In my eyes the changes required in your driver are _not_ that big. The gain
>would be noticeable. I don't say its a bad driver, really not, I would only
>suggest some refinement. I know _you_ can do a bit better, prove me right ;-)

Show me where the real problem is, and I'll fix it.  I'll add the bottom
half handler too eventually, but I don't see it as a pressing item.  I'm
much more interested in why you are seeing the behavior you are and exactly
what, quantitatively, that behavior is.

--
Justin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 14:17   ` Stephan von Krawczynski
  2001-11-04 18:10     ` Justin T. Gibbs
@ 2001-11-04 18:35     ` Stephan von Krawczynski
  2001-11-04 16:31       ` Gérard Roudier
                         ` (2 more replies)
  2001-11-04 19:02     ` Stephan von Krawczynski
  2 siblings, 3 replies; 14+ messages in thread
From: Stephan von Krawczynski @ 2001-11-04 18:35 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, groudier

On Sun, 04 Nov 2001 11:10:26 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com> wrote:

> >On Sat, 03 Nov 2001 22:47:39 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com>
wrote
> Show me where the real problem is, and I'll fix it.  I'll add the bottom
> half handler too eventually, but I don't see it as a pressing item.  I'm
> much more interested in why you are seeing the behavior you are and exactly
> what, quantitatively, that behavior is.

Hm, what more specific can I tell you, than:

Take my box with

Host: scsi1 Channel: 00 Id: 03 Lun: 00
  Vendor: TEAC     Model: CD-ROM CD-532S   Rev: 1.0A
  Type:   CD-ROM                           ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 08 Lun: 00
  Vendor: IBM      Model: DDYS-T36950N     Rev: S96H
  Type:   Direct-Access                    ANSI SCSI revision: 03

and an aic7xxx driver. Start xcdroast an read a cd image. You get something
between 2968,4 and 3168,2 kB/s throughput measured from xcdroast.

Now redo this with a Tekram controller (which is sym53c1010) and you get
throughput of 3611,1 to 3620,2 kB/s.
No special stuff or background processes or anything else involved. I wonder
how much simpler a test could be.
Give me values to compare from _your_ setup.

If you redo this test with nfs-load (copy files from some client to your
test-box acting as nfs-server) you will end up at 1926 - 2631 kB/s throughput
with aic, but 3395 - 3605 kB/s with symbios.

If you need more on that picture, then redo the last and start _some_
application in the background during the test (like mozilla). Time how long it
takes until the application is up and running. 
If you are really unlucky you have your mail-client open during test and let it
get mails via pop3 in a MH folder (lots of small files). You have a high chance
that your mail-client is unusable until xcdroast is finished with cd reading -
but not with symbios.

??

Regards,
Stephan



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 14:17   ` Stephan von Krawczynski
  2001-11-04 18:10     ` Justin T. Gibbs
  2001-11-04 18:35     ` Stephan von Krawczynski
@ 2001-11-04 19:02     ` Stephan von Krawczynski
  2 siblings, 0 replies; 14+ messages in thread
From: Stephan von Krawczynski @ 2001-11-04 19:02 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, groudier

On Sun, 04 Nov 2001 11:10:26 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com> wrote:

> >Nope.
> >I know the stuff :-) I already took tcq down to 8 (as in old driver) back at
> >the times I compared old an new driver.
> 
> Then you will have to find some other reason for the difference in
> performance.  Internal queuing is not a factor with any reasonable
> modern drive when the depth is set at 8.

Hm, obviously we could start right from the beginning and ask people with aic
controllers and symbios controllers for some comparison figures. Hopefully some
people are interested.

Here we go:
Hello out there :-)
we need your help. If you own a scsi-controller from adaptec or one with
ncr/symbios chipset can you please do the following:
reboot your box. Start xcdroast and read in a data cd. Tell us: brand of your
cdrom, how much RAM you have, processor type, throughput as measured by
xcdroast. Nice would be if you try several times.
We are not really interested in the hard figures, but want to extract some
"global" tendency.

Thank you for your cooperation, 

Stephan

PS: my values are (I obviously have both controllers):

Adaptec:

Drive TEAC-CD-532S (30x), 1 GB RAM, 2 x PIII 1GHz
test 1: 2998,9 kB/s
test 2: 2968,4 kB/s
test 3: 3168,2 kB/s

Tekram (symbios)

Drive TEAC-CD-532S (30x), 1 GB RAM, 2 X PIII 1GHz
test 1: 3619,3 kB/s
test 2: 3611,1 kB/s
test 3: 3620,2 kB/s


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 18:35     ` Stephan von Krawczynski
  2001-11-04 16:31       ` Gérard Roudier
@ 2001-11-04 19:13       ` Justin T. Gibbs
  2001-11-04 19:56       ` Stephan von Krawczynski
  2 siblings, 0 replies; 14+ messages in thread
From: Justin T. Gibbs @ 2001-11-04 19:13 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, groudier

>Hm, what more specific can I tell you, than:

Well, the numbers paint a different picture than your pervious
comments.  You never mentioned a performance disparity, only a
loss in interactive performance.

>Take my box with
>
>Host: scsi1 Channel: 00 Id: 03 Lun: 00
>  Vendor: TEAC     Model: CD-ROM CD-532S   Rev: 1.0A
>  Type:   CD-ROM                           ANSI SCSI revision: 02
>Host: scsi0 Channel: 00 Id: 08 Lun: 00
>  Vendor: IBM      Model: DDYS-T36950N     Rev: S96H
>  Type:   Direct-Access                    ANSI SCSI revision: 03
>
>and an aic7xxx driver.

A full dmesg would be better.  Right now I have no idea what kind
of aic7xxx controller you are using, the speed and type of CPU,
the chipset in the machine, etc. etc.  In general, I'd rather see
the raw data than a version edited down based on the conclusions
you've already drawn or on what you feel is important.

>Start xcdroast an read a cd image. You get something
>between 2968,4 and 3168,2 kB/s throughput measured from xcdroast.
>
>Now redo this with a Tekram controller (which is sym53c1010) and you get
>throughput of 3611,1 to 3620,2 kB/s.

Were both tests performed from cold boot to a new file in the same
directory with similar amounts of that filesystem in use?

>No special stuff or background processes or anything else involved. I wonder
>how much simpler a test could be.

It doesn't matter how simple it is if you've never mentioned it before.
Your tone is somewhat indignant.  Do you not understand why this
data is important to understanding and correcting the problem?

>Give me values to compare from _your_ setup.

Send me a c1010. 8-)

>If you redo this test with nfs-load (copy files from some client to your
>test-box acting as nfs-server) you will end up at 1926 - 2631 kB/s throughput
>with aic, but 3395 - 3605 kB/s with symbios.

What is the interrupt load during these tests?  Have you verified that
disconnection is enabled for all devices on the aic7xxx controller?

>If you need more on that picture, then redo the last and start _some_
>application in the background during the test (like mozilla). Time how long it
>takes until the application is up and running. 

Since you are experiencing the problem, can't you time it?  There is
little guarantee that I will be able to reproduce the exact scenario
you are describing.  As I mentioned before, I don't have a c1010,
so I cannot perform the comparison you feel is so telling.

This does not look like an interrupt latency problem.

--
Justin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 18:35     ` Stephan von Krawczynski
  2001-11-04 16:31       ` Gérard Roudier
  2001-11-04 19:13       ` Justin T. Gibbs
@ 2001-11-04 19:56       ` Stephan von Krawczynski
  2001-11-04 20:43         ` Justin T. Gibbs
  2 siblings, 1 reply; 14+ messages in thread
From: Stephan von Krawczynski @ 2001-11-04 19:56 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, groudier

On Sun, 04 Nov 2001 12:13:20 -0700 "Justin T. Gibbs" <gibbs@scsiguy.com> wrote:

> >Hm, what more specific can I tell you, than:
> 
> Well, the numbers paint a different picture than your pervious
> comments.  You never mentioned a performance disparity, only a
> loss in interactive performance.

See:

Date:	Wed, 31 Oct 2001 16:45:39 +0100
From:	Stephan von Krawczynski <skraw@ithnet.com>
To:	linux-kernel <linux-kernel@vger.kernel.org>
Subject: The good, the bad & the ugly (or VM, block devices, and SCSI :-)
Message-Id: <20011031164539.29c04ee0.skraw@ithnet.com>

> A full dmesg would be better.  Right now I have no idea what kind
> of aic7xxx controller you are using,

Adaptec A29160 (see above mail). Remarkably is I have a 32 bit PCI bus, no 64
bit. This is an Asus CUV4X-D board.


> the speed and type of CPU,

2 x PIII 1GHz

> the chipset in the machine,

00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x]
(rev c4)
00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x
AGP]
00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
40)
00:04.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:04.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:04.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev
40)
00:09.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03)
00:0a.0 Network controller: Elsa AG QuickStep 1000 (rev 01)
00:0b.0 SCSI storage controller: Symbios Logic Inc. (formerly NCR) 53c1010
Ultra3 SCSI Adapter (rev 01)
00:0b.1 SCSI storage controller: Symbios Logic Inc. (formerly NCR) 53c1010
Ultra3 SCSI Adapter (rev 01)
00:0d.0 Multimedia audio controller: Creative Labs SB Live! EMU10000 (rev 07)
00:0d.1 Input device controller: Creative Labs SB Live! (rev 07)
01:00.0 VGA compatible controller: nVidia Corporation NV11 (rev b2)
02:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43
(rev 41)
02:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43
(rev 41)
02:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43
(rev 41)
02:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43
(rev 41)

> Were both tests performed from cold boot

I rechecked that several times, made no difference.

> to a new file in the same
> directory with similar amounts of that filesystem in use?

yes. There is no difference if the file is a) new or b) overwritten. Anyway
both test cases use the same filesystems, I really exchanged only the
controllers, everything else is completely the same. 
Just did another test run with symbios, _after_ heavy nfs and I/O action on the
box and about 145 MB in swap currently. Result: 3620,1 kB/s. _Very_ stable
appearance from symbios.

> >No special stuff or background processes or anything else involved. I wonder
> >how much simpler a test could be.
> 
> It doesn't matter how simple it is if you've never mentioned it before.

Sorry, but there was nothing left out on my side. s.a.

> Your tone is somewhat indignant.  Do you not understand why this
> data is important to understanding and correcting the problem?

Sorry for that, this is unintentional. Though my written english may look nice,
keep in mind I am no native english-speaking, so some things may come over a
bit rougher than intended.

> >Give me values to compare from _your_ setup.
> 
> Send me a c1010. 8-)

Sorry, misunderstanding. What I meant was: how fast can you read data from your
cd-rom attached to some adaptec controller?

> >If you redo this test with nfs-load (copy files from some client to your
> >test-box acting as nfs-server) you will end up at 1926 - 2631 kB/s
throughput
> >with aic, but 3395 - 3605 kB/s with symbios.
> 
> What is the interrupt load during these tests?

How can I present you an exact figure on this?

>  Have you verified that
> disconnection is enabled for all devices on the aic7xxx controller?

yes.

> This does not look like an interrupt latency problem.

Based on which thoughts?

Regards,
Stephan



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 19:56       ` Stephan von Krawczynski
@ 2001-11-04 20:43         ` Justin T. Gibbs
  2001-11-05 12:18           ` Matthias Andree
  0 siblings, 1 reply; 14+ messages in thread
From: Justin T. Gibbs @ 2001-11-04 20:43 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, groudier

>See:
>
>Date:	Wed, 31 Oct 2001 16:45:39 +0100
>From:	Stephan von Krawczynski <skraw@ithnet.com>
>To:	linux-kernel <linux-kernel@vger.kernel.org>
>Subject: The good, the bad & the ugly (or VM, block devices, and SCSI :-)
>Message-Id: <20011031164539.29c04ee0.skraw@ithnet.com>

<Sigh> I don't read all of the LK list and the mail was not cc'd
to me, so I did not see this thread.

>> A full dmesg would be better.  Right now I have no idea what kind
>> of aic7xxx controller you are using,
>
>Adaptec A29160 (see above mail). Remarkably is I have a 32 bit PCI bus, no 64
>bit. This is an Asus CUV4X-D board.

*Please stop editing things*.  I need the actual boot messages from
the detection of the aic7xxx card.  It would also be nice to see
the output of /proc/scsi/aic7xxx/<card #>

>> the speed and type of CPU,
>
>2 x PIII 1GHz

Dmesg please.

>Sorry, misunderstanding. What I meant was: how fast can you read data
>from your cd-rom attached to some adaptec controller?

I'll run some tests tomorrow at work.  I'm sure the results will
be dependent on the cdrom in question but they may show something.

>> >If you redo this test with nfs-load (copy files from some client to your
>> >test-box acting as nfs-server) you will end up at 1926 - 2631 kB/s
>throughput
>> >with aic, but 3395 - 3605 kB/s with symbios.
>> 
>> What is the interrupt load during these tests?
>
>How can I present you an exact figure on this?

Isn't there a systat or vmstat equivalent under Linux that gives you
interrupt rates?  I'll poke around tomorrow when I'm in front of a Linux
box.

>>  Have you verified that
>> disconnection is enabled for all devices on the aic7xxx controller?
>
>yes.

The driver may not be seeing the same things as SCSI-Select for some
strange reason.  Again, just email me a full dmesg after a successful
boot along with the /proc/scsi/aic7xxx/ output.

>> This does not look like an interrupt latency problem.
>
>Based on which thoughts?

It really looks like a bug in the driver's round-robin code or perhaps
a difference in how many transactions we allow to be queued in the
untagged case.

Can you re-run your tests with the output directed to /dev/null for cdrom
reads and also perform some benchmarks against your disk?  The benchmarks
should operate on one device only at a time with as little I/O to any other
device during the test.

--
Justin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-04 20:43         ` Justin T. Gibbs
@ 2001-11-05 12:18           ` Matthias Andree
  0 siblings, 0 replies; 14+ messages in thread
From: Matthias Andree @ 2001-11-05 12:18 UTC (permalink / raw)
  To: linux-kernel

On Sun, 04 Nov 2001, Justin T. Gibbs wrote:

> Isn't there a systat or vmstat equivalent under Linux that gives you
> interrupt rates?  I'll poke around tomorrow when I'm in front of a Linux
> box.

vmstat is usually available, systat/iostat and the like are not
ubiquitous however.

-- 
Matthias Andree

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Adaptec vs Symbios performance
  2001-11-03 22:53 ` Adaptec vs Symbios performance Stephan von Krawczynski
@ 2001-11-03 23:01   ` arjan
  0 siblings, 0 replies; 14+ messages in thread
From: arjan @ 2001-11-03 23:01 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

In article <200111032253.XAA20342@webserver.ithnet.com> you wrote:

> Hello Justin, hello Gerard                                            
>                                                                      
> I am looking currently for reasons for bad behaviour of aic7xxx driver
> in an shared interrupt setup and general not-nice behaviour of the    
> driver regarding multi-tasking environment.                           
> Here is what I found in the code:                                     

>         * It would be nice to run the device queues from a           
>         * bottom half handler, but as there is no way to             
>         * dynamically register one, we'll have to postpone           
>         * that until we get integrated into the kernel.              
>         */                    

sounds like a good tasklet candidate......

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Adaptec vs Symbios performance
  2001-11-02 22:42 Google's mm problem - not reproduced on 2.4.13 Ben Smith
@ 2001-11-03 22:53 ` Stephan von Krawczynski
  2001-11-03 23:01   ` arjan
  0 siblings, 1 reply; 14+ messages in thread
From: Stephan von Krawczynski @ 2001-11-03 22:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: groudier

Hello Justin, hello Gerard                                            
                                                                      
I am looking currently for reasons for bad behaviour of aic7xxx driver
in an shared interrupt setup and general not-nice behaviour of the    
driver regarding multi-tasking environment.                           
Here is what I found in the code:                                     
                                                                      
/*                                                                    
 * SCSI controller interrupt handler.                                 
 */                                                                   
void                                                                  
ahc_linux_isr(int irq, void *dev_id, struct pt_regs * regs)           
{                                                                     
        struct ahc_softc *ahc;                                        
        struct ahc_cmd *acmd;                                         
        u_long flags;                                                 
                                                                      
        ahc = (struct ahc_softc *) dev_id;                            
        ahc_lock(ahc, &flags);                                        
        ahc_intr(ahc);                                                
        /*                                                            
         * It would be nice to run the device queues from a           
         * bottom half handler, but as there is no way to             
         * dynamically register one, we'll have to postpone           
         * that until we get integrated into the kernel.              
         */                                                           
        ahc_linux_run_device_queues(ahc);                             
        acmd = TAILQ_FIRST(&ahc->platform_data->completeq);           
        TAILQ_INIT(&ahc->platform_data->completeq);                   
        ahc_unlock(ahc, &flags);                                      
        if (acmd != NULL)                                             
                ahc_linux_run_complete_queue(ahc, acmd);              
}                                                                     
                                                                      
This is nice. I cannot read the complete code around it (it is derived
from aic7xxx_linux.c) but if I understand the naming and comments     
correct, some workload is done inside the hardware interrupt (which   
shouldn't), which would very much match my tests showing bad overall  
performance behaviour. Obviously this code is old (read the comment)  
and needs reworking.                                                  
Comments?                                                             
                                                                      
Regards,                                                              
Stephan                                                               
                                                                      

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2001-11-05 12:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200111032318.fA3NIQY62745@aslan.scsiguy.com>
2001-11-04  3:50 ` Adaptec vs Symbios performance Stephan von Krawczynski
2001-11-04  5:47   ` Justin T. Gibbs
2001-11-04  5:23     ` Gérard Roudier
2001-11-04 14:17   ` Stephan von Krawczynski
2001-11-04 18:10     ` Justin T. Gibbs
2001-11-04 18:35     ` Stephan von Krawczynski
2001-11-04 16:31       ` Gérard Roudier
2001-11-04 19:13       ` Justin T. Gibbs
2001-11-04 19:56       ` Stephan von Krawczynski
2001-11-04 20:43         ` Justin T. Gibbs
2001-11-05 12:18           ` Matthias Andree
2001-11-04 19:02     ` Stephan von Krawczynski
2001-11-02 22:42 Google's mm problem - not reproduced on 2.4.13 Ben Smith
2001-11-03 22:53 ` Adaptec vs Symbios performance Stephan von Krawczynski
2001-11-03 23:01   ` arjan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).