linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c]
       [not found] <41DED9FA.7080106@pobox.com>
@ 2005-01-08  8:33 ` Paul Gortmaker
  2005-01-08 15:45   ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Gortmaker @ 2005-01-08  8:33 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Jeff Garzik, Alan Cox


> From: Alan Cox <alan@lxorguk.ukuu.org.uk>
>
>
>This was done because it benchmarked materially faster than the
>skb_padto version you just reverted. Its only 64 bytes on the stack and
>its cached.
>
>ie the old 8390 code was quite intentional and done because it commonly
>occurs on very old machines where clock count matters. Because of its
>commonness I actually hand optimised this one when I did the original
>fixes to avoid doing extra memory allocations.
>
>Summary: Please revert.
>
>Alan
>
>

Is it possible that skb_padto has since got its act together?   Reason I
ask is that I just dusted off a crusty 386dx40 (doesn't get much older
than that) with a wd8013.  As a basic test, I did ttcp Tx tests with small
packets and they came out to all intents and purposes, identical.   Kernel 
was 2.6.10, with stack vs skb_padto, each size test ran 3 times, even tested
packets bigger than ETH_ZLEN as a (hopefully) invariant.  I've attached the
edited down results below.

Paul.

 --- Stock 2.6.10 kernel with 8390 doing the padding ---

ttcp-t: buflen=1, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 32768 bytes in 32.81 real seconds = 0.98 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.79
   ttcp-t: 3.5user 29.2sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
2: ttcp-t: 32768 bytes in 32.80 real seconds = 0.98 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.99
   ttcp-t: 3.7user 29.0sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
3: ttcp-t: 32768 bytes in 32.80 real seconds = 0.98 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.10
   ttcp-t: 0.2user 32.5sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw

ttcp-t: buflen=10, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 327680 bytes in 32.80 real seconds = 9.76 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.06
   ttcp-t: 1.3user 31.4sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
2: ttcp-t: 327680 bytes in 32.80 real seconds = 9.76 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 999.02
   ttcp-t: 1.5user 31.2sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
3: ttcp-t: 327680 bytes in 32.80 real seconds = 9.76 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.07
   ttcp-t: 2.3user 30.3sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw

ttcp-t: buflen=40, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 1310720 bytes in 32.80 real seconds = 39.02 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.96
   ttcp-t: 14.2user 18.5sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
2: ttcp-t: 1310720 bytes in 32.82 real seconds = 39.00 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.37
   ttcp-t: 14.6user 18.1sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+24csw
3: ttcp-t: 1310720 bytes in 32.80 real seconds = 39.02 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.91
   ttcp-t: 15.1user 17.5sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw

ttcp-t: buflen=100, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 3276800 bytes in 32.86 real seconds = 97.37 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 997.05
   ttcp-t: 0.1user 32.7sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
2: ttcp-t: 3276800 bytes in 32.93 real seconds = 97.19 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 995.22
   ttcp-t: 0.1user 32.7sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+24csw
3: ttcp-t: 3276800 bytes in 32.83 real seconds = 97.47 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.12
   ttcp-t: 0.0user 32.7sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+22csw


--- 2.6.10 Kernel with 8390 calling kernel's skb_padto ---

ttcp-t: buflen=1, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 32768 bytes in 32.81 real seconds = 0.98 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.68
   ttcp-t: 1.5user 31.2sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
2: ttcp-t: 32768 bytes in 32.79 real seconds = 0.98 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.36
   ttcp-t: 4.3user 28.3sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+24csw
3: ttcp-t: 32768 bytes in 32.80 real seconds = 0.98 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.99
   ttcp-t: 9.3user 23.3sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+24csw

ttcp-t: buflen=10, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 327680 bytes in 32.80 real seconds = 9.76 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.14
   ttcp-t: 12.5user 20.2sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
2: ttcp-t: 327680 bytes in 32.78 real seconds = 9.76 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.52
   ttcp-t: 1.2user 31.5sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
3: ttcp-t: 327680 bytes in 32.78 real seconds = 9.76 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.60
   ttcp-t: 5.6user 27.1sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+24csw

ttcp-t: buflen=40, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 1310720 bytes in 32.80 real seconds = 39.03 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.09
   ttcp-t: 17.3user 15.4sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+22csw
2: ttcp-t: 1310720 bytes in 32.80 real seconds = 39.02 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 999.00
   ttcp-t: 18.2user 14.5sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+24csw
3: ttcp-t: 1310720 bytes in 32.80 real seconds = 39.03 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.02, calls/sec = 999.06
   ttcp-t: 17.8user 14.9sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+22csw

ttcp-t: buflen=100, nbuf=32768, align=16384/0, port=5001  tcp  -> g
1: ttcp-t: 3276800 bytes in 32.81 real seconds = 97.54 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.77
   ttcp-t: 0.0user 32.7sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
2: ttcp-t: 3276800 bytes in 32.82 real seconds = 97.51 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 998.54
   ttcp-t: 0.0user 32.7sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw
3: ttcp-t: 3276800 bytes in 32.83 real seconds = 97.46 KB/sec +++
   ttcp-t: 32768 I/O calls, msec/call = 1.03, calls/sec = 997.99
   ttcp-t: 0.0user 32.7sys 0:32real 99% 0i+0d 0maxrss 0+0pf 0+23csw



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c]
  2005-01-08  8:33 ` [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c] Paul Gortmaker
@ 2005-01-08 15:45   ` Alan Cox
  2005-01-10  2:41     ` [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c Paul Gortmaker
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2005-01-08 15:45 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: Linux Kernel Mailing List, Jeff Garzik

On Sad, 2005-01-08 at 08:33, Paul Gortmaker wrote:
> Is it possible that skb_padto has since got its act together?   Reason I
> ask is that I just dusted off a crusty 386dx40 (doesn't get much older

Could be - kmalloc has probably improved but the skbuffs have got far
more complex

> than that) with a wd8013.  As a basic test, I did ttcp Tx tests with small
> packets and they came out to all intents and purposes, identical.   Kernel 
> was 2.6.10, with stack vs skb_padto, each size test ran 3 times, even tested
> packets bigger than ETH_ZLEN as a (hopefully) invariant.  I've attached the
> edited down results below.

What are you testing ? I don't see the relationship between network
throughput and efficiency on this device.

Drop it on a pentium or late 486 and use the tsc to compare the two code
paths. One is much much more efficienct.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c
  2005-01-08 15:45   ` Alan Cox
@ 2005-01-10  2:41     ` Paul Gortmaker
  2005-01-10 18:28       ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Gortmaker @ 2005-01-10  2:41 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List, Jeff Garzik

Alan Cox wrote:

>On Sad, 2005-01-08 at 08:33, Paul Gortmaker wrote:
>  
>
>>Is it possible that skb_padto has since got its act together?   Reason I
>>ask is that I just dusted off a crusty 386dx40 (doesn't get much older
>>    
>
>Could be - kmalloc has probably improved but the skbuffs have got far
>more complex
>
>
>>than that) with a wd8013.  As a basic test, I did ttcp Tx tests with small
>>packets and they came out to all intents and purposes, identical.   Kernel 
>>was 2.6.10, with stack vs skb_padto, each size test ran 3 times, even tested
>>packets bigger than ETH_ZLEN as a (hopefully) invariant.  I've attached the
>>edited down results below.
>>    
>
>What are you testing ? I don't see the relationship between network
>throughput and efficiency on this device.

I was thinking that for a sufficiently slow CPU running at 100% (hence the
lowly 386),  the throughput would be influenced by how efficiently we can
get in, get the Tx set up and get out.

>Drop it on a pentium or late 486 and use the tsc to compare the two code
>paths. One is much much more efficienct.

Using rdtscl over the  area affected by the patch on the two variants for a
sample  of 234 small packets, I see an average of 4141 for using the
existing stack scratch area, and 4162 for using skb_padto.   That is a
difference of about 0.5%, which is significantly less than the typical
spread in the samples themselves.  To help give a relevant scale,  feeding
it a larger 1400 byte packet comes in at around 60,000 cycles on this
particular box.   Am I being optimistic to see this as good news -- meaning
that there is no longer a need for driver specific padto implementations,
whereas it looks like there was back when you did your tests? 

Paul.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c
  2005-01-10  2:41     ` [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c Paul Gortmaker
@ 2005-01-10 18:28       ` Alan Cox
  2005-02-19  5:20         ` Jeff Garzik
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2005-01-10 18:28 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: Linux Kernel Mailing List, Jeff Garzik

On Llu, 2005-01-10 at 02:41, Paul Gortmaker wrote:
> Using rdtscl over the  area affected by the patch on the two variants for a
> sample  of 234 small packets, I see an average of 4141 for using the
> existing stack scratch area, and 4162 for using skb_padto.   That is a
> difference of about 0.5%, which is significantly less than the typical
> spread in the samples themselves.  To help give a relevant scale,  feeding
> it a larger 1400 byte packet comes in at around 60,000 cycles on this
> particular box.   Am I being optimistic to see this as good news -- meaning
> that there is no longer a need for driver specific padto implementations,
> whereas it looks like there was back when you did your tests? 

It means that padto has improved a lot (or the underlying allocators).
It also still means the patch makes the code slower and introduces
changes that have no benefit into the kernel, so while its good to see
its not relevant its still a pointless change.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c
  2005-01-10 18:28       ` Alan Cox
@ 2005-02-19  5:20         ` Jeff Garzik
  2005-02-19 14:06           ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Garzik @ 2005-02-19  5:20 UTC (permalink / raw)
  To: Alan Cox; +Cc: Paul Gortmaker, Linux Kernel Mailing List

Alan Cox wrote:
> On Llu, 2005-01-10 at 02:41, Paul Gortmaker wrote:
> 
>>Using rdtscl over the  area affected by the patch on the two variants for a
>>sample  of 234 small packets, I see an average of 4141 for using the
>>existing stack scratch area, and 4162 for using skb_padto.   That is a
>>difference of about 0.5%, which is significantly less than the typical
>>spread in the samples themselves.  To help give a relevant scale,  feeding
>>it a larger 1400 byte packet comes in at around 60,000 cycles on this
>>particular box.   Am I being optimistic to see this as good news -- meaning
>>that there is no longer a need for driver specific padto implementations,
>>whereas it looks like there was back when you did your tests? 
> 
> 
> It means that padto has improved a lot (or the underlying allocators).
> It also still means the patch makes the code slower and introduces
> changes that have no benefit into the kernel, so while its good to see
> its not relevant its still a pointless change.

So the verdict is to revert?

	Jeff




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c
  2005-02-19  5:20         ` Jeff Garzik
@ 2005-02-19 14:06           ` Alan Cox
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Cox @ 2005-02-19 14:06 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Paul Gortmaker, Linux Kernel Mailing List

On Sad, 2005-02-19 at 05:20, Jeff Garzik wrote:
> > It means that padto has improved a lot (or the underlying allocators).
> > It also still means the patch makes the code slower and introduces
> > changes that have no benefit into the kernel, so while its good to see
> > its not relevant its still a pointless change.
> 
> So the verdict is to revert?

Not sure. The old code is known to work and a fraction faster, the new
code makes the driver use the same logic as the others. Having seen the
numbers from Paul I personally don't care either way.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c
       [not found] <200501070309.j0739IG6007753@hera.kernel.org>
@ 2005-01-07 15:46 ` Alan Cox
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Cox @ 2005-01-07 15:46 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: jgarzik

On Llu, 2004-12-27 at 20:18, Linux Kernel Mailing List wrote:
> ChangeSet 1.1938.455.1, 2004/12/27 15:18:56-05:00, penguin@muskoka.com
> 
> 	[PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c
> 	
> 	The 8390 driver had been fixed for leaking information in short packets
> 	prior to skb_padto() existing.  This change gets rid of the scratch area on
> 	the stack and makes it use skb_padto().  Thanks to Mark Smith for bringing
> 	this to my attention.
> 	
> 	Signed-off-by: Paul Gortmaker <p_gortmaker@yahoo.com>
> 	Signed-off-by: Jeff Garzik <jgarzik@pobox.com>

This was done because it benchmarked materially faster than the
skb_padto version you just reverted. Its only 64 bytes on the stack and
its cached.

ie the old 8390 code was quite intentional and done because it commonly
occurs on very old machines where clock count matters. Because of its
commonness I actually hand optimised this one when I did the original
fixes to avoid doing extra memory allocations.

Summary: Please revert.

Alan


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-02-19 14:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <41DED9FA.7080106@pobox.com>
2005-01-08  8:33 ` [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c] Paul Gortmaker
2005-01-08 15:45   ` Alan Cox
2005-01-10  2:41     ` [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c Paul Gortmaker
2005-01-10 18:28       ` Alan Cox
2005-02-19  5:20         ` Jeff Garzik
2005-02-19 14:06           ` Alan Cox
     [not found] <200501070309.j0739IG6007753@hera.kernel.org>
2005-01-07 15:46 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).