linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Alan Shih: "TCP IP Offloading Interface"
@ 2003-07-14 18:46 David griego
  2003-07-14 19:02 ` Jeff Garzik
  2003-07-14 19:42 ` Alan Cox
  0 siblings, 2 replies; 25+ messages in thread
From: David griego @ 2003-07-14 18:46 UTC (permalink / raw)
  To: alan; +Cc: linux-kernel, dagriego

IMHO, there are several cases for some type of TCP/IP offload.  One is for 
embedded systems that are just not capable of doing 1Gbps+.  Another is with 
10GbE, even high end servers will not be able keep up with TCP 
processing/data movement at these speeds.  Not being proactive in adopting 
TCP/IP offload will force Linux into accepting some scheme that will not 
necissarily be best.


>Alan Shih wrote:
>>Has anyone worked on a standard interface between TOE and Linux? (ie. 
>>something like Trapeze/Myrinet's GMS?)
>>
>>Or TOE is a forbidden discussion? Any effort in making Linux the OS for 
>>TOE at all even though Linux is a little too heavy for it?
>
>
>
>I do not forsee there _ever_ being a TOE interface for Linux.
>
>
>It's not a forbidden discussion, but, the networking developers tend to
>ignore people who mention TOE because it's been discussed to death, and
>no evidence has ever been presented to prove it has advantages where it
>matters, and it has significant _dis_advantages from the get-go.
>
>
>I really should write an LKML FAQ entry for TOE.
>
>
>        Jeff
>
>
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>

_________________________________________________________________
STOP MORE SPAM with the new MSN 8 and get 2 months FREE*  
http://join.msn.com/?page=features/junkmail


^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: Alan Shih: "TCP IP Offloading Interface"
@ 2003-07-14 19:14 David griego
  2003-07-14 19:26 ` Jeff Garzik
  2003-07-14 19:46 ` Alan Cox
  0 siblings, 2 replies; 25+ messages in thread
From: David griego @ 2003-07-14 19:14 UTC (permalink / raw)
  To: jgarzik; +Cc: alan, linux-kernel

How does one measure the reliability and security of current software TCP/IP 
stacks?  Some standard set of test would have to be identified and the TOEs 
would need to be tested against this to ensure that they meet some minimum 
standard.  I would suggest offloading the minimum amount from the OS so that 
most of the control could be maintaind by the OS stack.  This also would 
make failover/routing changes between TOE -TOE, and TOE-NIC easier.  Current 
offloads such as checksum and segmentation will not be enough for 10GbE 
processing, so it would have to be something more than we have today.
David


>From: Jeff Garzik <jgarzik@pobox.com>
>To: David griego <dagriego@hotmail.com>
>CC: alan@storlinksemi.com,  linux-kernel@vger.kernel.org
>Subject: Re: Alan Shih: "TCP IP Offloading Interface"
>Date: Mon, 14 Jul 2003 15:02:35 -0400
>
>David griego wrote:
>>IMHO, there are several cases for some type of TCP/IP offload.  One is for 
>>embedded systems that are just not capable of doing 1Gbps+.  Another is 
>>with 10GbE, even high end servers will not be able keep up with TCP 
>>processing/data movement at these speeds.  Not being proactive in adopting 
>>TCP/IP offload will force Linux into accepting some scheme that will not 
>>necissarily be best.
>
>
>How does one evaluate a TOE stack to be sure that all the security fixes in 
>Linux are also in that stack?
>
>How does one evaluate a TOE stack to be sure it doesn't add new security 
>holes that Linux never had?
>
>	Jeff
>
>
>

_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.  
http://join.msn.com/?page=features/virus


^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: Alan Shih: "TCP IP Offloading Interface"
@ 2003-07-14 19:43 David griego
  2003-07-14 20:03 ` Jeff Garzik
  2003-07-14 20:05 ` Alan Cox
  0 siblings, 2 replies; 25+ messages in thread
From: David griego @ 2003-07-14 19:43 UTC (permalink / raw)
  To: jgarzik; +Cc: alan, linux-kernel

>Jeff Garzik wrote:
>Anything beyond basic host-only TOE adds massive complexity for very little 
>gain:  interfacing netfilter and routing code with a black box we _hope_ 
>will act properly sounds like suicide.
Keep most of this on the host, offload only performance path like the 
Alacritech TOE.

>All this is vague handwaving without supporting evidence.  So far we get 
>stuff like Internet2 speed records _without_ TOE.  And Linux currently 
>supports 10gige...  and hosts are just going to keep getting faster and 
>faster.

Intel Clusters and Network Storage Volume Platforms Lab reported that it 
takes about 1MHz to process 1Mbps on a PIII.  Using this rule of thumb (they 
showed it scaling from 400MHz to 800MHz) it would take 10GHz to process 
10Mbps.  Well you might say "what about multi-processers?"  This would be 
good for people that have multi-processors, but there is a large segment of 
embedded processors that are not going have SMP, or be at 10GHz anytime 
soon.  Besides that processing interrupts does not scale across MPs 
liniarly.  The truth is that communication speeds are outpacing processor 
speeds at this time.
David

>
>	Jeff
>
>
>

_________________________________________________________________
Help STOP SPAM with the new MSN 8 and get 2 months FREE*  
http://join.msn.com/?page=features/junkmail


^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: Alan Shih: "TCP IP Offloading Interface"
@ 2003-07-14 20:19 David griego
  2003-07-14 20:31 ` Alan Shih
  2003-07-14 20:34 ` Alan Cox
  0 siblings, 2 replies; 25+ messages in thread
From: David griego @ 2003-07-14 20:19 UTC (permalink / raw)
  To: alan; +Cc: alan, linux-kernel

Embedded does not simply include toasters and fridges, it also includes NAS 
and SAN appliances as well as telco gear.  These types of devices have 
advanced memory subsystems and run processors such as PPC and ARM.  One of 
the most limiting factors in these types of devices is power consumption.  
This usually limits the number of cores and frequency these cores.  
Offloading the processing of protocol stacks to ASICS would have a great 
impact in performance.  If you are going to embed a high frequency chip in 
your embedded devices I would recommend developing a heater not a fridge.


>From: Alan Cox <alan@lxorguk.ukuu.org.uk>
>To: David griego <dagriego@hotmail.com>
>CC: alan@storlinksemi.com,   Linux Kernel Mailing List 
><linux-kernel@vger.kernel.org>
>Subject: Re: Alan Shih: "TCP IP Offloading Interface"
>Date: 14 Jul 2003 20:42:53 +0100
>
>On Llu, 2003-07-14 at 19:46, David griego wrote:
> > IMHO, there are several cases for some type of TCP/IP offload.  One is 
>for
> > embedded systems that are just not capable of doing 1Gbps+.  Another is 
>with
>
>My fridge doesn't need to do 10Gbit a second, and for most other
>embedded the constraints are ram bandwidth and nothing else. Since
>deeply embedded stuff also doesn't run with MMUs or runs 'partially
>trusted' most of the VM games and the socket api games also go away.

See PPC and ARM architecture for the use of MMUs in embedded systems

>
>I've done deeply embedded tcp/ip. I don't buy the argument, embedded
>gains the least of all from ToE.
>
> > 10GbE, even high end servers will not be able keep up with TCP
> > processing/data movement at these speeds.  Not being proactive in 
>adopting
>
>They said that about 10Mbit until Van showed them a thing or two. They
>said it about 100Mbit, they said it about gigabit.

Not the case for embedded.  I understand your viewpoint from the server 
space though.
>
> > TCP/IP offload will force Linux into accepting some scheme that will not
> > necessarily be best.
>
>TCP/IP is an exercise in two things when you are running at speed
>
>1.	Finding the memory bandwidth - ToE doesn't help, checksums do,
>	sg does, on card target buffers do with decent chipsets.

A TOE enabled with RDDP would help eliminate the kernel to user space copy 
(and in the case of SAMBA the copy back to the kernel).  This would reduce 
the memory system loading by a third to a half.
>
>2.	Handling in order perfectly predicted data streams. ToE is
>	overkill for this. Thats about latency to memory and touching
>	as little as possible. The main CPU has a rather good connection
>	to main memory.
>
Yes, RDDP would be nice to have though for the reason stated for #1, so the 
hardware would need to at least be TCP aware.

>ToE is also horribly vulnerable to attack because putting it on a card
>dictates relatively low CPU power and low power consumption as well as
>rather nasty pricing issues. Historically low power devices have
>repeatedly been screwed by attackers hitting software or other slow
>paths in the device to attack it.
The use of ASICs could ensure that TCP processing is as quick as wire speed

>
>This is before we get into the delights of multipath routing across
>different vendors cards, firewalling, traffic shaping, retrofitting new
>features, questions about what happens with an old ToE card when its
>got a hole...
Try to keep the datapath processing on the TOE, and everything else in the 
OS.  Also give the API the ability to turn of the TOE if a hole exists and 
use it like a regular NIC.
>
>The internet land speed record is held by a non ToE system, let me know
>when that changes.
>
Layer one network processing is often handled by ASICS, also some of the 
fastest encryption engines are hardware.  I suggest we don't wait until your 
proven wrong before making a decision on TOE.

_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.  
http://join.msn.com/?page=features/virus


^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: Alan Shih: "TCP IP Offloading Interface"
@ 2003-07-14 20:23 David griego
  0 siblings, 0 replies; 25+ messages in thread
From: David griego @ 2003-07-14 20:23 UTC (permalink / raw)
  To: jgarzik; +Cc: alan, linux-kernel




>From: Jeff Garzik <jgarzik@pobox.com>
>To: David griego <dagriego@hotmail.com>
>CC: alan@storlinksemi.com,  linux-kernel@vger.kernel.org
>Subject: Re: Alan Shih: "TCP IP Offloading Interface"
>Date: Mon, 14 Jul 2003 16:03:01 -0400
>
>David griego wrote:
>>Intel Clusters and Network Storage Volume Platforms Lab reported that it 
>>takes about 1MHz to process 1Mbps on a PIII.  Using this rule of thumb 
>>(they showed it scaling from 400MHz to 800MHz) it would take 10GHz to 
>>process 10Mbps.  Well you might say "what about multi-processers?"  This
>
>Um.  It doesn't take nearly 10Ghz to handle 10Mbps, or even 100Mbps.
Err.  Make that 10GHz for 10Gbps :-)
>
>
>>would be good for people that have multi-processors, but there is a large 
>>segment of embedded processors that are not going have SMP, or be at 10GHz 
>>anytime soon.  Besides that processing interrupts does not scale across 
>>MPs liniarly.  The truth is that communication speeds are outpacing 
>>processor speeds at this time.
>
>If the host CPU is a bottleneck after large-send and checksums have been 
>offloaded, then logically you aren't getting any work done _anyway_. You 
>have to interface with the net stack at some point, in which case you incur 
>a fixed cost, for socket handling, TCP exception handling, etc.
Still other processing going on like RAID, NFS, or CIFS.
>
>Maybe somebody needs to be looking into AMP (asymmetric multiprocessing), 
>too.
Nice artical on AMP for ATM.  I'll try to find a pointer.
>
>	Jeff
>
>
>

_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*  
http://join.msn.com/?page=features/virus


^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: Alan Shih: "TCP IP Offloading Interface"
@ 2003-07-14 20:29 David griego
  0 siblings, 0 replies; 25+ messages in thread
From: David griego @ 2003-07-14 20:29 UTC (permalink / raw)
  To: alan; +Cc: jgarzik, alan, linux-kernel

>From: Alan Cox <alan@lxorguk.ukuu.org.uk>
>To: David griego <dagriego@hotmail.com>
>CC: jgarzik@pobox.com, alan@storlinksemi.com,   Linux Kernel Mailing List 
><linux-kernel@vger.kernel.org>
>Subject: Re: Alan Shih: "TCP IP Offloading Interface"
>Date: 14 Jul 2003 21:05:53 +0100
>
>On Llu, 2003-07-14 at 20:43, David griego wrote:
> > Intel Clusters and Network Storage Volume Platforms Lab reported that it
> > takes about 1MHz to process 1Mbps on a PIII.  Using this rule of thumb 
>(they
>
>1MHz to proces 1Mbit doing what - file I/O to and from disk, web serving
>- because ToE or otherwise I still have to process the data I receive
>and do something useful with it unless I'm just a router, firewall or
>load balancer. If you want to argue about using gate arrays and hardware
>to accelerate IP routing, balancing and firewall filter cams then you
>might get somewhere - but they dont need to talk TCP.
This was stream testing nTTCP, so no other IO work being done.  Freedom from 
TCP processing would allow for other tasks such as RAID and storage 
virtualization.
>
>Also if its 1MHz per 1Mbit worse case and your ToE engine isnt entirely
>hardware paths capable of sustaining 10Gbit/sec, what happens when I hit
>you with 10Gbit of carefully chosen non optimal frames ?
I'll let the hardware teams figure that out.  From my understanding it going 
to be done.

_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*  
http://join.msn.com/?page=features/virus


^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: Alan Shih: "TCP IP Offloading Interface"
@ 2003-07-14 21:51 David griego
  0 siblings, 0 replies; 25+ messages in thread
From: David griego @ 2003-07-14 21:51 UTC (permalink / raw)
  To: alan; +Cc: alan, linux-kernel

>From: Alan Cox <alan@lxorguk.ukuu.org.uk>
> > Layer one network processing is often handled by ASICS, also some of the
> > fastest encryption engines are hardware.  I suggest we don't wait until 
>your
> > proven wrong before making a decision on TOE.
>
>You don't have to. You can go build and test and maintain a set of TOE 
>patches.
>Nobody is stopping you. Lots of Linux code exists because someone decided 
>that
>the official story was wrong and proved it so.
>
Our team has done this twice aready for Linux (one TOE in software one as an 
ASIC).  It can be hard to make decisions on tradeoffs when the general 
consinsus in Linux is to not support TOE.  I'm sure that once everything is 
said and done we will provide a driver for our TOE to the community.  
Support from other OS venders has been better and feedback from them will 
defenitly influance our hardware design.
>
>Alan
>

_________________________________________________________________
The new MSN 8: smart spam protection and 2 months FREE*  
http://join.msn.com/?page=features/junkmail


^ permalink raw reply	[flat|nested] 25+ messages in thread
[parent not found: <Sea2-F66GGORm1u51rM00012573@hotmail.com>]

end of thread, other threads:[~2003-07-17 22:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-14 18:46 Alan Shih: "TCP IP Offloading Interface" David griego
2003-07-14 19:02 ` Jeff Garzik
2003-07-14 21:22   ` Deepak Saxena
2003-07-14 21:45     ` Jeff Garzik
2003-07-15  5:27     ` Werner Almesberger
2003-07-14 19:42 ` Alan Cox
2003-07-14 19:14 David griego
2003-07-14 19:26 ` Jeff Garzik
2003-07-15 12:42   ` Jesse Pollard
2003-07-14 19:46 ` Alan Cox
2003-07-14 19:43 David griego
2003-07-14 20:03 ` Jeff Garzik
2003-07-14 20:23   ` Alan Cox
2003-07-14 20:05 ` Alan Cox
2003-07-14 20:30   ` Shawn
2003-07-15  5:58   ` Werner Almesberger
2003-07-14 20:19 David griego
2003-07-14 20:31 ` Alan Shih
2003-07-14 20:34 ` Alan Cox
2003-07-14 21:53   ` Deepak Saxena
2003-07-17 22:31     ` Bill Davidsen
2003-07-14 20:23 David griego
2003-07-14 20:29 David griego
2003-07-14 21:51 David griego
     [not found] <Sea2-F66GGORm1u51rM00012573@hotmail.com>
2003-07-15 11:18 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).