archive mirror
 help / color / mirror / Atom feed
From: Alan Cox <>
To: David griego <>
	Linux Kernel Mailing List <>
Subject: Re: Alan Shih: "TCP IP Offloading Interface"
Date: 14 Jul 2003 21:34:08 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Llu, 2003-07-14 at 21:19, David griego wrote:
> Embedded does not simply include toasters and fridges, it also includes NAS 
> and SAN appliances as well as telco gear.  These types of devices have 

I have some sympathy with SAN developers, because you can treat the link
as dedicated and just for the SAN so you can implement iSCSI optimised
code. The right engineering approach probably consists of removing iSCSI
and doing the job right but I appreciate there are non engineering

> advanced memory subsystems and run processors such as PPC and ARM.  One of 
> the most limiting factors in these types of devices is power consumption.  

Memory bandwidth is your killer quite often, you have to keep the CPU
away from the data and often even off the memory bus holding most of the

> >deeply embedded stuff also doesn't run with MMUs or runs 'partially
> >trusted' most of the VM games and the socket api games also go away.
> See PPC and ARM architecture for the use of MMUs in embedded systems

You still have the partial trust stuff and the ability to violate the
socket api in the interest of speed - that helps.

> >TCP/IP is an exercise in two things when you are running at speed
> >
> >1.	Finding the memory bandwidth - ToE doesn't help, checksums do,
> >	sg does, on card target buffers do with decent chipsets.
> A TOE enabled with RDDP would help eliminate the kernel to user space copy 
> (and in the case of SAMBA the copy back to the kernel).  This would reduce 
> the memory system loading by a third to a half.

Thats mostly an API issue. Socket API found non optimal after 20 years.
Hardly shocking news - the cluster people already changed API, Larry
McVoy proposed stuff like splice years ago because he saw it coming.

> >2.	Handling in order perfectly predicted data streams. ToE is
> >	overkill for this. Thats about latency to memory and touching
> >	as little as possible. The main CPU has a rather good connection
> >	to main memory.
> >
> Yes, RDDP would be nice to have though for the reason stated for #1, so the 
> hardware would need to at least be TCP aware.

TCP aware hardware is good, segmentation, prediction, buffer/sequence
matchers etc but you don't want the policy parts in silicon

> >repeatedly been screwed by attackers hitting software or other slow
> >paths in the device to attack it.
> The use of ASICs could ensure that TCP processing is as quick as wire speed

Only if your ASIC worst case is wire speed. If your ASIC has one path it
must drop to software for and has a low powered internal CPU to fix up,
you just lost and you'll plow like a cisco with too many filter rules to
do in silicon.

> Try to keep the datapath processing on the TOE, and everything else in the 
> OS.  Also give the API the ability to turn of the TOE if a hole exists and 
> use it like a regular NIC.

Some of this is certainly about how you partition the load - its no
different to things like RAID. We've had hardware raid, software raid,
and finally people are popping up with neat hybrids. We've had software
audio, hardware audio and again now we are seeing hybrids that do
different bits in hardware and software - ditto modems.

> >The internet land speed record is held by a non ToE system, let me know
> >when that changes.
> >
> Layer one network processing is often handled by ASICS, also some of the 
> fastest encryption engines are hardware.  I suggest we don't wait until your 
> proven wrong before making a decision on TOE.

You don't have to. You can go build and test and maintain a set of TOE patches.
Nobody is stopping you. Lots of Linux code exists because someone decided that
the official story was wrong and proved it so.


  parent reply	other threads:[~2003-07-14 20:25 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-14 20:19 Alan Shih: "TCP IP Offloading Interface" David griego
2003-07-14 20:31 ` Alan Shih
2003-07-16 22:28   ` Error compiling, scsi 2.6.0-test1 backblue
2003-07-16 23:09     ` Rahul Karnik
2003-07-17 11:51     ` Alan Cox
2003-07-17 18:59       ` backblue
2003-07-18 16:14         ` Joshua Schmidlkofer
2003-07-18 17:00         ` Steven Cole
2003-07-24  7:51           ` Doug Ledford
2003-07-14 20:34 ` Alan Cox [this message]
2003-07-14 21:53   ` Alan Shih: "TCP IP Offloading Interface" Deepak Saxena
2003-07-17 22:31     ` Bill Davidsen
     [not found] <>
2003-07-15 11:18 ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2003-07-14 21:51 David griego
2003-07-14 20:29 David griego
2003-07-14 20:23 David griego
2003-07-14 19:43 David griego
2003-07-14 20:03 ` Jeff Garzik
2003-07-14 20:23   ` Alan Cox
2003-07-14 20:05 ` Alan Cox
2003-07-14 20:30   ` Shawn
2003-07-15  5:58   ` Werner Almesberger
2003-07-14 19:14 David griego
2003-07-14 19:26 ` Jeff Garzik
2003-07-15 12:42   ` Jesse Pollard
2003-07-14 19:46 ` Alan Cox
2003-07-14 18:46 David griego
2003-07-14 19:02 ` Jeff Garzik
2003-07-14 21:22   ` Deepak Saxena
2003-07-14 21:45     ` Jeff Garzik
2003-07-15  5:27     ` Werner Almesberger
2003-07-14 19:42 ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).