From mboxrd@z Thu Jan 1 00:00:00 1970 From: "David S. Miller" Subject: Re: Tigon3 5701 PCI-X recv performance problem Date: Wed, 8 Oct 2003 13:50:30 -0700 Sender: netdev-bounce@oss.sgi.com Message-ID: <20031008135030.3dad33f9.davem@redhat.com> References: <3F844578.40306@sgi.com> <20031008101046.376abc3b.davem@redhat.com> <3F8455BE.8080300@sgi.com> <20031008183742.GA24822@wotan.suse.de> <20031008122223.1ba5ac79.davem@redhat.com> <20031008202248.GA15611@oldwotan.suse.de> <20031008132402.64984528.davem@redhat.com> <20031008203306.GB15611@oldwotan.suse.de> <20031008133248.1583ddcf.davem@redhat.com> <20031008204618.GC15611@oldwotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: ak@suse.de, modica@sgi.com, johnip@sgi.com, netdev@oss.sgi.com, jgarzik@pobox.com, jes@sgi.com Return-path: To: Andi Kleen In-Reply-To: <20031008204618.GC15611@oldwotan.suse.de> Errors-to: netdev-bounce@oss.sgi.com List-Id: netdev.vger.kernel.org On Wed, 8 Oct 2003 22:46:18 +0200 Andi Kleen wrote: > On Wed, Oct 08, 2003 at 01:32:48PM -0700, David S. Miller wrote: > > The page chunk allocator is meant to make it easier to put the > > non-header parts in the frag list of the SKB, see? It means we > > don't need to do anything special in the networking, all the > > receive paths handle frag'd RX packets properly. > > Sure, but to handle the sub allocation you need a destructor per fragment. > (otherwise how do you want to share a page between different packets) Aha, no you don't, this is the beauty of it. Let's say we've packed 4 packets into a page (or 10 in 2 pages, whatever the optimal packing is), as you attach each chunk to a SKB you up the page count (if the buffer straddles 2 or more pages you use one frag entry for each of those pages and bump the count as approprise). As far as the networking is concerned, it's some page cache page or whatever, it doesn't care. Then kfree_skb(skb) just does the right thing by putting all the pages, when the page count goes to zero it's free'd up. > BTW I think this all should be also ifdefed with CONFIG_SLOW_UNALIGNMENT. > I certainly don't want any of this on x86-64 where unalignment cost > one cycle only. I agree, I don't even want this rediculious crap on sparc64 where I can make the unaligned trap handler 30 or 40 cycles or even less. BTW, your highmem example is interesting, but even more interesting are the cards that do the magic multiple-TCP-packet coalescing so that the data parts are all page aligned. They want infrastructure like this.