linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "odain2@mindspring.com" <odain2@mindspring.com>
To: linux-kernel@vger.kernel.org
Subject: CONFIG_PACKET_MMAP revisited
Date: Tue, 28 Oct 2003 23:09:13 -0500	[thread overview]
Message-ID: <176730-2200310329491330@M2W026.mail2web.com> (raw)

I've been looking into faster ways to do packet captures and I stumbled on
the following discussion on the Linux Kernel mailing list:

http://www.ussg.iu.edu/hypermail/linux/kernel/0202.2/1173.html

In that discussion Jamie Lokier suggested having a memory buffer that's
shared between user and kernel space and having the NIC do DMA transfers
directly to that buffer as an alternative to using Alexy's shared ring
buffer stuff.  The argument was that this would avoid the memory copy that
the kernel does from the DMA buffer to the memory mapped ring buffer. 
However, Alan Cox pointed out that the main cost of the memory copy is
getting the data from system memory (where the NIC put it via DMA) into the
L1 cache (DMA doesn't do any cache coherence so it can't go there
directly).  The memory copy (presumably from L1 cache to L1 cache) is
insignificant compared to this cost and since you'll need to get the data
into L1 cache to use it anyway, the memory copy is virtually free.

I'm wondering if this takes all of the costs into account.  If I understand
how this works the user space application can't get at the packet without a
context switch so that the kernel can first copy the packet to the shared
buffer.  The cost of the context switch is pretty high and this seems to me
to be the main bottleneck.  I believe that in normal operation each packet
(or with NICs that do interrupt coalescing, every n packets) causes an
interrupt which causes a context switch, the kernel then copies the data
from the DMA buffer to the shared buffer and does a RETI.  That's fairly
expensive.  If, on the other hand, data could be copied directly to
user-space accessible memory the NIC wouldn't need to generate any
interrupts and the kernel wouldn't need to get involved at all (this
assumes a NIC that can be configured not to generate any interrupts).  The
user space application could then poll the shared buffer and process
packets as fast as possible (some synchronization mechanism is clearly
needed here, but I think some of the high-end programmable NICs could do
this).  Would this not be significantly more efficient then the current
implementation?

Thank,
Oliver

PS: I'm not a mailing list subscriber so CCs on responses would be
appreciated.



--------------------------------------------------------------------
mail2web - Check your email from the web at
http://mail2web.com/ .



             reply	other threads:[~2003-10-29  4:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-29  4:09 odain2 [this message]
2003-10-29  4:50 ` CONFIG_PACKET_MMAP revisited Jamie Lokier
2003-11-06 11:08 ` Gianni Tedesco
2003-11-06 14:13   ` Oliver Dain
2003-11-06 14:31     ` Gianni Tedesco
2003-11-06 15:29       ` P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=176730-2200310329491330@M2W026.mail2web.com \
    --to=odain2@mindspring.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).