All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: nhorman@tuxdriver.com
Cc: netdev@vger.kernel.org, davem@davemloft.net, jpirko@redhat.com
Subject: Re: [PATCH] Enhance AF_PACKET implementation to not require high order contiguous memory allocation
Date: Mon, 25 Oct 2010 22:38:26 +0200	[thread overview]
Message-ID: <1288039106.3296.4.camel@edumazet-laptop> (raw)
In-Reply-To: <1288033566-2091-1-git-send-email-nhorman@tuxdriver.com>

Le lundi 25 octobre 2010 à 15:06 -0400, nhorman@tuxdriver.com a écrit :
> From: Neil Horman <nhorman@tuxdriver.com>
> 
> It was shown to me recently that systems under high load were driven very deep
> into swap when tcpdump was run.  The reason this happened was because the
> AF_PACKET protocol has a SET_RINGBUFFER socket option that allows the user space
> application to specify how many entries an AF_PACKET socket will have and how
> large each entry will be.  It seems the default setting for tcpdump is to set
> the ring buffer to 32 entries of 64 Kb each, which implies 32 order 5
> allocation.  Thats difficult under good circumstances, and horrid under memory
> pressure.
> 
> I thought it would be good to make that a bit more usable.  I was going to do a
> simple conversion of the ring buffer from contigous pages to iovecs, but
> unfortunately, the metadata which AF_PACKET places in these buffers can easily
> span a page boundary, and given that these buffers get mapped into user space,
> and the data layout doesn't easily allow for a change to padding between frames
> to avoid that, a simple iovec change is just going to break user space ABI
> consistency.
> 
> So instead I've done this.  This patch does the aforementioned change,
> allocating an array of pages instead of one contiguous chunk, and then vmaps the
> array into a contiguous memory space, so that it can still be accessed in the
> same way it was before.  This allows for a consisten user and kernel space
> behavior for memory mapped AF_PACKET sockets, which at the same time relieving
> the memory pressure placed on a system when tcpdump defaults are used.
> 
> Tested successfully by me.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> ---

Strange because last time I took a look at this stuff, libpcap was doing
several tries, reducing page orders until it got no allocation
failures...

(It tries to get high order pages, maybe to reduce TLB pressure...)

I remember adding __GFP_NOWARN to avoid a kernel message, while tcpdump
was actually working...




  parent reply	other threads:[~2010-10-25 20:38 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-25 19:06 [PATCH] Enhance AF_PACKET implementation to not require high order contiguous memory allocation nhorman
2010-10-25 20:17 ` Francois Romieu
2010-10-25 20:38 ` Eric Dumazet [this message]
2010-11-09 17:46 ` [PATCH] Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v2) nhorman
2010-11-09 18:02   ` Eric Dumazet
2010-11-09 18:38     ` Neil Horman
2010-11-09 19:20       ` Eric Dumazet
2010-11-09 20:57         ` Neil Horman
2010-11-09 21:07   ` Maciej Żenczykowski
2010-11-09 21:20     ` Neil Horman
2010-11-10 18:20 ` [PATCH] Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v3) nhorman
2010-11-10 18:27   ` Eric Dumazet
2010-11-10 19:09 ` [PATCH] Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v4) nhorman
2010-11-11  6:29   ` Eric Dumazet
2010-11-11  8:03   ` Maciej Żenczykowski
2010-11-16 18:25   ` David Miller
2010-11-16 21:30     ` Neil Horman
     [not found] <E1PAVIx-0001qL-EB@smtp.tuxdriver.com>
2010-10-25 22:30 ` [PATCH] Enhance AF_PACKET implementation to not require high order contiguous memory allocation Eric Dumazet
2010-10-25 23:35   ` Neil Horman
2010-10-25 23:46     ` David Miller
2010-10-26  0:48       ` Maciej Żenczykowski
2010-10-26  1:53         ` Neil Horman
2010-10-26  1:58       ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1288039106.3296.4.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=jpirko@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.