linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Eric B Munson <emunson@akamai.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	James Morris <jmorris@namei.org>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Patrick McHardy <kaber@trash.net>,
	netdev@vger.kernel.org, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
Date: Fri, 01 May 2015 11:42:57 -0700	[thread overview]
Message-ID: <1430505777.3711.135.camel@edumazet-glaptop2.roam.corp.google.com> (raw)
In-Reply-To: <1430502237-5619-1-git-send-email-emunson@akamai.com>

On Fri, 2015-05-01 at 13:43 -0400, Eric B Munson wrote:
> In order to enable policy decisions in userspace, the data contained in
> the SYN packet would be useful for tracking or identifying connections.
> Only parts of this data are available to userspace after the hand shake
> is completed.  This patch exposes a new setsockopt() option that will,
> when used with a listening socket, ask the kernel to cache the skb
> holding the SYN packet for retrieval later.  The SYN skbs will not be
> saved while the kernel is in syn cookie mode.
> 
> The same option will ask the kernel for the packet headers when used
> with getsockopt() with the socket returned from accept().  The cached
> packet will only be available for the first getsockopt() call, the skb
> is consumed after the requested data is copied to userspace.  Subsequent
> calls will return -ENOENT.  Because of this behavior, getsockopt() will
> return -E2BIG if the caller supplied a buffer that is too small to hold
> the skb header.
> 
> Signed-off-by: Eric B Munson <emunson@akamai.com>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: James Morris <jmorris@namei.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: netdev@vger.kernel.org
> Cc: linux-api@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---

We have a similar patch here at Google, but we do not hold one skb and
dst per saved syn. That can be ~4KB for some drivers.

Only a kmalloc() with the needed part (headers), usually less than 128
bytes. We store the length in first byte of this allocation.

This has a huge difference if you want to have ~4 million request socks.





  reply	other threads:[~2015-05-01 18:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-01 17:43 [PATCH] Allow TCP connections to cache SYN packet for userspace inspection Eric B Munson
2015-05-01 18:42 ` Eric Dumazet [this message]
2015-05-01 19:55   ` Tom Herbert
2015-05-01 20:14     ` Eric B Munson
2015-05-01 20:23       ` Eric Dumazet
2015-05-01 20:29         ` Eric B Munson
2015-05-01 20:41           ` Eric Dumazet
2015-05-01 19:27 ` Andy Lutomirski
2015-05-01 20:01   ` Eric B Munson
2015-05-01 20:28     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1430505777.3711.135.camel@edumazet-glaptop2.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=emunson@akamai.com \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).