netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Craig Gallek <kraigatgoog@gmail.com>
Cc: netdev@vger.kernel.org, David Miller <davem@davemloft.net>
Subject: Re: [PATCH net-next 3/4] soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF
Date: Thu, 24 Dec 2015 09:36:40 -0700	[thread overview]
Message-ID: <20151224163639.GA90724@ast-mbp.thefacebook.com> (raw)
In-Reply-To: <1450814710-17850-4-git-send-email-kraigatgoog@gmail.com>

On Tue, Dec 22, 2015 at 03:05:09PM -0500, Craig Gallek wrote:
> From: Craig Gallek <kraig@google.com>
> 
> Expose socket options for setting a classic or extended BPF program
> for use when selecting sockets in an SO_REUSEPORT group.  These options
> can be used on the first socket to belong to a group before bind or
> on any socket in the group after bind.
> 
> This change includes refactoring of the existing sk_filter code to
> allow reuse of the existing BPF filter validation checks.
> 
> Signed-off-by: Craig Gallek <kraig@google.com>

interesting stuff.

> +static struct sock *run_bpf(struct sock_reuseport *reuse, u16 socks,
> +			    struct bpf_prog *prog, struct sk_buff *skb,
> +			    int hdr_len)
> +{
> +	struct sk_buff *nskb = NULL;
> +	u32 index;
> +
> +	if (skb_shared(skb)) {
> +		nskb = skb_clone(skb, GFP_ATOMIC);
> +		if (!nskb)
> +			return NULL;
> +		skb = nskb;
> +	}

what is the typical case here skb_shared or not?

> +	/* temporarily advance data past protocol header */
> +	if (skb_headlen(skb) < hdr_len || !skb_pull_inline(skb, hdr_len)) {

though bpf core will read just fine past linear part of the packet,
here we're limiting this feature only to packets where udp header is
part of headlen. Will it make it somewhat unreliable?
May be we can avoid doing this pull/push and use negative load
instructions with SKF_NET_OFF ? Something like:
load_word(skb, SKF_NET_OFF + sizeof(struct udphdr)));

>  /**
>   *  reuseport_select_sock - Select a socket from an SO_REUSEPORT group.
>   *  @sk: First socket in the group.
> - *  @hash: Use this hash to select.
> + *  @hash: When no BPF filter is available, use this hash to select.
> + *  @skb: skb to run through BPF filter.
> + *  @hdr_len: BPF filter expects skb data pointer at payload data.  If
> + *    the skb does not yet point at the payload, this parameter represents
> + *    how far the pointer needs to advance to reach the payload.

what is the use case of this?
Do you expect programs to be stateful?

> +				sk2 = reuseport_select_sock(sk, hash, NULL, 0);
...
> +				sk2 = reuseport_select_sock(sk, hash, skb,
> +							sizeof(struct udphdr));

these are the cases that comment is trying to explain?
Meaning the bpf program needs to understand well enough when udp stack
is calling it ?

Will do more careful review of bpf bits once I'm back from PTO.

  reply	other threads:[~2015-12-24 16:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-22 20:05 [PATCH net-next 0/4] Faster SO_REUSEPORT Craig Gallek
2015-12-22 20:05 ` [PATCH net-next 1/4] soreuseport: define reuseport groups Craig Gallek
2015-12-22 21:40   ` David Miller
2015-12-22 21:58     ` Craig Gallek
2015-12-22 22:03       ` David Miller
2015-12-22 22:11   ` kbuild test robot
2015-12-22 22:39     ` Craig Gallek
2015-12-22 20:05 ` [PATCH net-next 2/4] soreuseport: fast reuseport UDP socket selection Craig Gallek
2015-12-22 20:05 ` [PATCH net-next 3/4] soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF Craig Gallek
2015-12-24 16:36   ` Alexei Starovoitov [this message]
2015-12-22 20:05 ` [PATCH net-next 4/4] soreuseport: BPF selection functional test Craig Gallek
2015-12-26 19:05 [PATCH net-next 3/4] soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF Craig Gallek
2015-12-29 17:16 ` Craig Gallek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151224163639.GA90724@ast-mbp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=davem@davemloft.net \
    --cc=kraigatgoog@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).