From: Martin KaFai Lau <kafai@fb.com>
To: Joe Stringer <joe@wand.net.nz>
Cc: <bpf@vger.kernel.org>, <netdev@vger.kernel.org>,
<daniel@iogearbox.net>, <ast@kernel.org>,
<eric.dumazet@gmail.com>, <lmb@cloudflare.com>
Subject: Re: [PATCHv3 bpf-next 0/5] Add bpf_sk_assign eBPF helper
Date: Fri, 27 Mar 2020 11:46:21 -0700 [thread overview]
Message-ID: <20200327184621.67324727o5rtu42p@kafai-mbp> (raw)
In-Reply-To: <20200327042556.11560-1-joe@wand.net.nz>
On Thu, Mar 26, 2020 at 09:25:51PM -0700, Joe Stringer wrote:
> Introduce a new helper that allows assigning a previously-found socket
> to the skb as the packet is received towards the stack, to cause the
> stack to guide the packet towards that socket subject to local routing
> configuration. The intention is to support TProxy use cases more
> directly from eBPF programs attached at TC ingress, to simplify and
> streamline Linux stack configuration in scale environments with Cilium.
>
> Normally in ip{,6}_rcv_core(), the skb will be orphaned, dropping any
> existing socket reference associated with the skb. Existing tproxy
> implementations in netfilter get around this restriction by running the
> tproxy logic after ip_rcv_core() in the PREROUTING table. However, this
> is not an option for TC-based logic (including eBPF programs attached at
> TC ingress).
>
> This series introduces the BPF helper bpf_sk_assign() to associate the
> socket with the skb on the ingress path as the packet is passed up the
> stack. The initial patch in the series simply takes a reference on the
> socket to ensure safety, but later patches relax this for listen
> sockets.
>
> To ensure delivery to the relevant socket, we still consult the routing
> table, for full examples of how to configure see the tests in patch #5;
> the simplest form of the route would look like this:
>
> $ ip route add local default dev lo
>
> This series is laid out as follows:
> * Patch 1 extends the eBPF API to add sk_assign() and defines a new
> socket free function to allow the later paths to understand when the
> socket associated with the skb should be kept through receive.
> * Patches 2-3 optimize the receive path to avoid taking a reference on
> listener sockets during receive.
> * Patches 4-5 extends the selftests with examples of the new
> functionality and validation of correct behaviour.
>
> Changes since v2:
> * Add selftests for UDP socket redirection
> * Drop the early demux optimization patch (defer for more testing)
> * Fix check for orphaning after TC act return
> * Tidy up the tests to clean up properly and be less noisy.
>
> Changes since v1:
> * Replace the metadata_dst approach with using the skb->destructor to
> determine whether the socket has been prefetched. This is much
> simpler.
> * Avoid taking a reference on listener sockets during receive
> * Restrict assigning sockets across namespaces
> * Restrict assigning SO_REUSEPORT sockets
> * Fix cookie usage for socket dst check
> * Rebase the tests against test_progs infrastructure
> * Tidy up commit messages
lgtm.
Acked-by: Martin KaFai Lau <kafai@fb.com>
next prev parent reply other threads:[~2020-03-27 18:46 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-27 4:25 [PATCHv3 bpf-next 0/5] Add bpf_sk_assign eBPF helper Joe Stringer
2020-03-27 4:25 ` [PATCHv3 bpf-next 1/5] bpf: Add socket assign support Joe Stringer
2020-03-27 18:44 ` Martin KaFai Lau
2020-03-27 4:25 ` [PATCHv3 bpf-next 2/5] net: Track socket refcounts in skb_steal_sock() Joe Stringer
2020-03-27 18:45 ` Martin KaFai Lau
2020-03-27 4:25 ` [PATCHv3 bpf-next 3/5] bpf: Don't refcount LISTEN sockets in sk_assign() Joe Stringer
2020-03-27 14:26 ` Jamal Hadi Salim
2020-03-27 17:38 ` Joe Stringer
2020-03-27 18:29 ` Jamal Hadi Salim
2020-03-27 4:25 ` [PATCHv3 bpf-next 4/5] selftests: bpf: add test for sk_assign Joe Stringer
2020-03-27 4:25 ` [PATCHv3 bpf-next 5/5] selftests: bpf: Extend sk_assign tests for UDP Joe Stringer
[not found] ` <CACAyw9-GOw5tkR8n6p7Kct9-wq4B-9ka-X8R2V8uZv8VWUY5UQ@mail.gmail.com>
2020-03-27 19:37 ` Joe Stringer
2020-03-27 5:02 ` [PATCHv3 bpf-next 0/5] Add bpf_sk_assign eBPF helper Alexei Starovoitov
2020-03-27 5:42 ` Eric Dumazet
2020-03-27 14:13 ` Jamal Hadi Salim
2020-03-27 17:43 ` Joe Stringer
2020-03-27 18:34 ` Jamal Hadi Salim
2020-03-27 18:46 ` Martin KaFai Lau [this message]
2020-03-27 21:05 ` Joe Stringer
2020-03-28 17:25 ` Daniel Borkmann
2020-03-28 17:42 ` Joe Stringer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200327184621.67324727o5rtu42p@kafai-mbp \
--to=kafai@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eric.dumazet@gmail.com \
--cc=joe@wand.net.nz \
--cc=lmb@cloudflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).