All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com,
	wangdongdong.6@bytedance.com, jiang.wang@bytedance.com,
	Cong Wang <cong.wang@bytedance.com>
Subject: [Patch bpf-next 00/19] sock_map: add non-TCP and cross-protocol support
Date: Tue,  2 Feb 2021 20:16:17 -0800	[thread overview]
Message-ID: <20210203041636.38555-1-xiyou.wangcong@gmail.com> (raw)

From: Cong Wang <cong.wang@bytedance.com>

Currently sockmap only fully supports TCP, UDP is partially supported
as it is only allowed to add into sockmap. This patch extends sockmap
with: 1) full UDP support; 2) full AF_UNIX dgram support; 3) cross
protocol support. Our goal is to allow socket splice between AF_UNIX
dgram and UDP.

On the high level, ->sendmsg_locked() and ->read_sock() are required
for each protocol to support sockmap redirection, and in order to do
sock proto update, a new ops ->update_proto() is introduced, which is
also required to implement. It is slightly harder for AF_UNIX, as it
does not have a full struct proto implementation and redirection.

In order to support cross protocol, we have to make skb independent
of protocols, which is extremely hard given how creatively UDP uses
dev_scratch. Fortunately, we can pass skmsg instead of skb when
redirecting to ingress, the only thing needs to add is a new
->recvmsg() to retrieve skmsg. On the egress side, a new skb is
allocated behind skb_send_sock_locked(), it comes for free.
Another big barrier is skb CB, which was hard-coded as TCP_CB(),
I switch it to skb ext to solve this problem. Please see patch 3 for
more details.

This patchset passed all tests, the existing ones and the new ones I
add within this patchset.

---

Cong Wang (19):
  bpf: rename BPF_STREAM_PARSER to BPF_SOCK_MAP
  skmsg: get rid of struct sk_psock_parser
  skmsg: use skb ext instead of TCP_SKB_CB
  sock_map: rename skb_parser and skb_verdict
  sock_map: introduce BPF_SK_SKB_VERDICT
  sock: introduce sk_prot->update_proto()
  udp: implement ->sendmsg_locked()
  udp: implement ->read_sock() for sockmap
  udp: add ->read_sock() and ->sendmsg_locked() to ipv6
  af_unix: implement ->sendmsg_locked for dgram socket
  af_unix: implement ->read_sock() for sockmap
  af_unix: implement ->update_proto()
  af_unix: set TCP_ESTABLISHED for datagram sockets too
  skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data()
  udp: implement udp_bpf_recvmsg() for sockmap
  af_unix: implement unix_dgram_bpf_recvmsg()
  sock_map: update sock type checks
  selftests/bpf: add test cases for unix and udp sockmap
  selftests/bpf: add test case for redirection between udp and unix

 MAINTAINERS                                   |   1 +
 include/linux/bpf.h                           |   4 +-
 include/linux/bpf_types.h                     |   2 +-
 include/linux/skbuff.h                        |   4 +
 include/linux/skmsg.h                         |  90 +++-
 include/net/af_unix.h                         |  13 +
 include/net/ipv6.h                            |   1 +
 include/net/sock.h                            |   3 +
 include/net/tcp.h                             |  33 +-
 include/net/udp.h                             |   9 +-
 include/uapi/linux/bpf.h                      |   1 +
 kernel/bpf/syscall.c                          |   1 +
 net/Kconfig                                   |  14 +-
 net/core/Makefile                             |   2 +-
 net/core/filter.c                             |   3 +-
 net/core/skbuff.c                             |   7 +
 net/core/skmsg.c                              | 223 +++++---
 net/core/sock_map.c                           | 128 ++---
 net/ipv4/Makefile                             |   2 +-
 net/ipv4/af_inet.c                            |   2 +
 net/ipv4/tcp_bpf.c                            | 130 +----
 net/ipv4/tcp_ipv4.c                           |   3 +
 net/ipv4/udp.c                                |  68 ++-
 net/ipv4/udp_bpf.c                            |  78 ++-
 net/ipv6/af_inet6.c                           |   2 +
 net/ipv6/tcp_ipv6.c                           |   3 +
 net/ipv6/udp.c                                |  30 +-
 net/tls/tls_sw.c                              |   4 +-
 net/unix/Makefile                             |   1 +
 net/unix/af_unix.c                            | 105 +++-
 net/unix/unix_bpf.c                           |  99 ++++
 tools/bpf/bpftool/common.c                    |   1 +
 tools/bpf/bpftool/prog.c                      |   1 +
 tools/include/uapi/linux/bpf.h                |   1 +
 .../selftests/bpf/prog_tests/sockmap_listen.c | 475 +++++++++++++++++-
 .../selftests/bpf/progs/test_sockmap_listen.c |  24 +-
 36 files changed, 1233 insertions(+), 335 deletions(-)
 create mode 100644 net/unix/unix_bpf.c

-- 
2.25.1


             reply	other threads:[~2021-02-03  4:17 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-03  4:16 Cong Wang [this message]
2021-02-03  4:16 ` [Patch bpf-next 01/19] bpf: rename BPF_STREAM_PARSER to BPF_SOCK_MAP Cong Wang
2021-02-05 10:32   ` Jakub Sitnicki
2021-02-09  1:40     ` Cong Wang
2021-02-08  8:21   ` John Fastabend
2021-02-08  9:50     ` Lorenz Bauer
2021-02-09  1:45     ` Cong Wang
2021-02-09  6:48       ` John Fastabend
2021-02-03  4:16 ` [Patch bpf-next 02/19] skmsg: get rid of struct sk_psock_parser Cong Wang
2021-02-05 11:25   ` Jakub Sitnicki
2021-02-08  8:39     ` John Fastabend
2021-02-09  0:19       ` Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 03/19] skmsg: use skb ext instead of TCP_SKB_CB Cong Wang
2021-02-05 22:09   ` Jakub Sitnicki
2021-02-08 18:56     ` Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 04/19] sock_map: rename skb_parser and skb_verdict Cong Wang
2021-02-08  8:27   ` John Fastabend
2021-02-03  4:16 ` [Patch bpf-next 05/19] sock_map: introduce BPF_SK_SKB_VERDICT Cong Wang
2021-02-08  8:31   ` John Fastabend
2021-02-03  4:16 ` [Patch bpf-next 06/19] sock: introduce sk_prot->update_proto() Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 07/19] udp: implement ->sendmsg_locked() Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 08/19] udp: implement ->read_sock() for sockmap Cong Wang
2021-02-08  9:48   ` Lorenz Bauer
2021-02-09  1:35     ` Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 09/19] udp: add ->read_sock() and ->sendmsg_locked() to ipv6 Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 10/19] af_unix: implement ->sendmsg_locked for dgram socket Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 11/19] af_unix: implement ->read_sock() for sockmap Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 12/19] af_unix: implement ->update_proto() Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 13/19] af_unix: set TCP_ESTABLISHED for datagram sockets too Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 14/19] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data() Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 15/19] udp: implement udp_bpf_recvmsg() for sockmap Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 16/19] af_unix: implement unix_dgram_bpf_recvmsg() Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 17/19] sock_map: update sock type checks Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 18/19] selftests/bpf: add test cases for unix and udp sockmap Cong Wang
2021-02-05 10:53   ` Jakub Sitnicki
2021-02-08 18:43     ` Cong Wang
2021-02-03  4:16 ` [Patch bpf-next 19/19] selftests/bpf: add test case for redirection between udp and unix Cong Wang
2021-02-03 17:48 ` [Patch bpf-next 00/19] sock_map: add non-TCP and cross-protocol support Alexei Starovoitov
2021-02-03 19:22   ` Cong Wang
2021-02-03 20:29     ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210203041636.38555-1-xiyou.wangcong@gmail.com \
    --to=xiyou.wangcong@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=jiang.wang@bytedance.com \
    --cc=netdev@vger.kernel.org \
    --cc=wangdongdong.6@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.