[bpf-next PATCH 00/16] bpf,sockmap: sendmsg/sendfile ULP

* [bpf-next PATCH 00/16] bpf,sockmap: sendmsg/sendfile ULP
@ 2018-03-05 19:50 John Fastabend
  2018-03-05 19:51 ` [bpf-next PATCH 01/16] sock: make static tls function alloc_sg generic sock helper John Fastabend
                   ` (15 more replies)
  0 siblings, 16 replies; 31+ messages in thread
From: John Fastabend @ 2018-03-05 19:50 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, davejwatson

This series adds a BPF hook for sendmsg and sendfile by using
the ULP infrastructure and sockmap. A simple pseudocode example
would be,

  // load the programs
  bpf_prog_load(SOCKMAP_TCP_MSG_PROG, BPF_PROG_TYPE_SK_MSG,
                &obj, &msg_prog);

  // lookup the sockmap
  bpf_map_msg = bpf_object__find_map_by_name(obj, "my_sock_map");

  // get fd for sockmap
  map_fd_msg = bpf_map__fd(bpf_map_msg);

  // attach program to sockmap
  bpf_prog_attach(msg_prog, map_fd_msg, BPF_SK_MSG_VERDICT, 0);

  // Add a socket 'fd' to sockmap at location 'i'
  bpf_map_update_elem(map_fd_msg, &i, fd, BPF_ANY);

After the above snippet any socket attached to the map would run
msg_prog on sendmsg and sendfile system calls.

Two additional helpers are added bpf_msg_apply_bytes() and
bpf_msg_cork_bytes(). With bpf_msg_apply_bytes BPF programs
can tell the infrastructure how many bytes the given verdict
should apply to. This has two cases. First BPF program applies
verdict to fewer bytes than in the current sendmsg/sendfile this
will apply the verdict to the first N bytes of the message then
run the BPF program again with data pointers recalculated to the
N+1 byte. The second case is the BPF program applies a verdict to
more bytes than the current sendmsg or sendfile system call. In
this case the infrastructure will cache the verdict and apply it
to future sendmsg/sendfile calls until the byte limit is reached.
This avoids the overhead of running BPF programs on large payloads.

The helper bpf_msg_cork_bytes() handles a different case where
a BPF program can not reach a verdict on a msg until it receives
more bytes AND the program doesn't want to forward the packet
until it is known to be "good". The example case being a user
(albeit a dumb one probably) sends messages in 1B system calls.
The BPF program can call bpf_msg_cork_bytes with the required byte
limit to reach a verdict and then the program will only be called
again once N bytes are received.

For more examples please review the sample program. There are
examples for all the actions and helpers there.

Patches 1-7 implement the above sockmap/BPF infrastructure. The
remaining patches flush out some minimal selftests and the sample
sockmap program. The sockmap sample program is the main vehicle
for testing this infrastructure and will be moved into selftests
shortly. The final patch in this series is a simple shell script
to run a set of tests. These are the tests I run after any changes
to sockmap. The next task on the list after this series is to
push those into selftests so we can avoid manually testing.

Couple notes on future items in the pipeline,

  0. move sample sockmap programs into selftests (noted above)
  1. add additional support for tcp flags, most are ignored now.
  2. add a Documentation/bpf/sockmap file for details
  3. support stacked ULP types to allow this and ktls to cooperate
  4. Ingress flag support, redirect only supports egress here. The
     other redirect helpers support ingress and egress flags.

Thanks,
John

Notes: I could have squashed the test patches down into a single
patch but I left it as is. It makes the patch count a bit large
but, makes the sample sockmap updates a bit more incremental. Also
the majority of the patches are testing patches so I think 16 patches
is reasonable.

---

John Fastabend (16):
      sock: make static tls function alloc_sg generic sock helper
      sockmap: convert refcnt to an atomic refcnt
      net: do_tcp_sendpages flag to avoid SKBTX_SHARED_FRAG
      net: generalize sk_alloc_sg to work with scatterlist rings
      bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
      bpf: sockmap, add bpf_msg_apply_bytes() helper
      bpf: sockmap, add msg_cork_bytes() helper
      bpf: add map tests for BPF_PROG_TYPE_SK_MSG
      bpf: add verifier tests for BPF_PROG_TYPE_SK_MSG
      bpf: sockmap sample, add option to attach SK_MSG program
      bpf: sockmap sample, add sendfile test
      bpf: sockmap sample, add data verification option
      bpf: sockmap, add sample option to test apply_bytes helper
      bpf: sockmap sample support for bpf_msg_cork_bytes()
      sockmap: add SK_DROP tests
      bpf: sockmap test script

 include/linux/bpf.h                                |    1 
 include/linux/bpf_types.h                          |    1 
 include/linux/filter.h                             |   17 
 include/linux/socket.h                             |    1 
 include/net/sock.h                                 |    4 
 include/uapi/linux/bpf.h                           |   30 +
 include/uapi/linux/bpf_common.h                    |    7 
 kernel/bpf/sockmap.c                               |  927 +++++++++++++++++++-
 kernel/bpf/syscall.c                               |   14 
 kernel/bpf/verifier.c                              |    5 
 net/core/filter.c                                  |  138 +++
 net/core/sock.c                                    |   61 +
 net/ipv4/tcp.c                                     |    4 
 net/tls/tls_sw.c                                   |   69 -
 samples/bpf/bpf_load.c                             |    8 
 samples/sockmap/sockmap_kern.c                     |  146 +++
 samples/sockmap/sockmap_test.sh                    |  387 ++++++++
 samples/sockmap/sockmap_user.c                     |  269 +++++-
 tools/include/uapi/linux/bpf.h                     |   30 +
 tools/lib/bpf/libbpf.c                             |    1 
 tools/testing/selftests/bpf/Makefile               |    2 
 tools/testing/selftests/bpf/bpf_helpers.h          |    8 
 tools/testing/selftests/bpf/sockmap_parse_prog.c   |   15 
 tools/testing/selftests/bpf/sockmap_verdict_prog.c |    7 
 tools/testing/selftests/bpf/test_maps.c            |   55 +
 tools/testing/selftests/bpf/test_verifier.c        |   54 +
 26 files changed, 2125 insertions(+), 136 deletions(-)
 create mode 100755 samples/sockmap/sockmap_test.sh

--
Signature

^ permalink raw reply	[flat|nested] 31+ messages in thread