All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples
@ 2014-06-28  0:05 Alexei Starovoitov
  2014-06-28  0:05 ` [PATCH RFC net-next 01/14] net: filter: split filter.c into two files Alexei Starovoitov
                   ` (14 more replies)
  0 siblings, 15 replies; 108+ messages in thread
From: Alexei Starovoitov @ 2014-06-28  0:05 UTC (permalink / raw)
  To: David S. Miller
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, Daniel Borkmann,
	Chema Gonzalez, Eric Dumazet, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Jiri Olsa, Thomas Gleixner,
	H. Peter Anvin, Andrew Morton, Kees Cook, linux-api, netdev,
	linux-kernel

Hi All,

this patch set demonstrates the potential of eBPF.

First patch "net: filter: split filter.c into two files" splits eBPF interpreter
out of networking into kernel/bpf/. The goal for BPF subsystem is to be usable
in NET-less configuration. Though the whole set is marked is RFC, the 1st patch
is good to go. Similar version of the patch that was posted few weeks ago, but
was deferred. I'm assuming due to lack of forward visibility. I hope that this
patch set shows what eBPF is capable of and where it's heading.

Other patches expose eBPF instruction set to user space and introduce concepts
of maps and programs accessible via syscall.

'maps' is a generic storage of different types for sharing data between kernel
and userspace. Maps are referrenced by global id. Root can create multiple
maps of different types where key/value are opaque bytes of data. It's up to
user space and eBPF program to decide what they store in the maps.

eBPF programs are similar to kernel modules. They live in global space and
have unique prog_id. Each program is a safe run-to-completion set of
instructions. eBPF verifier statically determines that the program terminates
and safe to execute. During verification the program takes a hold of maps
that it intends to use, so selected maps cannot be removed until program is
unloaded. The program can be attached to different events. These events can
be packets, tracepoint events and other types in the future. New event triggers
execution of the program which may store information about the event in the maps.
Beyond storing data the programs may call into in-kernel helper functions
which may, for example, dump stack, do trace_printk or other forms of live
kernel debugging. Same program can be attached to multiple events. Different
programs can access the same map:

  tracepoint  tracepoint  tracepoint    sk_buff    sk_buff
   event A     event B     event C      on eth0    on eth1
    |             |          |            |          |
    |             |          |            |          |
    --> tracing <--      tracing       socket      socket
         prog_1           prog_2       prog_3      prog_4
         |  |               |            |
      |---  -----|  |-------|           map_3
    map_1       map_2

User space (via syscall) and eBPF programs access maps concurrently.

Last two patches are sample code. 1st demonstrates stateful packet inspection.
It counts tcp and udp packets on eth0. Should be easy to see how this eBPF
framework can be used for network analytics.
2nd sample does simple 'drop monitor'. It attaches to kfree_skb tracepoint
event and counts number of packet drops at particular $pc location.
User space periodically summarizes what eBPF programs recorded.
In these two samples the eBPF programs are tiny and written in 'assembler'
with macroses. More complex programs can be written C (llvm backend is not
part of this diff to reduce 'huge' perception).
Since eBPF is fully JITed on x64, the cost of running eBPF program is very
small even for high frequency events. Here are the numbers comparing
flow_dissector in C vs eBPF:
  x86_64 skb_flow_dissect() same skb (all cached)         -  42 nsec per call
  x86_64 skb_flow_dissect() different skbs (cache misses) - 141 nsec per call
eBPF+jit skb_flow_dissect() same skb (all cached)         -  51 nsec per call
eBPF+jit skb_flow_dissect() different skbs (cache misses) - 135 nsec per call

Detailed explanation on eBPF verifier and safety is in patch 08/14

Thanks
Alexei

minor todo: rename 'struct sock_filter_int' into 'struct bpf_insn'. It's not
part of this diff to reduce size.
  
------

The following changes since commit c1c27fb9b3040a2559d4d3e1183afa8c106bc94a:

  Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next (2014-06-27 12:59:38 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf master

for you to fetch changes up to 4c8da0f21220087e38894c69339cddc64c1220f9:

  samples: bpf: example of tracing filters with eBPF (2014-06-27 15:22:07 -0700)

----------------------------------------------------------------
Alexei Starovoitov (14):
      net: filter: split filter.c into two files
      net: filter: split filter.h and expose eBPF to user space
      bpf: introduce syscall(BPF, ...) and BPF maps
      bpf: update MAINTAINERS entry
      bpf: add lookup/update/delete/iterate methods to BPF maps
      bpf: add hashtable type of BPF maps
      bpf: expand BPF syscall with program load/unload
      bpf: add eBPF verifier
      bpf: allow eBPF programs to use maps
      net: sock: allow eBPF programs to be attached to sockets
      tracing: allow eBPF programs to be attached to events
      samples: bpf: add mini eBPF library to manipulate maps and programs
      samples: bpf: example of stateful socket filtering
      samples: bpf: example of tracing filters with eBPF

 Documentation/networking/filter.txt    |  302 +++++++
 MAINTAINERS                            |    9 +
 arch/alpha/include/uapi/asm/socket.h   |    2 +
 arch/avr32/include/uapi/asm/socket.h   |    2 +
 arch/cris/include/uapi/asm/socket.h    |    2 +
 arch/frv/include/uapi/asm/socket.h     |    2 +
 arch/ia64/include/uapi/asm/socket.h    |    2 +
 arch/m32r/include/uapi/asm/socket.h    |    2 +
 arch/mips/include/uapi/asm/socket.h    |    2 +
 arch/mn10300/include/uapi/asm/socket.h |    2 +
 arch/parisc/include/uapi/asm/socket.h  |    2 +
 arch/powerpc/include/uapi/asm/socket.h |    2 +
 arch/s390/include/uapi/asm/socket.h    |    2 +
 arch/sparc/include/uapi/asm/socket.h   |    2 +
 arch/x86/syscalls/syscall_64.tbl       |    1 +
 arch/xtensa/include/uapi/asm/socket.h  |    2 +
 include/linux/bpf.h                    |  135 +++
 include/linux/filter.h                 |  304 +------
 include/linux/ftrace_event.h           |    5 +
 include/linux/syscalls.h               |    2 +
 include/trace/bpf_trace.h              |   29 +
 include/trace/ftrace.h                 |   10 +
 include/uapi/asm-generic/socket.h      |    2 +
 include/uapi/asm-generic/unistd.h      |    4 +-
 include/uapi/linux/Kbuild              |    1 +
 include/uapi/linux/bpf.h               |  403 +++++++++
 kernel/Makefile                        |    1 +
 kernel/bpf/Makefile                    |    1 +
 kernel/bpf/core.c                      |  548 ++++++++++++
 kernel/bpf/hashtab.c                   |  371 +++++++++
 kernel/bpf/syscall.c                   |  778 +++++++++++++++++
 kernel/bpf/verifier.c                  | 1431 ++++++++++++++++++++++++++++++++
 kernel/sys_ni.c                        |    3 +
 kernel/trace/Kconfig                   |    1 +
 kernel/trace/Makefile                  |    1 +
 kernel/trace/bpf_trace.c               |  217 +++++
 kernel/trace/trace.h                   |    3 +
 kernel/trace/trace_events.c            |    7 +
 kernel/trace/trace_events_filter.c     |   72 +-
 net/core/filter.c                      |  646 +++-----------
 net/core/sock.c                        |   13 +
 samples/bpf/.gitignore                 |    1 +
 samples/bpf/Makefile                   |   15 +
 samples/bpf/dropmon.c                  |  127 +++
 samples/bpf/libbpf.c                   |  114 +++
 samples/bpf/libbpf.h                   |   18 +
 samples/bpf/sock_example.c             |  160 ++++
 47 files changed, 4941 insertions(+), 820 deletions(-)
 create mode 100644 include/linux/bpf.h
 create mode 100644 include/trace/bpf_trace.h
 create mode 100644 include/uapi/linux/bpf.h
 create mode 100644 kernel/bpf/Makefile
 create mode 100644 kernel/bpf/core.c
 create mode 100644 kernel/bpf/hashtab.c
 create mode 100644 kernel/bpf/syscall.c
 create mode 100644 kernel/bpf/verifier.c
 create mode 100644 kernel/trace/bpf_trace.c
 create mode 100644 samples/bpf/.gitignore
 create mode 100644 samples/bpf/Makefile
 create mode 100644 samples/bpf/dropmon.c
 create mode 100644 samples/bpf/libbpf.c
 create mode 100644 samples/bpf/libbpf.h
 create mode 100644 samples/bpf/sock_example.c

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 108+ messages in thread

end of thread, other threads:[~2014-07-05 21:59 UTC | newest]

Thread overview: 108+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-28  0:05 [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples Alexei Starovoitov
2014-06-28  0:05 ` [PATCH RFC net-next 01/14] net: filter: split filter.c into two files Alexei Starovoitov
2014-07-02  4:23   ` Namhyung Kim
2014-07-02  4:23     ` Namhyung Kim
2014-07-02  5:35     ` Alexei Starovoitov
2014-07-02  5:35       ` Alexei Starovoitov
2014-06-28  0:05 ` [PATCH RFC net-next 02/14] net: filter: split filter.h and expose eBPF to user space Alexei Starovoitov
2014-06-28  0:05 ` [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps Alexei Starovoitov
2014-06-28  0:16   ` Andy Lutomirski
2014-06-28  5:55     ` Alexei Starovoitov
2014-06-28  6:25       ` Andy Lutomirski
2014-06-28  6:25         ` Andy Lutomirski
2014-06-28  6:43         ` Alexei Starovoitov
2014-06-28  6:43           ` Alexei Starovoitov
2014-06-28 15:34           ` Andy Lutomirski
2014-06-28 15:34             ` Andy Lutomirski
2014-06-28 20:49             ` Alexei Starovoitov
2014-06-28 20:49               ` Alexei Starovoitov
2014-06-29  1:52               ` Andy Lutomirski
2014-06-29  1:52                 ` Andy Lutomirski
2014-06-29  6:36                 ` Alexei Starovoitov
2014-06-29  6:36                   ` Alexei Starovoitov
2014-06-30 22:09                   ` Andy Lutomirski
2014-06-30 22:09                     ` Andy Lutomirski
2014-07-01  5:47                     ` Alexei Starovoitov
2014-07-01 15:11                       ` Andy Lutomirski
2014-07-01 15:11                         ` Andy Lutomirski
2014-07-02  5:33                         ` Alexei Starovoitov
2014-07-02  5:33                           ` Alexei Starovoitov
2014-07-03  1:43                           ` Andy Lutomirski
2014-07-03  1:43                             ` Andy Lutomirski
2014-07-03  2:29                             ` Alexei Starovoitov
2014-07-04 15:17                               ` Andy Lutomirski
2014-07-04 15:17                                 ` Andy Lutomirski
2014-07-05 21:59                                 ` Alexei Starovoitov
2014-06-28  0:05 ` [PATCH RFC net-next 04/14] bpf: update MAINTAINERS entry Alexei Starovoitov
2014-06-28  0:18   ` Joe Perches
2014-06-28  5:59     ` Alexei Starovoitov
2014-06-28  5:59       ` Alexei Starovoitov
2014-06-28  0:05 ` [PATCH RFC net-next 05/14] bpf: add lookup/update/delete/iterate methods to BPF maps Alexei Starovoitov
2014-06-28  0:05 ` [PATCH RFC net-next 06/14] bpf: add hashtable type of " Alexei Starovoitov
2014-06-28  0:05 ` [PATCH RFC net-next 07/14] bpf: expand BPF syscall with program load/unload Alexei Starovoitov
2014-06-28  0:19   ` Andy Lutomirski
2014-06-28  0:19     ` Andy Lutomirski
2014-06-28  6:12     ` Alexei Starovoitov
2014-06-28  6:28       ` Andy Lutomirski
2014-06-28  7:26         ` Alexei Starovoitov
2014-06-28  7:26           ` Alexei Starovoitov
2014-06-28 15:21           ` Greg KH
2014-06-28 15:21             ` Greg KH
2014-06-28 15:35             ` Andy Lutomirski
2014-06-30 20:39               ` Alexei Starovoitov
2014-06-30 10:06       ` David Laight
2014-06-30 10:06         ` David Laight
2014-06-28  0:06 ` [PATCH RFC net-next 08/14] bpf: add eBPF verifier Alexei Starovoitov
2014-06-28 16:01   ` Andy Lutomirski
2014-06-28 16:01     ` Andy Lutomirski
2014-06-28 20:25     ` Alexei Starovoitov
2014-06-28 20:25       ` Alexei Starovoitov
2014-06-29  1:58       ` Andy Lutomirski
2014-06-29  6:20         ` Alexei Starovoitov
2014-06-29  6:20           ` Alexei Starovoitov
2014-07-01  8:05   ` Daniel Borkmann
2014-07-01  8:05     ` Daniel Borkmann
2014-07-01 20:04     ` Alexei Starovoitov
2014-07-01 20:04       ` Alexei Starovoitov
2014-07-02  8:11       ` David Laight
2014-07-02  8:11         ` David Laight
2014-07-02 22:43         ` Alexei Starovoitov
2014-07-02 22:43           ` Alexei Starovoitov
2014-07-02  5:05   ` Namhyung Kim
2014-07-02  5:05     ` Namhyung Kim
2014-07-02  5:57     ` Alexei Starovoitov
2014-07-02 22:22   ` Chema Gonzalez
2014-07-02 23:04     ` Alexei Starovoitov
2014-07-02 23:04       ` Alexei Starovoitov
2014-07-02 23:35       ` Chema Gonzalez
2014-07-03  0:01         ` Alexei Starovoitov
2014-07-03  0:01           ` Alexei Starovoitov
2014-07-03  9:13       ` David Laight
2014-07-03  9:13         ` David Laight
2014-07-03 17:41         ` Alexei Starovoitov
2014-06-28  0:06 ` [PATCH RFC net-next 09/14] bpf: allow eBPF programs to use maps Alexei Starovoitov
2014-06-28  0:06   ` Alexei Starovoitov
2014-06-28  0:06 ` [PATCH RFC net-next 10/14] net: sock: allow eBPF programs to be attached to sockets Alexei Starovoitov
2014-06-28  0:06 ` [PATCH RFC net-next 11/14] tracing: allow eBPF programs to be attached to events Alexei Starovoitov
2014-07-01  8:30   ` Daniel Borkmann
2014-07-01  8:30     ` Daniel Borkmann
2014-07-01 20:06     ` Alexei Starovoitov
2014-07-01 20:06       ` Alexei Starovoitov
2014-07-02  5:32   ` Namhyung Kim
2014-07-02  5:32     ` Namhyung Kim
2014-07-02  6:14     ` Alexei Starovoitov
2014-07-02  6:14       ` Alexei Starovoitov
2014-07-02  6:39       ` Namhyung Kim
2014-07-02  7:29         ` Alexei Starovoitov
2014-06-28  0:06 ` [PATCH RFC net-next 12/14] samples: bpf: add mini eBPF library to manipulate maps and programs Alexei Starovoitov
2014-06-28  0:06 ` [PATCH RFC net-next 13/14] samples: bpf: example of stateful socket filtering Alexei Starovoitov
2014-06-28  0:21   ` Andy Lutomirski
2014-06-28  6:21     ` Alexei Starovoitov
2014-06-28  6:21       ` Alexei Starovoitov
2014-06-28  0:06 ` [PATCH RFC net-next 14/14] samples: bpf: example of tracing filters with eBPF Alexei Starovoitov
2014-06-30 23:09 ` [PATCH RFC net-next 00/14] BPF syscall, maps, verifier, samples Kees Cook
2014-06-30 23:09   ` Kees Cook
2014-07-01  7:18   ` Daniel Borkmann
2014-07-01  7:18     ` Daniel Borkmann
2014-07-02 16:39     ` Kees Cook
2014-07-02 16:39       ` Kees Cook

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.