All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v1 00/19] bpf: implement bpf iterator for kernel data
@ 2020-04-27 20:12 Yonghong Song
  2020-04-27 20:12 ` [PATCH bpf-next v1 01/19] net: refactor net assignment for seq_net_private structure Yonghong Song
                   ` (18 more replies)
  0 siblings, 19 replies; 85+ messages in thread
From: Yonghong Song @ 2020-04-27 20:12 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, Martin KaFai Lau, netdev
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team

Motivation:
  The current way to dump kernel data structures mostly:
    1. /proc system
    2. various specific tools like "ss" which requires kernel support.
    3. drgn
  The dropback for the first two is that whenever you want to dump more, you
  need change the kernel. For example, Martin wants to dump socket local
  storage with "ss". Kernel change is needed for it to work ([1]).
  This is also the direct motivation for this work.

  drgn ([2]) solves this proble nicely and no kernel change is not needed.
  But since drgn is not able to verify the validity of a particular pointer value,
  it might present the wrong results in rare cases.
  
  In this patch set, we introduce bpf iterator. Initial kernel changes are
  still needed for interested kernel data, but a later data structure change
  will not require kernel changes any more. bpf program itself can adapt
  to new data structure changes. This will give certain flexibility with
  guaranteed correctness.
  
  In this patch set, kernel seq_ops is used to facilitate iterating through
  kernel data, similar to current /proc and many other lossless kernel
  dumping facilities. In the future, different iterators can be
  implemented to trade off losslessness for other criteria e.g. no
  repeated object visits, etc.

User Interface:
  1. Similar to prog/map/link, the iterator can be pinned into a
     path within a bpffs mount point.
  2. The bpftool command can pin an iterator to a file
         bpftool iter pin <bpf_prog.o> <path>
  3. Use `cat <path>` to dump the contents.
     Use `rm -f <path>` to remove the pinned iterator.
  4. The anonymous iterator can be created as well.

  Please see patch #17 andd #18 for bpf programs and bpf iterator
  output examples.

  Note that certain iterators are namespace aware. For example,
  task and task_file targets only iterate through current pid namespace.
  ipv6_route and netlink will iterate through current net namespace.

  Please see individual patches for implementation details.

Performance:
  The bpf iterator provides in-kernel aggregation abilities
  for kernel data. This can greatly improve performance
  compared to e.g., iterating all process directories under /proc.
  For example, I did an experiment on my VM with an application forking
  different number of tasks and each forked process opening various number
  of files. The following is the result with the latency with unit of microseconds:

    # of forked tasks   # of open files    # of bpf_prog calls  # latency (us)
    100                 100                11503                7586
    1000                1000               1013203              709513
    10000               100                1130203              764519

  The number of bpf_prog calls may be more than forked tasks multipled by
  open files since there are other tasks running on the system.
  The bpf program is a do-nothing program. One millions of bpf calls takes
  less than one second.

Future Work:
  Although the initial motivation is from Martin's sk_local_storage,
  this patch didn't implement tcp6 sockets and sk_local_storage.
  The /proc/net/tcp6 involves three types of sockets, timewait,
  request and tcp6 sockets. Some kind of type casting or other
  mechanism is needed to handle all these socket types in one
  bpf program. This will be addressed in future work.

  Currently, we do not support kernel data generated under module.
  This requires some BTF work.

  More work for more iterators, e.g., bpf_progs, cgroups, bpf_map elements, etc.

Changelog:
  RFC v2 ([3]) -> non-RFC v1:
    - rename bpfdump to bpf_iter
    - use bpffs instead of a new file system
    - use bpf_link to streamline and simplify iterator creation.

References:
  [1]: https://lore.kernel.org/bpf/20200225230427.1976129-1-kafai@fb.com
  [2]: https://github.com/osandov/drgn
  [3]: https://lore.kernel.org/bpf/40e427e2-5b15-e9aa-e2cb-42dc1b53d047@gmail.com/T/

Yonghong Song (19):
  net: refactor net assignment for seq_net_private structure
  bpf: implement an interface to register bpf_iter targets
  bpf: add bpf_map iterator
  bpf: allow loading of a bpf_iter program
  bpf: support bpf tracing/iter programs for BPF_LINK_CREATE
  bpf: support bpf tracing/iter programs for BPF_LINK_UPDATE
  bpf: create anonymous bpf iterator
  bpf: create file bpf iterator
  bpf: add PTR_TO_BTF_ID_OR_NULL support
  bpf: add netlink and ipv6_route targets
  bpf: add task and task/file targets
  bpf: add bpf_seq_printf and bpf_seq_write helpers
  bpf: handle spilled PTR_TO_BTF_ID properly when checking
    stack_boundary
  bpf: support variable length array in tracing programs
  tools/libbpf: add bpf_iter support
  tools/bpftool: add bpf_iter support for bptool
  tools/bpf: selftests: add iterator programs for ipv6_route and netlink
  tools/bpf: selftests: add iter progs for bpf_map/task/task_file
  tools/bpf: selftests: add bpf_iter selftests

 fs/proc/proc_net.c                            |   5 +-
 include/linux/bpf.h                           |  33 ++
 include/linux/seq_file_net.h                  |   8 +
 include/uapi/linux/bpf.h                      |  38 +-
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bpf_iter.c                         | 358 ++++++++++++++++++
 kernel/bpf/btf.c                              |  38 +-
 kernel/bpf/inode.c                            |  28 ++
 kernel/bpf/map_iter.c                         | 107 ++++++
 kernel/bpf/syscall.c                          |  62 ++-
 kernel/bpf/task_iter.c                        | 319 ++++++++++++++++
 kernel/bpf/verifier.c                         |  47 ++-
 kernel/trace/bpf_trace.c                      | 159 ++++++++
 net/ipv6/ip6_fib.c                            |  71 +++-
 net/ipv6/route.c                              |  30 ++
 net/netlink/af_netlink.c                      |  99 ++++-
 scripts/bpf_helpers_doc.py                    |   2 +
 .../bpftool/Documentation/bpftool-iter.rst    |  71 ++++
 tools/bpf/bpftool/bash-completion/bpftool     |  13 +
 tools/bpf/bpftool/iter.c                      |  84 ++++
 tools/bpf/bpftool/main.c                      |   3 +-
 tools/bpf/bpftool/main.h                      |   1 +
 tools/include/uapi/linux/bpf.h                |  38 +-
 tools/lib/bpf/bpf.c                           |  11 +
 tools/lib/bpf/bpf.h                           |   2 +
 tools/lib/bpf/bpf_tracing.h                   |  23 ++
 tools/lib/bpf/libbpf.c                        |  60 +++
 tools/lib/bpf/libbpf.h                        |  11 +
 tools/lib/bpf/libbpf.map                      |   7 +
 .../selftests/bpf/prog_tests/bpf_iter.c       | 180 +++++++++
 .../selftests/bpf/progs/bpf_iter_bpf_map.c    |  32 ++
 .../selftests/bpf/progs/bpf_iter_ipv6_route.c |  69 ++++
 .../selftests/bpf/progs/bpf_iter_netlink.c    |  77 ++++
 .../selftests/bpf/progs/bpf_iter_task.c       |  29 ++
 .../selftests/bpf/progs/bpf_iter_task_file.c  |  28 ++
 .../selftests/bpf/progs/bpf_iter_test_kern1.c |   4 +
 .../selftests/bpf/progs/bpf_iter_test_kern2.c |   4 +
 .../selftests/bpf/progs/bpf_iter_test_kern3.c |  18 +
 .../bpf/progs/bpf_iter_test_kern_common.h     |  22 ++
 39 files changed, 2174 insertions(+), 19 deletions(-)
 create mode 100644 kernel/bpf/bpf_iter.c
 create mode 100644 kernel/bpf/map_iter.c
 create mode 100644 kernel/bpf/task_iter.c
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-iter.rst
 create mode 100644 tools/bpf/bpftool/iter.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_iter.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_ipv6_route.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_netlink.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_file.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_test_kern1.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_test_kern2.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_test_kern3.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_test_kern_common.h

-- 
2.24.1


^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2020-05-02  7:17 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 20:12 [PATCH bpf-next v1 00/19] bpf: implement bpf iterator for kernel data Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 01/19] net: refactor net assignment for seq_net_private structure Yonghong Song
2020-04-29  5:38   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 02/19] bpf: implement an interface to register bpf_iter targets Yonghong Song
2020-04-28 16:20   ` Martin KaFai Lau
2020-04-28 16:50     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 03/19] bpf: add bpf_map iterator Yonghong Song
2020-04-29  0:37   ` Martin KaFai Lau
2020-04-29  0:48     ` Alexei Starovoitov
2020-04-29  1:15       ` Yonghong Song
2020-04-29  2:44         ` Alexei Starovoitov
2020-04-29  5:09           ` Yonghong Song
2020-04-29  6:08             ` Andrii Nakryiko
2020-04-29  6:20               ` Yonghong Song
2020-04-29  6:30                 ` Alexei Starovoitov
2020-04-29  6:40                   ` Andrii Nakryiko
2020-04-29  6:44                     ` Yonghong Song
2020-04-29 15:34                       ` Alexei Starovoitov
2020-04-29 18:14                         ` Yonghong Song
2020-04-29 19:19                         ` Andrii Nakryiko
2020-04-29 20:15                           ` Yonghong Song
2020-04-30  3:06                             ` Alexei Starovoitov
2020-04-30  4:01                               ` Yonghong Song
2020-04-29  6:34                 ` Martin KaFai Lau
2020-04-29  6:51                   ` Yonghong Song
2020-04-29 19:25                     ` Andrii Nakryiko
2020-04-29  1:02     ` Yonghong Song
2020-04-29  6:04   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 04/19] bpf: allow loading of a bpf_iter program Yonghong Song
2020-04-29  0:54   ` Martin KaFai Lau
2020-04-29  1:27     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 05/19] bpf: support bpf tracing/iter programs for BPF_LINK_CREATE Yonghong Song
2020-04-29  1:17   ` [Potential Spoof] " Martin KaFai Lau
2020-04-29  6:25   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 06/19] bpf: support bpf tracing/iter programs for BPF_LINK_UPDATE Yonghong Song
2020-04-29  1:32   ` Martin KaFai Lau
2020-04-29  5:04     ` Yonghong Song
2020-04-29  5:58       ` Martin KaFai Lau
2020-04-29  6:32         ` Andrii Nakryiko
2020-04-29  6:41           ` Martin KaFai Lau
2020-04-27 20:12 ` [PATCH bpf-next v1 07/19] bpf: create anonymous bpf iterator Yonghong Song
2020-04-29  5:39   ` Martin KaFai Lau
2020-04-29  6:56   ` Andrii Nakryiko
2020-04-29  7:06     ` Yonghong Song
2020-04-29 18:16       ` Andrii Nakryiko
2020-04-29 18:46         ` Martin KaFai Lau
2020-04-29 19:20           ` Yonghong Song
2020-04-29 20:50             ` Martin KaFai Lau
2020-04-29 20:54               ` Yonghong Song
2020-04-29 19:39   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 08/19] bpf: create file " Yonghong Song
2020-04-29 20:40   ` Andrii Nakryiko
2020-04-30 18:02     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 09/19] bpf: add PTR_TO_BTF_ID_OR_NULL support Yonghong Song
2020-04-29 20:46   ` Andrii Nakryiko
2020-04-29 20:51     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 10/19] bpf: add netlink and ipv6_route targets Yonghong Song
2020-04-28 19:49   ` kbuild test robot
2020-04-28 19:49     ` kbuild test robot
2020-04-28 19:50   ` [RFC PATCH] bpf: __bpf_iter__netlink() can be static kbuild test robot
2020-04-28 19:50     ` kbuild test robot
2020-04-27 20:12 ` [PATCH bpf-next v1 11/19] bpf: add task and task/file targets Yonghong Song
2020-04-30  2:08   ` Andrii Nakryiko
2020-05-01 17:23     ` Yonghong Song
2020-05-01 19:01       ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 12/19] bpf: add bpf_seq_printf and bpf_seq_write helpers Yonghong Song
2020-04-28  6:02   ` kbuild test robot
2020-04-28  6:02     ` kbuild test robot
2020-04-28 16:35     ` Yonghong Song
2020-04-28 16:35       ` Yonghong Song
2020-04-30 20:06       ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 13/19] bpf: handle spilled PTR_TO_BTF_ID properly when checking stack_boundary Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 14/19] bpf: support variable length array in tracing programs Yonghong Song
2020-04-30 20:04   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 15/19] tools/libbpf: add bpf_iter support Yonghong Song
2020-04-30  1:41   ` Andrii Nakryiko
2020-05-02  7:17     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 16/19] tools/bpftool: add bpf_iter support for bptool Yonghong Song
2020-04-28  9:27   ` Quentin Monnet
2020-04-28 17:35     ` Yonghong Song
2020-04-29  8:37       ` Quentin Monnet
2020-04-27 20:12 ` [PATCH bpf-next v1 17/19] tools/bpf: selftests: add iterator programs for ipv6_route and netlink Yonghong Song
2020-04-30  2:12   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 18/19] tools/bpf: selftests: add iter progs for bpf_map/task/task_file Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 19/19] tools/bpf: selftests: add bpf_iter selftests Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.