[PATCH v3 bpf-next 00/14] bpf: Introduce BPF arena.

* [PATCH v3 bpf-next 00/14] bpf: Introduce BPF arena.
@ 2024-03-08  1:07 Alexei Starovoitov
  2024-03-08  1:07 ` [PATCH v3 bpf-next 01/14] bpf: Introduce bpf_arena Alexei Starovoitov
                   ` (15 more replies)
  0 siblings, 16 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2024-03-08  1:07 UTC (permalink / raw)
  To: bpf
  Cc: daniel, andrii, torvalds, brho, hannes, akpm, urezki, hch,
	linux-mm, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

v2->v3:
- contains bpf bits only, but cc-ing past audience for continuity
- since prerequisite patches landed, this series focus on the main
  functionality of bpf_arena.
- adopted Andrii's approach to support arena in libbpf.
- simplified LLVM support. Instead of two instructions it's now only one.
- switched to cond_break (instead of open coded iters) in selftests
- implemented several follow-ups that will be sent after this set
  . remember first IP and bpf insn that faulted in arena.
    report to user space via bpftool
  . copy paste and tweak glob_match() aka mini-regex as a selftests/bpf
- see patch 1 for detailed description of bpf_arena

v1->v2:
- Improved commit log with reasons for using vmap_pages_range() in arena.
  Thanks to Johannes
- Added support for __arena global variables in bpf programs
- Fixed race conditions spotted by Barret
- Fixed wrap32 issue spotted by Barret
- Fixed bpf_map_mmap_sz() the way Andrii suggested

The work on bpf_arena was inspired by Barret's work:
https://github.com/google/ghost-userspace/blob/main/lib/queue.bpf.h
that implements queues, lists and AVL trees completely as bpf programs
using giant bpf array map and integer indices instead of pointers.
bpf_arena is a sparse array that allows to use normal C pointers to
build such data structures. Last few patches implement page_frag
allocator, link list and hash table as bpf programs.

v1:
bpf programs have multiple options to communicate with user space:
- Various ring buffers (perf, ftrace, bpf): The data is streamed
  unidirectionally from bpf to user space.
- Hash map: The bpf program populates elements, and user space consumes
  them via bpf syscall.
- mmap()-ed array map: Libbpf creates an array map that is directly
  accessed by the bpf program and mmap-ed to user space. It's the fastest
  way. Its disadvantage is that memory for the whole array is reserved at
  the start.

Alexei Starovoitov (13):
  bpf: Introduce bpf_arena.
  bpf: Disasm support for addr_space_cast instruction.
  bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions.
  bpf: Add x86-64 JIT support for bpf_addr_space_cast instruction.
  bpf: Recognize addr_space_cast instruction in the verifier.
  bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA.
  libbpf: Add __arg_arena to bpf_helpers.h
  libbpf: Add support for bpf_arena.
  bpftool: Recognize arena map type
  bpf: Add helper macro bpf_addr_space_cast()
  selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages
  selftests/bpf: Add bpf_arena_list test.
  selftests/bpf: Add bpf_arena_htab test.

Andrii Nakryiko (1):
  libbpf: Recognize __arena global varaibles.

 arch/x86/net/bpf_jit_comp.c                   | 231 +++++++-
 include/linux/bpf.h                           |  10 +-
 include/linux/bpf_types.h                     |   1 +
 include/linux/bpf_verifier.h                  |   1 +
 include/linux/filter.h                        |   4 +
 include/uapi/linux/bpf.h                      |  14 +
 kernel/bpf/Makefile                           |   3 +
 kernel/bpf/arena.c                            | 558 ++++++++++++++++++
 kernel/bpf/btf.c                              |  19 +-
 kernel/bpf/core.c                             |  16 +
 kernel/bpf/disasm.c                           |  10 +
 kernel/bpf/log.c                              |   3 +
 kernel/bpf/syscall.c                          |  42 ++
 kernel/bpf/verifier.c                         | 123 +++-
 .../bpf/bpftool/Documentation/bpftool-map.rst |   2 +-
 tools/bpf/bpftool/gen.c                       |  13 +
 tools/bpf/bpftool/map.c                       |   2 +-
 tools/include/uapi/linux/bpf.h                |  14 +
 tools/lib/bpf/bpf_helpers.h                   |   1 +
 tools/lib/bpf/libbpf.c                        | 163 ++++-
 tools/lib/bpf/libbpf.h                        |   2 +-
 tools/lib/bpf/libbpf_probes.c                 |   7 +
 tools/testing/selftests/bpf/DENYLIST.aarch64  |   2 +
 tools/testing/selftests/bpf/DENYLIST.s390x    |   2 +
 tools/testing/selftests/bpf/bpf_arena_alloc.h |  67 +++
 .../testing/selftests/bpf/bpf_arena_common.h  |  70 +++
 tools/testing/selftests/bpf/bpf_arena_htab.h  | 100 ++++
 tools/testing/selftests/bpf/bpf_arena_list.h  |  92 +++
 .../testing/selftests/bpf/bpf_experimental.h  |  43 ++
 .../selftests/bpf/prog_tests/arena_htab.c     |  88 +++
 .../selftests/bpf/prog_tests/arena_list.c     |  68 +++
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../testing/selftests/bpf/progs/arena_htab.c  |  48 ++
 .../selftests/bpf/progs/arena_htab_asm.c      |   5 +
 .../testing/selftests/bpf/progs/arena_list.c  |  87 +++
 .../selftests/bpf/progs/verifier_arena.c      | 146 +++++
 tools/testing/selftests/bpf/test_loader.c     |   9 +-
 37 files changed, 2028 insertions(+), 40 deletions(-)
 create mode 100644 kernel/bpf/arena.c
 create mode 100644 tools/testing/selftests/bpf/bpf_arena_alloc.h
 create mode 100644 tools/testing/selftests/bpf/bpf_arena_common.h
 create mode 100644 tools/testing/selftests/bpf/bpf_arena_htab.h
 create mode 100644 tools/testing/selftests/bpf/bpf_arena_list.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_htab.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_list.c
 create mode 100644 tools/testing/selftests/bpf/progs/arena_htab.c
 create mode 100644 tools/testing/selftests/bpf/progs/arena_htab_asm.c
 create mode 100644 tools/testing/selftests/bpf/progs/arena_list.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_arena.c

-- 
2.43.0

^ permalink raw reply	[flat|nested] 26+ messages in thread