bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Dave Marchevsky <davemarchevsky@fb.com>,
	Delyan Kratunov <delyank@fb.com>
Subject: [PATCH RFC bpf-next v1 00/32] Local kptrs, BPF linked lists
Date: Sun,  4 Sep 2022 22:41:13 +0200	[thread overview]
Message-ID: <20220904204145.3089-1-memxor@gmail.com> (raw)

WARNING: This is an RFC. WARN_ON_ONCE is sprinkled around the code liberally
(useful while working on this stuff). I'll be doing a thorough pass and clean
all that up before sending out non-RFC v1.
TODO before non-RFC v1.
 - A lot more corner case tests, failure tests, more tests for the new local
   kptr support. I did test the basic stuff (which the verifier complained
   about when writing linked_list.c).
 - More tests for kptr support in new map types.
 - More self review.

--

This series introduces user defined BPF objects, by introducing the idea of
local kptrs. These are kptrs (strongly typed pointers) that refer to objects of
a user defined type, hence called "local" kptrs. This allows BPF programs to
allocate their own objects, build their own object hierarchies, and use the
basic building blocks provided by BPF runtime to build their own data structures
flexibly.

Then, we introduce the support for single ownership BPF linked lists, which can
be put inside BPF maps, or local kptrs, and hold such allocated local kptrs as
elements. It works as an instrusive collection, which is done to allow making
local kptrs part of multiple data structures at the same time in the future.

The eventual goal of this and future patches is to allow one to do some limited
form of kernel style programming in BPF C, and allow programmers to build their
own complex data structures flexibly out of basic building blocks.

The key difference will be that such programs are verified to be safe, preserve
runtime integrity of the system, and are proven to be bug free as far as the
invariants of BPF specific APIs are concerned.

One immediate use case that will be using the entire infrastructure this series
is introducing will be managing percpu NMI safe linked lists inside BPF
programs.

The other use case this will serve in the near future will be linking kernel
structures like XDP frame and sk_buff directly into user data structures
(rbtree, pifomap, etc.) for packet queueing. This will follow single ownership
concept included in this series.

The user has complete control of the internal locking, and hence also the
batching of operations for each critical section.

Eventually, with some more support in future patches, users will be able to
write fully concurrent RCU protected hash table using BPF_MAP_TYPE_ARRAY for
buckets and embed BPF linked lists in these buckets. All of this will be
possible in safe BPF C, which will be proven for runtime safety by the BPF
verifier.

The features, core infrastructure, and other improvements in this set are:
- Allow storing kptrs in local storage and percpu maps.
- Local kptrs - User defined kernel objects.
- bpf_kptr_alloc, bpf_kptr_free to allocate and free them.
- BPF memory object model, similar to what C and C++ abstract machines have,
  now verifier reasons about an object's lifetime, i.e. the concept of object
  lifetime, visibility, construction, destruction is reified.
  The separation of storage and object lifetime is understood by the verifier.
- Single ownership BPF linked lists.
  - Support for them in BPF maps.
  - Support for them in local kptrs.
- Global spin locks.
- Spin locks inside local kptrs.
- Allow storing local kptrs in all BPF maps with support for kernel kptrs.

Some other notable things:
- Completely static verification of locking.
- Kfunc argument handling has been completely reworked.
- Argument rewriting support for kfuncs.
  Now we can also support inlining block of BPF insns for certain kfuncs.
- Iteration over all registers in verifier state has a new lambda based
  iterator (and can be nifty or crazy - depending on your love for GNU C).
- Search pruning now understands non-size precise registers.
- A new bpf_experimental.h header as a dumping ground for these APIs.

Any functionality exposed in this series is **NOT** part of UAPI. It is only
available through use of kfuncs, and structs that can be added to map value may
also change their size or name in the future. Hence, every feature in this
series must be considered **EXPERIMENTAL**.

Next steps:
-----------
 * NMI safe percpu single ownership linked lists (using local_t protection).
  - This enables open coded freelist use case
 * Lockless linked lists.
 * Allow RCU protected local kptrs. This then allows RCU protected list lookups,
   since spinlock protection for readers does not scale.
 * Introduce explicit RCU read sections (using kfuncs).
 * Introduce bpf_refcount for local kptrs, shared ownership.
 * Introduce shared ownership linked lists.
 * Documentation.

Notes:
------
 * Delyan's work to expose Alexei's BPF memory allocator as global allocator
   is still needed before this can be merged. For now, direct kmalloc and
   kfree is used.

Links:
------
 * Dave's BPF RB-Tree RFC series
   v1 (Discussion thread)
     https://lore.kernel.org/bpf/20220722183438.3319790-1-davemarchevsky@fb.com
   v2 (With support for static locks)
     https://lore.kernel.org/bpf/20220830172759.4069786-1-davemarchevsky@fb.com
 * BPF Linked Lists Discussion
   https://lore.kernel.org/bpf/CAP01T74U30+yeBHEgmgzTJ-XYxZ0zj71kqCDJtTH9YQNfTK+Xw@mail.gmail.com
 * BPF Memory Allocator from Alexei
   https://lore.kernel.org/bpf/20220902211058.60789-1-alexei.starovoitov@gmail.com
 * BPF Memory Allocator UAPI Discussion
   https://lore.kernel.org/bpf/d3f76b27f4e55ec9e400ae8dcaecbb702a4932e8.camel@fb.com

Daniel Xu (1):
  bpf: Remove duplicate PTR_TO_BTF_ID RO check

Dave Marchevsky (1):
  libbpf: Add support for private BSS map section

Kumar Kartikeya Dwivedi (30):
  bpf: Add copy_map_value_long to copy to remote percpu memory
  bpf: Support kptrs in percpu arraymap
  bpf: Add zero_map_value to zero map value with special fields
  bpf: Support kptrs in percpu hashmap and percpu LRU hashmap
  bpf: Support kptrs in local storage maps
  bpf: Annotate data races in bpf_local_storage
  bpf: Allow specifying volatile type modifier for kptrs
  bpf: Add comment about kptr's PTR_TO_MAP_VALUE handling
  bpf: Rewrite kfunc argument handling
  bpf: Drop kfunc support from btf_check_func_arg_match
  bpf: Support constant scalar arguments for kfuncs
  bpf: Teach verifier about non-size constant arguments
  bpf: Introduce bpf_list_head support for BPF maps
  bpf: Introduce bpf_kptr_alloc helper
  bpf: Add helper macro bpf_expr_for_each_reg_in_vstate
  bpf: Introduce BPF memory object model
  bpf: Support bpf_list_node in local kptrs
  bpf: Support bpf_spin_lock in local kptrs
  bpf: Support bpf_list_head in local kptrs
  bpf: Introduce bpf_kptr_free helper
  bpf: Allow locking bpf_spin_lock global variables
  bpf: Bump BTF_KFUNC_SET_MAX_CNT
  bpf: Add single ownership BPF linked list API
  bpf: Permit NULL checking pointer with non-zero fixed offset
  bpf: Allow storing local kptrs in BPF maps
  bpf: Wire up freeing of bpf_list_heads in maps
  bpf: Add destructor for bpf_list_head in local kptr
  selftests/bpf: Add BTF tag macros for local kptrs, BPF linked lists
  selftests/bpf: Add BPF linked list API tests
  selftests/bpf: Add referenced local kptr tests

 Documentation/bpf/kfuncs.rst                  |   30 +
 include/linux/bpf.h                           |  177 +-
 include/linux/bpf_local_storage.h             |    2 +-
 include/linux/bpf_verifier.h                  |   77 +-
 include/linux/btf.h                           |   76 +-
 include/linux/poison.h                        |    3 +
 kernel/bpf/arraymap.c                         |   43 +-
 kernel/bpf/bpf_local_storage.c                |   53 +-
 kernel/bpf/btf.c                              |  727 +++---
 kernel/bpf/hashtab.c                          |   91 +-
 kernel/bpf/helpers.c                          |  137 +-
 kernel/bpf/map_in_map.c                       |    5 +-
 kernel/bpf/syscall.c                          |  231 +-
 kernel/bpf/verifier.c                         | 2084 ++++++++++++++---
 net/bpf/bpf_dummy_struct_ops.c                |    5 +-
 net/ipv4/bpf_tcp_ca.c                         |    5 +-
 tools/lib/bpf/libbpf.c                        |   65 +-
 .../testing/selftests/bpf/bpf_experimental.h  |  120 +
 .../selftests/bpf/prog_tests/linked_list.c    |   88 +
 .../selftests/bpf/prog_tests/map_kptr.c       |    2 +-
 .../testing/selftests/bpf/progs/linked_list.c |  347 +++
 tools/testing/selftests/bpf/progs/map_kptr.c  |   38 +
 tools/testing/selftests/bpf/verifier/calls.c  |    2 +-
 23 files changed, 3626 insertions(+), 782 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/bpf_experimental.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/linked_list.c
 create mode 100644 tools/testing/selftests/bpf/progs/linked_list.c

-- 
2.34.1


             reply	other threads:[~2022-09-04 20:41 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-04 20:41 Kumar Kartikeya Dwivedi [this message]
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 01/32] bpf: Add copy_map_value_long to copy to remote percpu memory Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 02/32] bpf: Support kptrs in percpu arraymap Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 03/32] bpf: Add zero_map_value to zero map value with special fields Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 04/32] bpf: Support kptrs in percpu hashmap and percpu LRU hashmap Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 05/32] bpf: Support kptrs in local storage maps Kumar Kartikeya Dwivedi
2022-09-07 19:00   ` Alexei Starovoitov
2022-09-08  2:47     ` Kumar Kartikeya Dwivedi
2022-09-09  5:27   ` Martin KaFai Lau
2022-09-09 11:22     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 06/32] bpf: Annotate data races in bpf_local_storage Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 07/32] bpf: Allow specifying volatile type modifier for kptrs Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 08/32] bpf: Add comment about kptr's PTR_TO_MAP_VALUE handling Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 09/32] bpf: Rewrite kfunc argument handling Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 10/32] bpf: Drop kfunc support from btf_check_func_arg_match Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 11/32] bpf: Support constant scalar arguments for kfuncs Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 12/32] bpf: Teach verifier about non-size constant arguments Kumar Kartikeya Dwivedi
2022-09-07 22:11   ` Alexei Starovoitov
2022-09-08  2:49     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 13/32] bpf: Introduce bpf_list_head support for BPF maps Kumar Kartikeya Dwivedi
2022-09-07 22:46   ` Alexei Starovoitov
2022-09-08  2:58     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 14/32] bpf: Introduce bpf_kptr_alloc helper Kumar Kartikeya Dwivedi
2022-09-07 23:30   ` Alexei Starovoitov
2022-09-08  3:01     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 15/32] bpf: Add helper macro bpf_expr_for_each_reg_in_vstate Kumar Kartikeya Dwivedi
2022-09-07 23:48   ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 16/32] bpf: Introduce BPF memory object model Kumar Kartikeya Dwivedi
2022-09-08  0:34   ` Alexei Starovoitov
2022-09-08  2:39     ` Kumar Kartikeya Dwivedi
2022-09-08  3:37       ` Alexei Starovoitov
2022-09-08 11:50         ` Kumar Kartikeya Dwivedi
2022-09-08 14:18           ` Alexei Starovoitov
2022-09-08 14:45             ` Kumar Kartikeya Dwivedi
2022-09-08 15:11               ` Alexei Starovoitov
2022-09-08 15:37                 ` Kumar Kartikeya Dwivedi
2022-09-08 15:59                   ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 17/32] bpf: Support bpf_list_node in local kptrs Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 18/32] bpf: Support bpf_spin_lock " Kumar Kartikeya Dwivedi
2022-09-08  0:35   ` Alexei Starovoitov
2022-09-09  8:25     ` Dave Marchevsky
2022-09-09 11:20       ` Kumar Kartikeya Dwivedi
2022-09-09 14:26         ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 19/32] bpf: Support bpf_list_head " Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 20/32] bpf: Introduce bpf_kptr_free helper Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 21/32] bpf: Allow locking bpf_spin_lock global variables Kumar Kartikeya Dwivedi
2022-09-08  0:27   ` Alexei Starovoitov
2022-09-08  0:39     ` Kumar Kartikeya Dwivedi
2022-09-08  0:55       ` Alexei Starovoitov
2022-09-08  1:00     ` Kumar Kartikeya Dwivedi
2022-09-08  1:08       ` Alexei Starovoitov
2022-09-08  1:15         ` Kumar Kartikeya Dwivedi
2022-09-08  2:39           ` Alexei Starovoitov
2022-09-09  8:13   ` Dave Marchevsky
2022-09-09 11:05     ` Kumar Kartikeya Dwivedi
2022-09-09 14:24       ` Alexei Starovoitov
2022-09-09 14:50         ` Kumar Kartikeya Dwivedi
2022-09-09 14:58           ` Alexei Starovoitov
2022-09-09 18:32             ` Andrii Nakryiko
2022-09-09 19:25               ` Alexei Starovoitov
2022-09-09 20:21                 ` Andrii Nakryiko
2022-09-09 20:57                   ` Alexei Starovoitov
2022-09-10  0:21                     ` Andrii Nakryiko
2022-09-11 22:31                       ` Alexei Starovoitov
2022-09-20 20:55                         ` Andrii Nakryiko
2022-10-18  4:06                           ` Andrii Nakryiko
2022-09-09 22:30                 ` Dave Marchevsky
2022-09-09 22:49                   ` Kumar Kartikeya Dwivedi
2022-09-09 22:57                     ` Alexei Starovoitov
2022-09-09 23:04                       ` Kumar Kartikeya Dwivedi
2022-09-09 22:51                   ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 22/32] bpf: Bump BTF_KFUNC_SET_MAX_CNT Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 23/32] bpf: Add single ownership BPF linked list API Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 24/32] bpf: Permit NULL checking pointer with non-zero fixed offset Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 25/32] bpf: Allow storing local kptrs in BPF maps Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 26/32] bpf: Wire up freeing of bpf_list_heads in maps Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 27/32] bpf: Add destructor for bpf_list_head in local kptr Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 28/32] bpf: Remove duplicate PTR_TO_BTF_ID RO check Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 29/32] libbpf: Add support for private BSS map section Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 30/32] selftests/bpf: Add BTF tag macros for local kptrs, BPF linked lists Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 31/32] selftests/bpf: Add BPF linked list API tests Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 32/32] selftests/bpf: Add referenced local kptr tests Kumar Kartikeya Dwivedi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220904204145.3089-1-memxor@gmail.com \
    --to=memxor@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=delyank@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).