bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Dave Marchevsky <davemarchevsky@meta.com>,
	Delyan Kratunov <delyank@meta.com>
Subject: [PATCH bpf-next v3 00/24] Local kptrs, BPF linked lists
Date: Thu,  3 Nov 2022 01:56:34 +0530	[thread overview]
Message-ID: <20221102202658.963008-1-memxor@gmail.com> (raw)

This series introduces user defined BPF objects, by introducing the idea
of local kptrs. These are kptrs (strongly typed pointers) that refer to
objects of a user defined type, hence called "local" kptrs. This allows
BPF programs to allocate their own objects, build their own object
hierarchies, and use the basic building blocks provided by BPF runtime
to build their own data structures flexibly.

Then, we introduce the support for single ownership BPF linked lists,
which can be put inside BPF maps, or local kptrs, and hold such
allocated local kptrs as elements. It works as an instrusive collection,
which is done to allow making local kptrs part of multiple data
structures at the same time in the future.

The eventual goal of this and future patches is to allow one to do some
limited form of kernel style programming in BPF C, and allow programmers
to build their own complex data structures flexibly out of basic
building blocks.

The key difference will be that such programs are verified to be safe,
preserve runtime integrity of the system, and are proven to be bug free
as far as the invariants of BPF specific APIs are concerned.

One immediate use case that will be using the entire infrastructure this
series is introducing will be managing percpu NMI safe linked lists
inside BPF programs.

The other use case this will serve in the near future will be linking
kernel structures like XDP frame and sk_buff directly into user data
structures (rbtree, pifomap, etc.) for packet queueing. This will follow
single ownership concept included in this series.

The user has complete control of the internal locking, and hence also
the batching of operations for each critical section.

The features are:
- Local kptrs - User defined kernel objects.
- bpf_obj_new, bpf_obj_drop to allocate and free them.
- Single ownership BPF linked lists.
  - Support for them in BPF maps.
  - Support for them in local kptrs.
- Global spin locks.
- Spin locks inside local kptrs.

Some other notable things:
- Completely static verification of locking.
- Kfunc argument handling has been completely reworked.
- Argument rewriting support for kfuncs.
- Search pruning now understands non-size precise registers.
- A new bpf_experimental.h header as a dumping ground for these APIs.

Any functionality exposed in this series is NOT part of UAPI. It is only
available through use of kfuncs, and structs that can be added to map value may
also change their size or name in the future. Hence, every feature in this
series must be considered experimental.

Follow-ups:
-----------
 * Support for kptrs (local and kernel) in local storage and percpu maps + kptr tests
 * Fixes for helper access checks rebasing on top of this series

Next steps:
-----------
 * NMI safe percpu single ownership linked lists (using local_t protection).
 * Lockless linked lists.
 * Allow RCU protected local kptrs. This then allows RCU protected list lookups,
   since spinlock protection for readers does not scale.
 * Introduce explicit RCU read sections (using kfuncs).
 * Introduce bpf_refcount for local kptrs, shared ownership.
 * Introduce shared ownership linked lists.
 * Documentation.

Changelog:
----------
 v2 -> v3
 v2: https://lore.kernel.org/bpf/20221013062303.896469-1-memxor@gmail.com

  * Add ack from Dave for patch 5
  * Rename btf_type_fields -> btf_record, btf_type_fields_off ->
    btf_field_offs, rename functions similarly (Alexei)
  * Remove 'kind' component from contains declaration tag (Alexei)
  * Move bpf_list_head, bpf_list_node definitions to UAPI bpf.h (Alexei)
  * Add note in commit log about modifying btf_struct_access API (Dave)
  * Downgrade WARN_ON_ONCE to verbose(env, "...") and return -EFAULT (Dave)
  * Add type_is_local_kptr wrapper to avoid noisy checks (Dave)
  * Remove unused flags parameter from bpf_kptr_new (Alexei)
  * Reword comment in ref_obj_id_set_release_on_unlock (Dave)
  * Fix return type of ref_obj_id_set_release_on_unlock (Dave)
  * Introduce is_bpf_list_api_kfunc to dedup checks (Dave)
  * Disallow BPF_WRITE to untrusted local kptrs
  * Add details about soundness of check_reg_allocation_locked logic
  * List untrusted local kptrs for PROBE_MEM handling

 v1 -> v2
 v1: https://lore.kernel.org/bpf/20221011012240.3149-1-memxor@gmail.com

  * Rebase on bpf-next to resolve merge conflict in DENYLIST.s390x
  * Fix a couple of mental lapses in bpf_list_head_free

 RFC v1 -> v1
 RFC v1: https://lore.kernel.org/bpf/20220904204145.3089-1-memxor@gmail.com

  * Mostly a complete rewrite of BTF parsing, refactor existing code (Kartikeya)
  * Rebase kfunc rewrite for bpf-next, add support for more changes
  * Cache type metadata in BTF to avoid recomputation inside verifier (Kartikeya)
  * Remove __kernel tag, make things similar to map values, reserve bpf_ prefix
  * Rename bpf_kptr_new -> bpf_obj_new, bpf_kptr_drop -> bpf_obj_drop (Alexei)
  * Rename precision state enum values (Alexei)
  * Drop explicit constructor/destructor support (Alexei)
  * Rewrite code for constructing/destructing objects and offload to runtime
  * Minimize duplication in bpf_map_value_off_desc handling (Alexei)
  * Expose global memory allocator (Alexei)
  * Address other nits from Alexei
  * Split out local kptrs in maps, more kptrs in maps support into a follow up

Links:
------
 * Dave's BPF RB-Tree RFC series
   v1 (Discussion thread)
     https://lore.kernel.org/bpf/20220722183438.3319790-1-davemarchevsky@fb.com
   v2 (With support for static locks)
     https://lore.kernel.org/bpf/20220830172759.4069786-1-davemarchevsky@fb.com
 * BPF Linked Lists Discussion
   https://lore.kernel.org/bpf/CAP01T74U30+yeBHEgmgzTJ-XYxZ0zj71kqCDJtTH9YQNfTK+Xw@mail.gmail.com
 * BPF Memory Allocator from Alexei
   https://lore.kernel.org/bpf/20220902211058.60789-1-alexei.starovoitov@gmail.com
 * BPF Memory Allocator UAPI Discussion
   https://lore.kernel.org/bpf/d3f76b27f4e55ec9e400ae8dcaecbb702a4932e8.camel@fb.com

Kumar Kartikeya Dwivedi (24):
  bpf: Document UAPI details for special BPF types
  bpf: Allow specifying volatile type modifier for kptrs
  bpf: Clobber stack slot when writing over spilled PTR_TO_BTF_ID
  bpf: Fix slot type check in check_stack_write_var_off
  bpf: Drop reg_type_may_be_refcounted_or_null
  bpf: Refactor kptr_off_tab into btf_record
  bpf: Consolidate spin_lock, timer management into btf_record
  bpf: Refactor map->off_arr handling
  bpf: Support bpf_list_head in map values
  bpf: Introduce local kptrs
  bpf: Recognize bpf_{spin_lock,list_head,list_node} in local kptrs
  bpf: Verify ownership relationships for user BTF types
  bpf: Support locking bpf_spin_lock in local kptr
  bpf: Allow locking bpf_spin_lock global variables
  bpf: Rewrite kfunc argument handling
  bpf: Drop kfunc bits from btf_check_func_arg_match
  bpf: Support constant scalar arguments for kfuncs
  bpf: Teach verifier about non-size constant arguments
  bpf: Introduce bpf_obj_new
  bpf: Introduce bpf_obj_drop
  bpf: Permit NULL checking pointer with non-zero fixed offset
  bpf: Introduce single ownership BPF linked list API
  selftests/bpf: Add __contains macro to bpf_experimental.h
  selftests/bpf: Add BPF linked list API tests

 Documentation/bpf/bpf_design_QA.rst           |   44 +
 Documentation/bpf/kfuncs.rst                  |   30 +
 include/linux/bpf.h                           |  250 ++-
 include/linux/bpf_verifier.h                  |   22 +-
 include/linux/btf.h                           |   77 +-
 include/linux/filter.h                        |    8 +-
 include/uapi/linux/bpf.h                      |   10 +
 kernel/bpf/arraymap.c                         |   30 +-
 kernel/bpf/bpf_local_storage.c                |    2 +-
 kernel/bpf/btf.c                              | 1205 +++++++------
 kernel/bpf/core.c                             |   14 +
 kernel/bpf/hashtab.c                          |   38 +-
 kernel/bpf/helpers.c                          |  141 +-
 kernel/bpf/local_storage.c                    |    2 +-
 kernel/bpf/map_in_map.c                       |   19 +-
 kernel/bpf/syscall.c                          |  401 +++--
 kernel/bpf/verifier.c                         | 1504 ++++++++++++++---
 net/bpf/bpf_dummy_struct_ops.c                |   14 +-
 net/core/bpf_sk_storage.c                     |    4 +-
 net/core/filter.c                             |   34 +-
 net/ipv4/bpf_tcp_ca.c                         |   13 +-
 net/netfilter/nf_conntrack_bpf.c              |   17 +-
 tools/include/uapi/linux/bpf.h                |   10 +
 tools/testing/selftests/bpf/DENYLIST.s390x    |    1 +
 .../testing/selftests/bpf/bpf_experimental.h  |   63 +
 .../bpf/prog_tests/kfunc_dynptr_param.c       |    2 +-
 .../selftests/bpf/prog_tests/linked_list.c    |   79 +
 .../testing/selftests/bpf/progs/linked_list.c |  330 ++++
 tools/testing/selftests/bpf/verifier/calls.c  |    4 +-
 .../selftests/bpf/verifier/ref_tracking.c     |    4 +-
 30 files changed, 3184 insertions(+), 1188 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/bpf_experimental.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/linked_list.c
 create mode 100644 tools/testing/selftests/bpf/progs/linked_list.c


base-commit: 3a07dcf8f57b9a90b1c07df3e9091fd04baa3036
-- 
2.38.1


             reply	other threads:[~2022-11-02 20:27 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-02 20:26 Kumar Kartikeya Dwivedi [this message]
2022-11-02 20:26 ` [PATCH bpf-next v3 01/24] bpf: Document UAPI details for special BPF types Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 02/24] bpf: Allow specifying volatile type modifier for kptrs Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 03/24] bpf: Clobber stack slot when writing over spilled PTR_TO_BTF_ID Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 04/24] bpf: Fix slot type check in check_stack_write_var_off Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 05/24] bpf: Drop reg_type_may_be_refcounted_or_null Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 06/24] bpf: Refactor kptr_off_tab into btf_record Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 07/24] bpf: Consolidate spin_lock, timer management " Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 08/24] bpf: Refactor map->off_arr handling Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 09/24] bpf: Support bpf_list_head in map values Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 10/24] bpf: Introduce local kptrs Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 11/24] bpf: Recognize bpf_{spin_lock,list_head,list_node} in " Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 12/24] bpf: Verify ownership relationships for user BTF types Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 13/24] bpf: Support locking bpf_spin_lock in local kptr Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 14/24] bpf: Allow locking bpf_spin_lock global variables Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 15/24] bpf: Rewrite kfunc argument handling Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 16/24] bpf: Drop kfunc bits from btf_check_func_arg_match Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 17/24] bpf: Support constant scalar arguments for kfuncs Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 18/24] bpf: Teach verifier about non-size constant arguments Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 19/24] bpf: Introduce bpf_obj_new Kumar Kartikeya Dwivedi
2022-11-02 22:58   ` kernel test robot
2022-11-03  0:29   ` kernel test robot
2022-11-02 20:26 ` [PATCH bpf-next v3 20/24] bpf: Introduce bpf_obj_drop Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 21/24] bpf: Permit NULL checking pointer with non-zero fixed offset Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 22/24] bpf: Introduce single ownership BPF linked list API Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 23/24] selftests/bpf: Add __contains macro to bpf_experimental.h Kumar Kartikeya Dwivedi
2022-11-02 20:26 ` [PATCH bpf-next v3 24/24] selftests/bpf: Add BPF linked list API tests Kumar Kartikeya Dwivedi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221102202658.963008-1-memxor@gmail.com \
    --to=memxor@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@meta.com \
    --cc=delyank@meta.com \
    --cc=martin.lau@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).