bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii@kernel.org>
To: <bpf@vger.kernel.org>
Cc: <linux-security-module@vger.kernel.org>, <keescook@chromium.org>,
	<brauner@kernel.org>, <lennart@poettering.net>,
	<cyphar@cyphar.com>, <luto@kernel.org>, <kernel-team@meta.com>,
	<sargun@sargun.me>
Subject: [PATCH RESEND v3 bpf-next 00/14] BPF token
Date: Wed, 28 Jun 2023 22:18:18 -0700	[thread overview]
Message-ID: <20230629051832.897119-1-andrii@kernel.org> (raw)

This patch set introduces new BPF object, BPF token, which allows to delegate
a subset of BPF functionality from privileged system-wide daemon (e.g.,
systemd or any other container manager) to a *trusted* unprivileged
application. Trust is the key here. This functionality is not about allowing
unconditional unprivileged BPF usage. Establishing trust, though, is
completely up to the discretion of respective privileged application that
would create a BPF token, as different production setups can and do achieve it
through a combination of different means (signing, LSM, code reviews, etc),
and it's undesirable and infeasible for kernel to enforce any particular way
of validating trustworthiness of particular process.

The main motivation for BPF token is a desire to enable containerized
BPF applications to be used together with user namespaces. This is currently
impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
arbitrary memory, and it's impossible to ensure that they only read memory of
processes belonging to any given namespace. This means that it's impossible to
have namespace-aware CAP_BPF capability, and as such another mechanism to
allow safe usage of BPF functionality is necessary. BPF token and delegation
of it to a trusted unprivileged applications is such mechanism. Kernel makes
no assumption about what "trusted" constitutes in any particular case, and
it's up to specific privileged applications and their surrounding
infrastructure to decide that. What kernel provides is a set of APIs to create
and tune BPF token, and pass it around to privileged BPF commands that are
creating new BPF objects like BPF programs, BPF maps, etc.

Previous attempt at addressing this very same problem ([0]) attempted to
utilize authoritative LSM approach, but was conclusively rejected by upstream
LSM maintainers. BPF token concept is not changing anything about LSM
approach, but can be combined with LSM hooks for very fine-grained security
policy. Some ideas about making BPF token more convenient to use with LSM (in
particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
2023 presentation ([1]). E.g., an ability to specify user-provided data
(context), which in combination with BPF LSM would allow implementing a very
dynamic and fine-granular custom security policies on top of BPF token. In the
interest of minimizing API surface area discussions this is going to be
added in follow up patches, as it's not essential to the fundamental concept
of delegatable BPF token.

It should be noted that BPF token is conceptually quite similar to the idea of
/dev/bpf device file, proposed by Song a while ago ([2]). The biggest
difference is the idea of using virtual anon_inode file to hold BPF token and
allowing multiple independent instances of them, each with its own set of
restrictions. BPF pinning solves the problem of exposing such BPF token
through file system (BPF FS, in this case) for cases where transferring FDs
over Unix domain sockets is not convenient. And also, crucially, BPF token
approach is not using any special stateful task-scoped flags. Instead, bpf()
syscall accepts token_fd parameters explicitly for each relevant BPF command.
This addresses main concerns brought up during the /dev/bpf discussion, and
fits better with overall BPF subsystem design.

This patch set adds a basic minimum of functionality to make BPF token useful
and to discuss API and functionality. Currently only low-level libbpf APIs
support passing BPF token around, allowing to test kernel functionality, but
for the most part is not sufficient for real-world applications, which
typically use high-level libbpf APIs based on `struct bpf_object` type. This
was done with the intent to limit the size of patch set and concentrate on
mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
as a separate follow up patch set kernel support makes it upstream.

Another part that should happen once kernel-side BPF token is established, is
a set of conventions between applications (e.g., systemd), tools (e.g.,
bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
at well-defined locations to allow applications take advantage of this in
automatic fashion without explicit code changes on BPF application's side.
But I'd like to postpone this discussion to after BPF token concept lands.

Once important distinctions from v2 that should be noted is a chance in the
semantics of a newly added BPF_TOKEN_CREATE command. Previously,
BPF_TOKEN_CREATE would create BPF token kernel object and return its FD to
user-space, allowing to (optionally) pin it in BPF FS using BPF_OBJ_PIN
command. This v3 version changes this slightly: BPF_TOKEN_CREATE combines BPF
token object creation *and* pinning in BPF FS. Such change ensures that BPF
token is always associated with a specific instance of BPF FS and cannot
"escape" it by application re-pinning it somewhere else using another
BPF_OBJ_PIN call. Now, BPF token can only be pinned once during its creation,
better containing it inside intended container (under assumption BPF FS is set
up in such a way as to not be shared with other containers on the system).

  [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
  [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
  [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/

v3->v3-resend:
  - I started integrating token_fd into bpf_object_open_opts and higher-level
    libbpf bpf_object APIs, but it started going a bit deeper into bpf_object
    implementation details and how libbpf performs feature detection and
    caching, so I decided to keep it separate from this patch set and not
    distract from the mostly kernel-side changes;
v2->v3:
  - make BPF_TOKEN_CREATE pin created BPF token in BPF FS, and disallow
    BPF_OBJ_PIN for BPF token;
v1->v2:
  - fix build failures on Kconfig with CONFIG_BPF_SYSCALL unset;
  - drop BPF_F_TOKEN_UNKNOWN_* flags and simplify UAPI (Stanislav).

Andrii Nakryiko (14):
  bpf: introduce BPF token object
  libbpf: add bpf_token_create() API
  selftests/bpf: add BPF_TOKEN_CREATE test
  bpf: add BPF token support to BPF_MAP_CREATE command
  libbpf: add BPF token support to bpf_map_create() API
  selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
  bpf: add BPF token support to BPF_BTF_LOAD command
  libbpf: add BPF token support to bpf_btf_load() API
  selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
  bpf: add BPF token support to BPF_PROG_LOAD command
  bpf: take into account BPF token when fetching helper protos
  bpf: consistenly use BPF token throughout BPF verifier logic
  libbpf: add BPF token support to bpf_prog_load() API
  selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests

 drivers/media/rc/bpf-lirc.c                   |   2 +-
 include/linux/bpf.h                           |  79 ++++-
 include/linux/filter.h                        |   2 +-
 include/uapi/linux/bpf.h                      |  53 ++++
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/arraymap.c                         |   2 +-
 kernel/bpf/cgroup.c                           |   6 +-
 kernel/bpf/core.c                             |   3 +-
 kernel/bpf/helpers.c                          |   6 +-
 kernel/bpf/inode.c                            |  46 ++-
 kernel/bpf/syscall.c                          | 183 +++++++++---
 kernel/bpf/token.c                            | 201 +++++++++++++
 kernel/bpf/verifier.c                         |  13 +-
 kernel/trace/bpf_trace.c                      |   2 +-
 net/core/filter.c                             |  36 +--
 net/ipv4/bpf_tcp_ca.c                         |   2 +-
 net/netfilter/nf_bpf_link.c                   |   2 +-
 tools/include/uapi/linux/bpf.h                |  53 ++++
 tools/lib/bpf/bpf.c                           |  35 ++-
 tools/lib/bpf/bpf.h                           |  45 ++-
 tools/lib/bpf/libbpf.map                      |   1 +
 .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
 .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
 .../testing/selftests/bpf/prog_tests/token.c  | 277 ++++++++++++++++++
 24 files changed, 957 insertions(+), 104 deletions(-)
 create mode 100644 kernel/bpf/token.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c

-- 
2.34.1


             reply	other threads:[~2023-06-29  5:18 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-29  5:18 Andrii Nakryiko [this message]
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 01/14] bpf: introduce BPF token object Andrii Nakryiko
2023-07-04 12:43   ` Christian Brauner
2023-07-04 13:34     ` Christian Brauner
2023-07-04 23:28     ` Toke Høiland-Jørgensen
2023-07-05  7:20       ` Daniel Borkmann
2023-07-05  8:45         ` Christian Brauner
2023-07-05 12:34           ` Toke Høiland-Jørgensen
2023-07-05 14:16     ` Paul Moore
2023-07-05 14:42       ` Christian Brauner
2023-07-05 16:00         ` Paul Moore
2023-07-05 21:38         ` Andrii Nakryiko
2023-07-06 11:32           ` Toke Høiland-Jørgensen
2023-07-06 20:37             ` Andrii Nakryiko
2023-07-07 13:04               ` Toke Høiland-Jørgensen
2023-07-07 17:58                 ` Andrii Nakryiko
2023-07-07 22:00                   ` Toke Høiland-Jørgensen
2023-07-07 23:58                     ` Andrii Nakryiko
2023-07-10 23:42                       ` Djalal Harouni
2023-07-11 13:33           ` Christian Brauner
2023-07-11 22:06             ` Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 02/14] libbpf: add bpf_token_create() API Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 03/14] selftests/bpf: add BPF_TOKEN_CREATE test Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 04/14] bpf: add BPF token support to BPF_MAP_CREATE command Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 05/14] libbpf: add BPF token support to bpf_map_create() API Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 06/14] selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 07/14] bpf: add BPF token support to BPF_BTF_LOAD command Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 08/14] libbpf: add BPF token support to bpf_btf_load() API Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 09/14] selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 10/14] bpf: add BPF token support to BPF_PROG_LOAD command Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 11/14] bpf: take into account BPF token when fetching helper protos Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 12/14] bpf: consistenly use BPF token throughout BPF verifier logic Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 13/14] libbpf: add BPF token support to bpf_prog_load() API Andrii Nakryiko
2023-06-29  5:18 ` [PATCH RESEND v3 bpf-next 14/14] selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests Andrii Nakryiko
2023-06-29 23:15 ` [PATCH RESEND v3 bpf-next 00/14] BPF token Toke Høiland-Jørgensen
2023-06-30 18:25   ` Andrii Nakryiko
2023-07-04  9:38     ` Christian Brauner
2023-07-04 23:20     ` Toke Høiland-Jørgensen
2023-07-05 12:57       ` Stefano Brivio
2023-07-02  6:59   ` Djalal Harouni
2023-07-04  9:51   ` Christian Brauner
2023-07-04 23:33     ` Toke Høiland-Jørgensen
2023-07-05 20:39     ` Andrii Nakryiko
2023-07-01  2:05 ` Yafang Shao
2023-07-05 20:37   ` Andrii Nakryiko
2023-07-06  1:26     ` Yafang Shao
2023-07-06 20:34       ` Andrii Nakryiko
2023-07-07  1:42         ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230629051832.897119-1-andrii@kernel.org \
    --to=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=cyphar@cyphar.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@meta.com \
    --cc=lennart@poettering.net \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=sargun@sargun.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).