[PATCH bpf-next v3 00/10] Add cgroup sockaddr hooks for unix sockets

* [PATCH bpf-next v3 00/10] Add cgroup sockaddr hooks for unix sockets
@ 2023-04-21 16:27 Daan De Meyer
  2023-04-21 16:27 ` [PATCH bpf-next v3 01/10] selftests/bpf: Add missing section name tests for getpeername/getsockname Daan De Meyer
                   ` (9 more replies)
  0 siblings, 10 replies; 25+ messages in thread
From: Daan De Meyer @ 2023-04-21 16:27 UTC (permalink / raw)
  To: bpf; +Cc: Daan De Meyer, martin.lau, kernel-team

Changes since v2:

* Configuring the sock addr is now done via a new kfunc bpf_sock_addr_set()
* The addrlen is exposed as u32 in bpf_sock_addr_kern
* Selftests are updated to use the new kfunc and access the sockaddr via
CORE
* Added BTF_KFUNC_HOOK_SOCK_ADDR for BPF_PROG_TYPE_CGROUP_SOCK_ADDR
* __cgroup_bpf_run_filter_sock_addr() now returns the modified addrlen

Changes since v1:

* Split into multiple patches instead of one single patch
* Added unix support for all socket address hooks instead of only connect()
* Switched approach to expose the socket address length to the bpf hook
instead of recalculating the socket address length in kernelspace to
properly support abstract unix socket addresses
* Modified socket address hook tests to calculate the socket address length
once and pass it around everywhere instead of recalculating the actual unix
socket address length on demand.
* Added some missing section name tests for getpeername()/getsockname()

This patch series extends the cgroup sockaddr hooks to include support for
unix sockets. To add support for unix sockets, struct bpf_sock_addr is
extended to expose the unix socket path (sun_path) and the socket address
length to the bpf program. For unix sockets, the address length is writable,
for the other socket address hook types, the address length is only readable.

I intend to use these new hooks in systemd to reimplement the LogNamespace=
feature, which allows running multiple instances of systemd-journald to
process the logs of different services. systemd-journald also processes
syslog messages, so currently, using log namespaces means all services running
in the same log namespace have to live in the same private mount namespace
so that systemd can mount the journal namespace's associated syslog socket
over /dev/log to properly direct syslog messages from all services running
in that log namespace to the correct systemd-journald instance. We want to
relax this requirement so that processes running in disjoint mount namespaces
can still run in the same log namespace. To achieve this, we can use these
new hooks to rewrite the socket address of any connect(), sendto(), ...
syscalls to /dev/log to the socket address of the journal namespace's syslog
socket instead, which will transparently do the redirection without requiring
use of a mount namespace and mounting over /dev/log.

Aside from the above usecase, these hooks can more generally be used to
transparently redirect unix sockets to different addresses as required by
services.

Daan De Meyer (10):
  selftests/bpf: Add missing section name tests for
    getpeername/getsockname
  selftests/bpf: Track sockaddr length in sock addr tests
  bpf: Allow read access to addr_len from cgroup sockaddr programs
  bpf: Add BTF_KFUNC_HOOK_SOCK_ADDR
  bpf: Add bpf_sock_addr_set() to allow writing sockaddr len from bpf
  bpf: Implement cgroup sockaddr hooks for unix sockets
  libbpf: Add support for cgroup unix socket address hooks
  bpftool: Add support for cgroup unix socket address hooks
  selftests/bpf: Add tests for cgroup unix socket address hooks
  documentation/bpf: Document cgroup unix socket address hooks

 Documentation/bpf/libbpf/program_types.rst    |  12 +
 include/linux/bpf-cgroup-defs.h               |   6 +
 include/linux/bpf-cgroup.h                    | 102 ++++---
 include/linux/filter.h                        |   1 +
 include/uapi/linux/bpf.h                      |  14 +-
 kernel/bpf/btf.c                              |   3 +
 kernel/bpf/cgroup.c                           |  27 +-
 kernel/bpf/syscall.c                          |  18 ++
 kernel/bpf/verifier.c                         |   7 +-
 net/core/filter.c                             |  69 ++++-
 net/ipv4/af_inet.c                            |   8 +-
 net/ipv4/ping.c                               |   8 +-
 net/ipv4/tcp_ipv4.c                           |   8 +-
 net/ipv4/udp.c                                |  17 +-
 net/ipv6/af_inet6.c                           |   8 +-
 net/ipv6/ping.c                               |   8 +-
 net/ipv6/tcp_ipv6.c                           |   8 +-
 net/ipv6/udp.c                                |  14 +-
 net/unix/af_unix.c                            | 102 ++++++-
 .../bpftool/Documentation/bpftool-cgroup.rst  |  21 +-
 tools/bpf/bpftool/cgroup.c                    |  17 +-
 tools/bpf/bpftool/common.c                    |   6 +
 tools/include/uapi/linux/bpf.h                |  14 +-
 tools/lib/bpf/libbpf.c                        |  12 +
 tools/testing/selftests/bpf/bpf_kfuncs.h      |  13 +
 .../selftests/bpf/prog_tests/section_names.c  |  50 ++++
 .../testing/selftests/bpf/progs/bindun_prog.c |  59 ++++
 .../selftests/bpf/progs/connectun_prog.c      |  53 ++++
 .../selftests/bpf/progs/recvmsgun_prog.c      |  59 ++++
 .../selftests/bpf/progs/sendmsgun_prog.c      |  53 ++++
 tools/testing/selftests/bpf/test_sock_addr.c  | 263 ++++++++++++++----
 31 files changed, 917 insertions(+), 143 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/bindun_prog.c
 create mode 100644 tools/testing/selftests/bpf/progs/connectun_prog.c
 create mode 100644 tools/testing/selftests/bpf/progs/recvmsgun_prog.c
 create mode 100644 tools/testing/selftests/bpf/progs/sendmsgun_prog.c

--
2.40.0

^ permalink raw reply	[flat|nested] 25+ messages in thread