bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/14] device_cgroup: guard mknod for non-initial user namespace
@ 2023-10-18 10:50 Michael Weiß
  2023-10-18 10:50 ` [RFC PATCH v2 01/14] device_cgroup: Implement devcgroup hooks as lsm security hooks Michael Weiß
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: Michael Weiß @ 2023-10-18 10:50 UTC (permalink / raw)
  To: Alexander Mikhalitsyn, Christian Brauner, Alexei Starovoitov, Paul Moore
  Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Quentin Monnet, Alexander Viro,
	Miklos Szeredi, Amir Goldstein, Serge E. Hallyn, bpf,
	linux-kernel, linux-fsdevel, gyroidos, Michael Weiß

Introduce the flag BPF_DEVCG_ACC_MKNOD_UNS for bpf programs of type
BPF_PROG_TYPE_CGROUP_DEVICE which allows to guard access to mknod
in non-initial user namespaces.

If a container manager restricts its unprivileged (user namespaced)
children by a device cgroup, it is not necessary to deny mknod()
anymore. Thus, user space applications may map devices on different
locations in the file system by using mknod() inside the container.

A use case for this, we also use in GyroidOS, is to run virsh for
VMs inside an unprivileged container. virsh creates device nodes,
e.g., "/var/run/libvirt/qemu/11-fgfg.dev/null" which currently fails
in a non-initial userns, even if a cgroup device white list with the
corresponding major, minor of /dev/null exists. Thus, in this case
the usual bind mounts or pre populated device nodes under /dev are
not sufficient.

To circumvent this limitation, allow mknod() by checking CAP_MKNOD
in the userns by implementing the security_inode_mknod_nscap(). The
hook implementation checks if the corresponding permission flag
BPF_DEVCG_ACC_MKNOD_UNS is set for the device in the bpf program.
To avoid to create unusable inodes in user space the hook also
checks SB_I_NODEV on the corresponding super block.

Further, the security_sb_alloc_userns() hook is implemented using
cgroup_bpf_current_enabled() to allow usage of device nodes on super
blocks mounted by a guarded task.

Patch 1 to 3 rework the current devcgroup_inode hooks as an LSM

Patch 4 to 8 rework explicit calls to devcgroup_check_permission
also as LSM hooks and finalize the conversion of the device_cgroup
subsystem to a LSM.

Patch 9 and 10 introduce new generic security hooks to be used
for the actual mknod device guard implementation.

Patch 11 wires up the security hooks in the vfs

Patch 12 and 13 provide helper functions in the bpf cgroup
subsystem.

Patch 14 finally implement the LSM hooks to grand access

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
---
Changes in v2:
- Integrate this as LSM (Christian, Paul)
- Switched to a device cgroup specific flag instead of a generic
  bpf program flag (Christian)
- do not ignore SB_I_NODEV in fs/namei.c but use LSM hook in
  sb_alloc_super in fs/super.c
- Link to v1: https://lore.kernel.org/r/20230814-devcg_guard-v1-0-654971ab88b1@aisec.fraunhofer.de

Michael Weiß (14):
  device_cgroup: Implement devcgroup hooks as lsm security hooks
  vfs: Remove explicit devcgroup_inode calls
  device_cgroup: Remove explicit devcgroup_inode hooks
  lsm: Add security_dev_permission() hook
  device_cgroup: Implement dev_permission() hook
  block: Switch from devcgroup_check_permission to security hook
  drm/amdkfd: Switch from devcgroup_check_permission to security hook
  device_cgroup: Hide devcgroup functionality completely in lsm
  lsm: Add security_inode_mknod_nscap() hook
  lsm: Add security_sb_alloc_userns() hook
  vfs: Wire up security hooks for lsm-based device guard in userns
  bpf: Add flag BPF_DEVCG_ACC_MKNOD_UNS for device access
  bpf: cgroup: Introduce helper cgroup_bpf_current_enabled()
  device_cgroup: Allow mknod in non-initial userns if guarded

 block/bdev.c                                 |   9 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h        |   7 +-
 fs/namei.c                                   |  24 ++--
 fs/super.c                                   |   6 +-
 include/linux/bpf-cgroup.h                   |   2 +
 include/linux/device_cgroup.h                |  67 -----------
 include/linux/lsm_hook_defs.h                |   4 +
 include/linux/security.h                     |  18 +++
 include/uapi/linux/bpf.h                     |   1 +
 init/Kconfig                                 |   4 +
 kernel/bpf/cgroup.c                          |  14 +++
 security/Kconfig                             |   1 +
 security/Makefile                            |   2 +-
 security/device_cgroup/Kconfig               |   7 ++
 security/device_cgroup/Makefile              |   4 +
 security/{ => device_cgroup}/device_cgroup.c |   3 +-
 security/device_cgroup/device_cgroup.h       |  20 ++++
 security/device_cgroup/lsm.c                 | 114 +++++++++++++++++++
 security/security.c                          |  75 ++++++++++++
 19 files changed, 294 insertions(+), 88 deletions(-)
 delete mode 100644 include/linux/device_cgroup.h
 create mode 100644 security/device_cgroup/Kconfig
 create mode 100644 security/device_cgroup/Makefile
 rename security/{ => device_cgroup}/device_cgroup.c (99%)
 create mode 100644 security/device_cgroup/device_cgroup.h
 create mode 100644 security/device_cgroup/lsm.c


base-commit: 58720809f52779dc0f08e53e54b014209d13eebb
-- 
2.30.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-10-18 10:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-18 10:50 [RFC PATCH v2 00/14] device_cgroup: guard mknod for non-initial user namespace Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 01/14] device_cgroup: Implement devcgroup hooks as lsm security hooks Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 02/14] vfs: Remove explicit devcgroup_inode calls Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 03/14] device_cgroup: Remove explicit devcgroup_inode hooks Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 04/14] lsm: Add security_dev_permission() hook Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 05/14] device_cgroup: Implement dev_permission() hook Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 06/14] block: Switch from devcgroup_check_permission to security hook Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 07/14] drm/amdkfd: " Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 08/14] device_cgroup: Hide devcgroup functionality completely in lsm Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 09/14] lsm: Add security_inode_mknod_nscap() hook Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 10/14] lsm: Add security_sb_alloc_userns() hook Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 11/14] vfs: Wire up security hooks for lsm-based device guard in userns Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 12/14] bpf: Add flag BPF_DEVCG_ACC_MKNOD_UNS for device access Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 13/14] bpf: cgroup: Introduce helper cgroup_bpf_current_enabled() Michael Weiß
2023-10-18 10:50 ` [RFC PATCH v2 14/14] device_cgroup: Allow mknod in non-initial userns if guarded Michael Weiß

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).