Closing the BPF map permission loophole

* Closing the BPF map permission loophole
@ 2022-09-15 10:30 Lorenz Bauer
  2022-09-16  3:03 ` Kumar Kartikeya Dwivedi
                   ` (4 more replies)
  0 siblings, 5 replies; 45+ messages in thread
From: Lorenz Bauer @ 2022-09-15 10:30 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, KP Singh, Stanislav Fomichev
  Cc: bpf

Hi list,

Here is a summary of the talk I gave at LPC '22 titled "Closing the BPF map permission loophole", with slides at [0].

Problem #1: Read-only fds can be modified via a BPF program

1. Craft a BPF program that executes bpf_map_update_elem(read-only fd, ...)
2. Load the program & execute it

The reason is that the verifier only checks bpf_map->map_flags in resolve_pseudo_ldimm64, but ignores fd->f_mode.

Fixing this problem is complicated by the fact that a user may use several distinct fds with differing permissions to refer to the same map, but that the verifier internally only tracks unique struct bpf_map. See [1].

Problem #2: Read-only fds can be "transmuted" into read-write fds via map in map

1. BPF_MAP_UPDATE_ELEM(map in map fd, read-only fd)
2. BPF_MAP_LOOKUP_ELEM(map in map fd) = read-write fd

This was pointed out by Stanislav Fomichev during the LPC session. I've not yet tried this myself.

Problem #3: Read-only fds can be transmuted into read-write fds via object pinning

1. BPF_OBJ_PIN(read-only fd, /sys/fs/bpf/foo)
2. BPF_OBJ_GET(/sys/fs/bpf/foo) = read-write fd

The problem is with BPF_OBJ_PIN semantics: regardless of fd->f_mode, pinning creates an inode that is owned by the current user, with mode o=rw. Even if we made the inode o=r, a user / attacker can still use chmod(2) to change it back to o=rw.

On older kernels, this requires either unprivileged BPF or CAP_BPF, but recently BPF_OBJ_PIN has been made available without CAP_BPF.

This problem also applies to other BPF objects: links, programs, maybe iterators? Since we don't have BPF_F_RDONLY semantics for those the issue is maybe less urgent, but see [2] for some more fun.

A number of ideas were explored during the session:

* In OBJ_PIN, create the inode owned by the user that executed MAP_CREATE, not the user that
  invoked OBJ_PIN. This would allow unprivileged users to create files as another user, which
  seems like a bad idea.
* In OBJ_GET, refuse a read-write fd if the fd passed to OBJ_PIN wasn't read-write. This is not
  possible since we store struct bpf_map * in the inode, so we don't have access to fd->f_mode
  anymore.
* In OBJ_PIN, adjust the mode of the created inode to match fd->f_mode, and later refuse attempts
  to chmod(2). After a cursory glance at the source code it seems like there are no hooks for
  filesystems to influence chmod.

My gut feeling is that the root of the problem is that OBJ_PIN is too permissive. Once an inode exists that is owned by the current user the cat is out of the box.

BPF_F_RDONLY and BPF_F_WRONLY were introduced in 4.15 [3]. If we want to fix this properly, aka relying on BPF_R_RDONLY doesn't introduce a gaping hole, we'll have to do quite a bit of backporting.

I plan on submitting a sledgehammer approach fix for #1 and #2 as discussed with Daniel after my presentation.

#3 is in sore need of further discussion and creativity. One avenue I want to explore is whether we can refuse OBJ_PIN if:
- the current user is not the map creator
- and the fd is not r/w
- and the current user has no CAP_DAC_OVERRIDE (or similar)

Thanks
Lorenz

0: https://lpc.events/event/16/contributions/1372/attachments/977/2059/Plumbers%2022%20Closing%20the%20BPF%20map%20permission%20loophole.pdf
1: https://elixir.bootlin.com/linux/v6.0-rc5/source/kernel/bpf/verifier.c#L12839
2: https://lore.kernel.org/bpf/20210618105526.265003-1-zenczykowski@gmail.com/
3: https://github.com/torvalds/linux/commit/6e71b04a8224

^ permalink raw reply	[flat|nested] 45+ messages in thread