All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
Date: Sat, 17 Apr 2021 16:48:53 +0000	[thread overview]
Message-ID: <YHsRdTqgurSCykt7@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <20210417143639.kq3nafzlsridtbb6@ast-mbp>

On Sat, Apr 17, 2021 at 07:36:39AM -0700, Alexei Starovoitov wrote:

> The kernel will perform the same work with FDs. The same locks are held
> and the same execution conditions are in both cases. The LSM hooks,
> fsnotify, etc will be called the same way.
> It's no different if new syscall was introduced "sys_foo(int num)" that
> would do { return close_fd(num); }.
> It would opearate in the same user context.

Hmm...  unless I'm misreading the code, one of the call chains would seem to
be sys_bpf() -> bpf_prog_test_run() -> ->test_run() -> ... -> bpf_sys_close().
OK, as long as you make sure bpf_prog_get() does fdput() (i.e. that we
don't have it restructured so that fdget()/fdput() pair would be lifted into
bpf_prog_test_run(), with fdput() moved in place of bpf_prog_put()).

Note that we *really* can not allow close_fd() on anything to be bracketed
by fdget()/fdput() pair; we had bugs of that sort and, as the matter of fact,
still have one in autofs_dev_ioctl().

The trouble happens if you have file F with 2 references, held by descriptor
tables of different processes.  Say, process A has descriptor 6 refering to
it, while B has descriptor 42 doing the same.  Descriptor tables of A and B
are not shared with anyone.

A: fdget(6) 	-> returns a reference to F, refcount _not_ touched
A: close_fd(6)	-> rips the reference to F from descriptor table, does fput(F)
		   refcount drops to 1.
B: close(42)	-> rips the reference to F from B's descriptor table, does fput(F)
		   This time refcount does reach 0 and we use task_work_add() to
		   make sure the destructor (__fput()) runs before B returns to
		   userland.  sys_close() returns and B goes off to userland.
		   On the way out __fput() is run, and among other things,
		   ->release() of F is executed, doing whatever it wants to do.
		   F is freed.
And at that point A, which presumably is using the guts of F, gets screwed.

In case of autofs_dev_ioctl(), it's possible for a thread to end up blocked
inside copy_to_user(), with autofs functions in call chains *AND* module
refcount of autofs not pinned by anything.  The effects of returning into a
function that used to reside in now unmapped page are obviously not pretty...

Basically, the rule is
	* never remove from descriptor table if you currently have an outstadning
fdget() (i.e. without having done the matching fdput() yet).

	That, obviously, covers all ioctls - there we have fdget() done
by sys_ioctl() on the issuing descriptor.  In your case you seem to be
safe, but it's a bloody dangerous minefield - you really need a big warning
in all call sites.  The worst part is that it won't happen with intended
use, so it doesn't show up in routine regression testing.  In particular,
for autofs the normal case is AUTOFS_DEV_IOCTL_CLOSEMOUNT getting passed
a file descriptor of something mounted and *not* the descriptor of
/dev/autofs we are holding fdget() on.  However, there's no way to prevent
a malicious call when we pass exactly that.

	So please, mark all call sites with "make very sure you never get
here with unpaired fdget()".

	BTW, if my reading (re ->test_run()) is correct, what limits the recursion
via bpf_sys_bpf()?

  reply	other threads:[~2021-04-17 16:49 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 01/15] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 02/15] bpf: Introduce bpfptr_t user/kernel pointer Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 03/15] bpf: Prepare bpf syscall to be used from kernel and user space Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 04/15] libbpf: Support for syscall program type Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 05/15] selftests/bpf: Test " Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 06/15] bpf: Make btf_load command to be bpfptr_t compatible Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 07/15] selftests/bpf: Test for btf_load command Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 08/15] bpf: Introduce fd_idx Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 09/15] libbpf: Support for fd_idx Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 10/15] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper Alexei Starovoitov
2021-04-17  3:42   ` Al Viro
2021-04-17  3:46     ` Alexei Starovoitov
2021-04-17  4:04       ` Al Viro
2021-04-17  5:01         ` Alexei Starovoitov
2021-04-17 14:36           ` Alexei Starovoitov
2021-04-17 16:48             ` Al Viro [this message]
2021-04-17 17:09               ` Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 12/15] libbpf: Change the order of data and text relocations Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
2021-04-21  1:34   ` Yonghong Song
2021-04-21  4:46     ` Alexei Starovoitov
2021-04-21  5:30       ` Yonghong Song
2021-04-21  6:06         ` Alexei Starovoitov
2021-04-21 14:05           ` Yonghong Song
2021-04-21 17:46       ` Andrii Nakryiko
2021-04-21 17:50         ` Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 14/15] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 15/15] selftests/bpf: Convert few tests to light skeleton Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YHsRdTqgurSCykt7@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.