All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH RFC bpf-next 2/4] bpf: support BPF ksym variables in kernel modules
Date: Fri, 11 Dec 2020 17:52:15 -0800	[thread overview]
Message-ID: <20201212015215.zmychededhpv55th@ast-mbp> (raw)
In-Reply-To: <CAEf4BzbZK8uZOprwHq_+mh=2Lb27POv5VMW4kB6eyPc_6bcSPg@mail.gmail.com>

On Fri, Dec 11, 2020 at 02:15:28PM -0800, Andrii Nakryiko wrote:
> On Fri, Dec 11, 2020 at 1:27 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Dec 10, 2020 at 08:27:32PM -0800, Andrii Nakryiko wrote:
> > > During BPF program load time, verifier will resolve FD to BTF object and will
> > > take reference on BTF object itself and, for module BTFs, corresponding module
> > > as well, to make sure it won't be unloaded from under running BPF program. The
> > > mechanism used is similar to how bpf_prog keeps track of used bpf_maps.
> > ...
> > > +
> > > +     /* if we reference variables from kernel module, bump its refcount */
> > > +     if (btf_is_module(btf)) {
> > > +             btf_mod->module = btf_try_get_module(btf);
> >
> > Is it necessary to refcnt the module? Correct me if I'm wrong, but
> > for module's BTF we register a notifier. Then the module can be rmmod-ed
> > at any time and we will do btf_put() for corresponding BTF, but that BTF may
> > stay around because bpftool or something is looking at it.
> 
> Correct, BTF object itself doesn't take a refcnt on module.
> 
> > Similarly when prog is attached to raw_tp in a module we currently do try_module_get(),
> > but is it really necessary ? When bpf is attached to a netdev the netdev can
> > be removed and the link will be dangling. May be it makes sense to do the same
> > with modules?  The raw_tp can become dangling after rmmod and the prog won't be
> 
> So for raw_tp it's not the case today. I tested, I attached raw_tp,
> kept triggering it in a loop, and tried to rmmod bpf_testmod. It
> failed, because raw tracepoint takes refcnt on module. rmmod -f

Right. I meant that we can change that behavior if it would make sense to do so.

> bpf_testmod also didn't work, but it's because my kernel wasn't built
> with force-unload enabled for modules. But force-unload is an entirely
> different matter and it's inherently dangerous to do, it can crash and
> corrupt anything in the kernel.
> 
> > executed anymore. So hard coded address of a per-cpu var in a ksym will
> > be pointing to freed mod memory after rmmod, but that's ok, since that prog will
> > never execute.
> 
> Not so fast :) Indeed, if somehow module gets unloaded while we keep
> BPF program loaded, we'll point to unallocated memory **OR** to a
> memory re-used for something else. That's bad. Nothing will crash even
> if it's unmapped memory (due to bpf_probe_read semantics), but we will
> potentially be reading some garbage (not zeroes), if some other module
> re-uses that per-CPU memory.
> 
> As for the BPF program won't be triggered. That's not true in general,
> as you mention yourself below.
> 
> > On the other side if we envision a bpf prog attaching to a vmlinux function
> > and accessing per-cpu or normal ksym in some module it would need to inc refcnt
> > of that module, since we won't be able to guarantee that this prog will
> > not execute any more. So we cannot allow dangling memory addresses.
> 
> That's what my new selftest is doing actually. It's a generic
> sys_enter raw_tp, which doesn't attach to the module, but it does read
> module's per-CPU variable. 

Got it. I see that now.

> So I actually ran a test before posting. I
> successfully unloaded bpf_testmod, but kept running the prog. And it
> kept returning *correct* per-CPU value. Most probably due to per-CPU
> memory not unmapped and not yet reused for something else. But it's a
> really nasty and surprising situation.

you mean you managed to unload early during development before
you've introduced refcnting of modules?

> Keep in mind, also, that whenever BPF program declares per-cpu
> variable extern, it doesn't know or care whether it will get resolved
> to built-in vmlinux per-CPU variable or module per-CPU variable.
> Restricting attachment to only module-provided hooks is both tedious
> and might be quite surprising sometimes, seems not worth the pain.
> 
> > If latter is what we want to allow then we probably need a test case for it and
> > document the reasons for keeping modules pinned while progs access their data.
> > Since such pinning behavior is different from other bpf attaching cases where
> > underlying objects (like netdev and cgroup) can go away.
> 
> See above, that's already the case for module tracepoints.
> 
> So in summary, I think we should take a refcnt on module, as that's
> already the case for stuff like raw_tp. I can add more comments to
> make this clear, of course.

ok. agreed.

Regarding fd+id in upper/lower 32-bit of ld_imm64...
That works for ksyms because at that end the pair is converted to single
address that fits into ld_imm64. That won't work for Alan's case
where btf_obj pointer and btf_id are two values (64-bit and 32-bit).
So api-wise it's fine here, but cannot adopt the same idea everywhere.

re: patch 4
Please add non-percpu var to the test. Just for completeness.
The pair fd+id should be enough to disambiguate, right?

re: patch 1.
Instead of copy paste that hack please convert it to sys_membarrier(MEMBARRIER_CMD_GLOBAL).

The rest looks good to me.

  reply	other threads:[~2020-12-12  1:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-11  4:27 [PATCH RFC bpf-next 0/4] Support kernel module ksym variables Andrii Nakryiko
2020-12-11  4:27 ` [PATCH RFC bpf-next 1/4] selftests/bpf: sync RCU before unloading bpf_testmod Andrii Nakryiko
2020-12-11  4:27 ` [PATCH RFC bpf-next 2/4] bpf: support BPF ksym variables in kernel modules Andrii Nakryiko
2020-12-11 11:55   ` kernel test robot
2020-12-11 21:27   ` Alexei Starovoitov
2020-12-11 22:15     ` Andrii Nakryiko
2020-12-12  1:52       ` Alexei Starovoitov [this message]
2020-12-12  5:23         ` Andrii Nakryiko
2020-12-11  4:27 ` [PATCH RFC bpf-next 3/4] libbpf: support kernel module ksym externs Andrii Nakryiko
2020-12-11  4:27 ` [PATCH RFC bpf-next 4/4] selftests/bpf: test " Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201212015215.zmychededhpv55th@ast-mbp \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.