bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Lorenz Bauer <lmb@cloudflare.com>
Cc: Frank Hofmann <fhofmann@cloudflare.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	kernel-team <kernel-team@cloudflare.com>
Subject: Re: bpf_jit_limit close shave
Date: Thu, 23 Sep 2021 13:52:05 +0200	[thread overview]
Message-ID: <53e09160-f30d-7d23-e3d0-8f636cd82117@iogearbox.net> (raw)
In-Reply-To: <CACAyw9-Ha9RQC_VijJAE02mCX3E09vmDji__Ts8YrsSH4cGiyg@mail.gmail.com>

On 9/23/21 11:16 AM, Lorenz Bauer wrote:
> On Wed, 22 Sept 2021 at 22:51, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 9/22/21 1:07 PM, Lorenz Bauer wrote:
>>> On Wed, 22 Sept 2021 at 09:20, Frank Hofmann <fhofmann@cloudflare.com> wrote:
>>>>
>>>>> That jit limit is not there on older kernels and doesn't apply to root.
>>>>> How would you notice such a kernel bug in such conditions?
>>>>
>>>> I'm talking about bpf_jit_current - it's an "overall gauge" for
>>>> allocation, priv and unpriv. I understood Lorenz' note as "change it
>>>> so it only tracks unpriv BPF mem usage - since we'll never act on
>>>> privileged usage anyway"
>>>
>>> Yes, that was my suggestion indeed. What Frank is saying: it looks
>>> like our leak of JIT memory is due to a privileged process. By
>>> exempting privileged processes it would be even harder to notice /
>>> debug. That's true, and brings me back to my question: what is
>>> different about JIT memory that we can't do a better limit?
>>
>> The knob with the limit was basically added back then as a band-aid to avoid
>> unprivileged BPF JIT (cBPF or eBPF) eating up all the module memory to the
>> point where we cannot even load kernel modules anymore. Given that memory
>> resource is global, we added the bpf_jit_limit / bpf_jit_current acounting
>> as a fix/heuristic via ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict
>> unpriv allocations"). If we wouldn't account for root, how would such detection
>> proposal work otherwise to block unprivileged? I don't think it's feasible to
>> only account the latter given privileged progs might have occupied most of the
>> budget already.
> 
> Thanks, that was the part I was missing. JITed BPF programs are
> treated like modules (why?). There is a limited space reserved for
> kernel modules.

See bpf_jit_alloc_exec() which calls module_alloc() for the images' r+x memory
holding the generated opcodes, and there's only one such pool for the system
on the latter: on x86 in particular, the rationale for module_alloc() use is
so that the image is guaranteed to be within +/- 2GB of where the kernel image
resides. See the encoding of BPF_CALL with __bpf_call_base + imm32, for example.

> How does the knob solve the "can't load a new module" problem if our
> suggestion / preference is to steer people towards CAP_BPF anyways
> (since unpriv BPF is trouble)? Over time all BPF will be privileged
> and we're in the same mess again?

Keep in mind that the knob was added before CAP_BPF. In general, unprivileged
cBPF->eBPF is also using the same bpf_jit_alloc_exec() for the JIT, so that
needs to be taken into consideration as well, but if you grant an application
CAP_BPF then you're essentially privileged. The knob's point was to prevent
fully unprivileged users to play bad games.

Thanks,
Daniel

  reply	other threads:[~2021-09-23 11:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 11:49 bpf_jit_limit close shave Lorenz Bauer
2021-09-21 14:34 ` Alexei Starovoitov
2021-09-21 15:52   ` Lorenz Bauer
     [not found]     ` <CABEBQi=WfdJ-h+5+fgFXOptDWSk2Oe_V85gR90G2V+PQh9ME0A@mail.gmail.com>
2021-09-21 19:59       ` Alexei Starovoitov
2021-09-22  8:20         ` Frank Hofmann
2021-09-22 11:07           ` Lorenz Bauer
2021-09-22 21:51             ` Daniel Borkmann
2021-09-23  2:03               ` Alexei Starovoitov
2021-09-23  9:16               ` Lorenz Bauer
2021-09-23 11:52                 ` Daniel Borkmann [this message]
2021-09-24 10:35                   ` Lorenz Bauer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53e09160-f30d-7d23-e3d0-8f636cd82117@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=fhofmann@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=lmb@cloudflare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).