bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Lorenz Bauer <lmb@cloudflare.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	kernel-team <kernel-team@cloudflare.com>
Subject: Re: bpf_jit_limit close shave
Date: Tue, 21 Sep 2021 07:34:33 -0700	[thread overview]
Message-ID: <CAADnVQKxmNDET97wfi-k7L_ot9WXDX7CnqPNe=wK=rXpQJDcyg@mail.gmail.com> (raw)
In-Reply-To: <CACAyw9_TjUMu1s46X3jE3ubcszAW3yoj39ADADOFseL0x96MeQ@mail.gmail.com>

On Tue, Sep 21, 2021 at 4:50 AM Lorenz Bauer <lmb@cloudflare.com> wrote:
>
> Hi,
>
> We just had a close shave with bpf_jit_limit. Something on our edge
> caused us to cross the default limit, which made seccomp and xt_bpf
> filters fail to load. Looking at the source made me realise that we
> narrowly avoided taking out our load balancer, which would've been
> pretty bad. We still run the LB with CAP_SYS_ADMIN instead of narrower
> CAP_BPF, CAP_NET_ADMIN. If we had migrated to the lesser capability
> set we would've been prevented from loading new eBPF:
>
> int bpf_jit_charge_modmem(u32 pages)
> {
>     if (atomic_long_add_return(pages, &bpf_jit_current) >
>         (bpf_jit_limit >> PAGE_SHIFT)) {
>         if (!capable(CAP_SYS_ADMIN)) {
>             atomic_long_sub(pages, &bpf_jit_current);
>             return -EPERM;
>         }
>     }
>
>     return 0;
> }
>
> Does it make sense to include !capable(CAP_BPF) in the check?

Good point. Makes sense to add CAP_BPF there.
Taking down critical networking infrastructure because of this limit
that supposed to apply to unpriv users only is scary indeed.

> This limit reminds me a bit of the memlock issue, where a global limit
> causes coupling between independent systems / processes. Can we remove
> the limit in favour of something more fine grained?

Right. Unfortunately memcg doesn't distinguish kernel module
memory vs any other memory. All types of memory are memory.
Regardless of whether its type is per-cpu, bpf map memory, bpf jit memory, etc.
That's the main reason for the independent knob for JITed memory.
Since it's a bit special. It's a crude knob. Certainly not perfect.

  reply	other threads:[~2021-09-21 14:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 11:49 bpf_jit_limit close shave Lorenz Bauer
2021-09-21 14:34 ` Alexei Starovoitov [this message]
2021-09-21 15:52   ` Lorenz Bauer
     [not found]     ` <CABEBQi=WfdJ-h+5+fgFXOptDWSk2Oe_V85gR90G2V+PQh9ME0A@mail.gmail.com>
2021-09-21 19:59       ` Alexei Starovoitov
2021-09-22  8:20         ` Frank Hofmann
2021-09-22 11:07           ` Lorenz Bauer
2021-09-22 21:51             ` Daniel Borkmann
2021-09-23  2:03               ` Alexei Starovoitov
2021-09-23  9:16               ` Lorenz Bauer
2021-09-23 11:52                 ` Daniel Borkmann
2021-09-24 10:35                   ` Lorenz Bauer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQKxmNDET97wfi-k7L_ot9WXDX7CnqPNe=wK=rXpQJDcyg@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@cloudflare.com \
    --cc=lmb@cloudflare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).