All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Dave Marchevsky <davemarchevsky@fb.com>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Delyan Kratunov <delyank@fb.com>
Subject: Re: [PATCH RFC bpf-next v1 21/32] bpf: Allow locking bpf_spin_lock global variables
Date: Sun, 11 Sep 2022 15:31:32 -0700	[thread overview]
Message-ID: <20220911223132.vnhyyojwjzdzo4wr@MacBook-Pro-4.local> (raw)
In-Reply-To: <CAEf4BzazaCKMu5FUB_iZ2z+SVtaw-w8VZhA7EBd9oKKB_o299Q@mail.gmail.com>

On Fri, Sep 09, 2022 at 05:21:52PM -0700, Andrii Nakryiko wrote:
> > >
> > > Well, I didn't propose to use suffixes. Currently user can define
> > > SEC(".data.my_custom_use_case").
> >
> > ... and libbpf will mmap such maps and will expose them in skeleton.
> > My point that it's an existing bug.
> 
> hm... it's not a bug, it's a desired feature. I wanted
> 
> int my_var SEC(".data.mine");
> 
> to be just like .data but in a separate map. So no bug here.

".rodata.*" and ".data.*" section names are effectively reserved by the compiler.
Sooner or later there will be trouble if users start mixing compiler
sections with their own section names like ".data.mine".
In bpf backend we specicialy check for '.rodata' prefix and avoid emitting BTF.
llvm can emit .rodata.cst%d, .rodata.str%d.%d, .data.rel, etc
Not sure how many such special sections will be generated once
bpf progs will get bigger, but creating them as maps will waste
plenty of kernel memory due to page align in bpf-array.
llvm should probably combine them when possible and minimize section
usage in general, but that's orthogonal.
Mixing user and compiler sections under the same prefix is just asking for trouble.

> > Compiler generated .rodata.str1.1 sections should not be messed by
> > user space. There is no BTF for them either.
> 
> Shouldn't but could if they wanted to without skeleton as well. In
> generated skeleton there will be an empty struct for this and no field
> for each of compiler's constant. User has to intentionally do
> something to harm themselves, which we can never stop either way.
> 
> So stuff like .rodata.str1.1 exposes a bit of compiler implementation
> details, but overall idea of allowing custom .data.xxx and .rodata.xxx
> sections was to make them mmapable and readable/writable through
> skeleton.

It's not a good idea to expose compiler internals into skeleton
and even worse to ask users to operate in the compiler's namespace.

> Carving out some sub-namespace based on special suffix feels wrong.

agree. suffix doesn't work, since prefix is already owned by the compiler.

> > mmap and subsequent write by user space won't cause a crash for bpf prog,
> > but it won't be doing what C code intended.
> > There is nothing in there for skeleton and user space to see,
> > but such map should be created, populated and map_fd provided to the prog to use.
> >
> > > So I was proposing that we'll just
> > > define a different *prefix*, like SEC(".private.enqueue") and
> > > SEC(".private.dequeue") for your example above, which will be private
> > > to BPF program, not mmap'ed, not exposed in skeleton.
> > >
> > > mmap is a bit orthogonal to exposing in skeleton, you can still
> > > imagine data section that will be allowed to be initialized from
> > > user-space before load but never mmaped afterwards. Just to say that
> > > .nommap doesn't necessarily imply that it shouldn't be in a skeleton.
> >
> > Well. That's true for normal skeleton and for lskel,
> > but not the case for kernel skel that doesn't have mmap.
> > Exposing a map in skel implies that it will be accessed not only
> > after _open and before _load, but after load as well.
> > We can say that mmap != expose in skeleton, but I really don't see
> > such feature being useful.
> 
> It's basically what .rodata is, actually. It's something that's
> settable from user-space before load and that's it. Yes, you can read
> it after load, but no one does it in practice. But we are digressing,
> I understand you want to make this short and sweet and I agree with
> you. I just disagree about wildcard rule for any non-dotted ELF
> section or using special suffix. See below.
> 
> >
> > > So I still prefer special prefix (.private) and declare that this is
> > > both non-mmapable and not-exposed in skeleton.
> > >
> > > As for allowing any section. It just feels unnecessary and long-term
> > > harmful to allow any section name at this point, tbh.
> >
> > Fine. How about a single new character instead of '.private' prefix ?
> > Like SEC("#enqueue") that would mean no-skel and no-mmap ?
> >
> > Or double dot SEC("..enqueue") ?
> >
> > '.private' is too verbose and when it's read in the context of C file
> > looks out of place and confusing.
> 
> As I said, I gave zero thought to .private, I just took it from
> ".bss.private". I'd like to keep it "dotted", so SEC("#something") is
> very "unusual". Me not like.

Why is this unusual? We have SEC("?tc") already.
SEC("#foo") is very similar.
Dot prefix is special. Something compiler will generate
whereas the section that starts with [A-z] is user's.
So reserving a prefix that starts from [A-z] would be wrong.

> For double-dot, could be just SEC("..data") and generalized to
> SEC("..data.<custom>")? BTW, we can add a macro, similar to __kconfig,
> to hide more descriptive and longer name. E.g.,
> 
> struct bpf_spin_lock my_lock __internal;

Macro doesn't help, since the namespace is broken anyway.
'..data' is dangerous because something might be doing strstr(".data")
instead of strcmp(".data") and will match that section erroneously.

> __internal, __private, __secret, don't know, naming is hard.

Right. We already use special meaning for text section names: tc, xdp
which is also far from ideal, but that's too late to change.
For data I'm arguing that only '.[ro]data' should appear in skel
and compiler internals '.[ro]data.*' should not leak to users.
Then emit all of [A-z] starting sections, since that's what compilers
and linkers will do, but reserve a single character like '#'
or whatever other char to mean that this section shouldn't be mmapable.

  reply	other threads:[~2022-09-11 22:31 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-04 20:41 [PATCH RFC bpf-next v1 00/32] Local kptrs, BPF linked lists Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 01/32] bpf: Add copy_map_value_long to copy to remote percpu memory Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 02/32] bpf: Support kptrs in percpu arraymap Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 03/32] bpf: Add zero_map_value to zero map value with special fields Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 04/32] bpf: Support kptrs in percpu hashmap and percpu LRU hashmap Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 05/32] bpf: Support kptrs in local storage maps Kumar Kartikeya Dwivedi
2022-09-07 19:00   ` Alexei Starovoitov
2022-09-08  2:47     ` Kumar Kartikeya Dwivedi
2022-09-09  5:27   ` Martin KaFai Lau
2022-09-09 11:22     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 06/32] bpf: Annotate data races in bpf_local_storage Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 07/32] bpf: Allow specifying volatile type modifier for kptrs Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 08/32] bpf: Add comment about kptr's PTR_TO_MAP_VALUE handling Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 09/32] bpf: Rewrite kfunc argument handling Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 10/32] bpf: Drop kfunc support from btf_check_func_arg_match Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 11/32] bpf: Support constant scalar arguments for kfuncs Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 12/32] bpf: Teach verifier about non-size constant arguments Kumar Kartikeya Dwivedi
2022-09-07 22:11   ` Alexei Starovoitov
2022-09-08  2:49     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 13/32] bpf: Introduce bpf_list_head support for BPF maps Kumar Kartikeya Dwivedi
2022-09-07 22:46   ` Alexei Starovoitov
2022-09-08  2:58     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 14/32] bpf: Introduce bpf_kptr_alloc helper Kumar Kartikeya Dwivedi
2022-09-07 23:30   ` Alexei Starovoitov
2022-09-08  3:01     ` Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 15/32] bpf: Add helper macro bpf_expr_for_each_reg_in_vstate Kumar Kartikeya Dwivedi
2022-09-07 23:48   ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 16/32] bpf: Introduce BPF memory object model Kumar Kartikeya Dwivedi
2022-09-08  0:34   ` Alexei Starovoitov
2022-09-08  2:39     ` Kumar Kartikeya Dwivedi
2022-09-08  3:37       ` Alexei Starovoitov
2022-09-08 11:50         ` Kumar Kartikeya Dwivedi
2022-09-08 14:18           ` Alexei Starovoitov
2022-09-08 14:45             ` Kumar Kartikeya Dwivedi
2022-09-08 15:11               ` Alexei Starovoitov
2022-09-08 15:37                 ` Kumar Kartikeya Dwivedi
2022-09-08 15:59                   ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 17/32] bpf: Support bpf_list_node in local kptrs Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 18/32] bpf: Support bpf_spin_lock " Kumar Kartikeya Dwivedi
2022-09-08  0:35   ` Alexei Starovoitov
2022-09-09  8:25     ` Dave Marchevsky
2022-09-09 11:20       ` Kumar Kartikeya Dwivedi
2022-09-09 14:26         ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 19/32] bpf: Support bpf_list_head " Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 20/32] bpf: Introduce bpf_kptr_free helper Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 21/32] bpf: Allow locking bpf_spin_lock global variables Kumar Kartikeya Dwivedi
2022-09-08  0:27   ` Alexei Starovoitov
2022-09-08  0:39     ` Kumar Kartikeya Dwivedi
2022-09-08  0:55       ` Alexei Starovoitov
2022-09-08  1:00     ` Kumar Kartikeya Dwivedi
2022-09-08  1:08       ` Alexei Starovoitov
2022-09-08  1:15         ` Kumar Kartikeya Dwivedi
2022-09-08  2:39           ` Alexei Starovoitov
2022-09-09  8:13   ` Dave Marchevsky
2022-09-09 11:05     ` Kumar Kartikeya Dwivedi
2022-09-09 14:24       ` Alexei Starovoitov
2022-09-09 14:50         ` Kumar Kartikeya Dwivedi
2022-09-09 14:58           ` Alexei Starovoitov
2022-09-09 18:32             ` Andrii Nakryiko
2022-09-09 19:25               ` Alexei Starovoitov
2022-09-09 20:21                 ` Andrii Nakryiko
2022-09-09 20:57                   ` Alexei Starovoitov
2022-09-10  0:21                     ` Andrii Nakryiko
2022-09-11 22:31                       ` Alexei Starovoitov [this message]
2022-09-20 20:55                         ` Andrii Nakryiko
2022-10-18  4:06                           ` Andrii Nakryiko
2022-09-09 22:30                 ` Dave Marchevsky
2022-09-09 22:49                   ` Kumar Kartikeya Dwivedi
2022-09-09 22:57                     ` Alexei Starovoitov
2022-09-09 23:04                       ` Kumar Kartikeya Dwivedi
2022-09-09 22:51                   ` Alexei Starovoitov
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 22/32] bpf: Bump BTF_KFUNC_SET_MAX_CNT Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 23/32] bpf: Add single ownership BPF linked list API Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 24/32] bpf: Permit NULL checking pointer with non-zero fixed offset Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 25/32] bpf: Allow storing local kptrs in BPF maps Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 26/32] bpf: Wire up freeing of bpf_list_heads in maps Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 27/32] bpf: Add destructor for bpf_list_head in local kptr Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 28/32] bpf: Remove duplicate PTR_TO_BTF_ID RO check Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 29/32] libbpf: Add support for private BSS map section Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 30/32] selftests/bpf: Add BTF tag macros for local kptrs, BPF linked lists Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 31/32] selftests/bpf: Add BPF linked list API tests Kumar Kartikeya Dwivedi
2022-09-04 20:41 ` [PATCH RFC bpf-next v1 32/32] selftests/bpf: Add referenced local kptr tests Kumar Kartikeya Dwivedi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220911223132.vnhyyojwjzdzo4wr@MacBook-Pro-4.local \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=delyank@fb.com \
    --cc=memxor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.