All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Dave Marchevsky <davemarchevsky@meta.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>,
	bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kernel Team <kernel-team@fb.com>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next 10/13] bpf, x86: BPF_PROBE_MEM handling for insn->off < 0
Date: Wed, 7 Dec 2022 16:47:34 -0800	[thread overview]
Message-ID: <20221208004734.3deulbouezpiehrg@macbook-pro-6.dhcp.thefacebook.com> (raw)
In-Reply-To: <ea0259d9-2f29-bfbc-011f-810d3e2654a8@meta.com>

On Wed, Dec 07, 2022 at 06:39:38PM -0500, Dave Marchevsky wrote:
> >>
> >> 0000000000000000 <less>:
> >> ;       return node_a->key < node_b->key;
> >>        0:       79 22 f0 ff 00 00 00 00 r2 = *(u64 *)(r2 - 0x10)
> >>        1:       79 11 f0 ff 00 00 00 00 r1 = *(u64 *)(r1 - 0x10)
> >>        2:       b4 00 00 00 01 00 00 00 w0 = 0x1
> >> ;       return node_a->key < node_b->key;
> > 
> > I see. That's the same bug.
> > The args to callback should have been PTR_TO_BTF_ID | PTR_TRUSTED with 
> > correct positive offset.
> > Then node_a = container_of(a, struct node_data, node);
> > would have produced correct offset into proper btf_id.
> > 
> > The verifier should be passing into less() the btf_id
> > of struct node_data instead of btf_id of struct bpf_rb_node.
> > 
> 
> The verifier is already passing the struct node_data type, not bpf_rb_node.
> For less() args, and rbtree_{first,remove} retval, mark_reg_datastructure_node
> - added in patch 8 - is doing as you describe.
> 
> Verifier sees less' arg regs as R=ptr_to_node_data(off=16). If it was
> instead passing R=ptr_to_bpf_rb_node(off=0), attempting to access *(reg - 0x10)
> would cause verifier err.

Ahh. I finally got it :)
Please put these details in the commit log when you respin.

> >>        3:       cd 21 01 00 00 00 00 00 if r1 s< r2 goto +0x1 <LBB2_2>
> >>        4:       b4 00 00 00 00 00 00 00 w0 = 0x0
> >>
> >> 0000000000000028 <LBB2_2>:
> >> ;       return node_a->key < node_b->key;
> >>        5:       95 00 00 00 00 00 00 00 exit
> >>
> >> Insns 0 and 1 are loading node_b->key and node_a->key, respectively, using
> >> negative insn->off. Verifier's view or R1 and R2 before insn 0 is
> >> untrusted_ptr_node_data(off=16). If there were some intermediate insns
> >> storing result of container_of() before dereferencing:
> >>
> >>   r3 = (r2 - 0x10)
> >>   r2 = *(u64 *)(r3)
> >>
> >> Verifier would see R3 as untrusted_ptr_node_data(off=0), and load for
> >> r2 would have insn->off = 0. But LLVM decides to just do a load-with-offset
> >> using original arg ptrs to less() instead of storing container_of() ptr
> >> adjustments.
> >>
> >> Since the container_of usage and code pattern in above example's less()
> >> isn't particularly specific to this series, I think there are other scenarios
> >> where such code would be generated and considered this a general bugfix in
> >> cover letter.
> > 
> > imo the negative offset looks specific to two misuses of PTR_UNTRUSTED in this set.
> > 
> 
> If I used PTR_TRUSTED here, the JITted instructions would still do a load like
> r2 = *(u64 *)(r2 - 0x10). There would just be no BPF_PROBE_MEM runtime checking
> insns generated, avoiding negative insn issue there. But the negative insn->off
> load being generated is not specific to PTR_UNTRUSTED.

yep.

> > 
> > Exactly. More flags will only increase the confusion.
> > Please try to make callback args as proper PTR_TRUSTED and disallow calling specific
> > rbtree kfuncs while inside this particular callback to prevent recursion.
> > That would solve all these issues, no?
> > Writing into such PTR_TRUSTED should be still allowed inside cb though it's bogus.
> > 
> > Consider less() receiving btf_id ptr_trusted of struct node_data and it contains
> > both link list and rbtree.
> > It should still be safe to operate on link list part of that node from less()
> > though it's not something we would ever recommend.
> 
> I definitely want to allow writes on non-owning references. In order to properly
> support this, there needs to be a way to designate a field as a "key":
> 
> struct node_data {
>   long key __key;
>   long data;
>   struct bpf_rb_node node;
> };
> 
> or perhaps on the rb_root via __contains or separate tag:
> 
> struct bpf_rb_root groot __contains(struct node_data, node, key);
> 
> This is necessary because rbtree's less() uses key field to determine order, so
> we don't want to allow write to the key field when the node is in a rbtree. If
> such a write were possible the rbtree could easily be placed in an invalid state
> since the new key may mean that the rbtree is no longer sorted. Subsequent add()
> operations would compare less() using the new key, so other nodes will be placed
> in wrong spot as well.
> 
> Since PTR_UNTRUSTED currently allows read but not write, and prevents use of
> non-owning ref as kfunc arg, it seemed to be reasonable tag for less() args.
> 
> I was planning on adding __key / non-owning-ref write support as a followup, but
> adding it as part of this series will probably save a lot of back-and-forth.
> Will try to add it.

Just key mark might not be enough. less() could be doing all sort of complex
logic on more than one field and even global fields.
But what is the concern with writing into 'key' ?
The rbtree will not be sorted. find/add operation will not be correct,
but nothing will crash. At the end bpf_rb_root_free() will walk all
unsorted nodes anyway and free them all.
Even if we pass PTR_TRUSTED | MEM_RDONLY pointers into less() the less()
can still do nonsensical things like returning random true/false.
Doesn't look like an issue to me.

  reply	other threads:[~2022-12-08  0:47 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-06 23:09 [PATCH bpf-next 00/13] BPF rbtree next-gen datastructure Dave Marchevsky
2022-12-06 23:09 ` [PATCH bpf-next 01/13] bpf: Loosen alloc obj test in verifier's reg_btf_record Dave Marchevsky
2022-12-07 16:41   ` Kumar Kartikeya Dwivedi
2022-12-07 18:34     ` Dave Marchevsky
2022-12-07 18:59       ` Alexei Starovoitov
2022-12-07 20:38         ` Dave Marchevsky
2022-12-07 22:46           ` Alexei Starovoitov
2022-12-07 23:42             ` Dave Marchevsky
2022-12-07 19:03       ` Kumar Kartikeya Dwivedi
2022-12-06 23:09 ` [PATCH bpf-next 02/13] bpf: map_check_btf should fail if btf_parse_fields fails Dave Marchevsky
2022-12-07  1:32   ` Alexei Starovoitov
2022-12-07 16:49   ` Kumar Kartikeya Dwivedi
2022-12-07 19:05     ` Alexei Starovoitov
2022-12-17  8:59       ` Dave Marchevsky
2022-12-06 23:09 ` [PATCH bpf-next 03/13] bpf: Minor refactor of ref_set_release_on_unlock Dave Marchevsky
2022-12-06 23:09 ` [PATCH bpf-next 04/13] bpf: rename list_head -> datastructure_head in field info types Dave Marchevsky
2022-12-07  1:41   ` Alexei Starovoitov
2022-12-07 18:52     ` Dave Marchevsky
2022-12-07 19:01       ` Alexei Starovoitov
2022-12-06 23:09 ` [PATCH bpf-next 05/13] bpf: Add basic bpf_rb_{root,node} support Dave Marchevsky
2022-12-07  1:48   ` Alexei Starovoitov
2022-12-06 23:09 ` [PATCH bpf-next 06/13] bpf: Add bpf_rbtree_{add,remove,first} kfuncs Dave Marchevsky
2022-12-07 14:20   ` kernel test robot
2022-12-06 23:09 ` [PATCH bpf-next 07/13] bpf: Add support for bpf_rb_root and bpf_rb_node in kfunc args Dave Marchevsky
2022-12-07  1:51   ` Alexei Starovoitov
2022-12-06 23:09 ` [PATCH bpf-next 08/13] bpf: Add callback validation to kfunc verifier logic Dave Marchevsky
2022-12-07  2:01   ` Alexei Starovoitov
2022-12-17  8:49     ` Dave Marchevsky
2022-12-06 23:09 ` [PATCH bpf-next 09/13] bpf: Special verifier handling for bpf_rbtree_{remove, first} Dave Marchevsky
2022-12-07  2:18   ` Alexei Starovoitov
2022-12-06 23:09 ` [PATCH bpf-next 10/13] bpf, x86: BPF_PROBE_MEM handling for insn->off < 0 Dave Marchevsky
2022-12-07  2:39   ` Alexei Starovoitov
2022-12-07  6:46     ` Dave Marchevsky
2022-12-07 18:06       ` Alexei Starovoitov
2022-12-07 23:39         ` Dave Marchevsky
2022-12-08  0:47           ` Alexei Starovoitov [this message]
2022-12-08  8:50             ` Dave Marchevsky
2022-12-06 23:09 ` [PATCH bpf-next 11/13] bpf: Add bpf_rbtree_{add,remove,first} decls to bpf_experimental.h Dave Marchevsky
2022-12-06 23:09 ` [PATCH bpf-next 12/13] libbpf: Make BTF mandatory if program BTF has spin_lock or alloc_obj type Dave Marchevsky
2022-12-06 23:10 ` [PATCH bpf-next 13/13] selftests/bpf: Add rbtree selftests Dave Marchevsky
2022-12-07  2:50 ` [PATCH bpf-next 00/13] BPF rbtree next-gen datastructure patchwork-bot+netdevbpf
2022-12-07 19:36 ` Kumar Kartikeya Dwivedi
2022-12-07 22:28   ` Dave Marchevsky
2022-12-07 23:06     ` Alexei Starovoitov
2022-12-08  1:18       ` Dave Marchevsky
2022-12-08  3:51         ` Alexei Starovoitov
2022-12-08  8:28           ` Dave Marchevsky
2022-12-08 12:57             ` Kumar Kartikeya Dwivedi
2022-12-08 20:36               ` Alexei Starovoitov
2022-12-08 23:35                 ` Dave Marchevsky
2022-12-09  0:39                   ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221208004734.3deulbouezpiehrg@macbook-pro-6.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=davemarchevsky@meta.com \
    --cc=kernel-team@fb.com \
    --cc=memxor@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.