All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	kernel-team@fb.com
Subject: Re: [PATCH bpf-next 7/7] bpf: unify PTR_TO_MAP_{KEY,VALUE} with default case in regsafe()
Date: Wed, 4 Jan 2023 14:35:21 -0800	[thread overview]
Message-ID: <20230104223521.hi2wvabfn7ldgh6o@macbook-pro-6.dhcp.thefacebook.com> (raw)
In-Reply-To: <CAEf4BzaaE1B0Xezb5jrH0p-my4_GEb7EPqfAQVBPLyLkq672=g@mail.gmail.com>

On Tue, Jan 03, 2023 at 02:04:44PM -0800, Andrii Nakryiko wrote:
> > It sounds logical, but it can get tricky with ranges and branch taken logic.
> > Consider something like:
> > R1=(min=2,max=8), R2=(min=1, max=10)
> > if (R1 within R2) // bpf prog is doing its own 'within'
> 
> a bit confused what is "R1 within R2" here and what you mean "bpf prog
> is doing its own 'within'"? Any sort of `R1 < R2` checks (and any
> other op: <=, >=, etc) can't really kick in branch elimination because
> R2_min=1 < R1_max=8, so arithmetically speaking we can't conclude that
> "R1 is always smaller than R2", so both branches would have to be
> examined.

Something like that. Details didn't matter to me.
It was hypothetical 'within' operation just to illustrate the point.

> But I probably misunderstood your example, sorry.
> 
> >   // branch taken kicks in
> > else
> >   // issues that were never checked
> >
> > Now new state has:
> > R1=(min=4,max=6), R2=(min=5, max=5)
> >
> > Both R1 and R2 of new state individually range_within of old safe state,
> > but together the prog may go to the unverified path.
> > Not sure whether it's practical today.
> > You asked for hypothetical, so here it goes :)
> 
> No problem with "hypothetical-ness". But my confusion and argument is
> similarly "in principle"-like. Because if such an example above can be
> constructed then this would be an issue for SCALAR as well, right? And
> if you can bypass verifier's safety with SCALAR, you (hypothetically)
> could use that SCALAR to do out-of-bounds memory access by adding this
> SCALAR to some mem-like register.

Correct. The issue would apply to regular scalar if such 'within' operation
was available.

> So that's my point and my source of confusion: if we don't trust
> var_off+range_within() logic to handle *all* situations correctly,
> then we should be worried about SCALARs just as much as anything else
> (unless, as usual, I missed something).

Yes. I personally don't believe that doing range_within for all regtypes
by default is a safer way forward.
The example wasn't real. It was trying to demonstrate a possible issue.
You insist to see a real example with range_within.
I don't have it. It's a gut feel that it could be there because
I could construct it with fake 'within'.

> > More gut feel than real issue.
> >
> > >
> > > >
> > > > SCALARS and PTR_TO_BTF_ID will likely dominate future bpf progs.
> > > > Keeping default as regs_exact (that does ID match) is safer default.
> > >
> > > It's fine, though the point of this patch set was patch #7, enabling
> > > logic similar to PTR_TO_MAP_VALUE for PTR_TO_MEM and PTR_TO_BUF. I can
> > > send specific fixes for that, no problem. But as I said above, I'm
> > > really curious to understand what kind of situations will lead to
> > > unsafety if we do var_off+range_within checks.
> >
> > PTR_TO_MEM and PTR_TO_BUF explicitly are likely ok despite my convoluted
> > example above.
> > I'm less sure about PTR_TO_BTF_ID. It could be ok.
> > Just feels safer to opt-in each type explicitly.
> 
> Sure, I can just do a simple opt-in, no problem. As I said, mostly
> trying to understand the issue overall.
> 
> For PTR_TO_BTF_ID specifically, I can see how we can enable
> var_off+range_within for cases when we access some array, right? But
> then I think we'll be enforcing that we are staying within the
> boundaries of a single array field, never crossing into another field.

Likely yes, but why?
You're trying hard to collapse the switch statement in regsafe()
while claiming it's a safer way. I don't see it this way.
For example the upcoming active_lock_id would need its own check_ids() call.
It will be necessary for PTR_TO_BTF_ID only.
Why collapse the switch into 'default:' just to bring some back?
The default without checking active_lock_id through check_ids
would be wrong, so collapsed switch doesn't make things safer.

> But just to take a step back, from my perspective var_off and
> range_within are complementary and solve slightly different uses, but
> should be both satisfied:
>   - var_off is not precise with range boundaries (due to some bits too
> coarsely marked as unknown), but it's useful to enforce having a value
> being a multiple of some power-of-2 (e.g., knowing for sure that
> lowest 2 bits are zero means that value is multiple of 4; I haven't
> checked, but I assume we check with for various pointer accesses to
> ensure we don't have misaligned reads). They can be only approximately
> used for actual possible range of values.

Right. var_off is used for alignment checking too. grep tnum_is_aligned.
We have bare minimum of testing for that though.
Only few tests in the test_verifier use BPF_F_STRICT_ALIGNMENT

>   - range_within() can and should be used for *precise* range of value
> tracking, but it can't express that alignment restriction.

Right.

> So while I previously thought that we can do away without var_off, I
> now think there are cases when it's necessary. But if we are sure that
> we handle any SCALAR case correctly for any possible var_off +
> range_within situation, it should be fine to do that for any mem-like
> pointer just as much, as var_off+range_within is basically a MEM +
> SCALAR combined case.

Right. Likely true.

> Anyways, I'm not blocked on this, but I think we'll benefit from
> taking this discussion to its logical conclusion.

Not sure what conclusion you're looking for.

  reply	other threads:[~2023-01-04 22:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-23  5:49 [PATCH bpf-next 0/7] BPF verifier state equivalence checks improvements Andrii Nakryiko
2022-12-23  5:49 ` [PATCH bpf-next 1/7] bpf: teach refsafe() to take into account ID remapping Andrii Nakryiko
2022-12-23  5:49 ` [PATCH bpf-next 2/7] bpf: reorganize struct bpf_reg_state fields Andrii Nakryiko
2022-12-23  5:49 ` [PATCH bpf-next 3/7] bpf: generalize MAYBE_NULL vs non-MAYBE_NULL rule Andrii Nakryiko
2022-12-23  5:49 ` [PATCH bpf-next 4/7] bpf: reject non-exact register type matches in regsafe() Andrii Nakryiko
2022-12-23  5:49 ` [PATCH bpf-next 5/7] bpf: perform byte-by-byte comparison only when necessary " Andrii Nakryiko
2022-12-23  5:49 ` [PATCH bpf-next 6/7] bpf: fix regs_exact() logic in regsafe() to remap IDs correctly Andrii Nakryiko
2022-12-23  5:49 ` [PATCH bpf-next 7/7] bpf: unify PTR_TO_MAP_{KEY,VALUE} with default case in regsafe() Andrii Nakryiko
2022-12-28  2:00   ` Alexei Starovoitov
2022-12-29 21:59     ` Andrii Nakryiko
2022-12-30  2:19       ` Alexei Starovoitov
2023-01-03 22:04         ` Andrii Nakryiko
2023-01-04 22:35           ` Alexei Starovoitov [this message]
2023-01-04 23:03             ` Andrii Nakryiko
2023-01-05  0:14               ` Alexei Starovoitov
2023-01-11 19:08                 ` Andrii Nakryiko
2022-12-28  2:10 ` [PATCH bpf-next 0/7] BPF verifier state equivalence checks improvements patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230104223521.hi2wvabfn7ldgh6o@macbook-pro-6.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.