bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: bpf@vger.kernel.org, ast@kernel.org
Cc: andrii@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
	kernel-team@fb.com, yonghong.song@linux.dev, sunhao.th@gmail.com,
	Eduard Zingerman <eddyz87@gmail.com>
Subject: [PATCH bpf-next 0/4] bpf: track find_equal_scalars history on per-instruction level
Date: Thu, 22 Feb 2024 02:50:01 +0200	[thread overview]
Message-ID: <20240222005005.31784-1-eddyz87@gmail.com> (raw)

This is a fix for precision tracking bug reported in [0].
It supersedes my previous attempt to fix similar issue in commit [1].
Here is a minimized test case from [0]:

    0:  call bpf_get_prandom_u32;
    1:  r7 = r0;
    2:  r8 = r0;
    3:  call bpf_get_prandom_u32;
    4:  if r0 > 1 goto +0;
    /* --- checkpoint #1: r7.id=1, r8.id=1 --- */
    5:  if r8 >= r0 goto 9f;
    6:  r8 += r8;
    /* --- checkpoint #2: r7.id=1, r8.id=0 --- */
    7:  if r7 == 0 goto 9f;
    8:  r0 /= 0;
    /* --- checkpoint #3 --- */
    9:  r0 = 42;
    10: exit;

W/o this fix verifier incorrectly assumes that instruction at label
(8) is unreachable. The issue is caused by failure to infer
precision mark for r0 at checkpoint #1:
- first verification path is:
  - (0-4): r0 range [0,1];
  - (5): r8 range [0,0], propagated to r7;
  - (6): r8.id is reset;
  - (7): jump is predicted to happen;
  - (9-10): safe exit.
- when jump at (7) is predicted mark_chain_precision() for r7 is
  called and backtrack_insn() proceeds as follows:
  - at (7) r7 is marked as precise;
  - at (5) r8 is not currently tracked and thus r0 is not marked;
  - at (4-5) boundary logic from [1] is triggered and r7,r8 are marked
    as precise;
  - => r0 precision mark is missed.
- when second branch of (4) is considered, verifier prunes the state
  because r0 is not marked as precise in the visited state.

Basically, backtracking logic fails to notice that at (5)
range information is gained for both r7 and r8, and thus both
r8 and r0 have to be marked as precise.
This happens because [1] can only account for such range
transfers at parent/child state boundaries.

The solution suggested by Andrii Nakryiko in [0] is to use jump
history to remember which registers gained range as a result of
find_equal_scalars() and use this information in backtrack_insn().
Which is what this patch-set does.

The patch-set uses u64 value as a vector of 10-bit values that
identify registers gaining range in find_equal_scalars().
This amounts to maximum of 6 possible values.
To check if such capacity is sufficient I've instrumented kernel
to track a histogram for maximal amount of registers that gain range
in find_equal_scalars per program verification [2].
Measurements done for verifier selftests and Cilium bpf object files
from [3] show that number of such registers is *always* <= 4 and
in 98% of cases it is <= 2.

When tested on a subset of selftests identified by
selftests/bpf/veristat.cfg and Cilium bpf object files from [3]
this patch-set has minimal verification performance impact:

File                      Program                   Insns    (DIFF)  States (DIFF)
------------------------  ------------------------  ---------------  -------------
bpf_host.o                tail_handle_nat_fwd_ipv4     -75 (-0.61%)    -3 (-0.39%)
pyperf180.bpf.o           on_event                     -24 (-0.02%)    -8 (-0.09%)
pyperf600_nounroll.bpf.o  on_event                  -11498 (-2.12%)  +551 (+1.64%)

Note:
  patch #1 is a small refactoring which is not really used by
  subsequent patches, but it fixes a surprising behavior that I hit
  while exploring solutions for the issue at hand,
  thus I decided to keep it.

[0] https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/
[1] 904e6ddf4133 ("bpf: Use scalar ids in mark_chain_precision()")
[2] https://github.com/eddyz87/bpf/tree/find-equal-scalars-in-jump-history-with-stats
[3] https://github.com/anakryiko/cilium

Eduard Zingerman (4):
  bpf: replace env->cur_hist_ent with a getter function
  bpf: track find_equal_scalars history on per-instruction level
  bpf: remove mark_precise_scalar_ids()
  selftests/bpf: tests for per-insn find_equal_scalars() precision
    tracking

 include/linux/bpf_verifier.h                  |   2 +-
 kernel/bpf/verifier.c                         | 356 ++++++++++--------
 .../selftests/bpf/progs/verifier_scalar_ids.c | 256 +++++++++----
 .../bpf/progs/verifier_subprog_precision.c    |   2 +-
 .../testing/selftests/bpf/verifier/precise.c  |  10 +-
 5 files changed, 395 insertions(+), 231 deletions(-)

--
2.43.0

             reply	other threads:[~2024-02-22  0:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-22  0:50 Eduard Zingerman [this message]
2024-02-22  0:50 ` [PATCH bpf-next 1/4] bpf: replace env->cur_hist_ent with a getter function Eduard Zingerman
2024-02-28 19:46   ` Andrii Nakryiko
2024-02-28 21:23     ` Eduard Zingerman
2024-02-22  0:50 ` [PATCH bpf-next 2/4] bpf: track find_equal_scalars history on per-instruction level Eduard Zingerman
2024-02-28 19:58   ` Andrii Nakryiko
2024-02-28 21:16     ` Eduard Zingerman
2024-02-28 21:36       ` Andrii Nakryiko
2024-02-28 22:39         ` Eduard Zingerman
2024-02-28 21:40       ` Andrii Nakryiko
2024-02-28 23:01   ` Andrii Nakryiko
2024-02-28 23:29     ` Eduard Zingerman
2024-03-01 17:34       ` Andrii Nakryiko
2024-03-01 17:44         ` Eduard Zingerman
2024-03-04 23:37           ` Andrii Nakryiko
2024-02-22  0:50 ` [PATCH bpf-next 3/4] bpf: remove mark_precise_scalar_ids() Eduard Zingerman
2024-02-22  0:50 ` [PATCH bpf-next 4/4] selftests/bpf: tests for per-insn find_equal_scalars() precision tracking Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240222005005.31784-1-eddyz87@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@linux.dev \
    --cc=sunhao.th@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).