All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/1] docs/bpf: Add description of register liveness tracking algorithm
@ 2023-01-30 18:23 Eduard Zingerman
  2023-01-30 18:24 ` [PATCH bpf-next v2 1/1] " Eduard Zingerman
  0 siblings, 1 reply; 5+ messages in thread
From: Eduard Zingerman @ 2023-01-30 18:23 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman

An overview of the register tracking liveness algorithm.
Previous version posted here: [1].
This update includes fixes suggested by Andrii Nakryiko:
- wording corrected to use term "stack slot" instead of "stack spill";
- parentage chain diagram updated to show nil links for frame #1;
- added example for non-BPF_DW writes behavior;
- explanation in "Read marks propagation for cache hits" is reworked.

[1] https://lore.kernel.org/bpf/20230124220343.2942203-1-eddyz87@gmail.com/

Eduard Zingerman (1):
  docs/bpf: Add description of register liveness tracking algorithm

 Documentation/bpf/verifier.rst | 266 +++++++++++++++++++++++++++++++++
 1 file changed, 266 insertions(+)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH bpf-next v2 1/1] docs/bpf: Add description of register liveness tracking algorithm
  2023-01-30 18:23 [PATCH bpf-next v2 0/1] docs/bpf: Add description of register liveness tracking algorithm Eduard Zingerman
@ 2023-01-30 18:24 ` Eduard Zingerman
  2023-01-30 19:59   ` David Vernet
  2023-01-31  2:33   ` Alexei Starovoitov
  0 siblings, 2 replies; 5+ messages in thread
From: Eduard Zingerman @ 2023-01-30 18:24 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman

This is a followup for [1], adds an overview for the register liveness
tracking, covers the following points:
- why register liveness tracking is useful;
- how register parentage chains are constructed;
- how liveness marks are applied using the parentage chains.

[1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 Documentation/bpf/verifier.rst | 266 +++++++++++++++++++++++++++++++++
 1 file changed, 266 insertions(+)

diff --git a/Documentation/bpf/verifier.rst b/Documentation/bpf/verifier.rst
index 3afa548ec28c..7269933428e1 100644
--- a/Documentation/bpf/verifier.rst
+++ b/Documentation/bpf/verifier.rst
@@ -316,6 +316,272 @@ Pruning considers not only the registers but also the stack (and any spilled
 registers it may hold).  They must all be safe for the branch to be pruned.
 This is implemented in states_equal().
 
+Some technical details about state pruning implementation could be found below.
+
+Register liveness tracking
+--------------------------
+
+In order to make state pruning effective liveness state is tracked for each
+register and stack slot. The basic idea is to identify which registers or stack
+slots were used by children of the state to reach the program exit. Registers
+that were never used could be removed from the cached state thus making more
+states equivalent to a cached state. This could be illustrated by the following
+program::
+
+  0: call bpf_get_prandom_u32()
+  1: r1 = 0
+  2: if r0 == 0 goto +1
+  3: r0 = 1
+  --- checkpoint ---
+  4: r0 = r1
+  5: exit
+
+Suppose that state cache entry is created at instruction #4 (such entries are
+also called "checkpoints" in the text below), verifier reaches this instruction
+in two states:
+
+* r0 = 1, r1 = 0
+* r0 = 0, r1 = 0
+
+However, only the value of register ``r1`` is important to successfully finish
+verification. The goal of liveness tracking algorithm is to spot this fact and
+figure out that both states are actually equivalent.
+
+Data structures
+~~~~~~~~~~~~~~~
+
+Liveness is tracked using the following data structures::
+
+  enum bpf_reg_liveness {
+	REG_LIVE_NONE = 0,
+	REG_LIVE_READ32 = 0x1,
+	REG_LIVE_READ64 = 0x2,
+	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
+	REG_LIVE_WRITTEN = 0x4,
+	REG_LIVE_DONE = 0x8,
+  };
+
+  struct bpf_reg_state {
+ 	...
+	struct bpf_reg_state *parent;
+ 	...
+	enum bpf_reg_liveness live;
+ 	...
+  };
+
+  struct bpf_stack_state {
+	struct bpf_reg_state spilled_ptr;
+	...
+  };
+
+  struct bpf_func_state {
+	struct bpf_reg_state regs[MAX_BPF_REG];
+        ...
+	struct bpf_stack_state *stack;
+  }
+  
+  struct bpf_verifier_state {
+	struct bpf_func_state *frame[MAX_CALL_FRAMES];
+	struct bpf_verifier_state *parent;
+        ...
+  }
+
+* ``REG_LIVE_NONE`` is an initial value assigned to ``->live`` fields upon new
+  verifier state creation;
+
+* ``REG_LIVE_WRITTEN`` means that the value of the register (or stack slot) is
+  defined by some instruction "executed" within verifier state;
+
+* ``REG_LIVE_READ{32,64}`` means that the value of the register (or stack slot)
+  is read by some instruction "executed" within verifier state;
+
+* ``REG_LIVE_DONE`` is a marker used by ``clean_verifier_state()`` to avoid
+  processing same verifier state multiple times and for some sanity checks;
+
+* ``->live`` field values are formed by combining ``enum bpf_reg_liveness``
+  values using bitwise or.
+
+Register parentage chains
+~~~~~~~~~~~~~~~~~~~~~~~~~
+  
+In order to propagate information between parent and child states register
+parentage chain is established. Each register or stack slot is linked to a
+corresponding register or stack slot in it's parent state via a ``->parent``
+pointer. This link is established upon state creation in function
+``is_state_visited()`` and is never changed.
+
+The rules for correspondence between registers / stack slots are as follows:
+
+* For current stack frame registers and stack slots of the new state are linked
+  to the registers and stack slots of the parent state with the same indices.
+
+* For outer stack frames only caller saved registers (r6-r9) and stack slots are
+  linked to the registers and stack slots of the parent state with the same
+  indices.
+
+This could be illustrated by the following diagram (arrows stand for
+``->parent`` pointers)::
+
+      ...                    ; Frame #0, some instructions
+  --- checkpoint #0 ---
+  1 : r6 = 42                ; Frame #0
+  --- checkpoint #1 ---
+  2 : call foo()             ; Frame #0
+      ...                    ; Frame #1, instructions from foo()
+  --- checkpoint #2 ---
+      ...                    ; Frame #1, instructions from foo()
+  --- checkpoint #3 ---
+      exit                   ; Frame #1, return from foo()
+  3 : r1 = r6                ; Frame #0  <- current state
+  
+             +--------------------------+--------------------------+ 
+             |         Frame #0         |         Frame #1         |
+  Checkpoint +--------------------------+--------------------------+
+  #0         | r0-r5 | r6-r9 | fp-8 ... |                           
+             +--------------------------+                           
+                ^       ^       ^
+                |       |       |          
+  Checkpoint +--------------------------+                           
+  #1         | r0-r5 | r6-r9 | fp-8 ... |
+             +--------------------------+
+                        ^       ^         nil     nil     nil 
+                        |       |          |       |       |          
+  Checkpoint +--------------------------+--------------------------+
+  #2         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
+             +--------------------------+--------------------------+
+                        ^       ^          ^       ^       ^  
+                        |       |          |       |       |
+  Checkpoint +--------------------------+--------------------------+
+  #3         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
+             +--------------------------+--------------------------+
+                        ^       ^
+                        |       |          
+  Current    +--------------------------+                           
+  state      | r0-r5 | r6-r9 | fp-8 ... |                           
+             +--------------------------+
+                        \           
+                          r6 read mark is propagated via
+                          these links all the way up to
+                          checkpoint #1.
+
+Liveness marks tracking
+~~~~~~~~~~~~~~~~~~~~~~~
+
+For each processed instruction verifier propagates information about reads up
+the parentage chain and saves information about writes in the current state.
+The information about reads is propagated by function ``mark_reg_read()`` which
+could be summarized as follows::
+
+  mark_reg_read(struct bpf_reg_state *state):
+      parent = state->parent
+      while parent:
+          if parent->live & REG_LIVE_WRITTEN:
+              break
+          if parent->live & REG_LIVE_READ64:
+              break
+          parent->live |= REG_LIVE_READ64
+          state = parent
+          parent = state->parent
+
+Note: details about REG_LIVE_READ32 are omitted.
+
+Also note: the read marks are applied to the **parent** state while write marks
+are applied to the **current** state.
+
+Because stack writes could have different sizes ``REG_LIVE_WRITTEN`` marks are
+applied conservatively: stack slots are marked as written only if write size
+corresponds to the size of the register, e.g. see function ``save_register_state()``.
+
+Consider the following example::
+
+  0: (*u64)(r10 - 8) = 0
+  --- checkpoint #0 ---
+  1: (*u32)(r10 - 8) = 1
+  2: r1 = (*u32)(r10 - 8)
+
+Because write size at instruction (1) is smaller than register size the write
+mark will not be added to fp-8 slot when (1) is verified, thus the fp-8 read at
+instruction (2) would propagate read mark for fp-8 up to checkpoint #0.
+
+Once ``BPF_EXIT`` instruction is reached function ``update_branch_counts()`` is
+called to update the ``->branches`` counter for each verifier state in a chain
+of parent verifier states. When ``->branches`` counter reaches zero the verifier
+state becomes a valid entry in a set of cached verifier states.
+
+Each entry of the verifier states cache is post-processed by a function
+``clean_live_states()``. This function marks all registers and stack slots
+without ``REG_LIVE_READ{32,64}`` marks as ``NOT_INIT`` or ``STACK_INVALID``.
+Registers/stack slots marked in this way are ignored in function ``stacksafe()``
+called from ``states_equal()`` when state cache entry is considered for
+equivalence with a current state.
+
+Now it is possible to explain how the example from the beginning of the section
+works::
+
+  0: call bpf_get_prandom_u32()
+  1: r1 = 0
+  2: if r0 == 0 goto +1
+  3: r0 = 1
+  --- checkpoint[0] ---
+  4: r0 = r1
+  5: exit
+
+* At instruction #2 branching point is reached and state ``{ r0 == 0, r1 == 0, pc == 4 }``
+  is pushed to states processing queue (pc stands for program counter).
+  
+* At instruction #4:
+  
+  * ``checkpoint[0]`` states cache entry is created: ``{ r0 == 1, r1 == 0, pc == 4 }``;
+  * ``checkpoint[0].r0`` is marked as written;
+  * ``checkpoint[0].r1`` is marked as read;
+
+* At instruction #5 exit is reached and ``checkpoint[0]`` can now be processed
+  by ``clean_live_states()``, after this processing ``checkpoint[0].r0`` has a
+  read mark and all other registers and stack slots are marked as ``NOT_INIT``
+  or ``STACK_INVALID``
+  
+* The state ``{ r0 == 0, r1 == 0, pc == 4 }`` is popped from the states queue
+  and is compared against a cached state ``{ r1 == 0, pc == 4 }``, the states
+  are considered equivalent.
+
+Read marks propagation for cache hits
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Another important point is handling of read marks when a previously verified
+state is found in the states cache. All read marks present on registers and
+stack slots of the cached state must be propagated over the parentage chain of
+the current state. Function ``propagate_liveness()`` handles this case.
+
+For example, consider the following state parentage chain (S is a
+starting state, A-E are derived states, -> arrows show which state is
+derived from which)::
+
+                      r1 read
+               <-------------                    A[r1] == 0
+                    +---+                        C[r1] == 0
+      S ---> A ---> | B | ---> exit              E[r1] == 1
+      |             |   |
+      ` ---> C ---> | D |
+      |             +---+
+      ` ---> E        
+                      ^
+             ^        |___   suppose all these
+             |             states are at insn #Y
+      suppose all these
+    states are at insn #X
+
+* Chain of states ``S -> A -> B -> exit`` is verified first.
+
+* While ``B -> exit`` is verified, register ``r1`` is read and this read mark is
+  propagated up to state ``A``.
+
+* When chain of states ``C -> D`` is verified the state ``D`` turns out to be
+  equivalent to state ``B``.
+
+* The read mark for ``r1`` has to be propagated to state ``C``, otherwise state
+  ``C`` might get mistakenly marked as equivalent to state ``E`` even though
+  values for register ``r1`` differ between ``C`` and ``E``.
+
 Understanding eBPF verifier messages
 ====================================
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next v2 1/1] docs/bpf: Add description of register liveness tracking algorithm
  2023-01-30 18:24 ` [PATCH bpf-next v2 1/1] " Eduard Zingerman
@ 2023-01-30 19:59   ` David Vernet
  2023-01-30 23:45     ` Eduard Zingerman
  2023-01-31  2:33   ` Alexei Starovoitov
  1 sibling, 1 reply; 5+ messages in thread
From: David Vernet @ 2023-01-30 19:59 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Mon, Jan 30, 2023 at 08:24:00PM +0200, Eduard Zingerman wrote:
> This is a followup for [1], adds an overview for the register liveness
> tracking, covers the following points:
> - why register liveness tracking is useful;
> - how register parentage chains are constructed;
> - how liveness marks are applied using the parentage chains.
> 
> [1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
> 
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

This is great, thanks a lot for writing this up. Left some comments.

> ---
>  Documentation/bpf/verifier.rst | 266 +++++++++++++++++++++++++++++++++
>  1 file changed, 266 insertions(+)
> 
> diff --git a/Documentation/bpf/verifier.rst b/Documentation/bpf/verifier.rst
> index 3afa548ec28c..7269933428e1 100644
> --- a/Documentation/bpf/verifier.rst
> +++ b/Documentation/bpf/verifier.rst
> @@ -316,6 +316,272 @@ Pruning considers not only the registers but also the stack (and any spilled
>  registers it may hold).  They must all be safe for the branch to be pruned.
>  This is implemented in states_equal().
>  
> +Some technical details about state pruning implementation could be found below.
> +
> +Register liveness tracking
> +--------------------------
> +
> +In order to make state pruning effective liveness state is tracked for each

Could you please add a comma after "effective"? Will use shorthand
s/xxx/yyy for future suggestions like this.

> +register and stack slot. The basic idea is to identify which registers or stack
> +slots were used by children of the state to reach the program exit. Registers

IMO the phrasing "used by children of the state" may be a bit confusing.
What do you think about this?

"The basic idea is to track which registers and stack slots are actually
used during subseqeuent execution of the program, until program exit is
reached."

> +that were never used could be removed from the cached state thus making more

s/Registers that were never/Registers and stack slots that were never

> +states equivalent to a cached state. This could be illustrated by the following
> +program::
> +
> +  0: call bpf_get_prandom_u32()
> +  1: r1 = 0
> +  2: if r0 == 0 goto +1
> +  3: r0 = 1
> +  --- checkpoint ---
> +  4: r0 = r1
> +  5: exit
> +
> +Suppose that state cache entry is created at instruction #4 (such entries are
> +also called "checkpoints" in the text below), verifier reaches this instruction
> +in two states:

Can we split these into two sentences? Wdyt about the following:

"Suppose that a state cache entry is created at instruction #4 (such
entries are also called "checkpoints" in the text below). The verifier
could reach the instruction with one of two possible register states:"

> +
> +* r0 = 1, r1 = 0
> +* r0 = 0, r1 = 0
> +
> +However, only the value of register ``r1`` is important to successfully finish
> +verification. The goal of liveness tracking algorithm is to spot this fact and

s/ of liveness tracking/of the liveness tracking

> +figure out that both states are actually equivalent.
> +
> +Data structures
> +~~~~~~~~~~~~~~~
> +
> +Liveness is tracked using the following data structures::
> +
> +  enum bpf_reg_liveness {
> +	REG_LIVE_NONE = 0,
> +	REG_LIVE_READ32 = 0x1,
> +	REG_LIVE_READ64 = 0x2,
> +	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
> +	REG_LIVE_WRITTEN = 0x4,
> +	REG_LIVE_DONE = 0x8,
> +  };

FYI, you have the option of using the .. kernel-doc:: directive here, as
in e.g.:

.. kernel-doc:: include/linux/bpf_verifier.h
  :identifiers: bpf_reg_liveness

For enum bpf_reg_liveness and the other types. Doing so would require
you updating the kernel-doc header comment in
include/linux/bpf_verifier.h to reflect the description in [0], i.e.
something like this for enum bpf_reg_liveness:

[0]: https://docs.kernel.org/doc-guide/kernel-doc.html#structure-union-and-enumeration-documentation

/**
 * enum bpf_reg_liveness - Liveness marks, used for registers and spilled-regs
 * @REG_LIVE_NONE: reg hasn't been read or written this branch
 * @REG_LIVE_READ32: reg was read, so we're sensitive to initial value
 * @REG_LIVE_READ64: likewise, but full 64-bit content matters
 * @REG_LIVE_READ: An OR of both 32 and 64 bit read reg states.
 * @REG_LIVE_WRITTEN: reg was written first, screening off later reads
 * @REG_LIVE_DONE: liveness won't be updating this register anymore
 *
 * Read marks propagate upwards until they find a write mark; they record that
 * "one of this state's descendants read this reg" (and therefore the reg is
 * relevant for states_equal() checks).
 * Write marks collect downwards and do not propagate; they record that "the
 * straight-line code that reached this state (from its parent) wrote this reg"
 * (and therefore that reads propagated from this state or its descendants
 * should not propagate to its parent).
 * A state with a write mark can receive read marks; it just won't propagate
 * them to its parent, since the write mark is a property, not of the state,
 * but of the link between it and its parent.  See mark_reg_read() and
 * mark_stack_slot_read() in kernel/bpf/verifier.c.
 */

Not everyone likes how the rendered HTML looks, and I tend to agree that
it looks pretty blocky, but it at least keeps all of this documentation
in one place and avoids going stale. Also, it has the benefit of
automatically creating a link to the ..kernel-doc:: comment anywhere
that "enum bpf_reg_liveness" is written in the Documentation subtree.

> +
> +  struct bpf_reg_state {
> + 	...
> +	struct bpf_reg_state *parent;
> + 	...
> +	enum bpf_reg_liveness live;
> + 	...
> +  };
> +
> +  struct bpf_stack_state {
> +	struct bpf_reg_state spilled_ptr;
> +	...
> +  };
> +
> +  struct bpf_func_state {
> +	struct bpf_reg_state regs[MAX_BPF_REG];
> +        ...
> +	struct bpf_stack_state *stack;
> +  }
> +  
> +  struct bpf_verifier_state {
> +	struct bpf_func_state *frame[MAX_CALL_FRAMES];
> +	struct bpf_verifier_state *parent;
> +        ...
> +  }
> +
> +* ``REG_LIVE_NONE`` is an initial value assigned to ``->live`` fields upon new
> +  verifier state creation;
> +
> +* ``REG_LIVE_WRITTEN`` means that the value of the register (or stack slot) is
> +  defined by some instruction "executed" within verifier state;
> +
> +* ``REG_LIVE_READ{32,64}`` means that the value of the register (or stack slot)
> +  is read by some instruction "executed" within verifier state;
> +
> +* ``REG_LIVE_DONE`` is a marker used by ``clean_verifier_state()`` to avoid
> +  processing same verifier state multiple times and for some sanity checks;
> +
> +* ``->live`` field values are formed by combining ``enum bpf_reg_liveness``
> +  values using bitwise or.
> +
> +Register parentage chains
> +~~~~~~~~~~~~~~~~~~~~~~~~~
> +  
> +In order to propagate information between parent and child states register

Suggestion:

s/parent and child states register parentage chain/
  parent and child states, a *register parentage chain*

> +parentage chain is established. Each register or stack slot is linked to a
> +corresponding register or stack slot in it's parent state via a ``->parent``

s/it's/its

> +pointer. This link is established upon state creation in function

suggestion: I think you can remove the word 'function'.

> +``is_state_visited()`` and is never changed.
> +
> +The rules for correspondence between registers / stack slots are as follows:
> +
> +* For current stack frame registers and stack slots of the new state are linked
> +  to the registers and stack slots of the parent state with the same indices.

s/For current stack frame registers/For the current stack frame,
registers

> +
> +* For outer stack frames only caller saved registers (r6-r9) and stack slots are
> +  linked to the registers and stack slots of the parent state with the same
> +  indices.

s/For the outer stack frames only/For the outer stack frames, only

> +
> +This could be illustrated by the following diagram (arrows stand for
> +``->parent`` pointers)::
> +
> +      ...                    ; Frame #0, some instructions
> +  --- checkpoint #0 ---
> +  1 : r6 = 42                ; Frame #0
> +  --- checkpoint #1 ---
> +  2 : call foo()             ; Frame #0
> +      ...                    ; Frame #1, instructions from foo()
> +  --- checkpoint #2 ---
> +      ...                    ; Frame #1, instructions from foo()
> +  --- checkpoint #3 ---
> +      exit                   ; Frame #1, return from foo()
> +  3 : r1 = r6                ; Frame #0  <- current state
> +  
> +             +--------------------------+--------------------------+ 
> +             |         Frame #0         |         Frame #1         |
> +  Checkpoint +--------------------------+--------------------------+
> +  #0         | r0-r5 | r6-r9 | fp-8 ... |                           
> +             +--------------------------+                           
> +                ^       ^       ^
> +                |       |       |          
> +  Checkpoint +--------------------------+                           
> +  #1         | r0-r5 | r6-r9 | fp-8 ... |
> +             +--------------------------+
> +                        ^       ^         nil     nil     nil 
> +                        |       |          |       |       |          
> +  Checkpoint +--------------------------+--------------------------+
> +  #2         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
> +             +--------------------------+--------------------------+
> +                        ^       ^          ^       ^       ^  
> +                        |       |          |       |       |
> +  Checkpoint +--------------------------+--------------------------+
> +  #3         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
> +             +--------------------------+--------------------------+
> +                        ^       ^
> +                        |       |          
> +  Current    +--------------------------+                           
> +  state      | r0-r5 | r6-r9 | fp-8 ... |                           
> +             +--------------------------+
> +                        \           
> +                          r6 read mark is propagated via
> +                          these links all the way up to
> +                          checkpoint #1.

IMO this illustration is kind of confusing as shown, relative to the
descriptions of current / outer above. 'Frame #1' is potentially in a
different function / subprog, correct? So it seems kind of confusing to
see it propagating registers from checkpoint 3 -> checkpoint 2, etc. And
it's propagating r0-r5. Similarly, it looks like 'Current state' isn't
propagating r0-r5. I'm sure I'm misunderstanding something.

> +
> +Liveness marks tracking
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +For each processed instruction verifier propagates information about reads up

s/processed instruction verifier/processed instruction, the verifier

> +the parentage chain and saves information about writes in the current state.
> +The information about reads is propagated by function ``mark_reg_read()`` which
> +could be summarized as follows::
> +
> +  mark_reg_read(struct bpf_reg_state *state):
> +      parent = state->parent
> +      while parent:
> +          if parent->live & REG_LIVE_WRITTEN:
> +              break
> +          if parent->live & REG_LIVE_READ64:
> +              break
> +          parent->live |= REG_LIVE_READ64
> +          state = parent
> +          parent = state->parent
> +
> +Note: details about REG_LIVE_READ32 are omitted.
> +
> +Also note: the read marks are applied to the **parent** state while write marks
> +are applied to the **current** state.
> +
> +Because stack writes could have different sizes ``REG_LIVE_WRITTEN`` marks are
> +applied conservatively: stack slots are marked as written only if write size
> +corresponds to the size of the register, e.g. see function ``save_register_state()``.
> +
> +Consider the following example::
> +
> +  0: (*u64)(r10 - 8) = 0
> +  --- checkpoint #0 ---
> +  1: (*u32)(r10 - 8) = 1
> +  2: r1 = (*u32)(r10 - 8)
> +
> +Because write size at instruction (1) is smaller than register size the write

s/write size/the write size

also

s/than register size the/than register size, the

> +mark will not be added to fp-8 slot when (1) is verified, thus the fp-8 read at
> +instruction (2) would propagate read mark for fp-8 up to checkpoint #0.

s/is verified, thus/is verifier. Thus

also

s/would propagate read mark/would propagate the read mark

In general, this sentence explains mechanically why the write won't
propagate (the write size is smaller than the register size), but I
think it would be useful to explain why sizeof(write) < sizeof(reg)
implies that it won't be propagated.

> +
> +Once ``BPF_EXIT`` instruction is reached function ``update_branch_counts()`` is

s/Once ``BPF_EXIT``/Once the ``BPF_EXIT``

also

s/is reached function/is reached,

> +called to update the ``->branches`` counter for each verifier state in a chain
> +of parent verifier states. When ``->branches`` counter reaches zero the verifier

s/When ``->branches`` counter reaches zero/
  When the ``branches`` counter reaches zero,

> +state becomes a valid entry in a set of cached verifier states.
> +
> +Each entry of the verifier states cache is post-processed by a function
> +``clean_live_states()``. This function marks all registers and stack slots
> +without ``REG_LIVE_READ{32,64}`` marks as ``NOT_INIT`` or ``STACK_INVALID``.
> +Registers/stack slots marked in this way are ignored in function ``stacksafe()``
> +called from ``states_equal()`` when state cache entry is considered for

s/when state cache entry/when a state cache entry

> +equivalence with a current state.
> +
> +Now it is possible to explain how the example from the beginning of the section
> +works::
> +
> +  0: call bpf_get_prandom_u32()
> +  1: r1 = 0
> +  2: if r0 == 0 goto +1
> +  3: r0 = 1
> +  --- checkpoint[0] ---
> +  4: r0 = r1
> +  5: exit
> +
> +* At instruction #2 branching point is reached and state ``{ r0 == 0, r1 == 0, pc == 4 }``
> +  is pushed to states processing queue (pc stands for program counter).
> +  
> +* At instruction #4:
> +  
> +  * ``checkpoint[0]`` states cache entry is created: ``{ r0 == 1, r1 == 0, pc == 4 }``;
> +  * ``checkpoint[0].r0`` is marked as written;
> +  * ``checkpoint[0].r1`` is marked as read;
> +
> +* At instruction #5 exit is reached and ``checkpoint[0]`` can now be processed
> +  by ``clean_live_states()``, after this processing ``checkpoint[0].r0`` has a

s/by ``clean_live_states()``, after/by ``clean_live_states()``. After

> +  read mark and all other registers and stack slots are marked as ``NOT_INIT``
> +  or ``STACK_INVALID``
> +  
> +* The state ``{ r0 == 0, r1 == 0, pc == 4 }`` is popped from the states queue
> +  and is compared against a cached state ``{ r1 == 0, pc == 4 }``, the states
> +  are considered equivalent.
> +
> +Read marks propagation for cache hits
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Another important point is handling of read marks when a previously verified

s/is handling of/is the handling of

> +state is found in the states cache. All read marks present on registers and
> +stack slots of the cached state must be propagated over the parentage chain of
> +the current state. Function ``propagate_liveness()`` handles this case.

Suggestion: Consider expanding the sentence to include why it's
necessary to propagate like this. You explain below after the diagram,
but it might be helpful to clarify here so the reader has some context
when reading over the diagram.

> +
> +For example, consider the following state parentage chain (S is a
> +starting state, A-E are derived states, -> arrows show which state is
> +derived from which)::
> +
> +                      r1 read
> +               <-------------                    A[r1] == 0
> +                    +---+                        C[r1] == 0
> +      S ---> A ---> | B | ---> exit              E[r1] == 1
> +      |             |   |
> +      ` ---> C ---> | D |
> +      |             +---+
> +      ` ---> E        
> +                      ^
> +             ^        |___   suppose all these
> +             |             states are at insn #Y
> +      suppose all these
> +    states are at insn #X
> +
> +* Chain of states ``S -> A -> B -> exit`` is verified first.
> +
> +* While ``B -> exit`` is verified, register ``r1`` is read and this read mark is
> +  propagated up to state ``A``.
> +
> +* When chain of states ``C -> D`` is verified the state ``D`` turns out to be
> +  equivalent to state ``B``.
> +
> +* The read mark for ``r1`` has to be propagated to state ``C``, otherwise state
> +  ``C`` might get mistakenly marked as equivalent to state ``E`` even though
> +  values for register ``r1`` differ between ``C`` and ``E``.
> +
>  Understanding eBPF verifier messages
>  ====================================
>  
> -- 
> 2.39.0
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next v2 1/1] docs/bpf: Add description of register liveness tracking algorithm
  2023-01-30 19:59   ` David Vernet
@ 2023-01-30 23:45     ` Eduard Zingerman
  0 siblings, 0 replies; 5+ messages in thread
From: Eduard Zingerman @ 2023-01-30 23:45 UTC (permalink / raw)
  To: David Vernet; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Mon, 2023-01-30 at 13:59 -0600, David Vernet wrote:
> On Mon, Jan 30, 2023 at 08:24:00PM +0200, Eduard Zingerman wrote:
> > This is a followup for [1], adds an overview for the register liveness
> > tracking, covers the following points:
> > - why register liveness tracking is useful;
> > - how register parentage chains are constructed;
> > - how liveness marks are applied using the parentage chains.
> > 
> > [1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
> > 
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> 
> This is great, thanks a lot for writing this up. Left some comments.
> 
> > ---
> >  Documentation/bpf/verifier.rst | 266 +++++++++++++++++++++++++++++++++
> >  1 file changed, 266 insertions(+)
> > 
> > diff --git a/Documentation/bpf/verifier.rst b/Documentation/bpf/verifier.rst
> > index 3afa548ec28c..7269933428e1 100644
> > --- a/Documentation/bpf/verifier.rst
> > +++ b/Documentation/bpf/verifier.rst
> > @@ -316,6 +316,272 @@ Pruning considers not only the registers but also the stack (and any spilled
> >  registers it may hold).  They must all be safe for the branch to be pruned.
> >  This is implemented in states_equal().
> >  
> > +Some technical details about state pruning implementation could be found below.

Hi David,

Thank you for commenting. I'll update the doc according to your
s/xxx/yyy/ suggestions, sorry for the amount of missing comas, etc.
Regarding some bigger points please see below.

> > +
> > +Register liveness tracking
> > +--------------------------
> > +
> > +In order to make state pruning effective liveness state is tracked for each
> 
> Could you please add a comma after "effective"? Will use shorthand
> s/xxx/yyy for future suggestions like this.
> 
> > +register and stack slot. The basic idea is to identify which registers or stack
> > +slots were used by children of the state to reach the program exit. Registers
> 
> IMO the phrasing "used by children of the state" may be a bit confusing.
> What do you think about this?
> 
> "The basic idea is to track which registers and stack slots are actually
> used during subseqeuent execution of the program, until program exit is
> reached."
> 
> > +that were never used could be removed from the cached state thus making more
> 
> s/Registers that were never/Registers and stack slots that were never
> 
> > +states equivalent to a cached state. This could be illustrated by the following
> > +program::
> > +
> > +  0: call bpf_get_prandom_u32()
> > +  1: r1 = 0
> > +  2: if r0 == 0 goto +1
> > +  3: r0 = 1
> > +  --- checkpoint ---
> > +  4: r0 = r1
> > +  5: exit
> > +
> > +Suppose that state cache entry is created at instruction #4 (such entries are
> > +also called "checkpoints" in the text below), verifier reaches this instruction
> > +in two states:
> 
> Can we split these into two sentences? Wdyt about the following:
> 
> "Suppose that a state cache entry is created at instruction #4 (such
> entries are also called "checkpoints" in the text below). The verifier
> could reach the instruction with one of two possible register states:"
> 
> > +
> > +* r0 = 1, r1 = 0
> > +* r0 = 0, r1 = 0
> > +
> > +However, only the value of register ``r1`` is important to successfully finish
> > +verification. The goal of liveness tracking algorithm is to spot this fact and
> 
> s/ of liveness tracking/of the liveness tracking
> 
> > +figure out that both states are actually equivalent.
> > +
> > +Data structures
> > +~~~~~~~~~~~~~~~
> > +
> > +Liveness is tracked using the following data structures::
> > +
> > +  enum bpf_reg_liveness {
> > +	REG_LIVE_NONE = 0,
> > +	REG_LIVE_READ32 = 0x1,
> > +	REG_LIVE_READ64 = 0x2,
> > +	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
> > +	REG_LIVE_WRITTEN = 0x4,
> > +	REG_LIVE_DONE = 0x8,
> > +  };
> 
> FYI, you have the option of using the .. kernel-doc:: directive here, as
> in e.g.:
> 
> .. kernel-doc:: include/linux/bpf_verifier.h
>   :identifiers: bpf_reg_liveness
> 
> For enum bpf_reg_liveness and the other types. Doing so would require
> you updating the kernel-doc header comment in
> include/linux/bpf_verifier.h to reflect the description in [0], i.e.
> something like this for enum bpf_reg_liveness:
> 
> [0]: https://docs.kernel.org/doc-guide/kernel-doc.html#structure-union-and-enumeration-documentation
> 
> /**
>  * enum bpf_reg_liveness - Liveness marks, used for registers and spilled-regs
>  * @REG_LIVE_NONE: reg hasn't been read or written this branch
>  * @REG_LIVE_READ32: reg was read, so we're sensitive to initial value
>  * @REG_LIVE_READ64: likewise, but full 64-bit content matters
>  * @REG_LIVE_READ: An OR of both 32 and 64 bit read reg states.
>  * @REG_LIVE_WRITTEN: reg was written first, screening off later reads
>  * @REG_LIVE_DONE: liveness won't be updating this register anymore
>  *
>  * Read marks propagate upwards until they find a write mark; they record that
>  * "one of this state's descendants read this reg" (and therefore the reg is
>  * relevant for states_equal() checks).
>  * Write marks collect downwards and do not propagate; they record that "the
>  * straight-line code that reached this state (from its parent) wrote this reg"
>  * (and therefore that reads propagated from this state or its descendants
>  * should not propagate to its parent).
>  * A state with a write mark can receive read marks; it just won't propagate
>  * them to its parent, since the write mark is a property, not of the state,
>  * but of the link between it and its parent.  See mark_reg_read() and
>  * mark_stack_slot_read() in kernel/bpf/verifier.c.
>  */
> 
> Not everyone likes how the rendered HTML looks, and I tend to agree that
> it looks pretty blocky, but it at least keeps all of this documentation
> in one place and avoids going stale. Also, it has the benefit of
> automatically creating a link to the ..kernel-doc:: comment anywhere
> that "enum bpf_reg_liveness" is written in the Documentation subtree.

Thank you for pointing to this capability. I'd like to stick to the
current variant because I don't need full definitions of struct
bpf_reg_state and others. The only complete definition I want to
describe is enum bpf_reg_liveness, for all others just want to show
the dependencies. I just tried mixing `.. kernel-doc::` for the enum,
and code block for the structs but it looks awkward in the final HTML.

> 
> > +
> > +  struct bpf_reg_state {
> > + 	...
> > +	struct bpf_reg_state *parent;
> > + 	...
> > +	enum bpf_reg_liveness live;
> > + 	...
> > +  };
> > +
> > +  struct bpf_stack_state {
> > +	struct bpf_reg_state spilled_ptr;
> > +	...
> > +  };
> > +
> > +  struct bpf_func_state {
> > +	struct bpf_reg_state regs[MAX_BPF_REG];
> > +        ...
> > +	struct bpf_stack_state *stack;
> > +  }
> > +  
> > +  struct bpf_verifier_state {
> > +	struct bpf_func_state *frame[MAX_CALL_FRAMES];
> > +	struct bpf_verifier_state *parent;
> > +        ...
> > +  }
> > +
> > +* ``REG_LIVE_NONE`` is an initial value assigned to ``->live`` fields upon new
> > +  verifier state creation;
> > +
> > +* ``REG_LIVE_WRITTEN`` means that the value of the register (or stack slot) is
> > +  defined by some instruction "executed" within verifier state;
> > +
> > +* ``REG_LIVE_READ{32,64}`` means that the value of the register (or stack slot)
> > +  is read by some instruction "executed" within verifier state;
> > +
> > +* ``REG_LIVE_DONE`` is a marker used by ``clean_verifier_state()`` to avoid
> > +  processing same verifier state multiple times and for some sanity checks;
> > +
> > +* ``->live`` field values are formed by combining ``enum bpf_reg_liveness``
> > +  values using bitwise or.
> > +
> > +Register parentage chains
> > +~~~~~~~~~~~~~~~~~~~~~~~~~
> > +  
> > +In order to propagate information between parent and child states register
> 
> Suggestion:
> 
> s/parent and child states register parentage chain/
>   parent and child states, a *register parentage chain*
> 
> > +parentage chain is established. Each register or stack slot is linked to a
> > +corresponding register or stack slot in it's parent state via a ``->parent``
> 
> s/it's/its
> 
> > +pointer. This link is established upon state creation in function
> 
> suggestion: I think you can remove the word 'function'.
> 
> > +``is_state_visited()`` and is never changed.
> > +
> > +The rules for correspondence between registers / stack slots are as follows:
> > +
> > +* For current stack frame registers and stack slots of the new state are linked
> > +  to the registers and stack slots of the parent state with the same indices.
> 
> s/For current stack frame registers/For the current stack frame,
> registers
> 
> > +
> > +* For outer stack frames only caller saved registers (r6-r9) and stack slots are
> > +  linked to the registers and stack slots of the parent state with the same
> > +  indices.
> 
> s/For the outer stack frames only/For the outer stack frames, only
> 
> > +
> > +This could be illustrated by the following diagram (arrows stand for
> > +``->parent`` pointers)::
> > +
> > +      ...                    ; Frame #0, some instructions
> > +  --- checkpoint #0 ---
> > +  1 : r6 = 42                ; Frame #0
> > +  --- checkpoint #1 ---
> > +  2 : call foo()             ; Frame #0
> > +      ...                    ; Frame #1, instructions from foo()
> > +  --- checkpoint #2 ---
> > +      ...                    ; Frame #1, instructions from foo()
> > +  --- checkpoint #3 ---
> > +      exit                   ; Frame #1, return from foo()
> > +  3 : r1 = r6                ; Frame #0  <- current state
> > +  
> > +             +--------------------------+--------------------------+ 
> > +             |         Frame #0         |         Frame #1         |
> > +  Checkpoint +--------------------------+--------------------------+
> > +  #0         | r0-r5 | r6-r9 | fp-8 ... |                           
> > +             +--------------------------+                           
> > +                ^       ^       ^
> > +                |       |       |          
> > +  Checkpoint +--------------------------+                           
> > +  #1         | r0-r5 | r6-r9 | fp-8 ... |
> > +             +--------------------------+
> > +                        ^       ^         nil     nil     nil 
> > +                        |       |          |       |       |          
> > +  Checkpoint +--------------------------+--------------------------+
> > +  #2         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
> > +             +--------------------------+--------------------------+
> > +                        ^       ^          ^       ^       ^  
> > +                        |       |          |       |       |
> > +  Checkpoint +--------------------------+--------------------------+
> > +  #3         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
> > +             +--------------------------+--------------------------+
> > +                        ^       ^
> > +                        |       |          
> > +  Current    +--------------------------+                           
> > +  state      | r0-r5 | r6-r9 | fp-8 ... |                           
> > +             +--------------------------+
> > +                        \           
> > +                          r6 read mark is propagated via
> > +                          these links all the way up to
> > +                          checkpoint #1.
> 
> IMO this illustration is kind of confusing as shown, relative to the
> descriptions of current / outer above. 'Frame #1' is potentially in a
> different function / subprog, correct? So it seems kind of confusing to
> see it propagating registers from checkpoint 3 -> checkpoint 2, etc. And
> it's propagating r0-r5. Similarly, it looks like 'Current state' isn't
> propagating r0-r5. I'm sure I'm misunderstanding something.
> 

Good point. I reviewed the logic in __check_func_call() and it turns
out I missed the actions performed by:
- init_func_state() which cuts all parent links for the newly created
  callee state.
- set_callee_state() which copies r1-r5 from caller byte-to-byte
  (thus including parent link).
I'll update the description and diagram to reflect this.

> > +
> > +Liveness marks tracking
> > +~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +For each processed instruction verifier propagates information about reads up
> 
> s/processed instruction verifier/processed instruction, the verifier
> 
> > +the parentage chain and saves information about writes in the current state.
> > +The information about reads is propagated by function ``mark_reg_read()`` which
> > +could be summarized as follows::
> > +
> > +  mark_reg_read(struct bpf_reg_state *state):
> > +      parent = state->parent
> > +      while parent:
> > +          if parent->live & REG_LIVE_WRITTEN:
> > +              break
> > +          if parent->live & REG_LIVE_READ64:
> > +              break
> > +          parent->live |= REG_LIVE_READ64
> > +          state = parent
> > +          parent = state->parent
> > +
> > +Note: details about REG_LIVE_READ32 are omitted.
> > +
> > +Also note: the read marks are applied to the **parent** state while write marks
> > +are applied to the **current** state.
> > +
> > +Because stack writes could have different sizes ``REG_LIVE_WRITTEN`` marks are
> > +applied conservatively: stack slots are marked as written only if write size
> > +corresponds to the size of the register, e.g. see function ``save_register_state()``.
> > +
> > +Consider the following example::
> > +
> > +  0: (*u64)(r10 - 8) = 0
> > +  --- checkpoint #0 ---
> > +  1: (*u32)(r10 - 8) = 1
> > +  2: r1 = (*u32)(r10 - 8)
> > +
> > +Because write size at instruction (1) is smaller than register size the write
> 
> s/write size/the write size
> 
> also
> 
> s/than register size the/than register size, the
> 
> > +mark will not be added to fp-8 slot when (1) is verified, thus the fp-8 read at
> > +instruction (2) would propagate read mark for fp-8 up to checkpoint #0.
> 
> s/is verified, thus/is verifier. Thus
> 
> also
> 
> s/would propagate read mark/would propagate the read mark
> 
> In general, this sentence explains mechanically why the write won't
> propagate (the write size is smaller than the register size), but I
> think it would be useful to explain why sizeof(write) < sizeof(reg)
> implies that it won't be propagated.
> 
> > +
> > +Once ``BPF_EXIT`` instruction is reached function ``update_branch_counts()`` is
> 
> s/Once ``BPF_EXIT``/Once the ``BPF_EXIT``
> 
> also
> 
> s/is reached function/is reached,
> 
> > +called to update the ``->branches`` counter for each verifier state in a chain
> > +of parent verifier states. When ``->branches`` counter reaches zero the verifier
> 
> s/When ``->branches`` counter reaches zero/
>   When the ``branches`` counter reaches zero,
> 
> > +state becomes a valid entry in a set of cached verifier states.
> > +
> > +Each entry of the verifier states cache is post-processed by a function
> > +``clean_live_states()``. This function marks all registers and stack slots
> > +without ``REG_LIVE_READ{32,64}`` marks as ``NOT_INIT`` or ``STACK_INVALID``.
> > +Registers/stack slots marked in this way are ignored in function ``stacksafe()``
> > +called from ``states_equal()`` when state cache entry is considered for
> 
> s/when state cache entry/when a state cache entry
> 
> > +equivalence with a current state.
> > +
> > +Now it is possible to explain how the example from the beginning of the section
> > +works::
> > +
> > +  0: call bpf_get_prandom_u32()
> > +  1: r1 = 0
> > +  2: if r0 == 0 goto +1
> > +  3: r0 = 1
> > +  --- checkpoint[0] ---
> > +  4: r0 = r1
> > +  5: exit
> > +
> > +* At instruction #2 branching point is reached and state ``{ r0 == 0, r1 == 0, pc == 4 }``
> > +  is pushed to states processing queue (pc stands for program counter).
> > +  
> > +* At instruction #4:
> > +  
> > +  * ``checkpoint[0]`` states cache entry is created: ``{ r0 == 1, r1 == 0, pc == 4 }``;
> > +  * ``checkpoint[0].r0`` is marked as written;
> > +  * ``checkpoint[0].r1`` is marked as read;
> > +
> > +* At instruction #5 exit is reached and ``checkpoint[0]`` can now be processed
> > +  by ``clean_live_states()``, after this processing ``checkpoint[0].r0`` has a
> 
> s/by ``clean_live_states()``, after/by ``clean_live_states()``. After
> 
> > +  read mark and all other registers and stack slots are marked as ``NOT_INIT``
> > +  or ``STACK_INVALID``
> > +  
> > +* The state ``{ r0 == 0, r1 == 0, pc == 4 }`` is popped from the states queue
> > +  and is compared against a cached state ``{ r1 == 0, pc == 4 }``, the states
> > +  are considered equivalent.
> > +
> > +Read marks propagation for cache hits
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Another important point is handling of read marks when a previously verified
> 
> s/is handling of/is the handling of
> 
> > +state is found in the states cache. All read marks present on registers and
> > +stack slots of the cached state must be propagated over the parentage chain of
> > +the current state. Function ``propagate_liveness()`` handles this case.
> 
> Suggestion: Consider expanding the sentence to include why it's
> necessary to propagate like this. You explain below after the diagram,
> but it might be helpful to clarify here so the reader has some context
> when reading over the diagram.

Well, I figured that the diagram below is the simplest way to describe
why this logic is necessary. I'll try to come up with some condensed
description as you suggest.

> 
> > +
> > +For example, consider the following state parentage chain (S is a
> > +starting state, A-E are derived states, -> arrows show which state is
> > +derived from which)::
> > +
> > +                      r1 read
> > +               <-------------                    A[r1] == 0
> > +                    +---+                        C[r1] == 0
> > +      S ---> A ---> | B | ---> exit              E[r1] == 1
> > +      |             |   |
> > +      ` ---> C ---> | D |
> > +      |             +---+
> > +      ` ---> E        
> > +                      ^
> > +             ^        |___   suppose all these
> > +             |             states are at insn #Y
> > +      suppose all these
> > +    states are at insn #X
> > +
> > +* Chain of states ``S -> A -> B -> exit`` is verified first.
> > +
> > +* While ``B -> exit`` is verified, register ``r1`` is read and this read mark is
> > +  propagated up to state ``A``.
> > +
> > +* When chain of states ``C -> D`` is verified the state ``D`` turns out to be
> > +  equivalent to state ``B``.
> > +
> > +* The read mark for ``r1`` has to be propagated to state ``C``, otherwise state
> > +  ``C`` might get mistakenly marked as equivalent to state ``E`` even though
> > +  values for register ``r1`` differ between ``C`` and ``E``.
> > +
> >  Understanding eBPF verifier messages
> >  ====================================
> >  
> > -- 
> > 2.39.0
> > 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next v2 1/1] docs/bpf: Add description of register liveness tracking algorithm
  2023-01-30 18:24 ` [PATCH bpf-next v2 1/1] " Eduard Zingerman
  2023-01-30 19:59   ` David Vernet
@ 2023-01-31  2:33   ` Alexei Starovoitov
  1 sibling, 0 replies; 5+ messages in thread
From: Alexei Starovoitov @ 2023-01-31  2:33 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Mon, Jan 30, 2023 at 08:24:00PM +0200, Eduard Zingerman wrote:
> +  
> +             +--------------------------+--------------------------+ 
> +             |         Frame #0         |         Frame #1         |
> +  Checkpoint +--------------------------+--------------------------+
> +  #0         | r0-r5 | r6-r9 | fp-8 ... |                           
> +             +--------------------------+                           
> +                ^       ^       ^
> +                |       |       |          
> +  Checkpoint +--------------------------+                           
> +  #1         | r0-r5 | r6-r9 | fp-8 ... |
> +             +--------------------------+
> +                        ^       ^         nil     nil     nil 
> +                        |       |          |       |       |          
> +  Checkpoint +--------------------------+--------------------------+
> +  #2         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
> +             +--------------------------+--------------------------+

Looks like you already realized that frame1 r1-r5 needs to be connected to frame0 r1-r5.
Otherwise when subprog (frame1) reads r1 the read mark won't be propagated
into checkpoint#1 r1 frame0 and r1 in frame0 will be trimmed incorrectly.

> +                        ^       ^          ^       ^       ^  
> +                        |       |          |       |       |
> +  Checkpoint +--------------------------+--------------------------+
> +  #3         | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... |
> +             +--------------------------+--------------------------+
> +                        ^       ^
> +                        |       |          
> +  Current    +--------------------------+                           
> +  state      | r0-r5 | r6-r9 | fp-8 ... |                           
> +             +--------------------------+
> +                        \           
> +                          r6 read mark is propagated via
> +                          these links all the way up to
> +                          checkpoint #1.
> +
...
> +Read marks propagation for cache hits
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Another important point is handling of read marks when a previously verified
> +state is found in the states cache. All read marks present on registers and
> +stack slots of the cached state must be propagated over the parentage chain of
> +the current state. Function ``propagate_liveness()`` handles this case.
> +
> +For example, consider the following state parentage chain (S is a
> +starting state, A-E are derived states, -> arrows show which state is
> +derived from which)::
> +
> +                      r1 read
> +               <-------------                    A[r1] == 0
> +                    +---+                        C[r1] == 0
> +      S ---> A ---> | B | ---> exit              E[r1] == 1
> +      |             |   |
> +      ` ---> C ---> | D |
> +      |             +---+
> +      ` ---> E        

The box around B and D is confusing.
The description doesn't explain what the box is for.
If the box is removed the description stays as-is, no?

> +                      ^
> +             ^        |___   suppose all these
> +             |             states are at insn #Y
> +      suppose all these
> +    states are at insn #X
> +
> +* Chain of states ``S -> A -> B -> exit`` is verified first.
> +
> +* While ``B -> exit`` is verified, register ``r1`` is read and this read mark is
> +  propagated up to state ``A``.
> +
> +* When chain of states ``C -> D`` is verified the state ``D`` turns out to be
> +  equivalent to state ``B``.
> +
> +* The read mark for ``r1`` has to be propagated to state ``C``, otherwise state
> +  ``C`` might get mistakenly marked as equivalent to state ``E`` even though
> +  values for register ``r1`` differ between ``C`` and ``E``.

Overall it's awesome.
Please cc Edward Cree <ecree.xilinx@gmail.com> in the respin.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-31  2:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-30 18:23 [PATCH bpf-next v2 0/1] docs/bpf: Add description of register liveness tracking algorithm Eduard Zingerman
2023-01-30 18:24 ` [PATCH bpf-next v2 1/1] " Eduard Zingerman
2023-01-30 19:59   ` David Vernet
2023-01-30 23:45     ` Eduard Zingerman
2023-01-31  2:33   ` Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.