bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements
@ 2024-04-24 22:40 Cupertino Miranda
  2024-04-24 22:40 ` [PATCH bpf-next v3 1/6] bpf/verifier: replace calls to mark_reg_unknown Cupertino Miranda
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:40 UTC (permalink / raw)
  To: bpf; +Cc: Cupertino Miranda

Hi everyone,

This is the third series of these patches.
Thank you Eduard and Yonghong for your reviews.

Regards,
Cupertino

Changes from v1:
 - Reordered patches in the series.
 - Fix refactor to be acurate with original code.
 - Fixed other mentioned small problems.

Changes from v2:
 - Added a patch to replace mark_reg_unknowon for __mark_reg_unknown in
   the context of range computation.
 - Reverted implementation of refactor to v1 which used a simpler
   boolean return value in check function.
 - Further relaxed MUL to allow it to still compute a range when neither
   of its registers is a known value.
 - Simplified tests based on Eduards example.
 - Added messages in selftest commits.

Cupertino Miranda (6):
  bpf/verifier: replace calls to mark_reg_unknown.
  bpf/verifier: refactor checks for range computation
  bpf/verifier: improve XOR and OR range computation
  selftests/bpf: XOR and OR range computation tests.
  bpf/verifier: relax MUL range computation check
  selftests/bpf: MUL range computation tests.

 kernel/bpf/verifier.c                         | 139 ++++++++++--------
 .../selftests/bpf/progs/verifier_bounds.c     |  63 ++++++++
 2 files changed, 137 insertions(+), 65 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v3 1/6] bpf/verifier: replace calls to mark_reg_unknown.
  2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
@ 2024-04-24 22:40 ` Cupertino Miranda
  2024-04-25 16:56   ` Eduard Zingerman
  2024-04-24 22:40 ` [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation Cupertino Miranda
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:40 UTC (permalink / raw)
  To: bpf
  Cc: Cupertino Miranda, Yonghong Song, Alexei Starovoitov,
	David Faust, Jose Marchesi, Elena Zannoni

In order to further simplify the code in adjust_scalar_min_max_vals all
the calls to mark_reg_unknown are replaced by __mark_reg_unknown.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
---
 kernel/bpf/verifier.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5a7e34e83a5b..6fe641c8ae33 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13704,7 +13704,6 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 				      struct bpf_reg_state *dst_reg,
 				      struct bpf_reg_state src_reg)
 {
-	struct bpf_reg_state *regs = cur_regs(env);
 	u8 opcode = BPF_OP(insn->code);
 	bool src_known;
 	s64 smin_val, smax_val;
@@ -13811,7 +13810,7 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 			/* Shifts greater than 31 or 63 are undefined.
 			 * This includes shifts by a negative number.
 			 */
-			mark_reg_unknown(env, regs, insn->dst_reg);
+			__mark_reg_unknown(env, dst_reg);
 			break;
 		}
 		if (alu32)
@@ -13824,7 +13823,7 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 			/* Shifts greater than 31 or 63 are undefined.
 			 * This includes shifts by a negative number.
 			 */
-			mark_reg_unknown(env, regs, insn->dst_reg);
+			__mark_reg_unknown(env, dst_reg);
 			break;
 		}
 		if (alu32)
@@ -13837,7 +13836,7 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 			/* Shifts greater than 31 or 63 are undefined.
 			 * This includes shifts by a negative number.
 			 */
-			mark_reg_unknown(env, regs, insn->dst_reg);
+			__mark_reg_unknown(env, dst_reg);
 			break;
 		}
 		if (alu32)
@@ -13846,7 +13845,7 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 			scalar_min_max_arsh(dst_reg, &src_reg);
 		break;
 	default:
-		mark_reg_unknown(env, regs, insn->dst_reg);
+		__mark_reg_unknown(env, dst_reg);
 		break;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
  2024-04-24 22:40 ` [PATCH bpf-next v3 1/6] bpf/verifier: replace calls to mark_reg_unknown Cupertino Miranda
@ 2024-04-24 22:40 ` Cupertino Miranda
  2024-04-25 18:49   ` Eduard Zingerman
  2024-04-25 23:05   ` Andrii Nakryiko
  2024-04-24 22:40 ` [PATCH bpf-next v3 3/6] bpf/verifier: improve XOR and OR " Cupertino Miranda
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:40 UTC (permalink / raw)
  To: bpf
  Cc: Cupertino Miranda, Yonghong Song, Alexei Starovoitov,
	David Faust, Jose Marchesi, Elena Zannoni

Split range computation checks in its own function, isolating pessimitic
range set for dst_reg and failing return to a single point.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
---
 kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
 1 file changed, 77 insertions(+), 64 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6fe641c8ae33..829a12d263a5 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
 	__update_reg_bounds(dst_reg);
 }
 
+static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,
+				   bool *valid)
+{
+	s64 smin_val = reg.smin_value;
+	s64 smax_val = reg.smax_value;
+	u64 umin_val = reg.umin_value;
+	u64 umax_val = reg.umax_value;
+
+	s32 s32_min_val = reg.s32_min_value;
+	s32 s32_max_val = reg.s32_max_value;
+	u32 u32_min_val = reg.u32_min_value;
+	u32 u32_max_val = reg.u32_max_value;
+
+	bool known = alu32 ? tnum_subreg_is_const(reg.var_off) :
+			     tnum_is_const(reg.var_off);
+
+	if (alu32) {
+		if ((known &&
+		     (s32_min_val != s32_max_val || u32_min_val != u32_max_val)) ||
+		      s32_min_val > s32_max_val || u32_min_val > u32_max_val)
+			*valid = false;
+	} else {
+		if ((known &&
+		     (smin_val != smax_val || umin_val != umax_val)) ||
+		    smin_val > smax_val || umin_val > umax_val)
+			*valid = false;
+	}
+
+	return known;
+}
+
+static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
+					     struct bpf_reg_state src_reg)
+{
+	bool src_known;
+	u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32;
+	bool alu32 = (BPF_CLASS(insn->code) != BPF_ALU64);
+	u8 opcode = BPF_OP(insn->code);
+
+	bool valid_known = true;
+	src_known = is_const_reg_and_valid(src_reg, alu32, &valid_known);
+
+	/* Taint dst register if offset had invalid bounds
+	 * derived from e.g. dead branches.
+	 */
+	if (valid_known == false)
+		return false;
+
+	switch (opcode) {
+	case BPF_ADD:
+	case BPF_SUB:
+	case BPF_AND:
+		return true;
+
+	/* Compute range for the following only if the src_reg is known.
+	 */
+	case BPF_XOR:
+	case BPF_OR:
+	case BPF_MUL:
+		return src_known;
+
+	/* Shift operators range is only computable if shift dimension operand
+	 * is known. Also, shifts greater than 31 or 63 are undefined. This
+	 * includes shifts by a negative number.
+	 */
+	case BPF_LSH:
+	case BPF_RSH:
+	case BPF_ARSH:
+		return (src_known && src_reg.umax_value < insn_bitness);
+	default:
+		break;
+	}
+
+	return false;
+}
+
 /* WARNING: This function does calculations on 64-bit values, but the actual
  * execution may occur on 32-bit values. Therefore, things like bitshifts
  * need extra checks in the 32-bit case.
@@ -13705,51 +13781,10 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 				      struct bpf_reg_state src_reg)
 {
 	u8 opcode = BPF_OP(insn->code);
-	bool src_known;
-	s64 smin_val, smax_val;
-	u64 umin_val, umax_val;
-	s32 s32_min_val, s32_max_val;
-	u32 u32_min_val, u32_max_val;
-	u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32;
 	bool alu32 = (BPF_CLASS(insn->code) != BPF_ALU64);
 	int ret;
 
-	smin_val = src_reg.smin_value;
-	smax_val = src_reg.smax_value;
-	umin_val = src_reg.umin_value;
-	umax_val = src_reg.umax_value;
-
-	s32_min_val = src_reg.s32_min_value;
-	s32_max_val = src_reg.s32_max_value;
-	u32_min_val = src_reg.u32_min_value;
-	u32_max_val = src_reg.u32_max_value;
-
-	if (alu32) {
-		src_known = tnum_subreg_is_const(src_reg.var_off);
-		if ((src_known &&
-		     (s32_min_val != s32_max_val || u32_min_val != u32_max_val)) ||
-		    s32_min_val > s32_max_val || u32_min_val > u32_max_val) {
-			/* Taint dst register if offset had invalid bounds
-			 * derived from e.g. dead branches.
-			 */
-			__mark_reg_unknown(env, dst_reg);
-			return 0;
-		}
-	} else {
-		src_known = tnum_is_const(src_reg.var_off);
-		if ((src_known &&
-		     (smin_val != smax_val || umin_val != umax_val)) ||
-		    smin_val > smax_val || umin_val > umax_val) {
-			/* Taint dst register if offset had invalid bounds
-			 * derived from e.g. dead branches.
-			 */
-			__mark_reg_unknown(env, dst_reg);
-			return 0;
-		}
-	}
-
-	if (!src_known &&
-	    opcode != BPF_ADD && opcode != BPF_SUB && opcode != BPF_AND) {
+	if (!is_safe_to_compute_dst_reg_range(insn, src_reg)) {
 		__mark_reg_unknown(env, dst_reg);
 		return 0;
 	}
@@ -13806,46 +13841,24 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 		scalar_min_max_xor(dst_reg, &src_reg);
 		break;
 	case BPF_LSH:
-		if (umax_val >= insn_bitness) {
-			/* Shifts greater than 31 or 63 are undefined.
-			 * This includes shifts by a negative number.
-			 */
-			__mark_reg_unknown(env, dst_reg);
-			break;
-		}
 		if (alu32)
 			scalar32_min_max_lsh(dst_reg, &src_reg);
 		else
 			scalar_min_max_lsh(dst_reg, &src_reg);
 		break;
 	case BPF_RSH:
-		if (umax_val >= insn_bitness) {
-			/* Shifts greater than 31 or 63 are undefined.
-			 * This includes shifts by a negative number.
-			 */
-			__mark_reg_unknown(env, dst_reg);
-			break;
-		}
 		if (alu32)
 			scalar32_min_max_rsh(dst_reg, &src_reg);
 		else
 			scalar_min_max_rsh(dst_reg, &src_reg);
 		break;
 	case BPF_ARSH:
-		if (umax_val >= insn_bitness) {
-			/* Shifts greater than 31 or 63 are undefined.
-			 * This includes shifts by a negative number.
-			 */
-			__mark_reg_unknown(env, dst_reg);
-			break;
-		}
 		if (alu32)
 			scalar32_min_max_arsh(dst_reg, &src_reg);
 		else
 			scalar_min_max_arsh(dst_reg, &src_reg);
 		break;
 	default:
-		__mark_reg_unknown(env, dst_reg);
 		break;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v3 3/6] bpf/verifier: improve XOR and OR range computation
  2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
  2024-04-24 22:40 ` [PATCH bpf-next v3 1/6] bpf/verifier: replace calls to mark_reg_unknown Cupertino Miranda
  2024-04-24 22:40 ` [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation Cupertino Miranda
@ 2024-04-24 22:40 ` Cupertino Miranda
  2024-04-25 18:52   ` Eduard Zingerman
  2024-04-24 22:40 ` [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests Cupertino Miranda
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:40 UTC (permalink / raw)
  To: bpf
  Cc: Cupertino Miranda, Yonghong Song, Alexei Starovoitov,
	David Faust, Jose Marchesi, Elena Zannoni

Range for XOR and OR operators would not be attempted unless src_reg
would resolve to a single value, i.e. a known constant value.
This condition is unnecessary, and the following XOR/OR operator
handling could compute a possible better range.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
---
 kernel/bpf/verifier.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 829a12d263a5..6f956c0936d0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13747,12 +13747,12 @@ static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
 	case BPF_ADD:
 	case BPF_SUB:
 	case BPF_AND:
+	case BPF_XOR:
+	case BPF_OR:
 		return true;
 
 	/* Compute range for the following only if the src_reg is known.
 	 */
-	case BPF_XOR:
-	case BPF_OR:
 	case BPF_MUL:
 		return src_known;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests.
  2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
                   ` (2 preceding siblings ...)
  2024-04-24 22:40 ` [PATCH bpf-next v3 3/6] bpf/verifier: improve XOR and OR " Cupertino Miranda
@ 2024-04-24 22:40 ` Cupertino Miranda
  2024-04-25 18:59   ` Eduard Zingerman
  2024-04-25 23:17   ` Andrii Nakryiko
  2024-04-24 22:40 ` [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check Cupertino Miranda
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:40 UTC (permalink / raw)
  To: bpf
  Cc: Cupertino Miranda, Yonghong Song, Alexei Starovoitov,
	David Faust, Jose Marchesi, Elena Zannoni

Added a test for bound computation in XOR and OR when non constant
values are used and both registers have bounded ranges.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
---
 .../selftests/bpf/progs/verifier_bounds.c     | 42 +++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c
index 960998f16306..aeb88a9c7a86 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bounds.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c
@@ -885,6 +885,48 @@ l1_%=:	r0 = 0;						\
 	: __clobber_all);
 }
 
+SEC("socket")
+__description("bounds check for non const xor src dst")
+__success __log_level(2)
+__msg("5: (af) r0 ^= r6                      ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff))")
+__naked void non_const_xor_src_dst(void)
+{
+	asm volatile ("					\
+	call %[bpf_get_prandom_u32];                    \
+	r6 = r0;					\
+	call %[bpf_get_prandom_u32];                    \
+	r6 &= 0xff;					\
+	r0 &= 0x0f;					\
+	r0 ^= r6;					\
+	exit;						\
+"	:
+	: __imm(bpf_map_lookup_elem),
+	__imm_addr(map_hash_8b),
+	__imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
+SEC("socket")
+__description("bounds check for non const or src dst")
+__success __log_level(2)
+__msg("5: (4f) r0 |= r6                      ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff))")
+__naked void non_const_or_src_dst(void)
+{
+	asm volatile ("					\
+	call %[bpf_get_prandom_u32];                    \
+	r6 = r0;					\
+	call %[bpf_get_prandom_u32];                    \
+	r6 &= 0xff;					\
+	r0 &= 0x0f;					\
+	r0 |= r6;					\
+	exit;						\
+"	:
+	: __imm(bpf_map_lookup_elem),
+	__imm_addr(map_hash_8b),
+	__imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
 SEC("socket")
 __description("bounds checks after 32-bit truncation. test 1")
 __success __failure_unpriv __msg_unpriv("R0 leaks addr")
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check
  2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
                   ` (3 preceding siblings ...)
  2024-04-24 22:40 ` [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests Cupertino Miranda
@ 2024-04-24 22:40 ` Cupertino Miranda
  2024-04-25 19:00   ` Eduard Zingerman
  2024-04-25 23:24   ` Andrii Nakryiko
  2024-04-24 22:40 ` [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests Cupertino Miranda
  2024-04-24 22:45 ` [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
  6 siblings, 2 replies; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:40 UTC (permalink / raw)
  To: bpf
  Cc: Cupertino Miranda, Yonghong Song, Alexei Starovoitov,
	David Faust, Jose Marchesi, Elena Zannoni

MUL instruction required that src_reg would be a known value (i.e.
src_reg would be a const value). The condition in this case can be
relaxed, since the range computation algorithm used in current code
already supports a proper range computation for any valid range value on
its operands.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
---
 kernel/bpf/verifier.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6f956c0936d0..760193dac85e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13749,12 +13749,8 @@ static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
 	case BPF_AND:
 	case BPF_XOR:
 	case BPF_OR:
-		return true;
-
-	/* Compute range for the following only if the src_reg is known.
-	 */
 	case BPF_MUL:
-		return src_known;
+		return true;
 
 	/* Shift operators range is only computable if shift dimension operand
 	 * is known. Also, shifts greater than 31 or 63 are undefined. This
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests.
  2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
                   ` (4 preceding siblings ...)
  2024-04-24 22:40 ` [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check Cupertino Miranda
@ 2024-04-24 22:40 ` Cupertino Miranda
  2024-04-25 19:02   ` Eduard Zingerman
  2024-04-25 23:26   ` Andrii Nakryiko
  2024-04-24 22:45 ` [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
  6 siblings, 2 replies; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:40 UTC (permalink / raw)
  To: bpf
  Cc: Cupertino Miranda, Yonghong Song, Alexei Starovoitov,
	David Faust, Jose Marchesi, Elena Zannoni

Added a test for bound computation in MUL when non constant
values are used and both registers have bounded ranges.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
---
 .../selftests/bpf/progs/verifier_bounds.c     | 21 +++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c
index aeb88a9c7a86..8fd7e93b112f 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bounds.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c
@@ -927,6 +927,27 @@ __naked void non_const_or_src_dst(void)
 	: __clobber_all);
 }
 
+SEC("socket")
+__description("bounds check for non const mul regs")
+__success __log_level(2)
+__msg("5: (2f) r0 *= r6                      ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3825,var_off=(0x0; 0xfff))")
+__naked void non_const_mul_regs(void)
+{
+	asm volatile ("					\
+	call %[bpf_get_prandom_u32];                    \
+	r6 = r0;					\
+	call %[bpf_get_prandom_u32];                    \
+	r6 &= 0xff;					\
+	r0 &= 0x0f;					\
+	r0 *= r6;					\
+	exit;						\
+"	:
+	: __imm(bpf_map_lookup_elem),
+	__imm_addr(map_hash_8b),
+	__imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
 SEC("socket")
 __description("bounds checks after 32-bit truncation. test 1")
 __success __failure_unpriv __msg_unpriv("R0 leaks addr")
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements
  2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
                   ` (5 preceding siblings ...)
  2024-04-24 22:40 ` [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests Cupertino Miranda
@ 2024-04-24 22:45 ` Cupertino Miranda
  6 siblings, 0 replies; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-24 22:45 UTC (permalink / raw)
  To: bpf; +Cc: Cupertino Miranda, Eduard Zingerman


+Cc: Eduard Zingerman <eddyz87@gmail.com>

Apologies Eduard, forgot to add you to the CC list in the commits. :(

Cupertino Miranda writes:

> Hi everyone,
>
> This is the third series of these patches.
> Thank you Eduard and Yonghong for your reviews.
>
> Regards,
> Cupertino
>
> Changes from v1:
>  - Reordered patches in the series.
>  - Fix refactor to be acurate with original code.
>  - Fixed other mentioned small problems.
>
> Changes from v2:
>  - Added a patch to replace mark_reg_unknowon for __mark_reg_unknown in
>    the context of range computation.
>  - Reverted implementation of refactor to v1 which used a simpler
>    boolean return value in check function.
>  - Further relaxed MUL to allow it to still compute a range when neither
>    of its registers is a known value.
>  - Simplified tests based on Eduards example.
>  - Added messages in selftest commits.
>
> Cupertino Miranda (6):
>   bpf/verifier: replace calls to mark_reg_unknown.
>   bpf/verifier: refactor checks for range computation
>   bpf/verifier: improve XOR and OR range computation
>   selftests/bpf: XOR and OR range computation tests.
>   bpf/verifier: relax MUL range computation check
>   selftests/bpf: MUL range computation tests.
>
>  kernel/bpf/verifier.c                         | 139 ++++++++++--------
>  .../selftests/bpf/progs/verifier_bounds.c     |  63 ++++++++
>  2 files changed, 137 insertions(+), 65 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 1/6] bpf/verifier: replace calls to mark_reg_unknown.
  2024-04-24 22:40 ` [PATCH bpf-next v3 1/6] bpf/verifier: replace calls to mark_reg_unknown Cupertino Miranda
@ 2024-04-25 16:56   ` Eduard Zingerman
  0 siblings, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2024-04-25 16:56 UTC (permalink / raw)
  To: Cupertino Miranda, bpf
  Cc: Yonghong Song, Alexei Starovoitov, David Faust, Jose Marchesi,
	Elena Zannoni

On Wed, 2024-04-24 at 23:40 +0100, Cupertino Miranda wrote:
> In order to further simplify the code in adjust_scalar_min_max_vals all
> the calls to mark_reg_unknown are replaced by __mark_reg_unknown.
> 
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

Nit: if there would be a v4 for this series (hopefully won't),
     please extend the commit message with something like below:
     
  static void mark_reg_unknown(struct bpf_verifier_env *env,
  			     struct bpf_reg_state *regs, u32 regno)
  {
  	if (WARN_ON(regno >= MAX_BPF_REG)) {
  		... mark all regs not init ...
  		return;
      }
  	__mark_reg_unknown(env, regs + regno);
  }
  
  The 'regno >= MAX_BPF_REG' does not apply to adjust_scalar_min_max_vals(),
  because it is only called from the following stack:
  - check_alu_op
    - adjust_reg_min_max_vals
      - adjust_scalar_min_max_vals
  
  The check_alu_op() does check_reg_arg() which verifies that both src
  and dst register numbers are within bounds.

[...]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-24 22:40 ` [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation Cupertino Miranda
@ 2024-04-25 18:49   ` Eduard Zingerman
  2024-04-25 23:05   ` Andrii Nakryiko
  1 sibling, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2024-04-25 18:49 UTC (permalink / raw)
  To: Cupertino Miranda, bpf
  Cc: Yonghong Song, Alexei Starovoitov, David Faust, Jose Marchesi,
	Elena Zannoni

On Wed, 2024-04-24 at 23:40 +0100, Cupertino Miranda wrote:
> Split range computation checks in its own function, isolating pessimitic
> range set for dst_reg and failing return to a single point.
> 
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 3/6] bpf/verifier: improve XOR and OR range computation
  2024-04-24 22:40 ` [PATCH bpf-next v3 3/6] bpf/verifier: improve XOR and OR " Cupertino Miranda
@ 2024-04-25 18:52   ` Eduard Zingerman
  0 siblings, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2024-04-25 18:52 UTC (permalink / raw)
  To: Cupertino Miranda, bpf
  Cc: Yonghong Song, Alexei Starovoitov, David Faust, Jose Marchesi,
	Elena Zannoni

On Wed, 2024-04-24 at 23:40 +0100, Cupertino Miranda wrote:
> Range for XOR and OR operators would not be attempted unless src_reg
> would resolve to a single value, i.e. a known constant value.
> This condition is unnecessary, and the following XOR/OR operator
> handling could compute a possible better range.
> 
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

(Please copy ack's from previous versions of the patch if there were
 no significant changes, it makes its easier to review subsequent versions).

[...]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests.
  2024-04-24 22:40 ` [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests Cupertino Miranda
@ 2024-04-25 18:59   ` Eduard Zingerman
  2024-04-25 23:17   ` Andrii Nakryiko
  1 sibling, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2024-04-25 18:59 UTC (permalink / raw)
  To: Cupertino Miranda, bpf
  Cc: Yonghong Song, Alexei Starovoitov, David Faust, Jose Marchesi,
	Elena Zannoni

On Wed, 2024-04-24 at 23:40 +0100, Cupertino Miranda wrote:
> Added a test for bound computation in XOR and OR when non constant
> values are used and both registers have bounded ranges.
> 
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check
  2024-04-24 22:40 ` [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check Cupertino Miranda
@ 2024-04-25 19:00   ` Eduard Zingerman
  2024-04-25 23:24   ` Andrii Nakryiko
  1 sibling, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2024-04-25 19:00 UTC (permalink / raw)
  To: Cupertino Miranda, bpf
  Cc: Yonghong Song, Alexei Starovoitov, David Faust, Jose Marchesi,
	Elena Zannoni

On Wed, 2024-04-24 at 23:40 +0100, Cupertino Miranda wrote:
> MUL instruction required that src_reg would be a known value (i.e.
> src_reg would be a const value). The condition in this case can be
> relaxed, since the range computation algorithm used in current code
> already supports a proper range computation for any valid range value on
> its operands.
> 
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests.
  2024-04-24 22:40 ` [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests Cupertino Miranda
@ 2024-04-25 19:02   ` Eduard Zingerman
  2024-04-25 23:26   ` Andrii Nakryiko
  1 sibling, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2024-04-25 19:02 UTC (permalink / raw)
  To: Cupertino Miranda, bpf
  Cc: Yonghong Song, Alexei Starovoitov, David Faust, Jose Marchesi,
	Elena Zannoni

On Wed, 2024-04-24 at 23:40 +0100, Cupertino Miranda wrote:
> Added a test for bound computation in MUL when non constant
> values are used and both registers have bounded ranges.
> 
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-24 22:40 ` [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation Cupertino Miranda
  2024-04-25 18:49   ` Eduard Zingerman
@ 2024-04-25 23:05   ` Andrii Nakryiko
  2024-04-26 10:20     ` Cupertino Miranda
  1 sibling, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2024-04-25 23:05 UTC (permalink / raw)
  To: Cupertino Miranda
  Cc: bpf, Yonghong Song, Alexei Starovoitov, David Faust,
	Jose Marchesi, Elena Zannoni

On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
<cupertino.miranda@oracle.com> wrote:
>
> Split range computation checks in its own function, isolating pessimitic
> range set for dst_reg and failing return to a single point.
>
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---
>  kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
>  1 file changed, 77 insertions(+), 64 deletions(-)
>

I know you are moving around pre-existing code, so a bunch of nits
below are to pre-existing code, but let's use this as an opportunity
to clean it up a bit.

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 6fe641c8ae33..829a12d263a5 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
>         __update_reg_bounds(dst_reg);
>  }
>
> +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,

hm.. why passing reg_state by value? Use pointer?

> +                                  bool *valid)
> +{
> +       s64 smin_val = reg.smin_value;
> +       s64 smax_val = reg.smax_value;
> +       u64 umin_val = reg.umin_value;
> +       u64 umax_val = reg.umax_value;
> +

don't add empty line between variable declarations, all variables
should be in a single continuous block

> +       s32 s32_min_val = reg.s32_min_value;
> +       s32 s32_max_val = reg.s32_max_value;
> +       u32 u32_min_val = reg.u32_min_value;
> +       u32 u32_max_val = reg.u32_max_value;
> +

but see below, I'm not sure we even need these local variables, they
don't save all that much typing

> +       bool known = alu32 ? tnum_subreg_is_const(reg.var_off) :
> +                            tnum_is_const(reg.var_off);

"known" is a misnomer, imo. It's `is_const`.

> +
> +       if (alu32) {
> +               if ((known &&
> +                    (s32_min_val != s32_max_val || u32_min_val != u32_max_val)) ||
> +                     s32_min_val > s32_max_val || u32_min_val > u32_max_val)
> +                       *valid = false;
> +       } else {
> +               if ((known &&
> +                    (smin_val != smax_val || umin_val != umax_val)) ||
> +                   smin_val > smax_val || umin_val > umax_val)
> +                       *valid = false;
> +       }
> +
> +       return known;


The above is really hard to follow, especially how known && !known
cases are being handled is very easy to misinterpret. How about we
rewrite the equivalent logic in a few steps:

if (alu32) {
    if (tnum_subreg_is_const(reg.var_off)) {
        return reg->s32_min_value == reg->s32_max_value &&
               reg->u32_min_value == reg->u32_max_value;
    } else {
        return reg->s32_min_value <= reg->s32_max_value &&
               reg->u32_min_value <= reg->u32_max_value;
    }
} else {
   /* same as above for 64-bit bounds */
}

And you don't even need any local variables, while all the important
conditions are a bit more easy to follow? Or is it just me?

> +}
> +
> +static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
> +                                            struct bpf_reg_state src_reg)
> +{
> +       bool src_known;
> +       u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32;

whole u64 for this seems like an overkill, I'd just stick to `int`

> +       bool alu32 = (BPF_CLASS(insn->code) != BPF_ALU64);

insn_bitness == 32 ?

> +       u8 opcode = BPF_OP(insn->code);
> +

nit: don't split variables block with empty line

> +       bool valid_known = true;

need an empty line between variable declarations and the rest

> +       src_known = is_const_reg_and_valid(src_reg, alu32, &valid_known);
> +
> +       /* Taint dst register if offset had invalid bounds
> +        * derived from e.g. dead branches.
> +        */
> +       if (valid_known == false)

nit: !valid_known

> +               return false;

given this logic/handling, why not just return false from
is_const_reg_and_valid() if it's a constant, but it's not
valid/consistent? It's simpler and fits the logic and function's name,
no? See my suggestion above

> +
> +       switch (opcode) {

inline opcode variable here, you use it just once

> +       case BPF_ADD:
> +       case BPF_SUB:
> +       case BPF_AND:
> +               return true;
> +
> +       /* Compute range for the following only if the src_reg is known.
> +        */
> +       case BPF_XOR:
> +       case BPF_OR:
> +       case BPF_MUL:
> +               return src_known;
> +
> +       /* Shift operators range is only computable if shift dimension operand
> +        * is known. Also, shifts greater than 31 or 63 are undefined. This
> +        * includes shifts by a negative number.
> +        */
> +       case BPF_LSH:
> +       case BPF_RSH:
> +       case BPF_ARSH:

preserve original comment?

> -                       /* Shifts greater than 31 or 63 are undefined.
> -                        * This includes shifts by a negative number.
> -                        */

> +               return (src_known && src_reg.umax_value < insn_bitness);

nit: unnecessary ()

> +       default:
> +               break;

return false here, and drop return below

> +       }
> +
> +       return false;
> +}
> +
>  /* WARNING: This function does calculations on 64-bit values, but the actual
>   * execution may occur on 32-bit values. Therefore, things like bitshifts
>   * need extra checks in the 32-bit case.

[...]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests.
  2024-04-24 22:40 ` [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests Cupertino Miranda
  2024-04-25 18:59   ` Eduard Zingerman
@ 2024-04-25 23:17   ` Andrii Nakryiko
  1 sibling, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2024-04-25 23:17 UTC (permalink / raw)
  To: Cupertino Miranda
  Cc: bpf, Yonghong Song, Alexei Starovoitov, David Faust,
	Jose Marchesi, Elena Zannoni

On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
<cupertino.miranda@oracle.com> wrote:
>
> Added a test for bound computation in XOR and OR when non constant
> values are used and both registers have bounded ranges.
>
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---
>  .../selftests/bpf/progs/verifier_bounds.c     | 42 +++++++++++++++++++
>  1 file changed, 42 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c
> index 960998f16306..aeb88a9c7a86 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_bounds.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c
> @@ -885,6 +885,48 @@ l1_%=:     r0 = 0;                                         \
>         : __clobber_all);
>  }
>
> +SEC("socket")
> +__description("bounds check for non const xor src dst")
> +__success __log_level(2)
> +__msg("5: (af) r0 ^= r6                      ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff))")
> +__naked void non_const_xor_src_dst(void)
> +{
> +       asm volatile ("                                 \
> +       call %[bpf_get_prandom_u32];                    \
> +       r6 = r0;                                        \
> +       call %[bpf_get_prandom_u32];                    \
> +       r6 &= 0xff;                                     \
> +       r0 &= 0x0f;                                     \
> +       r0 ^= r6;                                       \
> +       exit;                                           \
> +"      :
> +       : __imm(bpf_map_lookup_elem),
> +       __imm_addr(map_hash_8b),
> +       __imm(bpf_get_prandom_u32)
> +       : __clobber_all);
> +}
> +
> +SEC("socket")
> +__description("bounds check for non const or src dst")
> +__success __log_level(2)
> +__msg("5: (4f) r0 |= r6                      ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff))")
> +__naked void non_const_or_src_dst(void)
> +{
> +       asm volatile ("                                 \
> +       call %[bpf_get_prandom_u32];                    \
> +       r6 = r0;                                        \
> +       call %[bpf_get_prandom_u32];                    \
> +       r6 &= 0xff;                                     \
> +       r0 &= 0x0f;                                     \
> +       r0 |= r6;                                       \

what if we make this case a bit more challenging and interesting and
have some known bits in r0? like add `r0 |= 0x1a0;` before doing `r0 |
= r6;` and make sure that we have a few known bits in var_off() and
range is extended to be 511?

Maybe do something like that for xor case as well, making sure that we
have some bits known to be either zero or one?

> +       exit;                                           \
> +"      :
> +       : __imm(bpf_map_lookup_elem),
> +       __imm_addr(map_hash_8b),
> +       __imm(bpf_get_prandom_u32)
> +       : __clobber_all);
> +}
> +

do we have an existing case for AND as well? That seems to be an
interesting one, as verifier should be able to infer [0, 0xf] range

>  SEC("socket")
>  __description("bounds checks after 32-bit truncation. test 1")
>  __success __failure_unpriv __msg_unpriv("R0 leaks addr")
> --
> 2.39.2
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check
  2024-04-24 22:40 ` [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check Cupertino Miranda
  2024-04-25 19:00   ` Eduard Zingerman
@ 2024-04-25 23:24   ` Andrii Nakryiko
  1 sibling, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2024-04-25 23:24 UTC (permalink / raw)
  To: Cupertino Miranda
  Cc: bpf, Yonghong Song, Alexei Starovoitov, David Faust,
	Jose Marchesi, Elena Zannoni

On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
<cupertino.miranda@oracle.com> wrote:
>
> MUL instruction required that src_reg would be a known value (i.e.
> src_reg would be a const value). The condition in this case can be
> relaxed, since the range computation algorithm used in current code
> already supports a proper range computation for any valid range value on
> its operands.
>
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---
>  kernel/bpf/verifier.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 6f956c0936d0..760193dac85e 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -13749,12 +13749,8 @@ static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
>         case BPF_AND:
>         case BPF_XOR:
>         case BPF_OR:
> -               return true;
> -
> -       /* Compute range for the following only if the src_reg is known.
> -        */
>         case BPF_MUL:
> -               return src_known;
> +               return true;
>

scalar_min_max_mul() and scalar32_min_max_mul() could be implemented
just a touch smarter without becoming non-obvious. E.g., we
pessimistically limit both arguments to U16_MAX, but really we care
about their multiplication not overflowing U32_MAX, which can be
checked very easily (just multiply two max values and check they are
still < U32_MAX; similar checks could be done for 64-bit case). All
this is still trivially provable because we restrict arguments to be
non-negative.

So please consider some follow ups, if you are interested in improving
these parts of verifier.

But this change itself looks good to me:

Acked-by: Andrii Nakryiko <andrii@kernel.org>


>         /* Shift operators range is only computable if shift dimension operand
>          * is known. Also, shifts greater than 31 or 63 are undefined. This
> --
> 2.39.2
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests.
  2024-04-24 22:40 ` [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests Cupertino Miranda
  2024-04-25 19:02   ` Eduard Zingerman
@ 2024-04-25 23:26   ` Andrii Nakryiko
  1 sibling, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2024-04-25 23:26 UTC (permalink / raw)
  To: Cupertino Miranda
  Cc: bpf, Yonghong Song, Alexei Starovoitov, David Faust,
	Jose Marchesi, Elena Zannoni

On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
<cupertino.miranda@oracle.com> wrote:
>
> Added a test for bound computation in MUL when non constant
> values are used and both registers have bounded ranges.
>
> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Faust <david.faust@oracle.com>
> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> ---
>  .../selftests/bpf/progs/verifier_bounds.c     | 21 +++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c
> index aeb88a9c7a86..8fd7e93b112f 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_bounds.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c
> @@ -927,6 +927,27 @@ __naked void non_const_or_src_dst(void)
>         : __clobber_all);
>  }
>
> +SEC("socket")
> +__description("bounds check for non const mul regs")
> +__success __log_level(2)
> +__msg("5: (2f) r0 *= r6                      ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3825,var_off=(0x0; 0xfff))")
> +__naked void non_const_mul_regs(void)
> +{
> +       asm volatile ("                                 \
> +       call %[bpf_get_prandom_u32];                    \
> +       r6 = r0;                                        \
> +       call %[bpf_get_prandom_u32];                    \
> +       r6 &= 0xff;                                     \
> +       r0 &= 0x0f;                                     \
> +       r0 *= r6;                                       \
> +       exit;                                           \
> +"      :
> +       : __imm(bpf_map_lookup_elem),
> +       __imm_addr(map_hash_8b),
> +       __imm(bpf_get_prandom_u32)
> +       : __clobber_all);
> +}
> +

LGTM, but it would be a bit more interesting to have a few known bits
as well. Just setting 0x100 bit for r6 and 0x10 bit for r0 (before
multiplication) would test that tnum actually tracks those known bits
during multiplication correctly. Consider it as a follow up, I guess.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  SEC("socket")
>  __description("bounds checks after 32-bit truncation. test 1")
>  __success __failure_unpriv __msg_unpriv("R0 leaks addr")
> --
> 2.39.2
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-25 23:05   ` Andrii Nakryiko
@ 2024-04-26 10:20     ` Cupertino Miranda
  2024-04-26 16:11       ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-26 10:20 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Yonghong Song, Alexei Starovoitov, David Faust,
	Jose Marchesi, Elena Zannoni


Andrii Nakryiko writes:

> On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
> <cupertino.miranda@oracle.com> wrote:
>>
>> Split range computation checks in its own function, isolating pessimitic
>> range set for dst_reg and failing return to a single point.
>>
>> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
>> Cc: Yonghong Song <yonghong.song@linux.dev>
>> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
>> Cc: David Faust <david.faust@oracle.com>
>> Cc: Jose Marchesi <jose.marchesi@oracle.com>
>> Cc: Elena Zannoni <elena.zannoni@oracle.com>
>> ---
>>  kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
>>  1 file changed, 77 insertions(+), 64 deletions(-)
>>
>
> I know you are moving around pre-existing code, so a bunch of nits
> below are to pre-existing code, but let's use this as an opportunity
> to clean it up a bit.
>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 6fe641c8ae33..829a12d263a5 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
>>         __update_reg_bounds(dst_reg);
>>  }
>>
>> +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,
>
> hm.. why passing reg_state by value? Use pointer?
>
Someone mentioned this in a review already and I forgot to change it.
Apologies if I did not reply on this.

The reason why I pass by value, is more of an approach to programming.
I do it as guarantee to the caller that there is no mutation of
the value.
If it is better or worst from a performance point of view it is
arguable, since although it might appear to copy the value it also provides
more information to the compiler of the intent of the callee function,
allowing it to optimize further.
I personally would leave the copy by value, but I understand if you want
to keep having the same code style.


>> +                                  bool *valid)
>> +{
>> +       s64 smin_val = reg.smin_value;
>> +       s64 smax_val = reg.smax_value;
>> +       u64 umin_val = reg.umin_value;
>> +       u64 umax_val = reg.umax_value;
>> +
>
> don't add empty line between variable declarations, all variables
> should be in a single continuous block
>
>> +       s32 s32_min_val = reg.s32_min_value;
>> +       s32 s32_max_val = reg.s32_max_value;
>> +       u32 u32_min_val = reg.u32_min_value;
>> +       u32 u32_max_val = reg.u32_max_value;
>> +
>
> but see below, I'm not sure we even need these local variables, they
> don't save all that much typing
>
>> +       bool known = alu32 ? tnum_subreg_is_const(reg.var_off) :
>> +                            tnum_is_const(reg.var_off);
>
> "known" is a misnomer, imo. It's `is_const`.
>
>> +
>> +       if (alu32) {
>> +               if ((known &&
>> +                    (s32_min_val != s32_max_val || u32_min_val != u32_max_val)) ||
>> +                     s32_min_val > s32_max_val || u32_min_val > u32_max_val)
>> +                       *valid = false;
>> +       } else {
>> +               if ((known &&
>> +                    (smin_val != smax_val || umin_val != umax_val)) ||
>> +                   smin_val > smax_val || umin_val > umax_val)
>> +                       *valid = false;
>> +       }
>> +
>> +       return known;
>
>
> The above is really hard to follow, especially how known && !known
> cases are being handled is very easy to misinterpret. How about we
> rewrite the equivalent logic in a few steps:
>
> if (alu32) {
>     if (tnum_subreg_is_const(reg.var_off)) {
>         return reg->s32_min_value == reg->s32_max_value &&
>                reg->u32_min_value == reg->u32_max_value;
>     } else {
>         return reg->s32_min_value <= reg->s32_max_value &&
>                reg->u32_min_value <= reg->u32_max_value;
>     }
> } else {
>    /* same as above for 64-bit bounds */
> }
>
> And you don't even need any local variables, while all the important
> conditions are a bit more easy to follow? Or is it just me?
>

With current state of the code, indeed, it seems you don't need the extra
valid argument to pass the extra information.
Considering that the KNOWN value is now only used in the shift
operators, it seems now safe to merge both valid and the return value
from the function, since the logic will result in the same behaviour.

>> +}
>> +
>> +static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
>> +                                            struct bpf_reg_state src_reg)
>> +{
>> +       bool src_known;
>> +       u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32;
>
> whole u64 for this seems like an overkill, I'd just stick to `int`
>
>> +       bool alu32 = (BPF_CLASS(insn->code) != BPF_ALU64);
>
> insn_bitness == 32 ?
>
>> +       u8 opcode = BPF_OP(insn->code);
>> +
>
> nit: don't split variables block with empty line
>
>> +       bool valid_known = true;
>
> need an empty line between variable declarations and the rest
>
>> +       src_known = is_const_reg_and_valid(src_reg, alu32, &valid_known);
>> +
>> +       /* Taint dst register if offset had invalid bounds
>> +        * derived from e.g. dead branches.
>> +        */
>> +       if (valid_known == false)
>
> nit: !valid_known
>
>> +               return false;
>
> given this logic/handling, why not just return false from
> is_const_reg_and_valid() if it's a constant, but it's not
> valid/consistent? It's simpler and fits the logic and function's name,
> no? See my suggestion above
>
>> +
>> +       switch (opcode) {
>
> inline opcode variable here, you use it just once
>
>> +       case BPF_ADD:
>> +       case BPF_SUB:
>> +       case BPF_AND:
>> +               return true;
>> +
>> +       /* Compute range for the following only if the src_reg is known.
>> +        */
>> +       case BPF_XOR:
>> +       case BPF_OR:
>> +       case BPF_MUL:
>> +               return src_known;
>> +
>> +       /* Shift operators range is only computable if shift dimension operand
>> +        * is known. Also, shifts greater than 31 or 63 are undefined. This
>> +        * includes shifts by a negative number.
>> +        */
>> +       case BPF_LSH:
>> +       case BPF_RSH:
>> +       case BPF_ARSH:
>
> preserve original comment?
>
>> -                       /* Shifts greater than 31 or 63 are undefined.
>> -                        * This includes shifts by a negative number.
>> -                        */
>
>> +               return (src_known && src_reg.umax_value < insn_bitness);
>
> nit: unnecessary ()
>
>> +       default:
>> +               break;
>
> return false here, and drop return below
>
>> +       }
>> +
>> +       return false;
>> +}
>> +
>>  /* WARNING: This function does calculations on 64-bit values, but the actual
>>   * execution may occur on 32-bit values. Therefore, things like bitshifts
>>   * need extra checks in the 32-bit case.
>
> [...]

Apart from the obvious coding style problems I will address those optimizations
in an independent patch in the end, if you agree with. I would prefer to
separate the improvements to avoid to change semantics in the
refactoring patch, as previously requested by Yonghong.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-26 10:20     ` Cupertino Miranda
@ 2024-04-26 16:11       ` Andrii Nakryiko
  2024-04-26 16:17         ` Alexei Starovoitov
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2024-04-26 16:11 UTC (permalink / raw)
  To: Cupertino Miranda
  Cc: bpf, Yonghong Song, Alexei Starovoitov, David Faust,
	Jose Marchesi, Elena Zannoni

On Fri, Apr 26, 2024 at 3:20 AM Cupertino Miranda
<cupertino.miranda@oracle.com> wrote:
>
>
> Andrii Nakryiko writes:
>
> > On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
> > <cupertino.miranda@oracle.com> wrote:
> >>
> >> Split range computation checks in its own function, isolating pessimitic
> >> range set for dst_reg and failing return to a single point.
> >>
> >> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> >> Cc: Yonghong Song <yonghong.song@linux.dev>
> >> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> >> Cc: David Faust <david.faust@oracle.com>
> >> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> >> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> >> ---
> >>  kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
> >>  1 file changed, 77 insertions(+), 64 deletions(-)
> >>
> >
> > I know you are moving around pre-existing code, so a bunch of nits
> > below are to pre-existing code, but let's use this as an opportunity
> > to clean it up a bit.
> >
> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >> index 6fe641c8ae33..829a12d263a5 100644
> >> --- a/kernel/bpf/verifier.c
> >> +++ b/kernel/bpf/verifier.c
> >> @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
> >>         __update_reg_bounds(dst_reg);
> >>  }
> >>
> >> +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,
> >
> > hm.. why passing reg_state by value? Use pointer?
> >
> Someone mentioned this in a review already and I forgot to change it.
> Apologies if I did not reply on this.
>
> The reason why I pass by value, is more of an approach to programming.
> I do it as guarantee to the caller that there is no mutation of
> the value.
> If it is better or worst from a performance point of view it is
> arguable, since although it might appear to copy the value it also provides
> more information to the compiler of the intent of the callee function,
> allowing it to optimize further.
> I personally would leave the copy by value, but I understand if you want
> to keep having the same code style.

It's a pretty big 120-byte structure, so maybe the compiler can
optimize it very well, but I'd still be concerned. Hopefully it can
optimize well even with (const) pointer, if inlining.

But I do insist, if you look at (most? I haven't checked every single
function, of course) other uses in verifier.c, we pass things like
that by pointer. I understand the desire to specify the intent to not
modify it, but that's why you are passing `const struct bpf_reg_state
*reg`, so I think you don't lose anything with that.

>
>
> >> +                                  bool *valid)
> >> +{
> >> +       s64 smin_val = reg.smin_value;
> >> +       s64 smax_val = reg.smax_value;
> >> +       u64 umin_val = reg.umin_value;
> >> +       u64 umax_val = reg.umax_value;
> >> +
> >
> > don't add empty line between variable declarations, all variables
> > should be in a single continuous block
> >
> >> +       s32 s32_min_val = reg.s32_min_value;
> >> +       s32 s32_max_val = reg.s32_max_value;
> >> +       u32 u32_min_val = reg.u32_min_value;
> >> +       u32 u32_max_val = reg.u32_max_value;
> >> +
> >
> > but see below, I'm not sure we even need these local variables, they
> > don't save all that much typing
> >
> >> +       bool known = alu32 ? tnum_subreg_is_const(reg.var_off) :
> >> +                            tnum_is_const(reg.var_off);
> >
> > "known" is a misnomer, imo. It's `is_const`.
> >
> >> +
> >> +       if (alu32) {
> >> +               if ((known &&
> >> +                    (s32_min_val != s32_max_val || u32_min_val != u32_max_val)) ||
> >> +                     s32_min_val > s32_max_val || u32_min_val > u32_max_val)
> >> +                       *valid = false;
> >> +       } else {
> >> +               if ((known &&
> >> +                    (smin_val != smax_val || umin_val != umax_val)) ||
> >> +                   smin_val > smax_val || umin_val > umax_val)
> >> +                       *valid = false;
> >> +       }
> >> +
> >> +       return known;
> >
> >
> > The above is really hard to follow, especially how known && !known
> > cases are being handled is very easy to misinterpret. How about we
> > rewrite the equivalent logic in a few steps:
> >
> > if (alu32) {
> >     if (tnum_subreg_is_const(reg.var_off)) {
> >         return reg->s32_min_value == reg->s32_max_value &&
> >                reg->u32_min_value == reg->u32_max_value;
> >     } else {
> >         return reg->s32_min_value <= reg->s32_max_value &&
> >                reg->u32_min_value <= reg->u32_max_value;
> >     }
> > } else {
> >    /* same as above for 64-bit bounds */
> > }
> >
> > And you don't even need any local variables, while all the important
> > conditions are a bit more easy to follow? Or is it just me?
> >
>
> With current state of the code, indeed, it seems you don't need the extra
> valid argument to pass the extra information.
> Considering that the KNOWN value is now only used in the shift
> operators, it seems now safe to merge both valid and the return value
> from the function, since the logic will result in the same behaviour.
>

cool, let's do it then

> >> +}
> >> +
> >> +static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
> >> +                                            struct bpf_reg_state src_reg)
> >> +{
> >> +       bool src_known;
> >> +       u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32;
> >
> > whole u64 for this seems like an overkill, I'd just stick to `int`
> >
> >> +       bool alu32 = (BPF_CLASS(insn->code) != BPF_ALU64);
> >
> > insn_bitness == 32 ?
> >
> >> +       u8 opcode = BPF_OP(insn->code);
> >> +
> >
> > nit: don't split variables block with empty line
> >
> >> +       bool valid_known = true;
> >
> > need an empty line between variable declarations and the rest
> >
> >> +       src_known = is_const_reg_and_valid(src_reg, alu32, &valid_known);
> >> +
> >> +       /* Taint dst register if offset had invalid bounds
> >> +        * derived from e.g. dead branches.
> >> +        */
> >> +       if (valid_known == false)
> >
> > nit: !valid_known
> >
> >> +               return false;
> >
> > given this logic/handling, why not just return false from
> > is_const_reg_and_valid() if it's a constant, but it's not
> > valid/consistent? It's simpler and fits the logic and function's name,
> > no? See my suggestion above
> >
> >> +
> >> +       switch (opcode) {
> >
> > inline opcode variable here, you use it just once
> >
> >> +       case BPF_ADD:
> >> +       case BPF_SUB:
> >> +       case BPF_AND:
> >> +               return true;
> >> +
> >> +       /* Compute range for the following only if the src_reg is known.
> >> +        */
> >> +       case BPF_XOR:
> >> +       case BPF_OR:
> >> +       case BPF_MUL:
> >> +               return src_known;
> >> +
> >> +       /* Shift operators range is only computable if shift dimension operand
> >> +        * is known. Also, shifts greater than 31 or 63 are undefined. This
> >> +        * includes shifts by a negative number.
> >> +        */
> >> +       case BPF_LSH:
> >> +       case BPF_RSH:
> >> +       case BPF_ARSH:
> >
> > preserve original comment?
> >
> >> -                       /* Shifts greater than 31 or 63 are undefined.
> >> -                        * This includes shifts by a negative number.
> >> -                        */
> >
> >> +               return (src_known && src_reg.umax_value < insn_bitness);
> >
> > nit: unnecessary ()
> >
> >> +       default:
> >> +               break;
> >
> > return false here, and drop return below
> >
> >> +       }
> >> +
> >> +       return false;
> >> +}
> >> +
> >>  /* WARNING: This function does calculations on 64-bit values, but the actual
> >>   * execution may occur on 32-bit values. Therefore, things like bitshifts
> >>   * need extra checks in the 32-bit case.
> >
> > [...]
>
> Apart from the obvious coding style problems I will address those optimizations
> in an independent patch in the end, if you agree with. I would prefer to
> separate the improvements to avoid to change semantics in the
> refactoring patch, as previously requested by Yonghong.

sure

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-26 16:11       ` Andrii Nakryiko
@ 2024-04-26 16:17         ` Alexei Starovoitov
  2024-04-27 22:51           ` Cupertino Miranda
  0 siblings, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2024-04-26 16:17 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Cupertino Miranda, bpf, Yonghong Song, David Faust,
	Jose Marchesi, Elena Zannoni

On Fri, Apr 26, 2024 at 9:12 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 3:20 AM Cupertino Miranda
> <cupertino.miranda@oracle.com> wrote:
> >
> >
> > Andrii Nakryiko writes:
> >
> > > On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
> > > <cupertino.miranda@oracle.com> wrote:
> > >>
> > >> Split range computation checks in its own function, isolating pessimitic
> > >> range set for dst_reg and failing return to a single point.
> > >>
> > >> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> > >> Cc: Yonghong Song <yonghong.song@linux.dev>
> > >> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > >> Cc: David Faust <david.faust@oracle.com>
> > >> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> > >> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> > >> ---
> > >>  kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
> > >>  1 file changed, 77 insertions(+), 64 deletions(-)
> > >>
> > >
> > > I know you are moving around pre-existing code, so a bunch of nits
> > > below are to pre-existing code, but let's use this as an opportunity
> > > to clean it up a bit.
> > >
> > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > >> index 6fe641c8ae33..829a12d263a5 100644
> > >> --- a/kernel/bpf/verifier.c
> > >> +++ b/kernel/bpf/verifier.c
> > >> @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
> > >>         __update_reg_bounds(dst_reg);
> > >>  }
> > >>
> > >> +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,
> > >
> > > hm.. why passing reg_state by value? Use pointer?
> > >
> > Someone mentioned this in a review already and I forgot to change it.
> > Apologies if I did not reply on this.
> >
> > The reason why I pass by value, is more of an approach to programming.
> > I do it as guarantee to the caller that there is no mutation of
> > the value.
> > If it is better or worst from a performance point of view it is
> > arguable, since although it might appear to copy the value it also provides
> > more information to the compiler of the intent of the callee function,
> > allowing it to optimize further.
> > I personally would leave the copy by value, but I understand if you want
> > to keep having the same code style.
>
> It's a pretty big 120-byte structure, so maybe the compiler can
> optimize it very well, but I'd still be concerned. Hopefully it can
> optimize well even with (const) pointer, if inlining.
>
> But I do insist, if you look at (most? I haven't checked every single
> function, of course) other uses in verifier.c, we pass things like
> that by pointer. I understand the desire to specify the intent to not
> modify it, but that's why you are passing `const struct bpf_reg_state
> *reg`, so I think you don't lose anything with that.

+1
that "struct bpf_reg_state src_reg" code was written 7 years ago
when bpf_reg_state was small.
We definitely need to fix it. It might even bring
a noticeable runtime improvement.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-26 16:17         ` Alexei Starovoitov
@ 2024-04-27 22:51           ` Cupertino Miranda
  2024-04-28  3:22             ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-27 22:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Yonghong Song, David Faust, Jose Marchesi,
	Elena Zannoni


Alexei Starovoitov writes:

> On Fri, Apr 26, 2024 at 9:12 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
>>
>> On Fri, Apr 26, 2024 at 3:20 AM Cupertino Miranda
>> <cupertino.miranda@oracle.com> wrote:
>> >
>> >
>> > Andrii Nakryiko writes:
>> >
>> > > On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
>> > > <cupertino.miranda@oracle.com> wrote:
>> > >>
>> > >> Split range computation checks in its own function, isolating pessimitic
>> > >> range set for dst_reg and failing return to a single point.
>> > >>
>> > >> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
>> > >> Cc: Yonghong Song <yonghong.song@linux.dev>
>> > >> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
>> > >> Cc: David Faust <david.faust@oracle.com>
>> > >> Cc: Jose Marchesi <jose.marchesi@oracle.com>
>> > >> Cc: Elena Zannoni <elena.zannoni@oracle.com>
>> > >> ---
>> > >>  kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
>> > >>  1 file changed, 77 insertions(+), 64 deletions(-)
>> > >>
>> > >
>> > > I know you are moving around pre-existing code, so a bunch of nits
>> > > below are to pre-existing code, but let's use this as an opportunity
>> > > to clean it up a bit.
>> > >
>> > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> > >> index 6fe641c8ae33..829a12d263a5 100644
>> > >> --- a/kernel/bpf/verifier.c
>> > >> +++ b/kernel/bpf/verifier.c
>> > >> @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
>> > >>         __update_reg_bounds(dst_reg);
>> > >>  }
>> > >>
>> > >> +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,
>> > >
>> > > hm.. why passing reg_state by value? Use pointer?
>> > >
>> > Someone mentioned this in a review already and I forgot to change it.
>> > Apologies if I did not reply on this.
>> >
>> > The reason why I pass by value, is more of an approach to programming.
>> > I do it as guarantee to the caller that there is no mutation of
>> > the value.
>> > If it is better or worst from a performance point of view it is
>> > arguable, since although it might appear to copy the value it also provides
>> > more information to the compiler of the intent of the callee function,
>> > allowing it to optimize further.
>> > I personally would leave the copy by value, but I understand if you want
>> > to keep having the same code style.
>>
>> It's a pretty big 120-byte structure, so maybe the compiler can
>> optimize it very well, but I'd still be concerned. Hopefully it can
>> optimize well even with (const) pointer, if inlining.
>>
>> But I do insist, if you look at (most? I haven't checked every single
>> function, of course) other uses in verifier.c, we pass things like
>> that by pointer. I understand the desire to specify the intent to not
>> modify it, but that's why you are passing `const struct bpf_reg_state
>> *reg`, so I think you don't lose anything with that.
Well, the const will only guard the pointer from mutating, not the data
pointed by it.

>
> +1
> that "struct bpf_reg_state src_reg" code was written 7 years ago
> when bpf_reg_state was small.
> We definitely need to fix it. It might even bring
> a noticeable runtime improvement.

I forgot to reply to Andrii.

I will change the function prototype to pass the pointer instead.
In any case, please allow me to express my concerns once again, and
explain why I do it.

As a general practice, I personally will only copy a pointer to a
function if there is the intent to allow the function to change the
content of the pointed data.

In my understanding, it is easier for the compiler to optimize both the
caller and the callee when there are less side-effects from that
function call such as a possible memory clobbering.

Since these particular functions are leaf functions (not calling anywhere),
it should be relatively easy for the compiler to infer that the actual
copy is not needed and will likely just inline those calls, resulting in
lots of code being eliminated, which will remove any apparent copies.

I checked the asm file for verifier.c and everything below
adjust_scalar_min_max_vals including itself is inlined, making it
totally irrelevant if you copy the data or the pointer, since the
compiler will identify that the content refers to the same data and all
copies will be classified and removed as dead-code.

All the pointer passing in any context in verifier.c, to my eyes, is more
of a software defect then a virtue.
When there is an actual proven benefit, I am all for it, but not in all
cases.

I had to express my concerns on this and will never speak of it again.
:)

Thanks you all for the reviews. I will prepare a new version on Monday.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-27 22:51           ` Cupertino Miranda
@ 2024-04-28  3:22             ` Andrii Nakryiko
  2024-04-28 10:56               ` Cupertino Miranda
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2024-04-28  3:22 UTC (permalink / raw)
  To: Cupertino Miranda
  Cc: Alexei Starovoitov, bpf, Yonghong Song, David Faust,
	Jose Marchesi, Elena Zannoni

On Sat, Apr 27, 2024 at 3:51 PM Cupertino Miranda
<cupertino.miranda@oracle.com> wrote:
>
>
> Alexei Starovoitov writes:
>
> > On Fri, Apr 26, 2024 at 9:12 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> >>
> >> On Fri, Apr 26, 2024 at 3:20 AM Cupertino Miranda
> >> <cupertino.miranda@oracle.com> wrote:
> >> >
> >> >
> >> > Andrii Nakryiko writes:
> >> >
> >> > > On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
> >> > > <cupertino.miranda@oracle.com> wrote:
> >> > >>
> >> > >> Split range computation checks in its own function, isolating pessimitic
> >> > >> range set for dst_reg and failing return to a single point.
> >> > >>
> >> > >> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
> >> > >> Cc: Yonghong Song <yonghong.song@linux.dev>
> >> > >> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> >> > >> Cc: David Faust <david.faust@oracle.com>
> >> > >> Cc: Jose Marchesi <jose.marchesi@oracle.com>
> >> > >> Cc: Elena Zannoni <elena.zannoni@oracle.com>
> >> > >> ---
> >> > >>  kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
> >> > >>  1 file changed, 77 insertions(+), 64 deletions(-)
> >> > >>
> >> > >
> >> > > I know you are moving around pre-existing code, so a bunch of nits
> >> > > below are to pre-existing code, but let's use this as an opportunity
> >> > > to clean it up a bit.
> >> > >
> >> > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >> > >> index 6fe641c8ae33..829a12d263a5 100644
> >> > >> --- a/kernel/bpf/verifier.c
> >> > >> +++ b/kernel/bpf/verifier.c
> >> > >> @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
> >> > >>         __update_reg_bounds(dst_reg);
> >> > >>  }
> >> > >>
> >> > >> +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,
> >> > >
> >> > > hm.. why passing reg_state by value? Use pointer?
> >> > >
> >> > Someone mentioned this in a review already and I forgot to change it.
> >> > Apologies if I did not reply on this.
> >> >
> >> > The reason why I pass by value, is more of an approach to programming.
> >> > I do it as guarantee to the caller that there is no mutation of
> >> > the value.
> >> > If it is better or worst from a performance point of view it is
> >> > arguable, since although it might appear to copy the value it also provides
> >> > more information to the compiler of the intent of the callee function,
> >> > allowing it to optimize further.
> >> > I personally would leave the copy by value, but I understand if you want
> >> > to keep having the same code style.
> >>
> >> It's a pretty big 120-byte structure, so maybe the compiler can
> >> optimize it very well, but I'd still be concerned. Hopefully it can
> >> optimize well even with (const) pointer, if inlining.
> >>
> >> But I do insist, if you look at (most? I haven't checked every single
> >> function, of course) other uses in verifier.c, we pass things like
> >> that by pointer. I understand the desire to specify the intent to not
> >> modify it, but that's why you are passing `const struct bpf_reg_state
> >> *reg`, so I think you don't lose anything with that.
> Well, the const will only guard the pointer from mutating, not the data
> pointed by it.

I didn't propose marking pointer const, but mark pointee type as const:

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4e474ef44e9c..de2bc6fa15da 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -363,12 +363,14 @@ __printf(2, 3) static void verbose(void
*private_data, const char *fmt, ...)
 }

 static void verbose_invalid_scalar(struct bpf_verifier_env *env,
-                                  struct bpf_reg_state *reg,
+                                  const struct bpf_reg_state *reg,
                                   struct bpf_retval_range range,
const char *ctx,
                                   const char *reg_name)
 {
        bool unknown = true;

+       reg->smin_value = 0x1234;
+
        verbose(env, "%s the register %s has", ctx, reg_name);
        if (reg->smin_value > S64_MIN) {
                verbose(env, " smin=%lld", reg->smin_value);

$ make

...

/data/users/andriin/linux/kernel/bpf/verifier.c: In function
‘verbose_invalid_scalar’:
/data/users/andriin/linux/kernel/bpf/verifier.c:372:25: error:
assignment of member ‘smin_value’ in read-only object
  372 |         reg->smin_value = 0x1234;
      |                         ^

...

Works as it logically should.

>
> >
> > +1
> > that "struct bpf_reg_state src_reg" code was written 7 years ago
> > when bpf_reg_state was small.
> > We definitely need to fix it. It might even bring
> > a noticeable runtime improvement.
>
> I forgot to reply to Andrii.
>
> I will change the function prototype to pass the pointer instead.
> In any case, please allow me to express my concerns once again, and
> explain why I do it.
>
> As a general practice, I personally will only copy a pointer to a
> function if there is the intent to allow the function to change the
> content of the pointed data.

I'm not sure why you have this preconception that passing something by
pointer is only for mutation. C language has a straightforward way to
express "this is not going to be changed" with const. You can
circumvent this, of course, but that's an entirely different story.

>
> In my understanding, it is easier for the compiler to optimize both the
> caller and the callee when there are less side-effects from that
> function call such as a possible memory clobbering.
>
> Since these particular functions are leaf functions (not calling anywhere),
> it should be relatively easy for the compiler to infer that the actual
> copy is not needed and will likely just inline those calls, resulting in
> lots of code being eliminated, which will remove any apparent copies.
>
> I checked the asm file for verifier.c and everything below
> adjust_scalar_min_max_vals including itself is inlined, making it
> totally irrelevant if you copy the data or the pointer, since the
> compiler will identify that the content refers to the same data and all
> copies will be classified and removed as dead-code.
>
> All the pointer passing in any context in verifier.c, to my eyes, is more
> of a software defect then a virtue.
> When there is an actual proven benefit, I am all for it, but not in all
> cases.
>
> I had to express my concerns on this and will never speak of it again.
> :)
>
> Thanks you all for the reviews. I will prepare a new version on Monday.

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation
  2024-04-28  3:22             ` Andrii Nakryiko
@ 2024-04-28 10:56               ` Cupertino Miranda
  0 siblings, 0 replies; 24+ messages in thread
From: Cupertino Miranda @ 2024-04-28 10:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, bpf, Yonghong Song, David Faust,
	Jose Marchesi, Elena Zannoni


Andrii Nakryiko writes:

> On Sat, Apr 27, 2024 at 3:51 PM Cupertino Miranda
> <cupertino.miranda@oracle.com> wrote:
>>
>>
>> Alexei Starovoitov writes:
>>
>> > On Fri, Apr 26, 2024 at 9:12 AM Andrii Nakryiko
>> > <andrii.nakryiko@gmail.com> wrote:
>> >>
>> >> On Fri, Apr 26, 2024 at 3:20 AM Cupertino Miranda
>> >> <cupertino.miranda@oracle.com> wrote:
>> >> >
>> >> >
>> >> > Andrii Nakryiko writes:
>> >> >
>> >> > > On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda
>> >> > > <cupertino.miranda@oracle.com> wrote:
>> >> > >>
>> >> > >> Split range computation checks in its own function, isolating pessimitic
>> >> > >> range set for dst_reg and failing return to a single point.
>> >> > >>
>> >> > >> Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
>> >> > >> Cc: Yonghong Song <yonghong.song@linux.dev>
>> >> > >> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
>> >> > >> Cc: David Faust <david.faust@oracle.com>
>> >> > >> Cc: Jose Marchesi <jose.marchesi@oracle.com>
>> >> > >> Cc: Elena Zannoni <elena.zannoni@oracle.com>
>> >> > >> ---
>> >> > >>  kernel/bpf/verifier.c | 141 +++++++++++++++++++++++-------------------
>> >> > >>  1 file changed, 77 insertions(+), 64 deletions(-)
>> >> > >>
>> >> > >
>> >> > > I know you are moving around pre-existing code, so a bunch of nits
>> >> > > below are to pre-existing code, but let's use this as an opportunity
>> >> > > to clean it up a bit.
>> >> > >
>> >> > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> >> > >> index 6fe641c8ae33..829a12d263a5 100644
>> >> > >> --- a/kernel/bpf/verifier.c
>> >> > >> +++ b/kernel/bpf/verifier.c
>> >> > >> @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,
>> >> > >>         __update_reg_bounds(dst_reg);
>> >> > >>  }
>> >> > >>
>> >> > >> +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32,
>> >> > >
>> >> > > hm.. why passing reg_state by value? Use pointer?
>> >> > >
>> >> > Someone mentioned this in a review already and I forgot to change it.
>> >> > Apologies if I did not reply on this.
>> >> >
>> >> > The reason why I pass by value, is more of an approach to programming.
>> >> > I do it as guarantee to the caller that there is no mutation of
>> >> > the value.
>> >> > If it is better or worst from a performance point of view it is
>> >> > arguable, since although it might appear to copy the value it also provides
>> >> > more information to the compiler of the intent of the callee function,
>> >> > allowing it to optimize further.
>> >> > I personally would leave the copy by value, but I understand if you want
>> >> > to keep having the same code style.
>> >>
>> >> It's a pretty big 120-byte structure, so maybe the compiler can
>> >> optimize it very well, but I'd still be concerned. Hopefully it can
>> >> optimize well even with (const) pointer, if inlining.
>> >>
>> >> But I do insist, if you look at (most? I haven't checked every single
>> >> function, of course) other uses in verifier.c, we pass things like
>> >> that by pointer. I understand the desire to specify the intent to not
>> >> modify it, but that's why you are passing `const struct bpf_reg_state
>> >> *reg`, so I think you don't lose anything with that.
>> Well, the const will only guard the pointer from mutating, not the data
>> pointed by it.
>
> I didn't propose marking pointer const, but mark pointee type as const:
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4e474ef44e9c..de2bc6fa15da 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -363,12 +363,14 @@ __printf(2, 3) static void verbose(void
> *private_data, const char *fmt, ...)
>  }
>
>  static void verbose_invalid_scalar(struct bpf_verifier_env *env,
> -                                  struct bpf_reg_state *reg,
> +                                  const struct bpf_reg_state *reg,
>                                    struct bpf_retval_range range,
> const char *ctx,
>                                    const char *reg_name)
>  {
>         bool unknown = true;
>
> +       reg->smin_value = 0x1234;
> +
>         verbose(env, "%s the register %s has", ctx, reg_name);
>         if (reg->smin_value > S64_MIN) {
>                 verbose(env, " smin=%lld", reg->smin_value);
>
> $ make
>
> ...
>
> /data/users/andriin/linux/kernel/bpf/verifier.c: In function
> ‘verbose_invalid_scalar’:
> /data/users/andriin/linux/kernel/bpf/verifier.c:372:25: error:
> assignment of member ‘smin_value’ in read-only object
>   372 |         reg->smin_value = 0x1234;
>       |                         ^
>
> ...
>
> Works as it logically should.
>
Your right, pointer is better. I should have validated that myself.
Apologies for the noise. Please disregard all I said.

>>
>> >
>> > +1
>> > that "struct bpf_reg_state src_reg" code was written 7 years ago
>> > when bpf_reg_state was small.
>> > We definitely need to fix it. It might even bring
>> > a noticeable runtime improvement.
>>
>> I forgot to reply to Andrii.
>>
>> I will change the function prototype to pass the pointer instead.
>> In any case, please allow me to express my concerns once again, and
>> explain why I do it.
>>
>> As a general practice, I personally will only copy a pointer to a
>> function if there is the intent to allow the function to change the
>> content of the pointed data.
>
> I'm not sure why you have this preconception that passing something by
> pointer is only for mutation. C language has a straightforward way to
> express "this is not going to be changed" with const. You can
> circumvent this, of course, but that's an entirely different story.
>
>>
>> In my understanding, it is easier for the compiler to optimize both the
>> caller and the callee when there are less side-effects from that
>> function call such as a possible memory clobbering.
>>
>> Since these particular functions are leaf functions (not calling anywhere),
>> it should be relatively easy for the compiler to infer that the actual
>> copy is not needed and will likely just inline those calls, resulting in
>> lots of code being eliminated, which will remove any apparent copies.
>>
>> I checked the asm file for verifier.c and everything below
>> adjust_scalar_min_max_vals including itself is inlined, making it
>> totally irrelevant if you copy the data or the pointer, since the
>> compiler will identify that the content refers to the same data and all
>> copies will be classified and removed as dead-code.
>>
>> All the pointer passing in any context in verifier.c, to my eyes, is more
>> of a software defect then a virtue.
>> When there is an actual proven benefit, I am all for it, but not in all
>> cases.
>>
>> I had to express my concerns on this and will never speak of it again.
>> :)
>>
>> Thanks you all for the reviews. I will prepare a new version on Monday.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2024-04-28 10:56 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24 22:40 [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda
2024-04-24 22:40 ` [PATCH bpf-next v3 1/6] bpf/verifier: replace calls to mark_reg_unknown Cupertino Miranda
2024-04-25 16:56   ` Eduard Zingerman
2024-04-24 22:40 ` [PATCH bpf-next v3 2/6] bpf/verifier: refactor checks for range computation Cupertino Miranda
2024-04-25 18:49   ` Eduard Zingerman
2024-04-25 23:05   ` Andrii Nakryiko
2024-04-26 10:20     ` Cupertino Miranda
2024-04-26 16:11       ` Andrii Nakryiko
2024-04-26 16:17         ` Alexei Starovoitov
2024-04-27 22:51           ` Cupertino Miranda
2024-04-28  3:22             ` Andrii Nakryiko
2024-04-28 10:56               ` Cupertino Miranda
2024-04-24 22:40 ` [PATCH bpf-next v3 3/6] bpf/verifier: improve XOR and OR " Cupertino Miranda
2024-04-25 18:52   ` Eduard Zingerman
2024-04-24 22:40 ` [PATCH bpf-next v3 4/6] selftests/bpf: XOR and OR range computation tests Cupertino Miranda
2024-04-25 18:59   ` Eduard Zingerman
2024-04-25 23:17   ` Andrii Nakryiko
2024-04-24 22:40 ` [PATCH bpf-next v3 5/6] bpf/verifier: relax MUL range computation check Cupertino Miranda
2024-04-25 19:00   ` Eduard Zingerman
2024-04-25 23:24   ` Andrii Nakryiko
2024-04-24 22:40 ` [PATCH bpf-next v3 6/6] selftests/bpf: MUL range computation tests Cupertino Miranda
2024-04-25 19:02   ` Eduard Zingerman
2024-04-25 23:26   ` Andrii Nakryiko
2024-04-24 22:45 ` [PATCH bpf-next v3 0/6] bpf/verifier: range computation improvements Cupertino Miranda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).