Re: [PATCH bpf-next v3 07/11] bpf: Fix a false rejection caused by AND operation

From: Xu Kuohai <xukuohai@huaweicloud.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Matt Bobrowski <mattbobrowski@google.com>,
	Brendan Jackman <jackmanb@chromium.org>,
	Paul Moore <paul@paul-moore.com>,
	James Morris <jmorris@namei.org>,
	"Serge E . Hallyn" <serge@hallyn.com>,
	Khadija Kamran <kamrankhadijadj@gmail.com>,
	Casey Schaufler <casey@schaufler-ca.com>,
	Ondrej Mosnacek <omosnace@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	John Johansen <john.johansen@canonical.com>,
	Lukas Bulwahn <lukas.bulwahn@gmail.com>,
	Roberto Sassu <roberto.sassu@huawei.com>,
	Shung-Hsi Yu <shung-hsi.yu@suse.com>
Subject: Re: [PATCH bpf-next v3 07/11] bpf: Fix a false rejection caused by AND operation
Date: Sun, 28 Apr 2024 23:15:40 +0800	[thread overview]
Message-ID: <4fbce978-9687-48a9-be2a-1c4d76790f7d@huaweicloud.com> (raw)
In-Reply-To: <CAEf4BzZTzftrOCFsfBd81sHDBpmNK+4Jefqa3SSS6NiuncO0tQ@mail.gmail.com>

On 4/27/2024 4:36 AM, Andrii Nakryiko wrote:
> On Tue, Apr 23, 2024 at 7:26 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>>
>> On 4/24/2024 5:55 AM, Yonghong Song wrote:
>>>
>>> On 4/20/24 1:33 AM, Xu Kuohai wrote:
>>>> On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
>>>>> On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
>>>>>> From: Xu Kuohai <xukuohai@huawei.com>
>>>>>>
>>>>>> With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts
>>>>>> is rejected by the verifier, and the log says:
>>>>>>
>>>>>>     0: R1=ctx() R10=fp0
>>>>>>     ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27
>>>>>>     0: (b7) r0 = 0                        ; R0_w=0
>>>>>>     1: (79) r2 = *(u64 *)(r1 +0)
>>>>>>     func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map'
>>>>>>     2: R1=ctx() R2_w=trusted_ptr_bpf_map()
>>>>>>     ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29
>>>>>>     2: (18) r3 = 0xffff9742c0951a00       ; R3_w=map_ptr(map=data_input,ks=4,vs=4)
>>>>>>     4: (5d) if r2 != r3 goto pc+4         ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4)
>>>>>>     ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27
>>>>>>     5: (79) r0 = *(u64 *)(r1 +8)          ; R0_w=scalar() R1=ctx()
>>>>>>     ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32
>>>>>>     6: (67) r0 <<= 62                     ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000))
>>>>>>     7: (c7) r0 s>>= 63                    ; R0_w=scalar(smin=smin32=-1,smax=smax32=0)
>>>>>>     ;  @ test_libbpf_get_fd_by_id_opts.c:0
>>>>>>     8: (57) r0 &= -13                     ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
>>>>>>     ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27
>>>>>>     9: (95) exit
> 
> [...]
> 
>>
>>       As suggested by Eduard, this patch makes a special case for source
>>       or destination register of '&=' operation being in range [-1, 0].
>>
>>       Meaning that one of the '&=' operands is either:
>>       - all ones, in which case the counterpart is the result of the operation;
>>       - all zeros, in which case zero is the result of the operation.
>>
>>       And MIN and MAX values could be derived based on above two observations.
>>
>>       [0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@gmail.com/
>>       [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985a51565/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>
>>       Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
>>       Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 640747b53745..30c551d39329 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg,
>>           dst_reg->u32_min_value = var32_off.value;
>>           dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
>>
>> +       /* Special case: src_reg is known and dst_reg is in range [-1, 0] */
>> +       if (src_known &&
>> +               dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 &&
>> +               dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
> 
> please keep if () condition aligned across multiple lines, it's super
> confusing this way
>

OK, will update the align style

>> +               dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
>> +               dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0);
> 
> do we need to update tnum parts as well (or reset and re-derive, probably)?
> 
> btw, can't we support src being a range here? the idea is that dst_reg
> either all ones or all zeros. For and it means that it either stays
> all zero, or will be *exactly equal* to src, right? So I think the
> logic would be:
> 
> a) if [s32_min, s32_max] is on the same side of zero, then resulting
> range would be [min(s32_min, 0), max(s32_max, 0)], just like you have
> here
> 
> b) if [s32_min, s32_max] contains zero, then resulting range will be
> exactly [s32_min, s32_max]
> 
> Or did I make a mistake above?
>

Totally agree, the AND of any set with the range [-1,0] is equivalent
to adding number 0 to the set!

Based on this observation, I've rewritten the patch as follows.

diff --git a/include/linux/tnum.h b/include/linux/tnum.h
index 3c13240077b8..5e795d728b9f 100644
--- a/include/linux/tnum.h
+++ b/include/linux/tnum.h
@@ -52,6 +52,9 @@ struct tnum tnum_mul(struct tnum a, struct tnum b);
  /* Return a tnum representing numbers satisfying both @a and @b */
  struct tnum tnum_intersect(struct tnum a, struct tnum b);

+/* Return a tnum representing numbers satisfying either @a or @b */
+struct tnum tnum_union(struct tnum a, struct tnum b);
+
  /* Return @a with all but the lowest @size bytes cleared */
  struct tnum tnum_cast(struct tnum a, u8 size);

diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index 9dbc31b25e3d..9d4480a683ca 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -150,6 +150,29 @@ struct tnum tnum_intersect(struct tnum a, struct tnum b)
         return TNUM(v & ~mu, mu);
  }

+/*
+ * Each bit has 3 states: unkown, known 0, known 1. If using x to represent
+ * unknown state, the result of the union of two bits is as follows:
+ *
+ *         | x    0    1
+ *    -----+------------
+ *     x   | x    x    x
+ *     0   | x    0    x
+ *     1   | x    x    1
+ *
+ * For tnum a and b, only the bits that are both known 0 or known 1 in a
+ * and b are known in the result of union a and b.
+ */
+struct tnum tnum_union(struct tnum a, struct tnum b)
+{
+       u64 v0, v1, mu;
+
+       mu = a.mask | b.mask; // unkown bits either in a or b
+       v1 = (a.value & b.value) & ~mu; // "known 1" bits in both a and b
+       v0 = (~a.value & ~b.value) & ~mu; // "known 0" bits in both a and b
+       return TNUM(v1, mu | ~(v0 | v1));
+}
+
  struct tnum tnum_cast(struct tnum a, u8 size)
  {
         a.value &= (1ULL << (size * 8)) - 1;
  {
         a.value &= (1ULL << (size * 8)) - 1;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8f0f2e21699e..b69c89bc5cfc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13478,6 +13478,28 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg,
                 return;
         }

+       /* Special case: dst_reg is in range [-1, 0] */
+       if (dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) {
+               var32_off = tnum_union(src_reg->var_off, tnum_const(0));
+               dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
+               dst_reg->u32_min_value = var32_off.value;
+               dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
+               dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
+               dst_reg->s32_max_value = max_t(s32, src_reg->s32_max_value, 0);
+               return;
+       }
+
+       /* Special case: src_reg is in range [-1, 0] */
+       if (src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) {
+               var32_off = tnum_union(dst_reg->var_off, tnum_const(0));
+               dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
+               dst_reg->u32_min_value = var32_off.value;
+               dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
+               dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
+               dst_reg->s32_max_value = max_t(s32, dst_reg->s32_max_value, 0);
+               return;
+       }
+
         /* We get our minimum from the var_off, since that's inherently
          * bitwise.  Our maximum is the minimum of the operands' maxima.
          */
@@ -13508,6 +13530,26 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg,
                 return;
         }

+       /* Special case: dst_reg is in range [-1, 0] */
+       if (dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
+               dst_reg->var_off = tnum_union(src_reg->var_off, tnum_const(0));
+               dst_reg->umin_value = dst_reg->var_off.value;
+               dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
+               dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0);
+               dst_reg->smax_value = max_t(s64, src_reg->smax_value, 0);
+               return;
+       }
+
+       /* Special case: src_reg is in range [-1, 0] */
+       if (src_reg->smin_value == -1 && src_reg->smax_value == 0) {
+               dst_reg->var_off = tnum_union(dst_reg->var_off, tnum_const(0));
+               dst_reg->umin_value = dst_reg->var_off.value;
+               dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
+               dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0);
+               dst_reg->smax_value = max_t(s64, dst_reg->smax_value, 0);
+               return;
+       }
+

>> +               return;
>> +       }
>> +
>> +       /* Special case: dst_reg is known and src_reg is in range [-1, 0] */
>> +       if (dst_known &&
>> +               src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 &&
>> +               src_reg->smin_value == -1 && src_reg->smax_value == 0) {
>> +               dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
>> +               dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0);
>> +               return;
>> +       }
>> +
>>           /* Safe to set s32 bounds by casting u32 result into s32 when u32
>>            * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded.
>>            */
> 
> [...]
>