Re: [PATCH bpf-next 1/2] bpf: fix a verifier failure with xor

From: Yonghong Song <yhs@fb.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>, <kernel-team@fb.com>,
	John Fastabend <john.fastabend@gmail.com>, <ecree@solarflare.com>
Subject: Re: [PATCH bpf-next 1/2] bpf: fix a verifier failure with xor
Date: Tue, 25 Aug 2020 20:36:15 -0700	[thread overview]
Message-ID: <f2056e3c-e300-6fa0-8b8e-fa19ed5580bd@fb.com> (raw)
In-Reply-To: <20200826015836.2rlfvhoznylkabp6@ast-mbp.dhcp.thefacebook.com>

On 8/25/20 6:58 PM, Alexei Starovoitov wrote:
> On Mon, Aug 24, 2020 at 11:46:08PM -0700, Yonghong Song wrote:
>> bpf selftest test_progs/test_sk_assign failed with llvm 11 and llvm 12.
>> Compared to llvm 10, llvm 11 and 12 generates xor instruction which
>> is not handled properly in verifier. The following illustrates the
>> problem:
>>
>>    16: (b4) w5 = 0
>>    17: ... R5_w=inv0 ...
>>    ...
>>    132: (a4) w5 ^= 1
>>    133: ... R5_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ...
>>    ...
>>    37: (bc) w8 = w5
>>    38: ... R5=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
>>            R8_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ...
>>    ...
>>    41: (bc) w3 = w8
>>    42: ... R3_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ...
>>    45: (56) if w3 != 0x0 goto pc+1
>>     ... R3_w=inv0 ...
>>    46: (b7) r1 = 34
>>    47: R1_w=inv34 R7=pkt(id=0,off=26,r=38,imm=0)
>>    47: (0f) r7 += r1
>>    48: R1_w=invP34 R3_w=inv0 R7_w=pkt(id=0,off=60,r=38,imm=0)
>>    48: (b4) w9 = 0
>>    49: R1_w=invP34 R3_w=inv0 R7_w=pkt(id=0,off=60,r=38,imm=0)
>>    49: (69) r1 = *(u16 *)(r7 +0)
>>    invalid access to packet, off=60 size=2, R7(id=0,off=60,r=38)
>>    R7 offset is outside of the packet
>>
>> At above insn 132, w5 = 0, but after w5 ^= 1, we give a really conservative
>> value of w5. At insn 45, in reality the condition should be always false.
>> But due to conservative value for w3, the verifier evaluates it could be
>> true and this later leads to verifier failure complaining potential
>> packet out-of-bound access.
>>
>> This patch implemented proper XOR support in verifier.
>> In the above example, we have:
>>    132: R5=invP0
>>    132: (a4) w5 ^= 1
>>    133: R5_w=invP1
>>    ...
>>    37: (bc) w8 = w5
>>    ...
>>    41: (bc) w3 = w8
>>    42: R3_w=invP1
>>    ...
>>    45: (56) if w3 != 0x0 goto pc+1
>>    47: R3_w=invP1
>>    ...
>>    processed 353 insns ...
>> and the verifier can verify the program successfully.
>>
>> Cc: John Fastabend <john.fastabend@gmail.com>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   kernel/bpf/verifier.c | 66 +++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 66 insertions(+)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index dd24503ab3d3..a08cabc0f683 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -5801,6 +5801,67 @@ static void scalar_min_max_or(struct bpf_reg_state *dst_reg,
>>   	__update_reg_bounds(dst_reg);
>>   }
>>   
>> +static void scalar32_min_max_xor(struct bpf_reg_state *dst_reg,
>> +				 struct bpf_reg_state *src_reg)
>> +{
>> +	bool src_known = tnum_subreg_is_const(src_reg->var_off);
>> +	bool dst_known = tnum_subreg_is_const(dst_reg->var_off);
>> +	struct tnum var32_off = tnum_subreg(dst_reg->var_off);
>> +	s32 smin_val = src_reg->s32_min_value;
>> +
>> +	/* Assuming scalar64_min_max_xor will be called so it is safe
>> +	 * to skip updating register for known case.
>> +	 */
>> +	if (src_known && dst_known)
>> +		return;
> 
> why?
> I've looked at _and() and _or() variants that do the same and
> couldn't quite remember why it's ok to do so.

Yes, I copied what _and() and _or() did. What I thought is
if both known, 64bit scalar_min_max_xor() handled this and did
not go though the approximation below, so that is why we return here.
John, could you confirm?

> 
>> +
>> +	/* We get both minimum and maximum from the var32_off. */
>> +	dst_reg->u32_min_value = var32_off.value;
>> +	dst_reg->u32_max_value = var32_off.value | var32_off.mask;
>> +
>> +	if (dst_reg->s32_min_value >= 0 && smin_val >= 0) {
>> +		/* XORing two positive sign numbers gives a positive,
>> +		 * so safe to cast u32 result into s32.
>> +		 */
>> +		dst_reg->s32_min_value = dst_reg->u32_min_value;
>> +		dst_reg->s32_max_value = dst_reg->u32_max_value;
>> +	} else {
>> +		dst_reg->s32_min_value = S32_MIN;
>> +		dst_reg->s32_max_value = S32_MAX;
>> +	}
>> +}
>> +
>> +static void scalar_min_max_xor(struct bpf_reg_state *dst_reg,
>> +			       struct bpf_reg_state *src_reg)
>> +{
>> +	bool src_known = tnum_is_const(src_reg->var_off);
>> +	bool dst_known = tnum_is_const(dst_reg->var_off);
>> +	s64 smin_val = src_reg->smin_value;
>> +
>> +	if (src_known && dst_known) {
>> +		/* dst_reg->var_off.value has been updated earlier */
> 
> right, but that means that there is sort-of 'bug' (unnecessary operation)
> that caused me a lot of head scratching.
> scalar_min_max_and() and scalar_min_max_or() do the alu in similar situation:
>          if (src_known && dst_known) {
>                  __mark_reg_known(dst_reg, dst_reg->var_off.value |
>                                            src_reg->var_off.value);
> I guess it's still technically correct to repeat alu operation.
> second & and second | won't change the value of dst_reg,
> but it feels that it's correct by accident?
> John ?

I think for or and add, additional dst_reg op src_reg is okay. For 
example, for "and", the computation looks like
    dst = dst & src
    dst = dst & src
result will be the same as
    dst = dst & src
and the second is redundant and can be replaced with dst.
The same for or,
    dst = dst | src
    dst = dst | src
is the same as "dst = dst | src" and the second is redundant. So
for and/or, the __mark_reg_known can just take dst_reg->var_off.value,
but the current code is also correct but can be simplified.

This is not the case xor (^). The extra computation will
change expected value.

> 
>> +		__mark_reg_known(dst_reg, dst_reg->var_off.value);
>> +		return;
>> +	}
>> +
>> +	/* We get both minimum and maximum from the var_off. */
>> +	dst_reg->umin_value = dst_reg->var_off.value;
>> +	dst_reg->umax_value = dst_reg->var_off.value | dst_reg->var_off.mask;
> 
> I think this is correct, but I hope somebody else can analyze this as well.
> John, Ed ?

Please do double check. Thanks.

> 
>> +
>> +	if (dst_reg->smin_value >= 0 && smin_val >= 0) {
>> +		/* XORing two positive sign numbers gives a positive,
>> +		 * so safe to cast u64 result into s64.
>> +		 */
>> +		dst_reg->smin_value = dst_reg->umin_value;
>> +		dst_reg->smax_value = dst_reg->umax_value;
>> +	} else {
>> +		dst_reg->smin_value = S64_MIN;
>> +		dst_reg->smax_value = S64_MAX;
>> +	}
>> +
>> +	__update_reg_bounds(dst_reg);
>> +}