netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] bpf: verifier check for dead branch
@ 2020-08-07 17:30 Jiri Olsa
  2020-08-10  1:21 ` Yonghong Song
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2020-08-07 17:30 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: bpf, netdev, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

hi,
we have a customer facing some odd verifier fails on following
sk_skb program:

   0. r2 = *(u32 *)(r1 + data_end)
   1. r4 = *(u32 *)(r1 + data)
   2. r3 = r4
   3. r3 += 42
   4. r1 = 0
   5. if r3 > r2 goto 8
   6. r4 += 14
   7. r1 = r4
   8. if r3 > r2 goto 10
   9. r2 = *(u8 *)(r1 + 9)
  10. r0 = 0
  11. exit

The code checks if the skb data is big enough (5) and if it is,
it prepares pointer in r1 (7), then there's again size check (8)
and finally data load from r1 (9).

It's and odd code, but apparently this is something that can
get produced by clang.

I made selftest out of it and it fails to load with:

  # test_verifier -v 267
  #267/p dead path check FAIL
  Failed to load prog 'Success'!
  0: (61) r2 = *(u32 *)(r1 +80)
  1: (61) r4 = *(u32 *)(r1 +76)
  2: (bf) r3 = r4
  3: (07) r3 += 42
  4: (b7) r1 = 0
  5: (2d) if r3 > r2 goto pc+2

  from 5 to 8: R1_w=inv0 R2_w=pkt_end(id=0,off=0,imm=0) R3_w=pkt(id=0,off=42,r=0,imm=0) R4_w=pkt(id=0,off=0,r=0,imm=0) R10=fp0
  8: (2d) if r3 > r2 goto pc+1
   R1_w=inv0 R2_w=pkt_end(id=0,off=0,imm=0) R3_w=pkt(id=0,off=42,r=42,imm=0) R4_w=pkt(id=0,off=0,r=42,imm=0) R10=fp0
  9: (69) r2 = *(u16 *)(r1 +9)
  R1 invalid mem access 'inv'
  processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0

The verifier does not seem to take into account that code can't
ever reach instruction 9 if the size check fails and r1 will be
always valid when size check succeeds.

My guess is that verifier does not have such check, but I'm still
scratching on the surface of it, so I could be totally wrong and
missing something.. before I dive in I was wondering you guys
could help me out with some insights or suggestions.

thanks,
jirka


---
 .../testing/selftests/bpf/verifier/ctx_skb.c  | 21 +++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tools/testing/selftests/bpf/verifier/ctx_skb.c b/tools/testing/selftests/bpf/verifier/ctx_skb.c
index 2e16b8e268f2..54578f1fb662 100644
--- a/tools/testing/selftests/bpf/verifier/ctx_skb.c
+++ b/tools/testing/selftests/bpf/verifier/ctx_skb.c
@@ -346,6 +346,27 @@
 	.result = ACCEPT,
 	.prog_type = BPF_PROG_TYPE_SK_SKB,
 },
+{
+	"dead path check",
+	.insns = {
+	BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,		//  0. r2 = *(u32 *)(r1 + data_end)
+		    offsetof(struct __sk_buff, data_end)),
+	BPF_LDX_MEM(BPF_W, BPF_REG_4, BPF_REG_1,		//  1. r4 = *(u32 *)(r1 + data)
+		    offsetof(struct __sk_buff, data)),
+	BPF_MOV64_REG(BPF_REG_3, BPF_REG_4),			//  2. r3 = r4
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, 42),			//  3. r3 += 42
+	BPF_MOV64_IMM(BPF_REG_1, 0),				//  4. r1 = 0
+	BPF_JMP_REG(BPF_JGT, BPF_REG_3, BPF_REG_2, 2),		//  5. if r3 > r2 goto 8
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 14),			//  6. r4 += 14
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_4),			//  7. r1 = r4
+	BPF_JMP_REG(BPF_JGT, BPF_REG_3, BPF_REG_2, 1),		//  8. if r3 > r2 goto 10
+	BPF_LDX_MEM(BPF_H, BPF_REG_2, BPF_REG_1, 9),		//  9. r2 = *(u8 *)(r1 + 9)
+	BPF_MOV64_IMM(BPF_REG_0, 0),				// 10. r0 = 0
+	BPF_EXIT_INSN(),					// 11. exit
+	},
+	.result = ACCEPT,
+	.prog_type = BPF_PROG_TYPE_SK_SKB,
+},
 {
 	"overlapping checks for direct packet access SK_SKB",
 	.insns = {
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC] bpf: verifier check for dead branch
  2020-08-07 17:30 [RFC] bpf: verifier check for dead branch Jiri Olsa
@ 2020-08-10  1:21 ` Yonghong Song
  2020-08-10 13:54   ` Jiri Olsa
  0 siblings, 1 reply; 7+ messages in thread
From: Yonghong Song @ 2020-08-10  1:21 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann
  Cc: bpf, netdev, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	John Fastabend, KP Singh



On 8/7/20 10:30 AM, Jiri Olsa wrote:
> hi,
> we have a customer facing some odd verifier fails on following
> sk_skb program:
> 
>     0. r2 = *(u32 *)(r1 + data_end)
>     1. r4 = *(u32 *)(r1 + data)
>     2. r3 = r4
>     3. r3 += 42
>     4. r1 = 0
>     5. if r3 > r2 goto 8
>     6. r4 += 14
>     7. r1 = r4
>     8. if r3 > r2 goto 10
>     9. r2 = *(u8 *)(r1 + 9)
>    10. r0 = 0
>    11. exit
> 
> The code checks if the skb data is big enough (5) and if it is,
> it prepares pointer in r1 (7), then there's again size check (8)
> and finally data load from r1 (9).
> 
> It's and odd code, but apparently this is something that can
> get produced by clang.

Could you provide a test case where clang generates the above code?
I would like to see whether clang can do a better job to avoid
such codes.

> 
> I made selftest out of it and it fails to load with:
> 
>    # test_verifier -v 267
>    #267/p dead path check FAIL
>    Failed to load prog 'Success'!
>    0: (61) r2 = *(u32 *)(r1 +80)
>    1: (61) r4 = *(u32 *)(r1 +76)
>    2: (bf) r3 = r4
>    3: (07) r3 += 42
>    4: (b7) r1 = 0
>    5: (2d) if r3 > r2 goto pc+2
> 
>    from 5 to 8: R1_w=inv0 R2_w=pkt_end(id=0,off=0,imm=0) R3_w=pkt(id=0,off=42,r=0,imm=0) R4_w=pkt(id=0,off=0,r=0,imm=0) R10=fp0
>    8: (2d) if r3 > r2 goto pc+1
>     R1_w=inv0 R2_w=pkt_end(id=0,off=0,imm=0) R3_w=pkt(id=0,off=42,r=42,imm=0) R4_w=pkt(id=0,off=0,r=42,imm=0) R10=fp0
>    9: (69) r2 = *(u16 *)(r1 +9)
>    R1 invalid mem access 'inv'
>    processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0
> 
> The verifier does not seem to take into account that code can't
> ever reach instruction 9 if the size check fails and r1 will be
> always valid when size check succeeds.
> 
> My guess is that verifier does not have such check, but I'm still
> scratching on the surface of it, so I could be totally wrong and
> missing something.. before I dive in I was wondering you guys
> could help me out with some insights or suggestions.

There are no bits in reg_state to indicate a pkt pointer already goes 
beyond the packet_end. So insn 8 cannot conclude the condition r3 > r2 
is always true...

If a bit to indicate packet pointer exceeding the end, it should work.
Fixing in verifier probably not too hard. But this should be a corner 
case and it will be good to check whether we can avoid it in clang.

> 
> thanks,
> jirka
> 
> 
> ---
>   .../testing/selftests/bpf/verifier/ctx_skb.c  | 21 +++++++++++++++++++
>   1 file changed, 21 insertions(+)
> 
> diff --git a/tools/testing/selftests/bpf/verifier/ctx_skb.c b/tools/testing/selftests/bpf/verifier/ctx_skb.c
> index 2e16b8e268f2..54578f1fb662 100644
> --- a/tools/testing/selftests/bpf/verifier/ctx_skb.c
> +++ b/tools/testing/selftests/bpf/verifier/ctx_skb.c
> @@ -346,6 +346,27 @@
>   	.result = ACCEPT,
>   	.prog_type = BPF_PROG_TYPE_SK_SKB,
>   },
> +{
> +	"dead path check",
> +	.insns = {
> +	BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,		//  0. r2 = *(u32 *)(r1 + data_end)
> +		    offsetof(struct __sk_buff, data_end)),
> +	BPF_LDX_MEM(BPF_W, BPF_REG_4, BPF_REG_1,		//  1. r4 = *(u32 *)(r1 + data)
> +		    offsetof(struct __sk_buff, data)),
> +	BPF_MOV64_REG(BPF_REG_3, BPF_REG_4),			//  2. r3 = r4
> +	BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, 42),			//  3. r3 += 42
> +	BPF_MOV64_IMM(BPF_REG_1, 0),				//  4. r1 = 0
> +	BPF_JMP_REG(BPF_JGT, BPF_REG_3, BPF_REG_2, 2),		//  5. if r3 > r2 goto 8
> +	BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 14),			//  6. r4 += 14
> +	BPF_MOV64_REG(BPF_REG_1, BPF_REG_4),			//  7. r1 = r4
> +	BPF_JMP_REG(BPF_JGT, BPF_REG_3, BPF_REG_2, 1),		//  8. if r3 > r2 goto 10
> +	BPF_LDX_MEM(BPF_H, BPF_REG_2, BPF_REG_1, 9),		//  9. r2 = *(u8 *)(r1 + 9)
> +	BPF_MOV64_IMM(BPF_REG_0, 0),				// 10. r0 = 0
> +	BPF_EXIT_INSN(),					// 11. exit
> +	},
> +	.result = ACCEPT,
> +	.prog_type = BPF_PROG_TYPE_SK_SKB,
> +},
>   {
>   	"overlapping checks for direct packet access SK_SKB",
>   	.insns = {
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bpf: verifier check for dead branch
  2020-08-10  1:21 ` Yonghong Song
@ 2020-08-10 13:54   ` Jiri Olsa
  2020-08-10 17:16     ` Yonghong Song
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2020-08-10 13:54 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, netdev,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh

On Sun, Aug 09, 2020 at 06:21:01PM -0700, Yonghong Song wrote:
> 
> 
> On 8/7/20 10:30 AM, Jiri Olsa wrote:
> > hi,
> > we have a customer facing some odd verifier fails on following
> > sk_skb program:
> > 
> >     0. r2 = *(u32 *)(r1 + data_end)
> >     1. r4 = *(u32 *)(r1 + data)
> >     2. r3 = r4
> >     3. r3 += 42
> >     4. r1 = 0
> >     5. if r3 > r2 goto 8
> >     6. r4 += 14
> >     7. r1 = r4
> >     8. if r3 > r2 goto 10
> >     9. r2 = *(u8 *)(r1 + 9)
> >    10. r0 = 0
> >    11. exit
> > 
> > The code checks if the skb data is big enough (5) and if it is,
> > it prepares pointer in r1 (7), then there's again size check (8)
> > and finally data load from r1 (9).
> > 
> > It's and odd code, but apparently this is something that can
> > get produced by clang.
> 
> Could you provide a test case where clang generates the above code?
> I would like to see whether clang can do a better job to avoid
> such codes.

I get that code genrated by using recent enough upstream clang
on the attached source.

	/opt/clang/bin/clang --version
	clang version 11.0.0 (https://github.com/llvm/llvm-project.git 4cbfb98eb362b0629d5d1cd113af4427e2904763)
	Target: x86_64-unknown-linux-gnu
	Thread model: posix
	InstalledDir: /opt/clang/bin

	$ llvm-objdump -d verifier-cond-repro.o 

	verifier-cond-repro.o:  file format ELF64-BPF

	Disassembly of section .text:

	0000000000000000 my_prog:
	       0:       61 12 50 00 00 00 00 00 r2 = *(u32 *)(r1 + 80)
	       1:       61 14 4c 00 00 00 00 00 r4 = *(u32 *)(r1 + 76)
	       2:       bf 43 00 00 00 00 00 00 r3 = r4
	       3:       07 03 00 00 2a 00 00 00 r3 += 42
	       4:       b7 01 00 00 00 00 00 00 r1 = 0
	       5:       2d 23 02 00 00 00 00 00 if r3 > r2 goto +2 <LBB0_2>
	       6:       07 04 00 00 0e 00 00 00 r4 += 14
	       7:       bf 41 00 00 00 00 00 00 r1 = r4

	0000000000000040 LBB0_2:
	       8:       2d 23 05 00 00 00 00 00 if r3 > r2 goto +5 <LBB0_5>
	       9:       71 12 09 00 00 00 00 00 r2 = *(u8 *)(r1 + 9)
	      10:       56 02 03 00 11 00 00 00 if w2 != 17 goto +3 <LBB0_5>
	      11:       b4 00 00 00 d2 04 00 00 w0 = 1234
	      12:       69 11 16 00 00 00 00 00 r1 = *(u16 *)(r1 + 22)
	      13:       16 01 01 00 d2 04 00 00 if w1 == 1234 goto +1 <LBB0_6>

	0000000000000070 LBB0_5:
	      14:       b4 00 00 00 ff ff ff ff w0 = -1

	0000000000000078 LBB0_6:
	      15:       95 00 00 00 00 00 00 00 exit


thanks,
jirka


---
// Copyright (c) 2019 Tigera, Inc. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include <stddef.h>
#include <string.h>
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/in.h>
#include <linux/udp.h>
#include <linux/tcp.h>
#include <linux/pkt_cls.h>
#include <sys/socket.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>

#include <stddef.h>

#define INLINE inline __attribute__((always_inline))

#define skb_shorter(skb, len) ((void *)(long)(skb)->data + (len) > (void *)(long)skb->data_end)

#define ETH_IPV4_UDP_SIZE (14+20+8)

static INLINE struct iphdr *get_iphdr (struct __sk_buff *skb)
{
	struct iphdr *ip = NULL;
	struct ethhdr *eth;

	if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
		goto out;

	eth = (void *)(long)skb->data;
	ip = (void *)(eth + 1);

out:
	return ip;
}

int my_prog(struct __sk_buff *skb)
{
	struct iphdr *ip = NULL;
	struct udphdr *udp;
	__u8 proto = 0;

	if (!(ip = get_iphdr(skb)))
		goto out;

	proto = ip->protocol;

	if (proto != IPPROTO_UDP)
		goto out;

	udp = (void*)(ip + 1);

	if (udp->dest != 1234)
		goto out;

	if (!udp)
		goto out;

	return udp->dest;

out:
	return -1;
}


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bpf: verifier check for dead branch
  2020-08-10 13:54   ` Jiri Olsa
@ 2020-08-10 17:16     ` Yonghong Song
  2020-08-11  7:14       ` Jiri Olsa
  0 siblings, 1 reply; 7+ messages in thread
From: Yonghong Song @ 2020-08-10 17:16 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, netdev,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh



On 8/10/20 6:54 AM, Jiri Olsa wrote:
> On Sun, Aug 09, 2020 at 06:21:01PM -0700, Yonghong Song wrote:
>>
>>
>> On 8/7/20 10:30 AM, Jiri Olsa wrote:
>>> hi,
>>> we have a customer facing some odd verifier fails on following
>>> sk_skb program:
>>>
>>>      0. r2 = *(u32 *)(r1 + data_end)
>>>      1. r4 = *(u32 *)(r1 + data)
>>>      2. r3 = r4
>>>      3. r3 += 42
>>>      4. r1 = 0
>>>      5. if r3 > r2 goto 8
>>>      6. r4 += 14
>>>      7. r1 = r4
>>>      8. if r3 > r2 goto 10
>>>      9. r2 = *(u8 *)(r1 + 9)
>>>     10. r0 = 0
>>>     11. exit
>>>
>>> The code checks if the skb data is big enough (5) and if it is,
>>> it prepares pointer in r1 (7), then there's again size check (8)
>>> and finally data load from r1 (9).
>>>
>>> It's and odd code, but apparently this is something that can
>>> get produced by clang.
>>
>> Could you provide a test case where clang generates the above code?
>> I would like to see whether clang can do a better job to avoid
>> such codes.
> 
> I get that code genrated by using recent enough upstream clang
> on the attached source.

Thanks for the test case. I can reproduce the issue. The following
is why this happens in llvm.
the pseudo IR code looks like
    data = skb->data
    data_end = skb->data_end
    comp = data + 42 > data_end
    ip = select "comp" nullptr "data + some offset"
          <=== select return one of nullptr or "data + some offset" 
based on "comp"
    if comp   // original skb_shorter condition
       ....
    ...
       = ip

In llvm, bpf backend "select" actually inlined "comp" to generate proper
control flow. Therefore, comp is computed twice like below
    data = skb->data
    data_end = skb->data_end
    if (data + 42 > data_end) {
       ip = nullptr; goto block1;
    } else {
       ip = data + some_offset;
       goto block2;
    }
    ...
    if (data + 42 > data_end) // original skb_shorter condition

The issue can be workarounded the source. Just check data + 42 > 
data_end and if failure return. Then you will be able to assign
a value to "ip" conditionally.

Will try to fix this issue in llvm12 as well.
Thanks!


> 
> 	/opt/clang/bin/clang --version
> 	clang version 11.0.0 (https://github.com/llvm/llvm-project.git 4cbfb98eb362b0629d5d1cd113af4427e2904763)
> 	Target: x86_64-unknown-linux-gnu
> 	Thread model: posix
> 	InstalledDir: /opt/clang/bin
> 
> 	$ llvm-objdump -d verifier-cond-repro.o
> 
> 	verifier-cond-repro.o:  file format ELF64-BPF
> 
> 	Disassembly of section .text:
> 
> 	0000000000000000 my_prog:
> 	       0:       61 12 50 00 00 00 00 00 r2 = *(u32 *)(r1 + 80)
> 	       1:       61 14 4c 00 00 00 00 00 r4 = *(u32 *)(r1 + 76)
> 	       2:       bf 43 00 00 00 00 00 00 r3 = r4
> 	       3:       07 03 00 00 2a 00 00 00 r3 += 42
> 	       4:       b7 01 00 00 00 00 00 00 r1 = 0
> 	       5:       2d 23 02 00 00 00 00 00 if r3 > r2 goto +2 <LBB0_2>
> 	       6:       07 04 00 00 0e 00 00 00 r4 += 14
> 	       7:       bf 41 00 00 00 00 00 00 r1 = r4
> 
> 	0000000000000040 LBB0_2:
> 	       8:       2d 23 05 00 00 00 00 00 if r3 > r2 goto +5 <LBB0_5>
> 	       9:       71 12 09 00 00 00 00 00 r2 = *(u8 *)(r1 + 9)
> 	      10:       56 02 03 00 11 00 00 00 if w2 != 17 goto +3 <LBB0_5>
> 	      11:       b4 00 00 00 d2 04 00 00 w0 = 1234
> 	      12:       69 11 16 00 00 00 00 00 r1 = *(u16 *)(r1 + 22)
> 	      13:       16 01 01 00 d2 04 00 00 if w1 == 1234 goto +1 <LBB0_6>
> 
> 	0000000000000070 LBB0_5:
> 	      14:       b4 00 00 00 ff ff ff ff w0 = -1
> 
> 	0000000000000078 LBB0_6:
> 	      15:       95 00 00 00 00 00 00 00 exit
> 
> 
> thanks,
> jirka
> 
> 
> ---
> // Copyright (c) 2019 Tigera, Inc. All rights reserved.
> //
> // Licensed under the Apache License, Version 2.0 (the "License");
> // you may not use this file except in compliance with the License.
> // You may obtain a copy of the License at
> //
> //     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_licenses_LICENSE-2D2.0&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=bsFf5gOsaXXCIBYvnl6AK78aRCaFS1zhqvVCYbn4JBA&s=5MVLZXPlXMo2B2K5wDP5P3Lmn4-TQTKHvQfvZupEFvs&e=
> //
> // Unless required by applicable law or agreed to in writing, software
> // distributed under the License is distributed on an "AS IS" BASIS,
> // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> // See the License for the specific language governing permissions and
> // limitations under the License.
> 
> #include <stddef.h>
> #include <string.h>
> #include <linux/bpf.h>
> #include <linux/if_ether.h>
> #include <linux/if_packet.h>
> #include <linux/ip.h>
> #include <linux/ipv6.h>
> #include <linux/in.h>
> #include <linux/udp.h>
> #include <linux/tcp.h>
> #include <linux/pkt_cls.h>
> #include <sys/socket.h>
> #include <bpf/bpf_helpers.h>
> #include <bpf/bpf_endian.h>
> 
> #include <stddef.h>
> 
> #define INLINE inline __attribute__((always_inline))
> 
> #define skb_shorter(skb, len) ((void *)(long)(skb)->data + (len) > (void *)(long)skb->data_end)
> 
> #define ETH_IPV4_UDP_SIZE (14+20+8)
> 
> static INLINE struct iphdr *get_iphdr (struct __sk_buff *skb)
> {
> 	struct iphdr *ip = NULL;
> 	struct ethhdr *eth;
> 
> 	if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
> 		goto out;
> 
> 	eth = (void *)(long)skb->data;
> 	ip = (void *)(eth + 1);
> 
> out:
> 	return ip;
> }
> 
> int my_prog(struct __sk_buff *skb)
> {
> 	struct iphdr *ip = NULL;
> 	struct udphdr *udp;
> 	__u8 proto = 0;
> 
> 	if (!(ip = get_iphdr(skb)))
> 		goto out;
> 
> 	proto = ip->protocol;
> 
> 	if (proto != IPPROTO_UDP)
> 		goto out;
> 
> 	udp = (void*)(ip + 1);
> 
> 	if (udp->dest != 1234)
> 		goto out;
> 
> 	if (!udp)
> 		goto out;
> 
> 	return udp->dest;
> 
> out:
> 	return -1;
> }
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bpf: verifier check for dead branch
  2020-08-10 17:16     ` Yonghong Song
@ 2020-08-11  7:14       ` Jiri Olsa
  2020-08-11 16:08         ` Yonghong Song
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2020-08-11  7:14 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, netdev,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh

On Mon, Aug 10, 2020 at 10:16:12AM -0700, Yonghong Song wrote:

SNIP

> 
> Thanks for the test case. I can reproduce the issue. The following
> is why this happens in llvm.
> the pseudo IR code looks like
>    data = skb->data
>    data_end = skb->data_end
>    comp = data + 42 > data_end
>    ip = select "comp" nullptr "data + some offset"
>          <=== select return one of nullptr or "data + some offset" based on
> "comp"
>    if comp   // original skb_shorter condition
>       ....
>    ...
>       = ip
> 
> In llvm, bpf backend "select" actually inlined "comp" to generate proper
> control flow. Therefore, comp is computed twice like below
>    data = skb->data
>    data_end = skb->data_end
>    if (data + 42 > data_end) {
>       ip = nullptr; goto block1;
>    } else {
>       ip = data + some_offset;
>       goto block2;
>    }
>    ...
>    if (data + 42 > data_end) // original skb_shorter condition
> 
> The issue can be workarounded the source. Just check data + 42 > data_end
> and if failure return. Then you will be able to assign
> a value to "ip" conditionally.

is the change below what you mean? it produces the same code for me:

	diff --git a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
	index 2f11027d7e67..9c401bd00ab7 100644
	--- a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
	+++ b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
	@@ -41,12 +41,10 @@ static INLINE struct iphdr *get_iphdr (struct __sk_buff *skb)
		struct ethhdr *eth;
	 
		if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
	-		goto out;
	+		return NULL;
	 
		eth = (void *)(long)skb->data;
		ip = (void *)(eth + 1);
	-
	-out:
		return ip;
	 }
	 

I also tried this one:

	diff --git a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
	index 2f11027d7e67..00ff06fe6fdd 100644
	--- a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
	+++ b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
	@@ -57,7 +57,7 @@ int my_prog(struct __sk_buff *skb)
		__u8 proto = 0;
	 
		if (!(ip = get_iphdr(skb)))
	-               goto out;
	+               return -1;
	 
		proto = ip->protocol;

it did just slight change in generated code - added 'w0 = -1'
before the second condition

> 
> Will try to fix this issue in llvm12 as well.
> Thanks!

great, could you please CC me on the changes?

thanks a lot!
jirka


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bpf: verifier check for dead branch
  2020-08-11  7:14       ` Jiri Olsa
@ 2020-08-11 16:08         ` Yonghong Song
  2020-08-12  7:48           ` Jiri Olsa
  0 siblings, 1 reply; 7+ messages in thread
From: Yonghong Song @ 2020-08-11 16:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, netdev,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh



On 8/11/20 12:14 AM, Jiri Olsa wrote:
> On Mon, Aug 10, 2020 at 10:16:12AM -0700, Yonghong Song wrote:
> 
> SNIP
> 
>>
>> Thanks for the test case. I can reproduce the issue. The following
>> is why this happens in llvm.
>> the pseudo IR code looks like
>>     data = skb->data
>>     data_end = skb->data_end
>>     comp = data + 42 > data_end
>>     ip = select "comp" nullptr "data + some offset"
>>           <=== select return one of nullptr or "data + some offset" based on
>> "comp"
>>     if comp   // original skb_shorter condition
>>        ....
>>     ...
>>        = ip
>>
>> In llvm, bpf backend "select" actually inlined "comp" to generate proper
>> control flow. Therefore, comp is computed twice like below
>>     data = skb->data
>>     data_end = skb->data_end
>>     if (data + 42 > data_end) {
>>        ip = nullptr; goto block1;
>>     } else {
>>        ip = data + some_offset;
>>        goto block2;
>>     }
>>     ...
>>     if (data + 42 > data_end) // original skb_shorter condition
>>
>> The issue can be workarounded the source. Just check data + 42 > data_end
>> and if failure return. Then you will be able to assign
>> a value to "ip" conditionally.

sorry for typo. The above should be "conditionally" -> "unconditionally".

> 
> is the change below what you mean? it produces the same code for me:
> 
> 	diff --git a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
> 	index 2f11027d7e67..9c401bd00ab7 100644
> 	--- a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
> 	+++ b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
> 	@@ -41,12 +41,10 @@ static INLINE struct iphdr *get_iphdr (struct __sk_buff *skb)
> 		struct ethhdr *eth;
> 	
> 		if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
> 	-		goto out;
> 	+		return NULL;
> 	
> 		eth = (void *)(long)skb->data;
> 		ip = (void *)(eth + 1);
> 	-
> 	-out:
> 		return ip;
> 	 }
> 	
> 
> I also tried this one:
> 
> 	diff --git a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
> 	index 2f11027d7e67..00ff06fe6fdd 100644
> 	--- a/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
> 	+++ b/tools/testing/selftests/bpf/progs/verifier-cond-repro.c
> 	@@ -57,7 +57,7 @@ int my_prog(struct __sk_buff *skb)
> 		__u8 proto = 0;
> 	
> 		if (!(ip = get_iphdr(skb)))
> 	-               goto out;
> 	+               return -1;
> 	
> 		proto = ip->protocol;
> 
> it did just slight change in generated code - added 'w0 = -1'
> before the second condition

The following is what I mean:

diff --git a/t.c b/t.c
index c6baf28..7bf90dc 100644
--- a/t.c
+++ b/t.c
@@ -37,17 +37,10 @@

  static INLINE struct iphdr *get_iphdr (struct __sk_buff *skb)
  {
-       struct iphdr *ip = NULL;
         struct ethhdr *eth;

-       if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
-               goto out;
-
         eth = (void *)(long)skb->data;
-       ip = (void *)(eth + 1);
-
-out:
-       return ip;
+       return (void *)(eth + 1);
  }

  int my_prog(struct __sk_buff *skb)
@@ -56,9 +49,10 @@ int my_prog(struct __sk_buff *skb)
         struct udphdr *udp;
         __u8 proto = 0;

-       if (!(ip = get_iphdr(skb)))
+       if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
                 goto out;

+       ip = get_iphdr(skb);
         proto = ip->protocol;

         if (proto != IPPROTO_UDP)

> 
>>
>> Will try to fix this issue in llvm12 as well.
>> Thanks!
> 
> great, could you please CC me on the changes?

This will be a llvm change. Do you have llvm phabricator login name
https://reviews.llvm.org/
so I can add you as a subscriber?

> 
> thanks a lot!
> jirka
> 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC] bpf: verifier check for dead branch
  2020-08-11 16:08         ` Yonghong Song
@ 2020-08-12  7:48           ` Jiri Olsa
  0 siblings, 0 replies; 7+ messages in thread
From: Jiri Olsa @ 2020-08-12  7:48 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, netdev,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh

On Tue, Aug 11, 2020 at 09:08:13AM -0700, Yonghong Song wrote:
> 
> 
> On 8/11/20 12:14 AM, Jiri Olsa wrote:
> > On Mon, Aug 10, 2020 at 10:16:12AM -0700, Yonghong Song wrote:
> > 
> > SNIP
> > 
> > > 
> > > Thanks for the test case. I can reproduce the issue. The following
> > > is why this happens in llvm.
> > > the pseudo IR code looks like
> > >     data = skb->data
> > >     data_end = skb->data_end
> > >     comp = data + 42 > data_end
> > >     ip = select "comp" nullptr "data + some offset"
> > >           <=== select return one of nullptr or "data + some offset" based on
> > > "comp"
> > >     if comp   // original skb_shorter condition
> > >        ....
> > >     ...
> > >        = ip
> > > 
> > > In llvm, bpf backend "select" actually inlined "comp" to generate proper
> > > control flow. Therefore, comp is computed twice like below
> > >     data = skb->data
> > >     data_end = skb->data_end
> > >     if (data + 42 > data_end) {
> > >        ip = nullptr; goto block1;
> > >     } else {
> > >        ip = data + some_offset;
> > >        goto block2;
> > >     }
> > >     ...
> > >     if (data + 42 > data_end) // original skb_shorter condition
> > > 
> > > The issue can be workarounded the source. Just check data + 42 > data_end
> > > and if failure return. Then you will be able to assign
> > > a value to "ip" conditionally.
> 
> sorry for typo. The above should be "conditionally" -> "unconditionally".

aaah, ok ;-)

> 
> The following is what I mean:
> 
> diff --git a/t.c b/t.c
> index c6baf28..7bf90dc 100644
> --- a/t.c
> +++ b/t.c
> @@ -37,17 +37,10 @@
> 
>  static INLINE struct iphdr *get_iphdr (struct __sk_buff *skb)
>  {
> -       struct iphdr *ip = NULL;
>         struct ethhdr *eth;
> 
> -       if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
> -               goto out;
> -
>         eth = (void *)(long)skb->data;
> -       ip = (void *)(eth + 1);
> -
> -out:
> -       return ip;
> +       return (void *)(eth + 1);
>  }
> 
>  int my_prog(struct __sk_buff *skb)
> @@ -56,9 +49,10 @@ int my_prog(struct __sk_buff *skb)
>         struct udphdr *udp;
>         __u8 proto = 0;
> 
> -       if (!(ip = get_iphdr(skb)))
> +       if (skb_shorter(skb, ETH_IPV4_UDP_SIZE))
>                 goto out;
> 
> +       ip = get_iphdr(skb);
>         proto = ip->protocol;
> 
>         if (proto != IPPROTO_UDP)
> 
> > 
> > > 
> > > Will try to fix this issue in llvm12 as well.
> > > Thanks!
> > 
> > great, could you please CC me on the changes?
> 
> This will be a llvm change. Do you have llvm phabricator login name
> https://reviews.llvm.org/
> so I can add you as a subscriber?

Jiri (Olsa)
olsajiri@gmail.com

thank,
jirka


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-12  7:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07 17:30 [RFC] bpf: verifier check for dead branch Jiri Olsa
2020-08-10  1:21 ` Yonghong Song
2020-08-10 13:54   ` Jiri Olsa
2020-08-10 17:16     ` Yonghong Song
2020-08-11  7:14       ` Jiri Olsa
2020-08-11 16:08         ` Yonghong Song
2020-08-12  7:48           ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).