From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yonghong Song Subject: Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL Date: Tue, 14 Nov 2017 14:58:24 -0800 Message-ID: <0839587a-9520-c844-61a3-01a7a30f0015@fb.com> References: <20171113143047.GH15684@kernel.org> <20171113150820.GA15366@kernel.org> <02c19fc6-b17c-c632-7bf1-bc6746e530f3@iogearbox.net> <20171114125848.GH8836@kernel.org> <20171114134229.GO8836@kernel.org> <32dd93e5-d119-dc9d-1222-b53c31a13b35@fb.com> <0f68d96b-c19d-d154-411b-313acf07d62c@iogearbox.net> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Cc: Gianluca Borello , Alexei Starovoitov , David Miller , Linux Networking Development Mailing List To: Daniel Borkmann , Arnaldo Carvalho de Melo Return-path: Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:53424 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751736AbdKNW6w (ORCPT ); Tue, 14 Nov 2017 17:58:52 -0500 In-Reply-To: <0f68d96b-c19d-d154-411b-313acf07d62c@iogearbox.net> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 11/14/17 12:25 PM, Daniel Borkmann wrote: > On 11/14/2017 07:15 PM, Yonghong Song wrote: >> On 11/14/17 6:19 AM, Daniel Borkmann wrote: >>> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote: >>>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu: >>>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote: >>>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu: >>>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote: >>>>>>>> libbpf: -- BEGIN DUMP LOG --- >>>>>>>> libbpf: >>>>>>>> 0: (79) r3 = *(u64 *)(r1 +104) >>>>>>>> 1: (b7) r2 = 0 >>>>>>>> 2: (bf) r6 = r1 >>>>>>>> 3: (bf) r1 = r10 >>>>>>>> 4: (07) r1 += -128 >>>>>>>> 5: (b7) r2 = 128 >>>>>>>> 6: (85) call bpf_probe_read_str#45 >>>>>>>> 7: (bf) r1 = r0 >>>>>>>> 8: (07) r1 += -1 >>>>>>>> 9: (67) r1 <<= 32 >>>>>>>> 10: (77) r1 >>= 32 >>>>>>>> 11: (25) if r1 > 0x7f goto pc+11 >>>>>>> >>>>>>> Right, so the compiler is optimizing the two tests into a single one above, >>>>>>> which means lower bound cannot properly be derived again by the verifier due >>>>>>> to this and thus you'll get the error. Similar issue was seen recently [1]. >>>>>>> >>>>>>> Does the below hack work for you? >>>>>>> >>>>>>> int prog([...]) >>>>>>> { >>>>>>>          char filename[128]; >>>>>>>          int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); >>>>>>>          if (ret > 0) >>>>>>>                  bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, >>>>>>>                                        ret & (sizeof(filename) - 1)); >>>>>>>          return 1; >>>>>>> } >>>>>>> >>>>>>> r0 should keep on tracking bounds here at least: >>>>>>> >>>>>>> prog: >>>>>>>         0:    bf 16 00 00 00 00 00 00     r6 = r1 >>>>>>>         1:    bf a1 00 00 00 00 00 00     r1 = r10 >>>>>>>         2:    07 01 00 00 80 ff ff ff     r1 += -128 >>>>>>>         3:    b7 02 00 00 80 00 00 00     r2 = 128 >>>>>>>         4:    85 00 00 00 2d 00 00 00     call 45 >>>>>>>         5:    67 00 00 00 20 00 00 00     r0 <<= 32 >>>>>>>         6:    c7 00 00 00 20 00 00 00     r0 s>>= 32 >>>>>>>         7:    b7 01 00 00 01 00 00 00     r1 = 1 >>>>>>>         8:    6d 01 0a 00 00 00 00 00     if r1 s> r0 goto 10 >>>>>>>         9:    57 00 00 00 7f 00 00 00     r0 &= 127 >>>>>>>        10:    bf a4 00 00 00 00 00 00     r4 = r10 >>>>>>>        11:    07 04 00 00 80 ff ff ff     r4 += -128 >>>>>>>        12:    bf 61 00 00 00 00 00 00     r1 = r6 >>>>>>>        13:    18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     r2 = 0ll >>>>>>>        15:    18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00     r3 = 4294967295ll >>>>>>>        17:    bf 05 00 00 00 00 00 00     r5 = r0 >>>>>>>        18:    85 00 00 00 19 00 00 00     call 25 >>>>>>> >>>>>>>    [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e= >>>>>> >>>>>> Not yet: >>>>>> >>>>>> 6: (85) call bpf_probe_read_str#45 >>>>>> 7: (bf) r1 = r0 >>>>>> 8: (67) r1 <<= 32 >>>>>> 9: (77) r1 >>= 32 >>>>>> 10: (15) if r1 == 0x0 goto pc+10 >>>>>>   R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0 >>>>>> 11: (57) r0 &= 127 >>>>>> 12: (bf) r4 = r10 >>>>>> 13: (07) r4 += -128 >>>>>> 14: (bf) r1 = r6 >>>>>> 15: (18) r2 = 0xffff92bfc2aba840u >>>>>> 17: (18) r3 = 0xffffffff >>>>>> 19: (bf) r5 = r0 >>>>>> 20: (85) call bpf_perf_event_output#25 >>>>>> invalid stack type R4 off=-128 access_size=0 >>>>>> >>>>>> I'll try updating clang/llvm... >>>>>> >>>>>> Full details: >>>>>> >>>>>> [root@jouet bpf]# cat open.c >>>>>> #include "bpf.h" >>>>>> >>>>>> SEC("prog=do_sys_open filename") >>>>>> int prog(void *ctx, int err, const char __user *filename_ptr) >>>>>> { >>>>>>     char filename[128]; >>>>>>     const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); >>>>> >>>>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe() >>>>> could potentially return errors like -EFAULT. >>>> >>>> I changed to int, didn't help >>>> >>>>> Currently having a version compiled from the git tree: >>>>> >>>>> # llc --version >>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=): >>>>>    LLVM version 6.0.0git-2d810c2 >>>>>    Optimized build. >>>>>    Default target: x86_64-unknown-linux-gnu >>>>>    Host CPU: skylake >>>> >>>> [root@jouet bpf]# llc --version >>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=): >>>>    LLVM version 4.0.0svn >>>> >>>> Old stuff! ;-) Will change, but improving these messages should be on >>>> the radar, I think :-) >>> >>> Yep, agree, I think we need a generic, better solution for this type of >>> issue instead of converting individual helpers to handle 0 min bound and >>> then only bailing out in such case; need to brainstorm a bit on that. >>> >>> I think for the above in your case ... >>> >>>   [...] >>>    6: (85) call bpf_probe_read_str#45 >>>    7: (bf) r1 = r0 >>>    8: (67) r1 <<= 32 >>>    9: (77) r1 >>= 32 >>>   10: (15) if r1 == 0x0 goto pc+10 >>>    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0 >>>   11: (57) r0 &= 127 >>>   [...] >>> >>> ... the shifts on r1 might be due to using 32 bit type, so if you find >>> a way to avoid these and have the test on r0 directly, we might get there. >>> Perhaps keep using a 64 bit type to avoid them. It would be useful to >>> propagate the deduced bound information back to r0 when we know that >>> neither r0 nor r1 has changed in the meantime. >> >> It is tricky to do in the bpf_program. Compiler tries hard to optimize :-). >> >> The issue is at "r0 &= 127". >> >> 9: (6d) if r1 s> r0 goto pc+10 >>  R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0 >> 10: R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0 >> 10: (57) r0 &= 127 >> 11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0 >> >> One possible solution for this problem is to relax the arg4 type >> from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO. > > Yeah, I know, that's what I mentioned earlier in this thread to resolve it, > but do we really want to add this hack everywhere? :( Potentially any function > having ARG_CONST_SIZE would need to handle size 0 and bail out again in their > helper implementation and it ends up that progs start relying on this runtime > check where we won't be able to get rid of it later on anymore. The compiler actually does the right thing for the below code: int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); if (ret > 0) bpf_perf_event_output(ctx, &__bpf_stdout__,BPF_F_CURRENT_CPU, filename, ret & (sizeof(filename) - 1)); Just from the above code without consulting bpf_probe_read_str internals, it is totally possible that ret = 128, then ret & (sizeof(filename) - 1) = 0. The issue is that the verifier did not set the "ret" initial range as (-inf, sizeof(filename) - 1). We could have this information associated with helper and feed back to verifier. If we have this range, later for ret & (sizeof(filename) - 1) with ret >= 1, the verifier should be able to conclude ret & (sizeof(filename) - 1) >= 1. To workaround the immediate problem, I tested the following hack with bcc and it works fine. BPF_PERF_OUTPUT(events); int trace(struct pt_regs *ctx) { char filename[128]; int ret = bpf_probe_read_str(filename, sizeof(filename), 0); if (ret > 0) { if (ret == 1) events.perf_submit(ctx, filename, ret); else if (ret < 128) events.perf_submit(ctx, filename, ret); } return 1; } The idea is to make control flow more complex to prevent llvm do certain optimizations. > >> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c >> index a5580c6..a68d8bd 100644 >> --- a/kernel/trace/bpf_trace.c >> +++ b/kernel/trace/bpf_trace.c >> @@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map, >>                 }, >>         }; >> >> +       if (unlikely(size == 0)) >> +               return 0; >> + >>         if (unlikely(flags & ~(BPF_F_INDEX_MASK))) >>                 return -EINVAL; >> >> @@ -407,7 +410,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = { >>         .arg2_type      = ARG_CONST_MAP_PTR, >>         .arg3_type      = ARG_ANYTHING, >>         .arg4_type      = ARG_PTR_TO_MEM, >> -       .arg5_type      = ARG_CONST_SIZE, >> +       .arg5_type      = ARG_CONST_SIZE_OR_ZERO, >>  }; >