netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
@ 2017-11-13 14:30 Arnaldo Carvalho de Melo
  2017-11-13 14:56 ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-13 14:30 UTC (permalink / raw)
  To: Gianluca Borello
  Cc: Alexei Starovoitov, Daniel Borkmann, David Miller,
	Linux Networking Development Mailing List

Hi,

	In a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") you
state:

   "This is suboptimal because the size of the string needs to be estimated
    at compile time, causing more memory to be copied than often necessary,
    and can become more problematic if further processing on buf is done,
    for example by pushing it to userspace via bpf_perf_event_output(),
    since the real length of the string is unknown and the entire buffer
    must be copied (and defining an unrolled strnlen() inside the bpf
    program is a very inefficient and unfeasible approach)."

So I went on to try this with 'perf trace' but it isn't working if I use
the return from bpf_probe_read_str(), I must be missing something
here... 

I.e. this works:

[root@jouet bpf]# cat open.c
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, const char __user *filename_ptr)
{
	char filename[128];
	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, 32);
	return 1;
}
[root@jouet bpf]# perf trace -e open,open.c touch /etc/passwd
bpf: builtin compilation failed: -95, try external compiler
     0.000 ( 0.013 ms): touch/14403 open(filename: 0x2ff7ce37, flags: CLOEXEC                             ) ...
     0.013 (         ): __bpf_stdout__:/etc/ld.so.cache..B.................)
     0.015 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7f7a2ff7ce37)
     0.000 ( 0.021 ms): touch/14403  ... [continued]: open()) = 3
     0.042 ( 0.002 ms): touch/14403 open(filename: 0x30180640, flags: CLOEXEC                             ) ...
     0.044 (         ): __bpf_stdout__:/lib64/libc.so.6.. .......G.........)
     0.045 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7f7a30180640)
     0.042 ( 0.010 ms): touch/14403  ... [continued]: open()) = 3
     0.301 ( 0.003 ms): touch/14403 open(filename: 0x2fd26c70, flags: CLOEXEC                             ) ...
     0.305 (         ): __bpf_stdout__:/usr/lib/locale/locale-archive......)
     0.306 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7f7a2fd26c70)
     0.301 ( 0.011 ms): touch/14403  ... [continued]: open()) = 3
     0.360 ( 0.002 ms): touch/14403 open(filename: 0x681f20f3, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) ...
     0.362 (         ): __bpf_stdout__:/etc/passwd....... .......D.........)
     0.363 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7ffe681f20f3)
     0.360 ( 0.010 ms): touch/14403  ... [continued]: open()) = 3
[root@jouet bpf]#

That bpf.h will set up the maps, etc, its attached if that may be needed
to help figure this out.

But then if I use the return value to push just the string lenght, it
doesn't work:

[root@jouet bpf]# cat open.c
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, const char __user *filename_ptr)
{
	char filename[128];
	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len);
	return 1;
}
[root@jouet bpf]# perf trace -e open,open.c touch /etc/passwd
bpf: builtin compilation failed: -95, try external compiler
event syntax error: 'open.c'
                     \___ Kernel verifier blocks program loading

(add -v to see detail)
Run 'perf list' for a list of valid events

 Usage: perf trace [<options>] [<command>]
    or: perf trace [<options>] -- <command> [<options>]
    or: perf trace record [<options>] [<command>]
    or: perf trace record [<options>] -- <command> [<options>]

    -e, --event <event>   event/syscall selector. use 'perf list' to list available events
[root@jouet bpf]#

When running this with -v we get the tools/lib/libbpf.c debug that may
help here:

Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:perf_bpf_probe/prog _text+2493152 filename=%si:x64
In map_prologue, ntevs=1
mapping[0]=0
libbpf: create map __bpf_stdout__: fd=3
prologue: pass validation
prologue: fast path
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf: 
0: (79) r3 = *(u64 *)(r1 +104)
1: (b7) r2 = 0
2: (bf) r6 = r1
3: (bf) r7 = r10
4: (07) r7 += -128
5: (bf) r1 = r7
6: (b7) r2 = 128
7: (85) call bpf_probe_read_str#45
8: (bf) r8 = r0
9: (67) r8 <<= 32
10: (77) r8 >>= 32
11: (85) call bpf_get_smp_processor_id#8
12: (bf) r1 = r6
13: (18) r2 = 0xffffa0b5958e16c0
15: (bf) r3 = r0
16: (bf) r4 = r7
17: (bf) r5 = r8
18: (85) call bpf_perf_event_output#25
invalid stack type R4 off=-128 access_size=0

libbpf: -- END LOG --
libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
libbpf: failed to load program 'prog=do_sys_open filename'
libbpf: failed to load object 'open.c'
bpf: load objects failed
event syntax error: 'open.c'
                     \___ Kernel verifier blocks program loading

I tried adding checks for len to try to somehow make sure its all bounds
checked, but couldn't get past that "invalid stack type R4", the problem seems
to be that access_size=0...

Ideas?

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-13 14:30 len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL Arnaldo Carvalho de Melo
@ 2017-11-13 14:56 ` Daniel Borkmann
  2017-11-13 15:08   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2017-11-13 14:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Gianluca Borello
  Cc: Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List

On 11/13/2017 03:30 PM, Arnaldo Carvalho de Melo wrote:
> Hi,
> 
> 	In a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") you
> state:
> 
>    "This is suboptimal because the size of the string needs to be estimated
>     at compile time, causing more memory to be copied than often necessary,
>     and can become more problematic if further processing on buf is done,
>     for example by pushing it to userspace via bpf_perf_event_output(),
>     since the real length of the string is unknown and the entire buffer
>     must be copied (and defining an unrolled strnlen() inside the bpf
>     program is a very inefficient and unfeasible approach)."
> 
> So I went on to try this with 'perf trace' but it isn't working if I use
> the return from bpf_probe_read_str(), I must be missing something
> here... 
> 
> I.e. this works:
> 
> [root@jouet bpf]# cat open.c
> #include "bpf.h"
> 
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, const char __user *filename_ptr)
> {
> 	char filename[128];
> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> 	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, 32);

By the way, you can just use BPF_F_CURRENT_CPU flag instead of the helper
call get_smp_processor_id() to get current CPU.

> 	return 1;
> }
> [root@jouet bpf]# perf trace -e open,open.c touch /etc/passwd
> bpf: builtin compilation failed: -95, try external compiler
>      0.000 ( 0.013 ms): touch/14403 open(filename: 0x2ff7ce37, flags: CLOEXEC                             ) ...
>      0.013 (         ): __bpf_stdout__:/etc/ld.so.cache..B.................)
>      0.015 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7f7a2ff7ce37)
>      0.000 ( 0.021 ms): touch/14403  ... [continued]: open()) = 3
>      0.042 ( 0.002 ms): touch/14403 open(filename: 0x30180640, flags: CLOEXEC                             ) ...
>      0.044 (         ): __bpf_stdout__:/lib64/libc.so.6.. .......G.........)
>      0.045 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7f7a30180640)
>      0.042 ( 0.010 ms): touch/14403  ... [continued]: open()) = 3
>      0.301 ( 0.003 ms): touch/14403 open(filename: 0x2fd26c70, flags: CLOEXEC                             ) ...
>      0.305 (         ): __bpf_stdout__:/usr/lib/locale/locale-archive......)
>      0.306 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7f7a2fd26c70)
>      0.301 ( 0.011 ms): touch/14403  ... [continued]: open()) = 3
>      0.360 ( 0.002 ms): touch/14403 open(filename: 0x681f20f3, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) ...
>      0.362 (         ): __bpf_stdout__:/etc/passwd....... .......D.........)
>      0.363 (         ): perf_bpf_probe:prog:(ffffffffb4260ae0) filename=0x7ffe681f20f3)
>      0.360 ( 0.010 ms): touch/14403  ... [continued]: open()) = 3
> [root@jouet bpf]#
> 
> That bpf.h will set up the maps, etc, its attached if that may be needed
> to help figure this out.
> 
> But then if I use the return value to push just the string lenght, it
> doesn't work:
> 
> [root@jouet bpf]# cat open.c
> #include "bpf.h"
> 
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, const char __user *filename_ptr)
> {
> 	char filename[128];
> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> 	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len);

The below issue 'invalid stack type R4 off=-128 access_size=0' is basically that
unsigned len is unknown at verification time, thus unbounded. Can you try the
following to see if that passes?

if (len > 0 && len <= sizeof(filename))
    perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len);

> 	return 1;
> }
> [root@jouet bpf]# perf trace -e open,open.c touch /etc/passwd
> bpf: builtin compilation failed: -95, try external compiler
> event syntax error: 'open.c'
>                      \___ Kernel verifier blocks program loading
> 
> (add -v to see detail)
> Run 'perf list' for a list of valid events
> 
>  Usage: perf trace [<options>] [<command>]
>     or: perf trace [<options>] -- <command> [<options>]
>     or: perf trace record [<options>] [<command>]
>     or: perf trace record [<options>] -- <command> [<options>]
> 
>     -e, --event <event>   event/syscall selector. use 'perf list' to list available events
> [root@jouet bpf]#
> 
> When running this with -v we get the tools/lib/libbpf.c debug that may
> help here:
> 
> Opening /sys/kernel/debug/tracing//kprobe_events write=1
> Writing event: p:perf_bpf_probe/prog _text+2493152 filename=%si:x64
> In map_prologue, ntevs=1
> mapping[0]=0
> libbpf: create map __bpf_stdout__: fd=3
> prologue: pass validation
> prologue: fast path
> libbpf: load bpf program failed: Permission denied
> libbpf: -- BEGIN DUMP LOG ---
> libbpf: 
> 0: (79) r3 = *(u64 *)(r1 +104)
> 1: (b7) r2 = 0
> 2: (bf) r6 = r1
> 3: (bf) r7 = r10
> 4: (07) r7 += -128
> 5: (bf) r1 = r7
> 6: (b7) r2 = 128
> 7: (85) call bpf_probe_read_str#45
> 8: (bf) r8 = r0
> 9: (67) r8 <<= 32
> 10: (77) r8 >>= 32
> 11: (85) call bpf_get_smp_processor_id#8
> 12: (bf) r1 = r6
> 13: (18) r2 = 0xffffa0b5958e16c0
> 15: (bf) r3 = r0
> 16: (bf) r4 = r7
> 17: (bf) r5 = r8
> 18: (85) call bpf_perf_event_output#25
> invalid stack type R4 off=-128 access_size=0
> 
> libbpf: -- END LOG --
> libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
> libbpf: failed to load program 'prog=do_sys_open filename'
> libbpf: failed to load object 'open.c'
> bpf: load objects failed
> event syntax error: 'open.c'
>                      \___ Kernel verifier blocks program loading
> 
> I tried adding checks for len to try to somehow make sure its all bounds
> checked, but couldn't get past that "invalid stack type R4", the problem seems
> to be that access_size=0...
> 
> Ideas?
> 
> - Arnaldo
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-13 14:56 ` Daniel Borkmann
@ 2017-11-13 15:08   ` Arnaldo Carvalho de Melo
  2017-11-14  0:09     ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-13 15:08 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 8146 bytes --]

Em Mon, Nov 13, 2017 at 03:56:14PM +0100, Daniel Borkmann escreveu:
> On 11/13/2017 03:30 PM, Arnaldo Carvalho de Melo wrote:
> > Hi,
> > 
> > 	In a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") you
> > state:
> > 
> >    "This is suboptimal because the size of the string needs to be estimated
> >     at compile time, causing more memory to be copied than often necessary,
> >     and can become more problematic if further processing on buf is done,
> >     for example by pushing it to userspace via bpf_perf_event_output(),
> >     since the real length of the string is unknown and the entire buffer
> >     must be copied (and defining an unrolled strnlen() inside the bpf
> >     program is a very inefficient and unfeasible approach)."
> > 
> > So I went on to try this with 'perf trace' but it isn't working if I use
> > the return from bpf_probe_read_str(), I must be missing something
> > here... 
> > 
> > I.e. this works:
> > 
> > [root@jouet bpf]# cat open.c
> > #include "bpf.h"
> > 
> > SEC("prog=do_sys_open filename")
> > int prog(void *ctx, int err, const char __user *filename_ptr)
> > {
> > 	char filename[128];
> > 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> > 	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, 32);
> 
> By the way, you can just use BPF_F_CURRENT_CPU flag instead of the helper
> call get_smp_processor_id() to get current CPU.

Thanks, switched to it.

> > But then if I use the return value to push just the string lenght, it
> > doesn't work:
> > 
> > [root@jouet bpf]# cat open.c
> > #include "bpf.h"
> > 
> > SEC("prog=do_sys_open filename")
> > int prog(void *ctx, int err, const char __user *filename_ptr)
> > {
> > 	char filename[128];
> > 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> > 	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len);
> 
> The below issue 'invalid stack type R4 off=-128 access_size=0' is basically that
> unsigned len is unknown at verification time, thus unbounded. Can you try the
> following to see if that passes?
> 
> if (len > 0 && len <= sizeof(filename))
>     perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len);

I had it like:

	if (len > 0 && len < 32)

And it didn't helped, now I did exactly as you suggested:

[root@jouet bpf]# cat open.c
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, const char __user *filename_ptr)
{
	char filename[128];
	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
	if (len > 0 && len <= sizeof(filename))
		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
	return 1;
}
[root@jouet bpf]# trace -e open,open.c touch /etc/passwd
bpf: builtin compilation failed: -95, try external compiler
event syntax error: 'open.c'
                     \___ Kernel verifier blocks program loading
<SNIP>
[root@jouet bpf]# 

The -v output looks the same:

\x02[root@jouet bpf]# trace -v -e open,open.c touch /etc/passwd
bpf: builtin compilation failed: -95, try external compiler
Kernel build dir is set to /lib/modules/4.14.0-rc6+/build
set env: KBUILD_DIR=/lib/modules/4.14.0-rc6+/build
unset env: KBUILD_OPTS
include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40e00
set env: CLANG_EXEC=/usr/local/bin/clang
unset env: CLANG_OPTIONS
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: WORKING_DIR=/lib/modules/4.14.0-rc6+/build
set env: CLANG_SOURCE=/home/acme/bpf/open.c
llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
libbpf: loading object 'open.c' from buffer
libbpf: section .strtab, size 103, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section prog=do_sys_open filename, size 184, link 0, flags 6, type=1
libbpf: found program prog=do_sys_open filename
libbpf: section .relprog=do_sys_open filename, size 16, link 8, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of open.c is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of open.c is 40e00
libbpf: section .symtab, size 144, link 1, flags 0, type=2
libbpf: maps in open.c: 1 maps in 16 bytes
libbpf: map 0 is "__bpf_stdout__"
libbpf: collecting relocating info for: 'prog=do_sys_open filename'
libbpf: relocation: insn_idx=15
libbpf: relocation: find map 0 (__bpf_stdout__) for insn 15
bpf: config program 'prog=do_sys_open filename'
symbol:do_sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: filename into filename
bpf: config 'prog=do_sys_open filename' is ok
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.14.0-rc6+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.14.0-rc6+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: do_sys_open [2a2bbbe]
Probe point found: do_sys_open+0
Searching 'filename' variable in context.
Converting variable filename into trace event.
filename type is (null).
Opening /sys/kernel/debug/tracing//README write=0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:perf_bpf_probe/prog _text+2493152 filename=%si:x64
In map_prologue, ntevs=1
mapping[0]=0
libbpf: create map __bpf_stdout__: fd=3
prologue: pass validation
prologue: fast path
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf: 
0: (79) r3 = *(u64 *)(r1 +104)
1: (b7) r2 = 0
2: (bf) r6 = r1
3: (bf) r1 = r10
4: (07) r1 += -128
5: (b7) r2 = 128
6: (85) call bpf_probe_read_str#45
7: (bf) r1 = r0
8: (07) r1 += -1
9: (67) r1 <<= 32
10: (77) r1 >>= 32
11: (25) if r1 > 0x7f goto pc+11
 R0=inv(id=0) R1=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R6=ctx(id=0,off=0,imm=0) R10=fp0
12: (67) r0 <<= 32
13: (77) r0 >>= 32
14: (bf) r4 = r10
15: (07) r4 += -128
16: (bf) r1 = r6
17: (18) r2 = 0xffffa0b74ba91000
19: (18) r3 = 0xffffffff
21: (bf) r5 = r0
22: (85) call bpf_perf_event_output#25
invalid stack type R4 off=-128 access_size=0

libbpf: -- END LOG --
libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
libbpf: failed to load program 'prog=do_sys_open filename'
libbpf: failed to load object 'open.c'
bpf: load objects failed
event syntax error: 'open.c'
                     \___ Kernel verifier blocks program loading

Also:

[root@jouet bpf]# clang -v
clang version 4.0.0 (http://llvm.org/git/clang.git f5be8ba13adc4ba1011a7ccd60c844bd60427c1c) (http://llvm.org/git/llvm.git efca1a37676f4cd276d947658cf90b0fb625abfd)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/7
Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/7
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
[root@jouet bpf]#

And now I've really attached that bpf.h header I use.

- Arnaldo

[-- Attachment #2: bpf.h --]
[-- Type: text/plain, Size: 1094 bytes --]

#include <uapi/linux/bpf.h>
struct bpf_map_def {
       unsigned int type;
       unsigned int key_size;
       unsigned int value_size;
       unsigned int max_entries;
};
#define SEC(NAME) __attribute__((section(NAME), used))
static int (*get_smp_processor_id)(void) =
       (void *)BPF_FUNC_get_smp_processor_id;
static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
       (void *)BPF_FUNC_perf_event_output;
static int (*bpf_probe_read_str)(void *dst, int size, const void *unsafe_addr) =
       (void *)BPF_FUNC_probe_read_str;
static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
	(void *)BPF_FUNC_trace_printk;

struct bpf_map_def SEC("maps") __bpf_stdout__ = {
       .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
       .key_size = sizeof(int),
       .value_size = sizeof(u32),
       .max_entries = __NR_CPUS__,
};

#define printf(msg) \
	({ char str[] = msg; perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), &str, sizeof(str)); })

char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-13 15:08   ` Arnaldo Carvalho de Melo
@ 2017-11-14  0:09     ` Daniel Borkmann
  2017-11-14 12:58       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2017-11-14  0:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List, yhs

On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 13, 2017 at 03:56:14PM +0100, Daniel Borkmann escreveu:
>> On 11/13/2017 03:30 PM, Arnaldo Carvalho de Melo wrote:
>>> Hi,
>>>
>>> 	In a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") you
>>> state:
>>>
>>>    "This is suboptimal because the size of the string needs to be estimated
>>>     at compile time, causing more memory to be copied than often necessary,
>>>     and can become more problematic if further processing on buf is done,
>>>     for example by pushing it to userspace via bpf_perf_event_output(),
>>>     since the real length of the string is unknown and the entire buffer
>>>     must be copied (and defining an unrolled strnlen() inside the bpf
>>>     program is a very inefficient and unfeasible approach)."
>>>
>>> So I went on to try this with 'perf trace' but it isn't working if I use
>>> the return from bpf_probe_read_str(), I must be missing something
>>> here... 
>>>
>>> I.e. this works:
>>>
>>> [root@jouet bpf]# cat open.c
>>> #include "bpf.h"
>>>
>>> SEC("prog=do_sys_open filename")
>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>> {
>>> 	char filename[128];
>>> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>> 	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, 32);
>>
>> By the way, you can just use BPF_F_CURRENT_CPU flag instead of the helper
>> call get_smp_processor_id() to get current CPU.
> 
> Thanks, switched to it.
> 
>>> But then if I use the return value to push just the string lenght, it
>>> doesn't work:
>>>
>>> [root@jouet bpf]# cat open.c
>>> #include "bpf.h"
>>>
>>> SEC("prog=do_sys_open filename")
>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>> {
>>> 	char filename[128];
>>> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>> 	perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len);
>>
>> The below issue 'invalid stack type R4 off=-128 access_size=0' is basically that
>> unsigned len is unknown at verification time, thus unbounded. Can you try the
>> following to see if that passes?
>>
>> if (len > 0 && len <= sizeof(filename))
>>     perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len);
> 
> I had it like:
> 
> 	if (len > 0 && len < 32)
> 
> And it didn't helped, now I did exactly as you suggested:
> 
> [root@jouet bpf]# cat open.c
> #include "bpf.h"
> 
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, const char __user *filename_ptr)
> {
> 	char filename[128];
> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> 	if (len > 0 && len <= sizeof(filename))
> 		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
> 	return 1;
> }
> [root@jouet bpf]# trace -e open,open.c touch /etc/passwd
> bpf: builtin compilation failed: -95, try external compiler
> event syntax error: 'open.c'
>                      \___ Kernel verifier blocks program loading
> <SNIP>
> [root@jouet bpf]# 
> 
> The -v output looks the same:
> 
> \x02[root@jouet bpf]# trace -v -e open,open.c touch /etc/passwd
> bpf: builtin compilation failed: -95, try external compiler
> Kernel build dir is set to /lib/modules/4.14.0-rc6+/build
> set env: KBUILD_DIR=/lib/modules/4.14.0-rc6+/build
> unset env: KBUILD_OPTS
> include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
> set env: NR_CPUS=4
> set env: LINUX_VERSION_CODE=0x40e00
> set env: CLANG_EXEC=/usr/local/bin/clang
> unset env: CLANG_OPTIONS
> set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
> set env: WORKING_DIR=/lib/modules/4.14.0-rc6+/build
> set env: CLANG_SOURCE=/home/acme/bpf/open.c
> llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
> libbpf: loading object 'open.c' from buffer
> libbpf: section .strtab, size 103, link 0, flags 0, type=3
> libbpf: section .text, size 0, link 0, flags 6, type=1
> libbpf: section prog=do_sys_open filename, size 184, link 0, flags 6, type=1
> libbpf: found program prog=do_sys_open filename
> libbpf: section .relprog=do_sys_open filename, size 16, link 8, flags 0, type=9
> libbpf: section maps, size 16, link 0, flags 3, type=1
> libbpf: section license, size 4, link 0, flags 3, type=1
> libbpf: license of open.c is GPL
> libbpf: section version, size 4, link 0, flags 3, type=1
> libbpf: kernel version of open.c is 40e00
> libbpf: section .symtab, size 144, link 1, flags 0, type=2
> libbpf: maps in open.c: 1 maps in 16 bytes
> libbpf: map 0 is "__bpf_stdout__"
> libbpf: collecting relocating info for: 'prog=do_sys_open filename'
> libbpf: relocation: insn_idx=15
> libbpf: relocation: find map 0 (__bpf_stdout__) for insn 15
> bpf: config program 'prog=do_sys_open filename'
> symbol:do_sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
> parsing arg: filename into filename
> bpf: config 'prog=do_sys_open filename' is ok
> Looking at the vmlinux_path (8 entries long)
> Using /lib/modules/4.14.0-rc6+/build/vmlinux for symbols
> Open Debuginfo file: /lib/modules/4.14.0-rc6+/build/vmlinux
> Try to find probe point from debuginfo.
> Matched function: do_sys_open [2a2bbbe]
> Probe point found: do_sys_open+0
> Searching 'filename' variable in context.
> Converting variable filename into trace event.
> filename type is (null).
> Opening /sys/kernel/debug/tracing//README write=0
> Found 1 probe_trace_events.
> Opening /sys/kernel/debug/tracing//kprobe_events write=1
> Writing event: p:perf_bpf_probe/prog _text+2493152 filename=%si:x64
> In map_prologue, ntevs=1
> mapping[0]=0
> libbpf: create map __bpf_stdout__: fd=3
> prologue: pass validation
> prologue: fast path
> libbpf: load bpf program failed: Permission denied
> libbpf: -- BEGIN DUMP LOG ---
> libbpf: 
> 0: (79) r3 = *(u64 *)(r1 +104)
> 1: (b7) r2 = 0
> 2: (bf) r6 = r1
> 3: (bf) r1 = r10
> 4: (07) r1 += -128
> 5: (b7) r2 = 128
> 6: (85) call bpf_probe_read_str#45
> 7: (bf) r1 = r0
> 8: (07) r1 += -1
> 9: (67) r1 <<= 32
> 10: (77) r1 >>= 32
> 11: (25) if r1 > 0x7f goto pc+11

Right, so the compiler is optimizing the two tests into a single one above,
which means lower bound cannot properly be derived again by the verifier due
to this and thus you'll get the error. Similar issue was seen recently [1].

Does the below hack work for you?

int prog([...])
{
        char filename[128];
        int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
        if (ret > 0)
                bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
                                      ret & (sizeof(filename) - 1));
        return 1;
}

r0 should keep on tracking bounds here at least:

prog:
       0:	bf 16 00 00 00 00 00 00 	r6 = r1
       1:	bf a1 00 00 00 00 00 00 	r1 = r10
       2:	07 01 00 00 80 ff ff ff 	r1 += -128
       3:	b7 02 00 00 80 00 00 00 	r2 = 128
       4:	85 00 00 00 2d 00 00 00 	call 45
       5:	67 00 00 00 20 00 00 00 	r0 <<= 32
       6:	c7 00 00 00 20 00 00 00 	r0 s>>= 32
       7:	b7 01 00 00 01 00 00 00 	r1 = 1
       8:	6d 01 0a 00 00 00 00 00 	if r1 s> r0 goto 10
       9:	57 00 00 00 7f 00 00 00 	r0 &= 127
      10:	bf a4 00 00 00 00 00 00 	r4 = r10
      11:	07 04 00 00 80 ff ff ff 	r4 += -128
      12:	bf 61 00 00 00 00 00 00 	r1 = r6
      13:	18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	r2 = 0ll
      15:	18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 	r3 = 4294967295ll
      17:	bf 05 00 00 00 00 00 00 	r5 = r0
      18:	85 00 00 00 19 00 00 00 	call 25

  [1] http://patchwork.ozlabs.org/project/netdev/list/?series=13211

>  R0=inv(id=0) R1=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R6=ctx(id=0,off=0,imm=0) R10=fp0
> 12: (67) r0 <<= 32
> 13: (77) r0 >>= 32
> 14: (bf) r4 = r10
> 15: (07) r4 += -128
> 16: (bf) r1 = r6
> 17: (18) r2 = 0xffffa0b74ba91000
> 19: (18) r3 = 0xffffffff
> 21: (bf) r5 = r0
> 22: (85) call bpf_perf_event_output#25
> invalid stack type R4 off=-128 access_size=0
> 
> libbpf: -- END LOG --
> libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
> libbpf: failed to load program 'prog=do_sys_open filename'
> libbpf: failed to load object 'open.c'
> bpf: load objects failed
> event syntax error: 'open.c'
>                      \___ Kernel verifier blocks program loading
> 
> Also:
> 
> [root@jouet bpf]# clang -v
> clang version 4.0.0 (http://llvm.org/git/clang.git f5be8ba13adc4ba1011a7ccd60c844bd60427c1c) (http://llvm.org/git/llvm.git efca1a37676f4cd276d947658cf90b0fb625abfd)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/local/bin
> Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/7
> Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/7
> Candidate multilib: .;@m64
> Candidate multilib: 32;@m32
> Selected multilib: .;@m64
> [root@jouet bpf]#
> 
> And now I've really attached that bpf.h header I use.
> 
> - Arnaldo
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14  0:09     ` Daniel Borkmann
@ 2017-11-14 12:58       ` Arnaldo Carvalho de Melo
  2017-11-14 13:09         ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-14 12:58 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List, yhs

Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
> > libbpf: -- BEGIN DUMP LOG ---
> > libbpf: 
> > 0: (79) r3 = *(u64 *)(r1 +104)
> > 1: (b7) r2 = 0
> > 2: (bf) r6 = r1
> > 3: (bf) r1 = r10
> > 4: (07) r1 += -128
> > 5: (b7) r2 = 128
> > 6: (85) call bpf_probe_read_str#45
> > 7: (bf) r1 = r0
> > 8: (07) r1 += -1
> > 9: (67) r1 <<= 32
> > 10: (77) r1 >>= 32
> > 11: (25) if r1 > 0x7f goto pc+11
> 
> Right, so the compiler is optimizing the two tests into a single one above,
> which means lower bound cannot properly be derived again by the verifier due
> to this and thus you'll get the error. Similar issue was seen recently [1].
> 
> Does the below hack work for you?
> 
> int prog([...])
> {
>         char filename[128];
>         int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>         if (ret > 0)
>                 bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>                                       ret & (sizeof(filename) - 1));
>         return 1;
> }
> 
> r0 should keep on tracking bounds here at least:
> 
> prog:
>        0:	bf 16 00 00 00 00 00 00 	r6 = r1
>        1:	bf a1 00 00 00 00 00 00 	r1 = r10
>        2:	07 01 00 00 80 ff ff ff 	r1 += -128
>        3:	b7 02 00 00 80 00 00 00 	r2 = 128
>        4:	85 00 00 00 2d 00 00 00 	call 45
>        5:	67 00 00 00 20 00 00 00 	r0 <<= 32
>        6:	c7 00 00 00 20 00 00 00 	r0 s>>= 32
>        7:	b7 01 00 00 01 00 00 00 	r1 = 1
>        8:	6d 01 0a 00 00 00 00 00 	if r1 s> r0 goto 10
>        9:	57 00 00 00 7f 00 00 00 	r0 &= 127
>       10:	bf a4 00 00 00 00 00 00 	r4 = r10
>       11:	07 04 00 00 80 ff ff ff 	r4 += -128
>       12:	bf 61 00 00 00 00 00 00 	r1 = r6
>       13:	18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	r2 = 0ll
>       15:	18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 	r3 = 4294967295ll
>       17:	bf 05 00 00 00 00 00 00 	r5 = r0
>       18:	85 00 00 00 19 00 00 00 	call 25
> 
>   [1] http://patchwork.ozlabs.org/project/netdev/list/?series=13211

Not yet:

6: (85) call bpf_probe_read_str#45
7: (bf) r1 = r0
8: (67) r1 <<= 32
9: (77) r1 >>= 32
10: (15) if r1 == 0x0 goto pc+10
 R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
11: (57) r0 &= 127
12: (bf) r4 = r10
13: (07) r4 += -128
14: (bf) r1 = r6
15: (18) r2 = 0xffff92bfc2aba840
17: (18) r3 = 0xffffffff
19: (bf) r5 = r0
20: (85) call bpf_perf_event_output#25
invalid stack type R4 off=-128 access_size=0

I'll try updating clang/llvm...

Full details:

[root@jouet bpf]# cat open.c 
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, const char __user *filename_ptr)
{
	char filename[128];
	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
	if (len > 0)
       		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
				  len & (sizeof(filename) - 1));
	return 1;
}
[root@jouet bpf]# perf trace -v -e *open,open.c  usleep 2
bpf: builtin compilation failed: -95, try external compiler
Kernel build dir is set to /lib/modules/4.14.0+/build
set env: KBUILD_DIR=/lib/modules/4.14.0+/build
unset env: KBUILD_OPTS
include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40e00
set env: CLANG_EXEC=/usr/local/bin/clang
unset env: CLANG_OPTIONS
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: WORKING_DIR=/lib/modules/4.14.0+/build
set env: CLANG_SOURCE=/home/acme/bpf/open.c
llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
libbpf: loading object 'open.c' from buffer
libbpf: section .strtab, size 103, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section prog=do_sys_open filename, size 168, link 0, flags 6, type=1
libbpf: found program prog=do_sys_open filename
libbpf: section .relprog=do_sys_open filename, size 16, link 8, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of open.c is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of open.c is 40e00
libbpf: section .symtab, size 144, link 1, flags 0, type=2
libbpf: maps in open.c: 1 maps in 16 bytes
libbpf: map 0 is "__bpf_stdout__"
libbpf: collecting relocating info for: 'prog=do_sys_open filename'
libbpf: relocation: insn_idx=13
libbpf: relocation: find map 0 (__bpf_stdout__) for insn 13
bpf: config program 'prog=do_sys_open filename'
symbol:do_sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: filename into filename
bpf: config 'prog=do_sys_open filename' is ok
Failed to open cache(-1): /root/.debug/[kernel.kallsyms]/b5e25bb34abb3ddefa85250978beb530cc2eebd5/probes
Cache open error: -1
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.14.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.14.0+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: do_sys_open [2a2e5db]
Probe point found: do_sys_open+0
Searching 'filename' variable in context.
Converting variable filename into trace event.
filename type is (null).
Opening /sys/kernel/debug/tracing//README write=0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:perf_bpf_probe/prog _text+2493840 filename=%si:x64
In map_prologue, ntevs=1
mapping[0]=0
libbpf: create map __bpf_stdout__: fd=3
prologue: pass validation
prologue: fast path
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf: 
0: (79) r3 = *(u64 *)(r1 +104)
1: (b7) r2 = 0
2: (bf) r6 = r1
3: (bf) r1 = r10
4: (07) r1 += -128
5: (b7) r2 = 128
6: (85) call bpf_probe_read_str#45
7: (bf) r1 = r0
8: (67) r1 <<= 32
9: (77) r1 >>= 32
10: (15) if r1 == 0x0 goto pc+10
 R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
11: (57) r0 &= 127
12: (bf) r4 = r10
13: (07) r4 += -128
14: (bf) r1 = r6
15: (18) r2 = 0xffff92bfc2aba840
17: (18) r3 = 0xffffffff
19: (bf) r5 = r0
20: (85) call bpf_perf_event_output#25
invalid stack type R4 off=-128 access_size=0

libbpf: -- END LOG --
libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
libbpf: failed to load program 'prog=do_sys_open filename'
libbpf: failed to load object 'open.c'
bpf: load objects failed
event syntax error: 'open.c'
                     \___ Kernel verifier blocks program loading

(add -v to see detail)
Run 'perf list' for a list of valid events

 Usage: perf trace [<options>] [<command>]
    or: perf trace [<options>] -- <command> [<options>]
    or: perf trace record [<options>] [<command>]
    or: perf trace record [<options>] -- <command> [<options>]

    -e, --event <event>   event/syscall selector. use 'perf list' to list available events
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/prog _text+2493840 filename=%si:x64
Group:perf_bpf_probe Event:prog probe:p
Writing event: -:perf_bpf_probe/prog
[root@jouet bpf]#

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 12:58       ` Arnaldo Carvalho de Melo
@ 2017-11-14 13:09         ` Daniel Borkmann
  2017-11-14 13:42           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2017-11-14 13:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List, yhs

On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
>>> libbpf: -- BEGIN DUMP LOG ---
>>> libbpf: 
>>> 0: (79) r3 = *(u64 *)(r1 +104)
>>> 1: (b7) r2 = 0
>>> 2: (bf) r6 = r1
>>> 3: (bf) r1 = r10
>>> 4: (07) r1 += -128
>>> 5: (b7) r2 = 128
>>> 6: (85) call bpf_probe_read_str#45
>>> 7: (bf) r1 = r0
>>> 8: (07) r1 += -1
>>> 9: (67) r1 <<= 32
>>> 10: (77) r1 >>= 32
>>> 11: (25) if r1 > 0x7f goto pc+11
>>
>> Right, so the compiler is optimizing the two tests into a single one above,
>> which means lower bound cannot properly be derived again by the verifier due
>> to this and thus you'll get the error. Similar issue was seen recently [1].
>>
>> Does the below hack work for you?
>>
>> int prog([...])
>> {
>>         char filename[128];
>>         int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>         if (ret > 0)
>>                 bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>                                       ret & (sizeof(filename) - 1));
>>         return 1;
>> }
>>
>> r0 should keep on tracking bounds here at least:
>>
>> prog:
>>        0:	bf 16 00 00 00 00 00 00 	r6 = r1
>>        1:	bf a1 00 00 00 00 00 00 	r1 = r10
>>        2:	07 01 00 00 80 ff ff ff 	r1 += -128
>>        3:	b7 02 00 00 80 00 00 00 	r2 = 128
>>        4:	85 00 00 00 2d 00 00 00 	call 45
>>        5:	67 00 00 00 20 00 00 00 	r0 <<= 32
>>        6:	c7 00 00 00 20 00 00 00 	r0 s>>= 32
>>        7:	b7 01 00 00 01 00 00 00 	r1 = 1
>>        8:	6d 01 0a 00 00 00 00 00 	if r1 s> r0 goto 10
>>        9:	57 00 00 00 7f 00 00 00 	r0 &= 127
>>       10:	bf a4 00 00 00 00 00 00 	r4 = r10
>>       11:	07 04 00 00 80 ff ff ff 	r4 += -128
>>       12:	bf 61 00 00 00 00 00 00 	r1 = r6
>>       13:	18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	r2 = 0ll
>>       15:	18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 	r3 = 4294967295ll
>>       17:	bf 05 00 00 00 00 00 00 	r5 = r0
>>       18:	85 00 00 00 19 00 00 00 	call 25
>>
>>   [1] http://patchwork.ozlabs.org/project/netdev/list/?series=13211
> 
> Not yet:
> 
> 6: (85) call bpf_probe_read_str#45
> 7: (bf) r1 = r0
> 8: (67) r1 <<= 32
> 9: (77) r1 >>= 32
> 10: (15) if r1 == 0x0 goto pc+10
>  R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
> 11: (57) r0 &= 127
> 12: (bf) r4 = r10
> 13: (07) r4 += -128
> 14: (bf) r1 = r6
> 15: (18) r2 = 0xffff92bfc2aba840
> 17: (18) r3 = 0xffffffff
> 19: (bf) r5 = r0
> 20: (85) call bpf_perf_event_output#25
> invalid stack type R4 off=-128 access_size=0
> 
> I'll try updating clang/llvm...
> 
> Full details:
> 
> [root@jouet bpf]# cat open.c 
> #include "bpf.h"
> 
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, const char __user *filename_ptr)
> {
> 	char filename[128];
> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);

Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
could potentially return errors like -EFAULT.

Currently having a version compiled from the git tree:

# llc --version
LLVM (http://llvm.org/):
  LLVM version 6.0.0git-2d810c2
  Optimized build.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: skylake

  Registered Targets:
    bpf    - BPF (host endian)
    bpfeb  - BPF (big endian)
    bpfel  - BPF (little endian)
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

> 	if (len > 0)
>        		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
> 				  len & (sizeof(filename) - 1));
> 	return 1;
> }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 13:09         ` Daniel Borkmann
@ 2017-11-14 13:42           ` Arnaldo Carvalho de Melo
  2017-11-14 14:19             ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-14 13:42 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List, yhs

Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
> >> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
> >>> libbpf: -- BEGIN DUMP LOG ---
> >>> libbpf: 
> >>> 0: (79) r3 = *(u64 *)(r1 +104)
> >>> 1: (b7) r2 = 0
> >>> 2: (bf) r6 = r1
> >>> 3: (bf) r1 = r10
> >>> 4: (07) r1 += -128
> >>> 5: (b7) r2 = 128
> >>> 6: (85) call bpf_probe_read_str#45
> >>> 7: (bf) r1 = r0
> >>> 8: (07) r1 += -1
> >>> 9: (67) r1 <<= 32
> >>> 10: (77) r1 >>= 32
> >>> 11: (25) if r1 > 0x7f goto pc+11
> >>
> >> Right, so the compiler is optimizing the two tests into a single one above,
> >> which means lower bound cannot properly be derived again by the verifier due
> >> to this and thus you'll get the error. Similar issue was seen recently [1].
> >>
> >> Does the below hack work for you?
> >>
> >> int prog([...])
> >> {
> >>         char filename[128];
> >>         int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> >>         if (ret > 0)
> >>                 bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
> >>                                       ret & (sizeof(filename) - 1));
> >>         return 1;
> >> }
> >>
> >> r0 should keep on tracking bounds here at least:
> >>
> >> prog:
> >>        0:	bf 16 00 00 00 00 00 00 	r6 = r1
> >>        1:	bf a1 00 00 00 00 00 00 	r1 = r10
> >>        2:	07 01 00 00 80 ff ff ff 	r1 += -128
> >>        3:	b7 02 00 00 80 00 00 00 	r2 = 128
> >>        4:	85 00 00 00 2d 00 00 00 	call 45
> >>        5:	67 00 00 00 20 00 00 00 	r0 <<= 32
> >>        6:	c7 00 00 00 20 00 00 00 	r0 s>>= 32
> >>        7:	b7 01 00 00 01 00 00 00 	r1 = 1
> >>        8:	6d 01 0a 00 00 00 00 00 	if r1 s> r0 goto 10
> >>        9:	57 00 00 00 7f 00 00 00 	r0 &= 127
> >>       10:	bf a4 00 00 00 00 00 00 	r4 = r10
> >>       11:	07 04 00 00 80 ff ff ff 	r4 += -128
> >>       12:	bf 61 00 00 00 00 00 00 	r1 = r6
> >>       13:	18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	r2 = 0ll
> >>       15:	18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 	r3 = 4294967295ll
> >>       17:	bf 05 00 00 00 00 00 00 	r5 = r0
> >>       18:	85 00 00 00 19 00 00 00 	call 25
> >>
> >>   [1] http://patchwork.ozlabs.org/project/netdev/list/?series=13211
> > 
> > Not yet:
> > 
> > 6: (85) call bpf_probe_read_str#45
> > 7: (bf) r1 = r0
> > 8: (67) r1 <<= 32
> > 9: (77) r1 >>= 32
> > 10: (15) if r1 == 0x0 goto pc+10
> >  R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
> > 11: (57) r0 &= 127
> > 12: (bf) r4 = r10
> > 13: (07) r4 += -128
> > 14: (bf) r1 = r6
> > 15: (18) r2 = 0xffff92bfc2aba840
> > 17: (18) r3 = 0xffffffff
> > 19: (bf) r5 = r0
> > 20: (85) call bpf_perf_event_output#25
> > invalid stack type R4 off=-128 access_size=0
> > 
> > I'll try updating clang/llvm...
> > 
> > Full details:
> > 
> > [root@jouet bpf]# cat open.c 
> > #include "bpf.h"
> > 
> > SEC("prog=do_sys_open filename")
> > int prog(void *ctx, int err, const char __user *filename_ptr)
> > {
> > 	char filename[128];
> > 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> 
> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
> could potentially return errors like -EFAULT.

I changed to int, didn't help
 
> Currently having a version compiled from the git tree:
> 
> # llc --version
> LLVM (http://llvm.org/):
>   LLVM version 6.0.0git-2d810c2
>   Optimized build.
>   Default target: x86_64-unknown-linux-gnu
>   Host CPU: skylake

[root@jouet bpf]# llc --version
LLVM (http://llvm.org/):
  LLVM version 4.0.0svn

Old stuff! ;-) Will change, but improving these messages should be on
the radar, I think :-)

- Arnaldo
 
>   Registered Targets:
>     bpf    - BPF (host endian)
>     bpfeb  - BPF (big endian)
>     bpfel  - BPF (little endian)
>     x86    - 32-bit X86: Pentium-Pro and above
>     x86-64 - 64-bit X86: EM64T and AMD64
> 
> > 	if (len > 0)
> >        		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
> > 				  len & (sizeof(filename) - 1));
> > 	return 1;
> > }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 13:42           ` Arnaldo Carvalho de Melo
@ 2017-11-14 14:19             ` Daniel Borkmann
  2017-11-14 14:58               ` Arnaldo Carvalho de Melo
  2017-11-14 18:15               ` Yonghong Song
  0 siblings, 2 replies; 20+ messages in thread
From: Daniel Borkmann @ 2017-11-14 14:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List, yhs

On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
>>>>> libbpf: -- BEGIN DUMP LOG ---
>>>>> libbpf: 
>>>>> 0: (79) r3 = *(u64 *)(r1 +104)
>>>>> 1: (b7) r2 = 0
>>>>> 2: (bf) r6 = r1
>>>>> 3: (bf) r1 = r10
>>>>> 4: (07) r1 += -128
>>>>> 5: (b7) r2 = 128
>>>>> 6: (85) call bpf_probe_read_str#45
>>>>> 7: (bf) r1 = r0
>>>>> 8: (07) r1 += -1
>>>>> 9: (67) r1 <<= 32
>>>>> 10: (77) r1 >>= 32
>>>>> 11: (25) if r1 > 0x7f goto pc+11
>>>>
>>>> Right, so the compiler is optimizing the two tests into a single one above,
>>>> which means lower bound cannot properly be derived again by the verifier due
>>>> to this and thus you'll get the error. Similar issue was seen recently [1].
>>>>
>>>> Does the below hack work for you?
>>>>
>>>> int prog([...])
>>>> {
>>>>         char filename[128];
>>>>         int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>         if (ret > 0)
>>>>                 bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>>>                                       ret & (sizeof(filename) - 1));
>>>>         return 1;
>>>> }
>>>>
>>>> r0 should keep on tracking bounds here at least:
>>>>
>>>> prog:
>>>>        0:	bf 16 00 00 00 00 00 00 	r6 = r1
>>>>        1:	bf a1 00 00 00 00 00 00 	r1 = r10
>>>>        2:	07 01 00 00 80 ff ff ff 	r1 += -128
>>>>        3:	b7 02 00 00 80 00 00 00 	r2 = 128
>>>>        4:	85 00 00 00 2d 00 00 00 	call 45
>>>>        5:	67 00 00 00 20 00 00 00 	r0 <<= 32
>>>>        6:	c7 00 00 00 20 00 00 00 	r0 s>>= 32
>>>>        7:	b7 01 00 00 01 00 00 00 	r1 = 1
>>>>        8:	6d 01 0a 00 00 00 00 00 	if r1 s> r0 goto 10
>>>>        9:	57 00 00 00 7f 00 00 00 	r0 &= 127
>>>>       10:	bf a4 00 00 00 00 00 00 	r4 = r10
>>>>       11:	07 04 00 00 80 ff ff ff 	r4 += -128
>>>>       12:	bf 61 00 00 00 00 00 00 	r1 = r6
>>>>       13:	18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	r2 = 0ll
>>>>       15:	18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 	r3 = 4294967295ll
>>>>       17:	bf 05 00 00 00 00 00 00 	r5 = r0
>>>>       18:	85 00 00 00 19 00 00 00 	call 25
>>>>
>>>>   [1] http://patchwork.ozlabs.org/project/netdev/list/?series=13211
>>>
>>> Not yet:
>>>
>>> 6: (85) call bpf_probe_read_str#45
>>> 7: (bf) r1 = r0
>>> 8: (67) r1 <<= 32
>>> 9: (77) r1 >>= 32
>>> 10: (15) if r1 == 0x0 goto pc+10
>>>  R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>> 11: (57) r0 &= 127
>>> 12: (bf) r4 = r10
>>> 13: (07) r4 += -128
>>> 14: (bf) r1 = r6
>>> 15: (18) r2 = 0xffff92bfc2aba840u
>>> 17: (18) r3 = 0xffffffff
>>> 19: (bf) r5 = r0
>>> 20: (85) call bpf_perf_event_output#25
>>> invalid stack type R4 off=-128 access_size=0
>>>
>>> I'll try updating clang/llvm...
>>>
>>> Full details:
>>>
>>> [root@jouet bpf]# cat open.c 
>>> #include "bpf.h"
>>>
>>> SEC("prog=do_sys_open filename")
>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>> {
>>> 	char filename[128];
>>> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>
>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
>> could potentially return errors like -EFAULT.
> 
> I changed to int, didn't help
>  
>> Currently having a version compiled from the git tree:
>>
>> # llc --version
>> LLVM (http://llvm.org/):
>>   LLVM version 6.0.0git-2d810c2
>>   Optimized build.
>>   Default target: x86_64-unknown-linux-gnu
>>   Host CPU: skylake
> 
> [root@jouet bpf]# llc --version
> LLVM (http://llvm.org/):
>   LLVM version 4.0.0svn
> 
> Old stuff! ;-) Will change, but improving these messages should be on
> the radar, I think :-)

Yep, agree, I think we need a generic, better solution for this type of
issue instead of converting individual helpers to handle 0 min bound and
then only bailing out in such case; need to brainstorm a bit on that.

I think for the above in your case ...

 [...]
  6: (85) call bpf_probe_read_str#45
  7: (bf) r1 = r0
  8: (67) r1 <<= 32
  9: (77) r1 >>= 32
 10: (15) if r1 == 0x0 goto pc+10
  R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
 11: (57) r0 &= 127
 [...]

... the shifts on r1 might be due to using 32 bit type, so if you find
a way to avoid these and have the test on r0 directly, we might get there.
Perhaps keep using a 64 bit type to avoid them. It would be useful to
propagate the deduced bound information back to r0 when we know that
neither r0 nor r1 has changed in the meantime.

> - Arnaldo
>  
>>   Registered Targets:
>>     bpf    - BPF (host endian)
>>     bpfeb  - BPF (big endian)
>>     bpfel  - BPF (little endian)
>>     x86    - 32-bit X86: Pentium-Pro and above
>>     x86-64 - 64-bit X86: EM64T and AMD64
>>
>>> 	if (len > 0)
>>>        		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>> 				  len & (sizeof(filename) - 1));
>>> 	return 1;
>>> }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 14:19             ` Daniel Borkmann
@ 2017-11-14 14:58               ` Arnaldo Carvalho de Melo
  2017-11-14 18:15               ` Yonghong Song
  1 sibling, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-14 14:58 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List, yhs

Em Tue, Nov 14, 2017 at 03:19:51PM +0100, Daniel Borkmann escreveu:
> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
> >> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
> >> Currently having a version compiled from the git tree:

> >> # llc --version
> >> LLVM (http://llvm.org/):
> >>   LLVM version 6.0.0git-2d810c2
> >>   Optimized build.
> >>   Default target: x86_64-unknown-linux-gnu
> >>   Host CPU: skylake

> > [root@jouet bpf]# llc --version
> > LLVM (http://llvm.org/):
> >   LLVM version 4.0.0svn

> > Old stuff! ;-) Will change, but improving these messages should be on
> > the radar, I think :-)

> Yep, agree, I think we need a generic, better solution for this type of
> issue instead of converting individual helpers to handle 0 min bound and
> then only bailing out in such case; need to brainstorm a bit on that.
 
> I think for the above in your case ...
 
>  [...]
>   6: (85) call bpf_probe_read_str#45
>   7: (bf) r1 = r0
>   8: (67) r1 <<= 32
>   9: (77) r1 >>= 32
>  10: (15) if r1 == 0x0 goto pc+10
>   R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>  11: (57) r0 &= 127
>  [...]
 
> ... the shifts on r1 might be due to using 32 bit type, so if you find
> a way to avoid these and have the test on r0 directly, we might get there.
> Perhaps keep using a 64 bit type to avoid them. It would be useful to
> propagate the deduced bound information back to r0 when we know that
> neither r0 nor r1 has changed in the meantime.

I changed len/ret to u64, didn't help, updating clang and llvm to see if
that helps...

Will end up working directly with eBPF bytecode, which is what I really
need in 'perf trace', but lets get this sorted out first.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 14:19             ` Daniel Borkmann
  2017-11-14 14:58               ` Arnaldo Carvalho de Melo
@ 2017-11-14 18:15               ` Yonghong Song
  2017-11-14 20:25                 ` Daniel Borkmann
  1 sibling, 1 reply; 20+ messages in thread
From: Yonghong Song @ 2017-11-14 18:15 UTC (permalink / raw)
  To: Daniel Borkmann, Arnaldo Carvalho de Melo
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List



On 11/14/17 6:19 AM, Daniel Borkmann wrote:
> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
>>>>>> libbpf: -- BEGIN DUMP LOG ---
>>>>>> libbpf:
>>>>>> 0: (79) r3 = *(u64 *)(r1 +104)
>>>>>> 1: (b7) r2 = 0
>>>>>> 2: (bf) r6 = r1
>>>>>> 3: (bf) r1 = r10
>>>>>> 4: (07) r1 += -128
>>>>>> 5: (b7) r2 = 128
>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>> 7: (bf) r1 = r0
>>>>>> 8: (07) r1 += -1
>>>>>> 9: (67) r1 <<= 32
>>>>>> 10: (77) r1 >>= 32
>>>>>> 11: (25) if r1 > 0x7f goto pc+11
>>>>>
>>>>> Right, so the compiler is optimizing the two tests into a single one above,
>>>>> which means lower bound cannot properly be derived again by the verifier due
>>>>> to this and thus you'll get the error. Similar issue was seen recently [1].
>>>>>
>>>>> Does the below hack work for you?
>>>>>
>>>>> int prog([...])
>>>>> {
>>>>>          char filename[128];
>>>>>          int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>          if (ret > 0)
>>>>>                  bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>>>>                                        ret & (sizeof(filename) - 1));
>>>>>          return 1;
>>>>> }
>>>>>
>>>>> r0 should keep on tracking bounds here at least:
>>>>>
>>>>> prog:
>>>>>         0:	bf 16 00 00 00 00 00 00 	r6 = r1
>>>>>         1:	bf a1 00 00 00 00 00 00 	r1 = r10
>>>>>         2:	07 01 00 00 80 ff ff ff 	r1 += -128
>>>>>         3:	b7 02 00 00 80 00 00 00 	r2 = 128
>>>>>         4:	85 00 00 00 2d 00 00 00 	call 45
>>>>>         5:	67 00 00 00 20 00 00 00 	r0 <<= 32
>>>>>         6:	c7 00 00 00 20 00 00 00 	r0 s>>= 32
>>>>>         7:	b7 01 00 00 01 00 00 00 	r1 = 1
>>>>>         8:	6d 01 0a 00 00 00 00 00 	if r1 s> r0 goto 10
>>>>>         9:	57 00 00 00 7f 00 00 00 	r0 &= 127
>>>>>        10:	bf a4 00 00 00 00 00 00 	r4 = r10
>>>>>        11:	07 04 00 00 80 ff ff ff 	r4 += -128
>>>>>        12:	bf 61 00 00 00 00 00 00 	r1 = r6
>>>>>        13:	18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	r2 = 0ll
>>>>>        15:	18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 	r3 = 4294967295ll
>>>>>        17:	bf 05 00 00 00 00 00 00 	r5 = r0
>>>>>        18:	85 00 00 00 19 00 00 00 	call 25
>>>>>
>>>>>    [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e=
>>>>
>>>> Not yet:
>>>>
>>>> 6: (85) call bpf_probe_read_str#45
>>>> 7: (bf) r1 = r0
>>>> 8: (67) r1 <<= 32
>>>> 9: (77) r1 >>= 32
>>>> 10: (15) if r1 == 0x0 goto pc+10
>>>>   R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>> 11: (57) r0 &= 127
>>>> 12: (bf) r4 = r10
>>>> 13: (07) r4 += -128
>>>> 14: (bf) r1 = r6
>>>> 15: (18) r2 = 0xffff92bfc2aba840u
>>>> 17: (18) r3 = 0xffffffff
>>>> 19: (bf) r5 = r0
>>>> 20: (85) call bpf_perf_event_output#25
>>>> invalid stack type R4 off=-128 access_size=0
>>>>
>>>> I'll try updating clang/llvm...
>>>>
>>>> Full details:
>>>>
>>>> [root@jouet bpf]# cat open.c
>>>> #include "bpf.h"
>>>>
>>>> SEC("prog=do_sys_open filename")
>>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>>> {
>>>> 	char filename[128];
>>>> 	const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>
>>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
>>> could potentially return errors like -EFAULT.
>>
>> I changed to int, didn't help
>>   
>>> Currently having a version compiled from the git tree:
>>>
>>> # llc --version
>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>    LLVM version 6.0.0git-2d810c2
>>>    Optimized build.
>>>    Default target: x86_64-unknown-linux-gnu
>>>    Host CPU: skylake
>>
>> [root@jouet bpf]# llc --version
>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>    LLVM version 4.0.0svn
>>
>> Old stuff! ;-) Will change, but improving these messages should be on
>> the radar, I think :-)
> 
> Yep, agree, I think we need a generic, better solution for this type of
> issue instead of converting individual helpers to handle 0 min bound and
> then only bailing out in such case; need to brainstorm a bit on that.
> 
> I think for the above in your case ...
> 
>   [...]
>    6: (85) call bpf_probe_read_str#45
>    7: (bf) r1 = r0
>    8: (67) r1 <<= 32
>    9: (77) r1 >>= 32
>   10: (15) if r1 == 0x0 goto pc+10
>    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>   11: (57) r0 &= 127
>   [...]
> 
> ... the shifts on r1 might be due to using 32 bit type, so if you find
> a way to avoid these and have the test on r0 directly, we might get there.
> Perhaps keep using a 64 bit type to avoid them. It would be useful to
> propagate the deduced bound information back to r0 when we know that
> neither r0 nor r1 has changed in the meantime.

It is tricky to do in the bpf_program. Compiler tries hard to optimize :-).

The issue is at "r0 &= 127".

9: (6d) if r1 s> r0 goto pc+10
  R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 
0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
10: 
R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 
0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
10: (57) r0 &= 127
11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 
R6=ctx(id=0,off=0,imm=0) R10=fp0

One possible solution for this problem is to relax the arg4 type
from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO.

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index a5580c6..a68d8bd 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, 
regs, struct bpf_map *, map,
                 },
         };

+       if (unlikely(size == 0))
+               return 0;
+
         if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
                 return -EINVAL;

@@ -407,7 +410,7 @@ static const struct bpf_func_proto 
bpf_perf_event_output_proto = {
         .arg2_type      = ARG_CONST_MAP_PTR,
         .arg3_type      = ARG_ANYTHING,
         .arg4_type      = ARG_PTR_TO_MEM,
-       .arg5_type      = ARG_CONST_SIZE,
+       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
  };

> 
>> - Arnaldo
>>   
>>>    Registered Targets:
>>>      bpf    - BPF (host endian)
>>>      bpfeb  - BPF (big endian)
>>>      bpfel  - BPF (little endian)
>>>      x86    - 32-bit X86: Pentium-Pro and above
>>>      x86-64 - 64-bit X86: EM64T and AMD64
>>>
>>>> 	if (len > 0)
>>>>         		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>>> 				  len & (sizeof(filename) - 1));
>>>> 	return 1;
>>>> }
> 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 18:15               ` Yonghong Song
@ 2017-11-14 20:25                 ` Daniel Borkmann
  2017-11-14 22:58                   ` Yonghong Song
  2017-11-20 13:31                   ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 20+ messages in thread
From: Daniel Borkmann @ 2017-11-14 20:25 UTC (permalink / raw)
  To: Yonghong Song, Arnaldo Carvalho de Melo
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List

On 11/14/2017 07:15 PM, Yonghong Song wrote:
> On 11/14/17 6:19 AM, Daniel Borkmann wrote:
>> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
>>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
>>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
>>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
>>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
>>>>>>> libbpf: -- BEGIN DUMP LOG ---
>>>>>>> libbpf:
>>>>>>> 0: (79) r3 = *(u64 *)(r1 +104)
>>>>>>> 1: (b7) r2 = 0
>>>>>>> 2: (bf) r6 = r1
>>>>>>> 3: (bf) r1 = r10
>>>>>>> 4: (07) r1 += -128
>>>>>>> 5: (b7) r2 = 128
>>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>>> 7: (bf) r1 = r0
>>>>>>> 8: (07) r1 += -1
>>>>>>> 9: (67) r1 <<= 32
>>>>>>> 10: (77) r1 >>= 32
>>>>>>> 11: (25) if r1 > 0x7f goto pc+11
>>>>>>
>>>>>> Right, so the compiler is optimizing the two tests into a single one above,
>>>>>> which means lower bound cannot properly be derived again by the verifier due
>>>>>> to this and thus you'll get the error. Similar issue was seen recently [1].
>>>>>>
>>>>>> Does the below hack work for you?
>>>>>>
>>>>>> int prog([...])
>>>>>> {
>>>>>>          char filename[128];
>>>>>>          int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>>          if (ret > 0)
>>>>>>                  bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>>>>>                                        ret & (sizeof(filename) - 1));
>>>>>>          return 1;
>>>>>> }
>>>>>>
>>>>>> r0 should keep on tracking bounds here at least:
>>>>>>
>>>>>> prog:
>>>>>>         0:    bf 16 00 00 00 00 00 00     r6 = r1
>>>>>>         1:    bf a1 00 00 00 00 00 00     r1 = r10
>>>>>>         2:    07 01 00 00 80 ff ff ff     r1 += -128
>>>>>>         3:    b7 02 00 00 80 00 00 00     r2 = 128
>>>>>>         4:    85 00 00 00 2d 00 00 00     call 45
>>>>>>         5:    67 00 00 00 20 00 00 00     r0 <<= 32
>>>>>>         6:    c7 00 00 00 20 00 00 00     r0 s>>= 32
>>>>>>         7:    b7 01 00 00 01 00 00 00     r1 = 1
>>>>>>         8:    6d 01 0a 00 00 00 00 00     if r1 s> r0 goto 10
>>>>>>         9:    57 00 00 00 7f 00 00 00     r0 &= 127
>>>>>>        10:    bf a4 00 00 00 00 00 00     r4 = r10
>>>>>>        11:    07 04 00 00 80 ff ff ff     r4 += -128
>>>>>>        12:    bf 61 00 00 00 00 00 00     r1 = r6
>>>>>>        13:    18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     r2 = 0ll
>>>>>>        15:    18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00     r3 = 4294967295ll
>>>>>>        17:    bf 05 00 00 00 00 00 00     r5 = r0
>>>>>>        18:    85 00 00 00 19 00 00 00     call 25
>>>>>>
>>>>>>    [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e=
>>>>>
>>>>> Not yet:
>>>>>
>>>>> 6: (85) call bpf_probe_read_str#45
>>>>> 7: (bf) r1 = r0
>>>>> 8: (67) r1 <<= 32
>>>>> 9: (77) r1 >>= 32
>>>>> 10: (15) if r1 == 0x0 goto pc+10
>>>>>   R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>>> 11: (57) r0 &= 127
>>>>> 12: (bf) r4 = r10
>>>>> 13: (07) r4 += -128
>>>>> 14: (bf) r1 = r6
>>>>> 15: (18) r2 = 0xffff92bfc2aba840u
>>>>> 17: (18) r3 = 0xffffffff
>>>>> 19: (bf) r5 = r0
>>>>> 20: (85) call bpf_perf_event_output#25
>>>>> invalid stack type R4 off=-128 access_size=0
>>>>>
>>>>> I'll try updating clang/llvm...
>>>>>
>>>>> Full details:
>>>>>
>>>>> [root@jouet bpf]# cat open.c
>>>>> #include "bpf.h"
>>>>>
>>>>> SEC("prog=do_sys_open filename")
>>>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>>>> {
>>>>>     char filename[128];
>>>>>     const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>
>>>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
>>>> could potentially return errors like -EFAULT.
>>>
>>> I changed to int, didn't help
>>>  
>>>> Currently having a version compiled from the git tree:
>>>>
>>>> # llc --version
>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>>    LLVM version 6.0.0git-2d810c2
>>>>    Optimized build.
>>>>    Default target: x86_64-unknown-linux-gnu
>>>>    Host CPU: skylake
>>>
>>> [root@jouet bpf]# llc --version
>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>    LLVM version 4.0.0svn
>>>
>>> Old stuff! ;-) Will change, but improving these messages should be on
>>> the radar, I think :-)
>>
>> Yep, agree, I think we need a generic, better solution for this type of
>> issue instead of converting individual helpers to handle 0 min bound and
>> then only bailing out in such case; need to brainstorm a bit on that.
>>
>> I think for the above in your case ...
>>
>>   [...]
>>    6: (85) call bpf_probe_read_str#45
>>    7: (bf) r1 = r0
>>    8: (67) r1 <<= 32
>>    9: (77) r1 >>= 32
>>   10: (15) if r1 == 0x0 goto pc+10
>>    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>   11: (57) r0 &= 127
>>   [...]
>>
>> ... the shifts on r1 might be due to using 32 bit type, so if you find
>> a way to avoid these and have the test on r0 directly, we might get there.
>> Perhaps keep using a 64 bit type to avoid them. It would be useful to
>> propagate the deduced bound information back to r0 when we know that
>> neither r0 nor r1 has changed in the meantime.
> 
> It is tricky to do in the bpf_program. Compiler tries hard to optimize :-).
> 
> The issue is at "r0 &= 127".
> 
> 9: (6d) if r1 s> r0 goto pc+10
>  R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
> 10: R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
> 10: (57) r0 &= 127
> 11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
> 
> One possible solution for this problem is to relax the arg4 type
> from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO.

Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
but do we really want to add this hack everywhere? :( Potentially any function
having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
helper implementation and it ends up that progs start relying on this runtime
check where we won't be able to get rid of it later on anymore.

> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index a5580c6..a68d8bd 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>                 },
>         };
> 
> +       if (unlikely(size == 0))
> +               return 0;
> +
>         if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
>                 return -EINVAL;
> 
> @@ -407,7 +410,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = {
>         .arg2_type      = ARG_CONST_MAP_PTR,
>         .arg3_type      = ARG_ANYTHING,
>         .arg4_type      = ARG_PTR_TO_MEM,
> -       .arg5_type      = ARG_CONST_SIZE,
> +       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
>  };

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 20:25                 ` Daniel Borkmann
@ 2017-11-14 22:58                   ` Yonghong Song
  2017-11-21 14:29                     ` Arnaldo Carvalho de Melo
  2017-11-20 13:31                   ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 20+ messages in thread
From: Yonghong Song @ 2017-11-14 22:58 UTC (permalink / raw)
  To: Daniel Borkmann, Arnaldo Carvalho de Melo
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List



On 11/14/17 12:25 PM, Daniel Borkmann wrote:
> On 11/14/2017 07:15 PM, Yonghong Song wrote:
>> On 11/14/17 6:19 AM, Daniel Borkmann wrote:
>>> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
>>>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
>>>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
>>>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
>>>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
>>>>>>>> libbpf: -- BEGIN DUMP LOG ---
>>>>>>>> libbpf:
>>>>>>>> 0: (79) r3 = *(u64 *)(r1 +104)
>>>>>>>> 1: (b7) r2 = 0
>>>>>>>> 2: (bf) r6 = r1
>>>>>>>> 3: (bf) r1 = r10
>>>>>>>> 4: (07) r1 += -128
>>>>>>>> 5: (b7) r2 = 128
>>>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>>>> 7: (bf) r1 = r0
>>>>>>>> 8: (07) r1 += -1
>>>>>>>> 9: (67) r1 <<= 32
>>>>>>>> 10: (77) r1 >>= 32
>>>>>>>> 11: (25) if r1 > 0x7f goto pc+11
>>>>>>>
>>>>>>> Right, so the compiler is optimizing the two tests into a single one above,
>>>>>>> which means lower bound cannot properly be derived again by the verifier due
>>>>>>> to this and thus you'll get the error. Similar issue was seen recently [1].
>>>>>>>
>>>>>>> Does the below hack work for you?
>>>>>>>
>>>>>>> int prog([...])
>>>>>>> {
>>>>>>>           char filename[128];
>>>>>>>           int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>>>           if (ret > 0)
>>>>>>>                   bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>>>>>>                                         ret & (sizeof(filename) - 1));
>>>>>>>           return 1;
>>>>>>> }
>>>>>>>
>>>>>>> r0 should keep on tracking bounds here at least:
>>>>>>>
>>>>>>> prog:
>>>>>>>          0:    bf 16 00 00 00 00 00 00     r6 = r1
>>>>>>>          1:    bf a1 00 00 00 00 00 00     r1 = r10
>>>>>>>          2:    07 01 00 00 80 ff ff ff     r1 += -128
>>>>>>>          3:    b7 02 00 00 80 00 00 00     r2 = 128
>>>>>>>          4:    85 00 00 00 2d 00 00 00     call 45
>>>>>>>          5:    67 00 00 00 20 00 00 00     r0 <<= 32
>>>>>>>          6:    c7 00 00 00 20 00 00 00     r0 s>>= 32
>>>>>>>          7:    b7 01 00 00 01 00 00 00     r1 = 1
>>>>>>>          8:    6d 01 0a 00 00 00 00 00     if r1 s> r0 goto 10
>>>>>>>          9:    57 00 00 00 7f 00 00 00     r0 &= 127
>>>>>>>         10:    bf a4 00 00 00 00 00 00     r4 = r10
>>>>>>>         11:    07 04 00 00 80 ff ff ff     r4 += -128
>>>>>>>         12:    bf 61 00 00 00 00 00 00     r1 = r6
>>>>>>>         13:    18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     r2 = 0ll
>>>>>>>         15:    18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00     r3 = 4294967295ll
>>>>>>>         17:    bf 05 00 00 00 00 00 00     r5 = r0
>>>>>>>         18:    85 00 00 00 19 00 00 00     call 25
>>>>>>>
>>>>>>>     [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e=
>>>>>>
>>>>>> Not yet:
>>>>>>
>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>> 7: (bf) r1 = r0
>>>>>> 8: (67) r1 <<= 32
>>>>>> 9: (77) r1 >>= 32
>>>>>> 10: (15) if r1 == 0x0 goto pc+10
>>>>>>    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>>>> 11: (57) r0 &= 127
>>>>>> 12: (bf) r4 = r10
>>>>>> 13: (07) r4 += -128
>>>>>> 14: (bf) r1 = r6
>>>>>> 15: (18) r2 = 0xffff92bfc2aba840u
>>>>>> 17: (18) r3 = 0xffffffff
>>>>>> 19: (bf) r5 = r0
>>>>>> 20: (85) call bpf_perf_event_output#25
>>>>>> invalid stack type R4 off=-128 access_size=0
>>>>>>
>>>>>> I'll try updating clang/llvm...
>>>>>>
>>>>>> Full details:
>>>>>>
>>>>>> [root@jouet bpf]# cat open.c
>>>>>> #include "bpf.h"
>>>>>>
>>>>>> SEC("prog=do_sys_open filename")
>>>>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>>>>> {
>>>>>>      char filename[128];
>>>>>>      const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>
>>>>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
>>>>> could potentially return errors like -EFAULT.
>>>>
>>>> I changed to int, didn't help
>>>>   
>>>>> Currently having a version compiled from the git tree:
>>>>>
>>>>> # llc --version
>>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>>>     LLVM version 6.0.0git-2d810c2
>>>>>     Optimized build.
>>>>>     Default target: x86_64-unknown-linux-gnu
>>>>>     Host CPU: skylake
>>>>
>>>> [root@jouet bpf]# llc --version
>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>>     LLVM version 4.0.0svn
>>>>
>>>> Old stuff! ;-) Will change, but improving these messages should be on
>>>> the radar, I think :-)
>>>
>>> Yep, agree, I think we need a generic, better solution for this type of
>>> issue instead of converting individual helpers to handle 0 min bound and
>>> then only bailing out in such case; need to brainstorm a bit on that.
>>>
>>> I think for the above in your case ...
>>>
>>>    [...]
>>>     6: (85) call bpf_probe_read_str#45
>>>     7: (bf) r1 = r0
>>>     8: (67) r1 <<= 32
>>>     9: (77) r1 >>= 32
>>>    10: (15) if r1 == 0x0 goto pc+10
>>>     R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>    11: (57) r0 &= 127
>>>    [...]
>>>
>>> ... the shifts on r1 might be due to using 32 bit type, so if you find
>>> a way to avoid these and have the test on r0 directly, we might get there.
>>> Perhaps keep using a 64 bit type to avoid them. It would be useful to
>>> propagate the deduced bound information back to r0 when we know that
>>> neither r0 nor r1 has changed in the meantime.
>>
>> It is tricky to do in the bpf_program. Compiler tries hard to optimize :-).
>>
>> The issue is at "r0 &= 127".
>>
>> 9: (6d) if r1 s> r0 goto pc+10
>>   R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>> 10: R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>> 10: (57) r0 &= 127
>> 11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>>
>> One possible solution for this problem is to relax the arg4 type
>> from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO.
> 
> Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
> but do we really want to add this hack everywhere? :( Potentially any function
> having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
> helper implementation and it ends up that progs start relying on this runtime
> check where we won't be able to get rid of it later on anymore.

The compiler actually does the right thing for the below code:
          int ret = bpf_probe_read_str(filename, sizeof(filename),
                                       filename_ptr);
          if (ret > 0)
            bpf_perf_event_output(ctx, &__bpf_stdout__,BPF_F_CURRENT_CPU,
                            filename, ret & (sizeof(filename) - 1));

Just from the above code without consulting bpf_probe_read_str 
internals, it is totally possible that ret = 128, then
ret & (sizeof(filename) - 1) = 0.
The issue is that the verifier did not set the "ret" initial range as 
(-inf, sizeof(filename) - 1). We could have this information associated 
with helper and feed back to verifier.

If we have this range, later for ret & (sizeof(filename) - 1) with ret 
 >= 1, the verifier should be able to conclude
  ret & (sizeof(filename) - 1) >= 1.

To workaround the immediate problem, I tested the following hack
with bcc and it works fine.

BPF_PERF_OUTPUT(events);
int trace(struct pt_regs *ctx) {
   char filename[128];
   int ret = bpf_probe_read_str(filename, sizeof(filename), 0);
   if (ret > 0) {
     if (ret == 1)
           events.perf_submit(ctx, filename, ret);
     else if (ret < 128)
           events.perf_submit(ctx, filename, ret);
   }
   return 1;
}

The idea is to make control flow more complex to prevent llvm
do certain optimizations.

> 
>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>> index a5580c6..a68d8bd 100644
>> --- a/kernel/trace/bpf_trace.c
>> +++ b/kernel/trace/bpf_trace.c
>> @@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>>                  },
>>          };
>>
>> +       if (unlikely(size == 0))
>> +               return 0;
>> +
>>          if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
>>                  return -EINVAL;
>>
>> @@ -407,7 +410,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = {
>>          .arg2_type      = ARG_CONST_MAP_PTR,
>>          .arg3_type      = ARG_ANYTHING,
>>          .arg4_type      = ARG_PTR_TO_MEM,
>> -       .arg5_type      = ARG_CONST_SIZE,
>> +       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
>>   };
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 20:25                 ` Daniel Borkmann
  2017-11-14 22:58                   ` Yonghong Song
@ 2017-11-20 13:31                   ` Arnaldo Carvalho de Melo
  2017-11-20 16:47                     ` Yonghong Song
  1 sibling, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-20 13:31 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Yonghong Song, Gianluca Borello, Alexei Starovoitov,
	David Miller, Linux Networking Development Mailing List

Em Tue, Nov 14, 2017 at 09:25:17PM +0100, Daniel Borkmann escreveu:
> On 11/14/2017 07:15 PM, Yonghong Song wrote:
> > On 11/14/17 6:19 AM, Daniel Borkmann wrote:
> >> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
> >>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
> >>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
> >>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
> >>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
> >>>>>>> libbpf: -- BEGIN DUMP LOG ---
> >>>>>>> libbpf:
> >>>>>>> 0: (79) r3 = *(u64 *)(r1 +104)
> >>>>>>> 1: (b7) r2 = 0
> >>>>>>> 2: (bf) r6 = r1
> >>>>>>> 3: (bf) r1 = r10
> >>>>>>> 4: (07) r1 += -128
> >>>>>>> 5: (b7) r2 = 128
> >>>>>>> 6: (85) call bpf_probe_read_str#45
> >>>>>>> 7: (bf) r1 = r0
> >>>>>>> 8: (07) r1 += -1
> >>>>>>> 9: (67) r1 <<= 32
> >>>>>>> 10: (77) r1 >>= 32
> >>>>>>> 11: (25) if r1 > 0x7f goto pc+11
> >>>>>>
> >>>>>> Right, so the compiler is optimizing the two tests into a single one above,
> >>>>>> which means lower bound cannot properly be derived again by the verifier due
> >>>>>> to this and thus you'll get the error. Similar issue was seen recently [1].
> >>>>>>
> >>>>>> Does the below hack work for you?
> >>>>>>
> >>>>>> int prog([...])
> >>>>>> {
> >>>>>>          char filename[128];
> >>>>>>          int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> >>>>>>          if (ret > 0)
> >>>>>>                  bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
> >>>>>>                                        ret & (sizeof(filename) - 1));
> >>>>>>          return 1;
> >>>>>> }
> >>>>>>
> >>>>>> r0 should keep on tracking bounds here at least:
> >>>>>>
> >>>>>> prog:
> >>>>>>         0:    bf 16 00 00 00 00 00 00     r6 = r1
> >>>>>>         1:    bf a1 00 00 00 00 00 00     r1 = r10
> >>>>>>         2:    07 01 00 00 80 ff ff ff     r1 += -128
> >>>>>>         3:    b7 02 00 00 80 00 00 00     r2 = 128
> >>>>>>         4:    85 00 00 00 2d 00 00 00     call 45
> >>>>>>         5:    67 00 00 00 20 00 00 00     r0 <<= 32
> >>>>>>         6:    c7 00 00 00 20 00 00 00     r0 s>>= 32
> >>>>>>         7:    b7 01 00 00 01 00 00 00     r1 = 1
> >>>>>>         8:    6d 01 0a 00 00 00 00 00     if r1 s> r0 goto 10
> >>>>>>         9:    57 00 00 00 7f 00 00 00     r0 &= 127
> >>>>>>        10:    bf a4 00 00 00 00 00 00     r4 = r10
> >>>>>>        11:    07 04 00 00 80 ff ff ff     r4 += -128
> >>>>>>        12:    bf 61 00 00 00 00 00 00     r1 = r6
> >>>>>>        13:    18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     r2 = 0ll
> >>>>>>        15:    18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00     r3 = 4294967295ll
> >>>>>>        17:    bf 05 00 00 00 00 00 00     r5 = r0
> >>>>>>        18:    85 00 00 00 19 00 00 00     call 25
> >>>>>>
> >>>>>>    [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e=
> >>>>>
> >>>>> Not yet:
> >>>>>
> >>>>> 6: (85) call bpf_probe_read_str#45
> >>>>> 7: (bf) r1 = r0
> >>>>> 8: (67) r1 <<= 32
> >>>>> 9: (77) r1 >>= 32
> >>>>> 10: (15) if r1 == 0x0 goto pc+10
> >>>>>   R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
> >>>>> 11: (57) r0 &= 127
> >>>>> 12: (bf) r4 = r10
> >>>>> 13: (07) r4 += -128
> >>>>> 14: (bf) r1 = r6
> >>>>> 15: (18) r2 = 0xffff92bfc2aba840u
> >>>>> 17: (18) r3 = 0xffffffff
> >>>>> 19: (bf) r5 = r0
> >>>>> 20: (85) call bpf_perf_event_output#25
> >>>>> invalid stack type R4 off=-128 access_size=0
> >>>>>
> >>>>> I'll try updating clang/llvm...
> >>>>>
> >>>>> Full details:
> >>>>>
> >>>>> [root@jouet bpf]# cat open.c
> >>>>> #include "bpf.h"
> >>>>>
> >>>>> SEC("prog=do_sys_open filename")
> >>>>> int prog(void *ctx, int err, const char __user *filename_ptr)
> >>>>> {
> >>>>>     char filename[128];
> >>>>>     const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> >>>>
> >>>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
> >>>> could potentially return errors like -EFAULT.
> >>>
> >>> I changed to int, didn't help
> >>>  
> >>>> Currently having a version compiled from the git tree:
> >>>>
> >>>> # llc --version
> >>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
> >>>>    LLVM version 6.0.0git-2d810c2
> >>>>    Optimized build.
> >>>>    Default target: x86_64-unknown-linux-gnu
> >>>>    Host CPU: skylake
> >>>
> >>> [root@jouet bpf]# llc --version
> >>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
> >>>    LLVM version 4.0.0svn
> >>>
> >>> Old stuff! ;-) Will change, but improving these messages should be on
> >>> the radar, I think :-)
> >>
> >> Yep, agree, I think we need a generic, better solution for this type of
> >> issue instead of converting individual helpers to handle 0 min bound and
> >> then only bailing out in such case; need to brainstorm a bit on that.
> >>
> >> I think for the above in your case ...
> >>
> >>   [...]
> >>    6: (85) call bpf_probe_read_str#45
> >>    7: (bf) r1 = r0
> >>    8: (67) r1 <<= 32
> >>    9: (77) r1 >>= 32
> >>   10: (15) if r1 == 0x0 goto pc+10
> >>    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
> >>   11: (57) r0 &= 127
> >>   [...]
> >>
> >> ... the shifts on r1 might be due to using 32 bit type, so if you find

Where is it using a 32 bit type?

Here is it again, now using clang 6:

[root@jouet bpf]# trace -v -e *open,open.c  usleep 2
bpf: builtin compilation failed: -95, try external compiler
Kernel build dir is set to /lib/modules/4.14.0+/build
set env: KBUILD_DIR=/lib/modules/4.14.0+/build
unset env: KBUILD_OPTS
include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40e00
set env: CLANG_EXEC=/usr/local/bin/clang
unset env: CLANG_OPTIONS
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: WORKING_DIR=/lib/modules/4.14.0+/build
set env: CLANG_SOURCE=/home/acme/bpf/open.c
llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
libbpf: loading object 'open.c' from buffer
libbpf: section .strtab, size 103, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section prog=do_sys_open filename, size 168, link 0, flags 6, type=1
libbpf: found program prog=do_sys_open filename
libbpf: section .relprog=do_sys_open filename, size 16, link 8, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of open.c is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of open.c is 40e00
libbpf: section .symtab, size 144, link 1, flags 0, type=2
libbpf: maps in open.c: 1 maps in 16 bytes
libbpf: map 0 is "__bpf_stdout__"
libbpf: collecting relocating info for: 'prog=do_sys_open filename'
libbpf: relocation: insn_idx=13
libbpf: relocation: find map 0 (__bpf_stdout__) for insn 13
bpf: config program 'prog=do_sys_open filename'
symbol:do_sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: filename into filename
bpf: config 'prog=do_sys_open filename' is ok
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.14.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.14.0+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: do_sys_open [2a2e5f8]
Probe point found: do_sys_open+0
Searching 'filename' variable in context.
Converting variable filename into trace event.
filename type is (null).
Opening /sys/kernel/debug/tracing//README write=0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
In map_prologue, ntevs=1
mapping[0]=0
libbpf: create map __bpf_stdout__: fd=3
prologue: pass validation
prologue: fast path
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf: 
0: (79) r3 = *(u64 *)(r1 +104)
1: (b7) r2 = 0
2: (bf) r6 = r1
3: (bf) r1 = r10
4: (07) r1 += -128
5: (b7) r2 = 128
6: (85) call bpf_probe_read_str#45
7: (bf) r1 = r0
8: (67) r1 <<= 32
9: (77) r1 >>= 32
10: (15) if r1 == 0x0 goto pc+10
 R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
11: (57) r0 &= 127
12: (bf) r4 = r10
13: (07) r4 += -128
14: (bf) r1 = r6
15: (18) r2 = 0xffff9b56ca5a2a80
17: (18) r3 = 0xffffffff
19: (bf) r5 = r0
20: (85) call bpf_perf_event_output#25
invalid stack type R4 off=-128 access_size=0

libbpf: -- END LOG --
libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
libbpf: failed to load program 'prog=do_sys_open filename'
libbpf: failed to load object 'open.c'
bpf: load objects failed
event syntax error: 'open.c'
                     \___ Kernel verifier blocks program loading

(add -v to see detail)
Run 'perf list' for a list of valid events

 Usage: perf trace [<options>] [<command>]
    or: perf trace [<options>] -- <command> [<options>]
    or: perf trace record [<options>] [<command>]
    or: perf trace record [<options>] -- <command> [<options>]

    -e, --event <event>   event/syscall selector. use 'perf list' to list available events
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
Group:perf_bpf_probe Event:prog probe:p
Writing event: -:perf_bpf_probe/prog
[root@jouet bpf]# /usr/local/bin/clang --version
clang version 6.0.0 (http://llvm.org/git/clang.git 56cc8f8880db2ebc433eeb6b6a707c101467a186) (http://llvm.org/git/llvm.git 3656d83960a4f3fedf6d8f19043abf52379f78c3)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
[root@jouet bpf]# cat open.c 
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, const char __user *filename_ptr)
{
	char filename[128];
	u64 len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
	if (len > 0)
       		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
				  len & (sizeof(filename) - 1));
	return 1;
}
[root@jouet bpf]#

> >> a way to avoid these and have the test on r0 directly, we might get there.
> >> Perhaps keep using a 64 bit type to avoid them. It would be useful to
> >> propagate the deduced bound information back to r0 when we know that
> >> neither r0 nor r1 has changed in the meantime.
> > 
> > It is tricky to do in the bpf_program. Compiler tries hard to optimize :-).
> > 
> > The issue is at "r0 &= 127".
> > 
> > 9: (6d) if r1 s> r0 goto pc+10
> >  R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
> > 10: R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
> > 10: (57) r0 &= 127
> > 11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
> > 
> > One possible solution for this problem is to relax the arg4 type
> > from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO.
> 
> Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
> but do we really want to add this hack everywhere? :( Potentially any function
> having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
> helper implementation and it ends up that progs start relying on this runtime
> check where we won't be able to get rid of it later on anymore.
> 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index a5580c6..a68d8bd 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
> >                 },
> >         };
> > 
> > +       if (unlikely(size == 0))
> > +               return 0;
> > +
> >         if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
> >                 return -EINVAL;
> > 
> > @@ -407,7 +410,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = {
> >         .arg2_type      = ARG_CONST_MAP_PTR,
> >         .arg3_type      = ARG_ANYTHING,
> >         .arg4_type      = ARG_PTR_TO_MEM,
> > -       .arg5_type      = ARG_CONST_SIZE,
> > +       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
> >  };

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-20 13:31                   ` Arnaldo Carvalho de Melo
@ 2017-11-20 16:47                     ` Yonghong Song
  0 siblings, 0 replies; 20+ messages in thread
From: Yonghong Song @ 2017-11-20 16:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Daniel Borkmann
  Cc: Gianluca Borello, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List



On 11/20/17 5:31 AM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 14, 2017 at 09:25:17PM +0100, Daniel Borkmann escreveu:
>> On 11/14/2017 07:15 PM, Yonghong Song wrote:
>>> On 11/14/17 6:19 AM, Daniel Borkmann wrote:
>>>> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
>>>>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
>>>>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
>>>>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
>>>>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
>>>>>>>>> libbpf: -- BEGIN DUMP LOG ---
>>>>>>>>> libbpf:
>>>>>>>>> 0: (79) r3 = *(u64 *)(r1 +104)
>>>>>>>>> 1: (b7) r2 = 0
>>>>>>>>> 2: (bf) r6 = r1
>>>>>>>>> 3: (bf) r1 = r10
>>>>>>>>> 4: (07) r1 += -128
>>>>>>>>> 5: (b7) r2 = 128
>>>>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>>>>> 7: (bf) r1 = r0
>>>>>>>>> 8: (07) r1 += -1
>>>>>>>>> 9: (67) r1 <<= 32
>>>>>>>>> 10: (77) r1 >>= 32
>>>>>>>>> 11: (25) if r1 > 0x7f goto pc+11
>>>>>>>>
>>>>>>>> Right, so the compiler is optimizing the two tests into a single one above,
>>>>>>>> which means lower bound cannot properly be derived again by the verifier due
>>>>>>>> to this and thus you'll get the error. Similar issue was seen recently [1].
>>>>>>>>
>>>>>>>> Does the below hack work for you?
>>>>>>>>
>>>>>>>> int prog([...])
>>>>>>>> {
>>>>>>>>           char filename[128];
>>>>>>>>           int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>>>>           if (ret > 0)
>>>>>>>>                   bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>>>>>>>                                         ret & (sizeof(filename) - 1));
>>>>>>>>           return 1;
>>>>>>>> }
>>>>>>>>
>>>>>>>> r0 should keep on tracking bounds here at least:
>>>>>>>>
>>>>>>>> prog:
>>>>>>>>          0:    bf 16 00 00 00 00 00 00     r6 = r1
>>>>>>>>          1:    bf a1 00 00 00 00 00 00     r1 = r10
>>>>>>>>          2:    07 01 00 00 80 ff ff ff     r1 += -128
>>>>>>>>          3:    b7 02 00 00 80 00 00 00     r2 = 128
>>>>>>>>          4:    85 00 00 00 2d 00 00 00     call 45
>>>>>>>>          5:    67 00 00 00 20 00 00 00     r0 <<= 32
>>>>>>>>          6:    c7 00 00 00 20 00 00 00     r0 s>>= 32
>>>>>>>>          7:    b7 01 00 00 01 00 00 00     r1 = 1
>>>>>>>>          8:    6d 01 0a 00 00 00 00 00     if r1 s> r0 goto 10
>>>>>>>>          9:    57 00 00 00 7f 00 00 00     r0 &= 127
>>>>>>>>         10:    bf a4 00 00 00 00 00 00     r4 = r10
>>>>>>>>         11:    07 04 00 00 80 ff ff ff     r4 += -128
>>>>>>>>         12:    bf 61 00 00 00 00 00 00     r1 = r6
>>>>>>>>         13:    18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     r2 = 0ll
>>>>>>>>         15:    18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00     r3 = 4294967295ll
>>>>>>>>         17:    bf 05 00 00 00 00 00 00     r5 = r0
>>>>>>>>         18:    85 00 00 00 19 00 00 00     call 25
>>>>>>>>
>>>>>>>>     [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e=
>>>>>>>
>>>>>>> Not yet:
>>>>>>>
>>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>>> 7: (bf) r1 = r0
>>>>>>> 8: (67) r1 <<= 32
>>>>>>> 9: (77) r1 >>= 32
>>>>>>> 10: (15) if r1 == 0x0 goto pc+10
>>>>>>>    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>>>>> 11: (57) r0 &= 127
>>>>>>> 12: (bf) r4 = r10
>>>>>>> 13: (07) r4 += -128
>>>>>>> 14: (bf) r1 = r6
>>>>>>> 15: (18) r2 = 0xffff92bfc2aba840u
>>>>>>> 17: (18) r3 = 0xffffffff
>>>>>>> 19: (bf) r5 = r0
>>>>>>> 20: (85) call bpf_perf_event_output#25
>>>>>>> invalid stack type R4 off=-128 access_size=0
>>>>>>>
>>>>>>> I'll try updating clang/llvm...
>>>>>>>
>>>>>>> Full details:
>>>>>>>
>>>>>>> [root@jouet bpf]# cat open.c
>>>>>>> #include "bpf.h"
>>>>>>>
>>>>>>> SEC("prog=do_sys_open filename")
>>>>>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>>>>>> {
>>>>>>>      char filename[128];
>>>>>>>      const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>>
>>>>>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
>>>>>> could potentially return errors like -EFAULT.
>>>>>
>>>>> I changed to int, didn't help
>>>>>   
>>>>>> Currently having a version compiled from the git tree:
>>>>>>
>>>>>> # llc --version
>>>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>>>>     LLVM version 6.0.0git-2d810c2
>>>>>>     Optimized build.
>>>>>>     Default target: x86_64-unknown-linux-gnu
>>>>>>     Host CPU: skylake
>>>>>
>>>>> [root@jouet bpf]# llc --version
>>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>>>     LLVM version 4.0.0svn
>>>>>
>>>>> Old stuff! ;-) Will change, but improving these messages should be on
>>>>> the radar, I think :-)
>>>>
>>>> Yep, agree, I think we need a generic, better solution for this type of
>>>> issue instead of converting individual helpers to handle 0 min bound and
>>>> then only bailing out in such case; need to brainstorm a bit on that.
>>>>
>>>> I think for the above in your case ...
>>>>
>>>>    [...]
>>>>     6: (85) call bpf_probe_read_str#45
>>>>     7: (bf) r1 = r0
>>>>     8: (67) r1 <<= 32
>>>>     9: (77) r1 >>= 32
>>>>    10: (15) if r1 == 0x0 goto pc+10
>>>>     R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>>    11: (57) r0 &= 127
>>>>    [...]
>>>>
>>>> ... the shifts on r1 might be due to using 32 bit type, so if you find
> 
> Where is it using a 32 bit type?
> 
> Here is it again, now using clang 6:
> 
> [root@jouet bpf]# trace -v -e *open,open.c  usleep 2
> bpf: builtin compilation failed: -95, try external compiler
> Kernel build dir is set to /lib/modules/4.14.0+/build
> set env: KBUILD_DIR=/lib/modules/4.14.0+/build
> unset env: KBUILD_OPTS
> include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
> set env: NR_CPUS=4
> set env: LINUX_VERSION_CODE=0x40e00
> set env: CLANG_EXEC=/usr/local/bin/clang
> unset env: CLANG_OPTIONS
> set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
> set env: WORKING_DIR=/lib/modules/4.14.0+/build
> set env: CLANG_SOURCE=/home/acme/bpf/open.c
> llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
> libbpf: loading object 'open.c' from buffer
> libbpf: section .strtab, size 103, link 0, flags 0, type=3
> libbpf: section .text, size 0, link 0, flags 6, type=1
> libbpf: section prog=do_sys_open filename, size 168, link 0, flags 6, type=1
> libbpf: found program prog=do_sys_open filename
> libbpf: section .relprog=do_sys_open filename, size 16, link 8, flags 0, type=9
> libbpf: section maps, size 16, link 0, flags 3, type=1
> libbpf: section license, size 4, link 0, flags 3, type=1
> libbpf: license of open.c is GPL
> libbpf: section version, size 4, link 0, flags 3, type=1
> libbpf: kernel version of open.c is 40e00
> libbpf: section .symtab, size 144, link 1, flags 0, type=2
> libbpf: maps in open.c: 1 maps in 16 bytes
> libbpf: map 0 is "__bpf_stdout__"
> libbpf: collecting relocating info for: 'prog=do_sys_open filename'
> libbpf: relocation: insn_idx=13
> libbpf: relocation: find map 0 (__bpf_stdout__) for insn 13
> bpf: config program 'prog=do_sys_open filename'
> symbol:do_sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
> parsing arg: filename into filename
> bpf: config 'prog=do_sys_open filename' is ok
> Looking at the vmlinux_path (8 entries long)
> Using /lib/modules/4.14.0+/build/vmlinux for symbols
> Open Debuginfo file: /lib/modules/4.14.0+/build/vmlinux
> Try to find probe point from debuginfo.
> Matched function: do_sys_open [2a2e5f8]
> Probe point found: do_sys_open+0
> Searching 'filename' variable in context.
> Converting variable filename into trace event.
> filename type is (null).
> Opening /sys/kernel/debug/tracing//README write=0
> Found 1 probe_trace_events.
> Opening /sys/kernel/debug/tracing//kprobe_events write=1
> Writing event: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
> In map_prologue, ntevs=1
> mapping[0]=0
> libbpf: create map __bpf_stdout__: fd=3
> prologue: pass validation
> prologue: fast path
> libbpf: load bpf program failed: Permission denied
> libbpf: -- BEGIN DUMP LOG ---
> libbpf:
> 0: (79) r3 = *(u64 *)(r1 +104)
> 1: (b7) r2 = 0
> 2: (bf) r6 = r1
> 3: (bf) r1 = r10
> 4: (07) r1 += -128
> 5: (b7) r2 = 128
> 6: (85) call bpf_probe_read_str#45
> 7: (bf) r1 = r0
> 8: (67) r1 <<= 32
> 9: (77) r1 >>= 32

This shift is due to the return type of bpf_probe_read_str which is int.
The compiler does a sign extension to get it to u64 type.

> 10: (15) if r1 == 0x0 goto pc+10

Since it is u64 type, the compiler uses "==" instead of ">".

>   R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
> 11: (57) r0 &= 127
> 12: (bf) r4 = r10
> 13: (07) r4 += -128
> 14: (bf) r1 = r6
> 15: (18) r2 = 0xffff9b56ca5a2a80
> 17: (18) r3 = 0xffffffff
> 19: (bf) r5 = r0

r5 = r0, and r0 is still possbily 0.
    r0 = bpf_probe_read_str
    ...
    r0 &= 127
    ...

The same symptom as "int len" type.

> 20: (85) call bpf_perf_event_output#25
> invalid stack type R4 off=-128 access_size=0
> 
> libbpf: -- END LOG --
> libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
> libbpf: failed to load program 'prog=do_sys_open filename'
> libbpf: failed to load object 'open.c'
> bpf: load objects failed
> event syntax error: 'open.c'
>                       \___ Kernel verifier blocks program loading
> 
> (add -v to see detail)
> Run 'perf list' for a list of valid events
> 
>   Usage: perf trace [<options>] [<command>]
>      or: perf trace [<options>] -- <command> [<options>]
>      or: perf trace record [<options>] [<command>]
>      or: perf trace record [<options>] -- <command> [<options>]
> 
>      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
> Opening /sys/kernel/debug/tracing//kprobe_events write=1
> Opening /sys/kernel/debug/tracing//uprobe_events write=1
> Parsing probe_events: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
> Group:perf_bpf_probe Event:prog probe:p
> Writing event: -:perf_bpf_probe/prog
> [root@jouet bpf]# /usr/local/bin/clang --version
> clang version 6.0.0 (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_git_clang.git&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=lALOhwxSDXp7fA6co6UyLBnLfqv0e2pf97GCuu5hGSw&s=FGoodXMmAdx4dqZJtvq6Vosu03to1-GFIRbagBiz0fg&e= 56cc8f8880db2ebc433eeb6b6a707c101467a186) (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_git_llvm.git&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=lALOhwxSDXp7fA6co6UyLBnLfqv0e2pf97GCuu5hGSw&s=_pQdnamFeu6n_um7WGrLwaCMtXqIVGLLEmYbcss_0aU&e= 3656d83960a4f3fedf6d8f19043abf52379f78c3)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/local/bin
> [root@jouet bpf]# cat open.c
> #include "bpf.h"
> 
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, const char __user *filename_ptr)
> {
> 	char filename[128];
> 	u64 len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> 	if (len > 0)
>         		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
> 				  len & (sizeof(filename) - 1));
> 	return 1;
> }
> [root@jouet bpf]#
> 
>>>> a way to avoid these and have the test on r0 directly, we might get there.
>>>> Perhaps keep using a 64 bit type to avoid them. It would be useful to
>>>> propagate the deduced bound information back to r0 when we know that
>>>> neither r0 nor r1 has changed in the meantime.
>>>
>>> It is tricky to do in the bpf_program. Compiler tries hard to optimize :-).
>>>
>>> The issue is at "r0 &= 127".
>>>
>>> 9: (6d) if r1 s> r0 goto pc+10
>>>   R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>>> 10: R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>>> 10: (57) r0 &= 127
>>> 11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>
>>> One possible solution for this problem is to relax the arg4 type
>>> from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO.
>>
>> Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
>> but do we really want to add this hack everywhere? :( Potentially any function
>> having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
>> helper implementation and it ends up that progs start relying on this runtime
>> check where we won't be able to get rid of it later on anymore.
>>
>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>> index a5580c6..a68d8bd 100644
>>> --- a/kernel/trace/bpf_trace.c
>>> +++ b/kernel/trace/bpf_trace.c
>>> @@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>>>                  },
>>>          };
>>>
>>> +       if (unlikely(size == 0))
>>> +               return 0;
>>> +
>>>          if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
>>>                  return -EINVAL;
>>>
>>> @@ -407,7 +410,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = {
>>>          .arg2_type      = ARG_CONST_MAP_PTR,
>>>          .arg3_type      = ARG_ANYTHING,
>>>          .arg4_type      = ARG_PTR_TO_MEM,
>>> -       .arg5_type      = ARG_CONST_SIZE,
>>> +       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
>>>   };

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-14 22:58                   ` Yonghong Song
@ 2017-11-21 14:29                     ` Arnaldo Carvalho de Melo
  2017-11-21 22:31                       ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-21 14:29 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Daniel Borkmann, Gianluca Borello, Alexei Starovoitov,
	David Miller, Linux Networking Development Mailing List

Em Tue, Nov 14, 2017 at 02:58:24PM -0800, Yonghong Song escreveu:
> On 11/14/17 12:25 PM, Daniel Borkmann wrote:
> > Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
> > but do we really want to add this hack everywhere? :( Potentially any function
> > having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
> > helper implementation and it ends up that progs start relying on this runtime
> > check where we won't be able to get rid of it later on anymore.
 
> The compiler actually does the right thing for the below code:
>          int ret = bpf_probe_read_str(filename, sizeof(filename),
>                                       filename_ptr);
>          if (ret > 0)
>            bpf_perf_event_output(ctx, &__bpf_stdout__,BPF_F_CURRENT_CPU,
>                            filename, ret & (sizeof(filename) - 1));
 
> Just from the above code without consulting bpf_probe_read_str internals, it
> is totally possible that ret = 128, then
> ret & (sizeof(filename) - 1) = 0.

> The issue is that the verifier did not set the "ret" initial range as (-inf,
> sizeof(filename) - 1). We could have this information associated with helper
> and feed back to verifier.
 
> If we have this range, later for ret & (sizeof(filename) - 1) with ret >= 1,
> the verifier should be able to conclude
>  ret & (sizeof(filename) - 1) >= 1.
 
> To workaround the immediate problem, I tested the following hack
> with bcc and it works fine.
 
> BPF_PERF_OUTPUT(events);
> int trace(struct pt_regs *ctx) {
>   char filename[128];
>   int ret = bpf_probe_read_str(filename, sizeof(filename), 0);
>   if (ret > 0) {
>     if (ret == 1)
>           events.perf_submit(ctx, filename, ret);
>     else if (ret < 128)
>           events.perf_submit(ctx, filename, ret);
>   }
>   return 1;
> }
 
> The idea is to make control flow more complex to prevent llvm
> do certain optimizations.

So, the hack makes it work for me, using clang 6.0:

set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40e00
set env: CLANG_EXEC=/usr/local/bin/clang
unset env: CLANG_OPTIONS
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: WORKING_DIR=/lib/modules/4.14.0+/build
set env: CLANG_SOURCE=/home/acme/bpf/open.c
llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -

[root@jouet bpf]# perf probe -V do_sys_open
Available variables at do_sys_open
        @<do_sys_open+0>
                char*   filename
                int     dfd
                int     flags
                struct open_flags       op
                umode_t mode
[root@jouet bpf]# cat open.c 
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, char *filename_ptr)
{
	char filename[128];
	int len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); 
	if (len > 0) {
		if (len == 1)
       			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
		else if (len < 128)
       			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
	}
	return 1;
}
[root@jouet bpf]#
[root@jouet bpf]# perf trace -e *open,open.c touch /tmp/Thanks.Yonghong.Song\!
LLVM: dumping open.o
     0.000 ( 0.009 ms): touch/9034 open(filename: 0x5b678e37, flags: CLOEXEC                             ) ...
     0.009 (         ): __bpf_stdout__:/etc/ld.so.cache....)
     0.011 (         ): perf_bpf_probe:prog:(ffffffff8f260da0) filename=0x7f805b678e37)
     0.000 ( 0.016 ms): touch/9034  ... [continued]: open()) = 3
     0.034 ( 0.002 ms): touch/9034 open(filename: 0x5b87c640, flags: CLOEXEC                             ) ...
     0.036 (         ): __bpf_stdout__:/lib64/libc.so.6....)
     0.037 (         ): perf_bpf_probe:prog:(ffffffff8f260da0) filename=0x7f805b87c640)
     0.034 ( 0.009 ms): touch/9034  ... [continued]: open()) = 3
     0.251 ( 0.002 ms): touch/9034 open(filename: 0x5b422c70, flags: CLOEXEC                             ) ...
     0.253 (         ): __bpf_stdout__:/usr/lib/locale/locale-archive......)
     0.254 (         ): perf_bpf_probe:prog:(ffffffff8f260da0) filename=0x7f805b422c70)
     0.251 ( 0.009 ms): touch/9034  ... [continued]: open()) = 3
     0.296 ( 0.002 ms): touch/9034 open(filename: 0x1d3a00f1, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) ...
     0.298 (         ): __bpf_stdout__:/tmp/Thanks.Yonghong.Song!..)
     0.299 (         ): perf_bpf_probe:prog:(ffffffff8f260da0) filename=0x7ffd1d3a00f1)
     0.296 ( 0.009 ms): touch/9034  ... [continued]: open()) = 3
[root@jouet bpf]#

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-21 14:29                     ` Arnaldo Carvalho de Melo
@ 2017-11-21 22:31                       ` Alexei Starovoitov
  2017-11-22 18:42                         ` Gianluca Borello
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2017-11-21 22:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yonghong Song, Daniel Borkmann, Gianluca Borello,
	Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List

On Tue, Nov 21, 2017 at 11:29:05AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 14, 2017 at 02:58:24PM -0800, Yonghong Song escreveu:
> > On 11/14/17 12:25 PM, Daniel Borkmann wrote:
> > > Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
> > > but do we really want to add this hack everywhere? :( Potentially any function
> > > having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
> > > helper implementation and it ends up that progs start relying on this runtime
> > > check where we won't be able to get rid of it later on anymore.
>  
> > The compiler actually does the right thing for the below code:
> >          int ret = bpf_probe_read_str(filename, sizeof(filename),
> >                                       filename_ptr);
> >          if (ret > 0)
> >            bpf_perf_event_output(ctx, &__bpf_stdout__,BPF_F_CURRENT_CPU,
> >                            filename, ret & (sizeof(filename) - 1));
>  
> > Just from the above code without consulting bpf_probe_read_str internals, it
> > is totally possible that ret = 128, then
> > ret & (sizeof(filename) - 1) = 0.
> 
> > The issue is that the verifier did not set the "ret" initial range as (-inf,
> > sizeof(filename) - 1). We could have this information associated with helper
> > and feed back to verifier.
>  
> > If we have this range, later for ret & (sizeof(filename) - 1) with ret >= 1,
> > the verifier should be able to conclude
> >  ret & (sizeof(filename) - 1) >= 1.
>  
> > To workaround the immediate problem, I tested the following hack
> > with bcc and it works fine.
>  
> > BPF_PERF_OUTPUT(events);
> > int trace(struct pt_regs *ctx) {
> >   char filename[128];
> >   int ret = bpf_probe_read_str(filename, sizeof(filename), 0);
> >   if (ret > 0) {
> >     if (ret == 1)
> >           events.perf_submit(ctx, filename, ret);
> >     else if (ret < 128)
> >           events.perf_submit(ctx, filename, ret);
...
> 
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, char *filename_ptr)
> {
> 	char filename[128];
> 	int len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); 
> 	if (len > 0) {
> 		if (len == 1)
>        			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
> 		else if (len < 128)
>        			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);

yeah sorry about this hack. Gianluca reported this issue as well.
Yonghong fixed it for bpf_probe_read only. We will extend
the fix to bpf_probe_read_str() and bpf_perf_event_output() asap.
The above workaround gets too much into llvm and verifier details
we should strive to make bpf program writing as easy as possible.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-21 22:31                       ` Alexei Starovoitov
@ 2017-11-22 18:42                         ` Gianluca Borello
  2018-01-22 15:06                           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Gianluca Borello @ 2017-11-22 18:42 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yonghong Song, Daniel Borkmann, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List

On Tue, Nov 21, 2017 at 2:31 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> yeah sorry about this hack. Gianluca reported this issue as well.
> Yonghong fixed it for bpf_probe_read only. We will extend
> the fix to bpf_probe_read_str() and bpf_perf_event_output() asap.
> The above workaround gets too much into llvm and verifier details
> we should strive to make bpf program writing as easy as possible.
>

Hi Arnaldo

With the help of Alexei, Daniel and Yonghong I just submitted a new
series ("bpf: fix semantics issues with helpers receiving NULL
arguments") that includes a fix in bpf_perf_event_output. This should
simplify the way you write your bpf programs, so you shouldn't be
required to write those convoluted branches anymore (there are a few
usage examples in the commit log).

In my case it made writing the code much easier, after applying it I
haven't been surprised by the compiler output in a while, and I hope
your experience will be improved as well.

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2017-11-22 18:42                         ` Gianluca Borello
@ 2018-01-22 15:06                           ` Arnaldo Carvalho de Melo
  2018-01-22 18:28                             ` Yonghong Song
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-22 15:06 UTC (permalink / raw)
  To: Gianluca Borello
  Cc: Yonghong Song, Daniel Borkmann, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List

Em Wed, Nov 22, 2017 at 10:42:22AM -0800, Gianluca Borello escreveu:
> On Tue, Nov 21, 2017 at 2:31 PM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > yeah sorry about this hack. Gianluca reported this issue as well.
> > Yonghong fixed it for bpf_probe_read only. We will extend
> > the fix to bpf_probe_read_str() and bpf_perf_event_output() asap.
> > The above workaround gets too much into llvm and verifier details
> > we should strive to make bpf program writing as easy as possible.
> >
> 
> Hi Arnaldo
> 
> With the help of Alexei, Daniel and Yonghong I just submitted a new
> series ("bpf: fix semantics issues with helpers receiving NULL
> arguments") that includes a fix in bpf_perf_event_output. This should
> simplify the way you write your bpf programs, so you shouldn't be
> required to write those convoluted branches anymore (there are a few
> usage examples in the commit log).
> 
> In my case it made writing the code much easier, after applying it I
> haven't been surprised by the compiler output in a while, and I hope
> your experience will be improved as well.

Trying to work with this again, and I still need to trick clang into not
doing some optimizations that end up getting the resulting eBPF object
rejected by the kernel verifier:

[root@jouet bpf]# uname -a
Linux jouet 4.15.0-rc8+ #1 SMP Wed Jan 17 11:01:34 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@jouet bpf]# grep -i bpf /lib/modules/`uname -r`/build/.config
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT_ALWAYS_ON=y
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_NET_CLS_BPF is not set
# CONFIG_NET_ACT_BPF is not set
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
# CONFIG_TEST_BPF is not set
[root@jouet bpf]# cat sys_enter_open.c
#include "bpf.h"

SEC("syscalls:sys_enter_open")
int func(void *ctx)
{
	struct {
		char *ptr;
		char path[256];
	} filename = {
	/*
	 * /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format:
	 * 
	 * ...
	 * field:const char * filename;	offset:16;	size:8;	signed:0;
	 * ...
	 * ctx + 16 selects 'filename'
	 */
		.ptr = *((char **)(ctx + 16)),
	};
	int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr);
	if (len > 0 && len < 256)
		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
	return 0;
}
[root@jouet bpf]# 
[root@jouet bpf]# perf trace -v -e open,sys_enter_open.c
bpf: builtin compilation failed: -95, try external compiler
Kernel build dir is set to /lib/modules/4.15.0-rc8+/build
set env: KBUILD_DIR=/lib/modules/4.15.0-rc8+/build
unset env: KBUILD_OPTS
include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40f00
set env: CLANG_EXEC=/usr/local/bin/clang
unset env: CLANG_OPTIONS
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: WORKING_DIR=/lib/modules/4.15.0-rc8+/build
set env: CLANG_SOURCE=/home/acme/bpf/sys_enter_open.c
llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
libbpf: loading object 'sys_enter_open.c' from buffer
libbpf: section .strtab, size 101, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section syscalls:sys_enter_open, size 472, link 0, flags 6, type=1
libbpf: found program syscalls:sys_enter_open
libbpf: section .relsyscalls:sys_enter_open, size 16, link 8, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of sys_enter_open.c is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of sys_enter_open.c is 40f00
libbpf: section .symtab, size 144, link 1, flags 0, type=2
libbpf: maps in sys_enter_open.c: 1 maps in 16 bytes
libbpf: map 0 is "__bpf_stdout__"
libbpf: collecting relocating info for: 'syscalls:sys_enter_open'
libbpf: relocation: insn_idx=51
libbpf: relocation: find map 0 (__bpf_stdout__) for insn 51
LLVM: dumping sys_enter_open.o
bpf: config program 'syscalls:sys_enter_open'
libbpf: create map __bpf_stdout__: fd=3
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf: 
0: (bf) r6 = r1
1: (b7) r1 = 0
2: (7b) *(u64 *)(r10 -8) = r1
3: (7b) *(u64 *)(r10 -16) = r1
4: (7b) *(u64 *)(r10 -24) = r1
5: (7b) *(u64 *)(r10 -32) = r1
6: (7b) *(u64 *)(r10 -40) = r1
7: (7b) *(u64 *)(r10 -48) = r1
8: (7b) *(u64 *)(r10 -56) = r1
9: (7b) *(u64 *)(r10 -64) = r1
10: (7b) *(u64 *)(r10 -72) = r1
11: (7b) *(u64 *)(r10 -80) = r1
12: (7b) *(u64 *)(r10 -88) = r1
13: (7b) *(u64 *)(r10 -96) = r1
14: (7b) *(u64 *)(r10 -104) = r1
15: (7b) *(u64 *)(r10 -112) = r1
16: (7b) *(u64 *)(r10 -120) = r1
17: (7b) *(u64 *)(r10 -128) = r1
18: (7b) *(u64 *)(r10 -136) = r1
19: (7b) *(u64 *)(r10 -144) = r1
20: (7b) *(u64 *)(r10 -152) = r1
21: (7b) *(u64 *)(r10 -160) = r1
22: (7b) *(u64 *)(r10 -168) = r1
23: (7b) *(u64 *)(r10 -176) = r1
24: (7b) *(u64 *)(r10 -184) = r1
25: (7b) *(u64 *)(r10 -192) = r1
26: (7b) *(u64 *)(r10 -200) = r1
27: (7b) *(u64 *)(r10 -208) = r1
28: (7b) *(u64 *)(r10 -216) = r1
29: (7b) *(u64 *)(r10 -224) = r1
30: (7b) *(u64 *)(r10 -232) = r1
31: (7b) *(u64 *)(r10 -240) = r1
32: (7b) *(u64 *)(r10 -248) = r1
33: (7b) *(u64 *)(r10 -256) = r1
34: (79) r3 = *(u64 *)(r6 +16)
35: (7b) *(u64 *)(r10 -264) = r3
36: (bf) r1 = r10
37: (07) r1 += -256
38: (b7) r2 = 256
39: (85) call bpf_probe_read_str#45
40: (bf) r1 = r0
41: (07) r1 += -1
42: (67) r1 <<= 32
43: (77) r1 >>= 32
44: (25) if r1 > 0xfe goto pc+12
 R0=inv(id=0) R1=inv(id=0,umax_value=254,var_off=(0x0; 0xff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
45: (67) r0 <<= 32
46: (c7) r0 s>>= 32
47: (07) r0 += 8
48: (bf) r4 = r10
49: (07) r4 += -264
50: (bf) r1 = r6
51: (18) r2 = 0xffff957fc90e1000
53: (18) r3 = 0xffffffff
55: (bf) r5 = r0
56: (85) call bpf_perf_event_output#25
R5 min value is negative, either use unsigned or 'var &= const'

libbpf: -- END LOG --
libbpf: failed to load program 'syscalls:sys_enter_open'
libbpf: failed to load object 'sys_enter_open.c'
bpf: load objects failed
event syntax error: 'sys_enter_open.c'
                     \___ Kernel verifier blocks program loading

(add -v to see detail)
Run 'perf list' for a list of valid events

 Usage: perf trace [<options>] [<command>]
    or: perf trace [<options>] -- <command> [<options>]
    or: perf trace record [<options>] [<command>]
    or: perf trace record [<options>] -- <command> [<options>]

    -e, --event <event>   event/syscall selector. use 'perf list' to list available events
[root@jouet bpf]#

If I use the version we came up before, it works:

[root@jouet bpf]# trace -e open,sys_enter_open.c
LLVM: dumping sys_enter_open.o
  2171.820 (         ): __bpf_stdout__:@......./proc/self/task/14683/comm..)
  2171.845 ( 0.339 ms): qemu-system-x8/6721 open(filename: /proc/self/task/14683/comm, flags: RDWR                ) = 91
^C[root@jouet bpf]
[root@jouet bpf]# trace -e open,sys_enter_open.o
  2485.416 (         ): __bpf_stdout__:Vm..u.../proc/loadavg.......)
  2485.434 ( 0.210 ms): lighttpd/25120 open(filename: /proc/loadavg, mode: ISGID|IXOTH                       ) = 8
^C[root@jouet bpf]# 

[root@jouet bpf]cat sys_enter_open.c
#include "bpf.h"

SEC("syscalls:sys_enter_open")
int func(void *ctx)
{
	struct {
		char *ptr;
		char path[256];
	} filename = {
	/*
	 * /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format:
	 * 
	 * ...
	 * field:const char * filename;	offset:16;	size:8;	signed:0;
	 * ...
	 * ctx + 16 selects 'filename'
	 */
		.ptr = *((char **)(ctx + 16)),
	};
	int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr);
	if (len > 0) {
	       	if (len == 1)
			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
		else if (len < 256)
			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
	}
	return 0;
}
[root@jouet bpf]#

[acme@jouet perf]$ llc --version | head -16
LLVM (http://llvm.org/):
  LLVM version 6.0.0svn
  DEBUG build with assertions.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: broadwell

  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_be - AArch64 (big endian)
    amdgcn     - AMD GCN GPUs
    arm        - ARM
    arm64      - ARM64 (little endian)
    armeb      - ARM (big endian)
    bpf        - BPF (host endian)
    bpfeb      - BPF (big endian)
    bpfel      - BPF (little endian)
[acme@jouet perf]$ 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2018-01-22 15:06                           ` Arnaldo Carvalho de Melo
@ 2018-01-22 18:28                             ` Yonghong Song
  2018-01-22 20:52                               ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Yonghong Song @ 2018-01-22 18:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Gianluca Borello
  Cc: Daniel Borkmann, Alexei Starovoitov, David Miller,
	Linux Networking Development Mailing List



On 1/22/18 7:06 AM, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 22, 2017 at 10:42:22AM -0800, Gianluca Borello escreveu:
>> On Tue, Nov 21, 2017 at 2:31 PM, Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>>
>>> yeah sorry about this hack. Gianluca reported this issue as well.
>>> Yonghong fixed it for bpf_probe_read only. We will extend
>>> the fix to bpf_probe_read_str() and bpf_perf_event_output() asap.
>>> The above workaround gets too much into llvm and verifier details
>>> we should strive to make bpf program writing as easy as possible.
>>>
>>
>> Hi Arnaldo
>>
>> With the help of Alexei, Daniel and Yonghong I just submitted a new
>> series ("bpf: fix semantics issues with helpers receiving NULL
>> arguments") that includes a fix in bpf_perf_event_output. This should
>> simplify the way you write your bpf programs, so you shouldn't be
>> required to write those convoluted branches anymore (there are a few
>> usage examples in the commit log).
>>
>> In my case it made writing the code much easier, after applying it I
>> haven't been surprised by the compiler output in a while, and I hope
>> your experience will be improved as well.
> 
> Trying to work with this again, and I still need to trick clang into not
> doing some optimizations that end up getting the resulting eBPF object
> rejected by the kernel verifier:
> 
> [root@jouet bpf]# uname -a
> Linux jouet 4.15.0-rc8+ #1 SMP Wed Jan 17 11:01:34 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
> [root@jouet bpf]# grep -i bpf /lib/modules/`uname -r`/build/.config
> CONFIG_CGROUP_BPF=y
> CONFIG_BPF=y
> CONFIG_BPF_SYSCALL=y
> CONFIG_BPF_JIT_ALWAYS_ON=y
> # CONFIG_NETFILTER_XT_MATCH_BPF is not set
> # CONFIG_NET_CLS_BPF is not set
> # CONFIG_NET_ACT_BPF is not set
> CONFIG_BPF_JIT=y
> CONFIG_BPF_STREAM_PARSER=y
> CONFIG_LWTUNNEL_BPF=y
> CONFIG_HAVE_EBPF_JIT=y
> CONFIG_BPF_EVENTS=y
> # CONFIG_TEST_BPF is not set
> [root@jouet bpf]# cat sys_enter_open.c
> #include "bpf.h"
> 
> SEC("syscalls:sys_enter_open")
> int func(void *ctx)
> {
> 	struct {
> 		char *ptr;
> 		char path[256];
> 	} filename = {
> 	/*
> 	 * /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format:
> 	 *
> 	 * ...
> 	 * field:const char * filename;	offset:16;	size:8;	signed:0;
> 	 * ...
> 	 * ctx + 16 selects 'filename'
> 	 */
> 		.ptr = *((char **)(ctx + 16)),
> 	};
> 	int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr);
> 	if (len > 0 && len < 256)
> 		perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
> 	return 0;
> }
> [root@jouet bpf]#
> [root@jouet bpf]# perf trace -v -e open,sys_enter_open.c
> bpf: builtin compilation failed: -95, try external compiler
> Kernel build dir is set to /lib/modules/4.15.0-rc8+/build
> set env: KBUILD_DIR=/lib/modules/4.15.0-rc8+/build
> unset env: KBUILD_OPTS
> include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
> set env: NR_CPUS=4
> set env: LINUX_VERSION_CODE=0x40f00
> set env: CLANG_EXEC=/usr/local/bin/clang
> unset env: CLANG_OPTIONS
> set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
> set env: WORKING_DIR=/lib/modules/4.15.0-rc8+/build
> set env: CLANG_SOURCE=/home/acme/bpf/sys_enter_open.c
> llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
> libbpf: loading object 'sys_enter_open.c' from buffer
> libbpf: section .strtab, size 101, link 0, flags 0, type=3
> libbpf: section .text, size 0, link 0, flags 6, type=1
> libbpf: section syscalls:sys_enter_open, size 472, link 0, flags 6, type=1
> libbpf: found program syscalls:sys_enter_open
> libbpf: section .relsyscalls:sys_enter_open, size 16, link 8, flags 0, type=9
> libbpf: section maps, size 16, link 0, flags 3, type=1
> libbpf: section license, size 4, link 0, flags 3, type=1
> libbpf: license of sys_enter_open.c is GPL
> libbpf: section version, size 4, link 0, flags 3, type=1
> libbpf: kernel version of sys_enter_open.c is 40f00
> libbpf: section .symtab, size 144, link 1, flags 0, type=2
> libbpf: maps in sys_enter_open.c: 1 maps in 16 bytes
> libbpf: map 0 is "__bpf_stdout__"
> libbpf: collecting relocating info for: 'syscalls:sys_enter_open'
> libbpf: relocation: insn_idx=51
> libbpf: relocation: find map 0 (__bpf_stdout__) for insn 51
> LLVM: dumping sys_enter_open.o
> bpf: config program 'syscalls:sys_enter_open'
> libbpf: create map __bpf_stdout__: fd=3
> libbpf: load bpf program failed: Permission denied
> libbpf: -- BEGIN DUMP LOG ---
> libbpf:
> 0: (bf) r6 = r1
> 1: (b7) r1 = 0
> 2: (7b) *(u64 *)(r10 -8) = r1
> 3: (7b) *(u64 *)(r10 -16) = r1
> 4: (7b) *(u64 *)(r10 -24) = r1
> 5: (7b) *(u64 *)(r10 -32) = r1
> 6: (7b) *(u64 *)(r10 -40) = r1
> 7: (7b) *(u64 *)(r10 -48) = r1
> 8: (7b) *(u64 *)(r10 -56) = r1
> 9: (7b) *(u64 *)(r10 -64) = r1
> 10: (7b) *(u64 *)(r10 -72) = r1
> 11: (7b) *(u64 *)(r10 -80) = r1
> 12: (7b) *(u64 *)(r10 -88) = r1
> 13: (7b) *(u64 *)(r10 -96) = r1
> 14: (7b) *(u64 *)(r10 -104) = r1
> 15: (7b) *(u64 *)(r10 -112) = r1
> 16: (7b) *(u64 *)(r10 -120) = r1
> 17: (7b) *(u64 *)(r10 -128) = r1
> 18: (7b) *(u64 *)(r10 -136) = r1
> 19: (7b) *(u64 *)(r10 -144) = r1
> 20: (7b) *(u64 *)(r10 -152) = r1
> 21: (7b) *(u64 *)(r10 -160) = r1
> 22: (7b) *(u64 *)(r10 -168) = r1
> 23: (7b) *(u64 *)(r10 -176) = r1
> 24: (7b) *(u64 *)(r10 -184) = r1
> 25: (7b) *(u64 *)(r10 -192) = r1
> 26: (7b) *(u64 *)(r10 -200) = r1
> 27: (7b) *(u64 *)(r10 -208) = r1
> 28: (7b) *(u64 *)(r10 -216) = r1
> 29: (7b) *(u64 *)(r10 -224) = r1
> 30: (7b) *(u64 *)(r10 -232) = r1
> 31: (7b) *(u64 *)(r10 -240) = r1
> 32: (7b) *(u64 *)(r10 -248) = r1
> 33: (7b) *(u64 *)(r10 -256) = r1
> 34: (79) r3 = *(u64 *)(r6 +16)
> 35: (7b) *(u64 *)(r10 -264) = r3
> 36: (bf) r1 = r10
> 37: (07) r1 += -256
> 38: (b7) r2 = 256
> 39: (85) call bpf_probe_read_str#45
> 40: (bf) r1 = r0
> 41: (07) r1 += -1
> 42: (67) r1 <<= 32
> 43: (77) r1 >>= 32
> 44: (25) if r1 > 0xfe goto pc+12
>   R0=inv(id=0) R1=inv(id=0,umax_value=254,var_off=(0x0; 0xff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
> 45: (67) r0 <<= 32
> 46: (c7) r0 s>>= 32
> 47: (07) r0 += 8
> 48: (bf) r4 = r10
> 49: (07) r4 += -264
> 50: (bf) r1 = r6
> 51: (18) r2 = 0xffff957fc90e1000
> 53: (18) r3 = 0xffffffff
> 55: (bf) r5 = r0
> 56: (85) call bpf_perf_event_output#25
> R5 min value is negative, either use unsigned or 'var &= const'

Yes, I can reproduce the issue as well. The reason is again due to
compiler optimization,
    > 39: (85) call bpf_probe_read_str#45
    > 40: (bf) r1 = r0
    > 41: (07) r1 += -1
    > 42: (67) r1 <<= 32
    > 43: (77) r1 >>= 32
    > 44: (25) if r1 > 0xfe goto pc+12
    >   R0=inv(id=0) R1=inv(id=0,umax_value=254,var_off=(0x0; 0xff)) 
R6=ctx(id=0,off=0,imm=0) R10=fp0
    > 45: (67) r0 <<= 32
    > 46: (c7) r0 s>>= 32
    > 47: (07) r0 += 8
    > 48: (bf) r4 = r10
    > 49: (07) r4 += -264
    > 50: (bf) r1 = r6
    > 51: (18) r2 = 0xffff957fc90e1000
    > 53: (18) r3 = 0xffffffff
    > 55: (bf) r5 = r0
    > 56: (85) call bpf_perf_event_output#25
    > R5 min value is negative, either use unsigned or 'var &= const'

The compiler did "40: (bf) r1 = r0" and then uses "r1" for branch 
comparison, the original "r0" is left with complete unknown integer 
value and later used to calculate the buffer size "55: (bf) r5 = r0"
where "r5" could be negative value and the verifier rightfully
complains.

There is no easy way to fix this in verifier unless verifier starts to 
track correlations between registers which is a big task. So your below 
workaround is okay. The below workaround should also work:

         int len = bpf_probe_read_str(filename.path, 
sizeof(filename.path), filename.ptr);
         if (len > 0 && len < 256)
                 bpf_perf_event_output(ctx, &my_map, BPF_F_CURRENT_CPU, 
&filename, (len & 0xff) + sizeof(filename.ptr));
         return 0;

Thanks,

> 
> libbpf: -- END LOG --
> libbpf: failed to load program 'syscalls:sys_enter_open'
> libbpf: failed to load object 'sys_enter_open.c'
> bpf: load objects failed
> event syntax error: 'sys_enter_open.c'
>                       \___ Kernel verifier blocks program loading
> 
> (add -v to see detail)
> Run 'perf list' for a list of valid events
> 
>   Usage: perf trace [<options>] [<command>]
>      or: perf trace [<options>] -- <command> [<options>]
>      or: perf trace record [<options>] [<command>]
>      or: perf trace record [<options>] -- <command> [<options>]
> 
>      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
> [root@jouet bpf]#
> 
> If I use the version we came up before, it works:
> 
> [root@jouet bpf]# trace -e open,sys_enter_open.c
> LLVM: dumping sys_enter_open.o
>    2171.820 (         ): __bpf_stdout__:@......./proc/self/task/14683/comm..)
>    2171.845 ( 0.339 ms): qemu-system-x8/6721 open(filename: /proc/self/task/14683/comm, flags: RDWR                ) = 91
> ^C[root@jouet bpf]
> [root@jouet bpf]# trace -e open,sys_enter_open.o
>    2485.416 (         ): __bpf_stdout__:Vm..u.../proc/loadavg.......)
>    2485.434 ( 0.210 ms): lighttpd/25120 open(filename: /proc/loadavg, mode: ISGID|IXOTH                       ) = 8
> ^C[root@jouet bpf]#
> 
> [root@jouet bpf]cat sys_enter_open.c
> #include "bpf.h"
> 
> SEC("syscalls:sys_enter_open")
> int func(void *ctx)
> {
> 	struct {
> 		char *ptr;
> 		char path[256];
> 	} filename = {
> 	/*
> 	 * /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format:
> 	 *
> 	 * ...
> 	 * field:const char * filename;	offset:16;	size:8;	signed:0;
> 	 * ...
> 	 * ctx + 16 selects 'filename'
> 	 */
> 		.ptr = *((char **)(ctx + 16)),
> 	};
> 	int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr);
> 	if (len > 0) {
> 	       	if (len == 1)
> 			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
> 		else if (len < 256)
> 			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
> 	}
> 	return 0;
> }
> [root@jouet bpf]#
> 
> [acme@jouet perf]$ llc --version | head -16
> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=kbmGig2gAUdNIT0zFKDollApvFW0R95_U5Ntz4dvRbM&s=zQgAZ9vJnfkoNy-c53Y3H_2nMVwMzV-LyMA2f6bSFrw&e=):
>    LLVM version 6.0.0svn
>    DEBUG build with assertions.
>    Default target: x86_64-unknown-linux-gnu
>    Host CPU: broadwell
> 
>    Registered Targets:
>      aarch64    - AArch64 (little endian)
>      aarch64_be - AArch64 (big endian)
>      amdgcn     - AMD GCN GPUs
>      arm        - ARM
>      arm64      - ARM64 (little endian)
>      armeb      - ARM (big endian)
>      bpf        - BPF (host endian)
>      bpfeb      - BPF (big endian)
>      bpfel      - BPF (little endian)
> [acme@jouet perf]$
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
  2018-01-22 18:28                             ` Yonghong Song
@ 2018-01-22 20:52                               ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-22 20:52 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Gianluca Borello, Daniel Borkmann, Alexei Starovoitov,
	David Miller, Linux Networking Development Mailing List

Em Mon, Jan 22, 2018 at 10:28:11AM -0800, Yonghong Song escreveu:
> The compiler did "40: (bf) r1 = r0" and then uses "r1" for branch
> comparison, the original "r0" is left with complete unknown integer value
> and later used to calculate the buffer size "55: (bf) r5 = r0"
> where "r5" could be negative value and the verifier rightfully
> complains.
 
> There is no easy way to fix this in verifier unless verifier starts to track
> correlations between registers which is a big task. So your below workaround
> is okay. The below workaround should also work:
 
>         int len = bpf_probe_read_str(filename.path, sizeof(filename.path),
> filename.ptr);
>         if (len > 0 && len < 256)
>                 bpf_perf_event_output(ctx, &my_map, BPF_F_CURRENT_CPU,
> &filename, (len & 0xff) + sizeof(filename.ptr));
>         return 0;

Ok, thanks for one more time doing the analysis of the optimizations
emitted and suggesting something more compact, that I can confirm works:

[root@jouet bpf]# perf trace -a -e open,sys_enter_open.c sleep 0.1
LLVM: dumping sys_enter_open.o
     1.212 (         ): __bpf_stdout__:......../usr/lib/locale/locale-archive......)
     1.218 ( 0.021 ms): sleep/9872 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
     2.905 (         ): __bpf_stdout__:..:.F.../usr/lib/locale/locale-archive......)
     2.910 ( 0.013 ms): rm/9873 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
     7.562 (         ): __bpf_stdout__:..ul..../usr/lib/locale/locale-archive......)
     7.564 ( 0.013 ms): mv/9874 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
    11.275 (         ): __bpf_stdout__:...d..../usr/lib/locale/locale-archive......)
    11.278 ( 0.012 ms): sh/9875 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
    11.945 (         ): __bpf_stdout__:...d..../usr/lib64/gconv/gconv-modules.cache........)
    11.953 ( 0.018 ms): sh/9875 open(filename: /usr/lib64/gconv/gconv-modules.cache) = 3
    17.906 (         ): __bpf_stdout__:..T.p.../usr/lib/locale/locale-archive......)
    17.913 ( 0.319 ms): gcc/9877 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 4
    18.389 (         ): __bpf_stdout__:...l..../usr/share/locale/locale.alias......)
    18.394 ( 0.266 ms): gcc/9877 open(filename: /usr/share/locale/locale.alias, flags: CLOEXEC) = 4
    18.777 (         ): __bpf_stdout__:@......./usr/share/locale/en_US.UTF-8/LC_MESSAGES/gcc.mo....)
    18.782 ( 0.318 ms): gcc/9877 open(filename: /usr/share/locale/en_US.UTF-8/LC_MESSAGES/gcc.mo, mode: IFBLK|IFIFO|ISGID|ISVTX|IRUSR|IXUSR|0xb5cc0000) = -1 ENOENT No such file or directory

[root@jouet bpf]# cat sys_enter_open.c
#include "bpf.h"

SEC("syscalls:sys_enter_open")
int func(void *ctx)
{
	struct {
		char *ptr;
		char path[256];
	} filename = {
		.ptr = *((char **)(ctx + 16)),
	};
	int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr);
	if (len > 0 && len < 256)
                perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, (len & 0xff) + sizeof(filename.ptr));
	return 0;
}
[root@jouet bpf]# 

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-01-22 20:52 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-13 14:30 len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL Arnaldo Carvalho de Melo
2017-11-13 14:56 ` Daniel Borkmann
2017-11-13 15:08   ` Arnaldo Carvalho de Melo
2017-11-14  0:09     ` Daniel Borkmann
2017-11-14 12:58       ` Arnaldo Carvalho de Melo
2017-11-14 13:09         ` Daniel Borkmann
2017-11-14 13:42           ` Arnaldo Carvalho de Melo
2017-11-14 14:19             ` Daniel Borkmann
2017-11-14 14:58               ` Arnaldo Carvalho de Melo
2017-11-14 18:15               ` Yonghong Song
2017-11-14 20:25                 ` Daniel Borkmann
2017-11-14 22:58                   ` Yonghong Song
2017-11-21 14:29                     ` Arnaldo Carvalho de Melo
2017-11-21 22:31                       ` Alexei Starovoitov
2017-11-22 18:42                         ` Gianluca Borello
2018-01-22 15:06                           ` Arnaldo Carvalho de Melo
2018-01-22 18:28                             ` Yonghong Song
2018-01-22 20:52                               ` Arnaldo Carvalho de Melo
2017-11-20 13:31                   ` Arnaldo Carvalho de Melo
2017-11-20 16:47                     ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).