linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
@ 2017-11-28 19:05 Arnaldo Carvalho de Melo
  2017-11-29 21:07 ` Martin KaFai Lau
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-28 19:05 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Matthias Kaehlcke, Josh Poimboeuf, Yonghong Song,
	Alexei Starovoitov, David S. Miller, Daniel Borkmann, Wang Nan,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

So, I had 'perf test BPF' failing in perf/core, at first thought it was
due to an update I had made to clang, but then I noticed that with
perf/urgent it works, so something else, did a bisection and got to:

[acme@jouet linux]$ git bisect bad
88cda1c9da02c8aa31e1d5dcf22e8a35cc8c19f2 is the first bad commit
commit 88cda1c9da02c8aa31e1d5dcf22e8a35cc8c19f2
Author: Martin KaFai Lau <kafai@fb.com>
Date:   Wed Sep 27 14:37:54 2017 -0700

    bpf: libbpf: Provide basic API support to specify BPF obj name
    
    This patch extends the libbpf to provide API support to
    allow specifying BPF object name.
    
    In tools/lib/bpf/libbpf, the C symbol of the function
    and the map is used.  Regarding section name, all maps are
    under the same section named "maps".  Hence, section name
    is not a good choice for map's name.  To be consistent with
    map, bpf_prog also follows and uses its function symbol as
    the prog's name.
    
    This patch adds logic to collect function's symbols in libbpf.
    There is existing codes to collect the map's symbols and no change
    is needed.
    
    The bpf_load_program_name() and bpf_map_create_name() are
    added to take the name argument.  For the other bpf_map_create_xxx()
    variants, a name argument is directly added to them.
    
    In samples/bpf, bpf_load.c in particular, the symbol is also
    used as the map's name and the map symbols has already been
    collected in the existing code.  For bpf_prog, bpf_load.c does
    not collect the function symbol name.  We can consider to collect
    them later if there is a need to continue supporting the bpf_load.c.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    Acked-by: Alexei Starovoitov <ast@fb.com>
    Acked-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 9082e2747e39319fcd1555506bc24f36fc85ec43 433c27b777924b188028d79d2c5648a917135f39 M	samples
:040000 040000 6113e47f92559d3b4ac4eda399d2c980ac407603 9774db0bb47eab0ffc642623bd7d7a6f46b820c6 M	tools
[acme@jouet linux]$ 
[acme@jouet linux]$ 

	Please CC me and Wang Nan when making changes to tools/lib/bpf/,
as it started having tools/perf/ as its sole user, i.e. there is code
there that uses it and we have to make sure it continues working :-\

	What fails?

[root@jouet ~]# perf test bpf
39: BPF filter                                            :
39.1: Basic BPF filtering                                 : FAILED!
39.2: BPF pinning                                         : Skip
39.3: BPF prologue generation                             : Skip
39.4: BPF relocation checker                              : Skip
[root@jouet ~]#

With -v we can see that it is map setup that fails, with the error being:

libbpf: failed to create map (name: 'flip_table'): Invalid argument
libbpf: failed to load object '[basic_bpf_test]'
bpf: load objects failed
Failed to add events selected by BPF

Full logs, see below for further info I had collected before doing the bisect:

[root@jouet ~]# perf test -v bpf
39: BPF filter                                            :
39.1: Basic BPF filtering                                 :
--- start ---
test child forked, pid 12198
Kernel build dir is set to /lib/modules/4.14.0+/build
set env: KBUILD_DIR=/lib/modules/4.14.0+/build
unset env: KBUILD_OPTS
include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40e00
set env: CLANG_EXEC=/usr/local/bin/clang
set env: CLANG_OPTIONS=-xc 
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h 
set env: WORKING_DIR=/lib/modules/4.14.0+/build
set env: CLANG_SOURCE=-
llvm compiling command template: echo '/*
 * bpf-script-example.c
 * Test basic LLVM building
 */
#ifndef LINUX_VERSION_CODE
# error Need LINUX_VERSION_CODE
# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
#endif
#define BPF_ANY 0
#define BPF_MAP_TYPE_ARRAY 2
#define BPF_FUNC_map_lookup_elem 1
#define BPF_FUNC_map_update_elem 2

static void *(*bpf_map_lookup_elem)(void *map, void *key) =
	(void *) BPF_FUNC_map_lookup_elem;
static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
	(void *) BPF_FUNC_map_update_elem;

struct bpf_map_def {
	unsigned int type;
	unsigned int key_size;
	unsigned int value_size;
	unsigned int max_entries;
};

#define SEC(NAME) __attribute__((section(NAME), used))
struct bpf_map_def SEC("maps") flip_table = {
	.type = BPF_MAP_TYPE_ARRAY,
	.key_size = sizeof(int),
	.value_size = sizeof(int),
	.max_entries = 1,
};

SEC("func=SyS_epoll_wait")
int bpf_func__SyS_epoll_wait(void *ctx)
{
	int ind =0;
	int *flag = bpf_map_lookup_elem(&flip_table, &ind);
	int new_flag;
	if (!flag)
		return 0;
	/* flip flag and store back */
	new_flag = !*flag;
	bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
	return new_flag;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
libbpf: loading object '[basic_bpf_test]' from buffer
libbpf: section .strtab, size 120, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section func=SyS_epoll_wait, size 192, link 0, flags 6, type=1
libbpf: found program func=SyS_epoll_wait
libbpf: section .relfunc=SyS_epoll_wait, size 32, link 8, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of [basic_bpf_test] is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of [basic_bpf_test] is 40e00
libbpf: section .symtab, size 168, link 1, flags 0, type=2
libbpf: maps in [basic_bpf_test]: 1 maps in 16 bytes
libbpf: map 0 is "flip_table"
libbpf: collecting relocating info for: 'func=SyS_epoll_wait'
libbpf: relocation: insn_idx=4
libbpf: relocation: find map 0 (flip_table) for insn 4
libbpf: relocation: insn_idx=17
libbpf: relocation: find map 0 (flip_table) for insn 17
bpf: config program 'func=SyS_epoll_wait'
symbol:SyS_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
bpf: config 'func=SyS_epoll_wait' is ok
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.14.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.14.0+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: SyS_epoll_wait [2e5a9be]
found inline addr: 0xffffffff812af256
Probe point found: compat_SyS_epoll_pwait+150
found inline addr: 0xffffffff812af037
Probe point found: SyS_epoll_pwait+135
found inline addr: 0xffffffff812aeed0
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: p:perf_bpf_probe/func _text+2814550
Writing event: p:perf_bpf_probe/func_1 _text+2814007
Writing event: p:perf_bpf_probe/func_2 _text+2813648
libbpf: failed to create map (name: 'flip_table'): Invalid argument
libbpf: failed to load object '[basic_bpf_test]'
bpf: load objects failed
Failed to add events selected by BPF
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/func _text+2814550
Group:perf_bpf_probe Event:func probe:p
Parsing probe_events: p:perf_bpf_probe/func_1 _text+2814007
Group:perf_bpf_probe Event:func_1 probe:p
Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648
Group:perf_bpf_probe Event:func_2 probe:p
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: -:perf_bpf_probe/func
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/func_1 _text+2814007
Group:perf_bpf_probe Event:func_1 probe:p
Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648
Group:perf_bpf_probe Event:func_2 probe:p
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: -:perf_bpf_probe/func_1
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648
Group:perf_bpf_probe Event:func_2 probe:p
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: -:perf_bpf_probe/func_2
test child finished with -1
---- end ----
BPF filter subtest 0: FAILED!
39.2: BPF pinning                                         :
--- force skipped ---
BPF filter subtest 1: Skip
39.3: BPF prologue generation                             :
--- force skipped ---
BPF filter subtest 2: Skip
39.4: BPF relocation checker                              :
--- force skipped ---
BPF filter subtest 3: Skip
[root@jouet ~]#

Em Mon, Nov 27, 2017 at 04:34:25PM -0300, Arnaldo Carvalho de Melo escreveu:
> So, I noticed that any maps are failing, I'll go dig, but may be some
> new security tightening, even running as root, this was working
> recently, was even part of our discussion on the bpf_probe_read_str()
> trouble with clang's optimizer:
> 
> [root@jouet bpf]# cat open.c
> #include "bpf.h"
> 
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, char *filename_ptr)
> {
> 	char filename[128];
> 	int len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); 
> 	if (len > 0) {
> 		if (len == 1)
> 			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
> 		else if (len < 128)
> 			perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
>         }
> 	return 1;
> }
> [root@jouet bpf]#
> <SNIP>
> Found 1 probe_trace_events.
> Opening /sys/kernel/debug/tracing//kprobe_events write=1
> Writing event: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
> In map_prologue, ntevs=1
> mapping[0]=0
> libbpf: failed to create map (name: '__bpf_stdout__'): Invalid argument
> libbpf: failed to load object 'open.c'
> bpf: load objects failed
> event syntax error: 'open.c'
>                      \___ Operation not permitted
> <SNIP>
> 
> Using 'perf ftrace' to trace just 'perf trace':
> 
> [root@jouet bpf]# perf ftrace -G SyS_bpf perf trace -v -e open.c,open cat /tmp/somefile 2> /dev/null
>  0)               |  SyS_bpf() {
>  0)               |    capable() {
>  0)               |      ns_capable_common() {
>  0)               |        security_capable() {
>  0)   0.045 us    |          cap_capable();
>  0)               |          selinux_capable() {
>  0)   0.274 us    |            cred_has_capability();
>  0)   0.518 us    |          }
>  0)   1.464 us    |        }
>  0)   1.783 us    |      }
>  0)   2.130 us    |    }
>  0)   0.458 us    |    check_uarg_tail_zero();
>  0)               |    __check_object_size() {
>  0)   0.046 us    |      __virt_addr_valid();
>  0)   0.040 us    |      check_stack_object();
>  0)   0.510 us    |    }
>  0)   4.161 us    |  }
> [root@jouet bpf]#
> 
> /me goes to look at SyS_bpf() in this kernel... (4.14.0+).

Tracing 'perf trace' with 'perf trace' we see:

# perf trace -e bpf perf trace -e open.c,open cat /tmp/somefile
<SNIP traced 'perf trace' error messages>
  0.000 ( 0.015 ms): perf/16767 bpf(cmd: MAP_CREATE, uattr: 0x7ffc3c8c7ac0, size: 72) = -1 EINVAL Invalid argument
#

Humm,

# perf probe check_stack_object%return 'ret=$retval'
Added new event:
  probe:check_stack_object (on check_stack_object%return with ret=$retval)

You can now use it in all perf tools, such as:

	perf record -e probe:check_stack_object -aR sleep 1

#

# perf trace -e bpf,probe:check* perf trace -e open.c,open cat /tmp/somefile
<SNIP lots of check_stack_object calls returning 0x0)
  4626.779 ( 0.004 ms): perf/31498 bpf(cmd: MAP_CREATE, uattr: 0x7fff7dbbaab0, size: 72                  ) ...
  4626.784 (         ): probe:check_stack_object:(ffffffffb625ec30 <- ffffffffb625ed1f) ret=0x2)
  4626.779 ( 0.006 ms): perf/31498  ... [continued]: bpf()) = -1 EINVAL Invalid argument
<SNIP lots of check_stack_object calls returning 0x0)

check_stack_object returning 0x2 means GOOD_STACK, 0x0 means
NOT_STACK...

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-28 19:05 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name" Arnaldo Carvalho de Melo
@ 2017-11-29 21:07 ` Martin KaFai Lau
  2017-11-29 21:15   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Martin KaFai Lau @ 2017-11-29 21:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Matthias Kaehlcke, Josh Poimboeuf, Yonghong Song,
	Alexei Starovoitov, David S. Miller, Daniel Borkmann, Wang Nan,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> 
> [root@jouet ~]# perf test -v bpf
> 39: BPF filter                                            :
> 39.1: Basic BPF filtering                                 :
> --- start ---
> test child forked, pid 12198
> Kernel build dir is set to /lib/modules/4.14.0+/build
> set env: KBUILD_DIR=/lib/modules/4.14.0+/build
[ ... ]
> libbpf: failed to create map (name: 'flip_table'): Invalid argument
> libbpf: failed to load object '[basic_bpf_test]'
> bpf: load objects failed
88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
is introduced in 4.15.

I think the perf@kernel-4.15 broke on older kernels like 4.14 because
the new bpf prog/map name is only introduced since 4.15.

The newer perf needs to be compatible with an older kernel?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-29 21:07 ` Martin KaFai Lau
@ 2017-11-29 21:15   ` Arnaldo Carvalho de Melo
  2017-11-29 22:31     ` Martin KaFai Lau
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-29 21:15 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Matthias Kaehlcke, Josh Poimboeuf, Yonghong Song,
	Alexei Starovoitov, David S. Miller, Daniel Borkmann, Wang Nan,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > 
> > [root@jouet ~]# perf test -v bpf
> > 39: BPF filter                                            :
> > 39.1: Basic BPF filtering                                 :
> > --- start ---
> > test child forked, pid 12198
> > Kernel build dir is set to /lib/modules/4.14.0+/build
> > set env: KBUILD_DIR=/lib/modules/4.14.0+/build
> [ ... ]
> > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > libbpf: failed to load object '[basic_bpf_test]'
> > bpf: load objects failed
> 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
> is introduced in 4.15.
 
> I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> the new bpf prog/map name is only introduced since 4.15.
 
> The newer perf needs to be compatible with an older kernel?

Sure :-)

If some ABI breaks it should detect that and adapt, and older perf
tools should also gracefully fail in such a case, which I'm not sure
will be the case here, haven't checked perf's BPF integration to see how
it behaves in such a case, but I will.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-29 21:15   ` Arnaldo Carvalho de Melo
@ 2017-11-29 22:31     ` Martin KaFai Lau
  2017-11-30  3:01       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Martin KaFai Lau @ 2017-11-29 22:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Matthias Kaehlcke, Josh Poimboeuf, Yonghong Song,
	Alexei Starovoitov, David S. Miller, Daniel Borkmann, Wang Nan,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > > 
> > > [root@jouet ~]# perf test -v bpf
> > > 39: BPF filter                                            :
> > > 39.1: Basic BPF filtering                                 :
> > > --- start ---
> > > test child forked, pid 12198
> > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > set env: KBUILD_DIR=/lib/modules/4.14.0+/build
> > [ ... ]
> > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > libbpf: failed to load object '[basic_bpf_test]'
> > > bpf: load objects failed
> > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
> > is introduced in 4.15.
>  
> > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > the new bpf prog/map name is only introduced since 4.15.
>  
> > The newer perf needs to be compatible with an older kernel?
> 
> Sure :-)
Would the latest features introduced in perf/libbpf supposed to be
available in the latest kernel only?  What may be the reason that the
latest perf is installed with an older kernel while it does not gain new
functionality? 

> 
> If some ABI breaks it should detect that and adapt, and older perf
> tools should also gracefully fail in such a case, which I'm not sure
> will be the case here, haven't checked perf's BPF integration to see how
> it behaves in such a case, but I will.
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-29 22:31     ` Martin KaFai Lau
@ 2017-11-30  3:01       ` Arnaldo Carvalho de Melo
  2017-11-30 16:53         ` [PATCH/RFC] " Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-30  3:01 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Matthias Kaehlcke, Josh Poimboeuf, Yonghong Song,
	Alexei Starovoitov, David S. Miller, Daniel Borkmann, Wang Nan,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > [root@jouet ~]# perf test -v bpf
> > > > 39: BPF filter                                            :
> > > > 39.1: Basic BPF filtering                                 :
> > > > --- start ---
> > > > test child forked, pid 12198
> > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > set env: KBUILD_DIR=/lib/modules/4.14.0+/build
> > > [ ... ]
> > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > bpf: load objects failed
> > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
> > > is introduced in 4.15.

> > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > > the new bpf prog/map name is only introduced since 4.15.

> > > The newer perf needs to be compatible with an older kernel?

> > Sure :-)

> Would the latest features introduced in perf/libbpf supposed to be
> available in the latest kernel only?  What may be the reason that the

Yes, then the new perf binary should try to use the new stuff, if it
fails, use the old one, there is no requirement that one uses perf 4.14
in lockstep with the kernel 4.14 (or any other version), perf 4.15
should work with the 4.14 kernel as well as with 4.15 (or any other
future kernel), only limited by what it can grok up to when it was
released.

I'll test with a 4.15 kernel to confirm this, thanks for the
explanation!

> latest perf is installed with an older kernel while it does not gain new
> functionality? 

See, for instance tools/perf/util/evsel.c, function perf_evsel__open(),
and all those fallbacks for features a new tool doesn't find on an older
kernel.

When the user asks for some feature that is not present in the older
kernel, then it fails and the tool calling perf_evsel__open() will use
perf_evlist__strerror_open() to provide as best an explanation to what
took place as possible.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-30  3:01       ` Arnaldo Carvalho de Melo
@ 2017-11-30 16:53         ` Arnaldo Carvalho de Melo
  2017-11-30 18:28           ` Martin KaFai Lau
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-30 16:53 UTC (permalink / raw)
  To: Martin KaFai Lau, Alexei Starovoitov, Wang Nan, Daniel Borkmann
  Cc: Matthias Kaehlcke, Josh Poimboeuf, Yonghong Song,
	David S. Miller, Alexei Starovoitov, Adrian Hunter, David Ahern,
	Jiri Olsa, Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > [root@jouet ~]# perf test -v bpf
> > > > > 39: BPF filter                                            :
> > > > > 39.1: Basic BPF filtering                                 :
> > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > [ ... ]
> > > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > bpf: load objects failed
> > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
> > > > is introduced in 4.15.
> 
> > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > > > the new bpf prog/map name is only introduced since 4.15.
> 
> > > > The newer perf needs to be compatible with an older kernel?
> 
> > > Sure :-)
 
> > Would the latest features introduced in perf/libbpf supposed to be
> > available in the latest kernel only?  What may be the reason that the
> 
> Yes, then the new perf binary should try to use the new stuff, if it
> fails, use the old one, there is no requirement that one uses perf 4.14
> in lockstep with the kernel 4.14 (or any other version), perf 4.15
> should work with the 4.14 kernel as well as with 4.15 (or any other
> future kernel), only limited by what it can grok up to when it was
> released.

So, see the patch below, that makes a 'perf test bpf' and my other test
cases, including that one for probe_read_str() work again, it just
fallbacks to a behaviour the older kernels can accept.

We can improve it so that that EINVAL fallback happens only for
MAP_CREATE, and probably we don't need to change the size arg, just zero
the unused fields, but I haven't checked that.

- Arnaldo

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5128677e4117..3084f07c7c33 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -19,6 +19,7 @@
  * License along with this program; if not,  see <http://www.gnu.org/licenses>
  */
 
+#include <errno.h>
 #include <stdlib.h>
 #include <memory.h>
 #include <unistd.h>
@@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
 	return (__u64) (unsigned long) ptr;
 }
 
-static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
-			  unsigned int size)
+static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
 {
-	return syscall(__NR_bpf, cmd, attr, size);
+	int err = syscall(__NR_bpf, cmd, attr, size);
+	if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
+		const unsigned int old_union_size = offsetof(union bpf_attr, prog_name);
+		/*
+		 * These were the ones that added fields after the old bpf_attr
+		 * layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
+		 * API support to specify BPF obj name") so zero that out to
+		 * pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
+		 * kernels.
+		 */
+		if (cmd == BPF_MAP_CREATE)
+			memset(&attr->map_name, 0, size - offsetof(union bpf_attr, map_name));
+		else
+			memset(&attr->prog_name, 0, size - old_union_size);
+
+		err = syscall(__NR_bpf, cmd, attr, old_union_size);
+	}
+	return err;
 }
 
 int bpf_create_map_node(enum bpf_map_type map_type, const char *name,

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-30 16:53         ` [PATCH/RFC] " Arnaldo Carvalho de Melo
@ 2017-11-30 18:28           ` Martin KaFai Lau
  2017-11-30 19:00             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Martin KaFai Lau @ 2017-11-30 18:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Wang Nan, Daniel Borkmann, Matthias Kaehlcke,
	Josh Poimboeuf, Yonghong Song, David S. Miller,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

On Thu, Nov 30, 2017 at 01:53:58PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > [root@jouet ~]# perf test -v bpf
> > > > > > 39: BPF filter                                            :
> > > > > > 39.1: Basic BPF filtering                                 :
> > > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > > [ ... ]
> > > > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > > bpf: load objects failed
> > > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
> > > > > is introduced in 4.15.
> > 
> > > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > > > > the new bpf prog/map name is only introduced since 4.15.
> > 
> > > > > The newer perf needs to be compatible with an older kernel?
> > 
> > > > Sure :-)
>  
> > > Would the latest features introduced in perf/libbpf supposed to be
> > > available in the latest kernel only?  What may be the reason that the
> > 
> > Yes, then the new perf binary should try to use the new stuff, if it
> > fails, use the old one, there is no requirement that one uses perf 4.14
> > in lockstep with the kernel 4.14 (or any other version), perf 4.15
> > should work with the 4.14 kernel as well as with 4.15 (or any other
> > future kernel), only limited by what it can grok up to when it was
> > released.
> 
> So, see the patch below, that makes a 'perf test bpf' and my other test
> cases, including that one for probe_read_str() work again, it just
> fallbacks to a behaviour the older kernels can accept.
Thanks for the patch.

> 
> We can improve it so that that EINVAL fallback happens only for
> MAP_CREATE, and probably we don't need to change the size arg, just zero
> the unused fields, but I haven't checked that.
> 
> - Arnaldo
> 
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 5128677e4117..3084f07c7c33 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -19,6 +19,7 @@
>   * License along with this program; if not,  see <http://www.gnu.org/licenses>
>   */
>  
> +#include <errno.h>
>  #include <stdlib.h>
>  #include <memory.h>
>  #include <unistd.h>
> @@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
>  	return (__u64) (unsigned long) ptr;
>  }
>  
> -static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
> -			  unsigned int size)
> +static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
>  {
> -	return syscall(__NR_bpf, cmd, attr, size);
> +	int err = syscall(__NR_bpf, cmd, attr, size);
> +	if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
I would add a check to the length of map_name/prog_name.

> +		const unsigned int old_union_size = offsetof(union bpf_attr, prog_name);
> +		/*
> +		 * These were the ones that added fields after the old bpf_attr
> +		 * layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
> +		 * API support to specify BPF obj name") so zero that out to
> +		 * pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
> +		 * kernels.
> +		 */
> +		if (cmd == BPF_MAP_CREATE)
> +			memset(&attr->map_name, 0, size - offsetof(union bpf_attr, map_name));
> +		else
> +			memset(&attr->prog_name, 0, size - old_union_size);
If bpf_attr is extended in the future,  map_name/prog_name will still be
used as the anchor for backward compatibility instead of trial and error
attribute by attribute?

Instead of sinking all future bpf_attr's backward compatibility
requirements to sys_bpf,  I would push it up to its own BPF_* command
helper which has a better sense of its bpf_attr, i.e. push it up
to bpf_create_map_node() and bpf_load_program_name() in this case.

> +
> +		err = syscall(__NR_bpf, cmd, attr, old_union_size);
> +	}
> +	return err;
>  }
>  
>  int bpf_create_map_node(enum bpf_map_type map_type, const char *name,

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-30 18:28           ` Martin KaFai Lau
@ 2017-11-30 19:00             ` Arnaldo Carvalho de Melo
  2017-11-30 21:51               ` Alexei Starovoitov
  2017-11-30 22:09               ` Martin KaFai Lau
  0 siblings, 2 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-11-30 19:00 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Alexei Starovoitov, Wang Nan, Daniel Borkmann, Matthias Kaehlcke,
	Josh Poimboeuf, Yonghong Song, David S. Miller,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

Em Thu, Nov 30, 2017 at 10:28:08AM -0800, Martin KaFai Lau escreveu:
> On Thu, Nov 30, 2017 at 01:53:58PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > > > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > > [root@jouet ~]# perf test -v bpf
> > > > > > > 39: BPF filter                                            :
> > > > > > > 39.1: Basic BPF filtering                                 :
> > > > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > > > [ ... ]
> > > > > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > > > bpf: load objects failed
> > > > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
> > > > > > is introduced in 4.15.
> > > 
> > > > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > > > > > the new bpf prog/map name is only introduced since 4.15.

> > > > > > The newer perf needs to be compatible with an older kernel?

> > > > > Sure :-)

> > > > Would the latest features introduced in perf/libbpf supposed to be
> > > > available in the latest kernel only?  What may be the reason that the

> > > Yes, then the new perf binary should try to use the new stuff, if it
> > > fails, use the old one, there is no requirement that one uses perf 4.14
> > > in lockstep with the kernel 4.14 (or any other version), perf 4.15
> > > should work with the 4.14 kernel as well as with 4.15 (or any other
> > > future kernel), only limited by what it can grok up to when it was
> > > released.

> > So, see the patch below, that makes a 'perf test bpf' and my other test
> > cases, including that one for probe_read_str() work again, it just
> > fallbacks to a behaviour the older kernels can accept.

> Thanks for the patch.
 
> > We can improve it so that that EINVAL fallback happens only for
> > MAP_CREATE, and probably we don't need to change the size arg, just zero
> > the unused fields, but I haven't checked that.
> > 
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 5128677e4117..3084f07c7c33 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -19,6 +19,7 @@
> >   * License along with this program; if not,  see <http://www.gnu.org/licenses>
> >   */
> >  
> > +#include <errno.h>
> >  #include <stdlib.h>
> >  #include <memory.h>
> >  #include <unistd.h>
> > @@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
> >  	return (__u64) (unsigned long) ptr;
> >  }
> >  
> > -static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
> > -			  unsigned int size)
> > +static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
> >  {
> > -	return syscall(__NR_bpf, cmd, attr, size);
> > +	int err = syscall(__NR_bpf, cmd, attr, size);
> > +	if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
> I would add a check to the length of map_name/prog_name.

Can you elaborate? What kind of check?
 
> > +		const unsigned int old_union_size = offsetof(union bpf_attr, prog_name);
> > +		/*
> > +		 * These were the ones that added fields after the old bpf_attr
> > +		 * layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
> > +		 * API support to specify BPF obj name") so zero that out to
> > +		 * pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
> > +		 * kernels.
> > +		 */
> > +		if (cmd == BPF_MAP_CREATE)
> > +			memset(&attr->map_name, 0, size - offsetof(union bpf_attr, map_name));
> > +		else
> > +			memset(&attr->prog_name, 0, size - old_union_size);

> If bpf_attr is extended in the future,  map_name/prog_name will still be
> used as the anchor for backward compatibility instead of trial and error
> attribute by attribute?

Then you will first try the latest and greatest, if it fails, go the
previous (like here), if it fails, etc.

That or some sort of versioning to make sure the kernel and the tools
can agree on a common set of functionality supported by both.

Again, this is how perf_evsel__open() works in the perf case, try the
latest and go fallbacking to the most recent set of features that could
somehow service what is needed or disable some feature and warn the
user, i.e. do the best you can with what you have.

With this patch in place I was able to have what was working before
88cda1c9da02 working again.
 
> Instead of sinking all future bpf_attr's backward compatibility
> requirements to sys_bpf,  I would push it up to its own BPF_* command
> helper which has a better sense of its bpf_attr, i.e. push it up
> to bpf_create_map_node() and bpf_load_program_name() in this case.

Humm, we could try that approach, but the one in this patch seemed good
enough.

And after all if the first syscall() invokation, with the latest kernel
and latest tooling will work, right?

I think that we need as well bpf__strerror_METHOD() that can make sense
of error returns per method (aka "command helper"), to tell the user
when something can't be done why is that so: "you need a kernel >= 4.X",
"this needs root privs", "too many maps, etc"

Right now we have a generic, libbpf wide one:

static const char *libbpf_strerror_table[NR_ERRNO] = {
        [ERRCODE_OFFSET(LIBELF)]        = "Something wrong in libelf",
        [ERRCODE_OFFSET(FORMAT)]        = "BPF object format invalid",
        [ERRCODE_OFFSET(KVERSION)]      = "'version' section incorrect or lost",
        [ERRCODE_OFFSET(ENDIAN)]        = "Endian mismatch",
        [ERRCODE_OFFSET(INTERNAL)]      = "Internal error in libbpf",
        [ERRCODE_OFFSET(RELOC)]         = "Relocation failed",
        [ERRCODE_OFFSET(VERIFY)]        = "Kernel verifier blocks program loading",
        [ERRCODE_OFFSET(PROG2BIG)]      = "Program too big",
        [ERRCODE_OFFSET(KVER)]          = "Incorrect kernel version",
        [ERRCODE_OFFSET(PROGTYPE)]      = "Kernel doesn't support this program type",
};

int libbpf_strerror(int err, char *buf, size_t size)
{

	/* use the above or return strerror_r() for the errno usual range */
}

But as time goes by, features get added, ABI gets extended, and without
a better kernel/user error reporting mechanism... Making sense of errno
in the context of a specific method and looking at the system state
looks like the best we can do to provide not-so-cryptic messages to the
poor users, and we're all users :-)

- Arnaldo
 
> > +		err = syscall(__NR_bpf, cmd, attr, old_union_size);
> > +	}
> > +	return err;
> >  }
> >  
> >  int bpf_create_map_node(enum bpf_map_type map_type, const char *name,

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-30 19:00             ` Arnaldo Carvalho de Melo
@ 2017-11-30 21:51               ` Alexei Starovoitov
  2017-12-01 17:51                 ` Arnaldo Carvalho de Melo
  2017-11-30 22:09               ` Martin KaFai Lau
  1 sibling, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2017-11-30 21:51 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Martin KaFai Lau
  Cc: Wang Nan, Daniel Borkmann, David S. Miller, David Ahern,
	Ingo Molnar, Linux Kernel Mailing List, netdev

On 11/30/17 11:00 AM, Arnaldo Carvalho de Melo wrote:
>> Instead of sinking all future bpf_attr's backward compatibility
>> requirements to sys_bpf,  I would push it up to its own BPF_* command
>> helper which has a better sense of its bpf_attr, i.e. push it up
>> to bpf_create_map_node() and bpf_load_program_name() in this case.
> Humm, we could try that approach, but the one in this patch seemed good
> enough.
>
> And after all if the first syscall() invokation, with the latest kernel
> and latest tooling will work, right?

I agree with Martin and I also don't think it will work to push
logic of all bpf commands into single sys_bpf syscall wrapper.
This logic will become more and more complex over time.
Like this case really belongs in bpf_create_map() which is a wrapper
on top of single BPF_CREATE_MAP command.

Note it's the first time we're facing this
'new libbpf.a running on top of old kernel' issue and should be
very careful adding such fallback code to the generic bpf library,
since all the selftests/bpf/ are using this lib and relying on
excepted behavior. We don't want tests that want to test the latest
kernel feature all of a sudden pass on old kernel that doesn't have it.

To some degree perf and selftests/bpf needs are diverging here,
so adding #ifdef to libbpf.a to match testcase expectations may be
necessary.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-30 19:00             ` Arnaldo Carvalho de Melo
  2017-11-30 21:51               ` Alexei Starovoitov
@ 2017-11-30 22:09               ` Martin KaFai Lau
  1 sibling, 0 replies; 12+ messages in thread
From: Martin KaFai Lau @ 2017-11-30 22:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Wang Nan, Daniel Borkmann, Matthias Kaehlcke,
	Josh Poimboeuf, Yonghong Song, David S. Miller,
	Alexei Starovoitov, Adrian Hunter, David Ahern, Jiri Olsa,
	Ingo Molnar, Namhyung Kim, Linux Kernel Mailing List,
	Andrey Ryabinin

On Thu, Nov 30, 2017 at 04:00:42PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 30, 2017 at 10:28:08AM -0800, Martin KaFai Lau escreveu:
> > On Thu, Nov 30, 2017 at 01:53:58PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > > > > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > > > [root@jouet ~]# perf test -v bpf
> > > > > > > > 39: BPF filter                                            :
> > > > > > > > 39.1: Basic BPF filtering                                 :
> > > > > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > > > > [ ... ]
> > > > > > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > > > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > > > > bpf: load objects failed
> > > > > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name")
> > > > > > > is introduced in 4.15.
> > > > 
> > > > > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > > > > > > the new bpf prog/map name is only introduced since 4.15.
> 
> > > > > > > The newer perf needs to be compatible with an older kernel?
> 
> > > > > > Sure :-)
> 
> > > > > Would the latest features introduced in perf/libbpf supposed to be
> > > > > available in the latest kernel only?  What may be the reason that the
> 
> > > > Yes, then the new perf binary should try to use the new stuff, if it
> > > > fails, use the old one, there is no requirement that one uses perf 4.14
> > > > in lockstep with the kernel 4.14 (or any other version), perf 4.15
> > > > should work with the 4.14 kernel as well as with 4.15 (or any other
> > > > future kernel), only limited by what it can grok up to when it was
> > > > released.
> 
> > > So, see the patch below, that makes a 'perf test bpf' and my other test
> > > cases, including that one for probe_read_str() work again, it just
> > > fallbacks to a behaviour the older kernels can accept.
> 
> > Thanks for the patch.
>  
> > > We can improve it so that that EINVAL fallback happens only for
> > > MAP_CREATE, and probably we don't need to change the size arg, just zero
> > > the unused fields, but I haven't checked that.
> > > 
> > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > > index 5128677e4117..3084f07c7c33 100644
> > > --- a/tools/lib/bpf/bpf.c
> > > +++ b/tools/lib/bpf/bpf.c
> > > @@ -19,6 +19,7 @@
> > >   * License along with this program; if not,  see <http://www.gnu.org/licenses>
> > >   */
> > >  
> > > +#include <errno.h>
> > >  #include <stdlib.h>
> > >  #include <memory.h>
> > >  #include <unistd.h>
> > > @@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
> > >  	return (__u64) (unsigned long) ptr;
> > >  }
> > >  
> > > -static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
> > > -			  unsigned int size)
> > > +static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
> > >  {
> > > -	return syscall(__NR_bpf, cmd, attr, size);
> > > +	int err = syscall(__NR_bpf, cmd, attr, size);
> > > +	if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
> > I would add a check to the length of map_name/prog_name.
> 
> Can you elaborate? What kind of check?
F.e. if map_name is not set (i.e. strlen is 0), there is no
need to retry.

>  
> > > +		const unsigned int old_union_size = offsetof(union bpf_attr, prog_name);
> > > +		/*
> > > +		 * These were the ones that added fields after the old bpf_attr
> > > +		 * layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
> > > +		 * API support to specify BPF obj name") so zero that out to
> > > +		 * pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
> > > +		 * kernels.
> > > +		 */
> > > +		if (cmd == BPF_MAP_CREATE)
> > > +			memset(&attr->map_name, 0, size - offsetof(union bpf_attr, map_name));
> > > +		else
> > > +			memset(&attr->prog_name, 0, size - old_union_size);
> 
> > If bpf_attr is extended in the future,  map_name/prog_name will still be
> > used as the anchor for backward compatibility instead of trial and error
> > attribute by attribute?
> 
> Then you will first try the latest and greatest, if it fails, go the
> previous (like here), if it fails, etc.
> 
> That or some sort of versioning to make sure the kernel and the tools
> can agree on a common set of functionality supported by both.
> 
> Again, this is how perf_evsel__open() works in the perf case, try the
> latest and go fallbacking to the most recent set of features that could
> somehow service what is needed or disable some feature and warn the
> user, i.e. do the best you can with what you have.
> 
> With this patch in place I was able to have what was working before
> 88cda1c9da02 working again.
>  
> > Instead of sinking all future bpf_attr's backward compatibility
> > requirements to sys_bpf,  I would push it up to its own BPF_* command
> > helper which has a better sense of its bpf_attr, i.e. push it up
> > to bpf_create_map_node() and bpf_load_program_name() in this case.
> 
> Humm, we could try that approach, but the one in this patch seemed good
> enough.
> 
> And after all if the first syscall() invokation, with the latest kernel
> and latest tooling will work, right?
> 
> I think that we need as well bpf__strerror_METHOD() that can make sense
> of error returns per method (aka "command helper"), to tell the user
> when something can't be done why is that so: "you need a kernel >= 4.X",
> "this needs root privs", "too many maps, etc"
> 
> Right now we have a generic, libbpf wide one:
> 
> static const char *libbpf_strerror_table[NR_ERRNO] = {
>         [ERRCODE_OFFSET(LIBELF)]        = "Something wrong in libelf",
>         [ERRCODE_OFFSET(FORMAT)]        = "BPF object format invalid",
>         [ERRCODE_OFFSET(KVERSION)]      = "'version' section incorrect or lost",
>         [ERRCODE_OFFSET(ENDIAN)]        = "Endian mismatch",
>         [ERRCODE_OFFSET(INTERNAL)]      = "Internal error in libbpf",
>         [ERRCODE_OFFSET(RELOC)]         = "Relocation failed",
>         [ERRCODE_OFFSET(VERIFY)]        = "Kernel verifier blocks program loading",
>         [ERRCODE_OFFSET(PROG2BIG)]      = "Program too big",
>         [ERRCODE_OFFSET(KVER)]          = "Incorrect kernel version",
>         [ERRCODE_OFFSET(PROGTYPE)]      = "Kernel doesn't support this program type",
> };
> 
> int libbpf_strerror(int err, char *buf, size_t size)
> {
> 
> 	/* use the above or return strerror_r() for the errno usual range */
> }
> 
> But as time goes by, features get added, ABI gets extended, and without
> a better kernel/user error reporting mechanism... Making sense of errno
> in the context of a specific method and looking at the system state
> looks like the best we can do to provide not-so-cryptic messages to the
> poor users, and we're all users :-)
> 
> - Arnaldo
>  
> > > +		err = syscall(__NR_bpf, cmd, attr, old_union_size);
> > > +	}
> > > +	return err;
> > >  }
> > >  
> > >  int bpf_create_map_node(enum bpf_map_type map_type, const char *name,

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-11-30 21:51               ` Alexei Starovoitov
@ 2017-12-01 17:51                 ` Arnaldo Carvalho de Melo
  2017-12-02  1:15                   ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-01 17:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Martin KaFai Lau, Wang Nan, Daniel Borkmann, David S. Miller,
	David Ahern, Ingo Molnar, Linux Kernel Mailing List, netdev

Em Thu, Nov 30, 2017 at 01:51:15PM -0800, Alexei Starovoitov escreveu:
> On 11/30/17 11:00 AM, Arnaldo Carvalho de Melo wrote:
> > > Instead of sinking all future bpf_attr's backward compatibility
> > > requirements to sys_bpf,  I would push it up to its own BPF_* command
> > > helper which has a better sense of its bpf_attr, i.e. push it up
> > > to bpf_create_map_node() and bpf_load_program_name() in this case.
> > Humm, we could try that approach, but the one in this patch seemed good
> > enough.
> > 
> > And after all if the first syscall() invokation, with the latest kernel
> > and latest tooling will work, right?
> 
> I agree with Martin and I also don't think it will work to push
> logic of all bpf commands into single sys_bpf syscall wrapper.

Sure, that was just a POC, I'll work on something that takes into
account what you guys pointed out.

> This logic will become more and more complex over time.
> Like this case really belongs in bpf_create_map() which is a wrapper
> on top of single BPF_CREATE_MAP command.
 
> Note it's the first time we're facing this 'new libbpf.a running on
> top of old kernel' issue and should be very careful adding such
> fallback code to the generic bpf library, since all the selftests/bpf/
> are using this lib and relying on excepted behavior.

Right, tools/perf/ uses it as well and relies on its continued
functioning.

> We don't want tests that want to test the latest kernel feature all of
> a sudden pass on old kernel that doesn't have it.

Sure, neither do I :-)
 
> To some degree perf and selftests/bpf needs are diverging here,
> so adding #ifdef to libbpf.a to match testcase expectations may be
> necessary.

But this is not just testcase expectations, the usecase is someone
wanting to use a newer tool, with perhaps some new features of interest
that don't depend on changes in the kernel, in an older kernel on a
system where updating it is not possible or desirable.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
  2017-12-01 17:51                 ` Arnaldo Carvalho de Melo
@ 2017-12-02  1:15                   ` Alexei Starovoitov
  0 siblings, 0 replies; 12+ messages in thread
From: Alexei Starovoitov @ 2017-12-02  1:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Martin KaFai Lau, Wang Nan, Daniel Borkmann, David S. Miller,
	David Ahern, Ingo Molnar, Linux Kernel Mailing List, netdev

On 12/1/17 9:51 AM, Arnaldo Carvalho de Melo wrote:
>
> But this is not just testcase expectations, the usecase is someone
> wanting to use a newer tool, with perhaps some new features of interest
> that don't depend on changes in the kernel, in an older kernel on a
> system where updating it is not possible or desirable.

I think it's also dangerous for the core library like libbpf to
be smarter than the tool that is using it.
In this case we added prog and map names by default into loader and
create_map functions to make sure that all tools pick them up
automatically and we can see a bit more human readable bpf names
in kernel stack traces and in debug tools like bpftool, bcc/bps.
When kernel is older and doesn't support prog/map names, it's perfectly
reasonable to fall back to map creation without the name, but
library shouldn't be doing it in all cases.
Like prog_load command recently got new prog_ifindex field.
It would be incorrect to fallback to loading without it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-12-02  1:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-28 19:05 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name" Arnaldo Carvalho de Melo
2017-11-29 21:07 ` Martin KaFai Lau
2017-11-29 21:15   ` Arnaldo Carvalho de Melo
2017-11-29 22:31     ` Martin KaFai Lau
2017-11-30  3:01       ` Arnaldo Carvalho de Melo
2017-11-30 16:53         ` [PATCH/RFC] " Arnaldo Carvalho de Melo
2017-11-30 18:28           ` Martin KaFai Lau
2017-11-30 19:00             ` Arnaldo Carvalho de Melo
2017-11-30 21:51               ` Alexei Starovoitov
2017-12-01 17:51                 ` Arnaldo Carvalho de Melo
2017-12-02  1:15                   ` Alexei Starovoitov
2017-11-30 22:09               ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).