From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751319AbdK3WKF (ORCPT ); Thu, 30 Nov 2017 17:10:05 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:38162 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750923AbdK3WKE (ORCPT ); Thu, 30 Nov 2017 17:10:04 -0500 Date: Thu, 30 Nov 2017 14:09:13 -0800 From: Martin KaFai Lau To: Arnaldo Carvalho de Melo CC: Alexei Starovoitov , Wang Nan , "Daniel Borkmann" , Matthias Kaehlcke , "Josh Poimboeuf" , Yonghong Song , "David S. Miller" , Alexei Starovoitov , Adrian Hunter , David Ahern , Jiri Olsa , Ingo Molnar , Namhyung Kim , "Linux Kernel Mailing List" , Andrey Ryabinin Subject: Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name" Message-ID: <20171130220912.jqngrv2kt7vjxl3s@kafai-mbp.dhcp.thefacebook.com> References: <20171128190519.GM3298@kernel.org> <20171129210734.lqs23q65ac6avlwr@kafai-mbp> <20171129211543.GC31403@kernel.org> <20171129223135.6iqvj6ho4ojxmhu6@kafai-mbp> <20171130030110.GA18880@kernel.org> <20171130165358.GN3298@kernel.org> <20171130182807.sjhapvmnimtlsmpo@kafai-mbp.dhcp.thefacebook.com> <20171130190042.GQ3298@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20171130190042.GQ3298@kernel.org> User-Agent: NeoMutt/20171027 X-Originating-IP: [192.168.52.123] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-30_06:,, signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 30, 2017 at 04:00:42PM -0300, Arnaldo Carvalho de Melo wrote: > Em Thu, Nov 30, 2017 at 10:28:08AM -0800, Martin KaFai Lau escreveu: > > On Thu, Nov 30, 2017 at 01:53:58PM -0300, Arnaldo Carvalho de Melo wrote: > > > Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu: > > > > Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu: > > > > > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote: > > > > > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu: > > > > > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo wrote: > > > > > > > > [root@jouet ~]# perf test -v bpf > > > > > > > > 39: BPF filter : > > > > > > > > 39.1: Basic BPF filtering : > > > > > > > > Kernel build dir is set to /lib/modules/4.14.0+/build > > > > > > > [ ... ] > > > > > > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument > > > > > > > > libbpf: failed to load object '[basic_bpf_test]' > > > > > > > > bpf: load objects failed > > > > > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name") > > > > > > > is introduced in 4.15. > > > > > > > > > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because > > > > > > > the new bpf prog/map name is only introduced since 4.15. > > > > > > > > The newer perf needs to be compatible with an older kernel? > > > > > > > Sure :-) > > > > > > Would the latest features introduced in perf/libbpf supposed to be > > > > > available in the latest kernel only? What may be the reason that the > > > > > Yes, then the new perf binary should try to use the new stuff, if it > > > > fails, use the old one, there is no requirement that one uses perf 4.14 > > > > in lockstep with the kernel 4.14 (or any other version), perf 4.15 > > > > should work with the 4.14 kernel as well as with 4.15 (or any other > > > > future kernel), only limited by what it can grok up to when it was > > > > released. > > > > So, see the patch below, that makes a 'perf test bpf' and my other test > > > cases, including that one for probe_read_str() work again, it just > > > fallbacks to a behaviour the older kernels can accept. > > > Thanks for the patch. > > > > We can improve it so that that EINVAL fallback happens only for > > > MAP_CREATE, and probably we don't need to change the size arg, just zero > > > the unused fields, but I haven't checked that. > > > > > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c > > > index 5128677e4117..3084f07c7c33 100644 > > > --- a/tools/lib/bpf/bpf.c > > > +++ b/tools/lib/bpf/bpf.c > > > @@ -19,6 +19,7 @@ > > > * License along with this program; if not, see > > > */ > > > > > > +#include > > > #include > > > #include > > > #include > > > @@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr) > > > return (__u64) (unsigned long) ptr; > > > } > > > > > > -static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, > > > - unsigned int size) > > > +static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size) > > > { > > > - return syscall(__NR_bpf, cmd, attr, size); > > > + int err = syscall(__NR_bpf, cmd, attr, size); > > > + if (err == -1 && (errno == EINVAL || errno == E2BIG)) { > > I would add a check to the length of map_name/prog_name. > > Can you elaborate? What kind of check? F.e. if map_name is not set (i.e. strlen is 0), there is no need to retry. > > > > + const unsigned int old_union_size = offsetof(union bpf_attr, prog_name); > > > + /* > > > + * These were the ones that added fields after the old bpf_attr > > > + * layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic > > > + * API support to specify BPF obj name") so zero that out to > > > + * pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older > > > + * kernels. > > > + */ > > > + if (cmd == BPF_MAP_CREATE) > > > + memset(&attr->map_name, 0, size - offsetof(union bpf_attr, map_name)); > > > + else > > > + memset(&attr->prog_name, 0, size - old_union_size); > > > If bpf_attr is extended in the future, map_name/prog_name will still be > > used as the anchor for backward compatibility instead of trial and error > > attribute by attribute? > > Then you will first try the latest and greatest, if it fails, go the > previous (like here), if it fails, etc. > > That or some sort of versioning to make sure the kernel and the tools > can agree on a common set of functionality supported by both. > > Again, this is how perf_evsel__open() works in the perf case, try the > latest and go fallbacking to the most recent set of features that could > somehow service what is needed or disable some feature and warn the > user, i.e. do the best you can with what you have. > > With this patch in place I was able to have what was working before > 88cda1c9da02 working again. > > > Instead of sinking all future bpf_attr's backward compatibility > > requirements to sys_bpf, I would push it up to its own BPF_* command > > helper which has a better sense of its bpf_attr, i.e. push it up > > to bpf_create_map_node() and bpf_load_program_name() in this case. > > Humm, we could try that approach, but the one in this patch seemed good > enough. > > And after all if the first syscall() invokation, with the latest kernel > and latest tooling will work, right? > > I think that we need as well bpf__strerror_METHOD() that can make sense > of error returns per method (aka "command helper"), to tell the user > when something can't be done why is that so: "you need a kernel >= 4.X", > "this needs root privs", "too many maps, etc" > > Right now we have a generic, libbpf wide one: > > static const char *libbpf_strerror_table[NR_ERRNO] = { > [ERRCODE_OFFSET(LIBELF)] = "Something wrong in libelf", > [ERRCODE_OFFSET(FORMAT)] = "BPF object format invalid", > [ERRCODE_OFFSET(KVERSION)] = "'version' section incorrect or lost", > [ERRCODE_OFFSET(ENDIAN)] = "Endian mismatch", > [ERRCODE_OFFSET(INTERNAL)] = "Internal error in libbpf", > [ERRCODE_OFFSET(RELOC)] = "Relocation failed", > [ERRCODE_OFFSET(VERIFY)] = "Kernel verifier blocks program loading", > [ERRCODE_OFFSET(PROG2BIG)] = "Program too big", > [ERRCODE_OFFSET(KVER)] = "Incorrect kernel version", > [ERRCODE_OFFSET(PROGTYPE)] = "Kernel doesn't support this program type", > }; > > int libbpf_strerror(int err, char *buf, size_t size) > { > > /* use the above or return strerror_r() for the errno usual range */ > } > > But as time goes by, features get added, ABI gets extended, and without > a better kernel/user error reporting mechanism... Making sense of errno > in the context of a specific method and looking at the system state > looks like the best we can do to provide not-so-cryptic messages to the > poor users, and we're all users :-) > > - Arnaldo > > > > + err = syscall(__NR_bpf, cmd, attr, old_union_size); > > > + } > > > + return err; > > > } > > > > > > int bpf_create_map_node(enum bpf_map_type map_type, const char *name,