All of lore.kernel.org
 help / color / mirror / Atom feed
* net-next libbpf broken on prev kernel release
@ 2017-12-14  9:16 Eric Leblond
  2017-12-14  9:52 ` Daniel Borkmann
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Leblond @ 2017-12-14  9:16 UTC (permalink / raw)
  To: netdev, Martin KaFai Lau

Hello,

It seems that the following patch did break libbpf (in net-next
version) which is not able to load anymore a program on a 4.14:

tree 5096ddd73981e33a2164606461a45b56a189889c
parent ad5b177bd73f5107d97c36f56395c4281fb6f089
author Martin KaFai Lau <kafai@fb.com> Wed Sep 27 14:37:54 2017 -0700
committer David S. Miller <davem@davemloft.net> Fri Sep 29 06:17:05 2017 +0100

bpf: libbpf: Provide basic API support to specify BPF obj name

The problem comes from

-int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
-		     size_t insns_cnt, const char *license,
-		     __u32 kern_version, char *log_buf, size_t log_buf_sz)
+int bpf_load_program_name(enum bpf_prog_type type, const char *name,
+			  const struct bpf_insn *insns,
+			  size_t insns_cnt, const char *license,
+			  __u32 kern_version, char *log_buf,
+			  size_t log_buf_sz)
 {
 	int fd;
 	union bpf_attr attr;
+	__u32 name_len = name ? strlen(name) : 0;
 
 	bzero(&attr, sizeof(attr));
 	attr.prog_type = type;
@@ -130,6 +151,7 @@ int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
 	attr.log_size = 0;
 	attr.log_level = 0;
 	attr.kern_version = kern_version;
+	memcpy(attr.prog_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));

If I comment the memcpy then the eBPF program is loading correctly.

Is this a wanted behavior to have libbpf that needs to be in sync with
kernel ? or should it be fixed ?

BR,
-- 
Eric Leblond <eric@regit.org>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: net-next libbpf broken on prev kernel release
  2017-12-14  9:16 net-next libbpf broken on prev kernel release Eric Leblond
@ 2017-12-14  9:52 ` Daniel Borkmann
  2017-12-14 14:28   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Borkmann @ 2017-12-14  9:52 UTC (permalink / raw)
  To: Eric Leblond, Martin KaFai Lau, acme, ast; +Cc: netdev

[ +acme, +ast ]

On 12/14/2017 10:16 AM, Eric Leblond wrote:
> Hello,
> 
> It seems that the following patch did break libbpf (in net-next
> version) which is not able to load anymore a program on a 4.14:
> 
> tree 5096ddd73981e33a2164606461a45b56a189889c
> parent ad5b177bd73f5107d97c36f56395c4281fb6f089
> author Martin KaFai Lau <kafai@fb.com> Wed Sep 27 14:37:54 2017 -0700
> committer David S. Miller <davem@davemloft.net> Fri Sep 29 06:17:05 2017 +0100
> 
> bpf: libbpf: Provide basic API support to specify BPF obj name
> 
> The problem comes from
> 
> -int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
> -		     size_t insns_cnt, const char *license,
> -		     __u32 kern_version, char *log_buf, size_t log_buf_sz)
> +int bpf_load_program_name(enum bpf_prog_type type, const char *name,
> +			  const struct bpf_insn *insns,
> +			  size_t insns_cnt, const char *license,
> +			  __u32 kern_version, char *log_buf,
> +			  size_t log_buf_sz)
>  {
>  	int fd;
>  	union bpf_attr attr;
> +	__u32 name_len = name ? strlen(name) : 0;
>  
>  	bzero(&attr, sizeof(attr));
>  	attr.prog_type = type;
> @@ -130,6 +151,7 @@ int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
>  	attr.log_size = 0;
>  	attr.log_level = 0;
>  	attr.kern_version = kern_version;
> +	memcpy(attr.prog_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
> 
> If I comment the memcpy then the eBPF program is loading correctly.
> 
> Is this a wanted behavior to have libbpf that needs to be in sync with
> kernel ? or should it be fixed ?

Yeah, this was reported recently here: https://lkml.org/lkml/2017/11/28/1246

I agree that given the policy of perf tool is to try to use new features
but if they fail on older kernels, then we should try to fallback whenever
that is feasible. I think for this specific case, we should in-fact fallback
and try w/o map/prog name in order to fix this regression for perf (or
other lib users).

Also agree that this cannot be done for every possible case like the mentioned
prog_ifindex field for offloading to NIC in the thread above, but I imho
the prog_ifindex is a slightly different situation given that a user needs
to specifically ask to offload via some provided API.

I think the fix should be: if a user *specifically* calls bpf_load_program_name()
or bpf_create_map_name() API from the lib, then the intention is very clear
that the bpf object should be created *with* name and otherwise fail. So a
fallback for these APIs to load w/o name would be inappropriate! But for the
existing code that used to load objects before, e.g. bpf_object__create_maps()
or bpf_program__load() it should try to use either mentioned bpf_*_name() APIs
and *iff* they fail, fall-back to the normal ones w/o name attribute. Meaning,
this kind of fall-back should be done, but not on a sys_bpf() layer but from
a higher PoV in the lib instead. I guess it would make sense to probe the
underlying kernel at startup and then based on its capabilities use one out
of the two APIs when we get there, such that we don't need to uselessly retry
APIs for each prog load.

Arnaldo, will there be a rework of your fix that we could route to bpf tree?

Thanks a lot,
Daniel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: net-next libbpf broken on prev kernel release
  2017-12-14  9:52 ` Daniel Borkmann
@ 2017-12-14 14:28   ` Arnaldo Carvalho de Melo
  2017-12-14 20:29     ` Daniel Borkmann
  0 siblings, 1 reply; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-14 14:28 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Eric Leblond, Martin KaFai Lau, ast, netdev

Em Thu, Dec 14, 2017 at 10:52:19AM +0100, Daniel Borkmann escreveu:
> [ +acme, +ast ]
> 
> On 12/14/2017 10:16 AM, Eric Leblond wrote:
> > Hello,
> > 
> > It seems that the following patch did break libbpf (in net-next
> > version) which is not able to load anymore a program on a 4.14:
> > 
> > tree 5096ddd73981e33a2164606461a45b56a189889c
> > parent ad5b177bd73f5107d97c36f56395c4281fb6f089
> > author Martin KaFai Lau <kafai@fb.com> Wed Sep 27 14:37:54 2017 -0700
> > committer David S. Miller <davem@davemloft.net> Fri Sep 29 06:17:05 2017 +0100
> > 
> > bpf: libbpf: Provide basic API support to specify BPF obj name
> > 
> > The problem comes from
> > 
> > -int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
> > -		     size_t insns_cnt, const char *license,
> > -		     __u32 kern_version, char *log_buf, size_t log_buf_sz)
> > +int bpf_load_program_name(enum bpf_prog_type type, const char *name,
> > +			  const struct bpf_insn *insns,
> > +			  size_t insns_cnt, const char *license,
> > +			  __u32 kern_version, char *log_buf,
> > +			  size_t log_buf_sz)
> >  {
> >  	int fd;
> >  	union bpf_attr attr;
> > +	__u32 name_len = name ? strlen(name) : 0;
> >  
> >  	bzero(&attr, sizeof(attr));
> >  	attr.prog_type = type;
> > @@ -130,6 +151,7 @@ int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
> >  	attr.log_size = 0;
> >  	attr.log_level = 0;
> >  	attr.kern_version = kern_version;
> > +	memcpy(attr.prog_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
> > 
> > If I comment the memcpy then the eBPF program is loading correctly.
> > 
> > Is this a wanted behavior to have libbpf that needs to be in sync with
> > kernel ? or should it be fixed ?
 
> Yeah, this was reported recently here: https://lkml.org/lkml/2017/11/28/1246
 
> I agree that given the policy of perf tool is to try to use new features
> but if they fail on older kernels, then we should try to fallback whenever
> that is feasible. I think for this specific case, we should in-fact fallback
> and try w/o map/prog name in order to fix this regression for perf (or
> other lib users).
 
> Also agree that this cannot be done for every possible case like the mentioned
> prog_ifindex field for offloading to NIC in the thread above, but I imho
> the prog_ifindex is a slightly different situation given that a user needs
> to specifically ask to offload via some provided API.
 
> I think the fix should be: if a user *specifically* calls bpf_load_program_name()
> or bpf_create_map_name() API from the lib, then the intention is very clear
> that the bpf object should be created *with* name and otherwise fail. So a
> fallback for these APIs to load w/o name would be inappropriate! But for the
> existing code that used to load objects before, e.g. bpf_object__create_maps()
> or bpf_program__load() it should try to use either mentioned bpf_*_name() APIs
> and *iff* they fail, fall-back to the normal ones w/o name attribute. Meaning,
> this kind of fall-back should be done, but not on a sys_bpf() layer but from
> a higher PoV in the lib instead. I guess it would make sense to probe the
> underlying kernel at startup and then based on its capabilities use one out
> of the two APIs when we get there, such that we don't need to uselessly retry
> APIs for each prog load.

tools/perf/ has:

static struct {
        bool sample_id_all;
        bool exclude_guest;
        bool mmap2;
        bool cloexec;
        bool clockid;
        bool clockid_wrong;
        bool lbr_flags;
        bool write_backward;
        bool group_read;
} perf_missing_features;

When the user request something that needs some of these features we try
using it, failing it will mark it as missing and then other events will
not needlessly try using it, i.e. we don't do it at program start, we
leave that to when we actually need it, to avoid uselessly probing at
startup.

> Arnaldo, will there be a rework of your fix that we could route to bpf tree?

I'm resuming work on it after I get my current batch tested and
submitted, will reboot with an older kernel and follow your suggestions,
that seems to match Alexei's and Martin's, my patch was just a RFC to
show that we need a fallback for older kernels.

I needed to move on, so I updated my machine to a kernel where interlock
of tools/ with the kernel happens and it worked, so I left this to see
if someone else complained or if I was being too picky. :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: net-next libbpf broken on prev kernel release
  2017-12-14 14:28   ` Arnaldo Carvalho de Melo
@ 2017-12-14 20:29     ` Daniel Borkmann
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Borkmann @ 2017-12-14 20:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Eric Leblond, Martin KaFai Lau, ast, netdev

On 12/14/2017 03:28 PM, Arnaldo Carvalho de Melo wrote:
[...]
>> Arnaldo, will there be a rework of your fix that we could route to bpf tree?
> 
> I'm resuming work on it after I get my current batch tested and
> submitted, will reboot with an older kernel and follow your suggestions,
> that seems to match Alexei's and Martin's, my patch was just a RFC to
> show that we need a fallback for older kernels.

Okay, sounds good, thanks a lot!

> I needed to move on, so I updated my machine to a kernel where interlock
> of tools/ with the kernel happens and it worked, so I left this to see
> if someone else complained or if I was being too picky. :-)

:)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-12-14 20:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-14  9:16 net-next libbpf broken on prev kernel release Eric Leblond
2017-12-14  9:52 ` Daniel Borkmann
2017-12-14 14:28   ` Arnaldo Carvalho de Melo
2017-12-14 20:29     ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.