All of lore.kernel.org
 help / color / mirror / Atom feed
* bpf: missed fentry/fexit invocations due to implicit recursion
@ 2023-03-21 17:13 Davide Miola
  2023-03-21 22:55 ` Jiri Olsa
  0 siblings, 1 reply; 10+ messages in thread
From: Davide Miola @ 2023-03-21 17:13 UTC (permalink / raw)
  To: bpf

Hello,

I've been trying to measure the per-task CPU time spent in the kernel
function ip_queue_xmit by attaching an entry and exit probe to said
function in the form of fentry/fexit programs, but I've noticed that
under heavy traffic conditions this solution breaks. I first observed
this as the exit program was occasionally being called more times
than the entry, which of course shouldn't be possible.

Below is the stack trace (most recent call last) of one such event:

0xffffffffb8800099 entry_SYSCALL_64_after_hwframe
0xffffffffb879fb49 do_syscall_64
0xffffffffb7d8ff49 __x64_sys_read
0xffffffffb7d8fef5 ksys_read
0xffffffffb7d8d3d3 vfs_read
0xffffffffb7d8caae new_sync_read
0xffffffffb84c0f1f sock_read_iter
0xffffffffb84c0e81 sock_recvmsg
0xffffffffb85ef8bc inet_recvmsg
0xffffffffb85b13f4 tcp_recvmsg
0xffffffffb84ca5b0 release_sock
0xffffffffb84ca535 __release_sock
0xffffffffb85ce0a7 tcp_v4_do_rcv
0xffffffffb85bfd2a tcp_rcv_established
0xffffffffb85b5d32 __tcp_ack_snd_check
0xffffffffb85c8dcc tcp_send_ack
0xffffffffb85c5d0f __tcp_send_ack.part.0
0xffffffffb85a0e75 ip_queue_xmit
0xffffffffc0f3b044 rapl_msr_priv        [intel_rapl_msr]
0xffffffffc0c769aa bpf_prog_6deef7357e7b4530    [bpf]
0xffffffffb7c6707e __htab_map_lookup_elem
0xffffffffb7c66fe0 lookup_nulls_elem_raw
0xffffffffb8800da7 asm_common_interrupt
0xffffffffb87a11ae common_interrupt
0xffffffffb7ac57b4 irq_exit_rcu
0xffffffffb8a000d6 __softirqentry_text_start
0xffffffffb84f1fd6 net_rx_action
0xffffffffb84f1bf0 __napi_poll
0xffffffffc02af547 i40e_napi_poll       [i40e]
0xffffffffb84f1a7a napi_complete_done
0xffffffffb84f158e netif_receive_skb_list_internal
0xffffffffb84f12ba __netif_receive_skb_list_core
0xffffffffb84f04fa __netif_receive_skb_core.constprop.0
0xffffffffc079e192 br_handle_frame      [bridge]
0xffffffffc03f9ce5 br_nf_pre_routing    [br_netfilter]
0xffffffffc03f979c br_nf_pre_routing_finish     [br_netfilter]
0xffffffffc03f95db br_nf_hook_thresh    [br_netfilter]
0xffffffffc079dc07 br_handle_frame_finish       [bridge]
0xffffffffc079da28 br_pass_frame_up     [bridge]
0xffffffffb84f2fa3 netif_receive_skb
0xffffffffb84f2f15 __netif_receive_skb
0xffffffffb84f2eba __netif_receive_skb_one_core
0xffffffffb859aefa ip_rcv
0xffffffffb858eb61 nf_hook_slow
0xffffffffc03f88ec ip_sabotage_in       [br_netfilter]
0xffffffffb859ae5e ip_rcv_finish
0xffffffffb859ab0b ip_local_deliver
0xffffffffb859a9f8 ip_local_deliver_finish
0xffffffffb859a79c ip_protocol_deliver_rcu
0xffffffffb85d137e tcp_v4_rcv
0xffffffffb85ce0a7 tcp_v4_do_rcv
0xffffffffb85bf9cb tcp_rcv_established
0xffffffffb85c6fe7 __tcp_push_pending_frames
0xffffffffb85c6859 tcp_write_xmit
0xffffffffb85a0e75 ip_queue_xmit
0xffffffffc0f3b090 rapl_msr_priv        [intel_rapl_msr]
0xffffffffc00f0fb5 __this_module        [linear]
0xffffffffc00f0fb5 __this_module        [linear]

As I interpret this, it appears that after the first invocation of
ip_queue_xmit the entry function is called, but it's then interrupted
by an irq which eventually leads back to ip_queue_xmit, where the
entry's bpf_prog->active field is still 1, preventing its invocation
(whereas the exit program can instead be executed).
Inspecting bpftool seems to confirm this, as I can see an unbalanced,
slowly increasing recursion_misses counter for both programs.

I'm not even sure this would qualify as a bug, but - even though I've
only been able to reproduce this consistently with a 40G
iperf3 --bidir - the chance of it happening render fentry/fexit
unreliable for the job.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-21 17:13 bpf: missed fentry/fexit invocations due to implicit recursion Davide Miola
@ 2023-03-21 22:55 ` Jiri Olsa
  2023-03-22  7:33   ` Davide Miola
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2023-03-21 22:55 UTC (permalink / raw)
  To: Davide Miola; +Cc: bpf

On Tue, Mar 21, 2023 at 06:13:29PM +0100, Davide Miola wrote:
> Hello,
> 
> I've been trying to measure the per-task CPU time spent in the kernel
> function ip_queue_xmit by attaching an entry and exit probe to said
> function in the form of fentry/fexit programs, but I've noticed that
> under heavy traffic conditions this solution breaks. I first observed
> this as the exit program was occasionally being called more times
> than the entry, which of course shouldn't be possible.
> 
> Below is the stack trace (most recent call last) of one such event:
> 
> 0xffffffffb8800099 entry_SYSCALL_64_after_hwframe
> 0xffffffffb879fb49 do_syscall_64
> 0xffffffffb7d8ff49 __x64_sys_read
> 0xffffffffb7d8fef5 ksys_read
> 0xffffffffb7d8d3d3 vfs_read
> 0xffffffffb7d8caae new_sync_read
> 0xffffffffb84c0f1f sock_read_iter
> 0xffffffffb84c0e81 sock_recvmsg
> 0xffffffffb85ef8bc inet_recvmsg
> 0xffffffffb85b13f4 tcp_recvmsg
> 0xffffffffb84ca5b0 release_sock
> 0xffffffffb84ca535 __release_sock
> 0xffffffffb85ce0a7 tcp_v4_do_rcv
> 0xffffffffb85bfd2a tcp_rcv_established
> 0xffffffffb85b5d32 __tcp_ack_snd_check
> 0xffffffffb85c8dcc tcp_send_ack
> 0xffffffffb85c5d0f __tcp_send_ack.part.0
> 0xffffffffb85a0e75 ip_queue_xmit
> 0xffffffffc0f3b044 rapl_msr_priv        [intel_rapl_msr]
> 0xffffffffc0c769aa bpf_prog_6deef7357e7b4530    [bpf]
> 0xffffffffb7c6707e __htab_map_lookup_elem
> 0xffffffffb7c66fe0 lookup_nulls_elem_raw
> 0xffffffffb8800da7 asm_common_interrupt
> 0xffffffffb87a11ae common_interrupt
> 0xffffffffb7ac57b4 irq_exit_rcu
> 0xffffffffb8a000d6 __softirqentry_text_start
> 0xffffffffb84f1fd6 net_rx_action
> 0xffffffffb84f1bf0 __napi_poll
> 0xffffffffc02af547 i40e_napi_poll       [i40e]
> 0xffffffffb84f1a7a napi_complete_done
> 0xffffffffb84f158e netif_receive_skb_list_internal
> 0xffffffffb84f12ba __netif_receive_skb_list_core
> 0xffffffffb84f04fa __netif_receive_skb_core.constprop.0
> 0xffffffffc079e192 br_handle_frame      [bridge]
> 0xffffffffc03f9ce5 br_nf_pre_routing    [br_netfilter]
> 0xffffffffc03f979c br_nf_pre_routing_finish     [br_netfilter]
> 0xffffffffc03f95db br_nf_hook_thresh    [br_netfilter]
> 0xffffffffc079dc07 br_handle_frame_finish       [bridge]
> 0xffffffffc079da28 br_pass_frame_up     [bridge]
> 0xffffffffb84f2fa3 netif_receive_skb
> 0xffffffffb84f2f15 __netif_receive_skb
> 0xffffffffb84f2eba __netif_receive_skb_one_core
> 0xffffffffb859aefa ip_rcv
> 0xffffffffb858eb61 nf_hook_slow
> 0xffffffffc03f88ec ip_sabotage_in       [br_netfilter]
> 0xffffffffb859ae5e ip_rcv_finish
> 0xffffffffb859ab0b ip_local_deliver
> 0xffffffffb859a9f8 ip_local_deliver_finish
> 0xffffffffb859a79c ip_protocol_deliver_rcu
> 0xffffffffb85d137e tcp_v4_rcv
> 0xffffffffb85ce0a7 tcp_v4_do_rcv
> 0xffffffffb85bf9cb tcp_rcv_established
> 0xffffffffb85c6fe7 __tcp_push_pending_frames
> 0xffffffffb85c6859 tcp_write_xmit
> 0xffffffffb85a0e75 ip_queue_xmit
> 0xffffffffc0f3b090 rapl_msr_priv        [intel_rapl_msr]
> 0xffffffffc00f0fb5 __this_module        [linear]
> 0xffffffffc00f0fb5 __this_module        [linear]
> 
> As I interpret this, it appears that after the first invocation of
> ip_queue_xmit the entry function is called, but it's then interrupted
> by an irq which eventually leads back to ip_queue_xmit, where the
> entry's bpf_prog->active field is still 1, preventing its invocation
> (whereas the exit program can instead be executed).
> Inspecting bpftool seems to confirm this, as I can see an unbalanced,
> slowly increasing recursion_misses counter for both programs.

seems correct to me, can you see see recursion_misses counter in
bpftool prog output for the entry tracing program?

jirka

> 
> I'm not even sure this would qualify as a bug, but - even though I've
> only been able to reproduce this consistently with a 40G
> iperf3 --bidir - the chance of it happening render fentry/fexit
> unreliable for the job.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-21 22:55 ` Jiri Olsa
@ 2023-03-22  7:33   ` Davide Miola
  2023-03-22 12:57     ` Jiri Olsa
  0 siblings, 1 reply; 10+ messages in thread
From: Davide Miola @ 2023-03-22  7:33 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: bpf

> seems correct to me, can you see see recursion_misses counter in
> bpftool prog output for the entry tracing program?

Indeed I can. The problem here is that the recursion is not triggered
by my programs; from my point of view any miss is basically a random
event, and the fact that entry and exit progs can miss independently
means that, at any point, I can count two exits for one entry or
(worse) just one exit for two entries, making the whole mechanism
wildly unreliable.

Would using kprobes/kretprobes instead of fentry/fexit here be my
best compromise? It is my understanding (please correct me if I'm
wrong) that kprobes' recursion prevention is per-cpu rather than
per-program, so in this case there would be no imbalance in the
number of misses between the entry and exit probes.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-22  7:33   ` Davide Miola
@ 2023-03-22 12:57     ` Jiri Olsa
  2023-03-22 16:06       ` Alexei Starovoitov
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2023-03-22 12:57 UTC (permalink / raw)
  To: Davide Miola; +Cc: Jiri Olsa, bpf

On Wed, Mar 22, 2023 at 08:33:18AM +0100, Davide Miola wrote:
> > seems correct to me, can you see see recursion_misses counter in
> > bpftool prog output for the entry tracing program?
> 
> Indeed I can. The problem here is that the recursion is not triggered
> by my programs; from my point of view any miss is basically a random
> event, and the fact that entry and exit progs can miss independently
> means that, at any point, I can count two exits for one entry or
> (worse) just one exit for two entries, making the whole mechanism
> wildly unreliable.
> 
> Would using kprobes/kretprobes instead of fentry/fexit here be my
> best compromise? It is my understanding (please correct me if I'm
> wrong) that kprobes' recursion prevention is per-cpu rather than
> per-program, so in this case there would be no imbalance in the
> number of misses between the entry and exit probes.

kprobes are guarded with perf-cpu bpf_prog_active variable, which will
block nested kprobe calls while kprobe program is running, so you will
get balanced counts but likely miss some executions

there was discussion about this some time ago:
  https://lore.kernel.org/bpf/CAEf4BzZ-xe-zSjbBpKLHfQKPnTRTBMA2Eg382+_4kQoTLnj4eQ@mail.gmail.com/

seems the 'active' problem andrii described fits to your case as well

also retsnoop is probably using some custom per-cpu logic in each program
to prevent this issue, you check it in here:
  git@github.com:anakryiko/retsnoop.git

jirka

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-22 12:57     ` Jiri Olsa
@ 2023-03-22 16:06       ` Alexei Starovoitov
  2023-03-22 21:39         ` Davide Miola
  0 siblings, 1 reply; 10+ messages in thread
From: Alexei Starovoitov @ 2023-03-22 16:06 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Davide Miola, bpf

On Wed, Mar 22, 2023 at 6:10 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 08:33:18AM +0100, Davide Miola wrote:
> > > seems correct to me, can you see see recursion_misses counter in
> > > bpftool prog output for the entry tracing program?
> >
> > Indeed I can. The problem here is that the recursion is not triggered
> > by my programs; from my point of view any miss is basically a random
> > event, and the fact that entry and exit progs can miss independently
> > means that, at any point, I can count two exits for one entry or
> > (worse) just one exit for two entries, making the whole mechanism
> > wildly unreliable.
> >
> > Would using kprobes/kretprobes instead of fentry/fexit here be my
> > best compromise? It is my understanding (please correct me if I'm
> > wrong) that kprobes' recursion prevention is per-cpu rather than
> > per-program, so in this case there would be no imbalance in the
> > number of misses between the entry and exit probes.
>
> kprobes are guarded with perf-cpu bpf_prog_active variable, which will
> block nested kprobe calls while kprobe program is running, so you will
> get balanced counts but likely miss some executions
>
> there was discussion about this some time ago:
>   https://lore.kernel.org/bpf/CAEf4BzZ-xe-zSjbBpKLHfQKPnTRTBMA2Eg382+_4kQoTLnj4eQ@mail.gmail.com/
>
> seems the 'active' problem andrii described fits to your case as well
>
> also retsnoop is probably using some custom per-cpu logic in each program
> to prevent this issue, you check it in here:
>   git@github.com:anakryiko/retsnoop.git

I suspect per-cpu recursion counter will miss more events in this case,
since _any_ kprobe on that cpu will be blocked.
If missing events is not an issue you probably want a per-cpu counter
that is specific to your single ip_queue_xmit attach point.

But I'd explore proper networking hooks first that don't have such counter.
There are quite a few in TC and cgroup-bpf layer that can be attached to.
netfilter-bpf hooks are wip.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-22 16:06       ` Alexei Starovoitov
@ 2023-03-22 21:39         ` Davide Miola
  2023-03-22 22:21           ` Alexei Starovoitov
  0 siblings, 1 reply; 10+ messages in thread
From: Davide Miola @ 2023-03-22 21:39 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Jiri Olsa, bpf

On Wed, 22 Mar 2023 at 17:06, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Wed, Mar 22, 2023 at 6:10 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > there was discussion about this some time ago:
> >   https://lore.kernel.org/bpf/CAEf4BzZ-xe-zSjbBpKLHfQKPnTRTBMA2Eg382+_4kQoTLnj4eQ@mail.gmail.com/
> >
> > seems the 'active' problem andrii described fits to your case as well
>
> I suspect per-cpu recursion counter will miss more events in this case,
> since _any_ kprobe on that cpu will be blocked.
> If missing events is not an issue you probably want a per-cpu counter
> that is specific to your single ip_queue_xmit attach point.

The difference between the scenario described in the linked thread
and mine is also the reason why I think in-bpf solutions like a
per-cpu guard can't work here: my programs are recursing due to irqs
interrupting them and invoking ip_queue_xmit, not because some helper
I'm using ends up calling ip_queue_xmit. Recursion can happen
anywhere in my programs, even before they get the chance to set a
flag or increment a counter in a per-cpu map, since there is no
atomic "bpf_map_lookup_and_increment" (or is there?)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-22 21:39         ` Davide Miola
@ 2023-03-22 22:21           ` Alexei Starovoitov
  2023-03-22 22:45             ` Davide Miola
  0 siblings, 1 reply; 10+ messages in thread
From: Alexei Starovoitov @ 2023-03-22 22:21 UTC (permalink / raw)
  To: Davide Miola; +Cc: Jiri Olsa, bpf

On Wed, Mar 22, 2023 at 2:39 PM Davide Miola <davide.miola99@gmail.com> wrote:
>
> On Wed, 22 Mar 2023 at 17:06, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> > On Wed, Mar 22, 2023 at 6:10 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > there was discussion about this some time ago:
> > >   https://lore.kernel.org/bpf/CAEf4BzZ-xe-zSjbBpKLHfQKPnTRTBMA2Eg382+_4kQoTLnj4eQ@mail.gmail.com/
> > >
> > > seems the 'active' problem andrii described fits to your case as well
> >
> > I suspect per-cpu recursion counter will miss more events in this case,
> > since _any_ kprobe on that cpu will be blocked.
> > If missing events is not an issue you probably want a per-cpu counter
> > that is specific to your single ip_queue_xmit attach point.
>
> The difference between the scenario described in the linked thread
> and mine is also the reason why I think in-bpf solutions like a
> per-cpu guard can't work here: my programs are recursing due to irqs
> interrupting them and invoking ip_queue_xmit, not because some helper
> I'm using ends up calling ip_queue_xmit. Recursion can happen
> anywhere in my programs, even before they get the chance to set a
> flag or increment a counter in a per-cpu map, since there is no
> atomic "bpf_map_lookup_and_increment" (or is there?)

__sync_fetch_and_add() is supported. A bunch of selftests are using it.
Or you can use bpf_spin_lock.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-22 22:21           ` Alexei Starovoitov
@ 2023-03-22 22:45             ` Davide Miola
  2023-03-22 22:59               ` Alexei Starovoitov
  0 siblings, 1 reply; 10+ messages in thread
From: Davide Miola @ 2023-03-22 22:45 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Jiri Olsa, bpf

On Wed, 22 Mar 2023 at 23:21, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 2:39 PM Davide Miola <davide.miola99@gmail.com> wrote:
> >
> > On Wed, 22 Mar 2023 at 17:06, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > > On Wed, Mar 22, 2023 at 6:10 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > there was discussion about this some time ago:
> > > >   https://lore.kernel.org/bpf/CAEf4BzZ-xe-zSjbBpKLHfQKPnTRTBMA2Eg382+_4kQoTLnj4eQ@mail.gmail.com/
> > > >
> > > > seems the 'active' problem andrii described fits to your case as well
> > >
> > > I suspect per-cpu recursion counter will miss more events in this case,
> > > since _any_ kprobe on that cpu will be blocked.
> > > If missing events is not an issue you probably want a per-cpu counter
> > > that is specific to your single ip_queue_xmit attach point.
> >
> > The difference between the scenario described in the linked thread
> > and mine is also the reason why I think in-bpf solutions like a
> > per-cpu guard can't work here: my programs are recursing due to irqs
> > interrupting them and invoking ip_queue_xmit, not because some helper
> > I'm using ends up calling ip_queue_xmit. Recursion can happen
> > anywhere in my programs, even before they get the chance to set a
> > flag or increment a counter in a per-cpu map, since there is no
> > atomic "bpf_map_lookup_and_increment" (or is there?)
>
> __sync_fetch_and_add() is supported. A bunch of selftests are using it.
> Or you can use bpf_spin_lock.

Sure, but I'd still have to lookup the element from the map first.
At a minimum it would look something like:

SEC("fentry/ip_queue_xmit")
int BPF_PROG(entry_prog) {
    int key = 0;
    int64_t *guard = bpf_map_lookup_elem(&per_cpu, &key);
    if (guard) {
        if (__sync_fetch_and_add(guard, 1) == 0) {
            ...
        }
    }
}

The program could be interrupted before it reaches
__sync_fetch_and_add (just tested this and it does not solve the
problem)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-22 22:45             ` Davide Miola
@ 2023-03-22 22:59               ` Alexei Starovoitov
  2023-03-23  8:17                 ` Davide Miola
  0 siblings, 1 reply; 10+ messages in thread
From: Alexei Starovoitov @ 2023-03-22 22:59 UTC (permalink / raw)
  To: Davide Miola; +Cc: Jiri Olsa, bpf

On Wed, Mar 22, 2023 at 3:45 PM Davide Miola <davide.miola99@gmail.com> wrote:
>
> On Wed, 22 Mar 2023 at 23:21, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 2:39 PM Davide Miola <davide.miola99@gmail.com> wrote:
> > >
> > > On Wed, 22 Mar 2023 at 17:06, Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > > On Wed, Mar 22, 2023 at 6:10 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > >
> > > > > there was discussion about this some time ago:
> > > > >   https://lore.kernel.org/bpf/CAEf4BzZ-xe-zSjbBpKLHfQKPnTRTBMA2Eg382+_4kQoTLnj4eQ@mail.gmail.com/
> > > > >
> > > > > seems the 'active' problem andrii described fits to your case as well
> > > >
> > > > I suspect per-cpu recursion counter will miss more events in this case,
> > > > since _any_ kprobe on that cpu will be blocked.
> > > > If missing events is not an issue you probably want a per-cpu counter
> > > > that is specific to your single ip_queue_xmit attach point.
> > >
> > > The difference between the scenario described in the linked thread
> > > and mine is also the reason why I think in-bpf solutions like a
> > > per-cpu guard can't work here: my programs are recursing due to irqs
> > > interrupting them and invoking ip_queue_xmit, not because some helper
> > > I'm using ends up calling ip_queue_xmit. Recursion can happen
> > > anywhere in my programs, even before they get the chance to set a
> > > flag or increment a counter in a per-cpu map, since there is no
> > > atomic "bpf_map_lookup_and_increment" (or is there?)
> >
> > __sync_fetch_and_add() is supported. A bunch of selftests are using it.
> > Or you can use bpf_spin_lock.
>
> Sure, but I'd still have to lookup the element from the map first.
> At a minimum it would look something like:
>
> SEC("fentry/ip_queue_xmit")
> int BPF_PROG(entry_prog) {
>     int key = 0;
>     int64_t *guard = bpf_map_lookup_elem(&per_cpu, &key);
>     if (guard) {
>         if (__sync_fetch_and_add(guard, 1) == 0) {
>             ...
>         }
>     }
> }
>
> The program could be interrupted before it reaches
> __sync_fetch_and_add (just tested this and it does not solve the
> problem)

Ahh. I got confused by your bpf_map_lookup_and_increment idea.
It won't help here either if you're concerned of IRQ
after prog starts and before the first lookup.
You can use global data. In such case there is no lookup function call.
It reduces the race window, but it's theoretically still there.
Try kprobes, but they're slower and I suspect you'll miss more events,
because all kprobe progs are in one bucket.

Better approach is to switch to networking hooks instead of tracing.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: bpf: missed fentry/fexit invocations due to implicit recursion
  2023-03-22 22:59               ` Alexei Starovoitov
@ 2023-03-23  8:17                 ` Davide Miola
  0 siblings, 0 replies; 10+ messages in thread
From: Davide Miola @ 2023-03-23  8:17 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Jiri Olsa, bpf

On Thu, 23 Mar 2023 at 00:00, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 3:45 PM Davide Miola <davide.miola99@gmail.com> wrote:
> >
> > On Wed, 22 Mar 2023 at 23:21, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Mar 22, 2023 at 2:39 PM Davide Miola <davide.miola99@gmail.com> wrote:
> > > >
> > > > On Wed, 22 Mar 2023 at 17:06, Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > On Wed, Mar 22, 2023 at 6:10 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > > >
> > > > > > there was discussion about this some time ago:
> > > > > >   https://lore.kernel.org/bpf/CAEf4BzZ-xe-zSjbBpKLHfQKPnTRTBMA2Eg382+_4kQoTLnj4eQ@mail.gmail.com/
> > > > > >
> > > > > > seems the 'active' problem andrii described fits to your case as well
> > > > >
> > > > > I suspect per-cpu recursion counter will miss more events in this case,
> > > > > since _any_ kprobe on that cpu will be blocked.
> > > > > If missing events is not an issue you probably want a per-cpu counter
> > > > > that is specific to your single ip_queue_xmit attach point.
> > > >
> > > > The difference between the scenario described in the linked thread
> > > > and mine is also the reason why I think in-bpf solutions like a
> > > > per-cpu guard can't work here: my programs are recursing due to irqs
> > > > interrupting them and invoking ip_queue_xmit, not because some helper
> > > > I'm using ends up calling ip_queue_xmit. Recursion can happen
> > > > anywhere in my programs, even before they get the chance to set a
> > > > flag or increment a counter in a per-cpu map, since there is no
> > > > atomic "bpf_map_lookup_and_increment" (or is there?)
> > >
> > > __sync_fetch_and_add() is supported. A bunch of selftests are using it.
> > > Or you can use bpf_spin_lock.
> >
> > Sure, but I'd still have to lookup the element from the map first.
> > At a minimum it would look something like:
> >
> > SEC("fentry/ip_queue_xmit")
> > int BPF_PROG(entry_prog) {
> >     int key = 0;
> >     int64_t *guard = bpf_map_lookup_elem(&per_cpu, &key);
> >     if (guard) {
> >         if (__sync_fetch_and_add(guard, 1) == 0) {
> >             ...
> >         }
> >     }
> > }
> >
> > The program could be interrupted before it reaches
> > __sync_fetch_and_add (just tested this and it does not solve the
> > problem)
>
> Ahh. I got confused by your bpf_map_lookup_and_increment idea.
> It won't help here either if you're concerned of IRQ
> after prog starts and before the first lookup.
> You can use global data. In such case there is no lookup function call.
> It reduces the race window, but it's theoretically still there.

Per-cpu global variables? Is that a thing?

> Try kprobes, but they're slower and I suspect you'll miss more events,
> because all kprobe progs are in one bucket.
>
> Better approach is to switch to networking hooks instead of tracing.

Yeah, worst case kprobes would be acceptable, the number of
problematic interruptions is quite small to the point where losing
more events shouldn't be an issue, but I agree I should be looking
for a better hook.

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-23  8:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-21 17:13 bpf: missed fentry/fexit invocations due to implicit recursion Davide Miola
2023-03-21 22:55 ` Jiri Olsa
2023-03-22  7:33   ` Davide Miola
2023-03-22 12:57     ` Jiri Olsa
2023-03-22 16:06       ` Alexei Starovoitov
2023-03-22 21:39         ` Davide Miola
2023-03-22 22:21           ` Alexei Starovoitov
2023-03-22 22:45             ` Davide Miola
2023-03-22 22:59               ` Alexei Starovoitov
2023-03-23  8:17                 ` Davide Miola

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.