bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next] xdp: Add tracepoint on XDP program return
@ 2019-12-16 15:27 Toke Høiland-Jørgensen
  2019-12-16 18:17 ` Björn Töpel
  0 siblings, 1 reply; 5+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-16 15:27 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: Toke Høiland-Jørgensen, netdev, bpf,
	Jesper Dangaard Brouer, Ido Schimmel

This adds a new tracepoint, xdp_prog_return, which is triggered at every
XDP program return. This was first discussed back in August[0] as a way to
hook XDP into the kernel drop_monitor framework, to have a one-stop place
to find all packet drops in the system.

Because trace/events/xdp.h includes filter.h, some ifdef guarding is needed
to be able to use the tracepoint from bpf_prog_run_xdp(). If anyone has any
ideas for how to improve on this, please to speak up. Sending this RFC
because of this issue, and to get some feedback from Ido on whether this
tracepoint has enough data for drop_monitor usage.

[0] https://lore.kernel.org/netdev/20190809125418.GB2931@splinter/

Cc: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/linux/filter.h     | 22 +++++++++++++++++--
 include/trace/events/xdp.h | 45 ++++++++++++++++++++++++++++++++++++++
 kernel/bpf/core.c          |  2 ++
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 37ac7025031d..f5e79171902f 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -704,19 +704,37 @@ static inline u32 bpf_prog_run_clear_cb(const struct bpf_prog *prog,
 
 DECLARE_BPF_DISPATCHER(bpf_dispatcher_xdp)
 
+#if defined(_XDP_TRACE_DEF) || defined(_TRACE_XDP_H)
+static void call_trace_xdp_prog_return(const struct xdp_buff *xdp,
+				       const struct bpf_prog *prog,
+				       u32 act);
+#else
+#ifndef _CALL_TRACE_XDP
+#define _CALL_TRACE_XDP
+static inline void call_trace_xdp_prog_return(const struct xdp_buff *xdp,
+					      const struct bpf_prog *prog,
+					      u32 act) {}
+#endif
+#endif
+
 static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
 					    struct xdp_buff *xdp)
 {
+	u32 ret;
+
 	/* Caller needs to hold rcu_read_lock() (!), otherwise program
 	 * can be released while still running, or map elements could be
 	 * freed early while still having concurrent users. XDP fastpath
 	 * already takes rcu_read_lock() when fetching the program, so
 	 * it's not necessary here anymore.
 	 */
-	return __BPF_PROG_RUN(prog, xdp,
-			      BPF_DISPATCHER_FUNC(bpf_dispatcher_xdp));
+	ret = __BPF_PROG_RUN(prog, xdp,
+			     BPF_DISPATCHER_FUNC(bpf_dispatcher_xdp));
+	call_trace_xdp_prog_return(xdp, prog, ret);
+	return ret;
 }
 
+
 void bpf_prog_change_xdp(struct bpf_prog *prev_prog, struct bpf_prog *prog);
 
 static inline u32 bpf_prog_insn_size(const struct bpf_prog *prog)
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index a7378bcd9928..e64f4221bd2e 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -50,6 +50,51 @@ TRACE_EVENT(xdp_exception,
 		  __entry->ifindex)
 );
 
+TRACE_EVENT(xdp_prog_return,
+
+	TP_PROTO(const struct xdp_buff *xdp,
+		 const struct bpf_prog *pr, u32 act),
+
+	TP_ARGS(xdp, pr, act),
+
+	TP_STRUCT__entry(
+		__field(int, prog_id)
+		__field(u32, act)
+		__field(int, ifindex)
+		__field(int, queue_index)
+		__field(const void *, data_addr)
+		__field(unsigned int, data_len)
+	),
+
+	TP_fast_assign(
+		__entry->prog_id	= pr->aux->id;
+		__entry->act		= act;
+		__entry->ifindex	= xdp->rxq->dev->ifindex;
+		__entry->queue_index	= xdp->rxq->queue_index;
+		__entry->data_addr	= xdp->data;
+		__entry->data_len	= (unsigned int)(xdp->data_end - xdp->data);
+	),
+
+	TP_printk("prog_id=%d action=%s ifindex=%d queue_index=%d data_addr=%p data_len=%u",
+		  __entry->prog_id,
+		  __print_symbolic(__entry->act, __XDP_ACT_SYM_TAB),
+		  __entry->ifindex,
+		  __entry->queue_index,
+		  __entry->data_addr,
+		  __entry->data_len)
+);
+
+#ifndef _CALL_TRACE_XDP
+#define _CALL_TRACE_XDP
+static inline void call_trace_xdp_prog_return(const struct xdp_buff *xdp,
+					      const struct bpf_prog *prog,
+					      u32 act)
+{
+	trace_xdp_prog_return(xdp, prog, act);
+}
+#endif
+
+
 TRACE_EVENT(xdp_bulk_tx,
 
 	TP_PROTO(const struct net_device *dev,
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 2ff01a716128..a81d3b8d8e5c 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -17,6 +17,8 @@
  * Kris Katterjohn - Added many additional checks in bpf_check_classic()
  */
 
+#define _XDP_TRACE_DEF
+
 #include <uapi/linux/btf.h>
 #include <linux/filter.h>
 #include <linux/skbuff.h>
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH bpf-next] xdp: Add tracepoint on XDP program return
  2019-12-16 15:27 [RFC PATCH bpf-next] xdp: Add tracepoint on XDP program return Toke Høiland-Jørgensen
@ 2019-12-16 18:17 ` Björn Töpel
  2019-12-17  0:59   ` Alexei Starovoitov
  0 siblings, 1 reply; 5+ messages in thread
From: Björn Töpel @ 2019-12-16 18:17 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Daniel Borkmann, Netdev, bpf,
	Jesper Dangaard Brouer, Ido Schimmel

On Mon, 16 Dec 2019 at 16:28, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> This adds a new tracepoint, xdp_prog_return, which is triggered at every
> XDP program return. This was first discussed back in August[0] as a way to
> hook XDP into the kernel drop_monitor framework, to have a one-stop place
> to find all packet drops in the system.
>
> Because trace/events/xdp.h includes filter.h, some ifdef guarding is needed
> to be able to use the tracepoint from bpf_prog_run_xdp(). If anyone has any
> ideas for how to improve on this, please to speak up. Sending this RFC
> because of this issue, and to get some feedback from Ido on whether this
> tracepoint has enough data for drop_monitor usage.
>

I get that it would be useful, but can it be solved with BPF tracing
(i.e. tracing BPF with BPF)? It would be neat not adding another
tracepoint in the fast-path...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH bpf-next] xdp: Add tracepoint on XDP program return
  2019-12-16 18:17 ` Björn Töpel
@ 2019-12-17  0:59   ` Alexei Starovoitov
  2019-12-17  8:52     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2019-12-17  0:59 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Toke Høiland-Jørgensen, Alexei Starovoitov,
	Daniel Borkmann, Netdev, bpf, Jesper Dangaard Brouer,
	Ido Schimmel

On Mon, Dec 16, 2019 at 07:17:59PM +0100, Björn Töpel wrote:
> On Mon, 16 Dec 2019 at 16:28, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > This adds a new tracepoint, xdp_prog_return, which is triggered at every
> > XDP program return. This was first discussed back in August[0] as a way to
> > hook XDP into the kernel drop_monitor framework, to have a one-stop place
> > to find all packet drops in the system.
> >
> > Because trace/events/xdp.h includes filter.h, some ifdef guarding is needed
> > to be able to use the tracepoint from bpf_prog_run_xdp(). If anyone has any
> > ideas for how to improve on this, please to speak up. Sending this RFC
> > because of this issue, and to get some feedback from Ido on whether this
> > tracepoint has enough data for drop_monitor usage.
> >
> 
> I get that it would be useful, but can it be solved with BPF tracing
> (i.e. tracing BPF with BPF)? It would be neat not adding another
> tracepoint in the fast-path...

That was my question as well.
Here is an example from Eelco:
https://lore.kernel.org/bpf/78D7857B-82E4-42BC-85E1-E3D7C97BF840@redhat.com/
BPF_TRACE_2("fexit/xdp_prog_simple", trace_on_exit,
             struct xdp_buff*, xdp, int, ret)
{
     bpf_debug("fexit: [ifindex = %u, queue =  %u, ret = %d]\n",
               xdp->rxq->dev->ifindex, xdp->rxq->queue_index, ret);

     return 0;
}
'ret' is return code from xdp program.
Such approach is per xdp program, but cheaper when not enabled
and faster when it's triggering comparing to static tracepoint.
Anything missing there that you'd like to see?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH bpf-next] xdp: Add tracepoint on XDP program return
  2019-12-17  0:59   ` Alexei Starovoitov
@ 2019-12-17  8:52     ` Toke Høiland-Jørgensen
  2019-12-23  9:25       ` Ido Schimmel
  0 siblings, 1 reply; 5+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-17  8:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Björn Töpel
  Cc: Alexei Starovoitov, Daniel Borkmann, Netdev, bpf,
	Jesper Dangaard Brouer, Ido Schimmel

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Mon, Dec 16, 2019 at 07:17:59PM +0100, Björn Töpel wrote:
>> On Mon, 16 Dec 2019 at 16:28, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >
>> > This adds a new tracepoint, xdp_prog_return, which is triggered at every
>> > XDP program return. This was first discussed back in August[0] as a way to
>> > hook XDP into the kernel drop_monitor framework, to have a one-stop place
>> > to find all packet drops in the system.
>> >
>> > Because trace/events/xdp.h includes filter.h, some ifdef guarding is needed
>> > to be able to use the tracepoint from bpf_prog_run_xdp(). If anyone has any
>> > ideas for how to improve on this, please to speak up. Sending this RFC
>> > because of this issue, and to get some feedback from Ido on whether this
>> > tracepoint has enough data for drop_monitor usage.
>> >
>> 
>> I get that it would be useful, but can it be solved with BPF tracing
>> (i.e. tracing BPF with BPF)? It would be neat not adding another
>> tracepoint in the fast-path...
>
> That was my question as well.
> Here is an example from Eelco:
> https://lore.kernel.org/bpf/78D7857B-82E4-42BC-85E1-E3D7C97BF840@redhat.com/
> BPF_TRACE_2("fexit/xdp_prog_simple", trace_on_exit,
>              struct xdp_buff*, xdp, int, ret)
> {
>      bpf_debug("fexit: [ifindex = %u, queue =  %u, ret = %d]\n",
>                xdp->rxq->dev->ifindex, xdp->rxq->queue_index, ret);
>
>      return 0;
> }
> 'ret' is return code from xdp program.
> Such approach is per xdp program, but cheaper when not enabled
> and faster when it's triggering comparing to static tracepoint.
> Anything missing there that you'd like to see?

For userspace, sure, the fentry/fexit stuff is fine. The main use case
for this new tracepoint is to hook into the (in-kernel) drop monitor.
Dunno if that can be convinced to hook into the BPF tracing
infrastructure instead of tracepoints. Ido, WDYT?

-Toke


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH bpf-next] xdp: Add tracepoint on XDP program return
  2019-12-17  8:52     ` Toke Høiland-Jørgensen
@ 2019-12-23  9:25       ` Ido Schimmel
  0 siblings, 0 replies; 5+ messages in thread
From: Ido Schimmel @ 2019-12-23  9:25 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Netdev, bpf, Jesper Dangaard Brouer

On Tue, Dec 17, 2019 at 09:52:02AM +0100, Toke Høiland-Jørgensen wrote:
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> 
> > On Mon, Dec 16, 2019 at 07:17:59PM +0100, Björn Töpel wrote:
> >> On Mon, 16 Dec 2019 at 16:28, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >
> >> > This adds a new tracepoint, xdp_prog_return, which is triggered at every
> >> > XDP program return. This was first discussed back in August[0] as a way to
> >> > hook XDP into the kernel drop_monitor framework, to have a one-stop place
> >> > to find all packet drops in the system.
> >> >
> >> > Because trace/events/xdp.h includes filter.h, some ifdef guarding is needed
> >> > to be able to use the tracepoint from bpf_prog_run_xdp(). If anyone has any
> >> > ideas for how to improve on this, please to speak up. Sending this RFC
> >> > because of this issue, and to get some feedback from Ido on whether this
> >> > tracepoint has enough data for drop_monitor usage.
> >> >
> >> 
> >> I get that it would be useful, but can it be solved with BPF tracing
> >> (i.e. tracing BPF with BPF)? It would be neat not adding another
> >> tracepoint in the fast-path...
> >
> > That was my question as well.
> > Here is an example from Eelco:
> > https://lore.kernel.org/bpf/78D7857B-82E4-42BC-85E1-E3D7C97BF840@redhat.com/
> > BPF_TRACE_2("fexit/xdp_prog_simple", trace_on_exit,
> >              struct xdp_buff*, xdp, int, ret)
> > {
> >      bpf_debug("fexit: [ifindex = %u, queue =  %u, ret = %d]\n",
> >                xdp->rxq->dev->ifindex, xdp->rxq->queue_index, ret);
> >
> >      return 0;
> > }
> > 'ret' is return code from xdp program.
> > Such approach is per xdp program, but cheaper when not enabled
> > and faster when it's triggering comparing to static tracepoint.
> > Anything missing there that you'd like to see?
> 
> For userspace, sure, the fentry/fexit stuff is fine. The main use case
> for this new tracepoint is to hook into the (in-kernel) drop monitor.
> Dunno if that can be convinced to hook into the BPF tracing
> infrastructure instead of tracepoints. Ido, WDYT?

Hi Toke,

Sorry for the delay. I wasn't available most of last week.

Regarding the tracepoint, the data it provides seems sufficient to me.
Regarding the fentry/fexit stuff, it would be great to hook it into drop
monitor, but I'm not sure how to do that at this point. It seems that at
minimum user would need to pass the XDP programs that need to be traced?

FYI, I'm not too happy with the current way of capturing the events via
nlmon, so I started creating a utility to directly output the events to
pcap [1] (inspired by Florian's nfqdump). Will send a pull request to
Neil when it's ready. You can do:

# dwdump -w /dev/stdout | tshark -V -r -

A recent enough wireshark will correctly dissect these events. My next
step is to add '--unique' which will load an eBPF program on the socket
and only allow unique events to be enqueued. The program will store
{5-tuple, IP/drop reason} in LRU hash with corresponding count. I can
then instrument the application for Prometheus so that it will export
the contents of the map as metrics.

Please let me know if you have more suggestions.

[1] https://github.com/idosch/dropwatch/blob/dwdump/src/dwdump.c

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-12-23  9:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-16 15:27 [RFC PATCH bpf-next] xdp: Add tracepoint on XDP program return Toke Høiland-Jørgensen
2019-12-16 18:17 ` Björn Töpel
2019-12-17  0:59   ` Alexei Starovoitov
2019-12-17  8:52     ` Toke Høiland-Jørgensen
2019-12-23  9:25       ` Ido Schimmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).