linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket
@ 2018-10-04  8:30 David Howells
  2018-10-04 16:36 ` David Ahern
  2018-10-04 16:54 ` David Howells
  0 siblings, 2 replies; 7+ messages in thread
From: David Howells @ 2018-10-04  8:30 UTC (permalink / raw)
  To: netdev; +Cc: dhowells, linux-afs, linux-kernel

Add three tracepoints to the UDP/UDP6 code to trace three aspects of
sk_buff handling:

 (1) UDP packets that are about to be handed off to the IP layer to be
     transmitted.

 (2) UDP packets that have just arrived at the socket.

 (3) UDP packets that are about to be queued on the UDP socket's receive
     queue, and displaying the sk_rmem_alloc counter.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/trace/events/udp.h |   53 ++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/udp.c             |    3 ++
 net/ipv6/udp.c             |    3 ++
 3 files changed, 59 insertions(+)

diff --git a/include/trace/events/udp.h b/include/trace/events/udp.h
index 336fe272889f..ff6b3ab5a65e 100644
--- a/include/trace/events/udp.h
+++ b/include/trace/events/udp.h
@@ -27,6 +27,59 @@ TRACE_EVENT(udp_fail_queue_rcv_skb,
 	TP_printk("rc=%d port=%hu", __entry->rc, __entry->lport)
 );
 
+TRACE_EVENT(udp_tx,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(struct sk_buff *, skb)
+	),
+
+	TP_fast_assign(
+		__entry->skb = skb;
+	),
+
+	TP_printk("skb=%p", __entry->skb)
+);
+
+TRACE_EVENT(udp_rcv,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(struct sk_buff *, skb)
+	),
+
+	TP_fast_assign(
+		__entry->skb = skb;
+	),
+
+	TP_printk("skb=%p", __entry->skb)
+);
+
+TRACE_EVENT(udp_enqueue,
+
+	    TP_PROTO(struct sk_buff *skb, int rmem),
+
+	    TP_ARGS(skb, rmem),
+
+	    TP_STRUCT__entry(
+		    __field(struct sk_buff *, skb)
+		    __field(int, rmem)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->skb = skb;
+		    __entry->rmem = rmem;
+			   ),
+
+	    TP_printk("skb=%p rm=%d", __entry->skb, __entry->rmem)
+	    );
+
 #endif /* _TRACE_UDP_H */
 
 /* This part must be outside protection */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index f93bc444f631..67afbd0ac0a7 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -826,6 +826,7 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4,
 		uh->check = CSUM_MANGLED_0;
 
 send:
+	trace_udp_tx(skb);
 	err = ip_send_skb(sock_net(sk), skb);
 	if (err) {
 		if (err == -ENOBUFS && !inet->recverr) {
@@ -1363,6 +1364,7 @@ static int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 	rmem = atomic_read(&sk->sk_rmem_alloc);
 	if (rmem > sk->sk_rcvbuf)
 		goto drop;
+	trace_udp_enqueue(skb, rmem);
 
 	/* Under mem pressure, it might be helpful to help udp_recvmsg()
 	 * having linear skbs :
@@ -2395,6 +2397,7 @@ int udp_v4_early_demux(struct sk_buff *skb)
 
 int udp_rcv(struct sk_buff *skb)
 {
+	trace_udp_rcv(skb);
 	return __udp4_lib_rcv(skb, &udp_table, IPPROTO_UDP);
 }
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 4421fbf0a7a4..baec8abc1409 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -54,6 +54,7 @@
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <trace/events/skb.h>
+#include <trace/events/udp.h>
 #include "udp_impl.h"
 
 static bool udp6_lib_exact_dif_match(struct net *net, struct sk_buff *skb)
@@ -950,6 +951,7 @@ static void udp_v6_early_demux(struct sk_buff *skb)
 
 static __inline__ int udpv6_rcv(struct sk_buff *skb)
 {
+	trace_udp_rcv(skb);
 	return __udp6_lib_rcv(skb, &udp_table, IPPROTO_UDP);
 }
 
@@ -1093,6 +1095,7 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6,
 		uh->check = CSUM_MANGLED_0;
 
 send:
+	trace_udp_tx(skb);
 	err = ip6_send_skb(skb);
 	if (err) {
 		if (err == -ENOBUFS && !inet6_sk(sk)->recverr) {


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket
  2018-10-04  8:30 [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket David Howells
@ 2018-10-04 16:36 ` David Ahern
  2018-10-04 16:54 ` David Howells
  1 sibling, 0 replies; 7+ messages in thread
From: David Ahern @ 2018-10-04 16:36 UTC (permalink / raw)
  To: David Howells, netdev; +Cc: linux-afs, linux-kernel

On 10/4/18 2:30 AM, David Howells wrote:
> Add three tracepoints to the UDP/UDP6 code to trace three aspects of
> sk_buff handling:
> 
>  (1) UDP packets that are about to be handed off to the IP layer to be
>      transmitted.
> 
>  (2) UDP packets that have just arrived at the socket.

These do not add any real value. You are only dumping the skb address,
and you can get the same effect with a probe on the function and dumping
the value of the register with the skb argument.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket
  2018-10-04  8:30 [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket David Howells
  2018-10-04 16:36 ` David Ahern
@ 2018-10-04 16:54 ` David Howells
  2018-10-04 17:19   ` Alexei Starovoitov
  2018-10-04 23:24   ` David Howells
  1 sibling, 2 replies; 7+ messages in thread
From: David Howells @ 2018-10-04 16:54 UTC (permalink / raw)
  To: David Ahern; +Cc: dhowells, netdev, linux-afs, linux-kernel

David Ahern <dsahern@gmail.com> wrote:

> These do not add any real value. You are only dumping the skb address,

Not in and of themselves, no - but in combination with tracepoints elsewhere,
where you can use the skb address as a key to follow the lifetime of a
particular skb.

> and you can get the same effect with a probe on the function and dumping
> the value of the register with the skb argument.

Installing a probe requires various preconditions, including your kernel not
being locked down and having the source around AIUI.

Anyway, I've no especially strong feelings about this patch - it's been useful
for me and it might be useful to others.

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket
  2018-10-04 16:54 ` David Howells
@ 2018-10-04 17:19   ` Alexei Starovoitov
  2018-10-04 23:24   ` David Howells
  1 sibling, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2018-10-04 17:19 UTC (permalink / raw)
  To: David Howells; +Cc: David Ahern, netdev, linux-afs, linux-kernel

On Thu, Oct 04, 2018 at 05:54:33PM +0100, David Howells wrote:
> David Ahern <dsahern@gmail.com> wrote:
> 
> > These do not add any real value. You are only dumping the skb address,
> 
> Not in and of themselves, no - but in combination with tracepoints elsewhere,
> where you can use the skb address as a key to follow the lifetime of a
> particular skb.
> 
> > and you can get the same effect with a probe on the function and dumping
> > the value of the register with the skb argument.
> 
> Installing a probe requires various preconditions, including your kernel not
> being locked down and having the source around AIUI.
> 
> Anyway, I've no especially strong feelings about this patch - it's been useful
> for me and it might be useful to others.

I don't quite see what value this tracepoint adds.
Just use kprobe in the same spot? Or tcpdump?
How was it useful for you? and why it couldn't have been achieved by other means?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket
  2018-10-04 16:54 ` David Howells
  2018-10-04 17:19   ` Alexei Starovoitov
@ 2018-10-04 23:24   ` David Howells
  2018-10-05  2:03     ` Alexei Starovoitov
  2018-10-05  6:46     ` David Howells
  1 sibling, 2 replies; 7+ messages in thread
From: David Howells @ 2018-10-04 23:24 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: dhowells, David Ahern, netdev, linux-afs, linux-kernel

Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> I don't quite see what value this tracepoint adds.  Just use kprobe in the
> same spot?  Or tcpdump?

The point was to measure the elapsed time between a packet being received in
the r8169 driver (see the patch I posted for that), getting into udp driver,
being queued by the udp driver and and thence being passed to the af_rxrpc
driver's data_ready function.  You cannot measure this with tcpdump.

Further, packets were crossing each other, so the skb pointer had to be
included so that non-consecutive trace lines could be correlated.  Note that I
have seen up to 4s of delay between a packet being received in r8169 and
arriving in af_rxrpc, so the correlation is necessary.

As for kprobes: spots, plural.  Using kprobes involves installing breakpoint
instructions and taking traps and would seem to require using printk for the
output, which would affect the timings and potentially affect the situation,
especially given the sheer amount of traces produced.  Note that one of these
was in an interrupt handler, which is possibly not the best place to be taking
an illegal instruction trap.

Using kprobes gets worse too.  In my case, the source code and the build tree
aren't on the test machine, which I understand would be a requirement - and
they change regularly (every few minutes).  I boot a monolithic kernel by PXE
directly out of the build tree rather than messing about with building kernel
packages.

Further, I already have tracepoints in rxrpc that produce what I needed that I
could just turn on.

> How was it useful for you?

I could get traces like the following:

<idle>-0     [001] d.h2    95.751176: net_rtl8169_interrupt: enp3s0 st=1
<idle>-0     [001] ..s2    95.751181: net_rtl8169_poll: enp3s0 st=1
<idle>-0     [001] ..s2    95.751185: net_rtl8169_rx: enp3s0 skb=00000000efddbee8
<idle>-0     [001] ..s4    95.751204: udp_rcv: skb=00000000efddbee8
<idle>-0     [001] ..s4    95.751207: rxrpc_skb: s=00000000efddbee8 Rx RCV u=1 m=1 p=rxrpc_rcv_enqueue_skb+0xae/0x18f8
<idle>-0     [001] ..s4    95.751208: rxrpc_rx_packet: 911beb59:b241ccb4:00000006:0001 0000000e 00000001 01 04 DATA
<idle>-0     [001] ..s5    95.751214: rxrpc_skb: s=00000000efddbee8 Rx GOT u=1 m=2 p=rxrpc_input_data+0x557/0x96f
<idle>-0     [001] .Ns4    95.751234: rxrpc_skb: s=00000000efddbee8 Rx FRE u=2 m=1 p=rxrpc_rcv_enqueue_skb+0x15b0/0x18f8
    ls-3183  [001] ...1    95.751252: rxrpc_skb: s=00000000efddbee8 Rx SEE u=1 m=1 p=rxrpc_recvmsg_data.isra.0+0x1bf/0xdc1
    ls-3183  [001] ...1    95.751255: rxrpc_skb: s=00000000efddbee8 Rx ROT u=1 m=1 p=rxrpc_recvmsg_data.isra.0+0x7eb/0xdc1
    ls-3183  [001] ...1    95.751255: rxrpc_skb: s=00000000efddbee8 Rx FRE u=1 m=0 p=rxrpc_recvmsg_data.isra.0+0x840/0xdc1

I can see the interrupt where the NIC pokes the machine to say that there's a
packet, and then I can see the path that the received skbuff passes through
the r8169, udp and af_rxrpc drivers.

For its warts, tracepoints are pretty nice for this sort of thing - but they
have to already exist for you to make use of them, and changing them does
require a recompile.  The main problem I have with tracepoints is during boot,
module load or module removal where the buffer never becomes accessible or
gets wiped.

So, in and of themselves, the udp tracepoints in my patch aren't that useful -
but in combination with tracepoints in the nic driver and in the kernel
service that's using udp, they're actually quite useful.

> and why it couldn't have been achieved by other means?

It probably could, but why should I?

I'm changing my test kernel at very regular intervals.  Stacking on extra
patches to add more tracepoints is much less of an inconvenience than trying
to use kprobes.

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket
  2018-10-04 23:24   ` David Howells
@ 2018-10-05  2:03     ` Alexei Starovoitov
  2018-10-05  6:46     ` David Howells
  1 sibling, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2018-10-05  2:03 UTC (permalink / raw)
  To: David Howells; +Cc: David Ahern, netdev, linux-afs, linux-kernel

On Fri, Oct 05, 2018 at 12:24:42AM +0100, David Howells wrote:
> 
> As for kprobes: spots, plural.  Using kprobes involves installing breakpoint
> instructions and taking traps and would seem to require using printk for the
> output, which would affect the timings and potentially affect the situation,
> especially given the sheer amount of traces produced.  Note that one of these
> was in an interrupt handler, which is possibly not the best place to be taking
> an illegal instruction trap.
> 
> Using kprobes gets worse too.  In my case, the source code and the build tree
> aren't on the test machine, which I understand would be a requirement - and

there is quite a bit of misunderstanding in the above two paragraphs about kprobes.
Sounds like it influences odd design choices with tracepoints.

- kprobes at the top of the function don't use traps and they've been
optimized over the years to have very low overhead
- trace_printk is using the same mechanism as tracepoint prints
- both trace_printk and tracepoints can affect timing
- kprobes and trace_printk work well out of irq handler. They work out of NMI too
- build tree doesn't need to be present to use kprobes

perf, bcc and others tools have user friendly (arguable of course) interfaces
for kprobes.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket
  2018-10-04 23:24   ` David Howells
  2018-10-05  2:03     ` Alexei Starovoitov
@ 2018-10-05  6:46     ` David Howells
  1 sibling, 0 replies; 7+ messages in thread
From: David Howells @ 2018-10-05  6:46 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: dhowells, David Ahern, netdev, linux-afs, linux-kernel

Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> - kprobes at the top of the function don't use traps and they've been
> optimized over the years to have very low overhead

To quote kprobes.txt:

	How Does a Kprobe Work?
	-----------------------

	When a kprobe is registered, Kprobes makes a copy of the probed
	instruction and replaces the first byte(s) of the probed instruction
	with a breakpoint instruction (e.g., int3 on i386 and x86_64).

Perhaps the docs need updating.

However, for my purposes, tracepoints are easier.

Anyway, as I said, I don't feel that strongly about the patch.  It was useful
for me, and I thought it might be useful for other people.

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-10-05  6:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-04  8:30 [PATCH net-next] udp: Add tracepoints to monitor skbs going in and out of a UDP socket David Howells
2018-10-04 16:36 ` David Ahern
2018-10-04 16:54 ` David Howells
2018-10-04 17:19   ` Alexei Starovoitov
2018-10-04 23:24   ` David Howells
2018-10-05  2:03     ` Alexei Starovoitov
2018-10-05  6:46     ` David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).