linux-sctp.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next] net: use a dedicated tracepoint for kfree_skb_list()
@ 2020-10-23 20:52 Davide Caratti
  2020-10-27 22:35 ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 2+ messages in thread
From: Davide Caratti @ 2020-10-23 20:52 UTC (permalink / raw)
  To: netdev; +Cc: Marcelo Ricardo Leitner, Xin Long, linux-sctp, Jakub Kicinski

kfree_skb_list() calls kfree_skb(), thus triggering as many dropwatch
events as the number of skbs in the list. This can disturb the analysis
of packet drops, e.g. with fragmented echo requests generated by ICMP
sockets, or with regular SCTP packets: when consume_skb() frees them,
the kernel's drop monitor may wrongly account for several packet drops:

 consume skb()
   skb_release_data()
     kfree_skb_list()
       kfree_skb() <-- false dropwatch event

don't call kfree_skb() when freeing a skb list, use a dedicated
tracepoint instead. By printing "cur" and "next", it also becomes
possible to reconstruct the skb list from its members.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/trace/events/skb.h | 19 +++++++++++++++++++
 net/core/skbuff.c          |  6 +++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 9e92f22eb086..b16e3544bbbe 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -51,6 +51,25 @@ TRACE_EVENT(consume_skb,
 	TP_printk("skbaddr=%p", __entry->skbaddr)
 );
 
+TRACE_EVENT(kfree_skb_list,
+
+	TP_PROTO(struct sk_buff *cur, struct sk_buff *next),
+
+	TP_ARGS(cur, next),
+
+	TP_STRUCT__entry(
+		__field(	void *,	cur_addr	)
+		__field(	void *,	next_addr	)
+	),
+
+	TP_fast_assign(
+		__entry->cur_addr = cur;
+		__entry->next_addr = next;
+	),
+
+	TP_printk("cur=%p next=%p", __entry->cur_addr, __entry->next_addr)
+);
+
 TRACE_EVENT(skb_copy_datagram_iovec,
 
 	TP_PROTO(const struct sk_buff *skb, int len),
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 1ba8f0163744..7ed6bfc5dfd0 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -702,7 +702,11 @@ void kfree_skb_list(struct sk_buff *segs)
 	while (segs) {
 		struct sk_buff *next = segs->next;
 
-		kfree_skb(segs);
+		if (!skb_unref(segs))
+			continue;
+
+		trace_kfree_skb_list(segs, next);
+		__kfree_skb(segs);
 		segs = next;
 	}
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [RFC PATCH net-next] net: use a dedicated tracepoint for kfree_skb_list()
  2020-10-23 20:52 [RFC PATCH net-next] net: use a dedicated tracepoint for kfree_skb_list() Davide Caratti
@ 2020-10-27 22:35 ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 2+ messages in thread
From: Marcelo Ricardo Leitner @ 2020-10-27 22:35 UTC (permalink / raw)
  To: Davide Caratti; +Cc: netdev, Xin Long, linux-sctp, Jakub Kicinski

On Fri, Oct 23, 2020 at 10:52:14PM +0200, Davide Caratti wrote:
> kfree_skb_list() calls kfree_skb(), thus triggering as many dropwatch
> events as the number of skbs in the list. This can disturb the analysis
> of packet drops, e.g. with fragmented echo requests generated by ICMP
> sockets, or with regular SCTP packets: when consume_skb() frees them,
> the kernel's drop monitor may wrongly account for several packet drops:
> 
>  consume skb()
>    skb_release_data()
>      kfree_skb_list()
>        kfree_skb() <-- false dropwatch event

Seems the problem lies with skb_release_data() calling
kfree_skb_list() while it should have been a, say, consume_skb_list(),
and not generate further kfree_skb calls.

Maybe a bool parameter on skb_release_data to signal that it should
call consume_skb_list (which doesn't exist) instead?

> 
> don't call kfree_skb() when freeing a skb list, use a dedicated
> tracepoint instead. By printing "cur" and "next", it also becomes
> possible to reconstruct the skb list from its members.

I like the new probe alone. It helps to have more visibility on drops
such as those from __dev_xmit_skb() and how they happen.

But as a solution to the problem stated, seems it can be confusing.
Say one is debugging a tx drop issue. AFAICT one would have to watch
both probe points anyway, as the drop could be on a layer below than
SCTP. So I'm not seeing how it helps much, other than possibly causing
drop_watch to miss drops (by not listening to the new trace point).

  Marcelo

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-10-27 22:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-23 20:52 [RFC PATCH net-next] net: use a dedicated tracepoint for kfree_skb_list() Davide Caratti
2020-10-27 22:35 ` Marcelo Ricardo Leitner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).