All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb
@ 2022-02-19 19:12 Dongli Zhang
  2022-02-19 19:12 ` [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason() Dongli Zhang
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Dongli Zhang @ 2022-02-19 19:12 UTC (permalink / raw)
  To: netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, dsahern, edumazet

The commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") has
introduced the kfree_skb_reason() to help track the reason.

The tun and tap are commonly used as virtio-net/vhost-net backend. This is to
use kfree_skb_reason() to trace the dropped skb for those two drivers. 

Changed since v1:

I have renamed many of the reasons since v1. I make them as generic as possible
so that they can be re-used by core networking and drivers.

- SKB_DROP_REASON_SKB_CSUM

This is used whenever there is checksum error with sk_buff.

- SKB_DROP_REASON_SKB_COPY_DATA

The kernel may (zero) copy the data to or from sk_buff, e.g.,
zerocopy_sg_from_iter(), skb_copy_datagram_from_iter() and
skb_orphan_frags_rx(). This reason is for the copy related error.

- SKB_DROP_REASON_SKB_GSO_SEG

Any error reported when GSO processing the sk_buff. It is frequent to process
sk_buff gso data and we introduce a new reason to handle that.
	
- SKB_DROP_REASON_SKB_PULL
- SKB_DROP_REASON_SKB_TRIM

It is frequent to pull to sk_buff data or trim the sk_buff data.

- SKB_DROP_REASON_DEV_HDR

Any driver may report error if there is any error in the metadata on the DMA
ring buffer.

- SKB_DROP_REASON_DEV_READY

The device is not ready/online or initialized to receive data.

- SKB_DROP_REASON_DEV_FILTER

David Ahern suggested SKB_DROP_REASON_TAP_FILTER. I changed from 'TAP' to 'DEV'
to make it more generic.

- SKB_DROP_REASON_FULL_RING

Suggested by Eric Dumazet.

- SKB_DROP_REASON_BPF_FILTER

Dropped by ebpf filter


This is the output for TUN device.

# cat /sys/kernel/debug/tracing/trace_pipe
          <idle>-0       [020] b.s3.  2317.035738: kfree_skb: skbaddr=00000000feed21ec protocol=4 location=000000003e0b7d3d reason: FULL_RING
          arping-7078    [028] b..1.  2317.065261: kfree_skb: skbaddr=000000003409d3c3 protocol=2054 location=000000003e0b7d3d reason: FULL_RING
      vhost-5409-5422    [001] b..1.  2317.818647: kfree_skb: skbaddr=0000000004b8c03a protocol=2054 location=000000003e0b7d3d reason: FULL_RING
          arping-7078    [028] b..1.  2318.065336: kfree_skb: skbaddr=000000003409d3c3 protocol=2054 location=000000003e0b7d3d reason: FULL_RING
      vhost-5409-5422    [001] b..1.  2318.845315: kfree_skb: skbaddr=0000000004b8c03a protocol=2054 location=000000003e0b7d3d reason: FULL_RING

This is the output for TAP device.

# cat /sys/kernel/debug/tracing/trace_pipe 
    kworker/19:1-243     [019] ..s1.   922.639393: kfree_skb: skbaddr=0000000045a155b0 protocol=2054 location=00000000f88b4996 reason: FULL_RING
    kworker/19:1-243     [019] ..s1.   923.663398: kfree_skb: skbaddr=00000000ff4e054b protocol=2054 location=00000000f88b4996 reason: FULL_RING
    kworker/19:1-243     [019] ..s1.   924.687351: kfree_skb: skbaddr=00000000ff4e054b protocol=2054 location=00000000f88b4996 reason: FULL_RING
    kworker/19:1-243     [019] ..s1.   925.711294: kfree_skb: skbaddr=00000000fb75a48b protocol=2054 location=00000000f88b4996 reason: FULL_RING



 drivers/net/tap.c          | 30 ++++++++++++++++++++++--------
 drivers/net/tun.c          | 38 ++++++++++++++++++++++++++++++--------
 include/linux/skbuff.h     | 16 ++++++++++++++++
 include/trace/events/skb.h | 10 ++++++++++


Please let me know if there is any suggestion on the definition of reasons.

Thank you very much!

Dongli Zhang



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason()
  2022-02-19 19:12 [PATCH v2 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb Dongli Zhang
@ 2022-02-19 19:12 ` Dongli Zhang
  2022-02-19 22:46   ` David Ahern
  2022-02-19 19:12 ` [PATCH v2 2/3] net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement" Dongli Zhang
  2022-02-19 19:12 ` [PATCH v2 3/3] net: tun: track dropped skb via kfree_skb_reason() Dongli Zhang
  2 siblings, 1 reply; 8+ messages in thread
From: Dongli Zhang @ 2022-02-19 19:12 UTC (permalink / raw)
  To: netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, dsahern, edumazet

The TAP can be used as vhost-net backend. E.g., the tap_handle_frame() is
the interface to forward the skb from TAP to vhost-net/virtio-net.

However, there are many "goto drop" in the TAP driver. Therefore, the
kfree_skb_reason() is involved at each "goto drop" to help userspace
ftrace/ebpf to track the reason for the loss of packets.

The below reasons are introduced:

- SKB_DROP_REASON_SKB_CSUM
- SKB_DROP_REASON_SKB_COPY_DATA
- SKB_DROP_REASON_SKB_GSO_SEG
- SKB_DROP_REASON_DEV_HDR
- SKB_DROP_REASON_FULL_RING

Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/net/tap.c          | 30 ++++++++++++++++++++++--------
 include/linux/skbuff.h     |  9 +++++++++
 include/trace/events/skb.h |  5 +++++
 3 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 8e3a28ba6b28..ab3592506ef8 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -322,6 +322,7 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
 	struct tap_dev *tap;
 	struct tap_queue *q;
 	netdev_features_t features = TAP_FEATURES;
+	int drop_reason;
 
 	tap = tap_dev_get_rcu(dev);
 	if (!tap)
@@ -343,12 +344,16 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
 		struct sk_buff *segs = __skb_gso_segment(skb, features, false);
 		struct sk_buff *next;
 
-		if (IS_ERR(segs))
+		if (IS_ERR(segs)) {
+			drop_reason = SKB_DROP_REASON_SKB_GSO_SEG;
 			goto drop;
+		}
 
 		if (!segs) {
-			if (ptr_ring_produce(&q->ring, skb))
+			if (ptr_ring_produce(&q->ring, skb)) {
+				drop_reason = SKB_DROP_REASON_FULL_RING;
 				goto drop;
+			}
 			goto wake_up;
 		}
 
@@ -369,10 +374,14 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
 		 */
 		if (skb->ip_summed == CHECKSUM_PARTIAL &&
 		    !(features & NETIF_F_CSUM_MASK) &&
-		    skb_checksum_help(skb))
+		    skb_checksum_help(skb)) {
+			drop_reason = SKB_DROP_REASON_SKB_CSUM;
 			goto drop;
-		if (ptr_ring_produce(&q->ring, skb))
+		}
+		if (ptr_ring_produce(&q->ring, skb)) {
+			drop_reason = SKB_DROP_REASON_FULL_RING;
 			goto drop;
+		}
 	}
 
 wake_up:
@@ -383,7 +392,7 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
 	/* Count errors/drops only here, thus don't care about args. */
 	if (tap->count_rx_dropped)
 		tap->count_rx_dropped(tap);
-	kfree_skb(skb);
+	kfree_skb_reason(skb, drop_reason);
 	return RX_HANDLER_CONSUMED;
 }
 EXPORT_SYMBOL_GPL(tap_handle_frame);
@@ -632,6 +641,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
 	int depth;
 	bool zerocopy = false;
 	size_t linear;
+	int drop_reason;
 
 	if (q->flags & IFF_VNET_HDR) {
 		vnet_hdr_len = READ_ONCE(q->vnet_hdr_sz);
@@ -696,8 +706,10 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
 	else
 		err = skb_copy_datagram_from_iter(skb, 0, from, len);
 
-	if (err)
+	if (err) {
+		drop_reason = SKB_DROP_REASON_SKB_COPY_DATA;
 		goto err_kfree;
+	}
 
 	skb_set_network_header(skb, ETH_HLEN);
 	skb_reset_mac_header(skb);
@@ -706,8 +718,10 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
 	if (vnet_hdr_len) {
 		err = virtio_net_hdr_to_skb(skb, &vnet_hdr,
 					    tap_is_little_endian(q));
-		if (err)
+		if (err) {
+			drop_reason = SKB_DROP_REASON_DEV_HDR;
 			goto err_kfree;
+		}
 	}
 
 	skb_probe_transport_header(skb);
@@ -738,7 +752,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
 	return total_len;
 
 err_kfree:
-	kfree_skb(skb);
+	kfree_skb_reason(skb, drop_reason);
 
 err:
 	rcu_read_lock();
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a5adbf6b51e8..218f7ba753e7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -346,6 +346,15 @@ enum skb_drop_reason {
 					 * udp packet drop out of
 					 * udp_memory_allocated.
 					 */
+	SKB_DROP_REASON_SKB_CSUM,	/* sk_buff checksum error */
+	SKB_DROP_REASON_SKB_COPY_DATA,	/* failed to copy data from or to
+					 * sk_buff
+					 */
+	SKB_DROP_REASON_SKB_GSO_SEG,	/* gso segmentation error */
+	SKB_DROP_REASON_DEV_HDR,	/* there is something wrong with
+					 * device driver specific header
+					 */
+	SKB_DROP_REASON_FULL_RING,	/* ring buffer is full */
 	SKB_DROP_REASON_MAX,
 };
 
diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index cfcfd26399f7..842020d532f2 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -27,6 +27,11 @@
 	EM(SKB_DROP_REASON_IP_NOPROTO, IP_NOPROTO)		\
 	EM(SKB_DROP_REASON_SOCKET_RCVBUFF, SOCKET_RCVBUFF)	\
 	EM(SKB_DROP_REASON_PROTO_MEM, PROTO_MEM)		\
+	EM(SKB_DROP_REASON_SKB_CSUM, SKB_CSUM)			\
+	EM(SKB_DROP_REASON_SKB_COPY_DATA, SKB_COPY_DATA)	\
+	EM(SKB_DROP_REASON_SKB_GSO_SEG, SKB_GSO_SEG)		\
+	EM(SKB_DROP_REASON_DEV_HDR, DEV_HDR)			\
+	EM(SKB_DROP_REASON_FULL_RING, FULL_RING)		\
 	EMe(SKB_DROP_REASON_MAX, MAX)
 
 #undef EM
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement"
  2022-02-19 19:12 [PATCH v2 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb Dongli Zhang
  2022-02-19 19:12 ` [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason() Dongli Zhang
@ 2022-02-19 19:12 ` Dongli Zhang
  2022-02-19 19:12 ` [PATCH v2 3/3] net: tun: track dropped skb via kfree_skb_reason() Dongli Zhang
  2 siblings, 0 replies; 8+ messages in thread
From: Dongli Zhang @ 2022-02-19 19:12 UTC (permalink / raw)
  To: netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, dsahern, edumazet

No functional change.

Just to split the if statement into different conditions to use
kfree_skb_reason() to trace the reason later.

Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/net/tun.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index fed85447701a..aa27268edc5f 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1086,7 +1086,10 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 		goto drop;
 
 	len = run_ebpf_filter(tun, skb, len);
-	if (len == 0 || pskb_trim(skb, len))
+	if (len == 0)
+		goto drop;
+
+	if (pskb_trim(skb, len))
 		goto drop;
 
 	if (unlikely(skb_orphan_frags_rx(skb, GFP_ATOMIC)))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] net: tun: track dropped skb via kfree_skb_reason()
  2022-02-19 19:12 [PATCH v2 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb Dongli Zhang
  2022-02-19 19:12 ` [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason() Dongli Zhang
  2022-02-19 19:12 ` [PATCH v2 2/3] net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement" Dongli Zhang
@ 2022-02-19 19:12 ` Dongli Zhang
  2022-02-19 22:54   ` David Ahern
  2 siblings, 1 reply; 8+ messages in thread
From: Dongli Zhang @ 2022-02-19 19:12 UTC (permalink / raw)
  To: netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, dsahern, edumazet

The TUN can be used as vhost-net backend. E.g, the tun_net_xmit() is the
interface to forward the skb from TUN to vhost-net/virtio-net.

However, there are many "goto drop" in the TUN driver. Therefore, the
kfree_skb_reason() is involved at each "goto drop" to help userspace
ftrace/ebpf to track the reason for the loss of packets.

The below reasons are introduced:

- SKB_DROP_REASON_SKB_PULL
- SKB_DROP_REASON_SKB_TRIM
- SKB_DROP_REASON_DEV_READY
- SKB_DROP_REASON_DEV_FILTER
- SKB_DROP_REASON_BPF_FILTER

Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/net/tun.c          | 37 ++++++++++++++++++++++++++++---------
 include/linux/skbuff.h     |  7 +++++++
 include/trace/events/skb.h |  5 +++++
 3 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index aa27268edc5f..ab47a66deb7f 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1062,13 +1062,16 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct netdev_queue *queue;
 	struct tun_file *tfile;
 	int len = skb->len;
+	int drop_reason;
 
 	rcu_read_lock();
 	tfile = rcu_dereference(tun->tfiles[txq]);
 
 	/* Drop packet if interface is not attached */
-	if (!tfile)
+	if (!tfile) {
+		drop_reason = SKB_DROP_REASON_DEV_READY;
 		goto drop;
+	}
 
 	if (!rcu_dereference(tun->steering_prog))
 		tun_automq_xmit(tun, skb);
@@ -1078,22 +1081,32 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* Drop if the filter does not like it.
 	 * This is a noop if the filter is disabled.
 	 * Filter can be enabled only for the TAP devices. */
-	if (!check_filter(&tun->txflt, skb))
+	if (!check_filter(&tun->txflt, skb)) {
+		drop_reason = SKB_DROP_REASON_DEV_FILTER;
 		goto drop;
+	}
 
 	if (tfile->socket.sk->sk_filter &&
-	    sk_filter(tfile->socket.sk, skb))
+	    sk_filter(tfile->socket.sk, skb)) {
+		drop_reason = SKB_DROP_REASON_SOCKET_FILTER;
 		goto drop;
+	}
 
 	len = run_ebpf_filter(tun, skb, len);
-	if (len == 0)
+	if (len == 0) {
+		drop_reason = SKB_DROP_REASON_BPF_FILTER;
 		goto drop;
+	}
 
-	if (pskb_trim(skb, len))
+	if (pskb_trim(skb, len)) {
+		drop_reason = SKB_DROP_REASON_SKB_TRIM;
 		goto drop;
+	}
 
-	if (unlikely(skb_orphan_frags_rx(skb, GFP_ATOMIC)))
+	if (unlikely(skb_orphan_frags_rx(skb, GFP_ATOMIC))) {
+		drop_reason = SKB_DROP_REASON_SKB_COPY_DATA;
 		goto drop;
+	}
 
 	skb_tx_timestamp(skb);
 
@@ -1104,8 +1117,10 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	nf_reset_ct(skb);
 
-	if (ptr_ring_produce(&tfile->tx_ring, skb))
+	if (ptr_ring_produce(&tfile->tx_ring, skb)) {
+		drop_reason = SKB_DROP_REASON_FULL_RING;
 		goto drop;
+	}
 
 	/* NETIF_F_LLTX requires to do our own update of trans_start */
 	queue = netdev_get_tx_queue(dev, txq);
@@ -1122,7 +1137,7 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 drop:
 	atomic_long_inc(&dev->tx_dropped);
 	skb_tx_error(skb);
-	kfree_skb(skb);
+	kfree_skb_reason(skb, drop_reason);
 	rcu_read_unlock();
 	return NET_XMIT_DROP;
 }
@@ -1720,6 +1735,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	u32 rxhash = 0;
 	int skb_xdp = 1;
 	bool frags = tun_napi_frags_enabled(tfile);
+	int drop_reason;
 
 	if (!(tun->flags & IFF_NO_PI)) {
 		if (len < sizeof(pi))
@@ -1822,10 +1838,11 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 			err = skb_copy_datagram_from_iter(skb, 0, from, len);
 
 		if (err) {
+			drop_reason = SKB_DROP_REASON_SKB_COPY_DATA;
 			err = -EFAULT;
 drop:
 			atomic_long_inc(&tun->dev->rx_dropped);
-			kfree_skb(skb);
+			kfree_skb_reason(skb, drop_reason);
 			if (frags) {
 				tfile->napi.skb = NULL;
 				mutex_unlock(&tfile->napi_mutex);
@@ -1872,6 +1889,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	case IFF_TAP:
 		if (frags && !pskb_may_pull(skb, ETH_HLEN)) {
 			err = -ENOMEM;
+			drop_reason = SKB_DROP_REASON_SKB_PULL;
 			goto drop;
 		}
 		skb->protocol = eth_type_trans(skb, tun->dev);
@@ -1925,6 +1943,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	if (unlikely(!(tun->dev->flags & IFF_UP))) {
 		err = -EIO;
 		rcu_read_unlock();
+		drop_reason = SKB_DROP_REASON_DEV_READY;
 		goto drop;
 	}
 
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 218f7ba753e7..9370778b428d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -351,10 +351,17 @@ enum skb_drop_reason {
 					 * sk_buff
 					 */
 	SKB_DROP_REASON_SKB_GSO_SEG,	/* gso segmentation error */
+	SKB_DROP_REASON_SKB_PULL,	/* failed to pull sk_buff data */
+	SKB_DROP_REASON_SKB_TRIM,	/* failed to trim sk_buff data */
 	SKB_DROP_REASON_DEV_HDR,	/* there is something wrong with
 					 * device driver specific header
 					 */
+	SKB_DROP_REASON_DEV_READY,	/* device is not ready */
+	SKB_DROP_REASON_DEV_FILTER,	/* dropped by device driver
+					 * specific filter
+					 */
 	SKB_DROP_REASON_FULL_RING,	/* ring buffer is full */
+	SKB_DROP_REASON_BPF_FILTER,	/* dropped by ebpf filter */
 	SKB_DROP_REASON_MAX,
 };
 
diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 842020d532f2..62704851062c 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -30,8 +30,13 @@
 	EM(SKB_DROP_REASON_SKB_CSUM, SKB_CSUM)			\
 	EM(SKB_DROP_REASON_SKB_COPY_DATA, SKB_COPY_DATA)	\
 	EM(SKB_DROP_REASON_SKB_GSO_SEG, SKB_GSO_SEG)		\
+	EM(SKB_DROP_REASON_SKB_PULL, SKB_PULL)			\
+	EM(SKB_DROP_REASON_SKB_TRIM, SKB_TRIM)			\
 	EM(SKB_DROP_REASON_DEV_HDR, DEV_HDR)			\
+	EM(SKB_DROP_REASON_DEV_READY, DEV_READY)		\
+	EM(SKB_DROP_REASON_DEV_FILTER, DEV_FILTER)		\
 	EM(SKB_DROP_REASON_FULL_RING, FULL_RING)		\
+	EM(SKB_DROP_REASON_BPF_FILTER, BPF_FILTER)		\
 	EMe(SKB_DROP_REASON_MAX, MAX)
 
 #undef EM
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason()
  2022-02-19 19:12 ` [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason() Dongli Zhang
@ 2022-02-19 22:46   ` David Ahern
  2022-02-20  5:39     ` Dongli Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: David Ahern @ 2022-02-19 22:46 UTC (permalink / raw)
  To: Dongli Zhang, netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, edumazet

On 2/19/22 12:12 PM, Dongli Zhang wrote:
> The TAP can be used as vhost-net backend. E.g., the tap_handle_frame() is
> the interface to forward the skb from TAP to vhost-net/virtio-net.
> 
> However, there are many "goto drop" in the TAP driver. Therefore, the
> kfree_skb_reason() is involved at each "goto drop" to help userspace
> ftrace/ebpf to track the reason for the loss of packets.
> 
> The below reasons are introduced:
> 
> - SKB_DROP_REASON_SKB_CSUM
> - SKB_DROP_REASON_SKB_COPY_DATA
> - SKB_DROP_REASON_SKB_GSO_SEG
> - SKB_DROP_REASON_DEV_HDR
> - SKB_DROP_REASON_FULL_RING
> 
> Cc: Joao Martins <joao.m.martins@oracle.com>
> Cc: Joe Jin <joe.jin@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>  drivers/net/tap.c          | 30 ++++++++++++++++++++++--------
>  include/linux/skbuff.h     |  9 +++++++++
>  include/trace/events/skb.h |  5 +++++
>  3 files changed, 36 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/tap.c b/drivers/net/tap.c
> index 8e3a28ba6b28..ab3592506ef8 100644
> --- a/drivers/net/tap.c
> +++ b/drivers/net/tap.c
> @@ -322,6 +322,7 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
>  	struct tap_dev *tap;
>  	struct tap_queue *q;
>  	netdev_features_t features = TAP_FEATURES;
> +	int drop_reason;

enum skb_drop_reason drop_reason;

>  
>  	tap = tap_dev_get_rcu(dev);
>  	if (!tap)
> @@ -343,12 +344,16 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
>  		struct sk_buff *segs = __skb_gso_segment(skb, features, false);
>  		struct sk_buff *next;
>  
> -		if (IS_ERR(segs))
> +		if (IS_ERR(segs)) {
> +			drop_reason = SKB_DROP_REASON_SKB_GSO_SEG;
>  			goto drop;
> +		}
>  
>  		if (!segs) {
> -			if (ptr_ring_produce(&q->ring, skb))
> +			if (ptr_ring_produce(&q->ring, skb)) {
> +				drop_reason = SKB_DROP_REASON_FULL_RING;
>  				goto drop;
> +			}
>  			goto wake_up;
>  		}

you missed the walk of segs and calling ptr_ring_produce.

>  

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] net: tun: track dropped skb via kfree_skb_reason()
  2022-02-19 19:12 ` [PATCH v2 3/3] net: tun: track dropped skb via kfree_skb_reason() Dongli Zhang
@ 2022-02-19 22:54   ` David Ahern
  2022-02-20  5:40     ` Dongli Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: David Ahern @ 2022-02-19 22:54 UTC (permalink / raw)
  To: Dongli Zhang, netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, edumazet

On 2/19/22 12:12 PM, Dongli Zhang wrote:
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index aa27268edc5f..ab47a66deb7f 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1062,13 +1062,16 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
>  	struct netdev_queue *queue;
>  	struct tun_file *tfile;
>  	int len = skb->len;
> +	int drop_reason;

enum skb_drop_reason


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason()
  2022-02-19 22:46   ` David Ahern
@ 2022-02-20  5:39     ` Dongli Zhang
  0 siblings, 0 replies; 8+ messages in thread
From: Dongli Zhang @ 2022-02-20  5:39 UTC (permalink / raw)
  To: David Ahern, netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, edumazet

Hi David,

On 2/19/22 2:46 PM, David Ahern wrote:
> On 2/19/22 12:12 PM, Dongli Zhang wrote:
>> The TAP can be used as vhost-net backend. E.g., the tap_handle_frame() is
>> the interface to forward the skb from TAP to vhost-net/virtio-net.
>>
>> However, there are many "goto drop" in the TAP driver. Therefore, the
>> kfree_skb_reason() is involved at each "goto drop" to help userspace
>> ftrace/ebpf to track the reason for the loss of packets.
>>
>> The below reasons are introduced:
>>
>> - SKB_DROP_REASON_SKB_CSUM
>> - SKB_DROP_REASON_SKB_COPY_DATA
>> - SKB_DROP_REASON_SKB_GSO_SEG
>> - SKB_DROP_REASON_DEV_HDR
>> - SKB_DROP_REASON_FULL_RING
>>
>> Cc: Joao Martins <joao.m.martins@oracle.com>
>> Cc: Joe Jin <joe.jin@oracle.com>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>>  drivers/net/tap.c          | 30 ++++++++++++++++++++++--------
>>  include/linux/skbuff.h     |  9 +++++++++
>>  include/trace/events/skb.h |  5 +++++
>>  3 files changed, 36 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/tap.c b/drivers/net/tap.c
>> index 8e3a28ba6b28..ab3592506ef8 100644
>> --- a/drivers/net/tap.c
>> +++ b/drivers/net/tap.c
>> @@ -322,6 +322,7 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
>>  	struct tap_dev *tap;
>>  	struct tap_queue *q;
>>  	netdev_features_t features = TAP_FEATURES;
>> +	int drop_reason;
> 
> enum skb_drop_reason drop_reason;

According to cscope, so far all 'drop_reason' are declared in type 'int' (e.g.,
ip_rcv_finish_core()).

I will change above to enum.

> 
>>  
>>  	tap = tap_dev_get_rcu(dev);
>>  	if (!tap)
>> @@ -343,12 +344,16 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
>>  		struct sk_buff *segs = __skb_gso_segment(skb, features, false);
>>  		struct sk_buff *next;
>>  
>> -		if (IS_ERR(segs))
>> +		if (IS_ERR(segs)) {
>> +			drop_reason = SKB_DROP_REASON_SKB_GSO_SEG;
>>  			goto drop;
>> +		}
>>  
>>  		if (!segs) {
>> -			if (ptr_ring_produce(&q->ring, skb))
>> +			if (ptr_ring_produce(&q->ring, skb)) {
>> +				drop_reason = SKB_DROP_REASON_FULL_RING;
>>  				goto drop;
>> +			}
>>  			goto wake_up;
>>  		}
> 
> you missed the walk of segs and calling ptr_ring_produce.

This call site of kfree_skb() and kfree_skb_list() is unique and there is not
any "goto drop" involved. From developers' view, we will be able to tell if the
'drop' is at here according to the instruction pointers in callstack.

 360                 consume_skb(skb);
 361                 skb_list_walk_safe(segs, skb, next) {
 362                         skb_mark_not_on_list(skb);
 363                         if (ptr_ring_produce(&q->ring, skb)) {
 364                                 kfree_skb(skb);
 365                                 kfree_skb_list(next);
 366                                 break;
 367                         }
 368                 }

I will add the reason to "walk of segs" case as well. About kfree_skb_list(), I
will introduce a kfree_skb_list_reason().

Thank you very much!

Dongli Zhang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] net: tun: track dropped skb via kfree_skb_reason()
  2022-02-19 22:54   ` David Ahern
@ 2022-02-20  5:40     ` Dongli Zhang
  0 siblings, 0 replies; 8+ messages in thread
From: Dongli Zhang @ 2022-02-20  5:40 UTC (permalink / raw)
  To: David Ahern, netdev, bpf
  Cc: linux-kernel, davem, kuba, rostedt, mingo, ast, daniel, andrii,
	imagedong, joao.m.martins, joe.jin, edumazet

Hi David,

On 2/19/22 2:54 PM, David Ahern wrote:
> On 2/19/22 12:12 PM, Dongli Zhang wrote:
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index aa27268edc5f..ab47a66deb7f 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1062,13 +1062,16 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
>>  	struct netdev_queue *queue;
>>  	struct tun_file *tfile;
>>  	int len = skb->len;
>> +	int drop_reason;
> 
> enum skb_drop_reason
> 

As mentioned in previous email ...

According to cscope, so far all 'drop_reason' are declared in type 'int' (e.g.,
ip_rcv_finish_core()).

I will change above to enum.

Thank you very much!

Dongli Zhang

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-20  5:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-19 19:12 [PATCH v2 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb Dongli Zhang
2022-02-19 19:12 ` [PATCH v2 1/3] net: tap: track dropped skb via kfree_skb_reason() Dongli Zhang
2022-02-19 22:46   ` David Ahern
2022-02-20  5:39     ` Dongli Zhang
2022-02-19 19:12 ` [PATCH v2 2/3] net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement" Dongli Zhang
2022-02-19 19:12 ` [PATCH v2 3/3] net: tun: track dropped skb via kfree_skb_reason() Dongli Zhang
2022-02-19 22:54   ` David Ahern
2022-02-20  5:40     ` Dongli Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.