linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change
@ 2022-05-17  8:09 menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code menglong8.dong
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:09 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev

From: Menglong Dong <imagedong@tencent.com>

In this series patches, skb drop reasons are add to code path of TCP
state change, which we have not done before. It is hard to pass these
reasons from the function to its caller, where skb is dropped. In order
to do this, we have to make some functions return skb drop reasons, or
pass the pointer of 'reason' to these function as an new function
argument.

=============================
We change the type of the return value of tcp_rcv_synsent_state_process()
and tcp_rcv_state_process() to 'enum skb_drop_reason' and make them
return skb drop reasons in 5th and 6th patch.

=============================
In order to get skb drop reasons during tcp connect requesting code path,
we have to pass the pointer of the 'reason' as a new function argument of
conn_request() in 'struct inet_connection_sock_af_ops'. As the return
value of conn_request() can be positive or negative or 0, it's not
flexible to make it return drop reasons. This work is done in the 7th
patch, and functions that used as conn_request() is also modified:

  dccp_v4_conn_request()
  dccp_v6_conn_request()
  tcp_v4_conn_request()
  tcp_v6_conn_request()
  subflow_v4_conn_request()
  subflow_v6_conn_request()

As our target is TCP, dccp and mptcp are not handled more.

=============================
In the 8th patch, skb drop reasons are add to
tcp_timewait_state_process() by adding a function argument to it. In the
origin code, all skb are dropped for tw socket. In order to make less
noise, use consume_skb() for the 'good' skb. This can be checked by the
caller of tcp_timewait_state_process() from the value of drop reason.
If the drop reason is SKB_NOT_DROPPED_YET, it means this skb should not
be dropped.

=============================
In the 9th patch, skb drop reasons are add to the route_req() in struct
tcp_request_sock_ops. Following functions are involved:

  tcp_v4_route_req()
  tcp_v6_route_req()
  subflow_v4_route_req()
  subflow_v6_route_req()

In this series patches, following new drop reasons are added:

  SOCKET_DESTROYED
  TCP_PAWSACTIVEREJECTED
  TCP_ABORTONDATA
  LISTENOVERFLOWS
  TCP_REQQFULLDROP
  TIMEWAIT
  LSM

Changes since v1:
7/9 - fix the compile errors of dccp and mptcp (kernel test robot)
8/9 - skb is not freed on TCP_TW_ACK and 'ret' is not initizalized, fix
      it (Eric Dumazet)

Menglong Dong (9):
  net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code
  net: skb: introduce __skb_queue_purge_reason()
  net: sock: introduce sk_stream_kill_queues_reason()
  net: inet: add skb drop reason to inet_csk_destroy_sock()
  net: tcp: make tcp_rcv_synsent_state_process() return drop reasons
  net: tcp: make tcp_rcv_state_process() return drop reason
  net: tcp: add skb drop reasons to tcp connect requesting
  net: tcp: add skb drop reasons to tcp tw code path
  net: tcp: add skb drop reasons to route_req()

 include/linux/skbuff.h             | 482 +++++++++++++++++++----------
 include/net/inet_connection_sock.h |   3 +-
 include/net/sock.h                 |   8 +-
 include/net/tcp.h                  |  27 +-
 include/trace/events/skb.h         |  89 +-----
 net/core/drop_monitor.c            |  13 -
 net/core/skbuff.c                  |  10 +
 net/core/stream.c                  |   7 +-
 net/dccp/dccp.h                    |   3 +-
 net/dccp/input.c                   |   3 +-
 net/dccp/ipv4.c                    |   3 +-
 net/dccp/ipv6.c                    |   5 +-
 net/ipv4/inet_connection_sock.c    |   2 +-
 net/ipv4/tcp_input.c               |  56 ++--
 net/ipv4/tcp_ipv4.c                |  54 +++-
 net/ipv4/tcp_minisocks.c           |  35 ++-
 net/ipv6/tcp_ipv6.c                |  56 +++-
 net/mptcp/subflow.c                |  18 +-
 18 files changed, 522 insertions(+), 352 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-18  1:14   ` Jakub Kicinski
  2022-05-17  8:10 ` [PATCH net-next v2 2/9] net: skb: introduce __skb_queue_purge_reason() menglong8.dong
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

It is annoying to add new skb drop reasons to 'enum skb_drop_reason'
and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy to forget
to add the new reasons we added to TRACE_SKB_DROP_REASON.

TRACE_SKB_DROP_REASON is used to convert drop reason of type number
to string. For now, the string we passed to user space is exactly the
same as the name in 'enum skb_drop_reason' with a 'SKB_DROP_REASON_'
prefix. So why not make them togather by define a macro?

Therefore, introduce __DEFINE_SKB_DROP_REASON() and use it for 'enum
skb_drop_reason' definition and string converting.

Now, what should we with the document for the reasons? How about follow
__BPF_FUNC_MAPPER() and make these document togather?

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h     | 444 ++++++++++++++++++++++++-------------
 include/trace/events/skb.h |  89 +-------
 net/core/drop_monitor.c    |  13 --
 net/core/skbuff.c          |  10 +
 4 files changed, 297 insertions(+), 259 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4db3f4a33580..dfc568844df2 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -337,169 +337,295 @@ struct sk_buff_head {
 
 struct sk_buff;
 
+/*
+ * SKB_DROP_REASON_NOT_SPECIFIED
+ *	drop reason is not specified
+ *
+ * SKB_DROP_REASON_NO_SOCKET
+ *	socket not found
+ *
+ * SKB_DROP_REASON_PKT_TOO_SMALL
+ *	packet size is too small
+ *
+ * SKB_DROP_REASON_TCP_CSUM
+ *	TCP checksum error
+ *
+ * SKB_DROP_REASON_SOCKET_FILTER
+ *	dropped by socket filter
+ *
+ * SKB_DROP_REASON_UDP_CSUM
+ *	UDP checksum error
+ *
+ * SKB_DROP_REASON_NETFILTER_DROP
+ *	dropped by netfilter
+ *
+ * SKB_DROP_REASON_OTHERHOST
+ *	packet don't belong to current host (interface is in promisc
+ *	mode)
+ *
+ * SKB_DROP_REASON_IP_CSUM
+ *	IP checksum error
+ *
+ * SKB_DROP_REASON_IP_INHDR
+ *	there is something wrong with IP header (see
+ *	IPSTATS_MIB_INHDRERRORS)
+ *
+ * SKB_DROP_REASON_IP_RPFILTER
+ *	IP rpfilter validate failed. see the document for rp_filter
+ *	in ip-sysctl.rst for more information
+ *
+ * SKB_DROP_REASON_UNICAST_IN_L2_MULTICAST
+ *	destination address of L2 is multicast, but L3 is unicast.
+ *
+ * SKB_DROP_REASON_XFRM_POLICY
+ *	xfrm policy check failed
+ *
+ * SKB_DROP_REASON_IP_NOPROTO
+ *	no support for IP protocol
+ *
+ * SKB_DROP_REASON_SOCKET_RCVBUFF
+ *	socket receive buff is full
+ *
+ * SKB_DROP_REASON_PROTO_MEM
+ *	proto memory limition, such as udp packet drop out of
+ *	udp_memory_allocated.
+ *
+ * SKB_DROP_REASON_TCP_MD5NOTFOUND
+ *	no MD5 hash and one expected, corresponding to
+ *	LINUX_MIB_TCPMD5NOTFOUND
+ *
+ * SKB_DROP_REASON_TCP_MD5UNEXPECTED
+ *	MD5 hash and we're not expecting one, corresponding to
+ *	LINUX_MIB_TCPMD5UNEXPECTED
+ *
+ * SKB_DROP_REASON_TCP_MD5FAILURE
+ *	MD5 hash and its wrong, corresponding to LINUX_MIB_TCPMD5FAILURE
+ *
+ * SKB_DROP_REASON_SOCKET_BACKLOG
+ *	failed to add skb to socket backlog (see LINUX_MIB_TCPBACKLOGDROP)
+ *
+ * SKB_DROP_REASON_TCP_FLAGS
+ *	TCP flags invalid
+ *
+ * SKB_DROP_REASON_TCP_ZEROWINDOW
+ *	TCP receive window size is zero, see LINUX_MIB_TCPZEROWINDOWDROP
+ *
+ * SKB_DROP_REASON_TCP_OLD_DATA
+ *	the TCP data reveived is already received before (spurious
+ *	retrans may happened), see LINUX_MIB_DELAYEDACKLOST
+ *
+ * SKB_DROP_REASON_TCP_OVERWINDOW
+ *	the TCP data is out of window, the seq of the first byte exceed
+ *	the right edges of receive window
+ *
+ * SKB_DROP_REASON_TCP_OFOMERGE
+ *	the data of skb is already in the ofo queue, corresponding to
+ *	LINUX_MIB_TCPOFOMERGE
+ *
+ * SKB_DROP_REASON_TCP_RFC7323_PAWS
+ *	PAWS check, corresponding to LINUX_MIB_PAWSESTABREJECTED
+ *
+ * SKB_DROP_REASON_TCP_INVALID_SEQUENCE
+ *	Not acceptable SEQ field
+ *
+ * SKB_DROP_REASON_TCP_RESET
+ *	Invalid RST packet
+ *
+ * SKB_DROP_REASON_TCP_INVALID_SYN
+ *	Incoming packet has unexpected SYN flag
+ *
+ * SKB_DROP_REASON_TCP_CLOSE
+ *	TCP socket in CLOSE state
+ *
+ * SKB_DROP_REASON_TCP_FASTOPEN
+ *	dropped by FASTOPEN request socket
+ *
+ * SKB_DROP_REASON_TCP_OLD_ACK
+ *	TCP ACK is old, but in window
+ *
+ * SKB_DROP_REASON_TCP_TOO_OLD_ACK
+ *	TCP ACK is too old
+ *
+ * SKB_DROP_REASON_TCP_ACK_UNSENT_DATA
+ *	TCP ACK for data we haven't sent yet
+ *
+ * SKB_DROP_REASON_TCP_OFO_QUEUE_PRUNE
+ *	pruned from TCP OFO queue
+ *
+ * SKB_DROP_REASON_TCP_OFO_DROP
+ *	data already in receive queue
+ *
+ * SKB_DROP_REASON_IP_OUTNOROUTES
+ *	route lookup failed
+ *
+ * SKB_DROP_REASON_BPF_CGROUP_EGRESS
+ *	dropped by BPF_PROG_TYPE_CGROUP_SKB eBPF program
+ *
+ * SKB_DROP_REASON_IPV6DISABLED
+ *	IPv6 is disabled on the device
+ *
+ * SKB_DROP_REASON_NEIGH_CREATEFAIL
+ *	failed to create neigh entry
+ *
+ * SKB_DROP_REASON_NEIGH_FAILED
+ *	neigh entry in failed state
+ *
+ * SKB_DROP_REASON_NEIGH_QUEUEFULL
+ *	arp_queue for neigh entry is full
+ *
+ * SKB_DROP_REASON_NEIGH_DEAD
+ *	neigh entry is dead
+ *
+ * SKB_DROP_REASON_TC_EGRESS
+ *	dropped in TC egress HOOK
+ *
+ * SKB_DROP_REASON_QDISC_DROP
+ *	dropped by qdisc when packet outputting (failed to enqueue to
+ *	current qdisc)
+ *
+ * SKB_DROP_REASON_CPU_BACKLOG
+ *	failed to enqueue the skb to the per CPU backlog queue. This
+ *	can be caused by backlog queue full (see netdev_max_backlog in
+ *	net.rst) or RPS flow limit
+ *
+ * SKB_DROP_REASON_XDP
+ *	dropped by XDP in input path
+ *
+ * SKB_DROP_REASON_TC_INGRESS
+ *	dropped in TC ingress HOOK
+ *
+ * SKB_DROP_REASON_UNHANDLED_PROTO
+ *	protocol not implemented or not supported
+ *
+ * SKB_DROP_REASON_SKB_CSUM
+ *	sk_buff checksum computation error
+ *
+ * SKB_DROP_REASON_SKB_GSO_SEG
+ *	gso segmentation error
+ *
+ * SKB_DROP_REASON_SKB_UCOPY_FAULT
+ *	failed to copy data from user space, e.g., via
+ *	zerocopy_sg_from_iter() or skb_orphan_frags_rx()
+ *
+ * SKB_DROP_REASON_DEV_HDR
+ *	device driver specific header/metadata is invalid
+ *
+ * SKB_DROP_REASON_DEV_READY
+ *	the device is not ready to xmit/recv due to any of its data
+ *	structure that is not up/ready/initialized, e.g., the IFF_UP is
+ *	not set, or driver specific tun->tfiles[txq] is not initialized
+ *
+ * SKB_DROP_REASON_FULL_RING
+ *	ring buffer is full
+ *
+ * SKB_DROP_REASON_NOMEM
+ *	error due to OOM
+ *
+ * SKB_DROP_REASON_HDR_TRUNC
+ *	failed to trunc/extract the header from networking data, e.g.,
+ *	failed to pull the protocol header from frags via
+ *	pskb_may_pull()
+ *
+ * SKB_DROP_REASON_TAP_FILTER
+ *	dropped by (ebpf) filter directly attached to tun/tap, e.g., via
+ *	TUNSETFILTEREBPF
+ *
+ * SKB_DROP_REASON_TAP_TXFILTER
+ *	dropped by tx filter implemented at tun/tap, e.g., check_filter()
+ *
+ * SKB_DROP_REASON_ICMP_CSUM
+ *	ICMP checksum error
+ *
+ * SKB_DROP_REASON_INVALID_PROTO
+ *	the packet doesn't follow RFC 2211, such as a broadcasts
+ *	ICMP_TIMESTAMP
+ *
+ * SKB_DROP_REASON_IP_INADDRERRORS
+ *	host unreachable, corresponding to IPSTATS_MIB_INADDRERRORS
+ *
+ * SKB_DROP_REASON_IP_INNOROUTES
+ *	network unreachable, corresponding to IPSTATS_MIB_INADDRERRORS
+ *
+ * SKB_DROP_REASON_PKT_TOO_BIG
+ *	packet size is too big (maybe exceed the MTU)
+ */
+#define __DEFINE_SKB_DROP_REASON(FN)	\
+	FN(NOT_SPECIFIED)		\
+	FN(NO_SOCKET)			\
+	FN(PKT_TOO_SMALL)		\
+	FN(TCP_CSUM)			\
+	FN(SOCKET_FILTER)		\
+	FN(UDP_CSUM)			\
+	FN(NETFILTER_DROP)		\
+	FN(OTHERHOST)			\
+	FN(IP_CSUM)			\
+	FN(IP_INHDR)			\
+	FN(IP_RPFILTER)			\
+	FN(UNICAST_IN_L2_MULTICAST)	\
+	FN(XFRM_POLICY)			\
+	FN(IP_NOPROTO)			\
+	FN(SOCKET_RCVBUFF)		\
+	FN(PROTO_MEM)			\
+	FN(TCP_MD5NOTFOUND)		\
+	FN(TCP_MD5UNEXPECTED)		\
+	FN(TCP_MD5FAILURE)		\
+	FN(SOCKET_BACKLOG)		\
+	FN(TCP_FLAGS)			\
+	FN(TCP_ZEROWINDOW)		\
+	FN(TCP_OLD_DATA)		\
+	FN(TCP_OVERWINDOW)		\
+	FN(TCP_OFOMERGE)		\
+	FN(TCP_RFC7323_PAWS)		\
+	FN(TCP_INVALID_SEQUENCE)	\
+	FN(TCP_RESET)			\
+	FN(TCP_INVALID_SYN)		\
+	FN(TCP_CLOSE)			\
+	FN(TCP_FASTOPEN)		\
+	FN(TCP_OLD_ACK)			\
+	FN(TCP_TOO_OLD_ACK)		\
+	FN(TCP_ACK_UNSENT_DATA)		\
+	FN(TCP_OFO_QUEUE_PRUNE)		\
+	FN(TCP_OFO_DROP)		\
+	FN(IP_OUTNOROUTES)		\
+	FN(BPF_CGROUP_EGRESS)		\
+	FN(IPV6DISABLED)		\
+	FN(NEIGH_CREATEFAIL)		\
+	FN(NEIGH_FAILED)		\
+	FN(NEIGH_QUEUEFULL)		\
+	FN(NEIGH_DEAD)			\
+	FN(TC_EGRESS)			\
+	FN(QDISC_DROP)			\
+	FN(CPU_BACKLOG)			\
+	FN(XDP)				\
+	FN(TC_INGRESS)			\
+	FN(UNHANDLED_PROTO)		\
+	FN(SKB_CSUM)			\
+	FN(SKB_GSO_SEG)			\
+	FN(SKB_UCOPY_FAULT)		\
+	FN(DEV_HDR)			\
+	FN(DEV_READY)			\
+	FN(FULL_RING)			\
+	FN(NOMEM)			\
+	FN(HDR_TRUNC)			\
+	FN(TAP_FILTER)			\
+	FN(TAP_TXFILTER)		\
+	FN(ICMP_CSUM)			\
+	FN(INVALID_PROTO)		\
+	FN(IP_INADDRERRORS)		\
+	FN(IP_INNOROUTES)		\
+	FN(PKT_TOO_BIG)			\
+	FN(MAX)
+
 /* The reason of skb drop, which is used in kfree_skb_reason().
  * en...maybe they should be splited by group?
- *
- * Each item here should also be in 'TRACE_SKB_DROP_REASON', which is
- * used to translate the reason to string.
  */
 enum skb_drop_reason {
 	SKB_NOT_DROPPED_YET = 0,
-	SKB_DROP_REASON_NOT_SPECIFIED,	/* drop reason is not specified */
-	SKB_DROP_REASON_NO_SOCKET,	/* socket not found */
-	SKB_DROP_REASON_PKT_TOO_SMALL,	/* packet size is too small */
-	SKB_DROP_REASON_TCP_CSUM,	/* TCP checksum error */
-	SKB_DROP_REASON_SOCKET_FILTER,	/* dropped by socket filter */
-	SKB_DROP_REASON_UDP_CSUM,	/* UDP checksum error */
-	SKB_DROP_REASON_NETFILTER_DROP,	/* dropped by netfilter */
-	SKB_DROP_REASON_OTHERHOST,	/* packet don't belong to current
-					 * host (interface is in promisc
-					 * mode)
-					 */
-	SKB_DROP_REASON_IP_CSUM,	/* IP checksum error */
-	SKB_DROP_REASON_IP_INHDR,	/* there is something wrong with
-					 * IP header (see
-					 * IPSTATS_MIB_INHDRERRORS)
-					 */
-	SKB_DROP_REASON_IP_RPFILTER,	/* IP rpfilter validate failed.
-					 * see the document for rp_filter
-					 * in ip-sysctl.rst for more
-					 * information
-					 */
-	SKB_DROP_REASON_UNICAST_IN_L2_MULTICAST, /* destination address of L2
-						  * is multicast, but L3 is
-						  * unicast.
-						  */
-	SKB_DROP_REASON_XFRM_POLICY,	/* xfrm policy check failed */
-	SKB_DROP_REASON_IP_NOPROTO,	/* no support for IP protocol */
-	SKB_DROP_REASON_SOCKET_RCVBUFF,	/* socket receive buff is full */
-	SKB_DROP_REASON_PROTO_MEM,	/* proto memory limition, such as
-					 * udp packet drop out of
-					 * udp_memory_allocated.
-					 */
-	SKB_DROP_REASON_TCP_MD5NOTFOUND,	/* no MD5 hash and one
-						 * expected, corresponding
-						 * to LINUX_MIB_TCPMD5NOTFOUND
-						 */
-	SKB_DROP_REASON_TCP_MD5UNEXPECTED,	/* MD5 hash and we're not
-						 * expecting one, corresponding
-						 * to LINUX_MIB_TCPMD5UNEXPECTED
-						 */
-	SKB_DROP_REASON_TCP_MD5FAILURE,	/* MD5 hash and its wrong,
-					 * corresponding to
-					 * LINUX_MIB_TCPMD5FAILURE
-					 */
-	SKB_DROP_REASON_SOCKET_BACKLOG,	/* failed to add skb to socket
-					 * backlog (see
-					 * LINUX_MIB_TCPBACKLOGDROP)
-					 */
-	SKB_DROP_REASON_TCP_FLAGS,	/* TCP flags invalid */
-	SKB_DROP_REASON_TCP_ZEROWINDOW,	/* TCP receive window size is zero,
-					 * see LINUX_MIB_TCPZEROWINDOWDROP
-					 */
-	SKB_DROP_REASON_TCP_OLD_DATA,	/* the TCP data reveived is already
-					 * received before (spurious retrans
-					 * may happened), see
-					 * LINUX_MIB_DELAYEDACKLOST
-					 */
-	SKB_DROP_REASON_TCP_OVERWINDOW,	/* the TCP data is out of window,
-					 * the seq of the first byte exceed
-					 * the right edges of receive
-					 * window
-					 */
-	SKB_DROP_REASON_TCP_OFOMERGE,	/* the data of skb is already in
-					 * the ofo queue, corresponding to
-					 * LINUX_MIB_TCPOFOMERGE
-					 */
-	SKB_DROP_REASON_TCP_RFC7323_PAWS, /* PAWS check, corresponding to
-					   * LINUX_MIB_PAWSESTABREJECTED
-					   */
-	SKB_DROP_REASON_TCP_INVALID_SEQUENCE, /* Not acceptable SEQ field */
-	SKB_DROP_REASON_TCP_RESET,	/* Invalid RST packet */
-	SKB_DROP_REASON_TCP_INVALID_SYN, /* Incoming packet has unexpected SYN flag */
-	SKB_DROP_REASON_TCP_CLOSE,	/* TCP socket in CLOSE state */
-	SKB_DROP_REASON_TCP_FASTOPEN,	/* dropped by FASTOPEN request socket */
-	SKB_DROP_REASON_TCP_OLD_ACK,	/* TCP ACK is old, but in window */
-	SKB_DROP_REASON_TCP_TOO_OLD_ACK, /* TCP ACK is too old */
-	SKB_DROP_REASON_TCP_ACK_UNSENT_DATA, /* TCP ACK for data we haven't sent yet */
-	SKB_DROP_REASON_TCP_OFO_QUEUE_PRUNE, /* pruned from TCP OFO queue */
-	SKB_DROP_REASON_TCP_OFO_DROP,	/* data already in receive queue */
-	SKB_DROP_REASON_IP_OUTNOROUTES,	/* route lookup failed */
-	SKB_DROP_REASON_BPF_CGROUP_EGRESS,	/* dropped by
-						 * BPF_PROG_TYPE_CGROUP_SKB
-						 * eBPF program
-						 */
-	SKB_DROP_REASON_IPV6DISABLED,	/* IPv6 is disabled on the device */
-	SKB_DROP_REASON_NEIGH_CREATEFAIL,	/* failed to create neigh
-						 * entry
-						 */
-	SKB_DROP_REASON_NEIGH_FAILED,	/* neigh entry in failed state */
-	SKB_DROP_REASON_NEIGH_QUEUEFULL,	/* arp_queue for neigh
-						 * entry is full
-						 */
-	SKB_DROP_REASON_NEIGH_DEAD,	/* neigh entry is dead */
-	SKB_DROP_REASON_TC_EGRESS,	/* dropped in TC egress HOOK */
-	SKB_DROP_REASON_QDISC_DROP,	/* dropped by qdisc when packet
-					 * outputting (failed to enqueue to
-					 * current qdisc)
-					 */
-	SKB_DROP_REASON_CPU_BACKLOG,	/* failed to enqueue the skb to
-					 * the per CPU backlog queue. This
-					 * can be caused by backlog queue
-					 * full (see netdev_max_backlog in
-					 * net.rst) or RPS flow limit
-					 */
-	SKB_DROP_REASON_XDP,		/* dropped by XDP in input path */
-	SKB_DROP_REASON_TC_INGRESS,	/* dropped in TC ingress HOOK */
-	SKB_DROP_REASON_UNHANDLED_PROTO,	/* protocol not implemented
-						 * or not supported
-						 */
-	SKB_DROP_REASON_SKB_CSUM,	/* sk_buff checksum computation
-					 * error
-					 */
-	SKB_DROP_REASON_SKB_GSO_SEG,	/* gso segmentation error */
-	SKB_DROP_REASON_SKB_UCOPY_FAULT,	/* failed to copy data from
-						 * user space, e.g., via
-						 * zerocopy_sg_from_iter()
-						 * or skb_orphan_frags_rx()
-						 */
-	SKB_DROP_REASON_DEV_HDR,	/* device driver specific
-					 * header/metadata is invalid
-					 */
-	/* the device is not ready to xmit/recv due to any of its data
-	 * structure that is not up/ready/initialized, e.g., the IFF_UP is
-	 * not set, or driver specific tun->tfiles[txq] is not initialized
-	 */
-	SKB_DROP_REASON_DEV_READY,
-	SKB_DROP_REASON_FULL_RING,	/* ring buffer is full */
-	SKB_DROP_REASON_NOMEM,		/* error due to OOM */
-	SKB_DROP_REASON_HDR_TRUNC,      /* failed to trunc/extract the header
-					 * from networking data, e.g., failed
-					 * to pull the protocol header from
-					 * frags via pskb_may_pull()
-					 */
-	SKB_DROP_REASON_TAP_FILTER,     /* dropped by (ebpf) filter directly
-					 * attached to tun/tap, e.g., via
-					 * TUNSETFILTEREBPF
-					 */
-	SKB_DROP_REASON_TAP_TXFILTER,	/* dropped by tx filter implemented
-					 * at tun/tap, e.g., check_filter()
-					 */
-	SKB_DROP_REASON_ICMP_CSUM,	/* ICMP checksum error */
-	SKB_DROP_REASON_INVALID_PROTO,	/* the packet doesn't follow RFC
-					 * 2211, such as a broadcasts
-					 * ICMP_TIMESTAMP
-					 */
-	SKB_DROP_REASON_IP_INADDRERRORS,	/* host unreachable, corresponding
-						 * to IPSTATS_MIB_INADDRERRORS
-						 */
-	SKB_DROP_REASON_IP_INNOROUTES,	/* network unreachable, corresponding
-					 * to IPSTATS_MIB_INADDRERRORS
-					 */
-	SKB_DROP_REASON_PKT_TOO_BIG,	/* packet size is too big (maybe exceed
-					 * the MTU)
-					 */
-	SKB_DROP_REASON_MAX,
+
+#undef FN
+#define FN(name) SKB_DROP_REASON_##name,
+	__DEFINE_SKB_DROP_REASON(FN)
+#undef FN
 };
 
 #define SKB_DR_INIT(name, reason)				\
@@ -515,6 +641,8 @@ enum skb_drop_reason {
 			SKB_DR_SET(name, reason);		\
 	} while (0)
 
+extern const char * const drop_reasons[];
+
 /* To allow 64K frame to be packed as single skb without frag_list we
  * require 64K/PAGE_SIZE pages plus 1 additional page to allow for
  * buffers which do not start on a page boundary.
diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index a477bf907498..45264e4bb254 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -9,92 +9,6 @@
 #include <linux/netdevice.h>
 #include <linux/tracepoint.h>
 
-#define TRACE_SKB_DROP_REASON					\
-	EM(SKB_DROP_REASON_NOT_SPECIFIED, NOT_SPECIFIED)	\
-	EM(SKB_DROP_REASON_NO_SOCKET, NO_SOCKET)		\
-	EM(SKB_DROP_REASON_PKT_TOO_SMALL, PKT_TOO_SMALL)	\
-	EM(SKB_DROP_REASON_TCP_CSUM, TCP_CSUM)			\
-	EM(SKB_DROP_REASON_SOCKET_FILTER, SOCKET_FILTER)	\
-	EM(SKB_DROP_REASON_UDP_CSUM, UDP_CSUM)			\
-	EM(SKB_DROP_REASON_NETFILTER_DROP, NETFILTER_DROP)	\
-	EM(SKB_DROP_REASON_OTHERHOST, OTHERHOST)		\
-	EM(SKB_DROP_REASON_IP_CSUM, IP_CSUM)			\
-	EM(SKB_DROP_REASON_IP_INHDR, IP_INHDR)			\
-	EM(SKB_DROP_REASON_IP_RPFILTER, IP_RPFILTER)		\
-	EM(SKB_DROP_REASON_UNICAST_IN_L2_MULTICAST,		\
-	   UNICAST_IN_L2_MULTICAST)				\
-	EM(SKB_DROP_REASON_XFRM_POLICY, XFRM_POLICY)		\
-	EM(SKB_DROP_REASON_IP_NOPROTO, IP_NOPROTO)		\
-	EM(SKB_DROP_REASON_SOCKET_RCVBUFF, SOCKET_RCVBUFF)	\
-	EM(SKB_DROP_REASON_PROTO_MEM, PROTO_MEM)		\
-	EM(SKB_DROP_REASON_TCP_MD5NOTFOUND, TCP_MD5NOTFOUND)	\
-	EM(SKB_DROP_REASON_TCP_MD5UNEXPECTED,			\
-	   TCP_MD5UNEXPECTED)					\
-	EM(SKB_DROP_REASON_TCP_MD5FAILURE, TCP_MD5FAILURE)	\
-	EM(SKB_DROP_REASON_SOCKET_BACKLOG, SOCKET_BACKLOG)	\
-	EM(SKB_DROP_REASON_TCP_FLAGS, TCP_FLAGS)		\
-	EM(SKB_DROP_REASON_TCP_ZEROWINDOW, TCP_ZEROWINDOW)	\
-	EM(SKB_DROP_REASON_TCP_OLD_DATA, TCP_OLD_DATA)		\
-	EM(SKB_DROP_REASON_TCP_OVERWINDOW, TCP_OVERWINDOW)	\
-	EM(SKB_DROP_REASON_TCP_OFOMERGE, TCP_OFOMERGE)		\
-	EM(SKB_DROP_REASON_TCP_OFO_DROP, TCP_OFO_DROP)		\
-	EM(SKB_DROP_REASON_TCP_RFC7323_PAWS, TCP_RFC7323_PAWS)	\
-	EM(SKB_DROP_REASON_TCP_INVALID_SEQUENCE,		\
-	   TCP_INVALID_SEQUENCE)				\
-	EM(SKB_DROP_REASON_TCP_RESET, TCP_RESET)		\
-	EM(SKB_DROP_REASON_TCP_INVALID_SYN, TCP_INVALID_SYN)	\
-	EM(SKB_DROP_REASON_TCP_CLOSE, TCP_CLOSE)		\
-	EM(SKB_DROP_REASON_TCP_FASTOPEN, TCP_FASTOPEN)		\
-	EM(SKB_DROP_REASON_TCP_OLD_ACK, TCP_OLD_ACK)		\
-	EM(SKB_DROP_REASON_TCP_TOO_OLD_ACK, TCP_TOO_OLD_ACK)	\
-	EM(SKB_DROP_REASON_TCP_ACK_UNSENT_DATA,			\
-	   TCP_ACK_UNSENT_DATA)					\
-	EM(SKB_DROP_REASON_TCP_OFO_QUEUE_PRUNE,			\
-	  TCP_OFO_QUEUE_PRUNE)					\
-	EM(SKB_DROP_REASON_IP_OUTNOROUTES, IP_OUTNOROUTES)	\
-	EM(SKB_DROP_REASON_BPF_CGROUP_EGRESS,			\
-	   BPF_CGROUP_EGRESS)					\
-	EM(SKB_DROP_REASON_IPV6DISABLED, IPV6DISABLED)		\
-	EM(SKB_DROP_REASON_NEIGH_CREATEFAIL, NEIGH_CREATEFAIL)	\
-	EM(SKB_DROP_REASON_NEIGH_FAILED, NEIGH_FAILED)		\
-	EM(SKB_DROP_REASON_NEIGH_QUEUEFULL, NEIGH_QUEUEFULL)	\
-	EM(SKB_DROP_REASON_NEIGH_DEAD, NEIGH_DEAD)		\
-	EM(SKB_DROP_REASON_TC_EGRESS, TC_EGRESS)		\
-	EM(SKB_DROP_REASON_QDISC_DROP, QDISC_DROP)		\
-	EM(SKB_DROP_REASON_CPU_BACKLOG, CPU_BACKLOG)		\
-	EM(SKB_DROP_REASON_XDP, XDP)				\
-	EM(SKB_DROP_REASON_TC_INGRESS, TC_INGRESS)		\
-	EM(SKB_DROP_REASON_UNHANDLED_PROTO, UNHANDLED_PROTO)	\
-	EM(SKB_DROP_REASON_SKB_CSUM, SKB_CSUM)			\
-	EM(SKB_DROP_REASON_SKB_GSO_SEG, SKB_GSO_SEG)		\
-	EM(SKB_DROP_REASON_SKB_UCOPY_FAULT, SKB_UCOPY_FAULT)	\
-	EM(SKB_DROP_REASON_DEV_HDR, DEV_HDR)			\
-	EM(SKB_DROP_REASON_DEV_READY, DEV_READY)		\
-	EM(SKB_DROP_REASON_FULL_RING, FULL_RING)		\
-	EM(SKB_DROP_REASON_NOMEM, NOMEM)			\
-	EM(SKB_DROP_REASON_HDR_TRUNC, HDR_TRUNC)		\
-	EM(SKB_DROP_REASON_TAP_FILTER, TAP_FILTER)		\
-	EM(SKB_DROP_REASON_TAP_TXFILTER, TAP_TXFILTER)		\
-	EM(SKB_DROP_REASON_ICMP_CSUM, ICMP_CSUM)		\
-	EM(SKB_DROP_REASON_INVALID_PROTO, INVALID_PROTO)	\
-	EM(SKB_DROP_REASON_IP_INADDRERRORS, IP_INADDRERRORS)	\
-	EM(SKB_DROP_REASON_IP_INNOROUTES, IP_INNOROUTES)	\
-	EM(SKB_DROP_REASON_PKT_TOO_BIG, PKT_TOO_BIG)		\
-	EMe(SKB_DROP_REASON_MAX, MAX)
-
-#undef EM
-#undef EMe
-
-#define EM(a, b)	TRACE_DEFINE_ENUM(a);
-#define EMe(a, b)	TRACE_DEFINE_ENUM(a);
-
-TRACE_SKB_DROP_REASON
-
-#undef EM
-#undef EMe
-#define EM(a, b)	{ a, #b },
-#define EMe(a, b)	{ a, #b }
-
 /*
  * Tracepoint for free an sk_buff:
  */
@@ -121,8 +35,7 @@ TRACE_EVENT(kfree_skb,
 
 	TP_printk("skbaddr=%p protocol=%u location=%p reason: %s",
 		  __entry->skbaddr, __entry->protocol, __entry->location,
-		  __print_symbolic(__entry->reason,
-				   TRACE_SKB_DROP_REASON))
+		  drop_reasons[__entry->reason])
 );
 
 TRACE_EVENT(consume_skb,
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 41cac0e4834e..4ad1decce724 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -48,19 +48,6 @@
 static int trace_state = TRACE_OFF;
 static bool monitor_hw;
 
-#undef EM
-#undef EMe
-
-#define EM(a, b)	[a] = #b,
-#define EMe(a, b)	[a] = #b
-
-/* drop_reasons is used to translate 'enum skb_drop_reason' to string,
- * which is reported to user space.
- */
-static const char * const drop_reasons[] = {
-	TRACE_SKB_DROP_REASON
-};
-
 /* net_dm_mutex
  *
  * An overall lock guarding every operation coming from userspace.
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index fab791b0c59e..e677c052b459 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -90,6 +90,16 @@ static struct kmem_cache *skbuff_ext_cache __ro_after_init;
 int sysctl_max_skb_frags __read_mostly = MAX_SKB_FRAGS;
 EXPORT_SYMBOL(sysctl_max_skb_frags);
 
+/* drop_reasons is used to translate 'enum skb_drop_reason' to string,
+ * which is reported to user space.
+ */
+const char * const drop_reasons[] = {
+#undef FN
+#define FN(name) [SKB_DROP_REASON_##name] = #name,
+	__DEFINE_SKB_DROP_REASON(FN)
+#undef FN
+};
+
 /**
  *	skb_panic - private function for out-of-line support
  *	@skb:	buffer
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 2/9] net: skb: introduce __skb_queue_purge_reason()
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 3/9] net: sock: introduce sk_stream_kill_queues_reason() menglong8.dong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

Introduce __skb_queue_purge_reason() to empty a skb list with drop
reason and make __skb_queue_purge() an inline call to it.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index dfc568844df2..e9659a63961a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3323,18 +3323,24 @@ static inline int skb_orphan_frags_rx(struct sk_buff *skb, gfp_t gfp_mask)
 }
 
 /**
- *	__skb_queue_purge - empty a list
+ *	__skb_queue_purge_reason - empty a list with specific drop reason
  *	@list: list to empty
+ *	@reason: drop reason
  *
  *	Delete all buffers on an &sk_buff list. Each buffer is removed from
  *	the list and one reference dropped. This function does not take the
  *	list lock and the caller must hold the relevant locks to use it.
  */
-static inline void __skb_queue_purge(struct sk_buff_head *list)
+static inline void __skb_queue_purge_reason(struct sk_buff_head *list,
+					    enum skb_drop_reason reason)
 {
 	struct sk_buff *skb;
 	while ((skb = __skb_dequeue(list)) != NULL)
-		kfree_skb(skb);
+		kfree_skb_reason(skb, reason);
+}
+static inline void __skb_queue_purge(struct sk_buff_head *list)
+{
+	__skb_queue_purge_reason(list, SKB_DROP_REASON_NOT_SPECIFIED);
 }
 void skb_queue_purge(struct sk_buff_head *list);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 3/9] net: sock: introduce sk_stream_kill_queues_reason()
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 2/9] net: skb: introduce __skb_queue_purge_reason() menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 4/9] net: inet: add skb drop reason to inet_csk_destroy_sock() menglong8.dong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

Introduce the function sk_stream_kill_queues_reason() and make the
origin sk_stream_kill_queues() an inline call to it.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/net/sock.h | 8 +++++++-
 net/core/stream.c  | 7 ++++---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 73063c88a249..085838ce70d5 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1128,12 +1128,18 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p);
 int sk_stream_wait_memory(struct sock *sk, long *timeo_p);
 void sk_stream_wait_close(struct sock *sk, long timeo_p);
 int sk_stream_error(struct sock *sk, int flags, int err);
-void sk_stream_kill_queues(struct sock *sk);
+void sk_stream_kill_queues_reason(struct sock *sk,
+				  enum skb_drop_reason reason);
 void sk_set_memalloc(struct sock *sk);
 void sk_clear_memalloc(struct sock *sk);
 
 void __sk_flush_backlog(struct sock *sk);
 
+static inline void sk_stream_kill_queues(struct sock *sk)
+{
+	sk_stream_kill_queues_reason(sk, SKB_DROP_REASON_NOT_SPECIFIED);
+}
+
 static inline bool sk_flush_backlog(struct sock *sk)
 {
 	if (unlikely(READ_ONCE(sk->sk_backlog.tail))) {
diff --git a/net/core/stream.c b/net/core/stream.c
index 06b36c730ce8..a562b23a1a6e 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -190,10 +190,11 @@ int sk_stream_error(struct sock *sk, int flags, int err)
 }
 EXPORT_SYMBOL(sk_stream_error);
 
-void sk_stream_kill_queues(struct sock *sk)
+void sk_stream_kill_queues_reason(struct sock *sk,
+				  enum skb_drop_reason reason)
 {
 	/* First the read buffer. */
-	__skb_queue_purge(&sk->sk_receive_queue);
+	__skb_queue_purge_reason(&sk->sk_receive_queue, reason);
 
 	/* Next, the write queue. */
 	WARN_ON(!skb_queue_empty(&sk->sk_write_queue));
@@ -209,4 +210,4 @@ void sk_stream_kill_queues(struct sock *sk)
 	 * have gone away, only the net layer knows can touch it.
 	 */
 }
-EXPORT_SYMBOL(sk_stream_kill_queues);
+EXPORT_SYMBOL(sk_stream_kill_queues_reason);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 4/9] net: inet: add skb drop reason to inet_csk_destroy_sock()
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (2 preceding siblings ...)
  2022-05-17  8:10 ` [PATCH net-next v2 3/9] net: sock: introduce sk_stream_kill_queues_reason() menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 5/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons menglong8.dong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

skb dropping in inet_csk_destroy_sock() seems to be a common case. Add
the new drop reason 'SKB_DROP_REASON_SOCKET_DESTROIED' and apply it to
inet_csk_destroy_sock() to stop confusing users with 'NOT_SPECIFIED'.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h          | 5 +++++
 net/ipv4/inet_connection_sock.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index e9659a63961a..3c7b1e9aabbb 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -548,6 +548,10 @@ struct sk_buff;
  *
  * SKB_DROP_REASON_PKT_TOO_BIG
  *	packet size is too big (maybe exceed the MTU)
+ *
+ * SKB_DROP_REASON_SOCKET_DESTROYED
+ *	socket is destroyed and the skb in its receive or send queue
+ *	are all dropped
  */
 #define __DEFINE_SKB_DROP_REASON(FN)	\
 	FN(NOT_SPECIFIED)		\
@@ -614,6 +618,7 @@ struct sk_buff;
 	FN(IP_INADDRERRORS)		\
 	FN(IP_INNOROUTES)		\
 	FN(PKT_TOO_BIG)			\
+	FN(SOCKET_DESTROYED)		\
 	FN(MAX)
 
 /* The reason of skb drop, which is used in kfree_skb_reason().
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 1e5b53c2bb26..6775cc8c42e1 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1006,7 +1006,7 @@ void inet_csk_destroy_sock(struct sock *sk)
 
 	sk->sk_prot->destroy(sk);
 
-	sk_stream_kill_queues(sk);
+	sk_stream_kill_queues_reason(sk, SKB_DROP_REASON_SOCKET_DESTROYED);
 
 	xfrm_sk_free_policy(sk);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 5/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (3 preceding siblings ...)
  2022-05-17  8:10 ` [PATCH net-next v2 4/9] net: inet: add skb drop reason to inet_csk_destroy_sock() menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 6/9] net: tcp: make tcp_rcv_state_process() return drop reason menglong8.dong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

The return value of tcp_rcv_synsent_state_process() can be -1, 0 or 1:

- -1: free skb silently
- 0: success and skb is already freed
- 1: drop packet and send a RST

Therefore, we can make it return skb drop reasons on 'reset_and_undo'
path, which will not impact the caller.

The new reason 'TCP_PAWSACTIVEREJECTED' is added, which is corresponding
to LINUX_MIB_PAWSACTIVEREJECTED.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h | 1 +
 net/ipv4/tcp_input.c   | 7 ++++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3c7b1e9aabbb..36e0971f4cc9 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -619,6 +619,7 @@ struct sk_buff;
 	FN(IP_INNOROUTES)		\
 	FN(PKT_TOO_BIG)			\
 	FN(SOCKET_DESTROYED)		\
+	FN(TCP_PAWSACTIVEREJECTED)	\
 	FN(MAX)
 
 /* The reason of skb drop, which is used in kfree_skb_reason().
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 97cfcd85f84e..e8d26a68bc45 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6174,6 +6174,10 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 				inet_csk_reset_xmit_timer(sk,
 						ICSK_TIME_RETRANS,
 						TCP_TIMEOUT_MIN, TCP_RTO_MAX);
+			if (after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt))
+				SKB_DR_SET(reason, TCP_ACK_UNSENT_DATA);
+			else
+				SKB_DR_SET(reason, TCP_TOO_OLD_ACK);
 			goto reset_and_undo;
 		}
 
@@ -6182,6 +6186,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 			     tcp_time_stamp(tp))) {
 			NET_INC_STATS(sock_net(sk),
 					LINUX_MIB_PAWSACTIVEREJECTED);
+			SKB_DR_SET(reason, TCP_PAWSACTIVEREJECTED);
 			goto reset_and_undo;
 		}
 
@@ -6375,7 +6380,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 reset_and_undo:
 	tcp_clear_options(&tp->rx_opt);
 	tp->rx_opt.mss_clamp = saved_clamp;
-	return 1;
+	return reason;
 }
 
 static void tcp_rcv_synrecv_state_fastopen(struct sock *sk)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 6/9] net: tcp: make tcp_rcv_state_process() return drop reason
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (4 preceding siblings ...)
  2022-05-17  8:10 ` [PATCH net-next v2 5/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 7/9] net: tcp: add skb drop reasons to tcp connect requesting menglong8.dong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

For now, the return value of tcp_rcv_state_process() is treated as bool.
Therefore, we can make it return the reasons of the skb drops.

Meanwhile, the return value of tcp_child_process() comes from
tcp_rcv_state_process(), make it drop reasons by the way.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h   |  1 +
 include/net/tcp.h        |  8 +++++---
 net/ipv4/tcp_input.c     | 22 +++++++++++-----------
 net/ipv4/tcp_ipv4.c      | 20 +++++++++++++-------
 net/ipv4/tcp_minisocks.c | 11 ++++++-----
 net/ipv6/tcp_ipv6.c      | 19 ++++++++++++-------
 6 files changed, 48 insertions(+), 33 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 36e0971f4cc9..736899cc6a13 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -620,6 +620,7 @@ struct sk_buff;
 	FN(PKT_TOO_BIG)			\
 	FN(SOCKET_DESTROYED)		\
 	FN(TCP_PAWSACTIVEREJECTED)	\
+	FN(TCP_ABORTONDATA)		\
 	FN(MAX)
 
 /* The reason of skb drop, which is used in kfree_skb_reason().
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1e99f5c61f84..ea0eb2d4a743 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -339,7 +339,8 @@ void tcp_wfree(struct sk_buff *skb);
 void tcp_write_timer_handler(struct sock *sk);
 void tcp_delack_timer_handler(struct sock *sk);
 int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg);
-int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb);
+enum skb_drop_reason tcp_rcv_state_process(struct sock *sk,
+					   struct sk_buff *skb);
 void tcp_rcv_established(struct sock *sk, struct sk_buff *skb);
 void tcp_rcv_space_adjust(struct sock *sk);
 int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp);
@@ -385,8 +386,9 @@ enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw,
 struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 			   struct request_sock *req, bool fastopen,
 			   bool *lost_race);
-int tcp_child_process(struct sock *parent, struct sock *child,
-		      struct sk_buff *skb);
+enum skb_drop_reason tcp_child_process(struct sock *parent,
+				       struct sock *child,
+				       struct sk_buff *skb);
 void tcp_enter_loss(struct sock *sk);
 void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost, int flag);
 void tcp_clear_retrans(struct tcp_sock *tp);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e8d26a68bc45..4717af0eaea7 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6422,7 +6422,7 @@ static void tcp_rcv_synrecv_state_fastopen(struct sock *sk)
  *	address independent.
  */
 
-int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
+enum skb_drop_reason tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_connection_sock *icsk = inet_csk(sk);
@@ -6439,7 +6439,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 
 	case TCP_LISTEN:
 		if (th->ack)
-			return 1;
+			return SKB_DROP_REASON_TCP_FLAGS;
 
 		if (th->rst) {
 			SKB_DR_SET(reason, TCP_RESET);
@@ -6460,9 +6460,9 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			rcu_read_unlock();
 
 			if (!acceptable)
-				return 1;
+				return SKB_DROP_REASON_NOT_SPECIFIED;
 			consume_skb(skb);
-			return 0;
+			return SKB_NOT_DROPPED_YET;
 		}
 		SKB_DR_SET(reason, TCP_FLAGS);
 		goto discard;
@@ -6472,13 +6472,13 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		tcp_mstamp_refresh(tp);
 		queued = tcp_rcv_synsent_state_process(sk, skb, th);
 		if (queued >= 0)
-			return queued;
+			return (enum skb_drop_reason)queued;
 
 		/* Do step6 onward by hand. */
 		tcp_urg(sk, skb, th);
 		__kfree_skb(skb);
 		tcp_data_snd_check(sk);
-		return 0;
+		return SKB_NOT_DROPPED_YET;
 	}
 
 	tcp_mstamp_refresh(tp);
@@ -6582,7 +6582,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		if (tp->linger2 < 0) {
 			tcp_done(sk);
 			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
-			return 1;
+			return SKB_DROP_REASON_TCP_ABORTONDATA;
 		}
 		if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
 		    after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt)) {
@@ -6591,7 +6591,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 				tcp_fastopen_active_disable(sk);
 			tcp_done(sk);
 			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
-			return 1;
+			return SKB_DROP_REASON_TCP_ABORTONDATA;
 		}
 
 		tmo = tcp_fin_time(sk);
@@ -6656,7 +6656,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			    after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt)) {
 				NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
 				tcp_reset(sk, skb);
-				return 1;
+				return SKB_DROP_REASON_TCP_ABORTONDATA;
 			}
 		}
 		fallthrough;
@@ -6676,11 +6676,11 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 discard:
 		tcp_drop_reason(sk, skb, reason);
 	}
-	return 0;
+	return SKB_NOT_DROPPED_YET;
 
 consume:
 	__kfree_skb(skb);
-	return 0;
+	return SKB_NOT_DROPPED_YET;
 }
 EXPORT_SYMBOL(tcp_rcv_state_process);
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 24eb42497a71..12a18c5035f4 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1670,7 +1670,8 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 		if (!nsk)
 			goto discard;
 		if (nsk != sk) {
-			if (tcp_child_process(sk, nsk, skb)) {
+			reason = tcp_child_process(sk, nsk, skb);
+			if (reason) {
 				rsk = nsk;
 				goto reset;
 			}
@@ -1679,7 +1680,8 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 	} else
 		sock_rps_save_rxhash(sk, skb);
 
-	if (tcp_rcv_state_process(sk, skb)) {
+	reason = tcp_rcv_state_process(sk, skb);
+	if (reason) {
 		rsk = sk;
 		goto reset;
 	}
@@ -1688,6 +1690,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 reset:
 	tcp_v4_send_reset(rsk, skb);
 discard:
+	SKB_DR_OR(reason, NOT_SPECIFIED);
 	kfree_skb_reason(skb, reason);
 	/* Be careful here. If this function gets more complicated and
 	 * gcc suffers from register pressure on the x86, sk (in %ebx)
@@ -2019,12 +2022,15 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		if (nsk == sk) {
 			reqsk_put(req);
 			tcp_v4_restore_cb(skb);
-		} else if (tcp_child_process(sk, nsk, skb)) {
-			tcp_v4_send_reset(nsk, skb);
-			goto discard_and_relse;
 		} else {
-			sock_put(sk);
-			return 0;
+			drop_reason = tcp_child_process(sk, nsk, skb);
+			if (drop_reason) {
+				tcp_v4_send_reset(nsk, skb);
+				goto discard_and_relse;
+			} else {
+				sock_put(sk);
+				return 0;
+			}
 		}
 	}
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 6854bb1fb32b..1a21018f6f64 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -821,11 +821,12 @@ EXPORT_SYMBOL(tcp_check_req);
  * be created.
  */
 
-int tcp_child_process(struct sock *parent, struct sock *child,
-		      struct sk_buff *skb)
+enum skb_drop_reason tcp_child_process(struct sock *parent,
+				       struct sock *child,
+				       struct sk_buff *skb)
 	__releases(&((child)->sk_lock.slock))
 {
-	int ret = 0;
+	enum skb_drop_reason reason = SKB_NOT_DROPPED_YET;
 	int state = child->sk_state;
 
 	/* record sk_napi_id and sk_rx_queue_mapping of child. */
@@ -833,7 +834,7 @@ int tcp_child_process(struct sock *parent, struct sock *child,
 
 	tcp_segs_in(tcp_sk(child), skb);
 	if (!sock_owned_by_user(child)) {
-		ret = tcp_rcv_state_process(child, skb);
+		reason = tcp_rcv_state_process(child, skb);
 		/* Wakeup parent, send SIGIO */
 		if (state == TCP_SYN_RECV && child->sk_state != state)
 			parent->sk_data_ready(parent);
@@ -847,6 +848,6 @@ int tcp_child_process(struct sock *parent, struct sock *child,
 
 	bh_unlock_sock(child);
 	sock_put(child);
-	return ret;
+	return reason;
 }
 EXPORT_SYMBOL(tcp_child_process);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 636ed23d9af0..d8236d51dd47 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1489,7 +1489,8 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 			goto discard;
 
 		if (nsk != sk) {
-			if (tcp_child_process(sk, nsk, skb))
+			reason = tcp_child_process(sk, nsk, skb);
+			if (reason)
 				goto reset;
 			if (opt_skb)
 				__kfree_skb(opt_skb);
@@ -1498,7 +1499,8 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	} else
 		sock_rps_save_rxhash(sk, skb);
 
-	if (tcp_rcv_state_process(sk, skb))
+	reason = tcp_rcv_state_process(sk, skb);
+	if (reason)
 		goto reset;
 	if (opt_skb)
 		goto ipv6_pktoptions;
@@ -1685,12 +1687,15 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		if (nsk == sk) {
 			reqsk_put(req);
 			tcp_v6_restore_cb(skb);
-		} else if (tcp_child_process(sk, nsk, skb)) {
-			tcp_v6_send_reset(nsk, skb);
-			goto discard_and_relse;
 		} else {
-			sock_put(sk);
-			return 0;
+			drop_reason = tcp_child_process(sk, nsk, skb);
+			if (drop_reason) {
+				tcp_v6_send_reset(nsk, skb);
+				goto discard_and_relse;
+			} else {
+				sock_put(sk);
+				return 0;
+			}
 		}
 	}
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 7/9] net: tcp: add skb drop reasons to tcp connect requesting
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (5 preceding siblings ...)
  2022-05-17  8:10 ` [PATCH net-next v2 6/9] net: tcp: make tcp_rcv_state_process() return drop reason menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 8/9] net: tcp: add skb drop reasons to tcp tw code path menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 9/9] net: tcp: add skb drop reasons to route_req() menglong8.dong
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

In order to get skb drop reasons during tcp connect requesting code path,
we have to pass the pointer of the 'reason' as a new function argument of
conn_request() in 'struct inet_connection_sock_af_ops'. As the return
value of conn_request() can be positive or negative or 0, it's not
flexible to make it return drop reasons.

As the return value of tcp_conn_request() is 0, so we can treat it as bool
and make it return the skb drop reasons.

The new drop reasons 'LISTENOVERFLOWS' and 'TCP_REQQFULLDROP' are added,
which are used for 'accept queue' and 'request queue' full.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h             | 10 ++++++++++
 include/net/inet_connection_sock.h |  3 ++-
 include/net/tcp.h                  |  9 +++++----
 net/dccp/dccp.h                    |  3 ++-
 net/dccp/input.c                   |  3 ++-
 net/dccp/ipv4.c                    |  3 ++-
 net/dccp/ipv6.c                    |  5 +++--
 net/ipv4/tcp_input.c               | 27 ++++++++++++++++++---------
 net/ipv4/tcp_ipv4.c                |  9 ++++++---
 net/ipv6/tcp_ipv6.c                | 12 ++++++++----
 net/mptcp/subflow.c                |  8 +++++---
 11 files changed, 63 insertions(+), 29 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 736899cc6a13..4578bbab5a3e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -552,6 +552,14 @@ struct sk_buff;
  * SKB_DROP_REASON_SOCKET_DESTROYED
  *	socket is destroyed and the skb in its receive or send queue
  *	are all dropped
+ *
+ * SKB_DROP_REASON_LISTENOVERFLOWS
+ *	accept queue of the listen socket is full, corresponding to
+ *	LINUX_MIB_LISTENOVERFLOWS
+ *
+ * SKB_DROP_REASON_TCP_REQQFULLDROP
+ *	request queue of the listen socket is full, corresponding to
+ *	LINUX_MIB_TCPREQQFULLDROP
  */
 #define __DEFINE_SKB_DROP_REASON(FN)	\
 	FN(NOT_SPECIFIED)		\
@@ -621,6 +629,8 @@ struct sk_buff;
 	FN(SOCKET_DESTROYED)		\
 	FN(TCP_PAWSACTIVEREJECTED)	\
 	FN(TCP_ABORTONDATA)		\
+	FN(LISTENOVERFLOWS)		\
+	FN(TCP_REQQFULLDROP)		\
 	FN(MAX)
 
 /* The reason of skb drop, which is used in kfree_skb_reason().
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 3908296d103f..0600280f308e 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -36,7 +36,8 @@ struct inet_connection_sock_af_ops {
 	void	    (*send_check)(struct sock *sk, struct sk_buff *skb);
 	int	    (*rebuild_header)(struct sock *sk);
 	void	    (*sk_rx_dst_set)(struct sock *sk, const struct sk_buff *skb);
-	int	    (*conn_request)(struct sock *sk, struct sk_buff *skb);
+	int	    (*conn_request)(struct sock *sk, struct sk_buff *skb,
+				    enum skb_drop_reason *reason);
 	struct sock *(*syn_recv_sock)(const struct sock *sk, struct sk_buff *skb,
 				      struct request_sock *req,
 				      struct dst_entry *dst,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index ea0eb2d4a743..082dd0627e2e 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -445,7 +445,8 @@ void tcp_v4_send_check(struct sock *sk, struct sk_buff *skb);
 void tcp_v4_mtu_reduced(struct sock *sk);
 void tcp_req_err(struct sock *sk, u32 seq, bool abort);
 void tcp_ld_RTO_revert(struct sock *sk, u32 seq);
-int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb);
+int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			enum skb_drop_reason *reason);
 struct sock *tcp_create_openreq_child(const struct sock *sk,
 				      struct request_sock *req,
 				      struct sk_buff *skb);
@@ -2036,9 +2037,9 @@ void tcp4_proc_exit(void);
 #endif
 
 int tcp_rtx_synack(const struct sock *sk, struct request_sock *req);
-int tcp_conn_request(struct request_sock_ops *rsk_ops,
-		     const struct tcp_request_sock_ops *af_ops,
-		     struct sock *sk, struct sk_buff *skb);
+enum skb_drop_reason tcp_conn_request(struct request_sock_ops *rsk_ops,
+				      const struct tcp_request_sock_ops *af_ops,
+				      struct sock *sk, struct sk_buff *skb);
 
 /* TCP af-specific functions */
 struct tcp_sock_af_ops {
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index 7dfc00c9fb32..8c1241ae8449 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -255,7 +255,8 @@ void dccp_done(struct sock *sk);
 int dccp_reqsk_init(struct request_sock *rq, struct dccp_sock const *dp,
 		    struct sk_buff const *skb);
 
-int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb);
+int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			 enum skb_drop_reason *reason);
 
 struct sock *dccp_create_openreq_child(const struct sock *sk,
 				       const struct request_sock *req,
diff --git a/net/dccp/input.c b/net/dccp/input.c
index 2cbb757a894f..e12baa56ca59 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -576,6 +576,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 	const int old_state = sk->sk_state;
 	bool acceptable;
 	int queued = 0;
+	SKB_DR(reason);
 
 	/*
 	 *  Step 3: Process LISTEN state
@@ -606,7 +607,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 			 */
 			rcu_read_lock();
 			local_bh_disable();
-			acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0;
+			acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb, &reason) >= 0;
 			local_bh_enable();
 			rcu_read_unlock();
 			if (!acceptable)
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 82696ab86f74..c689385229f0 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -581,7 +581,8 @@ static struct request_sock_ops dccp_request_sock_ops __read_mostly = {
 	.syn_ack_timeout = dccp_syn_ack_timeout,
 };
 
-int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
+int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			 enum skb_drop_reason *reason)
 {
 	struct inet_request_sock *ireq;
 	struct request_sock *req;
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 4d95b6400915..d32fbdf45012 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -314,7 +314,8 @@ static struct request_sock_ops dccp6_request_sock_ops = {
 	.syn_ack_timeout = dccp_syn_ack_timeout,
 };
 
-static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
+static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb,
+				enum skb_drop_reason *reason)
 {
 	struct request_sock *req;
 	struct dccp_request_sock *dreq;
@@ -324,7 +325,7 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 	struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
 
 	if (skb->protocol == htons(ETH_P_IP))
-		return dccp_v4_conn_request(sk, skb);
+		return dccp_v4_conn_request(sk, skb, reason);
 
 	if (!ipv6_unicast_destination(skb))
 		return 0;	/* discard, don't send a reset here */
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4717af0eaea7..be6275c56b59 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6455,13 +6455,17 @@ enum skb_drop_reason tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			 */
 			rcu_read_lock();
 			local_bh_disable();
-			acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0;
+			reason = SKB_NOT_DROPPED_YET;
+			acceptable = icsk->icsk_af_ops->conn_request(sk, skb, &reason) >= 0;
 			local_bh_enable();
 			rcu_read_unlock();
 
 			if (!acceptable)
-				return SKB_DROP_REASON_NOT_SPECIFIED;
-			consume_skb(skb);
+				return reason ?: SKB_DROP_REASON_NOT_SPECIFIED;
+			if (reason)
+				kfree_skb_reason(skb, reason);
+			else
+				consume_skb(skb);
 			return SKB_NOT_DROPPED_YET;
 		}
 		SKB_DR_SET(reason, TCP_FLAGS);
@@ -6881,9 +6885,9 @@ u16 tcp_get_syncookie_mss(struct request_sock_ops *rsk_ops,
 }
 EXPORT_SYMBOL_GPL(tcp_get_syncookie_mss);
 
-int tcp_conn_request(struct request_sock_ops *rsk_ops,
-		     const struct tcp_request_sock_ops *af_ops,
-		     struct sock *sk, struct sk_buff *skb)
+enum skb_drop_reason tcp_conn_request(struct request_sock_ops *rsk_ops,
+				      const struct tcp_request_sock_ops *af_ops,
+				      struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_fastopen_cookie foc = { .len = -1 };
 	__u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn;
@@ -6895,6 +6899,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	bool want_cookie = false;
 	struct dst_entry *dst;
 	struct flowi fl;
+	SKB_DR(reason);
 
 	/* TW buckets are converted to open requests without
 	 * limitations, they conserve resources and peer is
@@ -6903,12 +6908,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	if ((net->ipv4.sysctl_tcp_syncookies == 2 ||
 	     inet_csk_reqsk_queue_is_full(sk)) && !isn) {
 		want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
-		if (!want_cookie)
+		if (!want_cookie) {
+			SKB_DR_SET(reason, TCP_REQQFULLDROP);
 			goto drop;
+		}
 	}
 
 	if (sk_acceptq_is_full(sk)) {
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+		SKB_DR_SET(reason, LISTENOVERFLOWS);
 		goto drop;
 	}
 
@@ -6964,6 +6972,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 			 */
 			pr_drop_req(req, ntohs(tcp_hdr(skb)->source),
 				    rsk_ops->family);
+			SKB_DR_SET(reason, TCP_REQQFULLDROP);
 			goto drop_and_release;
 		}
 
@@ -7016,7 +7025,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 		}
 	}
 	reqsk_put(req);
-	return 0;
+	return SKB_NOT_DROPPED_YET;
 
 drop_and_release:
 	dst_release(dst);
@@ -7024,6 +7033,6 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	__reqsk_free(req);
 drop:
 	tcp_listendrop(sk);
-	return 0;
+	return reason;
 }
 EXPORT_SYMBOL(tcp_conn_request);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 12a18c5035f4..708f92b03f42 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1458,17 +1458,20 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
 	.send_synack	=	tcp_v4_send_synack,
 };
 
-int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
+int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			enum skb_drop_reason *reason)
 {
 	/* Never answer to SYNs send to broadcast or multicast */
 	if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))
 		goto drop;
 
-	return tcp_conn_request(&tcp_request_sock_ops,
-				&tcp_request_sock_ipv4_ops, sk, skb);
+	*reason = tcp_conn_request(&tcp_request_sock_ops,
+				   &tcp_request_sock_ipv4_ops, sk, skb);
+	return *reason;
 
 drop:
 	tcp_listendrop(sk);
+	*reason = SKB_DROP_REASON_IP_INADDRERRORS;
 	return 0;
 }
 EXPORT_SYMBOL(tcp_v4_conn_request);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index d8236d51dd47..27c51991bd54 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1148,24 +1148,28 @@ u16 tcp_v6_get_syncookie(struct sock *sk, struct ipv6hdr *iph,
 	return mss;
 }
 
-static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
+static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb,
+			       enum skb_drop_reason *reason)
 {
 	if (skb->protocol == htons(ETH_P_IP))
-		return tcp_v4_conn_request(sk, skb);
+		return tcp_v4_conn_request(sk, skb, reason);
 
 	if (!ipv6_unicast_destination(skb))
 		goto drop;
 
 	if (ipv6_addr_v4mapped(&ipv6_hdr(skb)->saddr)) {
 		__IP6_INC_STATS(sock_net(sk), NULL, IPSTATS_MIB_INHDRERRORS);
+		*reason = SKB_DROP_REASON_IP_INADDRERRORS;
 		return 0;
 	}
 
-	return tcp_conn_request(&tcp6_request_sock_ops,
-				&tcp_request_sock_ipv6_ops, sk, skb);
+	*reason = tcp_conn_request(&tcp6_request_sock_ops,
+				   &tcp_request_sock_ipv6_ops, sk, skb);
+	return *reason;
 
 drop:
 	tcp_listendrop(sk);
+	*reason = SKB_DROP_REASON_IP_INADDRERRORS;
 	return 0; /* don't send reset */
 }
 
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 6d59336a8e1e..58c1f056213b 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -532,7 +532,8 @@ static int subflow_v6_rebuild_header(struct sock *sk)
 struct request_sock_ops mptcp_subflow_request_sock_ops;
 static struct tcp_request_sock_ops subflow_request_sock_ipv4_ops __ro_after_init;
 
-static int subflow_v4_conn_request(struct sock *sk, struct sk_buff *skb)
+static int subflow_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+				   enum skb_drop_reason *reason)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 
@@ -556,14 +557,15 @@ static struct inet_connection_sock_af_ops subflow_v6_specific __ro_after_init;
 static struct inet_connection_sock_af_ops subflow_v6m_specific __ro_after_init;
 static struct proto tcpv6_prot_override;
 
-static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb)
+static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb,
+				   enum skb_drop_reason *reason)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 
 	pr_debug("subflow=%p", subflow);
 
 	if (skb->protocol == htons(ETH_P_IP))
-		return subflow_v4_conn_request(sk, skb);
+		return subflow_v4_conn_request(sk, skb, reason);
 
 	if (!ipv6_unicast_destination(skb))
 		goto drop;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 8/9] net: tcp: add skb drop reasons to tcp tw code path
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (6 preceding siblings ...)
  2022-05-17  8:10 ` [PATCH net-next v2 7/9] net: tcp: add skb drop reasons to tcp connect requesting menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  2022-05-17  8:10 ` [PATCH net-next v2 9/9] net: tcp: add skb drop reasons to route_req() menglong8.dong
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

In order to get the reasons of skb drops, add a function argument of
type 'enum skb_drop_reason *reason' to tcp_timewait_state_process().

In the origin code, all packets to time-wait socket are treated as
dropping with kfree_skb(), which can make users confused. Therefore,
we use consume_skb() for the skbs that are 'good'. We can check the
value of 'reason' to decide use kfree_skb() or consume_skb().

The new reason 'TIMEWAIT' is added for the case that the skb is dropped
as the socket in time-wait state.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
v2:
- skb is not freed on TCP_TW_ACK and 'ret' is not initizalized, fix
  it (Eric Dumazet)
---
 include/linux/skbuff.h   |  5 +++++
 include/net/tcp.h        |  7 ++++---
 net/ipv4/tcp_ipv4.c      |  9 ++++++++-
 net/ipv4/tcp_minisocks.c | 24 ++++++++++++++++++++----
 net/ipv6/tcp_ipv6.c      |  9 ++++++++-
 5 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4578bbab5a3e..8d18fc5a5af6 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -560,6 +560,10 @@ struct sk_buff;
  * SKB_DROP_REASON_TCP_REQQFULLDROP
  *	request queue of the listen socket is full, corresponding to
  *	LINUX_MIB_TCPREQQFULLDROP
+ *
+ * SKB_DROP_REASON_TIMEWAIT
+ *	socket is in time-wait state and all packet that received will
+ *	be treated as 'drop', except a good 'SYN' packet
  */
 #define __DEFINE_SKB_DROP_REASON(FN)	\
 	FN(NOT_SPECIFIED)		\
@@ -631,6 +635,7 @@ struct sk_buff;
 	FN(TCP_ABORTONDATA)		\
 	FN(LISTENOVERFLOWS)		\
 	FN(TCP_REQQFULLDROP)		\
+	FN(TIMEWAIT)			\
 	FN(MAX)
 
 /* The reason of skb drop, which is used in kfree_skb_reason().
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 082dd0627e2e..88217b8d95ac 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -380,9 +380,10 @@ enum tcp_tw_status {
 };
 
 
-enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw,
-					      struct sk_buff *skb,
-					      const struct tcphdr *th);
+enum tcp_tw_status
+tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
+			   const struct tcphdr *th,
+			   enum skb_drop_reason *reason);
 struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 			   struct request_sock *req, bool fastopen,
 			   bool *lost_race);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 708f92b03f42..3c163b54b0f8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2134,7 +2134,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		inet_twsk_put(inet_twsk(sk));
 		goto csum_error;
 	}
-	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
+	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
+					   &drop_reason)) {
 	case TCP_TW_SYN: {
 		struct sock *sk2 = inet_lookup_listener(dev_net(skb->dev),
 							&tcp_hashinfo, skb,
@@ -2150,11 +2151,17 @@ int tcp_v4_rcv(struct sk_buff *skb)
 			refcounted = false;
 			goto process;
 		}
+		/* TCP_FLAGS or NO_SOCKET? */
+		SKB_DR_SET(drop_reason, TCP_FLAGS);
 	}
 		/* to ACK */
 		fallthrough;
 	case TCP_TW_ACK:
 		tcp_v4_timewait_ack(sk, skb);
+		if (!drop_reason) {
+			consume_skb(skb);
+			return 0;
+		}
 		break;
 	case TCP_TW_RST:
 		tcp_v4_send_reset(sk, skb);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 1a21018f6f64..329724118b7f 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -83,13 +83,15 @@ tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw,
  */
 enum tcp_tw_status
 tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
-			   const struct tcphdr *th)
+			   const struct tcphdr *th,
+			   enum skb_drop_reason *reason)
 {
 	struct tcp_options_received tmp_opt;
 	struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw);
 	bool paws_reject = false;
 
 	tmp_opt.saw_tstamp = 0;
+	*reason = SKB_DROP_REASON_NOT_SPECIFIED;
 	if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) {
 		tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL);
 
@@ -113,11 +115,16 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 			return tcp_timewait_check_oow_rate_limit(
 				tw, skb, LINUX_MIB_TCPACKSKIPPEDFINWAIT2);
 
-		if (th->rst)
+		if (th->rst) {
+			SKB_DR_SET(*reason, TCP_RESET);
 			goto kill;
+		}
 
-		if (th->syn && !before(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt))
+		if (th->syn && !before(TCP_SKB_CB(skb)->seq,
+				       tcptw->tw_rcv_nxt)) {
+			SKB_DR_SET(*reason, TCP_FLAGS);
 			return TCP_TW_RST;
+		}
 
 		/* Dup ACK? */
 		if (!th->ack ||
@@ -143,6 +150,9 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 		}
 
 		inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
+
+		/* skb should be free normally on this case. */
+		*reason = SKB_NOT_DROPPED_YET;
 		return TCP_TW_ACK;
 	}
 
@@ -174,6 +184,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 			 * protocol bug yet.
 			 */
 			if (twsk_net(tw)->ipv4.sysctl_tcp_rfc1337 == 0) {
+				SKB_DR_SET(*reason, TCP_RESET);
 kill:
 				inet_twsk_deschedule_put(tw);
 				return TCP_TW_SUCCESS;
@@ -216,11 +227,14 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 		if (isn == 0)
 			isn++;
 		TCP_SKB_CB(skb)->tcp_tw_isn = isn;
+		*reason = SKB_NOT_DROPPED_YET;
 		return TCP_TW_SYN;
 	}
 
-	if (paws_reject)
+	if (paws_reject) {
+		SKB_DR_SET(*reason, TCP_RFC7323_PAWS);
 		__NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED);
+	}
 
 	if (!th->rst) {
 		/* In this case we must reset the TIMEWAIT timer.
@@ -232,9 +246,11 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 		if (paws_reject || th->ack)
 			inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
 
+		SKB_DR_OR(*reason, TIMEWAIT);
 		return tcp_timewait_check_oow_rate_limit(
 			tw, skb, LINUX_MIB_TCPACKSKIPPEDTIMEWAIT);
 	}
+	SKB_DR_SET(*reason, TCP_RESET);
 	inet_twsk_put(tw);
 	return TCP_TW_SUCCESS;
 }
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 27c51991bd54..132b27763229 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1772,6 +1772,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 	}
 
 discard_it:
+	SKB_DR_OR(drop_reason, NOT_SPECIFIED);
 	kfree_skb_reason(skb, drop_reason);
 	return 0;
 
@@ -1795,7 +1796,8 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		goto csum_error;
 	}
 
-	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
+	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
+					   &drop_reason)) {
 	case TCP_TW_SYN:
 	{
 		struct sock *sk2;
@@ -1815,11 +1817,16 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 			refcounted = false;
 			goto process;
 		}
+		SKB_DR_SET(drop_reason, TCP_FLAGS);
 	}
 		/* to ACK */
 		fallthrough;
 	case TCP_TW_ACK:
 		tcp_v6_timewait_ack(sk, skb);
+		if (!drop_reason) {
+			consume_skb(skb);
+			return 0;
+		}
 		break;
 	case TCP_TW_RST:
 		tcp_v6_send_reset(sk, skb);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 9/9] net: tcp: add skb drop reasons to route_req()
  2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (7 preceding siblings ...)
  2022-05-17  8:10 ` [PATCH net-next v2 8/9] net: tcp: add skb drop reasons to tcp tw code path menglong8.dong
@ 2022-05-17  8:10 ` menglong8.dong
  8 siblings, 0 replies; 12+ messages in thread
From: menglong8.dong @ 2022-05-17  8:10 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

Add skb drop reasons to the route_req() in struct tcp_request_sock_ops.
Following functions are involved:

  tcp_v4_route_req()
  tcp_v6_route_req()
  subflow_v4_route_req()
  subflow_v6_route_req()

And the new reason SKB_DROP_REASON_LSM is added, which is used when
skb is dropped by LSM.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h |  4 ++++
 include/net/tcp.h      |  3 ++-
 net/ipv4/tcp_input.c   |  2 +-
 net/ipv4/tcp_ipv4.c    | 14 +++++++++++---
 net/ipv6/tcp_ipv6.c    | 14 +++++++++++---
 net/mptcp/subflow.c    | 10 ++++++----
 6 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8d18fc5a5af6..fdfe54dc5ae4 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -564,6 +564,9 @@ struct sk_buff;
  * SKB_DROP_REASON_TIMEWAIT
  *	socket is in time-wait state and all packet that received will
  *	be treated as 'drop', except a good 'SYN' packet
+ *
+ * SKB_DROP_REASON_LSM
+ *	dropped by LSM
  */
 #define __DEFINE_SKB_DROP_REASON(FN)	\
 	FN(NOT_SPECIFIED)		\
@@ -636,6 +639,7 @@ struct sk_buff;
 	FN(LISTENOVERFLOWS)		\
 	FN(TCP_REQQFULLDROP)		\
 	FN(TIMEWAIT)			\
+	FN(LSM)				\
 	FN(MAX)
 
 /* The reason of skb drop, which is used in kfree_skb_reason().
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 88217b8d95ac..ed57c331fdeb 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2075,7 +2075,8 @@ struct tcp_request_sock_ops {
 	struct dst_entry *(*route_req)(const struct sock *sk,
 				       struct sk_buff *skb,
 				       struct flowi *fl,
-				       struct request_sock *req);
+				       struct request_sock *req,
+				       enum skb_drop_reason *reason);
 	u32 (*init_seq)(const struct sk_buff *skb);
 	u32 (*init_ts_off)(const struct net *net, const struct sk_buff *skb);
 	int (*send_synack)(const struct sock *sk, struct dst_entry *dst,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index be6275c56b59..146d22b05186 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6950,7 +6950,7 @@ enum skb_drop_reason tcp_conn_request(struct request_sock_ops *rsk_ops,
 	/* Note: tcp_v6_init_req() might override ir_iif for link locals */
 	inet_rsk(req)->ir_iif = inet_request_bound_dev_if(sk, skb);
 
-	dst = af_ops->route_req(sk, skb, &fl, req);
+	dst = af_ops->route_req(sk, skb, &fl, req, &reason);
 	if (!dst)
 		goto drop_and_free;
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3c163b54b0f8..026a36f1598b 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1423,14 +1423,22 @@ static void tcp_v4_init_req(struct request_sock *req,
 static struct dst_entry *tcp_v4_route_req(const struct sock *sk,
 					  struct sk_buff *skb,
 					  struct flowi *fl,
-					  struct request_sock *req)
+					  struct request_sock *req,
+					  enum skb_drop_reason *reason)
 {
+	struct dst_entry *dst;
+
 	tcp_v4_init_req(req, sk, skb);
 
-	if (security_inet_conn_request(sk, skb, req))
+	if (security_inet_conn_request(sk, skb, req)) {
+		SKB_DR_SET(*reason, LSM);
 		return NULL;
+	}
 
-	return inet_csk_route_req(sk, &fl->u.ip4, req);
+	dst = inet_csk_route_req(sk, &fl->u.ip4, req);
+	if (!dst)
+		SKB_DR_SET(*reason, IP_OUTNOROUTES);
+	return dst;
 }
 
 struct request_sock_ops tcp_request_sock_ops __read_mostly = {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 132b27763229..b859adcde756 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -802,14 +802,22 @@ static void tcp_v6_init_req(struct request_sock *req,
 static struct dst_entry *tcp_v6_route_req(const struct sock *sk,
 					  struct sk_buff *skb,
 					  struct flowi *fl,
-					  struct request_sock *req)
+					  struct request_sock *req,
+					  enum skb_drop_reason *reason)
 {
+	struct dst_entry *dst;
+
 	tcp_v6_init_req(req, sk, skb);
 
-	if (security_inet_conn_request(sk, skb, req))
+	if (security_inet_conn_request(sk, skb, req)) {
+		SKB_DR_SET(*reason, LSM);
 		return NULL;
+	}
 
-	return inet6_csk_route_req(sk, &fl->u.ip6, req, IPPROTO_TCP);
+	dst = inet6_csk_route_req(sk, &fl->u.ip6, req, IPPROTO_TCP);
+	if (!dst)
+		SKB_DR_SET(*reason, IP_OUTNOROUTES);
+	return dst;
 }
 
 struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 58c1f056213b..db7e6cd96d44 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -285,7 +285,8 @@ EXPORT_SYMBOL_GPL(mptcp_subflow_init_cookie_req);
 static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 					      struct sk_buff *skb,
 					      struct flowi *fl,
-					      struct request_sock *req)
+					      struct request_sock *req,
+					      enum skb_drop_reason *reason)
 {
 	struct dst_entry *dst;
 	int err;
@@ -293,7 +294,7 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 	tcp_rsk(req)->is_mptcp = 1;
 	subflow_init_req(req, sk);
 
-	dst = tcp_request_sock_ipv4_ops.route_req(sk, skb, fl, req);
+	dst = tcp_request_sock_ipv4_ops.route_req(sk, skb, fl, req, reason);
 	if (!dst)
 		return NULL;
 
@@ -311,7 +312,8 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 					      struct sk_buff *skb,
 					      struct flowi *fl,
-					      struct request_sock *req)
+					      struct request_sock *req,
+					      enum skb_drop_reason *reason)
 {
 	struct dst_entry *dst;
 	int err;
@@ -319,7 +321,7 @@ static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 	tcp_rsk(req)->is_mptcp = 1;
 	subflow_init_req(req, sk);
 
-	dst = tcp_request_sock_ipv6_ops.route_req(sk, skb, fl, req);
+	dst = tcp_request_sock_ipv6_ops.route_req(sk, skb, fl, req, reason);
 	if (!dst)
 		return NULL;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code
  2022-05-17  8:10 ` [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code menglong8.dong
@ 2022-05-18  1:14   ` Jakub Kicinski
  2022-05-18  2:15     ` Menglong Dong
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2022-05-18  1:14 UTC (permalink / raw)
  To: menglong8.dong
  Cc: edumazet, rostedt, mingo, davem, yoshfuji, dsahern, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

On Tue, 17 May 2022 16:10:00 +0800 menglong8.dong@gmail.com wrote:
> From: Menglong Dong <imagedong@tencent.com>
> 
> It is annoying to add new skb drop reasons to 'enum skb_drop_reason'
> and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy to forget
> to add the new reasons we added to TRACE_SKB_DROP_REASON.
> 
> TRACE_SKB_DROP_REASON is used to convert drop reason of type number
> to string. For now, the string we passed to user space is exactly the
> same as the name in 'enum skb_drop_reason' with a 'SKB_DROP_REASON_'
> prefix. So why not make them togather by define a macro?
> 
> Therefore, introduce __DEFINE_SKB_DROP_REASON() and use it for 'enum
> skb_drop_reason' definition and string converting.
> 
> Now, what should we with the document for the reasons? How about follow
> __BPF_FUNC_MAPPER() and make these document togather?

Hi, I know BPF does this but I really find the definition-by-macro 
counter productive :(

kdoc will no longer work right because the parser will not see 
the real values. cscope and other code indexers will struggle 
to find definitions.

Did you investigate using auto-generation? Kernel already generates 
a handful of headers. Maybe with a little script we could convert 
the enum into the string thing at build time?

Also let's use this opportunity to move the enum to a standalone
header, it's getting huge.

Probably worth keeping this rework separate from the TCP patches.
Up to you which one you'd like to get done first.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code
  2022-05-18  1:14   ` Jakub Kicinski
@ 2022-05-18  2:15     ` Menglong Dong
  0 siblings, 0 replies; 12+ messages in thread
From: Menglong Dong @ 2022-05-18  2:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Eric Dumazet, Steven Rostedt, Ingo Molnar, David Miller,
	Hideaki YOSHIFUJI, David Ahern, Paolo Abeni, Menglong Dong,
	Martin Lau, Talal Ahmad, Kees Cook, Dongli Zhang, LKML, netdev,
	Jiang Biao, Hao Peng

On Wed, May 18, 2022 at 9:15 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 17 May 2022 16:10:00 +0800 menglong8.dong@gmail.com wrote:
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > It is annoying to add new skb drop reasons to 'enum skb_drop_reason'
> > and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy to forget
> > to add the new reasons we added to TRACE_SKB_DROP_REASON.
> >
> > TRACE_SKB_DROP_REASON is used to convert drop reason of type number
> > to string. For now, the string we passed to user space is exactly the
> > same as the name in 'enum skb_drop_reason' with a 'SKB_DROP_REASON_'
> > prefix. So why not make them togather by define a macro?
> >
> > Therefore, introduce __DEFINE_SKB_DROP_REASON() and use it for 'enum
> > skb_drop_reason' definition and string converting.
> >
> > Now, what should we with the document for the reasons? How about follow
> > __BPF_FUNC_MAPPER() and make these document togather?
>
> Hi, I know BPF does this but I really find the definition-by-macro
> counter productive :(
>
> kdoc will no longer work right because the parser will not see
> the real values. cscope and other code indexers will struggle
> to find definitions.
>

Yeah, I found this problem too. My autocomplete in vscode never helps
me anymore after I use this macro.

> Did you investigate using auto-generation? Kernel already generates
> a handful of headers. Maybe with a little script we could convert
> the enum into the string thing at build time?
>

Oh, I forgot about auto-generation, it seems it's a better choice.
I'll try to use auto-generation.

> Also let's use this opportunity to move the enum to a standalone
> header, it's getting huge.
>
> Probably worth keeping this rework separate from the TCP patches.
> Up to you which one you'd like to get done first.

Ok, I'll make the enum in a standalone header in the separated
series.

Thans!
Menglong Dong

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-05-18  2:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-17  8:09 [PATCH net-next v2 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 1/9] net: skb: introduce __DEFINE_SKB_DROP_REASON() to simply the code menglong8.dong
2022-05-18  1:14   ` Jakub Kicinski
2022-05-18  2:15     ` Menglong Dong
2022-05-17  8:10 ` [PATCH net-next v2 2/9] net: skb: introduce __skb_queue_purge_reason() menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 3/9] net: sock: introduce sk_stream_kill_queues_reason() menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 4/9] net: inet: add skb drop reason to inet_csk_destroy_sock() menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 5/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 6/9] net: tcp: make tcp_rcv_state_process() return drop reason menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 7/9] net: tcp: add skb drop reasons to tcp connect requesting menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 8/9] net: tcp: add skb drop reasons to tcp tw code path menglong8.dong
2022-05-17  8:10 ` [PATCH net-next v2 9/9] net: tcp: add skb drop reasons to route_req() menglong8.dong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).