linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change
@ 2022-06-10  3:41 menglong8.dong
  2022-06-10  3:41 ` [PATCH net-next v3 1/9] net: skb: introduce __skb_queue_purge_reason() menglong8.dong
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:41 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev

From: Menglong Dong <imagedong@tencent.com>

In this series patches, skb drop reasons are add to code path of TCP
state change, which we have not done before. It is hard to pass these
reasons from the function to its caller, where skb is dropped. In order
to do this, we have to make some functions return skb drop reasons, or
pass the pointer of 'reason' to these function as an new function
argument.

=============================
We change the type of the return value of tcp_rcv_synsent_state_process()
and tcp_rcv_state_process() to 'enum skb_drop_reason' and make them
return skb drop reasons in 5th and 6th patch.

=============================
In order to get skb drop reasons during tcp connect requesting code path,
we have to pass the pointer of the 'reason' as a new function argument of
conn_request() in 'struct inet_connection_sock_af_ops'. As the return
value of conn_request() can be positive or negative or 0, it's not
flexible to make it return drop reasons. This work is done in the 7th
patch, and functions that used as conn_request() is also modified:

  dccp_v4_conn_request()
  dccp_v6_conn_request()
  tcp_v4_conn_request()
  tcp_v6_conn_request()
  subflow_v4_conn_request()
  subflow_v6_conn_request()

As our target is TCP, dccp and mptcp are not handled more.

=============================
In the 7th patch, skb drop reasons are add to
tcp_timewait_state_process() by adding a function argument to it. In the
origin code, all skb are dropped for tw socket. In order to make less
noise, use consume_skb() for the 'good' skb. This can be checked by the
caller of tcp_timewait_state_process() from the value of drop reason.
If the drop reason is SKB_NOT_DROPPED_YET, it means this skb should not
be dropped.

=============================
In the 8th patch, skb drop reasons are add to the route_req() in struct
tcp_request_sock_ops. Following functions are involved:

  tcp_v4_route_req()
  tcp_v6_route_req()
  subflow_v4_route_req()
  subflow_v6_route_req()

In this series patches, following new drop reasons are added:

- SOCKET_DESTROYED
- TCP_PAWSACTIVEREJECTED
- TCP_LINGER
- LISTENOVERFLOWS
- TCP_REQQFULLDROP
- TIMEWAIT
- LSM

Changes since v2:
- move drop reasons to standalone header in another series

Changes since v1:
6/9 - fix the compile errors of dccp and mptcp (kernel test robot)
7/9 - skb is not freed on TCP_TW_ACK and 'ret' is not initizalized, fix
      it (Eric Dumazet)

Menglong Dong (9):
  net: skb: introduce __skb_queue_purge_reason()
  net: sock: introduce sk_stream_kill_queues_reason()
  net: inet: add skb drop reason to inet_csk_destroy_sock()
  net: tcp: make tcp_rcv_synsent_state_process() return drop reasons
  net: tcp: make tcp_rcv_state_process() return drop reason
  net: tcp: add skb drop reasons to tcp connect requesting
  net: tcp: add skb drop reasons to tcp tw code path
  net: tcp: add skb drop reasons to route_req()
  net: tcp: use LINUX_MIB_TCPABORTONLINGER in tcp_rcv_state_process()

 include/linux/skbuff.h             | 12 +++--
 include/net/dropreason.h           | 35 +++++++++++++++
 include/net/inet_connection_sock.h |  3 +-
 include/net/sock.h                 |  8 +++-
 include/net/tcp.h                  | 27 ++++++-----
 net/core/stream.c                  |  7 +--
 net/dccp/dccp.h                    |  3 +-
 net/dccp/input.c                   |  3 +-
 net/dccp/ipv4.c                    |  3 +-
 net/dccp/ipv6.c                    |  5 ++-
 net/ipv4/inet_connection_sock.c    |  2 +-
 net/ipv4/tcp_input.c               | 72 +++++++++++++++++++-----------
 net/ipv4/tcp_ipv4.c                | 52 +++++++++++++++------
 net/ipv4/tcp_minisocks.c           | 35 +++++++++++----
 net/ipv6/tcp_ipv6.c                | 53 +++++++++++++++-------
 net/mptcp/subflow.c                | 18 +++++---
 16 files changed, 241 insertions(+), 97 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 1/9] net: skb: introduce __skb_queue_purge_reason()
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
@ 2022-06-10  3:41 ` menglong8.dong
  2022-06-10  3:41 ` [PATCH net-next v3 2/9] net: sock: introduce sk_stream_kill_queues_reason() menglong8.dong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:41 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

Introduce __skb_queue_purge_reason() to empty a skb list with drop
reason and make __skb_queue_purge() an inline call to it.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 82edf0359ab3..8d5445c3d3e7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3020,18 +3020,24 @@ static inline int skb_orphan_frags_rx(struct sk_buff *skb, gfp_t gfp_mask)
 }
 
 /**
- *	__skb_queue_purge - empty a list
+ *	__skb_queue_purge_reason - empty a list with specific drop reason
  *	@list: list to empty
+ *	@reason: drop reason
  *
  *	Delete all buffers on an &sk_buff list. Each buffer is removed from
  *	the list and one reference dropped. This function does not take the
  *	list lock and the caller must hold the relevant locks to use it.
  */
-static inline void __skb_queue_purge(struct sk_buff_head *list)
+static inline void __skb_queue_purge_reason(struct sk_buff_head *list,
+					    enum skb_drop_reason reason)
 {
 	struct sk_buff *skb;
 	while ((skb = __skb_dequeue(list)) != NULL)
-		kfree_skb(skb);
+		kfree_skb_reason(skb, reason);
+}
+static inline void __skb_queue_purge(struct sk_buff_head *list)
+{
+	__skb_queue_purge_reason(list, SKB_DROP_REASON_NOT_SPECIFIED);
 }
 void skb_queue_purge(struct sk_buff_head *list);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 2/9] net: sock: introduce sk_stream_kill_queues_reason()
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
  2022-06-10  3:41 ` [PATCH net-next v3 1/9] net: skb: introduce __skb_queue_purge_reason() menglong8.dong
@ 2022-06-10  3:41 ` menglong8.dong
  2022-06-10  3:41 ` [PATCH net-next v3 3/9] net: inet: add skb drop reason to inet_csk_destroy_sock() menglong8.dong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:41 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

Introduce the function sk_stream_kill_queues_reason() and make the
origin sk_stream_kill_queues() an inline call to it.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/net/sock.h | 8 +++++++-
 net/core/stream.c  | 7 ++++---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 657873e2d90f..208c87807f23 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1139,12 +1139,18 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p);
 int sk_stream_wait_memory(struct sock *sk, long *timeo_p);
 void sk_stream_wait_close(struct sock *sk, long timeo_p);
 int sk_stream_error(struct sock *sk, int flags, int err);
-void sk_stream_kill_queues(struct sock *sk);
+void sk_stream_kill_queues_reason(struct sock *sk,
+				  enum skb_drop_reason reason);
 void sk_set_memalloc(struct sock *sk);
 void sk_clear_memalloc(struct sock *sk);
 
 void __sk_flush_backlog(struct sock *sk);
 
+static inline void sk_stream_kill_queues(struct sock *sk)
+{
+	sk_stream_kill_queues_reason(sk, SKB_DROP_REASON_NOT_SPECIFIED);
+}
+
 static inline bool sk_flush_backlog(struct sock *sk)
 {
 	if (unlikely(READ_ONCE(sk->sk_backlog.tail))) {
diff --git a/net/core/stream.c b/net/core/stream.c
index 06b36c730ce8..a562b23a1a6e 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -190,10 +190,11 @@ int sk_stream_error(struct sock *sk, int flags, int err)
 }
 EXPORT_SYMBOL(sk_stream_error);
 
-void sk_stream_kill_queues(struct sock *sk)
+void sk_stream_kill_queues_reason(struct sock *sk,
+				  enum skb_drop_reason reason)
 {
 	/* First the read buffer. */
-	__skb_queue_purge(&sk->sk_receive_queue);
+	__skb_queue_purge_reason(&sk->sk_receive_queue, reason);
 
 	/* Next, the write queue. */
 	WARN_ON(!skb_queue_empty(&sk->sk_write_queue));
@@ -209,4 +210,4 @@ void sk_stream_kill_queues(struct sock *sk)
 	 * have gone away, only the net layer knows can touch it.
 	 */
 }
-EXPORT_SYMBOL(sk_stream_kill_queues);
+EXPORT_SYMBOL(sk_stream_kill_queues_reason);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 3/9] net: inet: add skb drop reason to inet_csk_destroy_sock()
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
  2022-06-10  3:41 ` [PATCH net-next v3 1/9] net: skb: introduce __skb_queue_purge_reason() menglong8.dong
  2022-06-10  3:41 ` [PATCH net-next v3 2/9] net: sock: introduce sk_stream_kill_queues_reason() menglong8.dong
@ 2022-06-10  3:41 ` menglong8.dong
  2022-06-10  3:41 ` [PATCH net-next v3 4/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons menglong8.dong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:41 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

skb dropping in inet_csk_destroy_sock() seems to be a common case. Add
the new drop reason 'SKB_DROP_REASON_SOCKET_DESTROIED' and apply it to
inet_csk_destroy_sock() to stop confusing users with 'NOT_SPECIFIED'.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/net/dropreason.h        | 5 +++++
 net/ipv4/inet_connection_sock.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index fae9b40e54fa..3c6f1e299c35 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -231,6 +231,11 @@ enum skb_drop_reason {
 	 * MTU)
 	 */
 	SKB_DROP_REASON_PKT_TOO_BIG,
+	/**
+	 * @SKB_DROP_REASON_SOCKET_DESTROYED: socket is destroyed and the
+	 * skb in its receive or send queue are all dropped
+	 */
+	SKB_DROP_REASON_SOCKET_DESTROYED,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index c0b7e6c21360..1812060f24cb 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1129,7 +1129,7 @@ void inet_csk_destroy_sock(struct sock *sk)
 
 	sk->sk_prot->destroy(sk);
 
-	sk_stream_kill_queues(sk);
+	sk_stream_kill_queues_reason(sk, SKB_DROP_REASON_SOCKET_DESTROYED);
 
 	xfrm_sk_free_policy(sk);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 4/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (2 preceding siblings ...)
  2022-06-10  3:41 ` [PATCH net-next v3 3/9] net: inet: add skb drop reason to inet_csk_destroy_sock() menglong8.dong
@ 2022-06-10  3:41 ` menglong8.dong
  2022-06-10  3:42 ` [PATCH net-next v3 5/9] net: tcp: make tcp_rcv_state_process() return drop reason menglong8.dong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:41 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

The return value of tcp_rcv_synsent_state_process() can be -1, 0 or 1:

- -1: free skb silently
- 0: success and skb is already freed
- 1: drop packet and send a RST

Therefore, we can make it return skb drop reasons instead of '1' on
'reset_and_undo' path, which will not impact the caller.

The new reason 'TCP_PAWSACTIVEREJECTED' is added, which is corresponding
to LINUX_MIB_PAWSACTIVEREJECTED.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/net/dropreason.h | 6 ++++++
 net/ipv4/tcp_input.c     | 7 ++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index 3c6f1e299c35..c60913aba0e9 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -236,6 +236,12 @@ enum skb_drop_reason {
 	 * skb in its receive or send queue are all dropped
 	 */
 	SKB_DROP_REASON_SOCKET_DESTROYED,
+	/**
+	 * @SKB_DROP_REASON_TCP_PAWSACTIVEREJECTED: PAWS check failed for
+	 * active TCP connection, corresponding to
+	 * LINUX_MIB_PAWSACTIVEREJECTED
+	 */
+	SKB_DROP_REASON_TCP_PAWSACTIVEREJECTED,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2e2a9ece9af2..9254f14def43 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6177,6 +6177,10 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 				inet_csk_reset_xmit_timer(sk,
 						ICSK_TIME_RETRANS,
 						TCP_TIMEOUT_MIN, TCP_RTO_MAX);
+			if (after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt))
+				SKB_DR_SET(reason, TCP_ACK_UNSENT_DATA);
+			else
+				SKB_DR_SET(reason, TCP_TOO_OLD_ACK);
 			goto reset_and_undo;
 		}
 
@@ -6185,6 +6189,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 			     tcp_time_stamp(tp))) {
 			NET_INC_STATS(sock_net(sk),
 					LINUX_MIB_PAWSACTIVEREJECTED);
+			SKB_DR_SET(reason, TCP_PAWSACTIVEREJECTED);
 			goto reset_and_undo;
 		}
 
@@ -6378,7 +6383,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 reset_and_undo:
 	tcp_clear_options(&tp->rx_opt);
 	tp->rx_opt.mss_clamp = saved_clamp;
-	return 1;
+	return reason;
 }
 
 static void tcp_rcv_synrecv_state_fastopen(struct sock *sk)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 5/9] net: tcp: make tcp_rcv_state_process() return drop reason
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (3 preceding siblings ...)
  2022-06-10  3:41 ` [PATCH net-next v3 4/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons menglong8.dong
@ 2022-06-10  3:42 ` menglong8.dong
  2022-06-10  8:56   ` Eric Dumazet
  2022-06-10  3:42 ` [PATCH net-next v3 6/9] net: tcp: add skb drop reasons to tcp connect requesting menglong8.dong
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:42 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

For now, the return value of tcp_rcv_state_process() is treated as bool.
Therefore, we can make it return the reasons of the skb drops.

Meanwhile, the return value of tcp_child_process() comes from
tcp_rcv_state_process(), make it drop reasons by the way.

The new drop reason SKB_DROP_REASON_TCP_LINGER is added for skb dropping
out of TCP linger.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
v3:
- instead SKB_DROP_REASON_TCP_ABORTONDATA with SKB_DROP_REASON_TCP_LINGER
---
 include/net/dropreason.h |  6 ++++++
 include/net/tcp.h        |  8 +++++---
 net/ipv4/tcp_input.c     | 36 ++++++++++++++++++++----------------
 net/ipv4/tcp_ipv4.c      | 20 +++++++++++++-------
 net/ipv4/tcp_minisocks.c | 11 ++++++-----
 net/ipv6/tcp_ipv6.c      | 19 ++++++++++++-------
 6 files changed, 62 insertions(+), 38 deletions(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index c60913aba0e9..bbbf70ce207d 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -242,6 +242,12 @@ enum skb_drop_reason {
 	 * LINUX_MIB_PAWSACTIVEREJECTED
 	 */
 	SKB_DROP_REASON_TCP_PAWSACTIVEREJECTED,
+	/**
+	 * @SKB_DROP_REASON_TCP_LINGER: dropped because of the setting of
+	 * TCP socket option TCP_LINGER2, corresponding to
+	 * LINUX_MIB_TCPABORTONLINGER
+	 */
+	SKB_DROP_REASON_TCP_LINGER,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1e99f5c61f84..ea0eb2d4a743 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -339,7 +339,8 @@ void tcp_wfree(struct sk_buff *skb);
 void tcp_write_timer_handler(struct sock *sk);
 void tcp_delack_timer_handler(struct sock *sk);
 int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg);
-int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb);
+enum skb_drop_reason tcp_rcv_state_process(struct sock *sk,
+					   struct sk_buff *skb);
 void tcp_rcv_established(struct sock *sk, struct sk_buff *skb);
 void tcp_rcv_space_adjust(struct sock *sk);
 int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp);
@@ -385,8 +386,9 @@ enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw,
 struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 			   struct request_sock *req, bool fastopen,
 			   bool *lost_race);
-int tcp_child_process(struct sock *parent, struct sock *child,
-		      struct sk_buff *skb);
+enum skb_drop_reason tcp_child_process(struct sock *parent,
+				       struct sock *child,
+				       struct sk_buff *skb);
 void tcp_enter_loss(struct sock *sk);
 void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost, int flag);
 void tcp_clear_retrans(struct tcp_sock *tp);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9254f14def43..4a6a93d83866 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6425,13 +6425,13 @@ static void tcp_rcv_synrecv_state_fastopen(struct sock *sk)
  *	address independent.
  */
 
-int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
+enum skb_drop_reason tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_connection_sock *icsk = inet_csk(sk);
 	const struct tcphdr *th = tcp_hdr(skb);
 	struct request_sock *req;
-	int queued = 0;
+	int queued = 0, ret;
 	bool acceptable;
 	SKB_DR(reason);
 
@@ -6442,7 +6442,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 
 	case TCP_LISTEN:
 		if (th->ack)
-			return 1;
+			return SKB_DROP_REASON_TCP_FLAGS;
 
 		if (th->rst) {
 			SKB_DR_SET(reason, TCP_RESET);
@@ -6463,9 +6463,9 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			rcu_read_unlock();
 
 			if (!acceptable)
-				return 1;
+				return SKB_DROP_REASON_NOT_SPECIFIED;
 			consume_skb(skb);
-			return 0;
+			return SKB_NOT_DROPPED_YET;
 		}
 		SKB_DR_SET(reason, TCP_FLAGS);
 		goto discard;
@@ -6475,13 +6475,13 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		tcp_mstamp_refresh(tp);
 		queued = tcp_rcv_synsent_state_process(sk, skb, th);
 		if (queued >= 0)
-			return queued;
+			return (enum skb_drop_reason)queued;
 
 		/* Do step6 onward by hand. */
 		tcp_urg(sk, skb, th);
 		__kfree_skb(skb);
 		tcp_data_snd_check(sk);
-		return 0;
+		return SKB_NOT_DROPPED_YET;
 	}
 
 	tcp_mstamp_refresh(tp);
@@ -6508,15 +6508,19 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		return 0;
 
 	/* step 5: check the ACK field */
-	acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH |
-				      FLAG_UPDATE_TS_RECENT |
-				      FLAG_NO_CHALLENGE_ACK) > 0;
+	ret = tcp_ack(sk, skb, FLAG_SLOWPATH |
+			       FLAG_UPDATE_TS_RECENT |
+			       FLAG_NO_CHALLENGE_ACK);
+	acceptable = ret > 0;
 
 	if (!acceptable) {
 		if (sk->sk_state == TCP_SYN_RECV)
 			return 1;	/* send one RST */
 		tcp_send_challenge_ack(sk);
-		SKB_DR_SET(reason, TCP_OLD_ACK);
+		if (ret == 0)
+			SKB_DR_SET(reason, TCP_OLD_ACK);
+		else
+			reason = -ret;
 		goto discard;
 	}
 	switch (sk->sk_state) {
@@ -6585,7 +6589,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		if (tp->linger2 < 0) {
 			tcp_done(sk);
 			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
-			return 1;
+			return SKB_DROP_REASON_TCP_LINGER;
 		}
 		if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
 		    after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt)) {
@@ -6594,7 +6598,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 				tcp_fastopen_active_disable(sk);
 			tcp_done(sk);
 			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
-			return 1;
+			return SKB_DROP_REASON_TCP_INVALID_SEQUENCE;
 		}
 
 		tmo = tcp_fin_time(sk);
@@ -6659,7 +6663,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			    after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt)) {
 				NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
 				tcp_reset(sk, skb);
-				return 1;
+				return SKB_DROP_REASON_TCP_INVALID_SEQUENCE;
 			}
 		}
 		fallthrough;
@@ -6679,11 +6683,11 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 discard:
 		tcp_drop_reason(sk, skb, reason);
 	}
-	return 0;
+	return SKB_NOT_DROPPED_YET;
 
 consume:
 	__kfree_skb(skb);
-	return 0;
+	return SKB_NOT_DROPPED_YET;
 }
 EXPORT_SYMBOL(tcp_rcv_state_process);
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32..7bd35ce48b01 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1670,7 +1670,8 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 		if (!nsk)
 			goto discard;
 		if (nsk != sk) {
-			if (tcp_child_process(sk, nsk, skb)) {
+			reason = tcp_child_process(sk, nsk, skb);
+			if (reason) {
 				rsk = nsk;
 				goto reset;
 			}
@@ -1679,7 +1680,8 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 	} else
 		sock_rps_save_rxhash(sk, skb);
 
-	if (tcp_rcv_state_process(sk, skb)) {
+	reason = tcp_rcv_state_process(sk, skb);
+	if (reason) {
 		rsk = sk;
 		goto reset;
 	}
@@ -1688,6 +1690,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 reset:
 	tcp_v4_send_reset(rsk, skb);
 discard:
+	SKB_DR_OR(reason, NOT_SPECIFIED);
 	kfree_skb_reason(skb, reason);
 	/* Be careful here. If this function gets more complicated and
 	 * gcc suffers from register pressure on the x86, sk (in %ebx)
@@ -2019,12 +2022,15 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		if (nsk == sk) {
 			reqsk_put(req);
 			tcp_v4_restore_cb(skb);
-		} else if (tcp_child_process(sk, nsk, skb)) {
-			tcp_v4_send_reset(nsk, skb);
-			goto discard_and_relse;
 		} else {
-			sock_put(sk);
-			return 0;
+			drop_reason = tcp_child_process(sk, nsk, skb);
+			if (drop_reason) {
+				tcp_v4_send_reset(nsk, skb);
+				goto discard_and_relse;
+			} else {
+				sock_put(sk);
+				return 0;
+			}
 		}
 	}
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 6854bb1fb32b..1a21018f6f64 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -821,11 +821,12 @@ EXPORT_SYMBOL(tcp_check_req);
  * be created.
  */
 
-int tcp_child_process(struct sock *parent, struct sock *child,
-		      struct sk_buff *skb)
+enum skb_drop_reason tcp_child_process(struct sock *parent,
+				       struct sock *child,
+				       struct sk_buff *skb)
 	__releases(&((child)->sk_lock.slock))
 {
-	int ret = 0;
+	enum skb_drop_reason reason = SKB_NOT_DROPPED_YET;
 	int state = child->sk_state;
 
 	/* record sk_napi_id and sk_rx_queue_mapping of child. */
@@ -833,7 +834,7 @@ int tcp_child_process(struct sock *parent, struct sock *child,
 
 	tcp_segs_in(tcp_sk(child), skb);
 	if (!sock_owned_by_user(child)) {
-		ret = tcp_rcv_state_process(child, skb);
+		reason = tcp_rcv_state_process(child, skb);
 		/* Wakeup parent, send SIGIO */
 		if (state == TCP_SYN_RECV && child->sk_state != state)
 			parent->sk_data_ready(parent);
@@ -847,6 +848,6 @@ int tcp_child_process(struct sock *parent, struct sock *child,
 
 	bh_unlock_sock(child);
 	sock_put(child);
-	return ret;
+	return reason;
 }
 EXPORT_SYMBOL(tcp_child_process);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index f37dd4aa91c6..49c640b0cea3 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1489,7 +1489,8 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 			goto discard;
 
 		if (nsk != sk) {
-			if (tcp_child_process(sk, nsk, skb))
+			reason = tcp_child_process(sk, nsk, skb);
+			if (reason)
 				goto reset;
 			if (opt_skb)
 				__kfree_skb(opt_skb);
@@ -1498,7 +1499,8 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	} else
 		sock_rps_save_rxhash(sk, skb);
 
-	if (tcp_rcv_state_process(sk, skb))
+	reason = tcp_rcv_state_process(sk, skb);
+	if (reason)
 		goto reset;
 	if (opt_skb)
 		goto ipv6_pktoptions;
@@ -1684,12 +1686,15 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		if (nsk == sk) {
 			reqsk_put(req);
 			tcp_v6_restore_cb(skb);
-		} else if (tcp_child_process(sk, nsk, skb)) {
-			tcp_v6_send_reset(nsk, skb);
-			goto discard_and_relse;
 		} else {
-			sock_put(sk);
-			return 0;
+			drop_reason = tcp_child_process(sk, nsk, skb);
+			if (drop_reason) {
+				tcp_v6_send_reset(nsk, skb);
+				goto discard_and_relse;
+			} else {
+				sock_put(sk);
+				return 0;
+			}
 		}
 	}
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 6/9] net: tcp: add skb drop reasons to tcp connect requesting
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (4 preceding siblings ...)
  2022-06-10  3:42 ` [PATCH net-next v3 5/9] net: tcp: make tcp_rcv_state_process() return drop reason menglong8.dong
@ 2022-06-10  3:42 ` menglong8.dong
  2022-06-10  3:42 ` [PATCH net-next v3 7/9] net: tcp: add skb drop reasons to tcp tw code path menglong8.dong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:42 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng, kernel test robot

From: Menglong Dong <imagedong@tencent.com>

In order to get skb drop reasons during tcp connect requesting code path,
we have to pass the pointer of the 'reason' as a new function argument of
conn_request() in 'struct inet_connection_sock_af_ops'. As the return
value of conn_request() can be positive or negative or 0, it's not
flexible to make it return drop reasons.

As the return value of tcp_conn_request() is 0, so we can treat it as bool
and make it return the skb drop reasons.

The new drop reasons 'LISTENOVERFLOWS' and 'TCP_REQQFULLDROP' are added,
which are used for 'accept queue' and 'request queue' full.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reported-by: kernel test robot <lkp@intel.com>
--
v2:
- fix compile error reported by kernel test rebot
---
 include/net/dropreason.h           | 10 ++++++++++
 include/net/inet_connection_sock.h |  3 ++-
 include/net/tcp.h                  |  9 +++++----
 net/dccp/dccp.h                    |  3 ++-
 net/dccp/input.c                   |  3 ++-
 net/dccp/ipv4.c                    |  3 ++-
 net/dccp/ipv6.c                    |  5 +++--
 net/ipv4/tcp_input.c               | 27 ++++++++++++++++++---------
 net/ipv4/tcp_ipv4.c                |  9 ++++++---
 net/ipv6/tcp_ipv6.c                | 12 ++++++++----
 net/mptcp/subflow.c                |  8 +++++---
 11 files changed, 63 insertions(+), 29 deletions(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index bbbf70ce207d..86e89d3022b5 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -248,6 +248,16 @@ enum skb_drop_reason {
 	 * LINUX_MIB_TCPABORTONLINGER
 	 */
 	SKB_DROP_REASON_TCP_LINGER,
+	/**
+	 * @SKB_DROP_REASON_LISTENOVERFLOWS: accept queue of the listen
+	 * socket is full, corresponding to LINUX_MIB_LISTENOVERFLOWS
+	 */
+	SKB_DROP_REASON_LISTENOVERFLOWS,
+	/**
+	 * @SKB_DROP_REASON_TCP_REQQFULLDROP: request queue of the listen
+	 * socket is full, corresponding to LINUX_MIB_TCPREQQFULLDROP
+	 */
+	SKB_DROP_REASON_TCP_REQQFULLDROP,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 077cd730ce2f..7717c59ab3d7 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -37,7 +37,8 @@ struct inet_connection_sock_af_ops {
 	void	    (*send_check)(struct sock *sk, struct sk_buff *skb);
 	int	    (*rebuild_header)(struct sock *sk);
 	void	    (*sk_rx_dst_set)(struct sock *sk, const struct sk_buff *skb);
-	int	    (*conn_request)(struct sock *sk, struct sk_buff *skb);
+	int	    (*conn_request)(struct sock *sk, struct sk_buff *skb,
+				    enum skb_drop_reason *reason);
 	struct sock *(*syn_recv_sock)(const struct sock *sk, struct sk_buff *skb,
 				      struct request_sock *req,
 				      struct dst_entry *dst,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index ea0eb2d4a743..082dd0627e2e 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -445,7 +445,8 @@ void tcp_v4_send_check(struct sock *sk, struct sk_buff *skb);
 void tcp_v4_mtu_reduced(struct sock *sk);
 void tcp_req_err(struct sock *sk, u32 seq, bool abort);
 void tcp_ld_RTO_revert(struct sock *sk, u32 seq);
-int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb);
+int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			enum skb_drop_reason *reason);
 struct sock *tcp_create_openreq_child(const struct sock *sk,
 				      struct request_sock *req,
 				      struct sk_buff *skb);
@@ -2036,9 +2037,9 @@ void tcp4_proc_exit(void);
 #endif
 
 int tcp_rtx_synack(const struct sock *sk, struct request_sock *req);
-int tcp_conn_request(struct request_sock_ops *rsk_ops,
-		     const struct tcp_request_sock_ops *af_ops,
-		     struct sock *sk, struct sk_buff *skb);
+enum skb_drop_reason tcp_conn_request(struct request_sock_ops *rsk_ops,
+				      const struct tcp_request_sock_ops *af_ops,
+				      struct sock *sk, struct sk_buff *skb);
 
 /* TCP af-specific functions */
 struct tcp_sock_af_ops {
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index 7dfc00c9fb32..8c1241ae8449 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -255,7 +255,8 @@ void dccp_done(struct sock *sk);
 int dccp_reqsk_init(struct request_sock *rq, struct dccp_sock const *dp,
 		    struct sk_buff const *skb);
 
-int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb);
+int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			 enum skb_drop_reason *reason);
 
 struct sock *dccp_create_openreq_child(const struct sock *sk,
 				       const struct request_sock *req,
diff --git a/net/dccp/input.c b/net/dccp/input.c
index 2cbb757a894f..e12baa56ca59 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -576,6 +576,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 	const int old_state = sk->sk_state;
 	bool acceptable;
 	int queued = 0;
+	SKB_DR(reason);
 
 	/*
 	 *  Step 3: Process LISTEN state
@@ -606,7 +607,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 			 */
 			rcu_read_lock();
 			local_bh_disable();
-			acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0;
+			acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb, &reason) >= 0;
 			local_bh_enable();
 			rcu_read_unlock();
 			if (!acceptable)
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index da6e3b20cd75..b0e8fcabf93b 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -581,7 +581,8 @@ static struct request_sock_ops dccp_request_sock_ops __read_mostly = {
 	.syn_ack_timeout = dccp_syn_ack_timeout,
 };
 
-int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
+int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			 enum skb_drop_reason *reason)
 {
 	struct inet_request_sock *ireq;
 	struct request_sock *req;
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index fd44638ec16b..f01d6a05c7bf 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -314,7 +314,8 @@ static struct request_sock_ops dccp6_request_sock_ops = {
 	.syn_ack_timeout = dccp_syn_ack_timeout,
 };
 
-static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
+static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb,
+				enum skb_drop_reason *reason)
 {
 	struct request_sock *req;
 	struct dccp_request_sock *dreq;
@@ -324,7 +325,7 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 	struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
 
 	if (skb->protocol == htons(ETH_P_IP))
-		return dccp_v4_conn_request(sk, skb);
+		return dccp_v4_conn_request(sk, skb, reason);
 
 	if (!ipv6_unicast_destination(skb))
 		return 0;	/* discard, don't send a reset here */
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4a6a93d83866..6f7dd2564b9b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6458,13 +6458,17 @@ enum skb_drop_reason tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			 */
 			rcu_read_lock();
 			local_bh_disable();
-			acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0;
+			reason = SKB_NOT_DROPPED_YET;
+			acceptable = icsk->icsk_af_ops->conn_request(sk, skb, &reason) >= 0;
 			local_bh_enable();
 			rcu_read_unlock();
 
 			if (!acceptable)
-				return SKB_DROP_REASON_NOT_SPECIFIED;
-			consume_skb(skb);
+				return reason ?: SKB_DROP_REASON_NOT_SPECIFIED;
+			if (reason)
+				kfree_skb_reason(skb, reason);
+			else
+				consume_skb(skb);
 			return SKB_NOT_DROPPED_YET;
 		}
 		SKB_DR_SET(reason, TCP_FLAGS);
@@ -6888,9 +6892,9 @@ u16 tcp_get_syncookie_mss(struct request_sock_ops *rsk_ops,
 }
 EXPORT_SYMBOL_GPL(tcp_get_syncookie_mss);
 
-int tcp_conn_request(struct request_sock_ops *rsk_ops,
-		     const struct tcp_request_sock_ops *af_ops,
-		     struct sock *sk, struct sk_buff *skb)
+enum skb_drop_reason tcp_conn_request(struct request_sock_ops *rsk_ops,
+				      const struct tcp_request_sock_ops *af_ops,
+				      struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_fastopen_cookie foc = { .len = -1 };
 	__u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn;
@@ -6902,6 +6906,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	bool want_cookie = false;
 	struct dst_entry *dst;
 	struct flowi fl;
+	SKB_DR(reason);
 
 	/* TW buckets are converted to open requests without
 	 * limitations, they conserve resources and peer is
@@ -6910,12 +6915,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	if ((net->ipv4.sysctl_tcp_syncookies == 2 ||
 	     inet_csk_reqsk_queue_is_full(sk)) && !isn) {
 		want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
-		if (!want_cookie)
+		if (!want_cookie) {
+			SKB_DR_SET(reason, TCP_REQQFULLDROP);
 			goto drop;
+		}
 	}
 
 	if (sk_acceptq_is_full(sk)) {
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+		SKB_DR_SET(reason, LISTENOVERFLOWS);
 		goto drop;
 	}
 
@@ -6971,6 +6979,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 			 */
 			pr_drop_req(req, ntohs(tcp_hdr(skb)->source),
 				    rsk_ops->family);
+			SKB_DR_SET(reason, TCP_REQQFULLDROP);
 			goto drop_and_release;
 		}
 
@@ -7023,7 +7032,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 		}
 	}
 	reqsk_put(req);
-	return 0;
+	return SKB_NOT_DROPPED_YET;
 
 drop_and_release:
 	dst_release(dst);
@@ -7031,6 +7040,6 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	__reqsk_free(req);
 drop:
 	tcp_listendrop(sk);
-	return 0;
+	return reason;
 }
 EXPORT_SYMBOL(tcp_conn_request);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7bd35ce48b01..804fc5e0203e 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1458,17 +1458,20 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
 	.send_synack	=	tcp_v4_send_synack,
 };
 
-int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
+int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+			enum skb_drop_reason *reason)
 {
 	/* Never answer to SYNs send to broadcast or multicast */
 	if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))
 		goto drop;
 
-	return tcp_conn_request(&tcp_request_sock_ops,
-				&tcp_request_sock_ipv4_ops, sk, skb);
+	*reason = tcp_conn_request(&tcp_request_sock_ops,
+				   &tcp_request_sock_ipv4_ops, sk, skb);
+	return *reason;
 
 drop:
 	tcp_listendrop(sk);
+	*reason = SKB_DROP_REASON_IP_INADDRERRORS;
 	return 0;
 }
 EXPORT_SYMBOL(tcp_v4_conn_request);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 49c640b0cea3..0d2ba9d90529 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1148,24 +1148,28 @@ u16 tcp_v6_get_syncookie(struct sock *sk, struct ipv6hdr *iph,
 	return mss;
 }
 
-static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
+static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb,
+			       enum skb_drop_reason *reason)
 {
 	if (skb->protocol == htons(ETH_P_IP))
-		return tcp_v4_conn_request(sk, skb);
+		return tcp_v4_conn_request(sk, skb, reason);
 
 	if (!ipv6_unicast_destination(skb))
 		goto drop;
 
 	if (ipv6_addr_v4mapped(&ipv6_hdr(skb)->saddr)) {
 		__IP6_INC_STATS(sock_net(sk), NULL, IPSTATS_MIB_INHDRERRORS);
+		*reason = SKB_DROP_REASON_IP_INADDRERRORS;
 		return 0;
 	}
 
-	return tcp_conn_request(&tcp6_request_sock_ops,
-				&tcp_request_sock_ipv6_ops, sk, skb);
+	*reason = tcp_conn_request(&tcp6_request_sock_ops,
+				   &tcp_request_sock_ipv6_ops, sk, skb);
+	return *reason;
 
 drop:
 	tcp_listendrop(sk);
+	*reason = SKB_DROP_REASON_IP_INADDRERRORS;
 	return 0; /* don't send reset */
 }
 
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 8841e8cd9ad8..5a1d05c3a1ef 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -532,7 +532,8 @@ static int subflow_v6_rebuild_header(struct sock *sk)
 struct request_sock_ops mptcp_subflow_request_sock_ops;
 static struct tcp_request_sock_ops subflow_request_sock_ipv4_ops __ro_after_init;
 
-static int subflow_v4_conn_request(struct sock *sk, struct sk_buff *skb)
+static int subflow_v4_conn_request(struct sock *sk, struct sk_buff *skb,
+				   enum skb_drop_reason *reason)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 
@@ -556,14 +557,15 @@ static struct inet_connection_sock_af_ops subflow_v6_specific __ro_after_init;
 static struct inet_connection_sock_af_ops subflow_v6m_specific __ro_after_init;
 static struct proto tcpv6_prot_override;
 
-static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb)
+static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb,
+				   enum skb_drop_reason *reason)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 
 	pr_debug("subflow=%p", subflow);
 
 	if (skb->protocol == htons(ETH_P_IP))
-		return subflow_v4_conn_request(sk, skb);
+		return subflow_v4_conn_request(sk, skb, reason);
 
 	if (!ipv6_unicast_destination(skb))
 		goto drop;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 7/9] net: tcp: add skb drop reasons to tcp tw code path
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (5 preceding siblings ...)
  2022-06-10  3:42 ` [PATCH net-next v3 6/9] net: tcp: add skb drop reasons to tcp connect requesting menglong8.dong
@ 2022-06-10  3:42 ` menglong8.dong
  2022-06-10  9:04   ` Eric Dumazet
  2022-06-10  3:42 ` [PATCH net-next v3 8/9] net: tcp: add skb drop reasons to route_req() menglong8.dong
  2022-06-10  3:42 ` [PATCH net-next v3 9/9] net: tcp: use LINUX_MIB_TCPABORTONLINGER in tcp_rcv_state_process() menglong8.dong
  8 siblings, 1 reply; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:42 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

In order to get the reasons of skb drops, add a function argument of
type 'enum skb_drop_reason *reason' to tcp_timewait_state_process().

In the origin code, all packets to time-wait socket are treated as
dropping with kfree_skb(), which can make users confused. Therefore,
we use consume_skb() for the skbs that are 'good'. We can check the
value of 'reason' to decide use kfree_skb() or consume_skb().

The new reason 'TIMEWAIT' is added for the case that the skb is dropped
as the socket in time-wait state.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reported-by: Eric Dumazet <edumazet@google.com>
---
v2:
- skb is not freed on TCP_TW_ACK and 'ret' is not initizalized, fix
  it (Eric Dumazet)
---
 include/net/dropreason.h |  6 ++++++
 include/net/tcp.h        |  7 ++++---
 net/ipv4/tcp_ipv4.c      |  9 ++++++++-
 net/ipv4/tcp_minisocks.c | 24 ++++++++++++++++++++----
 net/ipv6/tcp_ipv6.c      |  8 +++++++-
 5 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index 86e89d3022b5..3c6f8e0f7f16 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -258,6 +258,12 @@ enum skb_drop_reason {
 	 * socket is full, corresponding to LINUX_MIB_TCPREQQFULLDROP
 	 */
 	SKB_DROP_REASON_TCP_REQQFULLDROP,
+	/**
+	 * @SKB_DROP_REASON_TIMEWAIT: socket is in time-wait state and all
+	 * packet that received will be treated as 'drop', except a good
+	 * 'SYN' packet
+	 */
+	SKB_DROP_REASON_TIMEWAIT,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 082dd0627e2e..88217b8d95ac 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -380,9 +380,10 @@ enum tcp_tw_status {
 };
 
 
-enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw,
-					      struct sk_buff *skb,
-					      const struct tcphdr *th);
+enum tcp_tw_status
+tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
+			   const struct tcphdr *th,
+			   enum skb_drop_reason *reason);
 struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 			   struct request_sock *req, bool fastopen,
 			   bool *lost_race);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 804fc5e0203e..e7bd2f410a4a 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2134,7 +2134,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		inet_twsk_put(inet_twsk(sk));
 		goto csum_error;
 	}
-	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
+	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
+					   &drop_reason)) {
 	case TCP_TW_SYN: {
 		struct sock *sk2 = inet_lookup_listener(dev_net(skb->dev),
 							&tcp_hashinfo, skb,
@@ -2150,11 +2151,17 @@ int tcp_v4_rcv(struct sk_buff *skb)
 			refcounted = false;
 			goto process;
 		}
+		/* TCP_FLAGS or NO_SOCKET? */
+		SKB_DR_SET(drop_reason, TCP_FLAGS);
 	}
 		/* to ACK */
 		fallthrough;
 	case TCP_TW_ACK:
 		tcp_v4_timewait_ack(sk, skb);
+		if (!drop_reason) {
+			consume_skb(skb);
+			return 0;
+		}
 		break;
 	case TCP_TW_RST:
 		tcp_v4_send_reset(sk, skb);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 1a21018f6f64..329724118b7f 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -83,13 +83,15 @@ tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw,
  */
 enum tcp_tw_status
 tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
-			   const struct tcphdr *th)
+			   const struct tcphdr *th,
+			   enum skb_drop_reason *reason)
 {
 	struct tcp_options_received tmp_opt;
 	struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw);
 	bool paws_reject = false;
 
 	tmp_opt.saw_tstamp = 0;
+	*reason = SKB_DROP_REASON_NOT_SPECIFIED;
 	if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) {
 		tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL);
 
@@ -113,11 +115,16 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 			return tcp_timewait_check_oow_rate_limit(
 				tw, skb, LINUX_MIB_TCPACKSKIPPEDFINWAIT2);
 
-		if (th->rst)
+		if (th->rst) {
+			SKB_DR_SET(*reason, TCP_RESET);
 			goto kill;
+		}
 
-		if (th->syn && !before(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt))
+		if (th->syn && !before(TCP_SKB_CB(skb)->seq,
+				       tcptw->tw_rcv_nxt)) {
+			SKB_DR_SET(*reason, TCP_FLAGS);
 			return TCP_TW_RST;
+		}
 
 		/* Dup ACK? */
 		if (!th->ack ||
@@ -143,6 +150,9 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 		}
 
 		inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
+
+		/* skb should be free normally on this case. */
+		*reason = SKB_NOT_DROPPED_YET;
 		return TCP_TW_ACK;
 	}
 
@@ -174,6 +184,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 			 * protocol bug yet.
 			 */
 			if (twsk_net(tw)->ipv4.sysctl_tcp_rfc1337 == 0) {
+				SKB_DR_SET(*reason, TCP_RESET);
 kill:
 				inet_twsk_deschedule_put(tw);
 				return TCP_TW_SUCCESS;
@@ -216,11 +227,14 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 		if (isn == 0)
 			isn++;
 		TCP_SKB_CB(skb)->tcp_tw_isn = isn;
+		*reason = SKB_NOT_DROPPED_YET;
 		return TCP_TW_SYN;
 	}
 
-	if (paws_reject)
+	if (paws_reject) {
+		SKB_DR_SET(*reason, TCP_RFC7323_PAWS);
 		__NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED);
+	}
 
 	if (!th->rst) {
 		/* In this case we must reset the TIMEWAIT timer.
@@ -232,9 +246,11 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 		if (paws_reject || th->ack)
 			inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
 
+		SKB_DR_OR(*reason, TIMEWAIT);
 		return tcp_timewait_check_oow_rate_limit(
 			tw, skb, LINUX_MIB_TCPACKSKIPPEDTIMEWAIT);
 	}
+	SKB_DR_SET(*reason, TCP_RESET);
 	inet_twsk_put(tw);
 	return TCP_TW_SUCCESS;
 }
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0d2ba9d90529..41551a3b679b 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1795,7 +1795,8 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		goto csum_error;
 	}
 
-	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
+	switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
+					   &drop_reason)) {
 	case TCP_TW_SYN:
 	{
 		struct sock *sk2;
@@ -1815,11 +1816,16 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 			refcounted = false;
 			goto process;
 		}
+		SKB_DR_SET(drop_reason, TCP_FLAGS);
 	}
 		/* to ACK */
 		fallthrough;
 	case TCP_TW_ACK:
 		tcp_v6_timewait_ack(sk, skb);
+		if (!drop_reason) {
+			consume_skb(skb);
+			return 0;
+		}
 		break;
 	case TCP_TW_RST:
 		tcp_v6_send_reset(sk, skb);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 8/9] net: tcp: add skb drop reasons to route_req()
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (6 preceding siblings ...)
  2022-06-10  3:42 ` [PATCH net-next v3 7/9] net: tcp: add skb drop reasons to tcp tw code path menglong8.dong
@ 2022-06-10  3:42 ` menglong8.dong
  2022-06-10  3:42 ` [PATCH net-next v3 9/9] net: tcp: use LINUX_MIB_TCPABORTONLINGER in tcp_rcv_state_process() menglong8.dong
  8 siblings, 0 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:42 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev, Jiang Biao, Hao Peng

From: Menglong Dong <imagedong@tencent.com>

Add skb drop reasons to the route_req() in struct tcp_request_sock_ops.
Following functions are involved:

  tcp_v4_route_req()
  tcp_v6_route_req()
  subflow_v4_route_req()
  subflow_v6_route_req()

And the new reason SKB_DROP_REASON_LSM is added, which is used when
skb is dropped by LSM.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/net/dropreason.h |  2 ++
 include/net/tcp.h        |  3 ++-
 net/ipv4/tcp_input.c     |  2 +-
 net/ipv4/tcp_ipv4.c      | 14 +++++++++++---
 net/ipv6/tcp_ipv6.c      | 14 +++++++++++---
 net/mptcp/subflow.c      | 10 ++++++----
 6 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index 3c6f8e0f7f16..22e9299e4bfe 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -264,6 +264,8 @@ enum skb_drop_reason {
 	 * 'SYN' packet
 	 */
 	SKB_DROP_REASON_TIMEWAIT,
+	/** @SKB_DROP_REASON_LSM: dropped by LSM */
+	SKB_DROP_REASON_LSM,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 88217b8d95ac..ed57c331fdeb 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2075,7 +2075,8 @@ struct tcp_request_sock_ops {
 	struct dst_entry *(*route_req)(const struct sock *sk,
 				       struct sk_buff *skb,
 				       struct flowi *fl,
-				       struct request_sock *req);
+				       struct request_sock *req,
+				       enum skb_drop_reason *reason);
 	u32 (*init_seq)(const struct sk_buff *skb);
 	u32 (*init_ts_off)(const struct net *net, const struct sk_buff *skb);
 	int (*send_synack)(const struct sock *sk, struct dst_entry *dst,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 6f7dd2564b9b..9b9247b34a6c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6957,7 +6957,7 @@ enum skb_drop_reason tcp_conn_request(struct request_sock_ops *rsk_ops,
 	/* Note: tcp_v6_init_req() might override ir_iif for link locals */
 	inet_rsk(req)->ir_iif = inet_request_bound_dev_if(sk, skb);
 
-	dst = af_ops->route_req(sk, skb, &fl, req);
+	dst = af_ops->route_req(sk, skb, &fl, req, &reason);
 	if (!dst)
 		goto drop_and_free;
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index e7bd2f410a4a..dcaf2bac4f55 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1423,14 +1423,22 @@ static void tcp_v4_init_req(struct request_sock *req,
 static struct dst_entry *tcp_v4_route_req(const struct sock *sk,
 					  struct sk_buff *skb,
 					  struct flowi *fl,
-					  struct request_sock *req)
+					  struct request_sock *req,
+					  enum skb_drop_reason *reason)
 {
+	struct dst_entry *dst;
+
 	tcp_v4_init_req(req, sk, skb);
 
-	if (security_inet_conn_request(sk, skb, req))
+	if (security_inet_conn_request(sk, skb, req)) {
+		SKB_DR_SET(*reason, LSM);
 		return NULL;
+	}
 
-	return inet_csk_route_req(sk, &fl->u.ip4, req);
+	dst = inet_csk_route_req(sk, &fl->u.ip4, req);
+	if (!dst)
+		SKB_DR_SET(*reason, IP_OUTNOROUTES);
+	return dst;
 }
 
 struct request_sock_ops tcp_request_sock_ops __read_mostly = {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 41551a3b679b..6e2d67238e11 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -802,14 +802,22 @@ static void tcp_v6_init_req(struct request_sock *req,
 static struct dst_entry *tcp_v6_route_req(const struct sock *sk,
 					  struct sk_buff *skb,
 					  struct flowi *fl,
-					  struct request_sock *req)
+					  struct request_sock *req,
+					  enum skb_drop_reason *reason)
 {
+	struct dst_entry *dst;
+
 	tcp_v6_init_req(req, sk, skb);
 
-	if (security_inet_conn_request(sk, skb, req))
+	if (security_inet_conn_request(sk, skb, req)) {
+		SKB_DR_SET(*reason, LSM);
 		return NULL;
+	}
 
-	return inet6_csk_route_req(sk, &fl->u.ip6, req, IPPROTO_TCP);
+	dst = inet6_csk_route_req(sk, &fl->u.ip6, req, IPPROTO_TCP);
+	if (!dst)
+		SKB_DR_SET(*reason, IP_OUTNOROUTES);
+	return dst;
 }
 
 struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 5a1d05c3a1ef..654cc602ff2c 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -285,7 +285,8 @@ EXPORT_SYMBOL_GPL(mptcp_subflow_init_cookie_req);
 static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 					      struct sk_buff *skb,
 					      struct flowi *fl,
-					      struct request_sock *req)
+					      struct request_sock *req,
+					      enum skb_drop_reason *reason)
 {
 	struct dst_entry *dst;
 	int err;
@@ -293,7 +294,7 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 	tcp_rsk(req)->is_mptcp = 1;
 	subflow_init_req(req, sk);
 
-	dst = tcp_request_sock_ipv4_ops.route_req(sk, skb, fl, req);
+	dst = tcp_request_sock_ipv4_ops.route_req(sk, skb, fl, req, reason);
 	if (!dst)
 		return NULL;
 
@@ -311,7 +312,8 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 					      struct sk_buff *skb,
 					      struct flowi *fl,
-					      struct request_sock *req)
+					      struct request_sock *req,
+					      enum skb_drop_reason *reason)
 {
 	struct dst_entry *dst;
 	int err;
@@ -319,7 +321,7 @@ static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 	tcp_rsk(req)->is_mptcp = 1;
 	subflow_init_req(req, sk);
 
-	dst = tcp_request_sock_ipv6_ops.route_req(sk, skb, fl, req);
+	dst = tcp_request_sock_ipv6_ops.route_req(sk, skb, fl, req, reason);
 	if (!dst)
 		return NULL;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v3 9/9] net: tcp: use LINUX_MIB_TCPABORTONLINGER in tcp_rcv_state_process()
  2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
                   ` (7 preceding siblings ...)
  2022-06-10  3:42 ` [PATCH net-next v3 8/9] net: tcp: add skb drop reasons to route_req() menglong8.dong
@ 2022-06-10  3:42 ` menglong8.dong
  8 siblings, 0 replies; 14+ messages in thread
From: menglong8.dong @ 2022-06-10  3:42 UTC (permalink / raw)
  To: edumazet
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, kuba, pabeni,
	imagedong, kafai, talalahmad, keescook, dongli.zhang,
	linux-kernel, netdev

From: Menglong Dong <imagedong@tencent.com>

The statistics for 'tp->linger2 < 0' in tcp_rcv_state_process() seems
more accurate to be LINUX_MIB_TCPABORTONLINGER.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 net/ipv4/tcp_input.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9b9247b34a6c..07b06c12fe87 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6592,7 +6592,7 @@ enum skb_drop_reason tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 
 		if (tp->linger2 < 0) {
 			tcp_done(sk);
-			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
+			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONLINGER);
 			return SKB_DROP_REASON_TCP_LINGER;
 		}
 		if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v3 5/9] net: tcp: make tcp_rcv_state_process() return drop reason
  2022-06-10  3:42 ` [PATCH net-next v3 5/9] net: tcp: make tcp_rcv_state_process() return drop reason menglong8.dong
@ 2022-06-10  8:56   ` Eric Dumazet
  2022-06-13  2:43     ` Menglong Dong
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2022-06-10  8:56 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Steven Rostedt, Ingo Molnar, David Miller, Hideaki YOSHIFUJI,
	David Ahern, Jakub Kicinski, Paolo Abeni, Menglong Dong,
	Martin KaFai Lau, Talal Ahmad, Kees Cook, Dongli Zhang, LKML,
	netdev, Jiang Biao, Hao Peng

On Thu, Jun 9, 2022 at 8:45 PM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
>
> For now, the return value of tcp_rcv_state_process() is treated as bool.
> Therefore, we can make it return the reasons of the skb drops.
>
> Meanwhile, the return value of tcp_child_process() comes from
> tcp_rcv_state_process(), make it drop reasons by the way.
>
> The new drop reason SKB_DROP_REASON_TCP_LINGER is added for skb dropping
> out of TCP linger.
>
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> v3:
> - instead SKB_DROP_REASON_TCP_ABORTONDATA with SKB_DROP_REASON_TCP_LINGER
> ---
>  include/net/dropreason.h |  6 ++++++
>  include/net/tcp.h        |  8 +++++---
>  net/ipv4/tcp_input.c     | 36 ++++++++++++++++++++----------------
>  net/ipv4/tcp_ipv4.c      | 20 +++++++++++++-------
>  net/ipv4/tcp_minisocks.c | 11 ++++++-----
>  net/ipv6/tcp_ipv6.c      | 19 ++++++++++++-------
>  6 files changed, 62 insertions(+), 38 deletions(-)
>

I am sorry, this patch is too invasive, and will make future bug fix
backports a real nightmare.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v3 7/9] net: tcp: add skb drop reasons to tcp tw code path
  2022-06-10  3:42 ` [PATCH net-next v3 7/9] net: tcp: add skb drop reasons to tcp tw code path menglong8.dong
@ 2022-06-10  9:04   ` Eric Dumazet
  2022-06-13  2:41     ` Menglong Dong
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2022-06-10  9:04 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Steven Rostedt, Ingo Molnar, David Miller, Hideaki YOSHIFUJI,
	David Ahern, Jakub Kicinski, Paolo Abeni, Menglong Dong,
	Martin KaFai Lau, Talal Ahmad, Kees Cook, Dongli Zhang, LKML,
	netdev, Jiang Biao, Hao Peng

On Thu, Jun 9, 2022 at 8:45 PM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
>
> In order to get the reasons of skb drops, add a function argument of
> type 'enum skb_drop_reason *reason' to tcp_timewait_state_process().
>
> In the origin code, all packets to time-wait socket are treated as
> dropping with kfree_skb(), which can make users confused. Therefore,
> we use consume_skb() for the skbs that are 'good'. We can check the
> value of 'reason' to decide use kfree_skb() or consume_skb().
>
> The new reason 'TIMEWAIT' is added for the case that the skb is dropped
> as the socket in time-wait state.
>
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> Reported-by: Eric Dumazet <edumazet@google.com>

I have suggested that consume_skb() could be an alias of
kfree_skb_reason(skb, SKB_NOT_DROPPED),
or something like that.

In order to avoid silly constructs all over the places :

if (reason)
   kfree_skb_reason(skb, reason);
else
  consume_skb(skb);

->

kfree_skb_reason(skb, reason);

> ---
> v2:
> - skb is not freed on TCP_TW_ACK and 'ret' is not initizalized, fix
>   it (Eric Dumazet)
> ---
>  include/net/dropreason.h |  6 ++++++
>  include/net/tcp.h        |  7 ++++---
>  net/ipv4/tcp_ipv4.c      |  9 ++++++++-
>  net/ipv4/tcp_minisocks.c | 24 ++++++++++++++++++++----
>  net/ipv6/tcp_ipv6.c      |  8 +++++++-
>  5 files changed, 45 insertions(+), 9 deletions(-)
>
> diff --git a/include/net/dropreason.h b/include/net/dropreason.h
> index 86e89d3022b5..3c6f8e0f7f16 100644
> --- a/include/net/dropreason.h
> +++ b/include/net/dropreason.h
> @@ -258,6 +258,12 @@ enum skb_drop_reason {
>          * socket is full, corresponding to LINUX_MIB_TCPREQQFULLDROP
>          */
>         SKB_DROP_REASON_TCP_REQQFULLDROP,
> +       /**
> +        * @SKB_DROP_REASON_TIMEWAIT: socket is in time-wait state and all
> +        * packet that received will be treated as 'drop', except a good
> +        * 'SYN' packet
> +        */
> +       SKB_DROP_REASON_TIMEWAIT,
>         /**
>          * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
>          * used as a real 'reason'
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 082dd0627e2e..88217b8d95ac 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -380,9 +380,10 @@ enum tcp_tw_status {
>  };
>
>
> -enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw,
> -                                             struct sk_buff *skb,
> -                                             const struct tcphdr *th);
> +enum tcp_tw_status
> +tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> +                          const struct tcphdr *th,
> +                          enum skb_drop_reason *reason);
>  struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
>                            struct request_sock *req, bool fastopen,
>                            bool *lost_race);
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 804fc5e0203e..e7bd2f410a4a 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2134,7 +2134,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
>                 inet_twsk_put(inet_twsk(sk));
>                 goto csum_error;
>         }
> -       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
> +       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
> +                                          &drop_reason)) {
>         case TCP_TW_SYN: {
>                 struct sock *sk2 = inet_lookup_listener(dev_net(skb->dev),
>                                                         &tcp_hashinfo, skb,
> @@ -2150,11 +2151,17 @@ int tcp_v4_rcv(struct sk_buff *skb)
>                         refcounted = false;
>                         goto process;
>                 }
> +               /* TCP_FLAGS or NO_SOCKET? */
> +               SKB_DR_SET(drop_reason, TCP_FLAGS);
>         }
>                 /* to ACK */
>                 fallthrough;
>         case TCP_TW_ACK:
>                 tcp_v4_timewait_ack(sk, skb);
> +               if (!drop_reason) {
> +                       consume_skb(skb);
> +                       return 0;
> +               }

Sorry, this is the kind of change which makes the whole exercise
source of numerous problems later
when backports are needed.

You are sending patches today, but we are not sure you will be willing
to spend days of work and tests to validate
future backports with conflicts.

>                 break;
>         case TCP_TW_RST:
>                 tcp_v4_send_reset(sk, skb);
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 1a21018f6f64..329724118b7f 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -83,13 +83,15 @@ tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw,
>   */
>  enum tcp_tw_status
>  tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> -                          const struct tcphdr *th)
> +                          const struct tcphdr *th,
> +                          enum skb_drop_reason *reason)
>  {
>         struct tcp_options_received tmp_opt;
>         struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw);
>         bool paws_reject = false;
>
>         tmp_opt.saw_tstamp = 0;
> +       *reason = SKB_DROP_REASON_NOT_SPECIFIED;
>         if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) {
>                 tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL);
>
> @@ -113,11 +115,16 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
>                         return tcp_timewait_check_oow_rate_limit(
>                                 tw, skb, LINUX_MIB_TCPACKSKIPPEDFINWAIT2);
>
> -               if (th->rst)
> +               if (th->rst) {
> +                       SKB_DR_SET(*reason, TCP_RESET);
>                         goto kill;
> +               }
>
> -               if (th->syn && !before(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt))
> +               if (th->syn && !before(TCP_SKB_CB(skb)->seq,
> +                                      tcptw->tw_rcv_nxt)) {
> +                       SKB_DR_SET(*reason, TCP_FLAGS);
>                         return TCP_TW_RST;
> +               }
>
>                 /* Dup ACK? */
>                 if (!th->ack ||
> @@ -143,6 +150,9 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
>                 }
>
>                 inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
> +
> +               /* skb should be free normally on this case. */
> +               *reason = SKB_NOT_DROPPED_YET;
>                 return TCP_TW_ACK;
>         }
>
> @@ -174,6 +184,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
>                          * protocol bug yet.
>                          */
>                         if (twsk_net(tw)->ipv4.sysctl_tcp_rfc1337 == 0) {
> +                               SKB_DR_SET(*reason, TCP_RESET);
>  kill:
>                                 inet_twsk_deschedule_put(tw);
>                                 return TCP_TW_SUCCESS;
> @@ -216,11 +227,14 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
>                 if (isn == 0)
>                         isn++;
>                 TCP_SKB_CB(skb)->tcp_tw_isn = isn;
> +               *reason = SKB_NOT_DROPPED_YET;
>                 return TCP_TW_SYN;
>         }
>
> -       if (paws_reject)
> +       if (paws_reject) {
> +               SKB_DR_SET(*reason, TCP_RFC7323_PAWS);
>                 __NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED);
> +       }
>
>         if (!th->rst) {
>                 /* In this case we must reset the TIMEWAIT timer.
> @@ -232,9 +246,11 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
>                 if (paws_reject || th->ack)
>                         inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
>
> +               SKB_DR_OR(*reason, TIMEWAIT);
>                 return tcp_timewait_check_oow_rate_limit(
>                         tw, skb, LINUX_MIB_TCPACKSKIPPEDTIMEWAIT);
>         }
> +       SKB_DR_SET(*reason, TCP_RESET);
>         inet_twsk_put(tw);
>         return TCP_TW_SUCCESS;
>  }
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 0d2ba9d90529..41551a3b679b 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -1795,7 +1795,8 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
>                 goto csum_error;
>         }
>
> -       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
> +       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
> +                                          &drop_reason)) {
>         case TCP_TW_SYN:
>         {
>                 struct sock *sk2;
> @@ -1815,11 +1816,16 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
>                         refcounted = false;
>                         goto process;
>                 }
> +               SKB_DR_SET(drop_reason, TCP_FLAGS);
>         }
>                 /* to ACK */
>                 fallthrough;
>         case TCP_TW_ACK:
>                 tcp_v6_timewait_ack(sk, skb);
> +               if (!drop_reason) {
> +                       consume_skb(skb);
> +                       return 0;
> +               }
>                 break;
>         case TCP_TW_RST:
>                 tcp_v6_send_reset(sk, skb);
> --
> 2.36.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v3 7/9] net: tcp: add skb drop reasons to tcp tw code path
  2022-06-10  9:04   ` Eric Dumazet
@ 2022-06-13  2:41     ` Menglong Dong
  0 siblings, 0 replies; 14+ messages in thread
From: Menglong Dong @ 2022-06-13  2:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Steven Rostedt, Ingo Molnar, David Miller, Hideaki YOSHIFUJI,
	David Ahern, Jakub Kicinski, Paolo Abeni, Menglong Dong,
	Martin KaFai Lau, Talal Ahmad, Kees Cook, Dongli Zhang, LKML,
	netdev, Jiang Biao, Hao Peng

On Fri, Jun 10, 2022 at 5:04 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Jun 9, 2022 at 8:45 PM <menglong8.dong@gmail.com> wrote:
> >
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > In order to get the reasons of skb drops, add a function argument of
> > type 'enum skb_drop_reason *reason' to tcp_timewait_state_process().
> >
> > In the origin code, all packets to time-wait socket are treated as
> > dropping with kfree_skb(), which can make users confused. Therefore,
> > we use consume_skb() for the skbs that are 'good'. We can check the
> > value of 'reason' to decide use kfree_skb() or consume_skb().
> >
> > The new reason 'TIMEWAIT' is added for the case that the skb is dropped
> > as the socket in time-wait state.
> >
> > Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> > Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> > Signed-off-by: Menglong Dong <imagedong@tencent.com>
> > Reported-by: Eric Dumazet <edumazet@google.com>
>
> I have suggested that consume_skb() could be an alias of
> kfree_skb_reason(skb, SKB_NOT_DROPPED),
> or something like that.
>
> In order to avoid silly constructs all over the places :
>
> if (reason)
>    kfree_skb_reason(skb, reason);
> else
>   consume_skb(skb);
>
> ->
>
> kfree_skb_reason(skb, reason);
>

This seems to be a good idea in some cases. However, I
am a little worried about that SKB_NOT_DROPPED can
be passed to kfree_skb_reason() by mistake, which happened
before. (Maybe I'm just overly worried, as this can be avoided
during we coding)

> > ---
> > v2:
> > - skb is not freed on TCP_TW_ACK and 'ret' is not initizalized, fix
> >   it (Eric Dumazet)
> > ---
> >  include/net/dropreason.h |  6 ++++++
> >  include/net/tcp.h        |  7 ++++---
> >  net/ipv4/tcp_ipv4.c      |  9 ++++++++-
> >  net/ipv4/tcp_minisocks.c | 24 ++++++++++++++++++++----
> >  net/ipv6/tcp_ipv6.c      |  8 +++++++-
> >  5 files changed, 45 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/net/dropreason.h b/include/net/dropreason.h
> > index 86e89d3022b5..3c6f8e0f7f16 100644
> > --- a/include/net/dropreason.h
> > +++ b/include/net/dropreason.h
> > @@ -258,6 +258,12 @@ enum skb_drop_reason {
> >          * socket is full, corresponding to LINUX_MIB_TCPREQQFULLDROP
> >          */
> >         SKB_DROP_REASON_TCP_REQQFULLDROP,
> > +       /**
> > +        * @SKB_DROP_REASON_TIMEWAIT: socket is in time-wait state and all
> > +        * packet that received will be treated as 'drop', except a good
> > +        * 'SYN' packet
> > +        */
> > +       SKB_DROP_REASON_TIMEWAIT,
> >         /**
> >          * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
> >          * used as a real 'reason'
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index 082dd0627e2e..88217b8d95ac 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -380,9 +380,10 @@ enum tcp_tw_status {
> >  };
> >
> >
> > -enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw,
> > -                                             struct sk_buff *skb,
> > -                                             const struct tcphdr *th);
> > +enum tcp_tw_status
> > +tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> > +                          const struct tcphdr *th,
> > +                          enum skb_drop_reason *reason);
> >  struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
> >                            struct request_sock *req, bool fastopen,
> >                            bool *lost_race);
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 804fc5e0203e..e7bd2f410a4a 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -2134,7 +2134,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
> >                 inet_twsk_put(inet_twsk(sk));
> >                 goto csum_error;
> >         }
> > -       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
> > +       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
> > +                                          &drop_reason)) {
> >         case TCP_TW_SYN: {
> >                 struct sock *sk2 = inet_lookup_listener(dev_net(skb->dev),
> >                                                         &tcp_hashinfo, skb,
> > @@ -2150,11 +2151,17 @@ int tcp_v4_rcv(struct sk_buff *skb)
> >                         refcounted = false;
> >                         goto process;
> >                 }
> > +               /* TCP_FLAGS or NO_SOCKET? */
> > +               SKB_DR_SET(drop_reason, TCP_FLAGS);
> >         }
> >                 /* to ACK */
> >                 fallthrough;
> >         case TCP_TW_ACK:
> >                 tcp_v4_timewait_ack(sk, skb);
> > +               if (!drop_reason) {
> > +                       consume_skb(skb);
> > +                       return 0;
> > +               }
>
> Sorry, this is the kind of change which makes the whole exercise
> source of numerous problems later
> when backports are needed.
>
> You are sending patches today, but we are not sure you will be willing
> to spend days of work and tests to validate
> future backports with conflicts.

With making kfree_skb_reason(skb, SKB_NOT_DROPPED) consume_skb(skb),
it seems that the code here doesn't need to be changed anymore.


>
> >                 break;
> >         case TCP_TW_RST:
> >                 tcp_v4_send_reset(sk, skb);
> > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> > index 1a21018f6f64..329724118b7f 100644
> > --- a/net/ipv4/tcp_minisocks.c
> > +++ b/net/ipv4/tcp_minisocks.c
> > @@ -83,13 +83,15 @@ tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw,
> >   */
> >  enum tcp_tw_status
> >  tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> > -                          const struct tcphdr *th)
> > +                          const struct tcphdr *th,
> > +                          enum skb_drop_reason *reason)
> >  {
> >         struct tcp_options_received tmp_opt;
> >         struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw);
> >         bool paws_reject = false;
> >
> >         tmp_opt.saw_tstamp = 0;
> > +       *reason = SKB_DROP_REASON_NOT_SPECIFIED;
> >         if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) {
> >                 tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL);
> >
> > @@ -113,11 +115,16 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> >                         return tcp_timewait_check_oow_rate_limit(
> >                                 tw, skb, LINUX_MIB_TCPACKSKIPPEDFINWAIT2);
> >
> > -               if (th->rst)
> > +               if (th->rst) {
> > +                       SKB_DR_SET(*reason, TCP_RESET);
> >                         goto kill;
> > +               }
> >
> > -               if (th->syn && !before(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt))
> > +               if (th->syn && !before(TCP_SKB_CB(skb)->seq,
> > +                                      tcptw->tw_rcv_nxt)) {
> > +                       SKB_DR_SET(*reason, TCP_FLAGS);
> >                         return TCP_TW_RST;
> > +               }
> >
> >                 /* Dup ACK? */
> >                 if (!th->ack ||
> > @@ -143,6 +150,9 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> >                 }
> >
> >                 inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
> > +
> > +               /* skb should be free normally on this case. */
> > +               *reason = SKB_NOT_DROPPED_YET;
> >                 return TCP_TW_ACK;
> >         }
> >
> > @@ -174,6 +184,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> >                          * protocol bug yet.
> >                          */
> >                         if (twsk_net(tw)->ipv4.sysctl_tcp_rfc1337 == 0) {
> > +                               SKB_DR_SET(*reason, TCP_RESET);
> >  kill:
> >                                 inet_twsk_deschedule_put(tw);
> >                                 return TCP_TW_SUCCESS;
> > @@ -216,11 +227,14 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> >                 if (isn == 0)
> >                         isn++;
> >                 TCP_SKB_CB(skb)->tcp_tw_isn = isn;
> > +               *reason = SKB_NOT_DROPPED_YET;
> >                 return TCP_TW_SYN;
> >         }
> >
> > -       if (paws_reject)
> > +       if (paws_reject) {
> > +               SKB_DR_SET(*reason, TCP_RFC7323_PAWS);
> >                 __NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED);
> > +       }
> >
> >         if (!th->rst) {
> >                 /* In this case we must reset the TIMEWAIT timer.
> > @@ -232,9 +246,11 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
> >                 if (paws_reject || th->ack)
> >                         inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
> >
> > +               SKB_DR_OR(*reason, TIMEWAIT);
> >                 return tcp_timewait_check_oow_rate_limit(
> >                         tw, skb, LINUX_MIB_TCPACKSKIPPEDTIMEWAIT);
> >         }
> > +       SKB_DR_SET(*reason, TCP_RESET);
> >         inet_twsk_put(tw);
> >         return TCP_TW_SUCCESS;
> >  }
> > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > index 0d2ba9d90529..41551a3b679b 100644
> > --- a/net/ipv6/tcp_ipv6.c
> > +++ b/net/ipv6/tcp_ipv6.c
> > @@ -1795,7 +1795,8 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
> >                 goto csum_error;
> >         }
> >
> > -       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) {
> > +       switch (tcp_timewait_state_process(inet_twsk(sk), skb, th,
> > +                                          &drop_reason)) {
> >         case TCP_TW_SYN:
> >         {
> >                 struct sock *sk2;
> > @@ -1815,11 +1816,16 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
> >                         refcounted = false;
> >                         goto process;
> >                 }
> > +               SKB_DR_SET(drop_reason, TCP_FLAGS);
> >         }
> >                 /* to ACK */
> >                 fallthrough;
> >         case TCP_TW_ACK:
> >                 tcp_v6_timewait_ack(sk, skb);
> > +               if (!drop_reason) {
> > +                       consume_skb(skb);
> > +                       return 0;
> > +               }
> >                 break;
> >         case TCP_TW_RST:
> >                 tcp_v6_send_reset(sk, skb);
> > --
> > 2.36.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v3 5/9] net: tcp: make tcp_rcv_state_process() return drop reason
  2022-06-10  8:56   ` Eric Dumazet
@ 2022-06-13  2:43     ` Menglong Dong
  0 siblings, 0 replies; 14+ messages in thread
From: Menglong Dong @ 2022-06-13  2:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Steven Rostedt, Ingo Molnar, David Miller, Hideaki YOSHIFUJI,
	David Ahern, Jakub Kicinski, Paolo Abeni, Menglong Dong,
	Martin KaFai Lau, Talal Ahmad, Kees Cook, Dongli Zhang, LKML,
	netdev, Jiang Biao, Hao Peng

On Fri, Jun 10, 2022 at 4:56 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Jun 9, 2022 at 8:45 PM <menglong8.dong@gmail.com> wrote:
> >
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > For now, the return value of tcp_rcv_state_process() is treated as bool.
> > Therefore, we can make it return the reasons of the skb drops.
> >
> > Meanwhile, the return value of tcp_child_process() comes from
> > tcp_rcv_state_process(), make it drop reasons by the way.
> >
> > The new drop reason SKB_DROP_REASON_TCP_LINGER is added for skb dropping
> > out of TCP linger.
> >
> > Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> > Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> > Signed-off-by: Menglong Dong <imagedong@tencent.com>
> > v3:
> > - instead SKB_DROP_REASON_TCP_ABORTONDATA with SKB_DROP_REASON_TCP_LINGER
> > ---
> >  include/net/dropreason.h |  6 ++++++
> >  include/net/tcp.h        |  8 +++++---
> >  net/ipv4/tcp_input.c     | 36 ++++++++++++++++++++----------------
> >  net/ipv4/tcp_ipv4.c      | 20 +++++++++++++-------
> >  net/ipv4/tcp_minisocks.c | 11 ++++++-----
> >  net/ipv6/tcp_ipv6.c      | 19 ++++++++++++-------
> >  6 files changed, 62 insertions(+), 38 deletions(-)
> >
>
> I am sorry, this patch is too invasive, and will make future bug fix
> backports a real nightmare.

Is there any advice to save this patch? Or should we just skip this
part (for now) ?

Thanks!
Menglong Dong

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-06-13  2:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10  3:41 [PATCH net-next v3 0/9] net: tcp: add skb drop reasons to tcp state change menglong8.dong
2022-06-10  3:41 ` [PATCH net-next v3 1/9] net: skb: introduce __skb_queue_purge_reason() menglong8.dong
2022-06-10  3:41 ` [PATCH net-next v3 2/9] net: sock: introduce sk_stream_kill_queues_reason() menglong8.dong
2022-06-10  3:41 ` [PATCH net-next v3 3/9] net: inet: add skb drop reason to inet_csk_destroy_sock() menglong8.dong
2022-06-10  3:41 ` [PATCH net-next v3 4/9] net: tcp: make tcp_rcv_synsent_state_process() return drop reasons menglong8.dong
2022-06-10  3:42 ` [PATCH net-next v3 5/9] net: tcp: make tcp_rcv_state_process() return drop reason menglong8.dong
2022-06-10  8:56   ` Eric Dumazet
2022-06-13  2:43     ` Menglong Dong
2022-06-10  3:42 ` [PATCH net-next v3 6/9] net: tcp: add skb drop reasons to tcp connect requesting menglong8.dong
2022-06-10  3:42 ` [PATCH net-next v3 7/9] net: tcp: add skb drop reasons to tcp tw code path menglong8.dong
2022-06-10  9:04   ` Eric Dumazet
2022-06-13  2:41     ` Menglong Dong
2022-06-10  3:42 ` [PATCH net-next v3 8/9] net: tcp: add skb drop reasons to route_req() menglong8.dong
2022-06-10  3:42 ` [PATCH net-next v3 9/9] net: tcp: use LINUX_MIB_TCPABORTONLINGER in tcp_rcv_state_process() menglong8.dong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).