mptcp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
@ 2024-04-17  8:51 Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent Jason Xing
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

In production, there are so many cases about why the RST skb is sent but
we don't have a very convenient/fast method to detect the exact underlying
reasons.

RST is implemented in two kinds: passive kind (like tcp_v4_send_reset())
and active kind (like tcp_send_active_reset()). The former can be traced
carefully 1) in TCP, with the help of drop reasons, which is based on
Eric's idea[1], 2) in MPTCP, with the help of reset options defined in
RFC 8684. The latter is relatively independent, which should be
implemented on our own.

In this series, I focus on the fundamental implement mostly about how
the rstreason mechnism works and give the detailed passive part as an
example, not including the active reset part. In future, we can go
further and refine those NOT_SPECIFIED reasons.

Here are some examples when tracing:
<idle>-0       [002] ..s1.  1830.262425: tcp_send_reset: skbaddr=x
        skaddr=x src=x dest=x state=x reason=NOT_SPECIFIED
<idle>-0       [002] ..s1.  1830.262425: tcp_send_reset: skbaddr=x
        skaddr=x src=x dest=x state=x reason=NO_SOCKET

[1]
Link: https://lore.kernel.org/all/CANn89iJw8x-LqgsWOeJQQvgVg6DnL5aBRLi10QN2WBdr+X4k=w@mail.gmail.com/

v6
1. add back casts, or else they are treated as error.

v5
Link: https://lore.kernel.org/all/20240411115630.38420-1-kerneljasonxing@gmail.com/
1. address format issue (like reverse xmas tree) (Eric, Paolo)
2. remove unnecessary casts. (Eric)
3. introduce a helper used in mptcp active reset. See patch 6. (Paolo)

v4
Link: https://lore.kernel.org/all/20240409100934.37725-1-kerneljasonxing@gmail.com/
1. passing 'enum sk_rst_reason' for readability when tracing (Antoine)

v3
Link: https://lore.kernel.org/all/20240404072047.11490-1-kerneljasonxing@gmail.com/
1. rebase (mptcp part) and address what Mat suggested.

v2
Link: https://lore.kernel.org/all/20240403185033.47ebc6a9@kernel.org/
1. rebase against the latest net-next tree



Jason Xing (7):
  net: introduce rstreason to detect why the RST is sent
  rstreason: prepare for passive reset
  rstreason: prepare for active reset
  tcp: support rstreason for passive reset
  mptcp: support rstreason for passive reset
  mptcp: introducing a helper into active reset logic
  rstreason: make it work in trace world

 include/net/request_sock.h |  4 +-
 include/net/rstreason.h    | 93 ++++++++++++++++++++++++++++++++++++++
 include/net/tcp.h          |  3 +-
 include/trace/events/tcp.h | 37 +++++++++++++--
 net/dccp/ipv4.c            | 10 ++--
 net/dccp/ipv6.c            | 10 ++--
 net/dccp/minisocks.c       |  3 +-
 net/ipv4/tcp.c             | 15 ++++--
 net/ipv4/tcp_ipv4.c        | 14 +++---
 net/ipv4/tcp_minisocks.c   |  3 +-
 net/ipv4/tcp_output.c      |  5 +-
 net/ipv4/tcp_timer.c       |  9 ++--
 net/ipv6/tcp_ipv6.c        | 17 ++++---
 net/mptcp/protocol.c       |  2 +-
 net/mptcp/protocol.h       | 11 +++++
 net/mptcp/subflow.c        | 27 ++++++++---
 16 files changed, 216 insertions(+), 47 deletions(-)
 create mode 100644 include/net/rstreason.h

-- 
2.37.3


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
@ 2024-04-17  8:51 ` Jason Xing
  2024-04-17  9:01   ` Eric Dumazet
  2024-04-17  8:51 ` [PATCH net-next v6 2/7] rstreason: prepare for passive reset Jason Xing
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

Add a new standalone file for the easy future extension to support
both active reset and passive reset in the TCP/DCCP/MPTCP protocols.

This patch only does the preparations for reset reason mechanism,
nothing else changes.

The reset reasons are divided into three parts:
1) reuse drop reasons for passive reset in TCP
2) reuse MP_TCPRST option for MPTCP
3) our own reasons

I will implement the basic codes of active/passive reset reason in
those three protocols, which is not complete for this moment. But
it provides a new chance to let other people add more reasons into
it:)

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 include/net/rstreason.h | 93 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)
 create mode 100644 include/net/rstreason.h

diff --git a/include/net/rstreason.h b/include/net/rstreason.h
new file mode 100644
index 000000000000..0c3fa55fa62f
--- /dev/null
+++ b/include/net/rstreason.h
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef _LINUX_RSTREASON_H
+#define _LINUX_RSTREASON_H
+#include <net/dropreason-core.h>
+
+#define DEFINE_RST_REASON(FN, FNe)	\
+	FN(MPTCP_RST_EUNSPEC)		\
+	FN(MPTCP_RST_EMPTCP)		\
+	FN(MPTCP_RST_ERESOURCE)		\
+	FN(MPTCP_RST_EPROHIBIT)		\
+	FN(MPTCP_RST_EWQ2BIG)		\
+	FN(MPTCP_RST_EBADPERF)		\
+	FN(MPTCP_RST_EMIDDLEBOX)	\
+	FN(NOT_SPECIFIED)		\
+	FNe(MAX)
+
+#define RST_REASON_START (SKB_DROP_REASON_MAX + 1)
+
+/* There are three parts in order:
+ * 1) 0 - SKB_DROP_REASON_MAX: rely on drop reasons for passive reset in TCP
+ * 2) SKB_DROP_REASON_MAX + 1 - MPTCP_RST_EMIDDLEBOX: for MPTCP use
+ * 3) MPTCP_RST_EMIDDLEBOX - SK_RST_REASON_MAX: independent reset reason
+ */
+enum sk_rst_reason {
+	/* Leave this 'blank' part (0-SKB_DROP_REASON_MAX) for the reuse
+	 * of skb drop reason because rst reason relies on what drop reason
+	 * indicates exactly why it could happen.
+	 */
+
+	/* Copy from include/uapi/linux/mptcp.h.
+	 * These reset fields will not be changed since they adhere to
+	 * RFC 8684. So do not touch them. I'm going to list each definition
+	 * of them respectively.
+	 */
+	/* Unspecified error.
+	 * This is the default error; it implies that the subflow is no
+	 * longer available. The presence of this option shows that the
+	 * RST was generated by an MPTCP-aware device.
+	 */
+	SK_RST_REASON_MPTCP_RST_EUNSPEC = RST_REASON_START,
+	/* MPTCP-specific error.
+	 * An error has been detected in the processing of MPTCP options.
+	 * This is the usual reason code to return in the cases where a RST
+	 * is being sent to close a subflow because of an invalid response.
+	 */
+	SK_RST_REASON_MPTCP_RST_EMPTCP,
+	/* Lack of resources.
+	 * This code indicates that the sending host does not have enough
+	 * resources to support the terminated subflow.
+	 */
+	SK_RST_REASON_MPTCP_RST_ERESOURCE,
+	/* Administratively prohibited.
+	 * This code indicates that the requested subflow is prohibited by
+	 * the policies of the sending host.
+	 */
+	SK_RST_REASON_MPTCP_RST_EPROHIBIT,
+	/* Too much outstanding data.
+	 * This code indicates that there is an excessive amount of data
+	 * that needs to be transmitted over the terminated subflow while
+	 * having already been acknowledged over one or more other subflows.
+	 * This may occur if a path has been unavailable for a short period
+	 * and it is more efficient to reset and start again than it is to
+	 * retransmit the queued data.
+	 */
+	SK_RST_REASON_MPTCP_RST_EWQ2BIG,
+	/* Unacceptable performance.
+	 * This code indicates that the performance of this subflow was
+	 * too low compared to the other subflows of this Multipath TCP
+	 * connection.
+	 */
+	SK_RST_REASON_MPTCP_RST_EBADPERF,
+	/* Middlebox interference.
+	 * Middlebox interference has been detected over this subflow,
+	 * making MPTCP signaling invalid. For example, this may be sent
+	 * if the checksum does not validate.
+	 */
+	SK_RST_REASON_MPTCP_RST_EMIDDLEBOX,
+
+	/* For the real standalone socket reset reason, we start from here */
+	SK_RST_REASON_NOT_SPECIFIED,
+
+	/* Maximum of socket reset reasons.
+	 * It shouldn't be used as a real 'reason'.
+	 */
+	SK_RST_REASON_MAX,
+};
+
+static inline enum sk_rst_reason convert_mptcp_reason(u32 reason)
+{
+	return reason += RST_REASON_START;
+}
+#endif
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 2/7] rstreason: prepare for passive reset
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent Jason Xing
@ 2024-04-17  8:51 ` Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 3/7] rstreason: prepare for active reset Jason Xing
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

Adjust the parameter and support passing reason of reset which
is for now NOT_SPECIFIED. No functional changes.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 include/net/request_sock.h |  4 +++-
 net/dccp/ipv4.c            | 10 ++++++----
 net/dccp/ipv6.c            | 10 ++++++----
 net/dccp/minisocks.c       |  3 ++-
 net/ipv4/tcp_ipv4.c        | 12 +++++++-----
 net/ipv4/tcp_minisocks.c   |  3 ++-
 net/ipv6/tcp_ipv6.c        | 15 +++++++++------
 net/mptcp/subflow.c        |  8 +++++---
 8 files changed, 40 insertions(+), 25 deletions(-)

diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index 004e651e6067..bdc737832da6 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -18,6 +18,7 @@
 #include <linux/refcount.h>
 
 #include <net/sock.h>
+#include <net/rstreason.h>
 
 struct request_sock;
 struct sk_buff;
@@ -34,7 +35,8 @@ struct request_sock_ops {
 	void		(*send_ack)(const struct sock *sk, struct sk_buff *skb,
 				    struct request_sock *req);
 	void		(*send_reset)(const struct sock *sk,
-				      struct sk_buff *skb);
+				      struct sk_buff *skb,
+				      enum sk_rst_reason reason);
 	void		(*destructor)(struct request_sock *req);
 	void		(*syn_ack_timeout)(const struct request_sock *req);
 };
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 9fc9cea4c251..ff41bd6f99c3 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -24,6 +24,7 @@
 #include <net/xfrm.h>
 #include <net/secure_seq.h>
 #include <net/netns/generic.h>
+#include <net/rstreason.h>
 
 #include "ackvec.h"
 #include "ccid.h"
@@ -521,7 +522,8 @@ static int dccp_v4_send_response(const struct sock *sk, struct request_sock *req
 	return err;
 }
 
-static void dccp_v4_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb)
+static void dccp_v4_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb,
+				   enum sk_rst_reason reason)
 {
 	int err;
 	const struct iphdr *rxiph;
@@ -706,7 +708,7 @@ int dccp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 	return 0;
 
 reset:
-	dccp_v4_ctl_send_reset(sk, skb);
+	dccp_v4_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 	kfree_skb(skb);
 	return 0;
 }
@@ -869,7 +871,7 @@ static int dccp_v4_rcv(struct sk_buff *skb)
 		if (nsk == sk) {
 			reqsk_put(req);
 		} else if (dccp_child_process(sk, nsk, skb)) {
-			dccp_v4_ctl_send_reset(sk, skb);
+			dccp_v4_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 			goto discard_and_relse;
 		} else {
 			sock_put(sk);
@@ -909,7 +911,7 @@ static int dccp_v4_rcv(struct sk_buff *skb)
 	if (dh->dccph_type != DCCP_PKT_RESET) {
 		DCCP_SKB_CB(skb)->dccpd_reset_code =
 					DCCP_RESET_CODE_NO_CONNECTION;
-		dccp_v4_ctl_send_reset(sk, skb);
+		dccp_v4_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 	}
 
 discard_it:
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index c8ca703dc331..85f4b8fdbe5e 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -29,6 +29,7 @@
 #include <net/secure_seq.h>
 #include <net/netns/generic.h>
 #include <net/sock.h>
+#include <net/rstreason.h>
 
 #include "dccp.h"
 #include "ipv6.h"
@@ -256,7 +257,8 @@ static void dccp_v6_reqsk_destructor(struct request_sock *req)
 	kfree_skb(inet_rsk(req)->pktopts);
 }
 
-static void dccp_v6_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb)
+static void dccp_v6_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb,
+				   enum sk_rst_reason reason)
 {
 	const struct ipv6hdr *rxip6h;
 	struct sk_buff *skb;
@@ -656,7 +658,7 @@ static int dccp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	return 0;
 
 reset:
-	dccp_v6_ctl_send_reset(sk, skb);
+	dccp_v6_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 discard:
 	if (opt_skb != NULL)
 		__kfree_skb(opt_skb);
@@ -762,7 +764,7 @@ static int dccp_v6_rcv(struct sk_buff *skb)
 		if (nsk == sk) {
 			reqsk_put(req);
 		} else if (dccp_child_process(sk, nsk, skb)) {
-			dccp_v6_ctl_send_reset(sk, skb);
+			dccp_v6_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 			goto discard_and_relse;
 		} else {
 			sock_put(sk);
@@ -801,7 +803,7 @@ static int dccp_v6_rcv(struct sk_buff *skb)
 	if (dh->dccph_type != DCCP_PKT_RESET) {
 		DCCP_SKB_CB(skb)->dccpd_reset_code =
 					DCCP_RESET_CODE_NO_CONNECTION;
-		dccp_v6_ctl_send_reset(sk, skb);
+		dccp_v6_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 	}
 
 discard_it:
diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c
index 64d805b27add..251a57cf5822 100644
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -15,6 +15,7 @@
 #include <net/sock.h>
 #include <net/xfrm.h>
 #include <net/inet_timewait_sock.h>
+#include <net/rstreason.h>
 
 #include "ackvec.h"
 #include "ccid.h"
@@ -202,7 +203,7 @@ struct sock *dccp_check_req(struct sock *sk, struct sk_buff *skb,
 	DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_TOO_BUSY;
 drop:
 	if (dccp_hdr(skb)->dccph_type != DCCP_PKT_RESET)
-		req->rsk_ops->send_reset(sk, skb);
+		req->rsk_ops->send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 
 	inet_csk_reqsk_queue_drop(sk, req);
 out:
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 88c83ac42129..418d11902fa7 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -70,6 +70,7 @@
 #include <net/xfrm.h>
 #include <net/secure_seq.h>
 #include <net/busy_poll.h>
+#include <net/rstreason.h>
 
 #include <linux/inet.h>
 #include <linux/ipv6.h>
@@ -723,7 +724,8 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb,
  *	Exception: precedence violation. We do not implement it in any case.
  */
 
-static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
+static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb,
+			      enum sk_rst_reason reason)
 {
 	const struct tcphdr *th = tcp_hdr(skb);
 	struct {
@@ -1934,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 	return 0;
 
 reset:
-	tcp_v4_send_reset(rsk, skb);
+	tcp_v4_send_reset(rsk, skb, SK_RST_REASON_NOT_SPECIFIED);
 discard:
 	kfree_skb_reason(skb, reason);
 	/* Be careful here. If this function gets more complicated and
@@ -2276,7 +2278,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		} else {
 			drop_reason = tcp_child_process(sk, nsk, skb);
 			if (drop_reason) {
-				tcp_v4_send_reset(nsk, skb);
+				tcp_v4_send_reset(nsk, skb, SK_RST_REASON_NOT_SPECIFIED);
 				goto discard_and_relse;
 			}
 			sock_put(sk);
@@ -2355,7 +2357,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 bad_packet:
 		__TCP_INC_STATS(net, TCP_MIB_INERRS);
 	} else {
-		tcp_v4_send_reset(NULL, skb);
+		tcp_v4_send_reset(NULL, skb, SK_RST_REASON_NOT_SPECIFIED);
 	}
 
 discard_it:
@@ -2407,7 +2409,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		tcp_v4_timewait_ack(sk, skb);
 		break;
 	case TCP_TW_RST:
-		tcp_v4_send_reset(sk, skb);
+		tcp_v4_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 		inet_twsk_deschedule_put(inet_twsk(sk));
 		goto discard_it;
 	case TCP_TW_SUCCESS:;
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index f53c7ada2ace..0bc19aca2759 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -22,6 +22,7 @@
 #include <net/tcp.h>
 #include <net/xfrm.h>
 #include <net/busy_poll.h>
+#include <net/rstreason.h>
 
 static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win)
 {
@@ -879,7 +880,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 		 * avoid becoming vulnerable to outside attack aiming at
 		 * resetting legit local connections.
 		 */
-		req->rsk_ops->send_reset(sk, skb);
+		req->rsk_ops->send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 	} else if (fastopen) { /* received a valid RST pkt */
 		reqsk_fastopen_remove(sk, req, true);
 		tcp_reset(sk, skb);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index bb7c3caf4f85..017f6293b5f4 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -60,6 +60,7 @@
 #include <net/secure_seq.h>
 #include <net/hotdata.h>
 #include <net/busy_poll.h>
+#include <net/rstreason.h>
 
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
@@ -69,7 +70,8 @@
 
 #include <trace/events/tcp.h>
 
-static void	tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb);
+static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb,
+			      enum sk_rst_reason reason);
 static void	tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
 				      struct request_sock *req);
 
@@ -1008,7 +1010,8 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
 	kfree_skb(buff);
 }
 
-static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb)
+static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb,
+			      enum sk_rst_reason reason)
 {
 	const struct tcphdr *th = tcp_hdr(skb);
 	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
@@ -1677,7 +1680,7 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	return 0;
 
 reset:
-	tcp_v6_send_reset(sk, skb);
+	tcp_v6_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 discard:
 	if (opt_skb)
 		__kfree_skb(opt_skb);
@@ -1862,7 +1865,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		} else {
 			drop_reason = tcp_child_process(sk, nsk, skb);
 			if (drop_reason) {
-				tcp_v6_send_reset(nsk, skb);
+				tcp_v6_send_reset(nsk, skb, SK_RST_REASON_NOT_SPECIFIED);
 				goto discard_and_relse;
 			}
 			sock_put(sk);
@@ -1939,7 +1942,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 bad_packet:
 		__TCP_INC_STATS(net, TCP_MIB_INERRS);
 	} else {
-		tcp_v6_send_reset(NULL, skb);
+		tcp_v6_send_reset(NULL, skb, SK_RST_REASON_NOT_SPECIFIED);
 	}
 
 discard_it:
@@ -1995,7 +1998,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		tcp_v6_timewait_ack(sk, skb);
 		break;
 	case TCP_TW_RST:
-		tcp_v6_send_reset(sk, skb);
+		tcp_v6_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 		inet_twsk_deschedule_put(inet_twsk(sk));
 		goto discard_it;
 	case TCP_TW_SUCCESS:
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index b94d1dca1094..32fe2ef36d56 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -20,6 +20,8 @@
 #include <net/transp_v6.h>
 #endif
 #include <net/mptcp.h>
+#include <net/rstreason.h>
+
 #include "protocol.h"
 #include "mib.h"
 
@@ -308,7 +310,7 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 
 	dst_release(dst);
 	if (!req->syncookie)
-		tcp_request_sock_ops.send_reset(sk, skb);
+		tcp_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 	return NULL;
 }
 
@@ -376,7 +378,7 @@ static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 
 	dst_release(dst);
 	if (!req->syncookie)
-		tcp6_request_sock_ops.send_reset(sk, skb);
+		tcp6_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 	return NULL;
 }
 #endif
@@ -911,7 +913,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 	tcp_rsk(req)->drop_req = true;
 	inet_csk_prepare_for_destroy_sock(child);
 	tcp_done(child);
-	req->rsk_ops->send_reset(sk, skb);
+	req->rsk_ops->send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
 
 	/* The last child reference will be released by the caller */
 	return child;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 3/7] rstreason: prepare for active reset
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 2/7] rstreason: prepare for passive reset Jason Xing
@ 2024-04-17  8:51 ` Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 4/7] tcp: support rstreason for passive reset Jason Xing
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

Like what we did to passive reset:
only passing possible reset reason in each active reset path.

No functional changes.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 include/net/tcp.h     |  3 ++-
 net/ipv4/tcp.c        | 15 ++++++++++-----
 net/ipv4/tcp_output.c |  3 ++-
 net/ipv4/tcp_timer.c  |  9 ++++++---
 net/mptcp/protocol.c  |  4 +++-
 net/mptcp/subflow.c   |  5 +++--
 6 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index b935e1ae4caf..adeacc9aa28a 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -670,7 +670,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
 void tcp_send_probe0(struct sock *);
 int tcp_write_wakeup(struct sock *, int mib);
 void tcp_send_fin(struct sock *sk);
-void tcp_send_active_reset(struct sock *sk, gfp_t priority);
+void tcp_send_active_reset(struct sock *sk, gfp_t priority,
+			   enum sk_rst_reason reason);
 int tcp_send_synack(struct sock *);
 void tcp_push_one(struct sock *, unsigned int mss_now);
 void __tcp_send_ack(struct sock *sk, u32 rcv_nxt);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f23b97777ea5..4ec0f4feee00 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -275,6 +275,7 @@
 #include <net/xfrm.h>
 #include <net/ip.h>
 #include <net/sock.h>
+#include <net/rstreason.h>
 
 #include <linux/uaccess.h>
 #include <asm/ioctls.h>
@@ -2811,7 +2812,8 @@ void __tcp_close(struct sock *sk, long timeout)
 		/* Unread data was tossed, zap the connection. */
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONCLOSE);
 		tcp_set_state(sk, TCP_CLOSE);
-		tcp_send_active_reset(sk, sk->sk_allocation);
+		tcp_send_active_reset(sk, sk->sk_allocation,
+				      SK_RST_REASON_NOT_SPECIFIED);
 	} else if (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime) {
 		/* Check zero linger _after_ checking for unread data. */
 		sk->sk_prot->disconnect(sk, 0);
@@ -2885,7 +2887,8 @@ void __tcp_close(struct sock *sk, long timeout)
 		struct tcp_sock *tp = tcp_sk(sk);
 		if (READ_ONCE(tp->linger2) < 0) {
 			tcp_set_state(sk, TCP_CLOSE);
-			tcp_send_active_reset(sk, GFP_ATOMIC);
+			tcp_send_active_reset(sk, GFP_ATOMIC,
+					      SK_RST_REASON_NOT_SPECIFIED);
 			__NET_INC_STATS(sock_net(sk),
 					LINUX_MIB_TCPABORTONLINGER);
 		} else {
@@ -2903,7 +2906,8 @@ void __tcp_close(struct sock *sk, long timeout)
 	if (sk->sk_state != TCP_CLOSE) {
 		if (tcp_check_oom(sk, 0)) {
 			tcp_set_state(sk, TCP_CLOSE);
-			tcp_send_active_reset(sk, GFP_ATOMIC);
+			tcp_send_active_reset(sk, GFP_ATOMIC,
+					      SK_RST_REASON_NOT_SPECIFIED);
 			__NET_INC_STATS(sock_net(sk),
 					LINUX_MIB_TCPABORTONMEMORY);
 		} else if (!check_net(sock_net(sk))) {
@@ -3007,7 +3011,7 @@ int tcp_disconnect(struct sock *sk, int flags)
 		/* The last check adjusts for discrepancy of Linux wrt. RFC
 		 * states
 		 */
-		tcp_send_active_reset(sk, gfp_any());
+		tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_NOT_SPECIFIED);
 		WRITE_ONCE(sk->sk_err, ECONNRESET);
 	} else if (old_state == TCP_SYN_SENT)
 		WRITE_ONCE(sk->sk_err, ECONNRESET);
@@ -4564,7 +4568,8 @@ int tcp_abort(struct sock *sk, int err)
 		smp_wmb();
 		sk_error_report(sk);
 		if (tcp_need_reset(sk->sk_state))
-			tcp_send_active_reset(sk, GFP_ATOMIC);
+			tcp_send_active_reset(sk, GFP_ATOMIC,
+					      SK_RST_REASON_NOT_SPECIFIED);
 		tcp_done(sk);
 	}
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 61119d42b0fd..276d9d541b01 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3586,7 +3586,8 @@ void tcp_send_fin(struct sock *sk)
  * was unread data in the receive queue.  This behavior is recommended
  * by RFC 2525, section 2.17.  -DaveM
  */
-void tcp_send_active_reset(struct sock *sk, gfp_t priority)
+void tcp_send_active_reset(struct sock *sk, gfp_t priority,
+			   enum sk_rst_reason reason)
 {
 	struct sk_buff *skb;
 
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 976db57b95d4..83fe7f62f7f1 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -22,6 +22,7 @@
 #include <linux/module.h>
 #include <linux/gfp.h>
 #include <net/tcp.h>
+#include <net/rstreason.h>
 
 static u32 tcp_clamp_rto_to_user_timeout(const struct sock *sk)
 {
@@ -127,7 +128,8 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
 		    (!tp->snd_wnd && !tp->packets_out))
 			do_reset = true;
 		if (do_reset)
-			tcp_send_active_reset(sk, GFP_ATOMIC);
+			tcp_send_active_reset(sk, GFP_ATOMIC,
+					      SK_RST_REASON_NOT_SPECIFIED);
 		tcp_done(sk);
 		__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY);
 		return 1;
@@ -768,7 +770,7 @@ static void tcp_keepalive_timer (struct timer_list *t)
 				goto out;
 			}
 		}
-		tcp_send_active_reset(sk, GFP_ATOMIC);
+		tcp_send_active_reset(sk, GFP_ATOMIC, SK_RST_REASON_NOT_SPECIFIED);
 		goto death;
 	}
 
@@ -795,7 +797,8 @@ static void tcp_keepalive_timer (struct timer_list *t)
 		    icsk->icsk_probes_out > 0) ||
 		    (user_timeout == 0 &&
 		    icsk->icsk_probes_out >= keepalive_probes(tp))) {
-			tcp_send_active_reset(sk, GFP_ATOMIC);
+			tcp_send_active_reset(sk, GFP_ATOMIC,
+					      SK_RST_REASON_NOT_SPECIFIED);
 			tcp_write_err(sk);
 			goto out;
 		}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f8bc34f0d973..065967086492 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -21,6 +21,7 @@
 #endif
 #include <net/mptcp.h>
 #include <net/xfrm.h>
+#include <net/rstreason.h>
 #include <asm/ioctls.h>
 #include "protocol.h"
 #include "mib.h"
@@ -2569,7 +2570,8 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
 
 		slow = lock_sock_fast(tcp_sk);
 		if (tcp_sk->sk_state != TCP_CLOSE) {
-			tcp_send_active_reset(tcp_sk, GFP_ATOMIC);
+			tcp_send_active_reset(tcp_sk, GFP_ATOMIC,
+					      SK_RST_REASON_NOT_SPECIFIED);
 			tcp_set_state(tcp_sk, TCP_CLOSE);
 		}
 		unlock_sock_fast(tcp_sk, slow);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 32fe2ef36d56..ac867d277860 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -414,7 +414,7 @@ void mptcp_subflow_reset(struct sock *ssk)
 	/* must hold: tcp_done() could drop last reference on parent */
 	sock_hold(sk);
 
-	tcp_send_active_reset(ssk, GFP_ATOMIC);
+	tcp_send_active_reset(ssk, GFP_ATOMIC, SK_RST_REASON_NOT_SPECIFIED);
 	tcp_done(ssk);
 	if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &mptcp_sk(sk)->flags))
 		mptcp_schedule_work(sk);
@@ -1350,7 +1350,8 @@ static bool subflow_check_data_avail(struct sock *ssk)
 			tcp_set_state(ssk, TCP_CLOSE);
 			while ((skb = skb_peek(&ssk->sk_receive_queue)))
 				sk_eat_skb(ssk, skb);
-			tcp_send_active_reset(ssk, GFP_ATOMIC);
+			tcp_send_active_reset(ssk, GFP_ATOMIC,
+					      SK_RST_REASON_NOT_SPECIFIED);
 			WRITE_ONCE(subflow->data_avail, false);
 			return false;
 		}
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 4/7] tcp: support rstreason for passive reset
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
                   ` (2 preceding siblings ...)
  2024-04-17  8:51 ` [PATCH net-next v6 3/7] rstreason: prepare for active reset Jason Xing
@ 2024-04-17  8:51 ` Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 5/7] mptcp: " Jason Xing
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

Reuse the dropreason logic to show the exact reason of tcp reset,
so we don't need to implement those duplicated reset reasons.
This patch replaces all the prior NOT_SPECIFIED reasons.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 net/ipv4/tcp_ipv4.c | 8 ++++----
 net/ipv6/tcp_ipv6.c | 8 ++++----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 418d11902fa7..d78412cf8566 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 	return 0;
 
 reset:
-	tcp_v4_send_reset(rsk, skb, SK_RST_REASON_NOT_SPECIFIED);
+	tcp_v4_send_reset(rsk, skb, (u32)reason);
 discard:
 	kfree_skb_reason(skb, reason);
 	/* Be careful here. If this function gets more complicated and
@@ -2278,7 +2278,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		} else {
 			drop_reason = tcp_child_process(sk, nsk, skb);
 			if (drop_reason) {
-				tcp_v4_send_reset(nsk, skb, SK_RST_REASON_NOT_SPECIFIED);
+				tcp_v4_send_reset(nsk, skb, (u32)drop_reason);
 				goto discard_and_relse;
 			}
 			sock_put(sk);
@@ -2357,7 +2357,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 bad_packet:
 		__TCP_INC_STATS(net, TCP_MIB_INERRS);
 	} else {
-		tcp_v4_send_reset(NULL, skb, SK_RST_REASON_NOT_SPECIFIED);
+		tcp_v4_send_reset(NULL, skb, (u32)drop_reason);
 	}
 
 discard_it:
@@ -2409,7 +2409,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		tcp_v4_timewait_ack(sk, skb);
 		break;
 	case TCP_TW_RST:
-		tcp_v4_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
+		tcp_v4_send_reset(sk, skb, (u32)drop_reason);
 		inet_twsk_deschedule_put(inet_twsk(sk));
 		goto discard_it;
 	case TCP_TW_SUCCESS:;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 017f6293b5f4..c46095fb596c 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1680,7 +1680,7 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	return 0;
 
 reset:
-	tcp_v6_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
+	tcp_v6_send_reset(sk, skb, (u32)reason);
 discard:
 	if (opt_skb)
 		__kfree_skb(opt_skb);
@@ -1865,7 +1865,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		} else {
 			drop_reason = tcp_child_process(sk, nsk, skb);
 			if (drop_reason) {
-				tcp_v6_send_reset(nsk, skb, SK_RST_REASON_NOT_SPECIFIED);
+				tcp_v6_send_reset(nsk, skb, (u32)drop_reason);
 				goto discard_and_relse;
 			}
 			sock_put(sk);
@@ -1942,7 +1942,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 bad_packet:
 		__TCP_INC_STATS(net, TCP_MIB_INERRS);
 	} else {
-		tcp_v6_send_reset(NULL, skb, SK_RST_REASON_NOT_SPECIFIED);
+		tcp_v6_send_reset(NULL, skb, (u32)drop_reason);
 	}
 
 discard_it:
@@ -1998,7 +1998,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		tcp_v6_timewait_ack(sk, skb);
 		break;
 	case TCP_TW_RST:
-		tcp_v6_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
+		tcp_v6_send_reset(sk, skb, (u32)drop_reason);
 		inet_twsk_deschedule_put(inet_twsk(sk));
 		goto discard_it;
 	case TCP_TW_SUCCESS:
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 5/7] mptcp: support rstreason for passive reset
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
                   ` (3 preceding siblings ...)
  2024-04-17  8:51 ` [PATCH net-next v6 4/7] tcp: support rstreason for passive reset Jason Xing
@ 2024-04-17  8:51 ` Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 6/7] mptcp: introducing a helper into active reset logic Jason Xing
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

It relys on what reset options in the skb are as rfc8684 says. Reusing
this logic can save us much energy. This patch replaces most of the prior
NOT_SPECIFIED reasons.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 net/mptcp/subflow.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index ac867d277860..bde4a7fdee82 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -309,8 +309,13 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 		return dst;
 
 	dst_release(dst);
-	if (!req->syncookie)
-		tcp_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
+	if (!req->syncookie) {
+		struct mptcp_ext *mpext = mptcp_get_ext(skb);
+		enum sk_rst_reason reason;
+
+		reason = convert_mptcp_reason(mpext->reset_reason);
+		tcp_request_sock_ops.send_reset(sk, skb, reason);
+	}
 	return NULL;
 }
 
@@ -377,8 +382,13 @@ static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 		return dst;
 
 	dst_release(dst);
-	if (!req->syncookie)
-		tcp6_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
+	if (!req->syncookie) {
+		struct mptcp_ext *mpext = mptcp_get_ext(skb);
+		enum sk_rst_reason reason;
+
+		reason = convert_mptcp_reason(mpext->reset_reason);
+		tcp6_request_sock_ops.send_reset(sk, skb, reason);
+	}
 	return NULL;
 }
 #endif
@@ -783,6 +793,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 	struct mptcp_subflow_request_sock *subflow_req;
 	struct mptcp_options_received mp_opt;
 	bool fallback, fallback_is_fatal;
+	enum sk_rst_reason reason;
 	struct mptcp_sock *owner;
 	struct sock *child;
 
@@ -913,7 +924,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 	tcp_rsk(req)->drop_req = true;
 	inet_csk_prepare_for_destroy_sock(child);
 	tcp_done(child);
-	req->rsk_ops->send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
+	reason = convert_mptcp_reason(mptcp_get_ext(skb)->reset_reason);
+	req->rsk_ops->send_reset(sk, skb, reason);
 
 	/* The last child reference will be released by the caller */
 	return child;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 6/7] mptcp: introducing a helper into active reset logic
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
                   ` (4 preceding siblings ...)
  2024-04-17  8:51 ` [PATCH net-next v6 5/7] mptcp: " Jason Xing
@ 2024-04-17  8:51 ` Jason Xing
  2024-04-17  8:51 ` [PATCH net-next v6 7/7] rstreason: make it work in trace world Jason Xing
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

Since we have mapped every mptcp reset reason definition
in enum sk_rst_reason, introducing a new helper can cover
some missing places where we have already set the
subflow->reset_reason.

Note: using SK_RST_REASON_NOT_SPECIFIED is the same as
SK_RST_REASON_MPTCP_RST_EUNSPEC. They are both unknown.
So we can convert it directly.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
Link: https://lore.kernel.org/all/2d3ea199eef53cf6a0c48e21abdee0eefbdee927.camel@redhat.com/
---
 net/mptcp/protocol.c |  4 +---
 net/mptcp/protocol.h | 11 +++++++++++
 net/mptcp/subflow.c  |  6 ++----
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 065967086492..4b13ca362efa 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -21,7 +21,6 @@
 #endif
 #include <net/mptcp.h>
 #include <net/xfrm.h>
-#include <net/rstreason.h>
 #include <asm/ioctls.h>
 #include "protocol.h"
 #include "mib.h"
@@ -2570,8 +2569,7 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
 
 		slow = lock_sock_fast(tcp_sk);
 		if (tcp_sk->sk_state != TCP_CLOSE) {
-			tcp_send_active_reset(tcp_sk, GFP_ATOMIC,
-					      SK_RST_REASON_NOT_SPECIFIED);
+			mptcp_send_active_reset_reason(tcp_sk);
 			tcp_set_state(tcp_sk, TCP_CLOSE);
 		}
 		unlock_sock_fast(tcp_sk, slow);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index fdfa843e2d88..82ef2f42a1bc 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -12,6 +12,7 @@
 #include <net/inet_connection_sock.h>
 #include <uapi/linux/mptcp.h>
 #include <net/genetlink.h>
+#include <net/rstreason.h>
 
 #include "mptcp_pm_gen.h"
 
@@ -581,6 +582,16 @@ mptcp_subflow_ctx_reset(struct mptcp_subflow_context *subflow)
 	WRITE_ONCE(subflow->local_id, -1);
 }
 
+static inline void
+mptcp_send_active_reset_reason(struct sock *sk)
+{
+	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
+	enum sk_rst_reason reason;
+
+	reason = convert_mptcp_reason(subflow->reset_reason);
+	tcp_send_active_reset(sk, GFP_ATOMIC, reason);
+}
+
 static inline u64
 mptcp_subflow_get_map_offset(const struct mptcp_subflow_context *subflow)
 {
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index bde4a7fdee82..4783d558863c 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -20,7 +20,6 @@
 #include <net/transp_v6.h>
 #endif
 #include <net/mptcp.h>
-#include <net/rstreason.h>
 
 #include "protocol.h"
 #include "mib.h"
@@ -424,7 +423,7 @@ void mptcp_subflow_reset(struct sock *ssk)
 	/* must hold: tcp_done() could drop last reference on parent */
 	sock_hold(sk);
 
-	tcp_send_active_reset(ssk, GFP_ATOMIC, SK_RST_REASON_NOT_SPECIFIED);
+	mptcp_send_active_reset_reason(ssk);
 	tcp_done(ssk);
 	if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &mptcp_sk(sk)->flags))
 		mptcp_schedule_work(sk);
@@ -1362,8 +1361,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
 			tcp_set_state(ssk, TCP_CLOSE);
 			while ((skb = skb_peek(&ssk->sk_receive_queue)))
 				sk_eat_skb(ssk, skb);
-			tcp_send_active_reset(ssk, GFP_ATOMIC,
-					      SK_RST_REASON_NOT_SPECIFIED);
+			mptcp_send_active_reset_reason(ssk);
 			WRITE_ONCE(subflow->data_avail, false);
 			return false;
 		}
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 7/7] rstreason: make it work in trace world
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
                   ` (5 preceding siblings ...)
  2024-04-17  8:51 ` [PATCH net-next v6 6/7] mptcp: introducing a helper into active reset logic Jason Xing
@ 2024-04-17  8:51 ` Jason Xing
  2024-04-17  9:41 ` [PATCH net-next v6 0/7] Implement reset reason mechanism to detect MPTCP CI
  2024-04-18  3:30 ` Jason Xing
  8 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  8:51 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

At last, we should let it work by introducing this reset reason in
trace world.

One of the possible expected outputs is:
... tcp_send_reset: skbaddr=xxx skaddr=xxx src=xxx dest=xxx
state=TCP_ESTABLISHED reason=NOT_SPECIFIED

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/trace/events/tcp.h | 37 +++++++++++++++++++++++++++++++++----
 net/ipv4/tcp_ipv4.c        |  2 +-
 net/ipv4/tcp_output.c      |  2 +-
 net/ipv6/tcp_ipv6.c        |  2 +-
 4 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 5c04a61a11c2..b1455cbc0634 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -11,6 +11,7 @@
 #include <net/ipv6.h>
 #include <net/tcp.h>
 #include <linux/sock_diag.h>
+#include <net/rstreason.h>
 
 /*
  * tcp event with arguments sk and skb
@@ -74,20 +75,38 @@ DEFINE_EVENT(tcp_event_sk_skb, tcp_retransmit_skb,
 	TP_ARGS(sk, skb)
 );
 
+#undef FN1
+#define FN1(reason)	TRACE_DEFINE_ENUM(SK_RST_REASON_##reason);
+#undef FN2
+#define FN2(reason)	TRACE_DEFINE_ENUM(SKB_DROP_REASON_##reason);
+DEFINE_RST_REASON(FN1, FN1)
+
+#undef FN1
+#undef FNe1
+#define FN1(reason)	{ SK_RST_REASON_##reason, #reason },
+#define FNe1(reason)	{ SK_RST_REASON_##reason, #reason }
+
+#undef FN2
+#undef FNe2
+#define FN2(reason)	{ SKB_DROP_REASON_##reason, #reason },
+#define FNe2(reason)	{ SKB_DROP_REASON_##reason, #reason }
 /*
  * skb of trace_tcp_send_reset is the skb that caused RST. In case of
  * active reset, skb should be NULL
  */
 TRACE_EVENT(tcp_send_reset,
 
-	TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+	TP_PROTO(const struct sock *sk,
+		 const struct sk_buff *skb,
+		 const enum sk_rst_reason reason),
 
-	TP_ARGS(sk, skb),
+	TP_ARGS(sk, skb, reason),
 
 	TP_STRUCT__entry(
 		__field(const void *, skbaddr)
 		__field(const void *, skaddr)
 		__field(int, state)
+		__field(enum sk_rst_reason, reason)
 		__array(__u8, saddr, sizeof(struct sockaddr_in6))
 		__array(__u8, daddr, sizeof(struct sockaddr_in6))
 	),
@@ -113,14 +132,24 @@ TRACE_EVENT(tcp_send_reset,
 			 */
 			TP_STORE_ADDR_PORTS_SKB(skb, th, entry->daddr, entry->saddr);
 		}
+		__entry->reason = reason;
 	),
 
-	TP_printk("skbaddr=%p skaddr=%p src=%pISpc dest=%pISpc state=%s",
+	TP_printk("skbaddr=%p skaddr=%p src=%pISpc dest=%pISpc state=%s reason=%s",
 		  __entry->skbaddr, __entry->skaddr,
 		  __entry->saddr, __entry->daddr,
-		  __entry->state ? show_tcp_state_name(__entry->state) : "UNKNOWN")
+		  __entry->state ? show_tcp_state_name(__entry->state) : "UNKNOWN",
+		  __entry->reason < RST_REASON_START ?
+			__print_symbolic(__entry->reason, DEFINE_DROP_REASON(FN2, FNe2)) :
+			__print_symbolic(__entry->reason, DEFINE_RST_REASON(FN1, FNe1)))
 );
 
+#undef FN1
+#undef FNe1
+
+#undef FN2
+#undef FNe2
+
 /*
  * tcp event with arguments sk
  *
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d78412cf8566..461b4d2b7cfe 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -871,7 +871,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb,
 	if (sk)
 		arg.bound_dev_if = sk->sk_bound_dev_if;
 
-	trace_tcp_send_reset(sk, skb);
+	trace_tcp_send_reset(sk, skb, reason);
 
 	BUILD_BUG_ON(offsetof(struct sock, sk_bound_dev_if) !=
 		     offsetof(struct inet_timewait_sock, tw_bound_dev_if));
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 276d9d541b01..b08ffb17d5a0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3612,7 +3612,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority,
 	/* skb of trace_tcp_send_reset() keeps the skb that caused RST,
 	 * skb here is different to the troublesome skb, so use NULL
 	 */
-	trace_tcp_send_reset(sk, NULL);
+	trace_tcp_send_reset(sk, NULL, SK_RST_REASON_NOT_SPECIFIED);
 }
 
 /* Send a crossed SYN-ACK during socket establishment.
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index c46095fb596c..6a4736ec3df0 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1133,7 +1133,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb,
 			label = ip6_flowlabel(ipv6h);
 	}
 
-	trace_tcp_send_reset(sk, skb);
+	trace_tcp_send_reset(sk, skb, reason);
 
 	tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, 1,
 			     ipv6_get_dsfield(ipv6h), label, priority, txhash,
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent
  2024-04-17  8:51 ` [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent Jason Xing
@ 2024-04-17  9:01   ` Eric Dumazet
  2024-04-17  9:22     ` Jason Xing
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2024-04-17  9:01 UTC (permalink / raw)
  To: Jason Xing
  Cc: dsahern, matttbe, martineau, geliang, kuba, pabeni, davem,
	rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp, netdev,
	linux-trace-kernel, Jason Xing

On Wed, Apr 17, 2024 at 10:51 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> From: Jason Xing <kernelxing@tencent.com>
>
> Add a new standalone file for the easy future extension to support
> both active reset and passive reset in the TCP/DCCP/MPTCP protocols.
>
> This patch only does the preparations for reset reason mechanism,
> nothing else changes.
>
> The reset reasons are divided into three parts:
> 1) reuse drop reasons for passive reset in TCP
> 2) reuse MP_TCPRST option for MPTCP
> 3) our own reasons
>
> I will implement the basic codes of active/passive reset reason in
> those three protocols, which is not complete for this moment. But
> it provides a new chance to let other people add more reasons into
> it:)
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>

My original suggestion was to use normal values in  'enum
skb_drop_reason', even if there was not necessarily a 'drop'
in the common sense.

https://lore.kernel.org/all/CANn89iJw8x-LqgsWOeJQQvgVg6DnL5aBRLi10QN2WBdr+X4k=w@mail.gmail.com/

This would avoid these ugly casts later, even casting an enum to other
ones is not very logical.
Going through an u32 pivot is quite a hack.

If you feel the need to put them in a special group, this is fine by me.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent
  2024-04-17  9:01   ` Eric Dumazet
@ 2024-04-17  9:22     ` Jason Xing
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-17  9:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: dsahern, matttbe, martineau, geliang, kuba, pabeni, davem,
	rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp, netdev,
	linux-trace-kernel, Jason Xing

Hello Eric,

On Wed, Apr 17, 2024 at 5:02 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Apr 17, 2024 at 10:51 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Add a new standalone file for the easy future extension to support
> > both active reset and passive reset in the TCP/DCCP/MPTCP protocols.
> >
> > This patch only does the preparations for reset reason mechanism,
> > nothing else changes.
> >
> > The reset reasons are divided into three parts:
> > 1) reuse drop reasons for passive reset in TCP
> > 2) reuse MP_TCPRST option for MPTCP
> > 3) our own reasons
> >
> > I will implement the basic codes of active/passive reset reason in
> > those three protocols, which is not complete for this moment. But
> > it provides a new chance to let other people add more reasons into
> > it:)
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
>
> My original suggestion was to use normal values in  'enum
> skb_drop_reason', even if there was not necessarily a 'drop'
> in the common sense.
>
> https://lore.kernel.org/all/CANn89iJw8x-LqgsWOeJQQvgVg6DnL5aBRLi10QN2WBdr+X4k=w@mail.gmail.com/
>
> This would avoid these ugly casts later, even casting an enum to other
> ones is not very logical.

Thanks for your comment.

It's a little bit tricky. That's the reason I documented and commented
on this in the rstreason.h file. I hope it's not that hard to
understand.

> Going through an u32 pivot is quite a hack.
>
> If you feel the need to put them in a special group, this is fine by me.

Yes, rst reasons only partially rely on the drop reason mechanism to
support passive rst for TCP well, but not supporting other cases. My
final goal is to cover all the cases for the future, so I wish I can
put it into a separate group, then people like me who find it useful
can introduce more reasons into it.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
                   ` (6 preceding siblings ...)
  2024-04-17  8:51 ` [PATCH net-next v6 7/7] rstreason: make it work in trace world Jason Xing
@ 2024-04-17  9:41 ` MPTCP CI
  2024-04-18  3:30 ` Jason Xing
  8 siblings, 0 replies; 27+ messages in thread
From: MPTCP CI @ 2024-04-17  9:41 UTC (permalink / raw)
  To: Jason Xing; +Cc: mptcp

Hi Jason,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Unstable: 2 failed test(s): packetdrill_dss packetdrill_fastopen 🔴
- KVM Validation: btf (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/8719220506

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/a79c83426c9f
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=845351


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
                   ` (7 preceding siblings ...)
  2024-04-17  9:41 ` [PATCH net-next v6 0/7] Implement reset reason mechanism to detect MPTCP CI
@ 2024-04-18  3:30 ` Jason Xing
  2024-04-18 15:46   ` Jakub Kicinski
  8 siblings, 1 reply; 27+ messages in thread
From: Jason Xing @ 2024-04-18  3:30 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, Jason Xing

On Wed, Apr 17, 2024 at 4:51 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> From: Jason Xing <kernelxing@tencent.com>
>
> In production, there are so many cases about why the RST skb is sent but
> we don't have a very convenient/fast method to detect the exact underlying
> reasons.
>
> RST is implemented in two kinds: passive kind (like tcp_v4_send_reset())
> and active kind (like tcp_send_active_reset()). The former can be traced
> carefully 1) in TCP, with the help of drop reasons, which is based on
> Eric's idea[1], 2) in MPTCP, with the help of reset options defined in
> RFC 8684. The latter is relatively independent, which should be
> implemented on our own.
>
> In this series, I focus on the fundamental implement mostly about how
> the rstreason mechnism works and give the detailed passive part as an
> example, not including the active reset part. In future, we can go
> further and refine those NOT_SPECIFIED reasons.
>
> Here are some examples when tracing:
> <idle>-0       [002] ..s1.  1830.262425: tcp_send_reset: skbaddr=x
>         skaddr=x src=x dest=x state=x reason=NOT_SPECIFIED
> <idle>-0       [002] ..s1.  1830.262425: tcp_send_reset: skbaddr=x
>         skaddr=x src=x dest=x state=x reason=NO_SOCKET
>
> [1]
> Link: https://lore.kernel.org/all/CANn89iJw8x-LqgsWOeJQQvgVg6DnL5aBRLi10QN2WBdr+X4k=w@mail.gmail.com/
>
> v6
> 1. add back casts, or else they are treated as error.
>
> v5
> Link: https://lore.kernel.org/all/20240411115630.38420-1-kerneljasonxing@gmail.com/
> 1. address format issue (like reverse xmas tree) (Eric, Paolo)
> 2. remove unnecessary casts. (Eric)
> 3. introduce a helper used in mptcp active reset. See patch 6. (Paolo)
>
> v4
> Link: https://lore.kernel.org/all/20240409100934.37725-1-kerneljasonxing@gmail.com/
> 1. passing 'enum sk_rst_reason' for readability when tracing (Antoine)
>
> v3
> Link: https://lore.kernel.org/all/20240404072047.11490-1-kerneljasonxing@gmail.com/
> 1. rebase (mptcp part) and address what Mat suggested.
>
> v2
> Link: https://lore.kernel.org/all/20240403185033.47ebc6a9@kernel.org/
> 1. rebase against the latest net-next tree
>
>
>
> Jason Xing (7):
>   net: introduce rstreason to detect why the RST is sent
>   rstreason: prepare for passive reset
>   rstreason: prepare for active reset
>   tcp: support rstreason for passive reset
>   mptcp: support rstreason for passive reset
>   mptcp: introducing a helper into active reset logic
>   rstreason: make it work in trace world
>
>  include/net/request_sock.h |  4 +-
>  include/net/rstreason.h    | 93 ++++++++++++++++++++++++++++++++++++++
>  include/net/tcp.h          |  3 +-
>  include/trace/events/tcp.h | 37 +++++++++++++--
>  net/dccp/ipv4.c            | 10 ++--
>  net/dccp/ipv6.c            | 10 ++--
>  net/dccp/minisocks.c       |  3 +-
>  net/ipv4/tcp.c             | 15 ++++--
>  net/ipv4/tcp_ipv4.c        | 14 +++---
>  net/ipv4/tcp_minisocks.c   |  3 +-
>  net/ipv4/tcp_output.c      |  5 +-
>  net/ipv4/tcp_timer.c       |  9 ++--
>  net/ipv6/tcp_ipv6.c        | 17 ++++---
>  net/mptcp/protocol.c       |  2 +-
>  net/mptcp/protocol.h       | 11 +++++
>  net/mptcp/subflow.c        | 27 ++++++++---
>  16 files changed, 216 insertions(+), 47 deletions(-)
>  create mode 100644 include/net/rstreason.h
>
> --
> 2.37.3
>

Hello maintainers,

I'm not sure why the patch series has been changed to 'Changes
Requested', until now I don't think I need to change something.

Should I repost this series (keeping the v6 tag) and then wait for
more comments?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-18  3:30 ` Jason Xing
@ 2024-04-18 15:46   ` Jakub Kicinski
  2024-04-18 16:23     ` Jason Xing
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2024-04-18 15:46 UTC (permalink / raw)
  To: Jason Xing
  Cc: edumazet, dsahern, matttbe, martineau, geliang, pabeni, davem,
	rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp, netdev,
	linux-trace-kernel, Jason Xing

On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote:
> I'm not sure why the patch series has been changed to 'Changes
> Requested', until now I don't think I need to change something.
> 
> Should I repost this series (keeping the v6 tag) and then wait for
> more comments?

If Eric doesn't like it - it's not getting merged.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-18 15:46   ` Jakub Kicinski
@ 2024-04-18 16:23     ` Jason Xing
  2024-04-18 18:51       ` Eric Dumazet
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Xing @ 2024-04-18 16:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: edumazet, dsahern, matttbe, martineau, geliang, pabeni, davem,
	rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp, netdev,
	linux-trace-kernel, Jason Xing

On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote:
> > I'm not sure why the patch series has been changed to 'Changes
> > Requested', until now I don't think I need to change something.
> >
> > Should I repost this series (keeping the v6 tag) and then wait for
> > more comments?
>
> If Eric doesn't like it - it's not getting merged.

I'm not a English native speaker. If I understand correctly, it seems
that Eric doesn't object to the patch series. Here is the quotation
[1]:
"If you feel the need to put them in a special group, this is fine by me."

This rst reason mechanism can cover all the possible reasons for both
TCP and MPTCP. We don't need to reinvent some definitions of reset
reasons which are totally the same as drop reasons. Also, we don't
need to reinvent something to cover MPTCP. If people are willing to
contribute more rst reasons, they can find a good place.

Reset is one big complicated 'issue' in production. I spent a lot of
time handling all kinds of reset reasons daily. I'm apparently not the
only one. I just want to make admins' lives easier, including me. This
special/separate reason group is important because we can extend it in
the future, which will not get confused.

I hope it can have a chance to get merged :) Thank you.

[1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/

Thanks,
Jason

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-18 16:23     ` Jason Xing
@ 2024-04-18 18:51       ` Eric Dumazet
  2024-04-18 22:29         ` Jason Xing
  2024-04-18 23:26         ` Jason Xing
  0 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2024-04-18 18:51 UTC (permalink / raw)
  To: Jason Xing
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Thu, Apr 18, 2024 at 6:24 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote:
> > > I'm not sure why the patch series has been changed to 'Changes
> > > Requested', until now I don't think I need to change something.
> > >
> > > Should I repost this series (keeping the v6 tag) and then wait for
> > > more comments?
> >
> > If Eric doesn't like it - it's not getting merged.
>
> I'm not a English native speaker. If I understand correctly, it seems
> that Eric doesn't object to the patch series. Here is the quotation
> [1]:
> "If you feel the need to put them in a special group, this is fine by me."
>
> This rst reason mechanism can cover all the possible reasons for both
> TCP and MPTCP. We don't need to reinvent some definitions of reset
> reasons which are totally the same as drop reasons. Also, we don't
> need to reinvent something to cover MPTCP. If people are willing to
> contribute more rst reasons, they can find a good place.
>
> Reset is one big complicated 'issue' in production. I spent a lot of
> time handling all kinds of reset reasons daily. I'm apparently not the
> only one. I just want to make admins' lives easier, including me. This
> special/separate reason group is important because we can extend it in
> the future, which will not get confused.
>
> I hope it can have a chance to get merged :) Thank you.
>
> [1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/
>
> Thanks,
> Jason

My objection was these casts between enums. Especially if hiding with (u32)

I see no reason for adding these casts in TCP stack.

When I said "If you feel the need to put them in a special group, this
is fine by me.",
this was really about partitioning the existing enum into groups, if
you prefer having a group of 'RES reasons'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-18 18:51       ` Eric Dumazet
@ 2024-04-18 22:29         ` Jason Xing
  2024-04-18 22:40           ` Jason Xing
  2024-04-18 23:26         ` Jason Xing
  1 sibling, 1 reply; 27+ messages in thread
From: Jason Xing @ 2024-04-18 22:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 2:51 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Apr 18, 2024 at 6:24 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote:
> > > > I'm not sure why the patch series has been changed to 'Changes
> > > > Requested', until now I don't think I need to change something.
> > > >
> > > > Should I repost this series (keeping the v6 tag) and then wait for
> > > > more comments?
> > >
> > > If Eric doesn't like it - it's not getting merged.
> >
> > I'm not a English native speaker. If I understand correctly, it seems
> > that Eric doesn't object to the patch series. Here is the quotation
> > [1]:
> > "If you feel the need to put them in a special group, this is fine by me."
> >
> > This rst reason mechanism can cover all the possible reasons for both
> > TCP and MPTCP. We don't need to reinvent some definitions of reset
> > reasons which are totally the same as drop reasons. Also, we don't
> > need to reinvent something to cover MPTCP. If people are willing to
> > contribute more rst reasons, they can find a good place.
> >
> > Reset is one big complicated 'issue' in production. I spent a lot of
> > time handling all kinds of reset reasons daily. I'm apparently not the
> > only one. I just want to make admins' lives easier, including me. This
> > special/separate reason group is important because we can extend it in
> > the future, which will not get confused.
> >
> > I hope it can have a chance to get merged :) Thank you.
> >
> > [1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/
> >
> > Thanks,
> > Jason
>
> My objection was these casts between enums. Especially if hiding with (u32)

So I should explicitly cast it like this:
    tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
?

>
> I see no reason for adding these casts in TCP stack.

Sorry, I don't know why the casts really make you so annoyed. But I
still think it's not a bad way to handle all the cases for RST.

Supposing not to add a enum sk_rst_reason{}, passing drop reasons only
works well in TCP for passive rests. For active reset cases (in the
tcp_send_active_reset()), it's meaningless/confusing to insist on
reusing the drop reason because I have to add some reset reasons (that
are only used in RST cases) in the enum skb_drop_reason{}, which is
really weird, in my view. The same problem exists in how to handle
MPTCP. So I prefer putting them in a separate group like now. What do
you think about it, right now?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-18 22:29         ` Jason Xing
@ 2024-04-18 22:40           ` Jason Xing
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-18 22:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 6:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Fri, Apr 19, 2024 at 2:51 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Thu, Apr 18, 2024 at 6:24 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > >
> > > > On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote:
> > > > > I'm not sure why the patch series has been changed to 'Changes
> > > > > Requested', until now I don't think I need to change something.
> > > > >
> > > > > Should I repost this series (keeping the v6 tag) and then wait for
> > > > > more comments?
> > > >
> > > > If Eric doesn't like it - it's not getting merged.
> > >
> > > I'm not a English native speaker. If I understand correctly, it seems
> > > that Eric doesn't object to the patch series. Here is the quotation
> > > [1]:
> > > "If you feel the need to put them in a special group, this is fine by me."
> > >
> > > This rst reason mechanism can cover all the possible reasons for both
> > > TCP and MPTCP. We don't need to reinvent some definitions of reset
> > > reasons which are totally the same as drop reasons. Also, we don't
> > > need to reinvent something to cover MPTCP. If people are willing to
> > > contribute more rst reasons, they can find a good place.
> > >
> > > Reset is one big complicated 'issue' in production. I spent a lot of
> > > time handling all kinds of reset reasons daily. I'm apparently not the
> > > only one. I just want to make admins' lives easier, including me. This
> > > special/separate reason group is important because we can extend it in
> > > the future, which will not get confused.
> > >
> > > I hope it can have a chance to get merged :) Thank you.
> > >
> > > [1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/
> > >
> > > Thanks,
> > > Jason
> >
> > My objection was these casts between enums. Especially if hiding with (u32)
>
> So I should explicitly cast it like this:
>     tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
> ?
>
> >
> > I see no reason for adding these casts in TCP stack.
>
> Sorry, I don't know why the casts really make you so annoyed. But I
> still think it's not a bad way to handle all the cases for RST.
>
> Supposing not to add a enum sk_rst_reason{}, passing drop reasons only
> works well in TCP for passive rests. For active reset cases (in the
> tcp_send_active_reset()), it's meaningless/confusing to insist on
> reusing the drop reason because I have to add some reset reasons (that
> are only used in RST cases) in the enum skb_drop_reason{}, which is
> really weird, in my view. The same problem exists in how to handle
> MPTCP. So I prefer putting them in a separate group like now. What do
> you think about it, right now?

The description in the previous email may be too long. The key point
is that we're not supporting only for passive resets, right?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-18 18:51       ` Eric Dumazet
  2024-04-18 22:29         ` Jason Xing
@ 2024-04-18 23:26         ` Jason Xing
  2024-04-19  2:30           ` Jason Xing
  1 sibling, 1 reply; 27+ messages in thread
From: Jason Xing @ 2024-04-18 23:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

> When I said "If you feel the need to put them in a special group, this
> is fine by me.",
> this was really about partitioning the existing enum into groups, if
> you prefer having a group of 'RES reasons'

Are you suggesting copying what we need from enum skb_drop_reason{} to
enum sk_rst_reason{}? Why not reusing them directly. I have no idea
what the side effect of cast conversion itself is?

If __not__ doing so (copying reasons one by one), for passive rests,
we can totally rely on the drop reason, which means if we implement
more reasons for skb drop happening in reset cases, we don't need to
handle reset cases over and over again (like adding rst reasons just
after newly added drop reasons if without cast conversions). It's
easier to maintain the reset reason part if we can apply the current
patch series.

Thank you.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-18 23:26         ` Jason Xing
@ 2024-04-19  2:30           ` Jason Xing
  2024-04-19  7:02             ` Eric Dumazet
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Xing @ 2024-04-19  2:30 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> > When I said "If you feel the need to put them in a special group, this
> > is fine by me.",
> > this was really about partitioning the existing enum into groups, if
> > you prefer having a group of 'RES reasons'
>
> Are you suggesting copying what we need from enum skb_drop_reason{} to
> enum sk_rst_reason{}? Why not reusing them directly. I have no idea
> what the side effect of cast conversion itself is?

Sorry that I'm writing this email. I'm worried my statement is not
that clear, so I write one simple snippet which can help me explain
well :)

Allow me give NO_SOCKET as an example:
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index e63a3bf99617..2c9f7364de45 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type,
int code, __be32 info,
        if (!fl4.saddr)
                fl4.saddr = htonl(INADDR_DUMMY);

+       trace_icmp_send(skb_in, type, code);
        icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt);
 ende:
        ip_rt_put(rt);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1e650ec71d2f..d5963831280f 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 {
        struct net *net = dev_net(skb->dev);
        enum skb_drop_reason drop_reason;
+       enum sk_rst_reason rst_reason;
        int sdif = inet_sdif(skb);
        int dif = inet_iif(skb);
        const struct iphdr *iph;
@@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
 bad_packet:
                __TCP_INC_STATS(net, TCP_MIB_INERRS);
        } else {
-               tcp_v4_send_reset(NULL, skb);
+               rst_reason = RST_REASON_NO_SOCKET;
+               tcp_v4_send_reset(NULL, skb, rst_reason);
        }

 discard_it:

As you can see, we need to add a new 'rst_reason' variable which
actually is the same as drop reason. They are the same except for the
enum type... Such rst_reasons/drop_reasons are all over the place.

Eric, if you have a strong preference, I can do it as you said.

Well, how about explicitly casting them like this based on the current
series. It looks better and clearer and more helpful to people who is
reading codes to understand:
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 461b4d2b7cfe..eb125163d819 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
        return 0;

 reset:
-       tcp_v4_send_reset(rsk, skb, (u32)reason);
+       tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
 discard:
        kfree_skb_reason(skb, reason);
        /* Be careful here. If this function gets more complicated and

Thanks for your patience again.

Jason

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-19  2:30           ` Jason Xing
@ 2024-04-19  7:02             ` Eric Dumazet
  2024-04-19  7:28               ` Jason Xing
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2024-04-19  7:02 UTC (permalink / raw)
  To: Jason Xing
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > > When I said "If you feel the need to put them in a special group, this
> > > is fine by me.",
> > > this was really about partitioning the existing enum into groups, if
> > > you prefer having a group of 'RES reasons'
> >
> > Are you suggesting copying what we need from enum skb_drop_reason{} to
> > enum sk_rst_reason{}? Why not reusing them directly. I have no idea
> > what the side effect of cast conversion itself is?
>
> Sorry that I'm writing this email. I'm worried my statement is not
> that clear, so I write one simple snippet which can help me explain
> well :)
>
> Allow me give NO_SOCKET as an example:
> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index e63a3bf99617..2c9f7364de45 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type,
> int code, __be32 info,
>         if (!fl4.saddr)
>                 fl4.saddr = htonl(INADDR_DUMMY);
>
> +       trace_icmp_send(skb_in, type, code);
>         icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt);
>  ende:
>         ip_rt_put(rt);
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 1e650ec71d2f..d5963831280f 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  {
>         struct net *net = dev_net(skb->dev);
>         enum skb_drop_reason drop_reason;
> +       enum sk_rst_reason rst_reason;
>         int sdif = inet_sdif(skb);
>         int dif = inet_iif(skb);
>         const struct iphdr *iph;
> @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  bad_packet:
>                 __TCP_INC_STATS(net, TCP_MIB_INERRS);
>         } else {
> -               tcp_v4_send_reset(NULL, skb);
> +               rst_reason = RST_REASON_NO_SOCKET;
> +               tcp_v4_send_reset(NULL, skb, rst_reason);
>         }
>
>  discard_it:
>
> As you can see, we need to add a new 'rst_reason' variable which
> actually is the same as drop reason. They are the same except for the
> enum type... Such rst_reasons/drop_reasons are all over the place.
>
> Eric, if you have a strong preference, I can do it as you said.
>
> Well, how about explicitly casting them like this based on the current
> series. It looks better and clearer and more helpful to people who is
> reading codes to understand:
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 461b4d2b7cfe..eb125163d819 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
>         return 0;
>
>  reset:
> -       tcp_v4_send_reset(rsk, skb, (u32)reason);
> +       tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
>  discard:
>         kfree_skb_reason(skb, reason);
>         /* Be careful here. If this function gets more complicated and

It makes no sense to declare an enum sk_rst_reason and then convert it
to drop_reason
or vice versa.

Next thing you know, compiler guys will add a new -Woption that will
forbid such conversions.

Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum.

skb_drop_reason are simply values that are later converted to strings...

So : Do not declare a new enum.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-19  7:02             ` Eric Dumazet
@ 2024-04-19  7:28               ` Jason Xing
  2024-04-19  7:44                 ` Eric Dumazet
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Xing @ 2024-04-19  7:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > > When I said "If you feel the need to put them in a special group, this
> > > > is fine by me.",
> > > > this was really about partitioning the existing enum into groups, if
> > > > you prefer having a group of 'RES reasons'
> > >
> > > Are you suggesting copying what we need from enum skb_drop_reason{} to
> > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea
> > > what the side effect of cast conversion itself is?
> >
> > Sorry that I'm writing this email. I'm worried my statement is not
> > that clear, so I write one simple snippet which can help me explain
> > well :)
> >
> > Allow me give NO_SOCKET as an example:
> > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> > index e63a3bf99617..2c9f7364de45 100644
> > --- a/net/ipv4/icmp.c
> > +++ b/net/ipv4/icmp.c
> > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type,
> > int code, __be32 info,
> >         if (!fl4.saddr)
> >                 fl4.saddr = htonl(INADDR_DUMMY);
> >
> > +       trace_icmp_send(skb_in, type, code);
> >         icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt);
> >  ende:
> >         ip_rt_put(rt);
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 1e650ec71d2f..d5963831280f 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
> >  {
> >         struct net *net = dev_net(skb->dev);
> >         enum skb_drop_reason drop_reason;
> > +       enum sk_rst_reason rst_reason;
> >         int sdif = inet_sdif(skb);
> >         int dif = inet_iif(skb);
> >         const struct iphdr *iph;
> > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
> >  bad_packet:
> >                 __TCP_INC_STATS(net, TCP_MIB_INERRS);
> >         } else {
> > -               tcp_v4_send_reset(NULL, skb);
> > +               rst_reason = RST_REASON_NO_SOCKET;
> > +               tcp_v4_send_reset(NULL, skb, rst_reason);
> >         }
> >
> >  discard_it:
> >
> > As you can see, we need to add a new 'rst_reason' variable which
> > actually is the same as drop reason. They are the same except for the
> > enum type... Such rst_reasons/drop_reasons are all over the place.
> >
> > Eric, if you have a strong preference, I can do it as you said.
> >
> > Well, how about explicitly casting them like this based on the current
> > series. It looks better and clearer and more helpful to people who is
> > reading codes to understand:
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 461b4d2b7cfe..eb125163d819 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
> >         return 0;
> >
> >  reset:
> > -       tcp_v4_send_reset(rsk, skb, (u32)reason);
> > +       tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
> >  discard:
> >         kfree_skb_reason(skb, reason);
> >         /* Be careful here. If this function gets more complicated and
>
> It makes no sense to declare an enum sk_rst_reason and then convert it
> to drop_reason
> or vice versa.
>
> Next thing you know, compiler guys will add a new -Woption that will
> forbid such conversions.
>
> Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum.

Ah... It looks like I didn't make myself clear again. Sorry...
Actually I wrote this part many times. My conclusion is that It's not
feasible to do this.

REASONS:
If we __only__ need to deal with this passive reset in TCP, it's fine.
We pass a skb_drop_reason which should also be converted to u32 type
in tcp_v4_send_reset() as you said, it can work. People who use the
trace will see the reason with the 'SKB_DROP_REASON' prefix stripped.

But we have to deal with other cases. A few questions are listed here:
1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop
reasons? There is no drop reason at all. I think people will get
confused. So I believe we should invent new definitions to cope with
it.
2) What about the .send_reset callback in the subflow_syn_recv_sock()
in MPTCP? The reasons in MPTCP are only definitions (such as
MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum
skb_drop_reason type.

So where should we group those various reasons?

Introducing a new enum is for extension and compatibility for all
kinds of reset reasons.

What do you think?

Thanks,
Jason

>
> skb_drop_reason are simply values that are later converted to strings...
>
> So : Do not declare a new enum.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-19  7:28               ` Jason Xing
@ 2024-04-19  7:44                 ` Eric Dumazet
  2024-04-19  8:00                   ` Jason Xing
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2024-04-19  7:44 UTC (permalink / raw)
  To: Jason Xing
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 9:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > >
> > > > > When I said "If you feel the need to put them in a special group, this
> > > > > is fine by me.",
> > > > > this was really about partitioning the existing enum into groups, if
> > > > > you prefer having a group of 'RES reasons'
> > > >
> > > > Are you suggesting copying what we need from enum skb_drop_reason{} to
> > > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea
> > > > what the side effect of cast conversion itself is?
> > >
> > > Sorry that I'm writing this email. I'm worried my statement is not
> > > that clear, so I write one simple snippet which can help me explain
> > > well :)
> > >
> > > Allow me give NO_SOCKET as an example:
> > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> > > index e63a3bf99617..2c9f7364de45 100644
> > > --- a/net/ipv4/icmp.c
> > > +++ b/net/ipv4/icmp.c
> > > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type,
> > > int code, __be32 info,
> > >         if (!fl4.saddr)
> > >                 fl4.saddr = htonl(INADDR_DUMMY);
> > >
> > > +       trace_icmp_send(skb_in, type, code);
> > >         icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt);
> > >  ende:
> > >         ip_rt_put(rt);
> > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > index 1e650ec71d2f..d5963831280f 100644
> > > --- a/net/ipv4/tcp_ipv4.c
> > > +++ b/net/ipv4/tcp_ipv4.c
> > > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > >  {
> > >         struct net *net = dev_net(skb->dev);
> > >         enum skb_drop_reason drop_reason;
> > > +       enum sk_rst_reason rst_reason;
> > >         int sdif = inet_sdif(skb);
> > >         int dif = inet_iif(skb);
> > >         const struct iphdr *iph;
> > > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > >  bad_packet:
> > >                 __TCP_INC_STATS(net, TCP_MIB_INERRS);
> > >         } else {
> > > -               tcp_v4_send_reset(NULL, skb);
> > > +               rst_reason = RST_REASON_NO_SOCKET;
> > > +               tcp_v4_send_reset(NULL, skb, rst_reason);
> > >         }
> > >
> > >  discard_it:
> > >
> > > As you can see, we need to add a new 'rst_reason' variable which
> > > actually is the same as drop reason. They are the same except for the
> > > enum type... Such rst_reasons/drop_reasons are all over the place.
> > >
> > > Eric, if you have a strong preference, I can do it as you said.
> > >
> > > Well, how about explicitly casting them like this based on the current
> > > series. It looks better and clearer and more helpful to people who is
> > > reading codes to understand:
> > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > index 461b4d2b7cfe..eb125163d819 100644
> > > --- a/net/ipv4/tcp_ipv4.c
> > > +++ b/net/ipv4/tcp_ipv4.c
> > > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
> > >         return 0;
> > >
> > >  reset:
> > > -       tcp_v4_send_reset(rsk, skb, (u32)reason);
> > > +       tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
> > >  discard:
> > >         kfree_skb_reason(skb, reason);
> > >         /* Be careful here. If this function gets more complicated and
> >
> > It makes no sense to declare an enum sk_rst_reason and then convert it
> > to drop_reason
> > or vice versa.
> >
> > Next thing you know, compiler guys will add a new -Woption that will
> > forbid such conversions.
> >
> > Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum.
>
> Ah... It looks like I didn't make myself clear again. Sorry...
> Actually I wrote this part many times. My conclusion is that It's not
> feasible to do this.
>
> REASONS:
> If we __only__ need to deal with this passive reset in TCP, it's fine.
> We pass a skb_drop_reason which should also be converted to u32 type
> in tcp_v4_send_reset() as you said, it can work. People who use the
> trace will see the reason with the 'SKB_DROP_REASON' prefix stripped.
>
> But we have to deal with other cases. A few questions are listed here:
> 1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop
> reasons? There is no drop reason at all. I think people will get
> confused. So I believe we should invent new definitions to cope with
> it.
> 2) What about the .send_reset callback in the subflow_syn_recv_sock()
> in MPTCP? The reasons in MPTCP are only definitions (such as
> MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum
> skb_drop_reason type.
>
> So where should we group those various reasons?
>
> Introducing a new enum is for extension and compatibility for all
> kinds of reset reasons.
>
> What do you think?

I will stop repeating myself.

enums are not what you think.

type safety is there for a reason.

Can you show me another place in networking stacks where we cast enums
to others ?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-19  7:44                 ` Eric Dumazet
@ 2024-04-19  8:00                   ` Jason Xing
  2024-04-19  8:06                     ` Jason Xing
  2024-04-20  2:35                     ` Steven Rostedt
  0 siblings, 2 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-19  8:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 3:44 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Apr 19, 2024 at 9:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > > >
> > > > > > When I said "If you feel the need to put them in a special group, this
> > > > > > is fine by me.",
> > > > > > this was really about partitioning the existing enum into groups, if
> > > > > > you prefer having a group of 'RES reasons'
> > > > >
> > > > > Are you suggesting copying what we need from enum skb_drop_reason{} to
> > > > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea
> > > > > what the side effect of cast conversion itself is?
> > > >
> > > > Sorry that I'm writing this email. I'm worried my statement is not
> > > > that clear, so I write one simple snippet which can help me explain
> > > > well :)
> > > >
> > > > Allow me give NO_SOCKET as an example:
> > > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> > > > index e63a3bf99617..2c9f7364de45 100644
> > > > --- a/net/ipv4/icmp.c
> > > > +++ b/net/ipv4/icmp.c
> > > > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type,
> > > > int code, __be32 info,
> > > >         if (!fl4.saddr)
> > > >                 fl4.saddr = htonl(INADDR_DUMMY);
> > > >
> > > > +       trace_icmp_send(skb_in, type, code);
> > > >         icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt);
> > > >  ende:
> > > >         ip_rt_put(rt);
> > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > index 1e650ec71d2f..d5963831280f 100644
> > > > --- a/net/ipv4/tcp_ipv4.c
> > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > > >  {
> > > >         struct net *net = dev_net(skb->dev);
> > > >         enum skb_drop_reason drop_reason;
> > > > +       enum sk_rst_reason rst_reason;
> > > >         int sdif = inet_sdif(skb);
> > > >         int dif = inet_iif(skb);
> > > >         const struct iphdr *iph;
> > > > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > > >  bad_packet:
> > > >                 __TCP_INC_STATS(net, TCP_MIB_INERRS);
> > > >         } else {
> > > > -               tcp_v4_send_reset(NULL, skb);
> > > > +               rst_reason = RST_REASON_NO_SOCKET;
> > > > +               tcp_v4_send_reset(NULL, skb, rst_reason);
> > > >         }
> > > >
> > > >  discard_it:
> > > >
> > > > As you can see, we need to add a new 'rst_reason' variable which
> > > > actually is the same as drop reason. They are the same except for the
> > > > enum type... Such rst_reasons/drop_reasons are all over the place.
> > > >
> > > > Eric, if you have a strong preference, I can do it as you said.
> > > >
> > > > Well, how about explicitly casting them like this based on the current
> > > > series. It looks better and clearer and more helpful to people who is
> > > > reading codes to understand:
> > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > index 461b4d2b7cfe..eb125163d819 100644
> > > > --- a/net/ipv4/tcp_ipv4.c
> > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
> > > >         return 0;
> > > >
> > > >  reset:
> > > > -       tcp_v4_send_reset(rsk, skb, (u32)reason);
> > > > +       tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
> > > >  discard:
> > > >         kfree_skb_reason(skb, reason);
> > > >         /* Be careful here. If this function gets more complicated and
> > >
> > > It makes no sense to declare an enum sk_rst_reason and then convert it
> > > to drop_reason
> > > or vice versa.
> > >
> > > Next thing you know, compiler guys will add a new -Woption that will
> > > forbid such conversions.
> > >
> > > Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum.
> >
> > Ah... It looks like I didn't make myself clear again. Sorry...
> > Actually I wrote this part many times. My conclusion is that It's not
> > feasible to do this.
> >
> > REASONS:
> > If we __only__ need to deal with this passive reset in TCP, it's fine.
> > We pass a skb_drop_reason which should also be converted to u32 type
> > in tcp_v4_send_reset() as you said, it can work. People who use the
> > trace will see the reason with the 'SKB_DROP_REASON' prefix stripped.
> >
> > But we have to deal with other cases. A few questions are listed here:
> > 1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop
> > reasons? There is no drop reason at all. I think people will get
> > confused. So I believe we should invent new definitions to cope with
> > it.
> > 2) What about the .send_reset callback in the subflow_syn_recv_sock()
> > in MPTCP? The reasons in MPTCP are only definitions (such as
> > MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum
> > skb_drop_reason type.
> >
> > So where should we group those various reasons?
> >
> > Introducing a new enum is for extension and compatibility for all
> > kinds of reset reasons.
> >
> > What do you think?
>
> I will stop repeating myself.
>
> enums are not what you think.
>
> type safety is there for a reason.
>
> Can you show me another place in networking stacks where we cast enums
> to others ?

No, I've checked this a month ago.

BTW, I don't know the dangers of casting enum types. I know you will
answer it, but I still insist on asking, hoping someone seeing this
will help me.

Using skb_drop_reason can only deal with the passive reset in TCP. It
is just one of all kinds of reset cases :(

Forgive me, I cannot come up with a good way to cover all the cases TT

I've tried..Sorry...

If other experts see this thread, please help me. I would appreciate
it. I have strong interests and feel strong responsibility to
implement something like this patch series. It can be very useful!!

Thanks again.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-19  8:00                   ` Jason Xing
@ 2024-04-19  8:06                     ` Jason Xing
  2024-04-20  2:35                     ` Steven Rostedt
  1 sibling, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-19  8:06 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, dsahern, matttbe, martineau, geliang, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart, mptcp,
	netdev, linux-trace-kernel, Jason Xing

On Fri, Apr 19, 2024 at 4:00 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Fri, Apr 19, 2024 at 3:44 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Fri, Apr 19, 2024 at 9:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > > > >
> > > > > > > When I said "If you feel the need to put them in a special group, this
> > > > > > > is fine by me.",
> > > > > > > this was really about partitioning the existing enum into groups, if
> > > > > > > you prefer having a group of 'RES reasons'
> > > > > >
> > > > > > Are you suggesting copying what we need from enum skb_drop_reason{} to
> > > > > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea
> > > > > > what the side effect of cast conversion itself is?
> > > > >
> > > > > Sorry that I'm writing this email. I'm worried my statement is not
> > > > > that clear, so I write one simple snippet which can help me explain
> > > > > well :)
> > > > >
> > > > > Allow me give NO_SOCKET as an example:
> > > > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> > > > > index e63a3bf99617..2c9f7364de45 100644
> > > > > --- a/net/ipv4/icmp.c
> > > > > +++ b/net/ipv4/icmp.c
> > > > > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type,
> > > > > int code, __be32 info,
> > > > >         if (!fl4.saddr)
> > > > >                 fl4.saddr = htonl(INADDR_DUMMY);
> > > > >
> > > > > +       trace_icmp_send(skb_in, type, code);
> > > > >         icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt);
> > > > >  ende:
> > > > >         ip_rt_put(rt);
> > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > > index 1e650ec71d2f..d5963831280f 100644
> > > > > --- a/net/ipv4/tcp_ipv4.c
> > > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > > > >  {
> > > > >         struct net *net = dev_net(skb->dev);
> > > > >         enum skb_drop_reason drop_reason;
> > > > > +       enum sk_rst_reason rst_reason;
> > > > >         int sdif = inet_sdif(skb);
> > > > >         int dif = inet_iif(skb);
> > > > >         const struct iphdr *iph;
> > > > > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > > > >  bad_packet:
> > > > >                 __TCP_INC_STATS(net, TCP_MIB_INERRS);
> > > > >         } else {
> > > > > -               tcp_v4_send_reset(NULL, skb);
> > > > > +               rst_reason = RST_REASON_NO_SOCKET;
> > > > > +               tcp_v4_send_reset(NULL, skb, rst_reason);
> > > > >         }
> > > > >
> > > > >  discard_it:
> > > > >
> > > > > As you can see, we need to add a new 'rst_reason' variable which
> > > > > actually is the same as drop reason. They are the same except for the
> > > > > enum type... Such rst_reasons/drop_reasons are all over the place.
> > > > >
> > > > > Eric, if you have a strong preference, I can do it as you said.
> > > > >
> > > > > Well, how about explicitly casting them like this based on the current
> > > > > series. It looks better and clearer and more helpful to people who is
> > > > > reading codes to understand:
> > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > > index 461b4d2b7cfe..eb125163d819 100644
> > > > > --- a/net/ipv4/tcp_ipv4.c
> > > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
> > > > >         return 0;
> > > > >
> > > > >  reset:
> > > > > -       tcp_v4_send_reset(rsk, skb, (u32)reason);
> > > > > +       tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason);
> > > > >  discard:
> > > > >         kfree_skb_reason(skb, reason);
> > > > >         /* Be careful here. If this function gets more complicated and
> > > >
> > > > It makes no sense to declare an enum sk_rst_reason and then convert it
> > > > to drop_reason
> > > > or vice versa.
> > > >
> > > > Next thing you know, compiler guys will add a new -Woption that will
> > > > forbid such conversions.
> > > >
> > > > Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum.
> > >
> > > Ah... It looks like I didn't make myself clear again. Sorry...
> > > Actually I wrote this part many times. My conclusion is that It's not
> > > feasible to do this.
> > >
> > > REASONS:
> > > If we __only__ need to deal with this passive reset in TCP, it's fine.
> > > We pass a skb_drop_reason which should also be converted to u32 type
> > > in tcp_v4_send_reset() as you said, it can work. People who use the
> > > trace will see the reason with the 'SKB_DROP_REASON' prefix stripped.
> > >
> > > But we have to deal with other cases. A few questions are listed here:
> > > 1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop
> > > reasons? There is no drop reason at all. I think people will get
> > > confused. So I believe we should invent new definitions to cope with
> > > it.
> > > 2) What about the .send_reset callback in the subflow_syn_recv_sock()
> > > in MPTCP? The reasons in MPTCP are only definitions (such as
> > > MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum
> > > skb_drop_reason type.
> > >
> > > So where should we group those various reasons?
> > >
> > > Introducing a new enum is for extension and compatibility for all
> > > kinds of reset reasons.
> > >
> > > What do you think?
> >
> > I will stop repeating myself.
> >
> > enums are not what you think.
> >
> > type safety is there for a reason.
> >
> > Can you show me another place in networking stacks where we cast enums
> > to others ?
>
> No, I've checked this a month ago.
>
> BTW, I don't know the dangers of casting enum types. I know you will

s/will/won't/

> answer it, but I still insist on asking, hoping someone seeing this
> will help me.
>
> Using skb_drop_reason can only deal with the passive reset in TCP. It
> is just one of all kinds of reset cases :(
>
> Forgive me, I cannot come up with a good way to cover all the cases TT
>
> I've tried..Sorry...
>
> If other experts see this thread, please help me. I would appreciate
> it. I have strong interests and feel strong responsibility to
> implement something like this patch series. It can be very useful!!
>
> Thanks again.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-19  8:00                   ` Jason Xing
  2024-04-19  8:06                     ` Jason Xing
@ 2024-04-20  2:35                     ` Steven Rostedt
  2024-04-20  3:22                       ` Jason Xing
  1 sibling, 1 reply; 27+ messages in thread
From: Steven Rostedt @ 2024-04-20  2:35 UTC (permalink / raw)
  To: Jason Xing
  Cc: Eric Dumazet, Jakub Kicinski, dsahern, matttbe, martineau,
	geliang, pabeni, davem, mhiramat, mathieu.desnoyers, atenart,
	mptcp, netdev, linux-trace-kernel, Jason Xing

On Fri, 19 Apr 2024 16:00:20 +0800
Jason Xing <kerneljasonxing@gmail.com> wrote:

> If other experts see this thread, please help me. I would appreciate
> it. I have strong interests and feel strong responsibility to
> implement something like this patch series. It can be very useful!!

I'm not a networking expert, but as I'm Cc'd and this is about tracing,
I'll jump in to see if I can help. Honestly, reading the thread, it
appears that you and Eric are talking past each other.

I believe Eric is concerned about losing the value of the enum. Enums
are types, and if you typecast them to another type, they lose the
previous type, and all the safety that goes with it.

Now, I do not really understand the problem trying to be solved here. I
understand how TCP works but I never looked into the implementation of
MPTCP.

You added this:

+static inline enum sk_rst_reason convert_mptcp_reason(u32 reason)
+{
+	return reason += RST_REASON_START;
+}

And used it for places like this:

@@ -309,8 +309,13 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 		return dst;
 
 	dst_release(dst);
-	if (!req->syncookie)
-		tcp_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
+	if (!req->syncookie) {
+		struct mptcp_ext *mpext = mptcp_get_ext(skb);
+		enum sk_rst_reason reason;
+
+		reason = convert_mptcp_reason(mpext->reset_reason);
+		tcp_request_sock_ops.send_reset(sk, skb, reason);
+	}
 	return NULL;
 }

As I don't know this code or how MPTCP works, I do not understand the
above. It use to pass to send_reset() SK_RST_REASON_NOT_SPECIFIED. But
now it takes a "reset_reason" calls the "convert_mptcp_reason()" to get
back a enum value.

If you are mapping the reset_reason to enum sk_rst_reason, why not do
it via a real conversion instead of this fragile arithmetic between the two
values?

static inline enum sk_rst_reason convert_mptcp_reason(u32 reason)
{
	switch(reason) {
	case 0: return SK_RST_REASON_MPTCP_RST_EUNSPEC;
	case 1: return SK_RST_REASON_MPTCP_RST_EMPTCP;
	case 2: return SK_RST_REASON_MPTCP_RST_ERESOURCE;
	[..]
	default: return SK_RST_REASON_MAX; // or some other error value
	]
}

I'm not sure if this is any better, but it's not doing any casting and
it's easier to understand. It's a simple mapping between the reason and
the enum and there's no inherit dependency between the values. Could
possibly create enums for the reason numbers and replace the hard coded
values with them.

That way that helper function is at least doing a real conversion of
one type to another.

But like I said from the beginning. I don't understand the details here
and have not spent the time to dig deeper. I just read the thread and I
agree with Eric that the arithmetic conversion of reason to an enum
looks fragile at best and buggy at worst.

-- Steve

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect
  2024-04-20  2:35                     ` Steven Rostedt
@ 2024-04-20  3:22                       ` Jason Xing
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-20  3:22 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Eric Dumazet, Jakub Kicinski, dsahern, matttbe, martineau,
	geliang, pabeni, davem, mhiramat, mathieu.desnoyers, atenart,
	mptcp, netdev, linux-trace-kernel, Jason Xing

Hello Steven,

On Sat, Apr 20, 2024 at 10:36 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 19 Apr 2024 16:00:20 +0800
> Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> > If other experts see this thread, please help me. I would appreciate
> > it. I have strong interests and feel strong responsibility to
> > implement something like this patch series. It can be very useful!!
>
> I'm not a networking expert, but as I'm Cc'd and this is about tracing,
> I'll jump in to see if I can help. Honestly, reading the thread, it
> appears that you and Eric are talking past each other.
>
> I believe Eric is concerned about losing the value of the enum. Enums
> are types, and if you typecast them to another type, they lose the
> previous type, and all the safety that goes with it.

Ah, I see. Possible lost value in another enum could cause a problem.

>
> Now, I do not really understand the problem trying to be solved here. I
> understand how TCP works but I never looked into the implementation of
> MPTCP.
>
> You added this:
>
> +static inline enum sk_rst_reason convert_mptcp_reason(u32 reason)
> +{
> +       return reason += RST_REASON_START;
> +}
>
> And used it for places like this:
>
> @@ -309,8 +309,13 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
>                 return dst;
>
>         dst_release(dst);
> -       if (!req->syncookie)
> -               tcp_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED);
> +       if (!req->syncookie) {
> +               struct mptcp_ext *mpext = mptcp_get_ext(skb);
> +               enum sk_rst_reason reason;
> +
> +               reason = convert_mptcp_reason(mpext->reset_reason);
> +               tcp_request_sock_ops.send_reset(sk, skb, reason);
> +       }
>         return NULL;
>  }
>
> As I don't know this code or how MPTCP works, I do not understand the
> above. It use to pass to send_reset() SK_RST_REASON_NOT_SPECIFIED. But
> now it takes a "reset_reason" calls the "convert_mptcp_reason()" to get
> back a enum value.
>
> If you are mapping the reset_reason to enum sk_rst_reason, why not do
> it via a real conversion instead of this fragile arithmetic between the two
> values?
>
> static inline enum sk_rst_reason convert_mptcp_reason(u32 reason)
> {
>         switch(reason) {
>         case 0: return SK_RST_REASON_MPTCP_RST_EUNSPEC;
>         case 1: return SK_RST_REASON_MPTCP_RST_EMPTCP;
>         case 2: return SK_RST_REASON_MPTCP_RST_ERESOURCE;
>         [..]
>         default: return SK_RST_REASON_MAX; // or some other error value
>         ]
> }

This code snippet looks much better than mine.

>
> I'm not sure if this is any better, but it's not doing any casting and
> it's easier to understand. It's a simple mapping between the reason and
> the enum and there's no inherit dependency between the values. Could
> possibly create enums for the reason numbers and replace the hard coded
> values with them.

Right.

I also need to handle many drop reasons cases like what you do. Due to
too many of them, I will try the key-value map instead of switch-case
and then see if it works.

>
> That way that helper function is at least doing a real conversion of
> one type to another.
>
> But like I said from the beginning. I don't understand the details here
> and have not spent the time to dig deeper. I just read the thread and I
> agree with Eric that the arithmetic conversion of reason to an enum
> looks fragile at best and buggy at worst.

Thanks so much for your help, which I didn't even imagine.

Sure, after one night of investigation, I figured it out. I will drop
enum casts without any doubt as Eric and you suggested. But I believe
a new enum is needed, grouping various reasons together which help
ftrace print the valid string to userspace.

Thanks,
Jason

>
> -- Steve

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next v6 7/7] rstreason: make it work in trace world
  2024-04-18 13:32 [PATCH net-next RESEND " Jason Xing
@ 2024-04-18 13:32 ` Jason Xing
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Xing @ 2024-04-18 13:32 UTC (permalink / raw)
  To: edumazet, dsahern, matttbe, martineau, geliang, kuba, pabeni,
	davem, rostedt, mhiramat, mathieu.desnoyers, atenart
  Cc: mptcp, netdev, linux-trace-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

At last, we should let it work by introducing this reset reason in
trace world.

One of the possible expected outputs is:
... tcp_send_reset: skbaddr=xxx skaddr=xxx src=xxx dest=xxx
state=TCP_ESTABLISHED reason=NOT_SPECIFIED

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/trace/events/tcp.h | 37 +++++++++++++++++++++++++++++++++----
 net/ipv4/tcp_ipv4.c        |  2 +-
 net/ipv4/tcp_output.c      |  2 +-
 net/ipv6/tcp_ipv6.c        |  2 +-
 4 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 5c04a61a11c2..b1455cbc0634 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -11,6 +11,7 @@
 #include <net/ipv6.h>
 #include <net/tcp.h>
 #include <linux/sock_diag.h>
+#include <net/rstreason.h>
 
 /*
  * tcp event with arguments sk and skb
@@ -74,20 +75,38 @@ DEFINE_EVENT(tcp_event_sk_skb, tcp_retransmit_skb,
 	TP_ARGS(sk, skb)
 );
 
+#undef FN1
+#define FN1(reason)	TRACE_DEFINE_ENUM(SK_RST_REASON_##reason);
+#undef FN2
+#define FN2(reason)	TRACE_DEFINE_ENUM(SKB_DROP_REASON_##reason);
+DEFINE_RST_REASON(FN1, FN1)
+
+#undef FN1
+#undef FNe1
+#define FN1(reason)	{ SK_RST_REASON_##reason, #reason },
+#define FNe1(reason)	{ SK_RST_REASON_##reason, #reason }
+
+#undef FN2
+#undef FNe2
+#define FN2(reason)	{ SKB_DROP_REASON_##reason, #reason },
+#define FNe2(reason)	{ SKB_DROP_REASON_##reason, #reason }
 /*
  * skb of trace_tcp_send_reset is the skb that caused RST. In case of
  * active reset, skb should be NULL
  */
 TRACE_EVENT(tcp_send_reset,
 
-	TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+	TP_PROTO(const struct sock *sk,
+		 const struct sk_buff *skb,
+		 const enum sk_rst_reason reason),
 
-	TP_ARGS(sk, skb),
+	TP_ARGS(sk, skb, reason),
 
 	TP_STRUCT__entry(
 		__field(const void *, skbaddr)
 		__field(const void *, skaddr)
 		__field(int, state)
+		__field(enum sk_rst_reason, reason)
 		__array(__u8, saddr, sizeof(struct sockaddr_in6))
 		__array(__u8, daddr, sizeof(struct sockaddr_in6))
 	),
@@ -113,14 +132,24 @@ TRACE_EVENT(tcp_send_reset,
 			 */
 			TP_STORE_ADDR_PORTS_SKB(skb, th, entry->daddr, entry->saddr);
 		}
+		__entry->reason = reason;
 	),
 
-	TP_printk("skbaddr=%p skaddr=%p src=%pISpc dest=%pISpc state=%s",
+	TP_printk("skbaddr=%p skaddr=%p src=%pISpc dest=%pISpc state=%s reason=%s",
 		  __entry->skbaddr, __entry->skaddr,
 		  __entry->saddr, __entry->daddr,
-		  __entry->state ? show_tcp_state_name(__entry->state) : "UNKNOWN")
+		  __entry->state ? show_tcp_state_name(__entry->state) : "UNKNOWN",
+		  __entry->reason < RST_REASON_START ?
+			__print_symbolic(__entry->reason, DEFINE_DROP_REASON(FN2, FNe2)) :
+			__print_symbolic(__entry->reason, DEFINE_RST_REASON(FN1, FNe1)))
 );
 
+#undef FN1
+#undef FNe1
+
+#undef FN2
+#undef FNe2
+
 /*
  * tcp event with arguments sk
  *
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d78412cf8566..461b4d2b7cfe 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -871,7 +871,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb,
 	if (sk)
 		arg.bound_dev_if = sk->sk_bound_dev_if;
 
-	trace_tcp_send_reset(sk, skb);
+	trace_tcp_send_reset(sk, skb, reason);
 
 	BUILD_BUG_ON(offsetof(struct sock, sk_bound_dev_if) !=
 		     offsetof(struct inet_timewait_sock, tw_bound_dev_if));
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 276d9d541b01..b08ffb17d5a0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3612,7 +3612,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority,
 	/* skb of trace_tcp_send_reset() keeps the skb that caused RST,
 	 * skb here is different to the troublesome skb, so use NULL
 	 */
-	trace_tcp_send_reset(sk, NULL);
+	trace_tcp_send_reset(sk, NULL, SK_RST_REASON_NOT_SPECIFIED);
 }
 
 /* Send a crossed SYN-ACK during socket establishment.
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index c46095fb596c..6a4736ec3df0 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1133,7 +1133,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb,
 			label = ip6_flowlabel(ipv6h);
 	}
 
-	trace_tcp_send_reset(sk, skb);
+	trace_tcp_send_reset(sk, skb, reason);
 
 	tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, 1,
 			     ipv6_get_dsfield(ipv6h), label, priority, txhash,
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-04-20  3:23 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-17  8:51 [PATCH net-next v6 0/7] Implement reset reason mechanism to detect Jason Xing
2024-04-17  8:51 ` [PATCH net-next v6 1/7] net: introduce rstreason to detect why the RST is sent Jason Xing
2024-04-17  9:01   ` Eric Dumazet
2024-04-17  9:22     ` Jason Xing
2024-04-17  8:51 ` [PATCH net-next v6 2/7] rstreason: prepare for passive reset Jason Xing
2024-04-17  8:51 ` [PATCH net-next v6 3/7] rstreason: prepare for active reset Jason Xing
2024-04-17  8:51 ` [PATCH net-next v6 4/7] tcp: support rstreason for passive reset Jason Xing
2024-04-17  8:51 ` [PATCH net-next v6 5/7] mptcp: " Jason Xing
2024-04-17  8:51 ` [PATCH net-next v6 6/7] mptcp: introducing a helper into active reset logic Jason Xing
2024-04-17  8:51 ` [PATCH net-next v6 7/7] rstreason: make it work in trace world Jason Xing
2024-04-17  9:41 ` [PATCH net-next v6 0/7] Implement reset reason mechanism to detect MPTCP CI
2024-04-18  3:30 ` Jason Xing
2024-04-18 15:46   ` Jakub Kicinski
2024-04-18 16:23     ` Jason Xing
2024-04-18 18:51       ` Eric Dumazet
2024-04-18 22:29         ` Jason Xing
2024-04-18 22:40           ` Jason Xing
2024-04-18 23:26         ` Jason Xing
2024-04-19  2:30           ` Jason Xing
2024-04-19  7:02             ` Eric Dumazet
2024-04-19  7:28               ` Jason Xing
2024-04-19  7:44                 ` Eric Dumazet
2024-04-19  8:00                   ` Jason Xing
2024-04-19  8:06                     ` Jason Xing
2024-04-20  2:35                     ` Steven Rostedt
2024-04-20  3:22                       ` Jason Xing
2024-04-18 13:32 [PATCH net-next RESEND " Jason Xing
2024-04-18 13:32 ` [PATCH net-next v6 7/7] rstreason: make it work in trace world Jason Xing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).