[PATCH mptcp-next v7 0/8] The infinite mapping support

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH mptcp-next v7 0/8] The infinite mapping support
@ 2021-10-18  4:33 Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 1/8] mptcp: don't send RST for single subflow Geliang Tang
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

v7:
 - drop the patch "mptcp: add last_ack_dss_start in the msk" in v6
 - set allow_infinite_fallback to false in __mptcp_subflow_connect and
   mptcp_finish_join.

v6:
 - use allow_infinite_fallback instead of last_retrans_seq.
 - rename mptcp_is_data_contiguous to mptcp_allow_infinite_fallback.
 - rename last_fully_acked_dss_start_seq to last_ack_dss_start.

v5:
 - move last_retrans_seq from msk to mptcp_subflow_context

v4:
 - update patch 1 and patch 2

v3:
 - drop MPTCP_INFINITE_DONE flag
 - drop MAPPING_INFINITE
 - add mptcp_is_data_contiguous helper
 - add the fallback check
 - The u32 target testcase has not been completed yet.

v2:
 - add MPTCP_INFINITE_DONE flag
 - add MAPPING_INFINITE mapping status
 - add start_seq in the msk

v1:
 - add noncontiguous flag
 - add the mibs check
 - tag: export/20210904T080009

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/216

Geliang Tang (8):
  mptcp: don't send RST for single subflow
  mptcp: add the fallback check
  mptcp: track and update contiguous data status
  mptcp: infinite mapping sending
  mptcp: infinite mapping receiving
  mptcp: add mib for infinite map sending
  selftests: mptcp: add infinite map mibs check
  DO-NOT-MERGE: mptcp: mp_fail test

 include/net/mptcp.h                           |  3 +-
 net/mptcp/mib.c                               |  1 +
 net/mptcp/mib.h                               |  1 +
 net/mptcp/options.c                           |  2 +-
 net/mptcp/pm.c                                |  6 ++
 net/mptcp/protocol.c                          | 31 ++++++++++
 net/mptcp/protocol.h                          | 13 +++++
 net/mptcp/subflow.c                           | 56 ++++++++++---------
 .../testing/selftests/net/mptcp/mptcp_join.sh | 56 +++++++++++++++++++
 9 files changed, 142 insertions(+), 27 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 1/8] mptcp: don't send RST for single subflow
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 2/8] mptcp: add the fallback check Geliang Tang
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

When a bad checksum is detected and a single subflow is in use, don't
send RST + MP_FAIL, send data_ack + MP_FAIL instead.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/subflow.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 6172f380dfb7..92b45a7c997e 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1167,14 +1167,14 @@ static bool subflow_check_data_avail(struct sock *ssk)
 	/* RFC 8684 section 3.7. */
 	if (subflow->send_mp_fail) {
 		if (mptcp_has_another_subflow(ssk)) {
+			ssk->sk_err = EBADMSG;
+			tcp_set_state(ssk, TCP_CLOSE);
+			subflow->reset_transient = 0;
+			subflow->reset_reason = MPTCP_RST_EMIDDLEBOX;
+			tcp_send_active_reset(ssk, GFP_ATOMIC);
 			while ((skb = skb_peek(&ssk->sk_receive_queue)))
 				sk_eat_skb(ssk, skb);
 		}
-		ssk->sk_err = EBADMSG;
-		tcp_set_state(ssk, TCP_CLOSE);
-		subflow->reset_transient = 0;
-		subflow->reset_reason = MPTCP_RST_EMIDDLEBOX;
-		tcp_send_active_reset(ssk, GFP_ATOMIC);
 		WRITE_ONCE(subflow->data_avail, 0);
 		return true;
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 2/8] mptcp: add the fallback check
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 1/8] mptcp: don't send RST for single subflow Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 3/8] mptcp: track and update contiguous data status Geliang Tang
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang, Paolo Abeni

This patch added the fallback check in subflow_check_data_avail(). Only
do the fallback when the msk isn't fallen back yet.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/subflow.c | 45 ++++++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 92b45a7c997e..87a9ffebcc42 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1164,35 +1164,38 @@ static bool subflow_check_data_avail(struct sock *ssk)
 	return false;
 
 fallback:
-	/* RFC 8684 section 3.7. */
-	if (subflow->send_mp_fail) {
-		if (mptcp_has_another_subflow(ssk)) {
+	if (!__mptcp_check_fallback(msk)) {
+		/* RFC 8684 section 3.7. */
+		if (subflow->send_mp_fail) {
+			if (mptcp_has_another_subflow(ssk)) {
+				ssk->sk_err = EBADMSG;
+				tcp_set_state(ssk, TCP_CLOSE);
+				subflow->reset_transient = 0;
+				subflow->reset_reason = MPTCP_RST_EMIDDLEBOX;
+				tcp_send_active_reset(ssk, GFP_ATOMIC);
+				while ((skb = skb_peek(&ssk->sk_receive_queue)))
+					sk_eat_skb(ssk, skb);
+			}
+			WRITE_ONCE(subflow->data_avail, 0);
+			return true;
+		}
+
+		if (subflow->mp_join || subflow->fully_established) {
+			/* fatal protocol error, close the socket.
+			 * subflow_error_report() will introduce the appropriate barriers
+			 */
 			ssk->sk_err = EBADMSG;
 			tcp_set_state(ssk, TCP_CLOSE);
 			subflow->reset_transient = 0;
-			subflow->reset_reason = MPTCP_RST_EMIDDLEBOX;
+			subflow->reset_reason = MPTCP_RST_EMPTCP;
 			tcp_send_active_reset(ssk, GFP_ATOMIC);
-			while ((skb = skb_peek(&ssk->sk_receive_queue)))
-				sk_eat_skb(ssk, skb);
+			WRITE_ONCE(subflow->data_avail, 0);
+			return false;
 		}
-		WRITE_ONCE(subflow->data_avail, 0);
-		return true;
-	}
 
-	if (subflow->mp_join || subflow->fully_established) {
-		/* fatal protocol error, close the socket.
-		 * subflow_error_report() will introduce the appropriate barriers
-		 */
-		ssk->sk_err = EBADMSG;
-		tcp_set_state(ssk, TCP_CLOSE);
-		subflow->reset_transient = 0;
-		subflow->reset_reason = MPTCP_RST_EMPTCP;
-		tcp_send_active_reset(ssk, GFP_ATOMIC);
-		WRITE_ONCE(subflow->data_avail, 0);
-		return false;
+		__mptcp_do_fallback(msk);
 	}
 
-	__mptcp_do_fallback(msk);
 	skb = skb_peek(&ssk->sk_receive_queue);
 	subflow->map_valid = 1;
 	subflow->map_seq = READ_ONCE(msk->ack_seq);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 3/8] mptcp: track and update contiguous data status
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 1/8] mptcp: don't send RST for single subflow Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 2/8] mptcp: add the fallback check Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 4/8] mptcp: infinite mapping sending Geliang Tang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang, Paolo Abeni

This patch added a new member allow_infinite_fallback in mptcp_sock,
which gets initialized to 'true' when the connection begins and is set
to 'false' on any retransmit or successful MP_JOIN. Only do infinite
mapping fallback if there is a single subflow AND there have been no
retransmissions AND there have never been any MP_JOINs.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/protocol.c | 3 +++
 net/mptcp/protocol.h | 1 +
 net/mptcp/subflow.c  | 4 +++-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 7803b0dbb1be..f3163647c501 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2387,6 +2387,7 @@ static void __mptcp_retrans(struct sock *sk)
 		dfrag->already_sent = max(dfrag->already_sent, info.sent);
 		tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle,
 			 info.size_goal);
+		WRITE_ONCE(msk->allow_infinite_fallback, false);
 	}
 
 	release_sock(ssk);
@@ -2465,6 +2466,7 @@ static int __mptcp_init_sock(struct sock *sk)
 	msk->first = NULL;
 	inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss;
 	WRITE_ONCE(msk->csum_enabled, mptcp_is_checksum_enabled(sock_net(sk)));
+	WRITE_ONCE(msk->allow_infinite_fallback, true);
 	msk->recovery = false;
 
 	mptcp_pm_data_init(msk);
@@ -3133,6 +3135,7 @@ bool mptcp_finish_join(struct sock *ssk)
 	if (parent_sock && !ssk->sk_socket)
 		mptcp_sock_graft(ssk, parent_sock);
 	subflow->map_seq = READ_ONCE(msk->ack_seq);
+	WRITE_ONCE(msk->allow_infinite_fallback, false);
 out:
 	mptcp_event(MPTCP_EVENT_SUB_ESTABLISHED, msk, ssk, GFP_ATOMIC);
 	return true;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 67a61ac48b20..4b9fe56bd572 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -249,6 +249,7 @@ struct mptcp_sock {
 	bool		rcv_fastclose;
 	bool		use_64bit_ack; /* Set when we received a 64-bit DSN */
 	bool		csum_enabled;
+	bool		allow_infinite_fallback;
 	spinlock_t	join_list_lock;
 	struct work_struct work;
 	struct sk_buff  *ooo_last_skb;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 87a9ffebcc42..93bc298bd41d 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1167,7 +1167,8 @@ static bool subflow_check_data_avail(struct sock *ssk)
 	if (!__mptcp_check_fallback(msk)) {
 		/* RFC 8684 section 3.7. */
 		if (subflow->send_mp_fail) {
-			if (mptcp_has_another_subflow(ssk)) {
+			if (mptcp_has_another_subflow(ssk) ||
+			    !READ_ONCE(msk->allow_infinite_fallback)) {
 				ssk->sk_err = EBADMSG;
 				tcp_set_state(ssk, TCP_CLOSE);
 				subflow->reset_transient = 0;
@@ -1452,6 +1453,7 @@ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc,
 	/* discard the subflow socket */
 	mptcp_sock_graft(ssk, sk->sk_socket);
 	iput(SOCK_INODE(sf));
+	WRITE_ONCE(msk->allow_infinite_fallback, false);
 	return err;
 
 failed_unlink:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 4/8] mptcp: infinite mapping sending
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
                   ` (2 preceding siblings ...)
  2021-10-18  4:33 ` [PATCH mptcp-next v7 3/8] mptcp: track and update contiguous data status Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 5/8] mptcp: infinite mapping receiving Geliang Tang
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang, Paolo Abeni

This patch added the infinite mapping sending logic.

Added a new flag send_infinite_map in struct mptcp_subflow_context. Set
it true when a single contiguous subflow is in use and the
allow_infinite_fallback flag is true in mptcp_pm_mp_fail_received().

In mptcp_sendmsg_frag(), if this flag is true, call the new function
mptcp_update_infinite_map() to set the infinite mapping.

Added a new flag infinite_map in struct mptcp_ext, set it true in
mptcp_update_infinite_map(), and check this flag in a new helper
mptcp_check_infinite_map().

In mptcp_update_infinite_map(), set data_len and csum to 0, and clear
the send_infinite_map flag, then do fallback.

In mptcp_established_options(), use the helper mptcp_check_infinite_map()
to let the infinite mapping DSS can be sent out in the fallback mode.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 include/net/mptcp.h  |  3 ++-
 net/mptcp/options.c  |  2 +-
 net/mptcp/pm.c       |  6 ++++++
 net/mptcp/protocol.c | 18 ++++++++++++++++++
 net/mptcp/protocol.h | 12 ++++++++++++
 5 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index f83fa48408b3..29e930540ea2 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -35,7 +35,8 @@ struct mptcp_ext {
 			frozen:1,
 			reset_transient:1;
 	u8		reset_reason:4,
-			csum_reqd:1;
+			csum_reqd:1,
+			infinite_map:1;
 };
 
 #define MPTCP_RM_IDS_MAX	8
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 422f4acfb3e6..f4591421ed22 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -816,7 +816,7 @@ bool mptcp_established_options(struct sock *sk, struct sk_buff *skb,
 
 	opts->suboptions = 0;
 
-	if (unlikely(__mptcp_check_fallback(msk)))
+	if (unlikely(__mptcp_check_fallback(msk) && !mptcp_check_infinite_map(skb)))
 		return false;
 
 	if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_RST)) {
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 6ab386ff3294..86b38a830b4c 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -251,7 +251,13 @@ void mptcp_pm_mp_prio_received(struct sock *sk, u8 bkup)
 
 void mptcp_pm_mp_fail_received(struct sock *sk, u64 fail_seq)
 {
+	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
+	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
+
 	pr_debug("fail_seq=%llu", fail_seq);
+
+	if (!mptcp_has_another_subflow(sk) && READ_ONCE(msk->allow_infinite_fallback))
+		subflow->send_infinite_map = 1;
 }
 
 /* path manager helpers */
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f3163647c501..70f2e0e212fa 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1226,6 +1226,22 @@ static void mptcp_update_data_checksum(struct sk_buff *skb, int added)
 	mpext->csum = csum_fold(csum_block_add(csum, skb_checksum(skb, offset, added, 0), offset));
 }
 
+static void mptcp_update_infinite_map(struct mptcp_sock *msk,
+				      struct sock *ssk,
+				      struct mptcp_ext *mpext)
+{
+	if (!mpext)
+		return;
+
+	mpext->infinite_map = 1;
+	mpext->data_len = 0;
+	mpext->csum = 0;
+
+	mptcp_subflow_ctx(ssk)->send_infinite_map = 0;
+	pr_fallback(msk);
+	__mptcp_do_fallback(msk);
+}
+
 static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 			      struct mptcp_data_frag *dfrag,
 			      struct mptcp_sendmsg_info *info)
@@ -1358,6 +1374,8 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 out:
 	if (READ_ONCE(msk->csum_enabled))
 		mptcp_update_data_checksum(skb, copy);
+	if (mptcp_subflow_ctx(ssk)->send_infinite_map)
+		mptcp_update_infinite_map(msk, ssk, mpext);
 	mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
 	return copy;
 }
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 4b9fe56bd572..906c65fca04f 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -419,6 +419,7 @@ struct mptcp_subflow_context {
 		backup : 1,
 		send_mp_prio : 1,
 		send_mp_fail : 1,
+		send_infinite_map : 1,
 		rx_eof : 1,
 		can_ack : 1,        /* only after processing the remote a key */
 		disposable : 1,	    /* ctx can be free at ulp release time */
@@ -852,6 +853,17 @@ static inline void mptcp_do_fallback(struct sock *sk)
 
 #define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)", __func__, a)
 
+static inline bool mptcp_check_infinite_map(struct sk_buff *skb)
+{
+	struct mptcp_ext *mpext;
+
+	mpext = skb ? mptcp_get_ext(skb) : NULL;
+	if (mpext && mpext->infinite_map)
+		return true;
+
+	return false;
+}
+
 static inline bool subflow_simultaneous_connect(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 5/8] mptcp: infinite mapping receiving
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
                   ` (3 preceding siblings ...)
  2021-10-18  4:33 ` [PATCH mptcp-next v7 4/8] mptcp: infinite mapping sending Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 6/8] mptcp: add mib for infinite map sending Geliang Tang
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang, Paolo Abeni

This patch added the infinite mapping receiving logic. When the infinite
mapping is received, set the map_data_len of the subflow to 0.

In subflow_check_data_avail(), only reset the subflow when the map_data_len
of the subflow is non-zero.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/subflow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 93bc298bd41d..9e54122f18f4 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -968,6 +968,7 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
 	data_len = mpext->data_len;
 	if (data_len == 0) {
 		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPRX);
+		subflow->map_data_len = 0;
 		return MAPPING_INVALID;
 	}
 
@@ -1181,7 +1182,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
 			return true;
 		}
 
-		if (subflow->mp_join || subflow->fully_established) {
+		if ((subflow->mp_join || subflow->fully_established) && subflow->map_data_len) {
 			/* fatal protocol error, close the socket.
 			 * subflow_error_report() will introduce the appropriate barriers
 			 */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 6/8] mptcp: add mib for infinite map sending
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
                   ` (4 preceding siblings ...)
  2021-10-18  4:33 ` [PATCH mptcp-next v7 5/8] mptcp: infinite mapping receiving Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 7/8] selftests: mptcp: add infinite map mibs check Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test Geliang Tang
  7 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch added a new mib named MPTCP_MIB_INFINITEMAPTX, increase it
when a infinite mapping has been sent out.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/mib.c      | 1 +
 net/mptcp/mib.h      | 1 +
 net/mptcp/protocol.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
index 3240b72271a7..c12251cb0d44 100644
--- a/net/mptcp/mib.c
+++ b/net/mptcp/mib.c
@@ -24,6 +24,7 @@ static const struct snmp_mib mptcp_snmp_list[] = {
 	SNMP_MIB_ITEM("MPJoinAckRx", MPTCP_MIB_JOINACKRX),
 	SNMP_MIB_ITEM("MPJoinAckHMacFailure", MPTCP_MIB_JOINACKMAC),
 	SNMP_MIB_ITEM("DSSNotMatching", MPTCP_MIB_DSSNOMATCH),
+	SNMP_MIB_ITEM("InfiniteMapTx", MPTCP_MIB_INFINITEMAPTX),
 	SNMP_MIB_ITEM("InfiniteMapRx", MPTCP_MIB_INFINITEMAPRX),
 	SNMP_MIB_ITEM("DSSNoMatchTCP", MPTCP_MIB_DSSTCPMISMATCH),
 	SNMP_MIB_ITEM("DataCsumErr", MPTCP_MIB_DATACSUMERR),
diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h
index ecd3d8b117e0..7901f1338d15 100644
--- a/net/mptcp/mib.h
+++ b/net/mptcp/mib.h
@@ -17,6 +17,7 @@ enum linux_mptcp_mib_field {
 	MPTCP_MIB_JOINACKRX,		/* Received an ACK + MP_JOIN */
 	MPTCP_MIB_JOINACKMAC,		/* HMAC was wrong on ACK + MP_JOIN */
 	MPTCP_MIB_DSSNOMATCH,		/* Received a new mapping that did not match the previous one */
+	MPTCP_MIB_INFINITEMAPTX,	/* Sent an infinite mapping */
 	MPTCP_MIB_INFINITEMAPRX,	/* Received an infinite mapping */
 	MPTCP_MIB_DSSTCPMISMATCH,	/* DSS-mapping did not map with TCP's sequence numbers */
 	MPTCP_MIB_DATACSUMERR,		/* The data checksum fail */
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 70f2e0e212fa..2283efda1bc9 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1237,6 +1237,7 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
 	mpext->data_len = 0;
 	mpext->csum = 0;
 
+	MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPTX);
 	mptcp_subflow_ctx(ssk)->send_infinite_map = 0;
 	pr_fallback(msk);
 	__mptcp_do_fallback(msk);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 7/8] selftests: mptcp: add infinite map mibs check
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
                   ` (5 preceding siblings ...)
  2021-10-18  4:33 ` [PATCH mptcp-next v7 6/8] mptcp: add mib for infinite map sending Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-18  4:33 ` [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test Geliang Tang
  7 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch added a function chk_infi_nr() to check the mibs for the
infinite mapping.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 .../testing/selftests/net/mptcp/mptcp_join.sh | 38 +++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 293d349e21fe..b3b351b868f9 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -615,6 +615,43 @@ chk_fail_nr()
 	fi
 }
 
+chk_infi_nr()
+{
+	local mp_infi_nr_tx=$1
+	local mp_infi_nr_rx=$2
+	local count
+	local dump_stats
+
+	printf "%-39s %s" " " "itx"
+	count=`ip netns exec $ns1 nstat -as | grep InfiniteMapTx | awk '{print $2}'`
+	[ -z "$count" ] && count=0
+	if [ "$count" != "$mp_infi_nr_tx" ]; then
+		echo "[fail] got $count infinite map[s] TX expected $mp_infi_nr_tx"
+		ret=1
+		dump_stats=1
+	else
+		echo -n "[ ok ]"
+	fi
+
+	echo -n " - irx   "
+	count=`ip netns exec $ns2 nstat -as | grep InfiniteMapRx | awk '{print $2}'`
+	[ -z "$count" ] && count=0
+	if [ "$count" != "$mp_infi_nr_rx" ]; then
+		echo "[fail] got $count infinite map[s] RX expected $mp_infi_nr_rx"
+		ret=1
+		dump_stats=1
+	else
+		echo "[ ok ]"
+	fi
+
+	if [ "${dump_stats}" = 1 ]; then
+		echo Server ns stats
+		ip netns exec $ns1 nstat -as | grep MPTcp
+		echo Client ns stats
+		ip netns exec $ns2 nstat -as | grep MPTcp
+	fi
+}
+
 chk_join_nr()
 {
 	local msg="$1"
@@ -665,6 +702,7 @@ chk_join_nr()
 	if [ $checksum -eq 1 ]; then
 		chk_csum_nr
 		chk_fail_nr 0 0
+		chk_infi_nr 0 0
 	fi
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
                   ` (6 preceding siblings ...)
  2021-10-18  4:33 ` [PATCH mptcp-next v7 7/8] selftests: mptcp: add infinite map mibs check Geliang Tang
@ 2021-10-18  4:33 ` Geliang Tang
  2021-10-19  0:50   ` Mat Martineau
  7 siblings, 1 reply; 23+ messages in thread
From: Geliang Tang @ 2021-10-18  4:33 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

./mptcp_join.sh -Cf

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/protocol.c                           |  9 +++++++++
 .../testing/selftests/net/mptcp/mptcp_join.sh  | 18 ++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2283efda1bc9..fc3a4fb81fbb 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1243,6 +1243,8 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
 	__mptcp_do_fallback(msk);
 }
 
+static int j;
+
 static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 			      struct mptcp_data_frag *dfrag,
 			      struct mptcp_sendmsg_info *info)
@@ -1377,6 +1379,13 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		mptcp_update_data_checksum(skb, copy);
 	if (mptcp_subflow_ctx(ssk)->send_infinite_map)
 		mptcp_update_infinite_map(msk, ssk, mpext);
+
+	pr_debug("%s j=%d", __func__, j++);
+	if (j == 20)
+		skb->data_len = 1;
+	if (j > 40)
+		j = 0;
+
 	mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
 	return copy;
 }
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index b3b351b868f9..3339633e87d4 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -977,6 +977,24 @@ chk_link_usage()
 
 subflows_tests()
 {
+	# 1 subflow
+	reset
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
+	run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
+	chk_join_nr "1 subflow" 0 0 0
+
+	exit
+
+	# multiple subflows
+	reset
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
+	chk_join_nr "multiple subflows" 2 2 2
+
 	reset
 	run_tests $ns1 $ns2 10.0.1.1
 	chk_join_nr "no JOIN" "0" "0" "0"
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-18  4:33 ` [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test Geliang Tang
@ 2021-10-19  0:50   ` Mat Martineau
  2021-10-19  3:03     ` Geliang Tang
  0 siblings, 1 reply; 23+ messages in thread
From: Mat Martineau @ 2021-10-19  0:50 UTC (permalink / raw)
  To: Geliang Tang; +Cc: mptcp

On Mon, 18 Oct 2021, Geliang Tang wrote:

> ./mptcp_join.sh -Cf

When I tried running this as './mptcp_join.sh -Cfc', I'm not seeing 
fallback in the packet trace (no data level len == 0, and the packets all 
have DSS headers until disconnect). Is the test working on your system?

Also, I know that you had asked about other ways to intentionally corrupt 
the checksum that would be ok to merge. Do you still need more information 
on that?

Mat

>
> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
> ---
> net/mptcp/protocol.c                           |  9 +++++++++
> .../testing/selftests/net/mptcp/mptcp_join.sh  | 18 ++++++++++++++++++
> 2 files changed, 27 insertions(+)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 2283efda1bc9..fc3a4fb81fbb 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1243,6 +1243,8 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
> 	__mptcp_do_fallback(msk);
> }
>
> +static int j;
> +
> static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> 			      struct mptcp_data_frag *dfrag,
> 			      struct mptcp_sendmsg_info *info)
> @@ -1377,6 +1379,13 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> 		mptcp_update_data_checksum(skb, copy);
> 	if (mptcp_subflow_ctx(ssk)->send_infinite_map)
> 		mptcp_update_infinite_map(msk, ssk, mpext);
> +
> +	pr_debug("%s j=%d", __func__, j++);
> +	if (j == 20)
> +		skb->data_len = 1;
> +	if (j > 40)
> +		j = 0;
> +
> 	mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
> 	return copy;
> }
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> index b3b351b868f9..3339633e87d4 100755
> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> @@ -977,6 +977,24 @@ chk_link_usage()
>
> subflows_tests()
> {
> +	# 1 subflow
> +	reset
> +	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> +	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> +	run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
> +	chk_join_nr "1 subflow" 0 0 0
> +
> +	exit
> +
> +	# multiple subflows
> +	reset
> +	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> +	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> +	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> +	ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow
> +	run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
> +	chk_join_nr "multiple subflows" 2 2 2
> +
> 	reset
> 	run_tests $ns1 $ns2 10.0.1.1
> 	chk_join_nr "no JOIN" "0" "0" "0"
> -- 
> 2.31.1
>
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-19  0:50   ` Mat Martineau
@ 2021-10-19  3:03     ` Geliang Tang
  2021-10-19  8:29       ` Davide Caratti
  2021-10-19 23:44       ` Mat Martineau
  0 siblings, 2 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-19  3:03 UTC (permalink / raw)
  To: Mat Martineau, Davide Caratti; +Cc: Geliang Tang, MPTCP Upstream

[-- Attachment #1: Type: text/plain, Size: 7820 bytes --]

Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年10月18日周一 下午8:50写道：
>
> On Mon, 18 Oct 2021, Geliang Tang wrote:
>
> > ./mptcp_join.sh -Cf
>
> When I tried running this as './mptcp_join.sh -Cfc', I'm not seeing
> fallback in the packet trace (no data level len == 0, and the packets all
> have DSS headers until disconnect). Is the test working on your system?

Hi Mat,

This test can't trigger the checksum failure every time, so you need
to run it again.

If the checksum failure didn't happen, the output is something like this:

$ sudo ./mptcp_join.sh -Cfc
Created /tmp/tmp.9k1HNPyrMW (size 1 KB) containing data sent by client
Created /tmp/tmp.0ehIGSgCdT (size 1 KB) containing data sent by server
Capturing traffic for test 1 into mp_join-01-ns1-0-8ZkVHI.pcap
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1),
capture size 65535 bytes
155 packets captured
155 packets received by filter
0 packets dropped by kernel
01 1 subflow                    syn[ ok ] - synack[ ok ] - ack[ ok ]
                                        sum[ ok ] - csum  [ ok ]
                                        ftx[ ok ] - frx   [ ok ]
                                        itx[ ok ] - irx   [ ok ]

And if the checksum failure happened, the output is something like this:

$ sudo ./mptcp_join.sh -Cfc
Created /tmp/tmp.la0tjEnFqI (size 1 KB) containing data sent by client
Created /tmp/tmp.LKtqF2tbL3 (size 1 KB) containing data sent by server
Capturing traffic for test 1 into mp_join-01-ns1-0-U5nzEu.pcap
[ FAIL ] file received by client does not match (in, out):
-rw------- 1 root root 1052 Oct 19 01:56 /tmp/tmp.LKtqF2tbL3
Trailing bytes are:
MPTCP_TEST_FILE_END_MARKER
-rw------- 1 root root 1052 Oct 19 01:56 /tmp/tmp.ZiiaNUe3yw
Trailing bytes are:
MPTCP_TEST_FILE_END_MARKER
[ FAIL ] file received by server does not match (in, out):
-rw------- 1 root root 1052 Oct 19 01:56 /tmp/tmp.la0tjEnFqI
Trailing bytes are:
MPTCP_TEST_FILE_END_MARKER
-rw------- 1 root root 1052 Oct 19 01:56 /tmp/tmp.fN9uFcVnnb
Trailing bytes are:
MPTCP_TEST_FILE_END_MARKER
dropped privs to tgl
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1),
capture size 65535 bytes
152 packets captured
152 packets received by filter
0 packets dropped by kernel
01 1 subflow                            syn[ ok ] - synack[ ok ] - ack[ ok ]
                                        sum[fail] got 1 data checksum
error[s] expected 0
 - csum  [fail] got 1 data checksum error[s] expected 0
Server ns stats
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtInfiniteMapTx           1                  0.0
MPTcpExtDataCsumErr             1                  0.0
MPTcpExtMPFailTx                1                  0.0
MPTcpExtMPFailRx                1                  0.0
Client ns stats
MPTcpExtMPCapableSYNTX          1                  0.0
MPTcpExtMPCapableSYNACKRX       1                  0.0
MPTcpExtInfiniteMapTx           1                  0.0
MPTcpExtDataCsumErr             1                  0.0
MPTcpExtMPFailTx                1                  0.0
MPTcpExtMPFailRx                1                  0.0
                                        ftx[fail] got 1 MP_FAIL[s] TX expected 0
 - frx   [fail] got 1 MP_FAIL[s] RX expected 0
Server ns stats
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtInfiniteMapTx           1                  0.0
MPTcpExtDataCsumErr             1                  0.0
MPTcpExtMPFailTx                1                  0.0
MPTcpExtMPFailRx                1                  0.0
Client ns stats
MPTcpExtMPCapableSYNTX          1                  0.0
MPTcpExtMPCapableSYNACKRX       1                  0.0
MPTcpExtInfiniteMapTx           1                  0.0
MPTcpExtDataCsumErr             1                  0.0
MPTcpExtMPFailTx                1                  0.0
MPTcpExtMPFailRx                1                  0.0
                                        itx[fail] got 1 infinite
map[s] TX expected 0
 - irx   [ ok ]
Server ns stats
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtInfiniteMapTx           1                  0.0
MPTcpExtDataCsumErr             1                  0.0
MPTcpExtMPFailTx                1                  0.0
MPTcpExtMPFailRx                1                  0.0
Client ns stats
MPTcpExtMPCapableSYNTX          1                  0.0
MPTcpExtMPCapableSYNACKRX       1                  0.0
MPTcpExtInfiniteMapTx           1                  0.0
MPTcpExtDataCsumErr             1                  0.0
MPTcpExtMPFailTx                1                  0.0
MPTcpExtMPFailRx                1                  0.0

My pcap file (mp_join-01-ns1-0-U5nzEu.pcap) is attached.

>
> Also, I know that you had asked about other ways to intentionally corrupt
> the checksum that would be ok to merge. Do you still need more information
> on that?

Hi Davide,

Could you please share your test script to me? I'll try to use it to
write a new selttest.

Thanks,
-Geliang

>
> Mat
>
> >
> > Signed-off-by: Geliang Tang <geliang.tang@suse.com>
> > ---
> > net/mptcp/protocol.c                           |  9 +++++++++
> > .../testing/selftests/net/mptcp/mptcp_join.sh  | 18 ++++++++++++++++++
> > 2 files changed, 27 insertions(+)
> >
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index 2283efda1bc9..fc3a4fb81fbb 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -1243,6 +1243,8 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
> >       __mptcp_do_fallback(msk);
> > }
> >
> > +static int j;
> > +
> > static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> >                             struct mptcp_data_frag *dfrag,
> >                             struct mptcp_sendmsg_info *info)
> > @@ -1377,6 +1379,13 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> >               mptcp_update_data_checksum(skb, copy);
> >       if (mptcp_subflow_ctx(ssk)->send_infinite_map)
> >               mptcp_update_infinite_map(msk, ssk, mpext);
> > +
> > +     pr_debug("%s j=%d", __func__, j++);
> > +     if (j == 20)
> > +             skb->data_len = 1;
> > +     if (j > 40)
> > +             j = 0;
> > +
> >       mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
> >       return copy;
> > }
> > diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > index b3b351b868f9..3339633e87d4 100755
> > --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > @@ -977,6 +977,24 @@ chk_link_usage()
> >
> > subflows_tests()
> > {
> > +     # 1 subflow
> > +     reset
> > +     ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> > +     ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> > +     run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
> > +     chk_join_nr "1 subflow" 0 0 0
> > +
> > +     exit
> > +
> > +     # multiple subflows
> > +     reset
> > +     ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> > +     ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> > +     ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> > +     ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow
> > +     run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
> > +     chk_join_nr "multiple subflows" 2 2 2
> > +
> >       reset
> >       run_tests $ns1 $ns2 10.0.1.1
> >       chk_join_nr "no JOIN" "0" "0" "0"
> > --
> > 2.31.1
> >
> >
> >
>
> --
> Mat Martineau
> Intel
>

[-- Attachment #2: mp_join-01-ns1-0-U5nzEu.pcap --]
[-- Type: application/vnd.tcpdump.pcap, Size: 17420 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-19  3:03     ` Geliang Tang
@ 2021-10-19  8:29       ` Davide Caratti
  2021-10-20 10:39         ` Matthieu Baerts
  2021-10-19 23:44       ` Mat Martineau
  1 sibling, 1 reply; 23+ messages in thread
From: Davide Caratti @ 2021-10-19  8:29 UTC (permalink / raw)
  To: Geliang Tang; +Cc: Mat Martineau, Geliang Tang, MPTCP Upstream

On Mon, Oct 18, 2021 at 11:03:13PM -0400, Geliang Tang wrote:
> Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年10月18日周一 下午8:50写道：
> >
> > On Mon, 18 Oct 2021, Geliang Tang wrote:
> >

> Hi Davide,
> 
> Could you please share your test script to me? I'll try to use it to
> write a new selttest.

I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
bit in the TCP header, I promised myself to change it to corrupt the DSS option.
The problem with this approach is, we can't assume that the DSS option has a fixed
offset from the IP header. So, the test might start corrupting other options that
are not MPTCP. Maybe using eBPF, using something like what Matteo did in the
past [1], we have more luck ? 

[1] https://developers.redhat.com/blog/2018/12/03/network-debugging-with-ebpf#how_ebpf_can_help

-- 
davide

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-19  3:03     ` Geliang Tang
  2021-10-19  8:29       ` Davide Caratti
@ 2021-10-19 23:44       ` Mat Martineau
  2021-10-21  6:10         ` Geliang Tang
  1 sibling, 1 reply; 23+ messages in thread
From: Mat Martineau @ 2021-10-19 23:44 UTC (permalink / raw)
  To: Geliang Tang; +Cc: Davide Caratti, Geliang Tang, MPTCP Upstream

[-- Attachment #1: Type: text/plain, Size: 1780 bytes --]

On Mon, 18 Oct 2021, Geliang Tang wrote:

> Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年10月18日周一 下午8:50写道：
>>
>> On Mon, 18 Oct 2021, Geliang Tang wrote:
>>
>>> ./mptcp_join.sh -Cf
>>
>> When I tried running this as './mptcp_join.sh -Cfc', I'm not seeing
>> fallback in the packet trace (no data level len == 0, and the packets all
>> have DSS headers until disconnect). Is the test working on your system?
>
> Hi Mat,
>
> This test can't trigger the checksum failure every time, so you need
> to run it again.
>

I ran it four or five times yesterday, with no failures. When I tried 
again today, it triggered the checksum failure the first time :)

>
> My pcap file (mp_join-01-ns1-0-U5nzEu.pcap) is attached.
>

Thanks for sharing the pcap. When I capture failures on my system, I'm 
seeing slightly different results. The MP_FAIL is only sent in one 
direction, and there is only one infinite mapping sent (in reply to the 
MP_FAIL), but the connection still acts like fallback has happened. I 
uploaded that pcap to the infinite mapping github issue:

https://github.com/multipath-tcp/mptcp_net-next/issues/216#issuecomment-947172574

I noticed in your pcap that the infinite mapping is only sent in one 
direction, but MP_FAIL is sent by both sides.

Section 3.7 of the RFC (after Figure 16) says the MP_FAIL must be sent 
both directions, and I think that also implies the infinite mapping should 
be sent in both directions. So, should make sure that the last mapping 
sent by both peers is an infinite mapping. And if other people have 
differing interpretations of the RFC (especially the "infinite mapping in 
both directions" part that is not completely spelled out), please discuss!

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-19  8:29       ` Davide Caratti
@ 2021-10-20 10:39         ` Matthieu Baerts
  2021-10-20 11:33           ` Davide Caratti
  0 siblings, 1 reply; 23+ messages in thread
From: Matthieu Baerts @ 2021-10-20 10:39 UTC (permalink / raw)
  To: Davide Caratti, Geliang Tang; +Cc: Mat Martineau, Geliang Tang, MPTCP Upstream

Hi Davide, Geliang,

On 19/10/2021 10:29, Davide Caratti wrote:
> On Mon, Oct 18, 2021 at 11:03:13PM -0400, Geliang Tang wrote:
>> Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年10月18日周一 下午8:50写道：
>>>
>>> On Mon, 18 Oct 2021, Geliang Tang wrote:
>>>
>  
>> Hi Davide,
>>
>> Could you please share your test script to me? I'll try to use it to
>> write a new selttest.
> 
> I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
> bit in the TCP header, I promised myself to change it to corrupt the DSS option.
> The problem with this approach is, we can't assume that the DSS option has a fixed
> offset from the IP header. So, the test might start corrupting other options that
> are not MPTCP.

If we corrupt any data in TCP Options or data and update the TCP
Checksum, that's enough to corrupt the DSS. For example, we could flip
the last bit of the TCP payload (if tcp.len > 0). Would it be possible
to do this by adapting your script?

For mptcp.org, I told someone to use netfilterqueue + scapy. He had some
issues but also shared a working version:

https://github.com/multipath-tcp/mptcp/issues/420#issuecomment-825771587

I guess we cannot depend on Python and Scapy for selftests. I don't know
if we can also easily depend on eBPF either :)
And using simple solutions like "tc netem corrupt" will be detectable by
TCP csum.

So maybe your way is better if we can corrupt anything, e.g. using a
fixed offset like 120 or the last bit (if it is not an ACK). WDYT?

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-20 10:39         ` Matthieu Baerts
@ 2021-10-20 11:33           ` Davide Caratti
  2021-10-20 14:49             ` Matthieu Baerts
  0 siblings, 1 reply; 23+ messages in thread
From: Davide Caratti @ 2021-10-20 11:33 UTC (permalink / raw)
  To: Matthieu Baerts; +Cc: Geliang Tang, Mat Martineau, Geliang Tang, MPTCP Upstream

hi,

On Wed, Oct 20, 2021 at 12:40 PM Matthieu Baerts
<matthieu.baerts@tessares.net> wrote:
>
[...]

> >
> > I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
> > bit in the TCP header, I promised myself to change it to corrupt the DSS option.

FTR, that script was:

# convert PSH,ACK into ACK
# tc filter add dev vetha egress protocol ip prio 1000 flower ip_proto
tcp tcp_flags 0x18/0xff action pedit ex munge tcp flags set 0x10 pipe
csum tcp reclassify
# convert ACK into PSH,ACK, 1 each 10 packets
# tc filter add dev vetha egress protocol ip prio 1001 flower ip_proto
tcp tcp_flags 0x10/0xff action pass random determ pipe 10 pedit ex
munge tcp flags add 0x8 pipe csum tcp

> > The problem with this approach is, we can't assume that the DSS option has a fixed
> > offset from the IP header. So, the test might start corrupting other options that
> > are not MPTCP.
>
> If we corrupt any data in TCP Options or data and update the TCP
> Checksum, that's enough to corrupt the DSS. For example, we could flip
> the last bit of the TCP payload (if tcp.len > 0). Would it be possible
> to do this by adapting your script?

it's easier if it's the first bit, then...

> For mptcp.org, I told someone to use netfilterqueue + scapy. He had some
> issues but also shared a working version:
>
> https://github.com/multipath-tcp/mptcp/issues/420#issuecomment-825771587
>
> I guess we cannot depend on Python and Scapy for selftests. I don't know
> if we can also easily depend on eBPF either :)

'bpf' kselftests (and also tc-testing)  already depend on eBPF, so for
linux this is a no-problem.

> And using simple solutions like "tc netem corrupt" will be detectable by
> TCP csum.

this can be easily workarounded using the 'csum' TC action after the
netem dequeue.

> So maybe your way is better if we can corrupt anything, e.g. using a
> fixed offset like 120 or the last bit (if it is not an ACK). WDYT?

I redirected this question to some friends working on eBPF, and they
told me it's doable and cool at the same time :)
There is an eBPF program that parses the "timestamp" option [1], we
can change it easily to corrupt the DSS and recompute the TCP checksum
like in Matteo's RHD article.

WDYT?
--
davide

[1] https://github.com/xdp-project/bpf-examples/blob/master/pping/pping_kern.c#L83


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-20 11:33           ` Davide Caratti
@ 2021-10-20 14:49             ` Matthieu Baerts
  2021-10-21  9:17               ` Geliang Tang
  2021-10-21  9:19               ` Davide Caratti
  0 siblings, 2 replies; 23+ messages in thread
From: Matthieu Baerts @ 2021-10-20 14:49 UTC (permalink / raw)
  To: Davide Caratti, Geliang Tang; +Cc: Geliang Tang, Mat Martineau, MPTCP Upstream

Hi Davide,

On 20/10/2021 13:33, Davide Caratti wrote:
> hi,
> 
> On Wed, Oct 20, 2021 at 12:40 PM Matthieu Baerts
> <matthieu.baerts@tessares.net> wrote:
>>
> [...]
> 
>>>
>>> I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
>>> bit in the TCP header, I promised myself to change it to corrupt the DSS option.
> 
> FTR, that script was:
> 
> # convert PSH,ACK into ACK
> # tc filter add dev vetha egress protocol ip prio 1000 flower ip_proto
> tcp tcp_flags 0x18/0xff action pedit ex munge tcp flags set 0x10 pipe
> csum tcp reclassify
> # convert ACK into PSH,ACK, 1 each 10 packets
> # tc filter add dev vetha egress protocol ip prio 1001 flower ip_proto
> tcp tcp_flags 0x10/0xff action pass random determ pipe 10 pedit ex
> munge tcp flags add 0x8 pipe csum tcp

Thank you for sharing that.

I managed to change the content of a TCP packet using:

  $ tc -n <ns> qdisc add dev <iface> root handle 1: htb

  $ tc -n <ns> filter add dev <iface> parent 1: protocol ip prio 1000
flower ip_proto tcp action pedit munge offset 148 u32 invert pipe csum tcp

It looks like nothing happen for packets smaller than 148 bytes.

Note that I had to add these to the config file:

CONFIG_NET_SCH_HTB=m

CONFIG_NET_ACT_CSUM=m

CONFIG_NET_ACT_GACT=m

CONFIG_GACT_PROB=y

CONFIG_NET_ACT_PEDIT=m

CONFIG_NET_CLS_FLOWER=m


@Geliang: do you think you could use this command?

>>> The problem with this approach is, we can't assume that the DSS option has a fixed
>>> offset from the IP header. So, the test might start corrupting other options that
>>> are not MPTCP.
>>
>> If we corrupt any data in TCP Options or data and update the TCP
>> Checksum, that's enough to corrupt the DSS. For example, we could flip
>> the last bit of the TCP payload (if tcp.len > 0). Would it be possible
>> to do this by adapting your script?
> 
> it's easier if it's the first bit, then...
> 
>> For mptcp.org, I told someone to use netfilterqueue + scapy. He had some
>> issues but also shared a working version:
>>
>> https://github.com/multipath-tcp/mptcp/issues/420#issuecomment-825771587
>>
>> I guess we cannot depend on Python and Scapy for selftests. I don't know
>> if we can also easily depend on eBPF either :)
> 
> 'bpf' kselftests (and also tc-testing)  already depend on eBPF, so for
> linux this is a no-problem.

Yes indeed but that increases the complexity to launch selftests. I mean
you need to add a bit of dependences (and "recent" ones) for 'bpf'
selftests.
It could be optional but it still needs to be supported by our CI.

>> And using simple solutions like "tc netem corrupt" will be detectable by
>> TCP csum.
> 
> this can be easily workarounded using the 'csum' TC action after the
> netem dequeue.

Good point! So we would need to create a tree with first "netem" then
"csum" action?

The only thing is that netem will introduce errors at random positions,
maybe in IP/TCP headers but it is maybe not an issue. If the TC filter
command is enough, it might help to produce more controlled tests.

> 
>> So maybe your way is better if we can corrupt anything, e.g. using a
>> fixed offset like 120 or the last bit (if it is not an ACK). WDYT?
> 
> I redirected this question to some friends working on eBPF, and they
> told me it's doable and cool at the same time :)
> There is an eBPF program that parses the "timestamp" option [1], we
> can change it easily to corrupt the DSS and recompute the TCP checksum
> like in Matteo's RHD article.
> 
> WDYT?

Indeed it looks like it is doable but if we could not add new
dependences, I think that would be easier for us :)

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-19 23:44       ` Mat Martineau
@ 2021-10-21  6:10         ` Geliang Tang
  0 siblings, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-21  6:10 UTC (permalink / raw)
  To: Mat Martineau; +Cc: Davide Caratti, Geliang Tang, MPTCP Upstream

Hi Mat,

Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年10月20日周三 上午7:44写道：
>
> On Mon, 18 Oct 2021, Geliang Tang wrote:
>
> > Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年10月18日周一 下午8:50写道：
> >>
> >> On Mon, 18 Oct 2021, Geliang Tang wrote:
> >>
> >>> ./mptcp_join.sh -Cf
> >>
> >> When I tried running this as './mptcp_join.sh -Cfc', I'm not seeing
> >> fallback in the packet trace (no data level len == 0, and the packets all
> >> have DSS headers until disconnect). Is the test working on your system?
> >
> > Hi Mat,
> >
> > This test can't trigger the checksum failure every time, so you need
> > to run it again.
> >
>
> I ran it four or five times yesterday, with no failures. When I tried
> again today, it triggered the checksum failure the first time :)
>
> >
> > My pcap file (mp_join-01-ns1-0-U5nzEu.pcap) is attached.
> >
>
> Thanks for sharing the pcap. When I capture failures on my system, I'm
> seeing slightly different results. The MP_FAIL is only sent in one
> direction, and there is only one infinite mapping sent (in reply to the
> MP_FAIL), but the connection still acts like fallback has happened. I
> uploaded that pcap to the infinite mapping github issue:
>
> https://github.com/multipath-tcp/mptcp_net-next/issues/216#issuecomment-947172574
>
> I noticed in your pcap that the infinite mapping is only sent in one
> direction, but MP_FAIL is sent by both sides.
>
> Section 3.7 of the RFC (after Figure 16) says the MP_FAIL must be sent
> both directions, and I think that also implies the infinite mapping should
> be sent in both directions. So, should make sure that the last mapping
> sent by both peers is an infinite mapping. And if other people have
> differing interpretations of the RFC (especially the "infinite mapping in
> both directions" part that is not completely spelled out), please discuss!

I just sent a squash-to patch to fix this.

Thanks,
-Geliang

>
>
> --
> Mat Martineau
> Intel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-20 14:49             ` Matthieu Baerts
@ 2021-10-21  9:17               ` Geliang Tang
  2021-10-21  9:19               ` Davide Caratti
  1 sibling, 0 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-21  9:17 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: Davide Caratti, Geliang Tang, Mat Martineau, MPTCP Upstream

Matthieu Baerts <matthieu.baerts@tessares.net> 于2021年10月20日周三 下午10:49写道：
>
> Hi Davide,
>
> On 20/10/2021 13:33, Davide Caratti wrote:
> > hi,
> >
> > On Wed, Oct 20, 2021 at 12:40 PM Matthieu Baerts
> > <matthieu.baerts@tessares.net> wrote:
> >>
> > [...]
> >
> >>>
> >>> I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
> >>> bit in the TCP header, I promised myself to change it to corrupt the DSS option.
> >
> > FTR, that script was:
> >
> > # convert PSH,ACK into ACK
> > # tc filter add dev vetha egress protocol ip prio 1000 flower ip_proto
> > tcp tcp_flags 0x18/0xff action pedit ex munge tcp flags set 0x10 pipe
> > csum tcp reclassify
> > # convert ACK into PSH,ACK, 1 each 10 packets
> > # tc filter add dev vetha egress protocol ip prio 1001 flower ip_proto
> > tcp tcp_flags 0x10/0xff action pass random determ pipe 10 pedit ex
> > munge tcp flags add 0x8 pipe csum tcp
>
> Thank you for sharing that.
>
> I managed to change the content of a TCP packet using:
>
>   $ tc -n <ns> qdisc add dev <iface> root handle 1: htb
>
>   $ tc -n <ns> filter add dev <iface> parent 1: protocol ip prio 1000
> flower ip_proto tcp action pedit munge offset 148 u32 invert pipe csum tcp
>
> It looks like nothing happen for packets smaller than 148 bytes.
>
> Note that I had to add these to the config file:
>
> CONFIG_NET_SCH_HTB=m
>
> CONFIG_NET_ACT_CSUM=m
>
> CONFIG_NET_ACT_GACT=m
>
> CONFIG_GACT_PROB=y
>
> CONFIG_NET_ACT_PEDIT=m
>
> CONFIG_NET_CLS_FLOWER=m
>
>
> @Geliang: do you think you could use this command?

Hi Matt, Davide,

Thanks for your help. It works on my system. I'll write new MP_FAIL
selftest cases using tc to trigger the checksum failure as you suggested.

Thanks,
-Geliang

>
> >>> The problem with this approach is, we can't assume that the DSS option has a fixed
> >>> offset from the IP header. So, the test might start corrupting other options that
> >>> are not MPTCP.
> >>
> >> If we corrupt any data in TCP Options or data and update the TCP
> >> Checksum, that's enough to corrupt the DSS. For example, we could flip
> >> the last bit of the TCP payload (if tcp.len > 0). Would it be possible
> >> to do this by adapting your script?
> >
> > it's easier if it's the first bit, then...
> >
> >> For mptcp.org, I told someone to use netfilterqueue + scapy. He had some
> >> issues but also shared a working version:
> >>
> >> https://github.com/multipath-tcp/mptcp/issues/420#issuecomment-825771587
> >>
> >> I guess we cannot depend on Python and Scapy for selftests. I don't know
> >> if we can also easily depend on eBPF either :)
> >
> > 'bpf' kselftests (and also tc-testing)  already depend on eBPF, so for
> > linux this is a no-problem.
>
> Yes indeed but that increases the complexity to launch selftests. I mean
> you need to add a bit of dependences (and "recent" ones) for 'bpf'
> selftests.
> It could be optional but it still needs to be supported by our CI.
>
> >> And using simple solutions like "tc netem corrupt" will be detectable by
> >> TCP csum.
> >
> > this can be easily workarounded using the 'csum' TC action after the
> > netem dequeue.
>
> Good point! So we would need to create a tree with first "netem" then
> "csum" action?
>
> The only thing is that netem will introduce errors at random positions,
> maybe in IP/TCP headers but it is maybe not an issue. If the TC filter
> command is enough, it might help to produce more controlled tests.
>
> >
> >> So maybe your way is better if we can corrupt anything, e.g. using a
> >> fixed offset like 120 or the last bit (if it is not an ACK). WDYT?
> >
> > I redirected this question to some friends working on eBPF, and they
> > told me it's doable and cool at the same time :)
> > There is an eBPF program that parses the "timestamp" option [1], we
> > can change it easily to corrupt the DSS and recompute the TCP checksum
> > like in Matteo's RHD article.
> >
> > WDYT?
>
> Indeed it looks like it is doable but if we could not add new
> dependences, I think that would be easier for us :)
>
> Cheers,
> Matt
> --
> Tessares | Belgium | Hybrid Access Solutions
> www.tessares.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-20 14:49             ` Matthieu Baerts
  2021-10-21  9:17               ` Geliang Tang
@ 2021-10-21  9:19               ` Davide Caratti
  2021-10-21 12:51                 ` Matthieu Baerts
  1 sibling, 1 reply; 23+ messages in thread
From: Davide Caratti @ 2021-10-21  9:19 UTC (permalink / raw)
  To: Matthieu Baerts; +Cc: Geliang Tang, Geliang Tang, Mat Martineau, MPTCP Upstream

On Wed, Oct 20, 2021 at 04:49:35PM +0200, Matthieu Baerts wrote:
> Hi Davide,
> 
> On 20/10/2021 13:33, Davide Caratti wrote:
> > hi,
> > 
> > On Wed, Oct 20, 2021 at 12:40 PM Matthieu Baerts
> > <matthieu.baerts@tessares.net> wrote:
> >>
> > [...]
> > 
> >>>
> >>> I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
> >>> bit in the TCP header, I promised myself to change it to corrupt the DSS option.
> > 
> > FTR, that script was:
> > 
> > # convert PSH,ACK into ACK
> > # tc filter add dev vetha egress protocol ip prio 1000 flower ip_proto
> > tcp tcp_flags 0x18/0xff action pedit ex munge tcp flags set 0x10 pipe
> > csum tcp reclassify
> > # convert ACK into PSH,ACK, 1 each 10 packets
> > # tc filter add dev vetha egress protocol ip prio 1001 flower ip_proto
> > tcp tcp_flags 0x10/0xff action pass random determ pipe 10 pedit ex
> > munge tcp flags add 0x8 pipe csum tcp
> 
> Thank you for sharing that.
> 
> I managed to change the content of a TCP packet using:
> 
>   $ tc -n <ns> qdisc add dev <iface> root handle 1: htb
> 
>   $ tc -n <ns> filter add dev <iface> parent 1: protocol ip prio 1000
> flower ip_proto tcp action pedit munge offset 148 u32 invert pipe csum tcp
> 
> It looks like nothing happen for packets smaller than 148 bytes.
> 
> Note that I had to add these to the config file:
> 

> CONFIG_NET_SCH_HTB=m

I assume you are using HTB because it's a classful qdisc and it's used to
attach the flower classifier. You don't need it, just use the clsact
qdisc (CONFIG_NET_CLS_ACT=y + CONFIG_NET_SCH_INGRESS=y), it will allow
you inserting filters+actions even if no root qdisc is in place.

# tc qdisc add dev eth0 clsact
# tc filter add dev eth0 egress matchall action simple sdata "hello"

> 
> CONFIG_NET_ACT_CSUM=m
> 
> CONFIG_NET_ACT_GACT=m
> 
> CONFIG_GACT_PROB=y
> 
> CONFIG_NET_ACT_PEDIT=m
> 
> CONFIG_NET_CLS_FLOWER=m
> 
> 
> @Geliang: do you think you could use this command?
> 
> >>> The problem with this approach is, we can't assume that the DSS option has a fixed
> >>> offset from the IP header. So, the test might start corrupting other options that
> >>> are not MPTCP.
> >>
> >> If we corrupt any data in TCP Options or data and update the TCP
> >> Checksum, that's enough to corrupt the DSS. For example, we could flip
> >> the last bit of the TCP payload (if tcp.len > 0). Would it be possible
> >> to do this by adapting your script?
> > 
> > it's easier if it's the first bit, then...
> >
[...] 
 
> >> And using simple solutions like "tc netem corrupt" will be detectable by
> >> TCP csum.
> > 
> > this can be easily workarounded using the 'csum' TC action after the
> > netem dequeue.
> 
> Good point! So we would need to create a tree with first "netem" then
> "csum" action?

the problem is, netem acts *after* TC actions. But probably in your
setup you can overcome this by making netem corrupt when xmitting, and
install the TCP csum  at  the receiver ingress (also here clsact is
helpful):

# ip link add name vetha type veth peer name vethb
# tc qdisc add dev vetha root netem <...>
# tc qdisc a dev vethb clsact
# tc filter add dev vethb ingress protocol ip flower ip_proto tcp action csum tcp

> 
> The only thing is that netem will introduce errors at random positions,
> maybe in IP/TCP headers but it is maybe not an issue. If the TC filter
> command is enough, it might help to produce more controlled tests.

maybe TC flower + action is even more accurate then netem: because it leaves
you the freedom to "corrupt" only packets after the 3WHS (or only the 3WHS,
depending on the match arguments of flower).
You can obtain the same randomization as netem by using TC "act_gact" random
feature ("CONFIG_GACT_PROB=y")

 # tc filter add dev vetha egress protocol ip prio 1001 flower \
  > ip_proto tcp \
  > tcp_flags 0x10/0xff \ <-- only the ACK BIT is set 
  > action pass random determ pipe 10 \    <-- 1 packet every 10 goes to pedit, others go out unmodified
  > pedit munge offset 148 u32 invert pipe csum tcp <-- corrupt the payload and recompute checksum

 
> Indeed it looks like it is doable but if we could not add new
> dependences, I think that would be easier for us :)

Whatever works, it's good also for me :)

-- 
davide
 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-21  9:19               ` Davide Caratti
@ 2021-10-21 12:51                 ` Matthieu Baerts
  2021-10-22  7:37                   ` Geliang Tang
  0 siblings, 1 reply; 23+ messages in thread
From: Matthieu Baerts @ 2021-10-21 12:51 UTC (permalink / raw)
  To: Davide Caratti; +Cc: Geliang Tang, Geliang Tang, Mat Martineau, MPTCP Upstream

Hi Davide,

On 21/10/2021 11:19, Davide Caratti wrote:
> On Wed, Oct 20, 2021 at 04:49:35PM +0200, Matthieu Baerts wrote:
>> Hi Davide,
>>
>> On 20/10/2021 13:33, Davide Caratti wrote:
>>> hi,
>>>
>>> On Wed, Oct 20, 2021 at 12:40 PM Matthieu Baerts
>>> <matthieu.baerts@tessares.net> wrote:
>>>>
>>> [...]
>>>
>>>>>
>>>>> I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
>>>>> bit in the TCP header, I promised myself to change it to corrupt the DSS option.
>>>
>>> FTR, that script was:
>>>
>>> # convert PSH,ACK into ACK
>>> # tc filter add dev vetha egress protocol ip prio 1000 flower ip_proto
>>> tcp tcp_flags 0x18/0xff action pedit ex munge tcp flags set 0x10 pipe
>>> csum tcp reclassify
>>> # convert ACK into PSH,ACK, 1 each 10 packets
>>> # tc filter add dev vetha egress protocol ip prio 1001 flower ip_proto
>>> tcp tcp_flags 0x10/0xff action pass random determ pipe 10 pedit ex
>>> munge tcp flags add 0x8 pipe csum tcp
>>
>> Thank you for sharing that.
>>
>> I managed to change the content of a TCP packet using:
>>
>>   $ tc -n <ns> qdisc add dev <iface> root handle 1: htb
>>
>>   $ tc -n <ns> filter add dev <iface> parent 1: protocol ip prio 1000
>> flower ip_proto tcp action pedit munge offset 148 u32 invert pipe csum tcp
>>
>> It looks like nothing happen for packets smaller than 148 bytes.
>>
>> Note that I had to add these to the config file:
>>
> 
>> CONFIG_NET_SCH_HTB=m
> 
> I assume you are using HTB because it's a classful qdisc and it's used to
> attach the flower classifier. You don't need it, just use the clsact
> qdisc (CONFIG_NET_CLS_ACT=y + CONFIG_NET_SCH_INGRESS=y), it will allow
> you inserting filters+actions even if no root qdisc is in place.
> 
> # tc qdisc add dev eth0 clsact
> # tc filter add dev eth0 egress matchall action simple sdata "hello"

Great, I don't know well this obscure tc :-)

So we should do instead:

$ tc -n <ns> qdisc add dev <iface> clsact
$ tc -n <ns> filter add dev <iface> egress \
  protocol ip prio 1000 \
  flower ip_proto tcp \
  action pedit munge offset 148 u32 invert \
  pipe csum tcp

Right?

> 
>>
>> CONFIG_NET_ACT_CSUM=m
>>
>> CONFIG_NET_ACT_GACT=m
>>
>> CONFIG_GACT_PROB=y
>>
>> CONFIG_NET_ACT_PEDIT=m
>>
>> CONFIG_NET_CLS_FLOWER=m
>>
>>
>> @Geliang: do you think you could use this command?
>>
>>>>> The problem with this approach is, we can't assume that the DSS option has a fixed
>>>>> offset from the IP header. So, the test might start corrupting other options that
>>>>> are not MPTCP.
>>>>
>>>> If we corrupt any data in TCP Options or data and update the TCP
>>>> Checksum, that's enough to corrupt the DSS. For example, we could flip
>>>> the last bit of the TCP payload (if tcp.len > 0). Would it be possible
>>>> to do this by adapting your script?
>>>
>>> it's easier if it's the first bit, then...
>>>
> [...] 
>  
>>>> And using simple solutions like "tc netem corrupt" will be detectable by
>>>> TCP csum.
>>>
>>> this can be easily workarounded using the 'csum' TC action after the
>>> netem dequeue.
>>
>> Good point! So we would need to create a tree with first "netem" then
>> "csum" action?
> 
> the problem is, netem acts *after* TC actions. But probably in your
> setup you can overcome this by making netem corrupt when xmitting, and
> install the TCP csum  at  the receiver ingress (also here clsact is
> helpful):
> 
> # ip link add name vetha type veth peer name vethb
> # tc qdisc add dev vetha root netem <...>
> # tc qdisc a dev vethb clsact
> # tc filter add dev vethb ingress protocol ip flower ip_proto tcp action csum tcp

Thank you for the explanations and the commands :-)

>> The only thing is that netem will introduce errors at random positions,
>> maybe in IP/TCP headers but it is maybe not an issue. If the TC filter
>> command is enough, it might help to produce more controlled tests.
> 
> maybe TC flower + action is even more accurate then netem: because it leaves
> you the freedom to "corrupt" only packets after the 3WHS (or only the 3WHS,
> depending on the match arguments of flower).
> You can obtain the same randomization as netem by using TC "act_gact" random
> feature ("CONFIG_GACT_PROB=y")
> 
>  # tc filter add dev vetha egress protocol ip prio 1001 flower \
>   > ip_proto tcp \
>   > tcp_flags 0x10/0xff \ <-- only the ACK BIT is set 
>   > action pass random determ pipe 10 \    <-- 1 packet every 10 goes to pedit, others go out unmodified
>   > pedit munge offset 148 u32 invert pipe csum tcp <-- corrupt the payload and recompute checksum

Good point!
I don't know if Geliang wants to have all packets transporting data that
are being corrupted or a random one out of X packets or only a specific one.

>> Indeed it looks like it is doable but if we could not add new
>> dependences, I think that would be easier for us :)
> 
> Whatever works, it's good also for me :)

:)

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-21 12:51                 ` Matthieu Baerts
@ 2021-10-22  7:37                   ` Geliang Tang
  2021-10-22 11:25                     ` Matthieu Baerts
  2021-10-25  7:59                     ` Davide Caratti
  0 siblings, 2 replies; 23+ messages in thread
From: Geliang Tang @ 2021-10-22  7:37 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: Davide Caratti, Geliang Tang, Mat Martineau, MPTCP Upstream

Hi Matt, Davide,

Matthieu Baerts <matthieu.baerts@tessares.net> 于2021年10月21日周四 下午8:51写道：
>
> Hi Davide,
>
> On 21/10/2021 11:19, Davide Caratti wrote:
> > On Wed, Oct 20, 2021 at 04:49:35PM +0200, Matthieu Baerts wrote:
> >> Hi Davide,
> >>
> >> On 20/10/2021 13:33, Davide Caratti wrote:
> >>> hi,
> >>>
> >>> On Wed, Oct 20, 2021 at 12:40 PM Matthieu Baerts
> >>> <matthieu.baerts@tessares.net> wrote:
> >>>>
> >>> [...]
> >>>
> >>>>>
> >>>>> I had a script that used TC flower and gact + pedit + csum actions to flip the 'PSH'
> >>>>> bit in the TCP header, I promised myself to change it to corrupt the DSS option.
> >>>
> >>> FTR, that script was:
> >>>
> >>> # convert PSH,ACK into ACK
> >>> # tc filter add dev vetha egress protocol ip prio 1000 flower ip_proto
> >>> tcp tcp_flags 0x18/0xff action pedit ex munge tcp flags set 0x10 pipe
> >>> csum tcp reclassify
> >>> # convert ACK into PSH,ACK, 1 each 10 packets
> >>> # tc filter add dev vetha egress protocol ip prio 1001 flower ip_proto
> >>> tcp tcp_flags 0x10/0xff action pass random determ pipe 10 pedit ex
> >>> munge tcp flags add 0x8 pipe csum tcp
> >>
> >> Thank you for sharing that.
> >>
> >> I managed to change the content of a TCP packet using:
> >>
> >>   $ tc -n <ns> qdisc add dev <iface> root handle 1: htb
> >>
> >>   $ tc -n <ns> filter add dev <iface> parent 1: protocol ip prio 1000
> >> flower ip_proto tcp action pedit munge offset 148 u32 invert pipe csum tcp
> >>
> >> It looks like nothing happen for packets smaller than 148 bytes.
> >>
> >> Note that I had to add these to the config file:
> >>
> >
> >> CONFIG_NET_SCH_HTB=m
> >
> > I assume you are using HTB because it's a classful qdisc and it's used to
> > attach the flower classifier. You don't need it, just use the clsact
> > qdisc (CONFIG_NET_CLS_ACT=y + CONFIG_NET_SCH_INGRESS=y), it will allow
> > you inserting filters+actions even if no root qdisc is in place.
> >
> > # tc qdisc add dev eth0 clsact
> > # tc filter add dev eth0 egress matchall action simple sdata "hello"
>
> Great, I don't know well this obscure tc :-)
>
> So we should do instead:
>
> $ tc -n <ns> qdisc add dev <iface> clsact
> $ tc -n <ns> filter add dev <iface> egress \
>   protocol ip prio 1000 \
>   flower ip_proto tcp \
>   action pedit munge offset 148 u32 invert \
>   pipe csum tcp
>
> Right?
>
> >
> >>
> >> CONFIG_NET_ACT_CSUM=m
> >>
> >> CONFIG_NET_ACT_GACT=m
> >>
> >> CONFIG_GACT_PROB=y
> >>
> >> CONFIG_NET_ACT_PEDIT=m
> >>
> >> CONFIG_NET_CLS_FLOWER=m
> >>
> >>
> >> @Geliang: do you think you could use this command?
> >>
> >>>>> The problem with this approach is, we can't assume that the DSS option has a fixed
> >>>>> offset from the IP header. So, the test might start corrupting other options that
> >>>>> are not MPTCP.
> >>>>
> >>>> If we corrupt any data in TCP Options or data and update the TCP
> >>>> Checksum, that's enough to corrupt the DSS. For example, we could flip
> >>>> the last bit of the TCP payload (if tcp.len > 0). Would it be possible
> >>>> to do this by adapting your script?
> >>>
> >>> it's easier if it's the first bit, then...
> >>>
> > [...]
> >
> >>>> And using simple solutions like "tc netem corrupt" will be detectable by
> >>>> TCP csum.
> >>>
> >>> this can be easily workarounded using the 'csum' TC action after the
> >>> netem dequeue.
> >>
> >> Good point! So we would need to create a tree with first "netem" then
> >> "csum" action?
> >
> > the problem is, netem acts *after* TC actions. But probably in your
> > setup you can overcome this by making netem corrupt when xmitting, and
> > install the TCP csum  at  the receiver ingress (also here clsact is
> > helpful):
> >
> > # ip link add name vetha type veth peer name vethb
> > # tc qdisc add dev vetha root netem <...>
> > # tc qdisc a dev vethb clsact
> > # tc filter add dev vethb ingress protocol ip flower ip_proto tcp action csum tcp
>
> Thank you for the explanations and the commands :-)
>
> >> The only thing is that netem will introduce errors at random positions,
> >> maybe in IP/TCP headers but it is maybe not an issue. If the TC filter
> >> command is enough, it might help to produce more controlled tests.
> >
> > maybe TC flower + action is even more accurate then netem: because it leaves
> > you the freedom to "corrupt" only packets after the 3WHS (or only the 3WHS,
> > depending on the match arguments of flower).
> > You can obtain the same randomization as netem by using TC "act_gact" random
> > feature ("CONFIG_GACT_PROB=y")
> >
> >  # tc filter add dev vetha egress protocol ip prio 1001 flower \
> >   > ip_proto tcp \
> >   > tcp_flags 0x10/0xff \ <-- only the ACK BIT is set
> >   > action pass random determ pipe 10 \    <-- 1 packet every 10 goes to pedit, others go out unmodified
> >   > pedit munge offset 148 u32 invert pipe csum tcp <-- corrupt the payload and recompute checksum
>
> Good point!
> I don't know if Geliang wants to have all packets transporting data that
> are being corrupted or a random one out of X packets or only a specific one.

I want to have only a specific packet to be corrupted, like sending 10
unmodified packets, then sending 1 corrupted packet, then sending 10
unmodified packets. How can I do this?

Thanks,
-Geliang


>
> >> Indeed it looks like it is doable but if we could not add new
> >> dependences, I think that would be easier for us :)
> >
> > Whatever works, it's good also for me :)
>
> :)
>
> Cheers,
> Matt
> --
> Tessares | Belgium | Hybrid Access Solutions
> www.tessares.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-22  7:37                   ` Geliang Tang
@ 2021-10-22 11:25                     ` Matthieu Baerts
  2021-10-25  7:59                     ` Davide Caratti
  1 sibling, 0 replies; 23+ messages in thread
From: Matthieu Baerts @ 2021-10-22 11:25 UTC (permalink / raw)
  To: Geliang Tang, Davide Caratti; +Cc: Geliang Tang, Mat Martineau, MPTCP Upstream

Hi Geliang, Davide,

On 22/10/2021 09:37, Geliang Tang wrote:
> Matthieu Baerts <matthieu.baerts@tessares.net> 于2021年10月21日周四 下午8:51写道：
>> On 21/10/2021 11:19, Davide Caratti wrote:
>>> On Wed, Oct 20, 2021 at 04:49:35PM +0200, Matthieu Baerts wrote:

(...)

>>>> The only thing is that netem will introduce errors at random positions,
>>>> maybe in IP/TCP headers but it is maybe not an issue. If the TC filter
>>>> command is enough, it might help to produce more controlled tests.
>>>
>>> maybe TC flower + action is even more accurate then netem: because it leaves
>>> you the freedom to "corrupt" only packets after the 3WHS (or only the 3WHS,
>>> depending on the match arguments of flower).
>>> You can obtain the same randomization as netem by using TC "act_gact" random
>>> feature ("CONFIG_GACT_PROB=y")
>>>
>>>  # tc filter add dev vetha egress protocol ip prio 1001 flower \
>>>   > ip_proto tcp \
>>>   > tcp_flags 0x10/0xff \ <-- only the ACK BIT is set
>>>   > action pass random determ pipe 10 \    <-- 1 packet every 10 goes to pedit, others go out unmodified
>>>   > pedit munge offset 148 u32 invert pipe csum tcp <-- corrupt the payload and recompute checksum
>>
>> Good point!
>> I don't know if Geliang wants to have all packets transporting data that
>> are being corrupted or a random one out of X packets or only a specific one.
> 
> I want to have only a specific packet to be corrupted, like sending 10
> unmodified packets, then sending 1 corrupted packet, then sending 10
> unmodified packets. How can I do this?

I didn't find any man page about TC GAct, I guess we would need to look
at that code to have a "statistic way" like you want.

A workaround could be to insert the rule after the creation of the
connection, no? In this case, you don't need "pass random determ pipe
10": all packets carrying data would be corrupted.
Or maybe "one every 10 packets" is enough?

Then you can remove the rule if you need to send unmodified packets but
I don't think you need this, do you?

Or maybe Davide has a better idea or knows how TC GAct works? :)

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test
  2021-10-22  7:37                   ` Geliang Tang
  2021-10-22 11:25                     ` Matthieu Baerts
@ 2021-10-25  7:59                     ` Davide Caratti
  1 sibling, 0 replies; 23+ messages in thread
From: Davide Caratti @ 2021-10-25  7:59 UTC (permalink / raw)
  To: Geliang Tang; +Cc: Matthieu Baerts, Geliang Tang, Mat Martineau, MPTCP Upstream

On Fri, Oct 22, 2021 at 03:37:38PM +0800, Geliang Tang wrote:
> Hi Matt, Davide,
> 

[...] 

> > >> The only thing is that netem will introduce errors at random positions,
> > >> maybe in IP/TCP headers but it is maybe not an issue. If the TC filter
> > >> command is enough, it might help to produce more controlled tests.
> > >
> > > maybe TC flower + action is even more accurate then netem: because it leaves
> > > you the freedom to "corrupt" only packets after the 3WHS (or only the 3WHS,
> > > depending on the match arguments of flower).
> > > You can obtain the same randomization as netem by using TC "act_gact" random
> > > feature ("CONFIG_GACT_PROB=y")
> > >
> > >  # tc filter add dev vetha egress protocol ip prio 1001 flower \
> > >   > ip_proto tcp \
> > >   > tcp_flags 0x10/0xff \ <-- only the ACK BIT is set
> > >   > action pass random determ pipe 10 \    <-- 1 packet every 10 goes to pedit, others go out unmodified
> > >   > pedit munge offset 148 u32 invert pipe csum tcp <-- corrupt the payload and recompute checksum
> >
> > Good point!
> > I don't know if Geliang wants to have all packets transporting data that
> > are being corrupted or a random one out of X packets or only a specific one.
> 
> I want to have only a specific packet to be corrupted, like sending 10
> unmodified packets, then sending 1 corrupted packet, then sending 10
> unmodified packets. How can I do this?

you can use TC act_gact "random determ" :

 # tc filter add dev eth0 egress protocol ip <filter params> \
 > action pass          random determ pipe 10  pedit munge ....

          ^^^            ^^^^^^^^^^^^^^^^^^^
        control          alternate control action:
        action:          pipe to pedit 1 packet 
        pass             every 10 packets



then check counters of the 2 actions:

 # for act in gact pedit; do \
 > tc -s action show action $act; \
 > done

and verify that gact packet counters are 10 times bigger than pedit.
Let me know if that works!

-- 
davide
 


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-10-25  7:59 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18  4:33 [PATCH mptcp-next v7 0/8] The infinite mapping support Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 1/8] mptcp: don't send RST for single subflow Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 2/8] mptcp: add the fallback check Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 3/8] mptcp: track and update contiguous data status Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 4/8] mptcp: infinite mapping sending Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 5/8] mptcp: infinite mapping receiving Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 6/8] mptcp: add mib for infinite map sending Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 7/8] selftests: mptcp: add infinite map mibs check Geliang Tang
2021-10-18  4:33 ` [PATCH mptcp-next v7 8/8] DO-NOT-MERGE: mptcp: mp_fail test Geliang Tang
2021-10-19  0:50   ` Mat Martineau
2021-10-19  3:03     ` Geliang Tang
2021-10-19  8:29       ` Davide Caratti
2021-10-20 10:39         ` Matthieu Baerts
2021-10-20 11:33           ` Davide Caratti
2021-10-20 14:49             ` Matthieu Baerts
2021-10-21  9:17               ` Geliang Tang
2021-10-21  9:19               ` Davide Caratti
2021-10-21 12:51                 ` Matthieu Baerts
2021-10-22  7:37                   ` Geliang Tang
2021-10-22 11:25                     ` Matthieu Baerts
2021-10-25  7:59                     ` Davide Caratti
2021-10-19 23:44       ` Mat Martineau
2021-10-21  6:10         ` Geliang Tang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.