All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 mptcp-next 00/22] mptcp: data checksum support
@ 2021-04-23 14:49 Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 01/22] mptcp: add csum_enabled in mptcp_sock Paolo Abeni
                   ` (21 more replies)
  0 siblings, 22 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Changes from v3:
 - included Geliang's feedback
 - hopefully fixed wrong csum computation in the rx path
   for DSS+data_fin 

Changes from v2:
 - implement RFC behavior WRT csum failure and csum mismatch
 - cope with zero win probe and re-injection
 - fix csum computation for DSS with data_fin flag set

Only some Squash-to patches has been changes WRT the previous
iteration.

There is still at least a ough edge:
- the csum computation in the tx path is very suboptimal.

Anyhow, I propose to address the above with additional squash-to patches
to avoid complex/extra long review cycles.

Geliang Tang (14):
  mptcp: add csum_enabled in mptcp_sock
  mptcp: generate the data checksum
  mptcp: add csum_reqd in mptcp_out_options
  mptcp: send out checksum for MP_CAPABLE with data
  mptcp: send out checksum for DSS
  mptcp: add sk parameter for mptcp_parse_option
  mptcp: add csum_reqd in mptcp_options_received
  mptcp: receive checksum for MP_CAPABLE with data
  mptcp: receive checksum for DSS
  mptcp: validate the data checksum
  mptcp: add the mib for data checksum
  mptcp: add a new sysctl checksum_enabled
  selftests: mptcp: enable checksum in mptcp_connect.sh
  selftests: mptcp: enable checksum in mptcp_join.sh

Paolo Abeni (8):
  Squash-to: "mptcp: generate the data checksum"
  Squash-to: "mptcp: send out checksum for MP_CAPABLE with data"
  Squash-to "mptcp: send out checksum for DSS"
  Squash-to: "mptcp: receive checksum for MP_CAPABLE with data"
  Squash-to: "mptcp: receive checksum for DSS"
  Squash-to: mptcp: validate the data checksum
  mptcp: cleanup error path in subflow_check_data_avail()
  mptcp: tune re-injections for csum enabled mode.

 Documentation/networking/mptcp-sysctl.rst     |   8 +
 include/net/mptcp.h                           |   7 +-
 include/uapi/linux/mptcp.h                    |   1 +
 net/mptcp/ctrl.c                              |  14 ++
 net/mptcp/mib.c                               |   1 +
 net/mptcp/mib.h                               |   1 +
 net/mptcp/mptcp_diag.c                        |   1 +
 net/mptcp/options.c                           | 142 +++++++++++++-----
 net/mptcp/protocol.c                          |  39 ++++-
 net/mptcp/protocol.h                          |  23 ++-
 net/mptcp/subflow.c                           | 142 +++++++++++++++---
 .../selftests/net/mptcp/mptcp_connect.sh      |  13 +-
 .../testing/selftests/net/mptcp/mptcp_join.sh | 107 ++++++++++++-
 13 files changed, 423 insertions(+), 76 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 01/22] mptcp: add csum_enabled in mptcp_sock
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 02/22] mptcp: generate the data checksum Paolo Abeni
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new member named csum_enabled in struct mptcp_sock,
used a dummy mptcp_is_checksum_enabled() helper to initialize it.

Also added a new member named mptcpi_csum_enabled in struct mptcp_info
to expose the csum_enabled flag.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 include/uapi/linux/mptcp.h | 1 +
 net/mptcp/mptcp_diag.c     | 1 +
 net/mptcp/protocol.c       | 1 +
 net/mptcp/protocol.h       | 2 ++
 4 files changed, 5 insertions(+)

diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index 8eb3c0844bff..7b05f7102321 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -105,6 +105,7 @@ struct mptcp_info {
 	__u64	mptcpi_rcv_nxt;
 	__u8	mptcpi_local_addr_used;
 	__u8	mptcpi_local_addr_max;
+	__u8	mptcpi_csum_enabled;
 };
 
 /*
diff --git a/net/mptcp/mptcp_diag.c b/net/mptcp/mptcp_diag.c
index f16d9b5ee978..8f88ddeab6a2 100644
--- a/net/mptcp/mptcp_diag.c
+++ b/net/mptcp/mptcp_diag.c
@@ -144,6 +144,7 @@ static void mptcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r,
 	info->mptcpi_write_seq = READ_ONCE(msk->write_seq);
 	info->mptcpi_snd_una = READ_ONCE(msk->snd_una);
 	info->mptcpi_rcv_nxt = READ_ONCE(msk->ack_seq);
+	info->mptcpi_csum_enabled = READ_ONCE(msk->csum_enabled);
 	unlock_sock_fast(sk, slow);
 }
 
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index a8180a917649..5b6bb1d27af5 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2392,6 +2392,7 @@ static int __mptcp_init_sock(struct sock *sk)
 	msk->ack_hint = NULL;
 	msk->first = NULL;
 	inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss;
+	WRITE_ONCE(msk->csum_enabled, mptcp_is_checksum_enabled(sock_net(sk)));
 
 	mptcp_pm_data_init(msk);
 
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index edc0128730df..240a5d9bc7da 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -234,6 +234,7 @@ struct mptcp_sock {
 	bool		snd_data_fin_enable;
 	bool		rcv_fastclose;
 	bool		use_64bit_ack; /* Set when we received a 64-bit DSN */
+	bool		csum_enabled;
 	spinlock_t	join_list_lock;
 	struct sock	*ack_hint;
 	struct work_struct work;
@@ -524,6 +525,7 @@ static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *su
 
 int mptcp_is_enabled(struct net *net);
 unsigned int mptcp_get_add_addr_timeout(struct net *net);
+static inline int mptcp_is_checksum_enabled(struct net *net) { return false; }
 void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
 				     struct mptcp_options_received *mp_opt);
 bool mptcp_subflow_data_available(struct sock *sk);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 02/22] mptcp: generate the data checksum
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 01/22] mptcp: add csum_enabled in mptcp_sock Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 03/22] Squash-to: "mptcp: generate the data checksum" Paolo Abeni
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new member named csum in struct mptcp_ext, implemented
a new function named mptcp_generate_data_checksum().

Generate the data checksum in mptcp_sendmsg_frag, save it in mpext->csum.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 include/net/mptcp.h  |  1 +
 net/mptcp/protocol.c | 21 +++++++++++++++++++++
 net/mptcp/protocol.h |  7 +++++++
 3 files changed, 29 insertions(+)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 83f23774b908..23bbd439e115 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -23,6 +23,7 @@ struct mptcp_ext {
 	u64		data_seq;
 	u32		subflow_seq;
 	u16		data_len;
+	__sum16		csum;
 	u8		use_map:1,
 			dsn64:1,
 			data_fin:1,
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 5b6bb1d27af5..abcb22848a5c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1271,6 +1271,25 @@ static bool mptcp_alloc_tx_skb(struct sock *sk, struct sock *ssk)
 	return __mptcp_alloc_tx_skb(sk, ssk, sk->sk_allocation);
 }
 
+static __sum16 mptcp_generate_data_checksum(struct sk_buff *skb)
+{
+	struct csum_pseudo_header header;
+	struct mptcp_ext *mpext;
+	__wsum csum;
+
+	mpext = mptcp_get_ext(skb);
+
+	header.data_seq = mpext->data_seq;
+	header.subflow_seq = mpext->subflow_seq;
+	header.data_len = mpext->data_len;
+	header.csum = 0;
+
+	csum = skb_checksum(skb, 0, skb->len, 0);
+	csum = csum_partial(&header, sizeof(header), csum);
+
+	return csum_fold(csum);
+}
+
 static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 			      struct mptcp_data_frag *dfrag,
 			      struct mptcp_sendmsg_info *info)
@@ -1369,6 +1388,8 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		tcp_push_pending_frames(ssk);
 	}
 out:
+	if (READ_ONCE(msk->csum_enabled))
+		mpext->csum = mptcp_generate_data_checksum(tail);
 	mptcp_subflow_ctx(ssk)->rel_write_seq += ret;
 	return ret;
 }
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 240a5d9bc7da..e9e2d3c1a142 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -335,6 +335,13 @@ static inline struct mptcp_data_frag *mptcp_rtx_head(const struct sock *sk)
 	return list_first_entry_or_null(&msk->rtx_queue, struct mptcp_data_frag, list);
 }
 
+struct csum_pseudo_header {
+	u64 data_seq;
+	u32 subflow_seq;
+	u16 data_len;
+	__sum16 csum;
+};
+
 struct mptcp_subflow_request_sock {
 	struct	tcp_request_sock sk;
 	u16	mp_capable : 1,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 03/22] Squash-to: "mptcp: generate the data checksum"
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 01/22] mptcp: add csum_enabled in mptcp_sock Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 02/22] mptcp: generate the data checksum Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 04/22] mptcp: add csum_reqd in mptcp_out_options Paolo Abeni
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

We must generate the csum for zero window probe, too.

Additionally we must update the checksum when setting the data_fin
on a data packet.

Note that in a later patch we will skip unneeded csum related
operation. Changes not included here to keep the delta small.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c  | 3 +++
 net/mptcp/protocol.c | 7 ++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 99fc21406168..ebce0326e3b6 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -519,6 +519,9 @@ static void mptcp_write_data_fin(struct mptcp_subflow_context *subflow,
 		 */
 		ext->data_fin = 1;
 		ext->data_len++;
+
+		/* the pseudo header has changed, update the csum accordingly */
+		csum_replace2(&ext->csum, ext->data_len - 1, ext->data_len);
 	}
 }
 
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index abcb22848a5c..d012b9998fdd 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1271,6 +1271,9 @@ static bool mptcp_alloc_tx_skb(struct sock *sk, struct sock *ssk)
 	return __mptcp_alloc_tx_skb(sk, ssk, sk->sk_allocation);
 }
 
+/* note: this always recompute the csum on the whole skb, even
+ * if we just appended a single frag. More status info needed
+ */
 static __sum16 mptcp_generate_data_checksum(struct sk_buff *skb)
 {
 	struct csum_pseudo_header header;
@@ -1384,8 +1387,10 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 	if (zero_window_probe) {
 		mptcp_subflow_ctx(ssk)->rel_write_seq += ret;
 		mpext->frozen = 1;
-		ret = 0;
+		if (READ_ONCE(msk->csum_enabled))
+			mpext->csum = mptcp_generate_data_checksum(tail);
 		tcp_push_pending_frames(ssk);
+		return 0;
 	}
 out:
 	if (READ_ONCE(msk->csum_enabled))
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 04/22] mptcp: add csum_reqd in mptcp_out_options
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (2 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 03/22] Squash-to: "mptcp: generate the data checksum" Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 05/22] mptcp: send out checksum for MP_CAPABLE with data Paolo Abeni
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new member csum_reqd in struct mptcp_out_options and
struct mptcp_subflow_request_sock. Initialized it with the helper
function mptcp_is_checksum_enabled().

In mptcp_write_options, if this field is enabled, send out the MP_CAPABLE
suboption with the MPTCP_CAP_CHECKSUM_REQD flag.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 include/net/mptcp.h  |  3 ++-
 net/mptcp/options.c  | 11 +++++++++--
 net/mptcp/protocol.h |  3 ++-
 net/mptcp/subflow.c  |  1 +
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 23bbd439e115..8f86c05ddbfd 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -65,7 +65,8 @@ struct mptcp_out_options {
 	u8 join_id;
 	u8 backup;
 	u8 reset_reason:4;
-	u8 reset_transient:1;
+	u8 reset_transient:1,
+	   csum_reqd:1;
 	u32 nonce;
 	u64 thmac;
 	u32 token;
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index ebce0326e3b6..67bb08ce8ed7 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -381,6 +381,7 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
 	subflow->snd_isn = TCP_SKB_CB(skb)->end_seq;
 	if (subflow->request_mptcp) {
 		opts->suboptions = OPTION_MPTCP_MPC_SYN;
+		opts->csum_reqd = mptcp_is_checksum_enabled(sock_net(sk));
 		*size = TCPOLEN_MPTCP_MPC_SYN;
 		return true;
 	} else if (subflow->request_join) {
@@ -436,6 +437,7 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 					 struct mptcp_out_options *opts)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
+	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
 	struct mptcp_ext *mpext;
 	unsigned int data_len;
 
@@ -466,6 +468,7 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 		opts->suboptions = OPTION_MPTCP_MPC_ACK;
 		opts->sndr_key = subflow->local_key;
 		opts->rcvr_key = subflow->remote_key;
+		opts->csum_reqd = READ_ONCE(msk->csum_enabled);
 
 		/* Section 3.1.
 		 * The MP_CAPABLE option is carried on the SYN, SYN/ACK, and ACK
@@ -793,6 +796,7 @@ bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
 	if (subflow_req->mp_capable) {
 		opts->suboptions = OPTION_MPTCP_MPC_SYNACK;
 		opts->sndr_key = subflow_req->local_key;
+		opts->csum_reqd = subflow_req->csum_reqd;
 		*size = TCPOLEN_MPTCP_MPC_SYNACK;
 		pr_debug("subflow_req=%p, local_key=%llu",
 			 subflow_req, subflow_req->local_key);
@@ -1127,7 +1131,7 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 {
 	if ((OPTION_MPTCP_MPC_SYN | OPTION_MPTCP_MPC_SYNACK |
 	     OPTION_MPTCP_MPC_ACK) & opts->suboptions) {
-		u8 len;
+		u8 len, flag = MPTCP_CAP_HMAC_SHA256;
 
 		if (OPTION_MPTCP_MPC_SYN & opts->suboptions)
 			len = TCPOLEN_MPTCP_MPC_SYN;
@@ -1138,9 +1142,12 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 		else
 			len = TCPOLEN_MPTCP_MPC_ACK;
 
+		if (opts->csum_reqd)
+			flag |= MPTCP_CAP_CHECKSUM_REQD;
+
 		*ptr++ = mptcp_option(MPTCPOPT_MP_CAPABLE, len,
 				      MPTCP_SUPPORTED_VERSION,
-				      MPTCP_CAP_HMAC_SHA256);
+				      flag);
 
 		if (!((OPTION_MPTCP_MPC_SYNACK | OPTION_MPTCP_MPC_ACK) &
 		    opts->suboptions))
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index e9e2d3c1a142..6955f507609e 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -346,7 +346,8 @@ struct mptcp_subflow_request_sock {
 	struct	tcp_request_sock sk;
 	u16	mp_capable : 1,
 		mp_join : 1,
-		backup : 1;
+		backup : 1,
+		csum_reqd : 1;
 	u8	local_id;
 	u8	remote_id;
 	u64	local_key;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 82e91b00ad39..79fd8fbbd846 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -108,6 +108,7 @@ static void subflow_init_req(struct request_sock *req, const struct sock *sk_lis
 
 	subflow_req->mp_capable = 0;
 	subflow_req->mp_join = 0;
+	subflow_req->csum_reqd = mptcp_is_checksum_enabled(sock_net(sk_listener));
 	subflow_req->msk = NULL;
 	mptcp_token_init_request(req);
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 05/22] mptcp: send out checksum for MP_CAPABLE with data
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (3 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 04/22] mptcp: add csum_reqd in mptcp_out_options Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 06/22] Squash-to: "mptcp: send out checksum for MP_CAPABLE with data" Paolo Abeni
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

If the checksum is enabled, send out the data checksum with the
MP_CAPABLE suboption with data.

In mptcp_established_options_mp, save the data checksum in
opts->ext_copy.csum. In mptcp_write_options, adjust the option length and
send it out with the MP_CAPABLE suboption.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/options.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 67bb08ce8ed7..710e3de8c910 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -440,6 +440,8 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
 	struct mptcp_ext *mpext;
 	unsigned int data_len;
+	__sum16 csum;
+	u8 len;
 
 	/* When skb is not available, we better over-estimate the emitted
 	 * options len. A full DSS option (28 bytes) is longer than
@@ -459,6 +461,7 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 	if (subflow->mp_capable) {
 		mpext = mptcp_get_ext(skb);
 		data_len = mpext ? mpext->data_len : 0;
+		csum = mpext ? mpext->csum : 0;
 
 		/* we will check ext_copy.data_len in mptcp_write_options() to
 		 * discriminate between TCPOLEN_MPTCP_MPC_ACK_DATA and
@@ -469,16 +472,22 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 		opts->sndr_key = subflow->local_key;
 		opts->rcvr_key = subflow->remote_key;
 		opts->csum_reqd = READ_ONCE(msk->csum_enabled);
+		if (opts->csum_reqd)
+			opts->ext_copy.csum = csum;
 
 		/* Section 3.1.
 		 * The MP_CAPABLE option is carried on the SYN, SYN/ACK, and ACK
 		 * packets that start the first subflow of an MPTCP connection,
 		 * as well as the first packet that carries data
 		 */
-		if (data_len > 0)
-			*size = ALIGN(TCPOLEN_MPTCP_MPC_ACK_DATA, 4);
-		else
+		if (data_len > 0) {
+			len = TCPOLEN_MPTCP_MPC_ACK_DATA;
+			if (opts->csum_reqd)
+				len += TCPOLEN_MPTCP_DSS_CHECKSUM;
+			*size = ALIGN(len, 4);
+		} else {
 			*size = TCPOLEN_MPTCP_MPC_ACK;
+		}
 
 		pr_debug("subflow=%p, local_key=%llu, remote_key=%llu map_len=%d",
 			 subflow, subflow->local_key, subflow->remote_key,
@@ -1133,14 +1142,17 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 	     OPTION_MPTCP_MPC_ACK) & opts->suboptions) {
 		u8 len, flag = MPTCP_CAP_HMAC_SHA256;
 
-		if (OPTION_MPTCP_MPC_SYN & opts->suboptions)
+		if (OPTION_MPTCP_MPC_SYN & opts->suboptions) {
 			len = TCPOLEN_MPTCP_MPC_SYN;
-		else if (OPTION_MPTCP_MPC_SYNACK & opts->suboptions)
+		} else if (OPTION_MPTCP_MPC_SYNACK & opts->suboptions) {
 			len = TCPOLEN_MPTCP_MPC_SYNACK;
-		else if (opts->ext_copy.data_len)
+		} else if (opts->ext_copy.data_len) {
 			len = TCPOLEN_MPTCP_MPC_ACK_DATA;
-		else
+			if (opts->csum_reqd)
+				len += TCPOLEN_MPTCP_DSS_CHECKSUM;
+		} else {
 			len = TCPOLEN_MPTCP_MPC_ACK;
+		}
 
 		if (opts->csum_reqd)
 			flag |= MPTCP_CAP_CHECKSUM_REQD;
@@ -1163,8 +1175,13 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 		if (!opts->ext_copy.data_len)
 			goto mp_capable_done;
 
-		put_unaligned_be32(opts->ext_copy.data_len << 16 |
-				   TCPOPT_NOP << 8 | TCPOPT_NOP, ptr);
+		if (opts->csum_reqd) {
+			put_unaligned_be32(opts->ext_copy.data_len << 16 |
+					   (__force u16)opts->ext_copy.csum, ptr);
+		} else {
+			put_unaligned_be32(opts->ext_copy.data_len << 16 |
+					   TCPOPT_NOP << 8 | TCPOPT_NOP, ptr);
+		}
 		ptr += 1;
 	}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 06/22] Squash-to: "mptcp: send out checksum for MP_CAPABLE with data"
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (4 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 05/22] mptcp: send out checksum for MP_CAPABLE with data Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 07/22] mptcp: send out checksum for DSS Paolo Abeni
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Copy the checksum into the opts only after computing it.

This is necessary after:
Squash-to: "mptcp: generate the data checksum"

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 710e3de8c910..814fc43ee97b 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -440,7 +440,6 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
 	struct mptcp_ext *mpext;
 	unsigned int data_len;
-	__sum16 csum;
 	u8 len;
 
 	/* When skb is not available, we better over-estimate the emitted
@@ -461,7 +460,6 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 	if (subflow->mp_capable) {
 		mpext = mptcp_get_ext(skb);
 		data_len = mpext ? mpext->data_len : 0;
-		csum = mpext ? mpext->csum : 0;
 
 		/* we will check ext_copy.data_len in mptcp_write_options() to
 		 * discriminate between TCPOLEN_MPTCP_MPC_ACK_DATA and
@@ -472,8 +470,6 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 		opts->sndr_key = subflow->local_key;
 		opts->rcvr_key = subflow->remote_key;
 		opts->csum_reqd = READ_ONCE(msk->csum_enabled);
-		if (opts->csum_reqd)
-			opts->ext_copy.csum = csum;
 
 		/* Section 3.1.
 		 * The MP_CAPABLE option is carried on the SYN, SYN/ACK, and ACK
@@ -482,8 +478,10 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 		 */
 		if (data_len > 0) {
 			len = TCPOLEN_MPTCP_MPC_ACK_DATA;
-			if (opts->csum_reqd)
+			if (opts->csum_reqd) {
+				opts->ext_copy.csum = mpext->csum;
 				len += TCPOLEN_MPTCP_DSS_CHECKSUM;
+			}
 			*size = ALIGN(len, 4);
 		} else {
 			*size = TCPOLEN_MPTCP_MPC_ACK;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 07/22] mptcp: send out checksum for DSS
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (5 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 06/22] Squash-to: "mptcp: send out checksum for MP_CAPABLE with data" Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 08/22] Squash-to "mptcp: send out checksum for DSS" Paolo Abeni
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

In mptcp_write_options, If the checksum is enabled, adjust the option
length and send out the data checksum with DSS suboption.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/options.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 814fc43ee97b..01caff4cdd25 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -549,6 +549,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 	bool ret = false;
 	u64 ack_seq;
 
+	opts->csum_reqd = READ_ONCE(msk->csum_enabled);
 	mpext = skb ? mptcp_get_ext(skb) : NULL;
 
 	if (!skb || (mpext && mpext->use_map) || snd_data_fin_enable) {
@@ -1331,6 +1332,9 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 			flags |= MPTCP_DSS_HAS_MAP | MPTCP_DSS_DSN64;
 			if (mpext->data_fin)
 				flags |= MPTCP_DSS_DATA_FIN;
+
+			if (opts->csum_reqd)
+				len += TCPOLEN_MPTCP_DSS_CHECKSUM;
 		}
 
 		*ptr++ = mptcp_option(MPTCPOPT_DSS, len, 0, flags);
@@ -1350,8 +1354,13 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 			ptr += 2;
 			put_unaligned_be32(mpext->subflow_seq, ptr);
 			ptr += 1;
-			put_unaligned_be32(mpext->data_len << 16 |
-					   TCPOPT_NOP << 8 | TCPOPT_NOP, ptr);
+			if (opts->csum_reqd) {
+				put_unaligned_be32(mpext->data_len << 16 |
+						   (__force u16)mpext->csum, ptr);
+			} else {
+				put_unaligned_be32(mpext->data_len << 16 |
+						   TCPOPT_NOP << 8 | TCPOPT_NOP, ptr);
+			}
 		}
 	}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 08/22] Squash-to "mptcp: send out checksum for DSS"
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (6 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 07/22] mptcp: send out checksum for DSS Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 09/22] mptcp: add sk parameter for mptcp_parse_option Paolo Abeni
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Account for csum len in mptcp_established_options_dss()

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v3 -> v4
 - use opts->csum_reqd (Geliang)
 - reoder remaining, dss_size update (Geliang)
---
 net/mptcp/options.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 01caff4cdd25..f8430415a954 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -553,15 +553,17 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 	mpext = skb ? mptcp_get_ext(skb) : NULL;
 
 	if (!skb || (mpext && mpext->use_map) || snd_data_fin_enable) {
-		unsigned int map_size;
+		unsigned int map_size = TCPOLEN_MPTCP_DSS_BASE + TCPOLEN_MPTCP_DSS_MAP64;
 
-		map_size = TCPOLEN_MPTCP_DSS_BASE + TCPOLEN_MPTCP_DSS_MAP64;
+		if (mpext) {
+			if (READ_ONCE(msk->csum_enabled))
+				map_size += TCPOLEN_MPTCP_DSS_CHECKSUM;
 
-		remaining -= map_size;
-		dss_size = map_size;
-		if (mpext)
 			opts->ext_copy = *mpext;
+		}
 
+		dss_size = map_size;
+		remaining -= map_size;
 		if (skb && snd_data_fin_enable)
 			mptcp_write_data_fin(subflow, skb, &opts->ext_copy);
 		ret = true;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 09/22] mptcp: add sk parameter for mptcp_parse_option
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (7 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 08/22] Squash-to "mptcp: send out checksum for DSS" Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 10/22] mptcp: add csum_reqd in mptcp_options_received Paolo Abeni
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new parameter name sk in mptcp_parse_option() and
mptcp_get_options().

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/options.c  | 10 ++++++----
 net/mptcp/protocol.h |  3 ++-
 net/mptcp/subflow.c  | 10 +++++-----
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index f8430415a954..cf465b62e4b4 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -20,7 +20,8 @@ static bool mptcp_cap_flag_sha256(u8 flags)
 	return (flags & MPTCP_CAP_FLAG_MASK) == MPTCP_CAP_HMAC_SHA256;
 }
 
-static void mptcp_parse_option(const struct sk_buff *skb,
+static void mptcp_parse_option(const struct sock *sk,
+			       const struct sk_buff *skb,
 			       const unsigned char *ptr, int opsize,
 			       struct mptcp_options_received *mp_opt)
 {
@@ -324,7 +325,8 @@ static void mptcp_parse_option(const struct sk_buff *skb,
 	}
 }
 
-void mptcp_get_options(const struct sk_buff *skb,
+void mptcp_get_options(const struct sock *sk,
+		       const struct sk_buff *skb,
 		       struct mptcp_options_received *mp_opt)
 {
 	const struct tcphdr *th = tcp_hdr(skb);
@@ -363,7 +365,7 @@ void mptcp_get_options(const struct sk_buff *skb,
 			if (opsize > length)
 				return;	/* don't parse partial options */
 			if (opcode == TCPOPT_MPTCP)
-				mptcp_parse_option(skb, ptr, opsize, mp_opt);
+				mptcp_parse_option(sk, skb, ptr, opsize, mp_opt);
 			ptr += opsize - 2;
 			length -= opsize;
 		}
@@ -1025,7 +1027,7 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
 		return;
 	}
 
-	mptcp_get_options(skb, &mp_opt);
+	mptcp_get_options(sk, skb, &mp_opt);
 	if (!check_fully_established(msk, sk, subflow, skb, &mp_opt))
 		return;
 
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 6955f507609e..87d78819a545 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -585,7 +585,8 @@ int __init mptcp_proto_v6_init(void);
 struct sock *mptcp_sk_clone(const struct sock *sk,
 			    const struct mptcp_options_received *mp_opt,
 			    struct request_sock *req);
-void mptcp_get_options(const struct sk_buff *skb,
+void mptcp_get_options(const struct sock *sk,
+		       const struct sk_buff *skb,
 		       struct mptcp_options_received *mp_opt);
 
 void mptcp_finish_connect(struct sock *sk);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 79fd8fbbd846..39ae86d12f06 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -151,7 +151,7 @@ static int subflow_check_req(struct request_sock *req,
 		return -EINVAL;
 #endif
 
-	mptcp_get_options(skb, &mp_opt);
+	mptcp_get_options(sk_listener, skb, &mp_opt);
 
 	if (mp_opt.mp_capable) {
 		SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_MPCAPABLEPASSIVE);
@@ -248,7 +248,7 @@ int mptcp_subflow_init_cookie_req(struct request_sock *req,
 	int err;
 
 	subflow_init_req(req, sk_listener);
-	mptcp_get_options(skb, &mp_opt);
+	mptcp_get_options(sk_listener, skb, &mp_opt);
 
 	if (mp_opt.mp_capable && mp_opt.mp_join)
 		return -EINVAL;
@@ -395,7 +395,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 	subflow->ssn_offset = TCP_SKB_CB(skb)->seq;
 	pr_debug("subflow=%p synack seq=%x", subflow, subflow->ssn_offset);
 
-	mptcp_get_options(skb, &mp_opt);
+	mptcp_get_options(sk, skb, &mp_opt);
 	if (subflow->request_mptcp) {
 		if (!mp_opt.mp_capable) {
 			MPTCP_INC_STATS(sock_net(sk),
@@ -640,7 +640,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 			goto create_msk;
 		}
 
-		mptcp_get_options(skb, &mp_opt);
+		mptcp_get_options(sk, skb, &mp_opt);
 		if (!mp_opt.mp_capable) {
 			fallback = true;
 			goto create_child;
@@ -651,7 +651,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 		if (!new_msk)
 			fallback = true;
 	} else if (subflow_req->mp_join) {
-		mptcp_get_options(skb, &mp_opt);
+		mptcp_get_options(sk, skb, &mp_opt);
 		if (!mp_opt.mp_join || !subflow_hmac_valid(req, &mp_opt) ||
 		    !mptcp_can_accept_new_subflow(subflow_req->msk)) {
 			SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_JOINACKMAC);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 10/22] mptcp: add csum_reqd in mptcp_options_received
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (8 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 09/22] mptcp: add sk parameter for mptcp_parse_option Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 11/22] mptcp: receive checksum for MP_CAPABLE with data Paolo Abeni
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new flag csum_reqd in struct mptcp_options_received, if
the flag MPTCP_CAP_CHECKSUM_REQD is set in the receiving MP_CAPABLE
suboption, set this flag.

In mptcp_sk_clone and subflow_finish_connect, if the csum_reqd flag is set,
enable the msk->csum_enabled flag.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/options.c  | 8 +++++---
 net/mptcp/protocol.c | 2 ++
 net/mptcp/protocol.h | 1 +
 net/mptcp/subflow.c  | 2 ++
 4 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index cf465b62e4b4..2d6b0158e19b 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -25,6 +25,8 @@ static void mptcp_parse_option(const struct sock *sk,
 			       const unsigned char *ptr, int opsize,
 			       struct mptcp_options_received *mp_opt)
 {
+	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
+	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
 	u8 subtype = *ptr >> 4;
 	int expected_opsize;
 	u8 version;
@@ -72,11 +74,10 @@ static void mptcp_parse_option(const struct sock *sk,
 		 * "If a checksum is not present when its use has been
 		 * negotiated, the receiver MUST close the subflow with a RST as
 		 * it is considered broken."
-		 *
-		 * We don't implement DSS checksum - fall back to TCP.
 		 */
+		mp_opt->csum_reqd = READ_ONCE(msk->csum_enabled);
 		if (flags & MPTCP_CAP_CHECKSUM_REQD)
-			break;
+			mp_opt->csum_reqd = 1;
 
 		mp_opt->mp_capable = 1;
 		if (opsize >= TCPOLEN_MPTCP_MPC_SYNACK) {
@@ -344,6 +345,7 @@ void mptcp_get_options(const struct sock *sk,
 	mp_opt->dss = 0;
 	mp_opt->mp_prio = 0;
 	mp_opt->reset = 0;
+	mp_opt->csum_reqd = 0;
 
 	length = (th->doff * 4) - sizeof(struct tcphdr);
 	ptr = (const unsigned char *)(th + 1);
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index d012b9998fdd..0d8005b480ab 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2753,6 +2753,8 @@ struct sock *mptcp_sk_clone(const struct sock *sk,
 	msk->token = subflow_req->token;
 	msk->subflow = NULL;
 	WRITE_ONCE(msk->fully_established, false);
+	if (mp_opt->csum_reqd)
+		WRITE_ONCE(msk->csum_enabled, true);
 
 	msk->write_seq = subflow_req->idsn + 1;
 	msk->snd_nxt = msk->write_seq;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 87d78819a545..51f4bbe1703a 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -133,6 +133,7 @@ struct mptcp_options_received {
 		rm_addr : 1,
 		mp_prio : 1,
 		echo : 1,
+		csum_reqd : 1,
 		backup : 1;
 	u32	token;
 	u32	nonce;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 39ae86d12f06..fc39107b25ec 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -405,6 +405,8 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 			goto fallback;
 		}
 
+		if (mp_opt.csum_reqd)
+			WRITE_ONCE(mptcp_sk(parent)->csum_enabled, true);
 		subflow->mp_capable = 1;
 		subflow->can_ack = 1;
 		subflow->remote_key = mp_opt.sndr_key;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 11/22] mptcp: receive checksum for MP_CAPABLE with data
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (9 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 10/22] mptcp: add csum_reqd in mptcp_options_received Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 12/22] Squash-to: "mptcp: receive checksum for MP_CAPABLE with data" Paolo Abeni
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new member named csum in struct mptcp_options_received.

When parsing the MP_CAPABLE with data, if the checksum is enabled,
adjust the expected_opsize. If the receiving option length matches the
length with the data checksum, get the checksum value and save it in
mp_opt->csum. And in mptcp_incoming_options, pass it to mpext->csum.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/options.c  | 21 ++++++++++++++++-----
 net/mptcp/protocol.h |  1 +
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 2d6b0158e19b..cd205ad09c00 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -37,10 +37,13 @@ static void mptcp_parse_option(const struct sock *sk,
 	case MPTCPOPT_MP_CAPABLE:
 		/* strict size checking */
 		if (!(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)) {
-			if (skb->len > tcp_hdr(skb)->doff << 2)
+			if (skb->len > tcp_hdr(skb)->doff << 2) {
 				expected_opsize = TCPOLEN_MPTCP_MPC_ACK_DATA;
-			else
+				if (READ_ONCE(msk->csum_enabled))
+					expected_opsize += TCPOLEN_MPTCP_DSS_CHECKSUM;
+			} else {
 				expected_opsize = TCPOLEN_MPTCP_MPC_ACK;
+			}
 		} else {
 			if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACK)
 				expected_opsize = TCPOLEN_MPTCP_MPC_SYNACK;
@@ -88,7 +91,7 @@ static void mptcp_parse_option(const struct sock *sk,
 			mp_opt->rcvr_key = get_unaligned_be64(ptr);
 			ptr += 8;
 		}
-		if (opsize == TCPOLEN_MPTCP_MPC_ACK_DATA) {
+		if (opsize >= TCPOLEN_MPTCP_MPC_ACK_DATA) {
 			/* Section 3.1.:
 			 * "the data parameters in a MP_CAPABLE are semantically
 			 * equivalent to those in a DSS option and can be used
@@ -100,9 +103,13 @@ static void mptcp_parse_option(const struct sock *sk,
 			mp_opt->data_len = get_unaligned_be16(ptr);
 			ptr += 2;
 		}
-		pr_debug("MP_CAPABLE version=%x, flags=%x, optlen=%d sndr=%llu, rcvr=%llu len=%d",
+		if (opsize == TCPOLEN_MPTCP_MPC_ACK_DATA + TCPOLEN_MPTCP_DSS_CHECKSUM) {
+			mp_opt->csum = (__force __sum16)get_unaligned_be16(ptr);
+			ptr += 2;
+		}
+		pr_debug("MP_CAPABLE version=%x, flags=%x, optlen=%d sndr=%llu, rcvr=%llu len=%d csum=%u",
 			 version, flags, opsize, mp_opt->sndr_key,
-			 mp_opt->rcvr_key, mp_opt->data_len);
+			 mp_opt->rcvr_key, mp_opt->data_len, mp_opt->csum);
 		break;
 
 	case MPTCPOPT_MP_JOIN:
@@ -346,6 +353,7 @@ void mptcp_get_options(const struct sock *sk,
 	mp_opt->mp_prio = 0;
 	mp_opt->reset = 0;
 	mp_opt->csum_reqd = 0;
+	mp_opt->csum = 0;
 
 	length = (th->doff * 4) - sizeof(struct tcphdr);
 	ptr = (const unsigned char *)(th + 1);
@@ -1121,6 +1129,9 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
 		}
 		mpext->data_len = mp_opt.data_len;
 		mpext->use_map = 1;
+
+		if (READ_ONCE(msk->csum_enabled))
+			mpext->csum = mp_opt.csum;
 	}
 }
 
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 51f4bbe1703a..d2b00f8c4040 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -124,6 +124,7 @@ struct mptcp_options_received {
 	u64	data_seq;
 	u32	subflow_seq;
 	u16	data_len;
+	__sum16	csum;
 	u16	mp_capable : 1,
 		mp_join : 1,
 		fastclose : 1,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 12/22] Squash-to: "mptcp: receive checksum for MP_CAPABLE with data"
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (10 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 11/22] mptcp: receive checksum for MP_CAPABLE with data Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 13/22] mptcp: receive checksum for DSS Paolo Abeni
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

always parse any csum/nocsum combination and delay the
presence check to later code, to allow reset if missing.

Additionally, in the TX path, use the newly introduce ext
field to avoid MPTCP csum recomputation on tcp retransmission
and unneeded csum update on when setting the data fin_flag.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/mptcp.h  |  3 ++-
 net/mptcp/options.c  | 32 ++++++++++++++++++++------------
 net/mptcp/protocol.h |  2 ++
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 8f86c05ddbfd..bd272c34b53c 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -32,7 +32,8 @@ struct mptcp_ext {
 			mpc_map:1,
 			frozen:1,
 			reset_transient:1;
-	u8		reset_reason:4;
+	u8		reset_reason:4,
+			csum_reqd:1;
 };
 
 #define MPTCP_RM_IDS_MAX	8
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index cd205ad09c00..83a981684bbf 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -39,8 +39,6 @@ static void mptcp_parse_option(const struct sock *sk,
 		if (!(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)) {
 			if (skb->len > tcp_hdr(skb)->doff << 2) {
 				expected_opsize = TCPOLEN_MPTCP_MPC_ACK_DATA;
-				if (READ_ONCE(msk->csum_enabled))
-					expected_opsize += TCPOLEN_MPTCP_DSS_CHECKSUM;
 			} else {
 				expected_opsize = TCPOLEN_MPTCP_MPC_ACK;
 			}
@@ -50,7 +48,20 @@ static void mptcp_parse_option(const struct sock *sk,
 			else
 				expected_opsize = TCPOLEN_MPTCP_MPC_SYN;
 		}
-		if (opsize != expected_opsize)
+
+		/* Cfr RFC 8684 Section 3.3.0:
+		 * If a checksum is present but its use had
+		 * not been negotiated in the MP_CAPABLE handshake, the receiver MUST
+		 * close the subflow with a RST, as it is not behaving as negotiated.
+		 * If a checksum is not present when its use has been negotiated, the
+		 * receiver MUST close the subflow with a RST, as it is considered
+		 * broken
+		 * We parse even option with mismatching csum presence, so that
+		 * later in subflow_data_ready we can trigger the reset.
+		 */
+		if ((opsize != expected_opsize) &&
+		    (expected_opsize != TCPOLEN_MPTCP_MPC_ACK_DATA ||
+		     opsize != TCPOLEN_MPTCP_MPC_ACK_DATA_CSUM))
 			break;
 
 		/* try to be gentle vs future versions on the initial syn */
@@ -72,11 +83,6 @@ static void mptcp_parse_option(const struct sock *sk,
 		 * host requires the use of checksums, checksums MUST be used.
 		 * In other words, the only way for checksums not to be used
 		 * is if both hosts in their SYNs set A=0."
-		 *
-		 * Section 3.3.0:
-		 * "If a checksum is not present when its use has been
-		 * negotiated, the receiver MUST close the subflow with a RST as
-		 * it is considered broken."
 		 */
 		mp_opt->csum_reqd = READ_ONCE(msk->csum_enabled);
 		if (flags & MPTCP_CAP_CHECKSUM_REQD)
@@ -103,8 +109,9 @@ static void mptcp_parse_option(const struct sock *sk,
 			mp_opt->data_len = get_unaligned_be16(ptr);
 			ptr += 2;
 		}
-		if (opsize == TCPOLEN_MPTCP_MPC_ACK_DATA + TCPOLEN_MPTCP_DSS_CHECKSUM) {
+		if (opsize == TCPOLEN_MPTCP_MPC_ACK_DATA_CSUM) {
 			mp_opt->csum = (__force __sum16)get_unaligned_be16(ptr);
+			mp_opt->csum_reqd = 1;
 			ptr += 2;
 		}
 		pr_debug("MP_CAPABLE version=%x, flags=%x, optlen=%d sndr=%llu, rcvr=%llu len=%d csum=%u",
@@ -353,7 +360,6 @@ void mptcp_get_options(const struct sock *sk,
 	mp_opt->mp_prio = 0;
 	mp_opt->reset = 0;
 	mp_opt->csum_reqd = 0;
-	mp_opt->csum = 0;
 
 	length = (th->doff * 4) - sizeof(struct tcphdr);
 	ptr = (const unsigned char *)(th + 1);
@@ -543,7 +549,8 @@ static void mptcp_write_data_fin(struct mptcp_subflow_context *subflow,
 		ext->data_len++;
 
 		/* the pseudo header has changed, update the csum accordingly */
-		csum_replace2(&ext->csum, ext->data_len - 1, ext->data_len);
+		if (ext->csum_reqd)
+			csum_replace2(&ext->csum, ext->data_len - 1, ext->data_len);
 	}
 }
 
@@ -1129,8 +1136,9 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
 		}
 		mpext->data_len = mp_opt.data_len;
 		mpext->use_map = 1;
+		mpext->csum_reqd = mp_opt.csum_reqd;
 
-		if (READ_ONCE(msk->csum_enabled))
+		if (mpext->csum_reqd)
 			mpext->csum = mp_opt.csum;
 	}
 }
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index d2b00f8c4040..3ac61b8178fd 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -68,6 +68,8 @@
 #define TCPOLEN_MPTCP_FASTCLOSE		12
 #define TCPOLEN_MPTCP_RST		4
 
+#define TCPOLEN_MPTCP_MPC_ACK_DATA_CSUM	(TCPOLEN_MPTCP_DSS_CHECKSUM + TCPOLEN_MPTCP_MPC_ACK_DATA)
+
 /* MPTCP MP_JOIN flags */
 #define MPTCPOPT_BACKUP		BIT(0)
 #define MPTCPOPT_HMAC_LEN	20
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 13/22] mptcp: receive checksum for DSS
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (11 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 12/22] Squash-to: "mptcp: receive checksum for MP_CAPABLE with data" Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 14/22] Squash-to: "mptcp: receive checksum for DSS" Paolo Abeni
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

In mptcp_parse_option, adjust the expected_opsize, and parse the data
checksum value from the receiving DSS, then save it in mp_opt->csum.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
v3 -> v4:
 - drop unneeded brackets (geliang)
---
 net/mptcp/options.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 83a981684bbf..91c36d06ca73 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -37,11 +37,10 @@ static void mptcp_parse_option(const struct sock *sk,
 	case MPTCPOPT_MP_CAPABLE:
 		/* strict size checking */
 		if (!(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)) {
-			if (skb->len > tcp_hdr(skb)->doff << 2) {
+			if (skb->len > tcp_hdr(skb)->doff << 2)
 				expected_opsize = TCPOLEN_MPTCP_MPC_ACK_DATA;
-			} else {
+			else
 				expected_opsize = TCPOLEN_MPTCP_MPC_ACK;
-			}
 		} else {
 			if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACK)
 				expected_opsize = TCPOLEN_MPTCP_MPC_SYNACK;
@@ -166,6 +165,7 @@ static void mptcp_parse_option(const struct sock *sk,
 		mp_opt->use_map = (flags & MPTCP_DSS_HAS_MAP) != 0;
 		mp_opt->ack64 = (flags & MPTCP_DSS_ACK64) != 0;
 		mp_opt->use_ack = (flags & MPTCP_DSS_HAS_ACK);
+		mp_opt->csum_reqd = READ_ONCE(msk->csum_enabled);
 
 		pr_debug("data_fin=%d dsn64=%d use_map=%d ack64=%d use_ack=%d",
 			 mp_opt->data_fin, mp_opt->dsn64,
@@ -186,6 +186,9 @@ static void mptcp_parse_option(const struct sock *sk,
 				expected_opsize += TCPOLEN_MPTCP_DSS_MAP64;
 			else
 				expected_opsize += TCPOLEN_MPTCP_DSS_MAP32;
+
+			if (mp_opt->csum_reqd)
+				expected_opsize += TCPOLEN_MPTCP_DSS_CHECKSUM;
 		}
 
 		/* RFC 6824, Section 3.3:
@@ -193,8 +196,7 @@ static void mptcp_parse_option(const struct sock *sk,
 		 * not been negotiated in the MP_CAPABLE handshake,
 		 * the checksum field MUST be ignored.
 		 */
-		if (opsize != expected_opsize &&
-		    opsize != expected_opsize + TCPOLEN_MPTCP_DSS_CHECKSUM)
+		if (opsize != expected_opsize)
 			break;
 
 		mp_opt->dss = 1;
@@ -226,9 +228,14 @@ static void mptcp_parse_option(const struct sock *sk,
 			mp_opt->data_len = get_unaligned_be16(ptr);
 			ptr += 2;
 
-			pr_debug("data_seq=%llu subflow_seq=%u data_len=%u",
+			if (mp_opt->csum_reqd) {
+				mp_opt->csum = (__force __sum16)get_unaligned_be16(ptr);
+				ptr += 2;
+			}
+
+			pr_debug("data_seq=%llu subflow_seq=%u data_len=%u csum=%u",
 				 mp_opt->data_seq, mp_opt->subflow_seq,
-				 mp_opt->data_len);
+				 mp_opt->data_len, mp_opt->csum);
 		}
 
 		break;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 14/22] Squash-to: "mptcp: receive checksum for DSS"
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (12 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 13/22] mptcp: receive checksum for DSS Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 15/22] mptcp: validate the data checksum Paolo Abeni
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Similar to the previous squash to patch, always parse
the DSS option regardless of csum presence.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 91c36d06ca73..9af54a1a3040 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -165,7 +165,6 @@ static void mptcp_parse_option(const struct sock *sk,
 		mp_opt->use_map = (flags & MPTCP_DSS_HAS_MAP) != 0;
 		mp_opt->ack64 = (flags & MPTCP_DSS_ACK64) != 0;
 		mp_opt->use_ack = (flags & MPTCP_DSS_HAS_ACK);
-		mp_opt->csum_reqd = READ_ONCE(msk->csum_enabled);
 
 		pr_debug("data_fin=%d dsn64=%d use_map=%d ack64=%d use_ack=%d",
 			 mp_opt->data_fin, mp_opt->dsn64,
@@ -186,17 +185,13 @@ static void mptcp_parse_option(const struct sock *sk,
 				expected_opsize += TCPOLEN_MPTCP_DSS_MAP64;
 			else
 				expected_opsize += TCPOLEN_MPTCP_DSS_MAP32;
-
-			if (mp_opt->csum_reqd)
-				expected_opsize += TCPOLEN_MPTCP_DSS_CHECKSUM;
 		}
 
-		/* RFC 6824, Section 3.3:
-		 * If a checksum is present, but its use had
-		 * not been negotiated in the MP_CAPABLE handshake,
-		 * the checksum field MUST be ignored.
+		/* Always parse any csum presence combination, we will enforce
+		 * RFC 8684 Section 3.3.0 checks later in subflow_data_ready
 		 */
-		if (opsize != expected_opsize)
+		if (opsize != expected_opsize &&
+		    opsize != expected_opsize + TCPOLEN_MPTCP_DSS_CHECKSUM)
 			break;
 
 		mp_opt->dss = 1;
@@ -228,14 +223,15 @@ static void mptcp_parse_option(const struct sock *sk,
 			mp_opt->data_len = get_unaligned_be16(ptr);
 			ptr += 2;
 
-			if (mp_opt->csum_reqd) {
+			if (opsize == expected_opsize + TCPOLEN_MPTCP_DSS_CHECKSUM) {
+				mp_opt->csum_reqd = 1;
 				mp_opt->csum = (__force __sum16)get_unaligned_be16(ptr);
 				ptr += 2;
 			}
 
-			pr_debug("data_seq=%llu subflow_seq=%u data_len=%u csum=%u",
+			pr_debug("data_seq=%llu subflow_seq=%u data_len=%u csum=%d:%u",
 				 mp_opt->data_seq, mp_opt->subflow_seq,
-				 mp_opt->data_len, mp_opt->csum);
+				 mp_opt->data_len, mp_opt->csum_reqd, mp_opt->csum);
 		}
 
 		break;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 15/22] mptcp: validate the data checksum
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (13 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 14/22] Squash-to: "mptcp: receive checksum for DSS" Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 16/22] Squash-to: " Paolo Abeni
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added three new members named data_csum, csum_len and map_csum
in struct mptcp_subflow_context, implemented a new function named
mptcp_validate_data_checksum(). Validate the data checksum in the function
__mptcp_move_skbs_from_subflow.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/protocol.c | 35 +++++++++++++++++++++++++++++++++++
 net/mptcp/protocol.h |  3 +++
 net/mptcp/subflow.c  |  7 +++++--
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0d8005b480ab..5160256de731 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -520,6 +520,35 @@ static bool mptcp_check_data_fin(struct sock *sk)
 	return ret;
 }
 
+static bool mptcp_validate_data_checksum(struct sock *ssk)
+{
+	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
+	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
+	struct csum_pseudo_header header;
+	__wsum csum;
+
+	if (__mptcp_check_fallback(msk))
+		goto out;
+
+	if (subflow->csum_len < subflow->map_data_len)
+		goto out;
+
+	header.data_seq = subflow->map_seq;
+	header.subflow_seq = subflow->map_subflow_seq;
+	header.data_len = subflow->map_data_len;
+	header.csum = subflow->map_csum;
+
+	csum = csum_partial(&header, sizeof(header), subflow->data_csum);
+
+	if (csum_fold(csum))
+		return false;
+	subflow->data_csum = 0;
+	subflow->csum_len = 0;
+
+out:
+	return true;
+}
+
 static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 					   struct sock *ssk,
 					   unsigned int *bytes)
@@ -588,6 +617,12 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 			if (tp->urg_data)
 				done = true;
 
+			if (READ_ONCE(msk->csum_enabled)) {
+				subflow->data_csum = skb_checksum(skb, offset, len,
+								  subflow->data_csum);
+				subflow->csum_len += len;
+				mptcp_validate_data_checksum(ssk);
+			}
 			if (__mptcp_move_skb(msk, ssk, skb, offset, len))
 				moved += len;
 			seq += len;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 3ac61b8178fd..176f175a00bd 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -399,6 +399,9 @@ struct mptcp_subflow_context {
 	u32	map_subflow_seq;
 	u32	ssn_offset;
 	u32	map_data_len;
+	__wsum	data_csum;
+	u32	csum_len;
+	__sum16	map_csum;
 	u32	request_mptcp : 1,  /* send MP_CAPABLE */
 		request_join : 1,   /* send MP_JOIN */
 		request_bkup : 1,
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index fc39107b25ec..68efc81eaf2c 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -948,9 +948,12 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
 	subflow->map_data_len = data_len;
 	subflow->map_valid = 1;
 	subflow->mpc_map = mpext->mpc_map;
-	pr_debug("new map seq=%llu subflow_seq=%u data_len=%u",
+	subflow->data_csum = 0;
+	subflow->csum_len = 0;
+	subflow->map_csum = mpext->csum;
+	pr_debug("new map seq=%llu subflow_seq=%u data_len=%u csum=%u",
 		 subflow->map_seq, subflow->map_subflow_seq,
-		 subflow->map_data_len);
+		 subflow->map_data_len, subflow->map_csum);
 
 validate_seq:
 	/* we revalidate valid mapping on new skb, because we must ensure
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 16/22] Squash-to: mptcp: validate the data checksum
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (14 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 15/22] mptcp: validate the data checksum Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 23:42   ` Mat Martineau
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 17/22] mptcp: cleanup error path in subflow_check_data_avail() Paolo Abeni
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Move the RX csum validation into get_mapping_status().
If the current mapping is valid and csum is enabled traverse the
later pending skbs and compute csum incrementally till the whole
mapping has been covered. If not enough data is available in the
rx queue, return MAPPING_EMPTY - that is, no data.

Next subflow_data_ready invocation will trigger again csum computation.

When the full DSS is available, validate the csum and return to the
caller an appropriate error code, to trigger subflow reset of fallback
as required by the RFC.

Additionally:
- if the csum prevence in the DSS don't match the negotiated value
e.g. csum present, but not requested, return invalid mapping to trigger
subflow reset.
- keep some csum state, to avoid re-compute the csum on the
same data when multiple rx queue traversal are required.
- clean-up the uncompleted mapping from the receive queue on close,
to allow proper subflow disposal

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v3 -> v4:
 - drop unrelated 'fallback' changes (geliang)
 - fix csum computation for data segment with data fin bit set
---
 net/mptcp/protocol.c |  35 ---------------
 net/mptcp/protocol.h |   7 +--
 net/mptcp/subflow.c  | 104 +++++++++++++++++++++++++++++++++++++++----
 3 files changed, 99 insertions(+), 47 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 5160256de731..0d8005b480ab 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -520,35 +520,6 @@ static bool mptcp_check_data_fin(struct sock *sk)
 	return ret;
 }
 
-static bool mptcp_validate_data_checksum(struct sock *ssk)
-{
-	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
-	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
-	struct csum_pseudo_header header;
-	__wsum csum;
-
-	if (__mptcp_check_fallback(msk))
-		goto out;
-
-	if (subflow->csum_len < subflow->map_data_len)
-		goto out;
-
-	header.data_seq = subflow->map_seq;
-	header.subflow_seq = subflow->map_subflow_seq;
-	header.data_len = subflow->map_data_len;
-	header.csum = subflow->map_csum;
-
-	csum = csum_partial(&header, sizeof(header), subflow->data_csum);
-
-	if (csum_fold(csum))
-		return false;
-	subflow->data_csum = 0;
-	subflow->csum_len = 0;
-
-out:
-	return true;
-}
-
 static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 					   struct sock *ssk,
 					   unsigned int *bytes)
@@ -617,12 +588,6 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 			if (tp->urg_data)
 				done = true;
 
-			if (READ_ONCE(msk->csum_enabled)) {
-				subflow->data_csum = skb_checksum(skb, offset, len,
-								  subflow->data_csum);
-				subflow->csum_len += len;
-				mptcp_validate_data_checksum(ssk);
-			}
 			if (__mptcp_move_skb(msk, ssk, skb, offset, len))
 				moved += len;
 			seq += len;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 176f175a00bd..747cd94da78b 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -399,9 +399,8 @@ struct mptcp_subflow_context {
 	u32	map_subflow_seq;
 	u32	ssn_offset;
 	u32	map_data_len;
-	__wsum	data_csum;
-	u32	csum_len;
-	__sum16	map_csum;
+	__wsum	map_data_csum;
+	u32	map_csum_len;
 	u32	request_mptcp : 1,  /* send MP_CAPABLE */
 		request_join : 1,   /* send MP_JOIN */
 		request_bkup : 1,
@@ -411,6 +410,8 @@ struct mptcp_subflow_context {
 		pm_notified : 1,    /* PM hook called for established status */
 		conn_finished : 1,
 		map_valid : 1,
+		map_csum_reqd : 1,
+		map_data_fin : 1,
 		mpc_map : 1,
 		backup : 1,
 		send_mp_prio : 1,
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 68efc81eaf2c..0ee568a99da1 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -829,10 +829,86 @@ static bool validate_mapping(struct sock *ssk, struct sk_buff *skb)
 	return true;
 }
 
+static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *skb,
+					      bool csum_reqd)
+{
+	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
+	struct csum_pseudo_header header;
+	u32 offset, seq, delta;
+	__wsum csum;
+	int len;
+
+	if (!csum_reqd)
+		return MAPPING_OK;
+
+	/* mapping already validated on previous traversal */
+	if (subflow->map_csum_len == subflow->map_data_len)
+		return MAPPING_OK;
+
+	/* traverse the receive queue, ensuring it contains a full
+	 * DSS mapping and accumulating the related csum.
+	 * Preserve the accoumlate csum across multiple calls, to compute
+	 * the csum only once
+	 */
+	delta = subflow->map_data_len - subflow->map_csum_len;
+	for (;;) {
+		seq = tcp_sk(ssk)->copied_seq + subflow->map_csum_len;
+		offset = seq - TCP_SKB_CB(skb)->seq;
+
+		/* if the current skb has not been accounted yet, csum its contents
+		 * up to the amount covered by the current DSS
+		 */
+		if (offset < skb->len) {
+			len = min(skb->len - offset, delta);
+			delta -= len;
+			subflow->map_csum_len += len;
+			subflow->map_data_csum = skb_checksum(skb, offset, len,
+							      subflow->map_data_csum);
+		}
+		if (delta == 0)
+			break;
+
+		if (skb_queue_is_last(&ssk->sk_receive_queue, skb)) {
+			/* if this subflow is closed, the partial mapping
+			 * will be never completed; flush the pending skbs, so
+			 * that subflow_sched_work_if_closed() can kick in
+			 */
+			if (unlikely(ssk->sk_state == TCP_CLOSE))
+				while ((skb = skb_peek(&ssk->sk_receive_queue)))
+					sk_eat_skb(ssk, skb);
+
+			/* not enough data to validate the csum */
+			return MAPPING_EMPTY;
+		}
+
+		/* the DSS mapping for next skbs will be validated later,
+		 * when a get_mapping_status call will process such skb
+		 */
+		skb = skb->next;
+	}
+
+	/* note that 'map_data_len' accounts only for the carried data, does
+	 * not include the eventual seq increment due to the data fin,
+	 * while the pseudo header requires the original DSS data len,
+	 * including that
+	 */
+	header.data_seq = subflow->map_seq;
+	header.subflow_seq = subflow->map_subflow_seq;
+	header.data_len = subflow->map_data_len + subflow->map_data_fin;
+	header.csum = 0;
+
+	csum = csum_partial(&header, sizeof(header), subflow->map_data_csum);
+	if (unlikely(csum_fold(csum)))
+		return subflow->mp_join ? MAPPING_INVALID : MAPPING_DUMMY;
+
+	return MAPPING_OK;
+}
+
 static enum mapping_status get_mapping_status(struct sock *ssk,
 					      struct mptcp_sock *msk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
+	bool csum_reqd = READ_ONCE(msk->csum_enabled);
 	struct mptcp_ext *mpext;
 	struct sk_buff *skb;
 	u16 data_len;
@@ -926,9 +1002,10 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
 		/* Allow replacing only with an identical map */
 		if (subflow->map_seq == map_seq &&
 		    subflow->map_subflow_seq == mpext->subflow_seq &&
-		    subflow->map_data_len == data_len) {
+		    subflow->map_data_len == data_len &&
+		    subflow->map_csum_reqd == mpext->csum_reqd) {
 			skb_ext_del(skb, SKB_EXT_MPTCP);
-			return MAPPING_OK;
+			goto validate_csum;
 		}
 
 		/* If this skb data are fully covered by the current mapping,
@@ -940,20 +1017,27 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
 		}
 
 		/* will validate the next map after consuming the current one */
-		return MAPPING_OK;
+		goto validate_csum;
 	}
 
 	subflow->map_seq = map_seq;
 	subflow->map_subflow_seq = mpext->subflow_seq;
 	subflow->map_data_len = data_len;
 	subflow->map_valid = 1;
+	subflow->map_data_fin = mpext->data_fin;
 	subflow->mpc_map = mpext->mpc_map;
-	subflow->data_csum = 0;
-	subflow->csum_len = 0;
-	subflow->map_csum = mpext->csum;
-	pr_debug("new map seq=%llu subflow_seq=%u data_len=%u csum=%u",
+	subflow->map_csum_reqd = mpext->csum_reqd;
+	subflow->map_csum_len = 0;
+	subflow->map_data_csum = csum_unfold(mpext->csum);
+
+	/* Cfr RFC 8684 Section 3.3.0 */
+	if (unlikely(subflow->map_csum_reqd != csum_reqd))
+		return MAPPING_INVALID;
+
+	pr_debug("new map seq=%llu subflow_seq=%u data_len=%u csum=%d:%u",
 		 subflow->map_seq, subflow->map_subflow_seq,
-		 subflow->map_data_len, subflow->map_csum);
+		 subflow->map_data_len, subflow->map_csum_reqd,
+		 subflow->map_data_csum);
 
 validate_seq:
 	/* we revalidate valid mapping on new skb, because we must ensure
@@ -963,7 +1047,9 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
 		return MAPPING_INVALID;
 
 	skb_ext_del(skb, SKB_EXT_MPTCP);
-	return MAPPING_OK;
+
+validate_csum:
+	return validate_data_csum(ssk, skb, csum_reqd);
 }
 
 static void mptcp_subflow_discard_data(struct sock *ssk, struct sk_buff *skb,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 17/22] mptcp: cleanup error path in subflow_check_data_avail()
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (15 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 16/22] Squash-to: " Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 18/22] mptcp: tune re-injections for csum enabled mode Paolo Abeni
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Move fallback code out of the main loop and adds some
brench prediction annotations.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v3 -> v4:
 - moved here from previous patch
 - added unlikely annotations

possibly not the best time ever for this kind of cleanup...
---
 net/mptcp/subflow.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 0ee568a99da1..b61ebadd796c 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1106,21 +1106,12 @@ static bool subflow_check_data_avail(struct sock *ssk)
 
 		status = get_mapping_status(ssk, msk);
 		trace_subflow_check_data_avail(status, skb_peek(&ssk->sk_receive_queue));
-		if (status == MAPPING_INVALID) {
+		if (unlikely(status == MAPPING_INVALID)) {
 			ssk->sk_err = EBADMSG;
 			goto fatal;
 		}
-		if (status == MAPPING_DUMMY) {
-			__mptcp_do_fallback(msk);
-			skb = skb_peek(&ssk->sk_receive_queue);
-			subflow->map_valid = 1;
-			subflow->map_seq = READ_ONCE(msk->ack_seq);
-			subflow->map_data_len = skb->len;
-			subflow->map_subflow_seq = tcp_sk(ssk)->copied_seq -
-						   subflow->ssn_offset;
-			subflow->data_avail = MPTCP_SUBFLOW_DATA_AVAIL;
-			return true;
-		}
+		if (unlikely(status == MAPPING_DUMMY))
+			goto fallback;
 
 		if (status != MAPPING_OK)
 			goto no_data;
@@ -1164,6 +1155,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
 no_data:
 	subflow_sched_work_if_closed(msk, ssk);
 	return false;
+
 fatal:
 	/* fatal protocol error, close the socket */
 	/* This barrier is coupled with smp_rmb() in tcp_poll() */
@@ -1175,6 +1167,16 @@ static bool subflow_check_data_avail(struct sock *ssk)
 	tcp_send_active_reset(ssk, GFP_ATOMIC);
 	subflow->data_avail = 0;
 	return false;
+
+fallback:
+	__mptcp_do_fallback(msk);
+	skb = skb_peek(&ssk->sk_receive_queue);
+	subflow->map_valid = 1;
+	subflow->map_seq = READ_ONCE(msk->ack_seq);
+	subflow->map_data_len = skb->len;
+	subflow->map_subflow_seq = tcp_sk(ssk)->copied_seq -subflow->ssn_offset;
+	subflow->data_avail = MPTCP_SUBFLOW_DATA_AVAIL;
+	return true;
 }
 
 bool mptcp_subflow_data_available(struct sock *sk)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 18/22] mptcp: tune re-injections for csum enabled mode.
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (16 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 17/22] mptcp: cleanup error path in subflow_check_data_avail() Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 19/22] mptcp: add the mib for data checksum Paolo Abeni
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

If the MPTCP-level checksum is enabled, on re-injections we
must spool a complete DSS, or the receive side will not be
able to compute the csum and process any data.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0d8005b480ab..91e25fc91951 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2324,8 +2324,8 @@ static void __mptcp_retrans(struct sock *sk)
 
 	/* limit retransmission to the bytes already sent on some subflows */
 	info.sent = 0;
-	info.limit = dfrag->already_sent;
-	while (info.sent < dfrag->already_sent) {
+	info.limit = READ_ONCE(msk->csum_enabled) ? dfrag->data_len : dfrag->already_sent;
+	while (info.sent < info.limit) {
 		if (!mptcp_alloc_tx_skb(sk, ssk))
 			break;
 
@@ -2337,9 +2337,11 @@ static void __mptcp_retrans(struct sock *sk)
 		copied += ret;
 		info.sent += ret;
 	}
-	if (copied)
+	if (copied) {
+		dfrag->already_sent = max(dfrag->already_sent, info.sent);
 		tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle,
 			 info.size_goal);
+	}
 
 	mptcp_set_timeout(sk, ssk);
 	release_sock(ssk);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 19/22] mptcp: add the mib for data checksum
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (17 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 18/22] mptcp: tune re-injections for csum enabled mode Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 20/22] mptcp: add a new sysctl checksum_enabled Paolo Abeni
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added the mib for the data checksum, MPTCP_MIB_DSSCSUMERR.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/mib.c     | 1 +
 net/mptcp/mib.h     | 1 +
 net/mptcp/subflow.c | 4 +++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
index eb2dc6dbe212..c7042e3317b5 100644
--- a/net/mptcp/mib.c
+++ b/net/mptcp/mib.c
@@ -25,6 +25,7 @@ static const struct snmp_mib mptcp_snmp_list[] = {
 	SNMP_MIB_ITEM("MPJoinAckHMacFailure", MPTCP_MIB_JOINACKMAC),
 	SNMP_MIB_ITEM("DSSNotMatching", MPTCP_MIB_DSSNOMATCH),
 	SNMP_MIB_ITEM("InfiniteMapRx", MPTCP_MIB_INFINITEMAPRX),
+	SNMP_MIB_ITEM("DSSCsumErr", MPTCP_MIB_DSSCSUMERR),
 	SNMP_MIB_ITEM("OFOQueueTail", MPTCP_MIB_OFOQUEUETAIL),
 	SNMP_MIB_ITEM("OFOQueue", MPTCP_MIB_OFOQUEUE),
 	SNMP_MIB_ITEM("OFOMerge", MPTCP_MIB_OFOMERGE),
diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h
index f0da4f060fe1..c407358eced8 100644
--- a/net/mptcp/mib.h
+++ b/net/mptcp/mib.h
@@ -18,6 +18,7 @@ enum linux_mptcp_mib_field {
 	MPTCP_MIB_JOINACKMAC,		/* HMAC was wrong on ACK + MP_JOIN */
 	MPTCP_MIB_DSSNOMATCH,		/* Received a new mapping that did not match the previous one */
 	MPTCP_MIB_INFINITEMAPRX,	/* Received an infinite mapping */
+	MPTCP_MIB_DSSCSUMERR,		/* The DSS checksum fail */
 	MPTCP_MIB_OFOQUEUETAIL,	/* Segments inserted into OoO queue tail */
 	MPTCP_MIB_OFOQUEUE,		/* Segments inserted into OoO queue */
 	MPTCP_MIB_OFOMERGE,		/* Segments merged in OoO queue */
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index b61ebadd796c..4182dff3d5d4 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -898,8 +898,10 @@ static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *
 	header.csum = 0;
 
 	csum = csum_partial(&header, sizeof(header), subflow->map_data_csum);
-	if (unlikely(csum_fold(csum)))
+	if (unlikely(csum_fold(csum))) {
+		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DSSCSUMERR);
 		return subflow->mp_join ? MAPPING_INVALID : MAPPING_DUMMY;
+	}
 
 	return MAPPING_OK;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 20/22] mptcp: add a new sysctl checksum_enabled
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (18 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 19/22] mptcp: add the mib for data checksum Paolo Abeni
@ 2021-04-23 14:49 ` Paolo Abeni
  2021-04-23 14:50 ` [PATCH v4 mptcp-next 21/22] selftests: mptcp: enable checksum in mptcp_connect.sh Paolo Abeni
  2021-04-23 14:50 ` [PATCH v4 mptcp-next 22/22] selftests: mptcp: enable checksum in mptcp_join.sh Paolo Abeni
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:49 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new sysctl, named checksum_enabled, to control
whether DSS checksum can be enabled.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 Documentation/networking/mptcp-sysctl.rst |  8 ++++++++
 net/mptcp/ctrl.c                          | 14 ++++++++++++++
 net/mptcp/protocol.h                      |  2 +-
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index 6af0196c4297..566ec0e4a3da 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -24,3 +24,11 @@ add_addr_timeout - INTEGER (seconds)
 	sysctl.
 
 	Default: 120
+
+checksum_enabled - INTEGER
+	Control whether DSS checksum can be enabled.
+
+	DSS checksum can be enabled if the value is nonzero. This is a
+	per-namespace sysctl.
+
+	Default: 0
diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index 96ba616f59bf..857e371c108c 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -19,6 +19,7 @@ struct mptcp_pernet {
 
 	int mptcp_enabled;
 	unsigned int add_addr_timeout;
+	int checksum_enabled;
 };
 
 static struct mptcp_pernet *mptcp_get_pernet(struct net *net)
@@ -36,6 +37,11 @@ unsigned int mptcp_get_add_addr_timeout(struct net *net)
 	return mptcp_get_pernet(net)->add_addr_timeout;
 }
 
+int mptcp_is_checksum_enabled(struct net *net)
+{
+	return mptcp_get_pernet(net)->checksum_enabled;
+}
+
 static struct ctl_table mptcp_sysctl_table[] = {
 	{
 		.procname = "enabled",
@@ -52,6 +58,12 @@ static struct ctl_table mptcp_sysctl_table[] = {
 		.mode = 0644,
 		.proc_handler = proc_dointvec_jiffies,
 	},
+	{
+		.procname = "checksum_enabled",
+		.maxlen = sizeof(int),
+		.mode = 0644,
+		.proc_handler = proc_dointvec,
+	},
 	{}
 };
 
@@ -59,6 +71,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 {
 	pernet->mptcp_enabled = 1;
 	pernet->add_addr_timeout = TCP_RTO_MAX;
+	pernet->checksum_enabled = 0;
 }
 
 static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
@@ -75,6 +88,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
 
 	table[0].data = &pernet->mptcp_enabled;
 	table[1].data = &pernet->add_addr_timeout;
+	table[2].data = &pernet->checksum_enabled;
 
 	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
 	if (!hdr)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 747cd94da78b..51d54343e7cb 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -541,7 +541,7 @@ static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *su
 
 int mptcp_is_enabled(struct net *net);
 unsigned int mptcp_get_add_addr_timeout(struct net *net);
-static inline int mptcp_is_checksum_enabled(struct net *net) { return false; }
+int mptcp_is_checksum_enabled(struct net *net);
 void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
 				     struct mptcp_options_received *mp_opt);
 bool mptcp_subflow_data_available(struct sock *sk);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 21/22] selftests: mptcp: enable checksum in mptcp_connect.sh
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (19 preceding siblings ...)
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 20/22] mptcp: add a new sysctl checksum_enabled Paolo Abeni
@ 2021-04-23 14:50 ` Paolo Abeni
  2021-04-23 14:50 ` [PATCH v4 mptcp-next 22/22] selftests: mptcp: enable checksum in mptcp_join.sh Paolo Abeni
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:50 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new argument "-C" for the mptcp_connect.sh script to
set the sysctl checksum_enabled to 1 in ns1, ns2, ns3 and ns4 to enable
the data checksum.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 tools/testing/selftests/net/mptcp/mptcp_connect.sh | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index 9236609731b1..96cec8c2d975 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -3,7 +3,7 @@
 
 time_start=$(date +%s)
 
-optstring="S:R:d:e:l:r:h4cm:f:t"
+optstring="S:R:d:e:l:r:h4cm:f:tC"
 ret=0
 sin=""
 sout=""
@@ -22,6 +22,7 @@ sndbuf=0
 rcvbuf=0
 options_log=true
 do_tcp=0
+checksum=false
 filesize=0
 
 if [ $tc_loss -eq 100 ];then
@@ -47,6 +48,7 @@ usage() {
 	echo -e "\t-R: set rcvbuf value (default: use kernel default)"
 	echo -e "\t-m: test mode (poll, sendfile; default: poll)"
 	echo -e "\t-t: also run tests with TCP (use twice to non-fallback tcp)"
+	echo -e "\t-C: enable the MPTCP data checksum"
 }
 
 while getopts "$optstring" option;do
@@ -104,6 +106,9 @@ while getopts "$optstring" option;do
 	"t")
 		do_tcp=$((do_tcp+1))
 		;;
+	"C")
+		checksum=true
+		;;
 	"?")
 		usage $0
 		exit 1
@@ -200,6 +205,12 @@ ip -net "$ns4" route add default via dead:beef:3::2
 # use TCP syn cookies, even if no flooding was detected.
 ip netns exec "$ns2" sysctl -q net.ipv4.tcp_syncookies=2
 
+if $checksum; then
+	for i in "$ns1" "$ns2" "$ns3" "$ns4";do
+		ip netns exec $i sysctl -q net.mptcp.checksum_enabled=1
+	done
+fi
+
 set_ethtool_flags() {
 	local ns="$1"
 	local dev="$2"
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 mptcp-next 22/22] selftests: mptcp: enable checksum in mptcp_join.sh
  2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
                   ` (20 preceding siblings ...)
  2021-04-23 14:50 ` [PATCH v4 mptcp-next 21/22] selftests: mptcp: enable checksum in mptcp_connect.sh Paolo Abeni
@ 2021-04-23 14:50 ` Paolo Abeni
  21 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-04-23 14:50 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@gmail.com>

This patch added a new argument "-C" for the mptcp_join.sh script to set
the sysctl checksum_enabled to 1 in ns1 and ns2 to enable the data
checksum.

In chk_join_nr, check the counter of the mib for the data checksum.

Also added a new argument "-S" for the mptcp_join.sh script to start the
test cases that verify the checksum handshake:

  * Sender and listener both have checksums off
  * Sender and listener both have checksums on
  * Sender checksums off, listener checksums on
  * Sender checksums on, listener checksums off

The output looks like this:

 01 checksum test 0 0                  sum[ ok ] - csum  [ ok ]
 02 checksum test 1 1                  sum[ ok ] - csum  [ ok ]
 03 checksum test 0 1                  sum[ ok ] - csum  [ ok ]
 04 checksum test 1 0                  sum[ ok ] - csum  [ ok ]
 05 no JOIN                            syn[ ok ] - synack[ ok ] - ack[ ok ]
                                       sum[ ok ] - csum  [ ok ]
 06 single subflow, limited by client  syn[ ok ] - synack[ ok ] - ack[ ok ]
                                       sum[ ok ] - csum  [ ok ]

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 .../testing/selftests/net/mptcp/mptcp_join.sh | 107 +++++++++++++++++-
 1 file changed, 103 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index fd99485cf2a4..ef8341c851f7 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -12,6 +12,7 @@ timeout_poll=30
 timeout_test=$((timeout_poll * 2 + 1))
 mptcp_connect=""
 capture=0
+checksum=0
 do_all_tests=1
 
 TEST_COUNT=0
@@ -49,6 +50,9 @@ init()
 		ip netns exec $netns sysctl -q net.mptcp.enabled=1
 		ip netns exec $netns sysctl -q net.ipv4.conf.all.rp_filter=0
 		ip netns exec $netns sysctl -q net.ipv4.conf.default.rp_filter=0
+		if [ $checksum -eq 1 ]; then
+			ip netns exec $netns sysctl -q net.mptcp.checksum_enabled=1
+		fi
 	done
 
 	#  ns1              ns2
@@ -124,6 +128,17 @@ reset_with_add_addr_timeout()
 		-j DROP
 }
 
+reset_with_checksum()
+{
+	local ns1_enable=$1
+	local ns2_enable=$2
+
+	reset
+
+	ip netns exec $ns1 sysctl -q net.mptcp.checksum_enabled=$ns1_enable
+	ip netns exec $ns2 sysctl -q net.mptcp.checksum_enabled=$ns2_enable
+}
+
 ip -Version > /dev/null 2>&1
 if [ $? -ne 0 ];then
 	echo "SKIP: Could not run test without ip tool"
@@ -476,6 +491,45 @@ run_tests()
 	fi
 }
 
+chk_csum_nr()
+{
+	local msg=${1:-""}
+	local count
+	local dump_stats
+
+	if [ ! -z "$msg" ]; then
+		printf "%02u" "$TEST_COUNT"
+	else
+		echo -n "  "
+	fi
+	printf " %-36s %s" "$msg" "sum"
+	count=`ip netns exec $ns1 nstat -as | grep MPTcpExtDSSCsumErr | awk '{print $2}'`
+	[ -z "$count" ] && count=0
+	if [ "$count" != 0 ]; then
+		echo "[fail] got $count data checksum error[s] expected 0"
+		ret=1
+		dump_stats=1
+	else
+		echo -n "[ ok ]"
+	fi
+	echo -n " - csum  "
+	count=`ip netns exec $ns2 nstat -as | grep MPTcpExtDSSCsumErr | awk '{print $2}'`
+	[ -z "$count" ] && count=0
+	if [ "$count" != 0 ]; then
+		echo "[fail] got $count data checksum error[s] expected 0"
+		ret=1
+		dump_stats=1
+	else
+		echo "[ ok ]"
+	fi
+	if [ "${dump_stats}" = 1 ]; then
+		echo Server ns stats
+		ip netns exec $ns1 nstat -as | grep MPTcp
+		echo Client ns stats
+		ip netns exec $ns2 nstat -as | grep MPTcp
+	fi
+}
+
 chk_join_nr()
 {
 	local msg="$1"
@@ -523,6 +577,9 @@ chk_join_nr()
 		echo Client ns stats
 		ip netns exec $ns2 nstat -as | grep MPTcp
 	fi
+	if [ $checksum -eq 1 ]; then
+		chk_csum_nr
+	fi
 }
 
 chk_add_nr()
@@ -1374,6 +1431,37 @@ syncookies_tests()
 	chk_add_nr 1 1
 }
 
+checksum_tests()
+{
+	# checksum test 0 0
+	reset_with_checksum 0 0
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 1
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_csum_nr "checksum test 0 0"
+
+	# checksum test 1 1
+	reset_with_checksum 1 1
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 1
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_csum_nr "checksum test 1 1"
+
+	# checksum test 0 1
+	reset_with_checksum 0 1
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 1
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_csum_nr "checksum test 0 1"
+
+	# checksum test 1 0
+	reset_with_checksum 1 0
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 1
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_csum_nr "checksum test 1 0"
+}
+
 all_tests()
 {
 	subflows_tests
@@ -1387,6 +1475,7 @@ all_tests()
 	backup_tests
 	add_addr_ports_tests
 	syncookies_tests
+	checksum_tests
 }
 
 usage()
@@ -1403,7 +1492,9 @@ usage()
 	echo "  -b backup_tests"
 	echo "  -p add_addr_ports_tests"
 	echo "  -k syncookies_tests"
+	echo "  -S checksum_tests"
 	echo "  -c capture pcap files"
+	echo "  -C enable data checksum"
 	echo "  -h help"
 }
 
@@ -1418,13 +1509,16 @@ make_file "$sin" "server" 1
 trap cleanup EXIT
 
 for arg in "$@"; do
-	# check for "capture" arg before launching tests
+	# check for "capture/checksum" args before launching tests
 	if [[ "${arg}" =~ ^"-"[0-9a-zA-Z]*"c"[0-9a-zA-Z]*$ ]]; then
 		capture=1
 	fi
+	if [[ "${arg}" =~ ^"-"[0-9a-zA-Z]*"C"[0-9a-zA-Z]*$ ]]; then
+		checksum=1
+	fi
 
-	# exception for the capture option, the rest means: a part of the tests
-	if [ "${arg}" != "-c" ]; then
+	# exception for the capture/checksum options, the rest means: a part of the tests
+	if [ "${arg}" != "-c" ] && [ "${arg}" != "-C" ]; then
 		do_all_tests=0
 	fi
 done
@@ -1434,7 +1528,7 @@ if [ $do_all_tests -eq 1 ]; then
 	exit $ret
 fi
 
-while getopts 'fsltra64bpkch' opt; do
+while getopts 'fsltra64bpkchCS' opt; do
 	case $opt in
 		f)
 			subflows_tests
@@ -1469,8 +1563,13 @@ while getopts 'fsltra64bpkch' opt; do
 		k)
 			syncookies_tests
 			;;
+		S)
+			checksum_tests
+			;;
 		c)
 			;;
+		C)
+			;;
 		h | *)
 			usage
 			;;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 mptcp-next 16/22] Squash-to: mptcp: validate the data checksum
  2021-04-23 14:49 ` [PATCH v4 mptcp-next 16/22] Squash-to: " Paolo Abeni
@ 2021-04-23 23:42   ` Mat Martineau
       [not found]     ` <CAF6piCJ5Dv+5_LN-yaz0kdY_EM6LtyM7JmHY43ie+cFE_nf-=A@mail.gmail.com>
  0 siblings, 1 reply; 25+ messages in thread
From: Mat Martineau @ 2021-04-23 23:42 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp, Geliang Tang

On Fri, 23 Apr 2021, Paolo Abeni wrote:

> Move the RX csum validation into get_mapping_status().
> If the current mapping is valid and csum is enabled traverse the
> later pending skbs and compute csum incrementally till the whole
> mapping has been covered. If not enough data is available in the
> rx queue, return MAPPING_EMPTY - that is, no data.
>
> Next subflow_data_ready invocation will trigger again csum computation.
>
> When the full DSS is available, validate the csum and return to the
> caller an appropriate error code, to trigger subflow reset of fallback
> as required by the RFC.
>
> Additionally:
> - if the csum prevence in the DSS don't match the negotiated value
> e.g. csum present, but not requested, return invalid mapping to trigger
> subflow reset.
> - keep some csum state, to avoid re-compute the csum on the
> same data when multiple rx queue traversal are required.
> - clean-up the uncompleted mapping from the receive queue on close,
> to allow proper subflow disposal
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> v3 -> v4:
> - drop unrelated 'fallback' changes (geliang)
> - fix csum computation for data segment with data fin bit set
> ---
> net/mptcp/protocol.c |  35 ---------------
> net/mptcp/protocol.h |   7 +--
> net/mptcp/subflow.c  | 104 +++++++++++++++++++++++++++++++++++++++----
> 3 files changed, 99 insertions(+), 47 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 5160256de731..0d8005b480ab 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -520,35 +520,6 @@ static bool mptcp_check_data_fin(struct sock *sk)
> 	return ret;
> }
>
> -static bool mptcp_validate_data_checksum(struct sock *ssk)
> -{
> -	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> -	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
> -	struct csum_pseudo_header header;
> -	__wsum csum;
> -
> -	if (__mptcp_check_fallback(msk))
> -		goto out;
> -
> -	if (subflow->csum_len < subflow->map_data_len)
> -		goto out;
> -
> -	header.data_seq = subflow->map_seq;
> -	header.subflow_seq = subflow->map_subflow_seq;
> -	header.data_len = subflow->map_data_len;
> -	header.csum = subflow->map_csum;
> -
> -	csum = csum_partial(&header, sizeof(header), subflow->data_csum);
> -
> -	if (csum_fold(csum))
> -		return false;
> -	subflow->data_csum = 0;
> -	subflow->csum_len = 0;
> -
> -out:
> -	return true;
> -}
> -
> static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
> 					   struct sock *ssk,
> 					   unsigned int *bytes)
> @@ -617,12 +588,6 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
> 			if (tp->urg_data)
> 				done = true;
>
> -			if (READ_ONCE(msk->csum_enabled)) {
> -				subflow->data_csum = skb_checksum(skb, offset, len,
> -								  subflow->data_csum);
> -				subflow->csum_len += len;
> -				mptcp_validate_data_checksum(ssk);
> -			}
> 			if (__mptcp_move_skb(msk, ssk, skb, offset, len))
> 				moved += len;
> 			seq += len;
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 176f175a00bd..747cd94da78b 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -399,9 +399,8 @@ struct mptcp_subflow_context {
> 	u32	map_subflow_seq;
> 	u32	ssn_offset;
> 	u32	map_data_len;
> -	__wsum	data_csum;
> -	u32	csum_len;
> -	__sum16	map_csum;
> +	__wsum	map_data_csum;
> +	u32	map_csum_len;
> 	u32	request_mptcp : 1,  /* send MP_CAPABLE */
> 		request_join : 1,   /* send MP_JOIN */
> 		request_bkup : 1,
> @@ -411,6 +410,8 @@ struct mptcp_subflow_context {
> 		pm_notified : 1,    /* PM hook called for established status */
> 		conn_finished : 1,
> 		map_valid : 1,
> +		map_csum_reqd : 1,
> +		map_data_fin : 1,
> 		mpc_map : 1,
> 		backup : 1,
> 		send_mp_prio : 1,
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 68efc81eaf2c..0ee568a99da1 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -829,10 +829,86 @@ static bool validate_mapping(struct sock *ssk, struct sk_buff *skb)
> 	return true;
> }
>
> +static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *skb,
> +					      bool csum_reqd)
> +{
> +	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> +	struct csum_pseudo_header header;
> +	u32 offset, seq, delta;
> +	__wsum csum;
> +	int len;
> +
> +	if (!csum_reqd)
> +		return MAPPING_OK;
> +
> +	/* mapping already validated on previous traversal */
> +	if (subflow->map_csum_len == subflow->map_data_len)
> +		return MAPPING_OK;
> +
> +	/* traverse the receive queue, ensuring it contains a full
> +	 * DSS mapping and accumulating the related csum.
> +	 * Preserve the accoumlate csum across multiple calls, to compute
> +	 * the csum only once
> +	 */
> +	delta = subflow->map_data_len - subflow->map_csum_len;
> +	for (;;) {
> +		seq = tcp_sk(ssk)->copied_seq + subflow->map_csum_len;
> +		offset = seq - TCP_SKB_CB(skb)->seq;
> +
> +		/* if the current skb has not been accounted yet, csum its contents
> +		 * up to the amount covered by the current DSS
> +		 */
> +		if (offset < skb->len) {
> +			len = min(skb->len - offset, delta);
> +			delta -= len;
> +			subflow->map_csum_len += len;
> +			subflow->map_data_csum = skb_checksum(skb, offset, len,
> +							      subflow->map_data_csum);

skb_checksum() can only be used here if the data values are properly 
aligned for the 16-bit checksum. Think about the pathological case of all 
skbs with 1-byte payloads - you can't update the checksum until you have 
two bytes to feed to the checksum calculation (unless it's the last 
byte).

Also see the comment with checksum_partial() that ends up doing the work 
for skb_checksum() - "this function must be called with even lengths, 
except for the last fragment, which may be odd".

Unfortunately even more complexity is needed to handle partial checksums 
of odd-length data - if it's not the end of a mapping (where the odd 
length is ok), would need to exclude the trailing 1 byte and then combine 
that byte with the next chunk of data that arrives.

mptcp_verif_dss_csum() in the multipath-tcp.org kernel has a bunch of 
logic to deal with this.

I was starting to think that it might be easier to just calculate the csum 
once when the full mapping is available, but the code would still have to 
handle arbitrary odd-length or odd-aligned data in each skb.


Overall, I think the changes to the series are moving the code in the 
right direction. Thanks for working on this!

Mat

> +		}
> +		if (delta == 0)
> +			break;
> +
> +		if (skb_queue_is_last(&ssk->sk_receive_queue, skb)) {
> +			/* if this subflow is closed, the partial mapping
> +			 * will be never completed; flush the pending skbs, so
> +			 * that subflow_sched_work_if_closed() can kick in
> +			 */
> +			if (unlikely(ssk->sk_state == TCP_CLOSE))
> +				while ((skb = skb_peek(&ssk->sk_receive_queue)))
> +					sk_eat_skb(ssk, skb);
> +
> +			/* not enough data to validate the csum */
> +			return MAPPING_EMPTY;
> +		}
> +
> +		/* the DSS mapping for next skbs will be validated later,
> +		 * when a get_mapping_status call will process such skb
> +		 */
> +		skb = skb->next;
> +	}
> +
> +	/* note that 'map_data_len' accounts only for the carried data, does
> +	 * not include the eventual seq increment due to the data fin,
> +	 * while the pseudo header requires the original DSS data len,
> +	 * including that
> +	 */
> +	header.data_seq = subflow->map_seq;
> +	header.subflow_seq = subflow->map_subflow_seq;
> +	header.data_len = subflow->map_data_len + subflow->map_data_fin;
> +	header.csum = 0;
> +
> +	csum = csum_partial(&header, sizeof(header), subflow->map_data_csum);
> +	if (unlikely(csum_fold(csum)))
> +		return subflow->mp_join ? MAPPING_INVALID : MAPPING_DUMMY;
> +
> +	return MAPPING_OK;
> +}
> +
> static enum mapping_status get_mapping_status(struct sock *ssk,
> 					      struct mptcp_sock *msk)
> {
> 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> +	bool csum_reqd = READ_ONCE(msk->csum_enabled);
> 	struct mptcp_ext *mpext;
> 	struct sk_buff *skb;
> 	u16 data_len;
> @@ -926,9 +1002,10 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
> 		/* Allow replacing only with an identical map */
> 		if (subflow->map_seq == map_seq &&
> 		    subflow->map_subflow_seq == mpext->subflow_seq &&
> -		    subflow->map_data_len == data_len) {
> +		    subflow->map_data_len == data_len &&
> +		    subflow->map_csum_reqd == mpext->csum_reqd) {
> 			skb_ext_del(skb, SKB_EXT_MPTCP);
> -			return MAPPING_OK;
> +			goto validate_csum;
> 		}
>
> 		/* If this skb data are fully covered by the current mapping,
> @@ -940,20 +1017,27 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
> 		}
>
> 		/* will validate the next map after consuming the current one */
> -		return MAPPING_OK;
> +		goto validate_csum;
> 	}
>
> 	subflow->map_seq = map_seq;
> 	subflow->map_subflow_seq = mpext->subflow_seq;
> 	subflow->map_data_len = data_len;
> 	subflow->map_valid = 1;
> +	subflow->map_data_fin = mpext->data_fin;
> 	subflow->mpc_map = mpext->mpc_map;
> -	subflow->data_csum = 0;
> -	subflow->csum_len = 0;
> -	subflow->map_csum = mpext->csum;
> -	pr_debug("new map seq=%llu subflow_seq=%u data_len=%u csum=%u",
> +	subflow->map_csum_reqd = mpext->csum_reqd;
> +	subflow->map_csum_len = 0;
> +	subflow->map_data_csum = csum_unfold(mpext->csum);
> +
> +	/* Cfr RFC 8684 Section 3.3.0 */
> +	if (unlikely(subflow->map_csum_reqd != csum_reqd))
> +		return MAPPING_INVALID;
> +
> +	pr_debug("new map seq=%llu subflow_seq=%u data_len=%u csum=%d:%u",
> 		 subflow->map_seq, subflow->map_subflow_seq,
> -		 subflow->map_data_len, subflow->map_csum);
> +		 subflow->map_data_len, subflow->map_csum_reqd,
> +		 subflow->map_data_csum);
>
> validate_seq:
> 	/* we revalidate valid mapping on new skb, because we must ensure
> @@ -963,7 +1047,9 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
> 		return MAPPING_INVALID;
>
> 	skb_ext_del(skb, SKB_EXT_MPTCP);
> -	return MAPPING_OK;
> +
> +validate_csum:
> +	return validate_data_csum(ssk, skb, csum_reqd);
> }
>
> static void mptcp_subflow_discard_data(struct sock *ssk, struct sk_buff *skb,
> -- 
> 2.26.2
>
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 mptcp-next 16/22] Squash-to: mptcp: validate the data checksum
       [not found]       ` <24c5410-3b1-5efe-57b6-e1ab374ca52a@linux.intel.com>
@ 2021-05-04  8:09         ` Paolo Abeni
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Abeni @ 2021-05-04  8:09 UTC (permalink / raw)
  To: Mat Martineau; +Cc: mptcp

Hello,

On Mon, 2021-05-03 at 13:10 -0700, Mat Martineau wrote:
> On Mon, 3 May 2021, Paolo Abeni wrote:
> > On Sat, Apr 24, 2021 at 1:42 AM Mat Martineau
> > <mathew.j.martineau@linux.intel.com> wrote:
> > > On Fri, 23 Apr 2021, Paolo Abeni wrote:
> > > > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> > > > index 68efc81eaf2c..0ee568a99da1 100644
> > > > --- a/net/mptcp/subflow.c
> > > > +++ b/net/mptcp/subflow.c
> > > > @@ -829,10 +829,86 @@ static bool validate_mapping(struct sock *ssk, struct sk_buff *skb)
> > > >       return true;
> > > > }
> > > > 
> > > > +static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *skb,
> > > > +                                           bool csum_reqd)
> > > > +{
> > > > +     struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> > > > +     struct csum_pseudo_header header;
> > > > +     u32 offset, seq, delta;
> > > > +     __wsum csum;
> > > > +     int len;
> > > > +
> > > > +     if (!csum_reqd)
> > > > +             return MAPPING_OK;
> > > > +
> > > > +     /* mapping already validated on previous traversal */
> > > > +     if (subflow->map_csum_len == subflow->map_data_len)
> > > > +             return MAPPING_OK;
> > > > +
> > > > +     /* traverse the receive queue, ensuring it contains a full
> > > > +      * DSS mapping and accumulating the related csum.
> > > > +      * Preserve the accoumlate csum across multiple calls, to compute
> > > > +      * the csum only once
> > > > +      */
> > > > +     delta = subflow->map_data_len - subflow->map_csum_len;
> > > > +     for (;;) {
> > > > +             seq = tcp_sk(ssk)->copied_seq + subflow->map_csum_len;
> > > > +             offset = seq - TCP_SKB_CB(skb)->seq;
> > > > +
> > > > +             /* if the current skb has not been accounted yet, csum its contents
> > > > +              * up to the amount covered by the current DSS
> > > > +              */
> > > > +             if (offset < skb->len) {
> > > > +                     len = min(skb->len - offset, delta);
> > > > +                     delta -= len;
> > > > +                     subflow->map_csum_len += len;
> > > > +                     subflow->map_data_csum = skb_checksum(skb, offset, len,
> > > > +                                                           subflow->map_data_csum);
> > > 
> > > skb_checksum() can only be used here if the data values are properly
> > > aligned for the 16-bit checksum. Think about the pathological case of all
> > > skbs with 1-byte payloads - you can't update the checksum until you have
> > > two bytes to feed to the checksum calculation (unless it's the last
> > > byte).
> > > 
> > > Also see the comment with checksum_partial() that ends up doing the work
> > > for skb_checksum() - "this function must be called with even lengths,
> > > except for the last fragment, which may be odd".
> > 
> > Thanks for pointing that out! I did not notice the comments (nor the
> > previous discussion on this topic, if any).
> > 
> > I'm wondering how e.g. UDP + cork cope with that ?!? in
> > ip_append_data() I don't see any special care for odd fragment lengths
> > and a after quick inspection to the x86 and the generic csum code I
> > *think* it should cope correctly even with odd, non terminal fragments
> > ?!?
> > 
> > Do you see any specific reason why do_csum() should produce uncorrect
> > results in the above scenario ?
> 
> I think the special handling is kind of buried - both ip_append_data() and 
> __ip_append_data() have a getfrag function pointer as a parameter. And 
> getfrag functions have an 'odd' parameter that gets propagated to 
> csum_block_add() by functions like ip_generic_getfrag() and 
> ip_reply_glue_bits(). Then csum_block_add() ends up doing a rotate to 
> handle the alignment.
> 
> Within skb_checksum(), csum_block_add() is also used, and it looks like it 
> handles odd/even lengths of fragments. So I think the checksums for 
> MPTCP-level skbs would need to be updated/combined similar to the way the 
> checksums for skb fragments are updated/combined inside __skb_checksum() 
> in order to correctly handle alignment.
> 
> 
> If you have data like this:
> 
> 0011223344556677
> 
> the checksum calculated all at once would look like:
> 
>    0011
> + 2233
> + 4455
> + 6677
> 
> But if you slice it up with odd-lengths like this:
> 
> 001122 | 3344 | 55 | 6677
> 
> then the checksum algorithm sees some padding (the 'p's):
> 
>    0011
> + 22pp
> + 3344
> + 55pp
> + 6677
> 
> which throws off the alignment and introduces artifacts from the padding.
> 
> 
> I think my statement earlier in this thread about the problems with 
> odd-length data didn't have all the details correct, but it seems like the 
> good news is that csum_block_add() could be the tool to handle it.

Thank you for the pointers and the detailed analisys. My understanding
is that combining the csum_partial() results via csum_block_add(),
tacking in account the current offset inside the DSS should address the
situation. I'll update the patches with that and hopefully post soon.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-05-04  8:09 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-23 14:49 [PATCH v4 mptcp-next 00/22] mptcp: data checksum support Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 01/22] mptcp: add csum_enabled in mptcp_sock Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 02/22] mptcp: generate the data checksum Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 03/22] Squash-to: "mptcp: generate the data checksum" Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 04/22] mptcp: add csum_reqd in mptcp_out_options Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 05/22] mptcp: send out checksum for MP_CAPABLE with data Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 06/22] Squash-to: "mptcp: send out checksum for MP_CAPABLE with data" Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 07/22] mptcp: send out checksum for DSS Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 08/22] Squash-to "mptcp: send out checksum for DSS" Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 09/22] mptcp: add sk parameter for mptcp_parse_option Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 10/22] mptcp: add csum_reqd in mptcp_options_received Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 11/22] mptcp: receive checksum for MP_CAPABLE with data Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 12/22] Squash-to: "mptcp: receive checksum for MP_CAPABLE with data" Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 13/22] mptcp: receive checksum for DSS Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 14/22] Squash-to: "mptcp: receive checksum for DSS" Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 15/22] mptcp: validate the data checksum Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 16/22] Squash-to: " Paolo Abeni
2021-04-23 23:42   ` Mat Martineau
     [not found]     ` <CAF6piCJ5Dv+5_LN-yaz0kdY_EM6LtyM7JmHY43ie+cFE_nf-=A@mail.gmail.com>
     [not found]       ` <24c5410-3b1-5efe-57b6-e1ab374ca52a@linux.intel.com>
2021-05-04  8:09         ` Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 17/22] mptcp: cleanup error path in subflow_check_data_avail() Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 18/22] mptcp: tune re-injections for csum enabled mode Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 19/22] mptcp: add the mib for data checksum Paolo Abeni
2021-04-23 14:49 ` [PATCH v4 mptcp-next 20/22] mptcp: add a new sysctl checksum_enabled Paolo Abeni
2021-04-23 14:50 ` [PATCH v4 mptcp-next 21/22] selftests: mptcp: enable checksum in mptcp_connect.sh Paolo Abeni
2021-04-23 14:50 ` [PATCH v4 mptcp-next 22/22] selftests: mptcp: enable checksum in mptcp_join.sh Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.