All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [MPTCP] [PATCH] mptcp: mptcp option len must be skb independent
@ 2019-04-19 22:28 Mat Martineau
  0 siblings, 0 replies; 4+ messages in thread
From: Mat Martineau @ 2019-04-19 22:28 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 5195 bytes --]


On Fri, 19 Apr 2019, Paolo Abeni wrote:

> Otherwise the TCP stack can't compute correctly the mss and
> that will break segmentation. Note that we can't observe the
> issue with the self-test until we allow skb collapsing -
> loopback default MTU is pretty big and we dont build big enogh
> skb otherwise.
>
> Fix the issue caching the use_map/use_checksum flags also at the
> substream ctx level, and using them in tcp_established_options().
> While at it, replace a bunch of magic numbers with the appropriate
> constants.

Hi Paolo -

"use_checksum" isn't going to differ across packets, and is a subflow-wide 
setting that can be in the subflow context.

"use_map" is different. The MPTCP DSS option has two parts, with one or 
both of the MPTCP Data ACK and the MPTCP data sequence mapping. This 
varies by the packet. After the connection gets started (handshake plus a 
DSS ACK indicating the DSS options are getting passed through the 
network), the DSS option is not required on every packet.

ACK packets have the DSS option with the MPTCP Data ACK but no mapping.

Not every data packet needs to have DSS mapping information - our MPTCP 
socket could get a sendmsg with 2800 bytes, and send one skb with the 
first 1400 data bytes and the DSS mapping for the 2800 byte range, and a 
second skb with the last 1400 data bytes and no MPTCP option header.

One option is to have the MPTCP socket only send skbs with DSS mapping 
information, which might use TCP option bytes in more packets than is 
strictly necessary. Another approach would be to assume the larger MPTCP 
DSS option size (both ack and map) when calculating the mss. Do you think 
one of those approaches would allow proper segmentation? Or is something 
else required?


By the way, I do have changes queued up for RFCv10 that refactor the MPTCP 
option writing code and replace the magic numbers.


Mat


>
> Fixes: a0e41f8a56d6 ("mptcp: Write MPTCP DSS headers to outgoing data packets")
> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
> --
> Notes:
> - I'm ok with swashing this commit into the referenced one
> - this fixes the data corruption I was observing on the xmit path
>  refactor patchset
>
> ---
> include/net/mptcp.h   |  2 ++
> net/ipv4/tcp_output.c | 18 ++++++++----------
> net/mptcp/subflow.c   |  4 ++++
> 3 files changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> index 0c0839d9ab38..5658b85da7ca 100644
> --- a/include/net/mptcp.h
> +++ b/include/net/mptcp.h
> @@ -128,6 +128,8 @@ struct subflow_context {
> 	bool	fourth_ack;	// send initial DSS
> 	bool	conn_finished;
> 	bool	map_valid;
> +	bool	use_map;
> +	bool	use_checksum;
> 	struct  sock *sk;       /* underlying tcp_sock */
> 	struct  sock *conn;     /* parent mptcp_sock */
> 	void	(*tcp_sk_data_ready)(struct sock *sk);
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 6798b1e91140..4e5dab801bed 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -822,19 +822,17 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
> 				opts->mptcp.rcvr_key = remote_key;
> 				size += TCPOLEN_MPTCP_MPC_ACK;
> 			}
> -		} else if (subflow_ctx(sk)->mp_capable && skb) {
> +		} else if (subflow_ctx(sk)->mp_capable) {
> +			unsigned int ack_size = TCPOLEN_MPTCP_DSS_ACK64;
> 			unsigned int dss_size = 0;
> -			unsigned int ack_size = 8;
> -			struct mptcp_ext *mpext;
> 			u16 suboptions = 0;
>
> -			mpext = mptcp_get_ext(skb);
> +			if (subflow_ctx(sk)->use_map) {
> +				unsigned int map_size = TCPOLEN_MPTCP_DSS_BASE +
> +							TCPOLEN_MPTCP_DSS_MAP64;
>
> -			if (mpext && mpext->use_map) {
> -				unsigned int map_size = 18;
> -
> -				if (mpext->use_checksum)
> -					map_size += 2;
> +				if (subflow_ctx(sk)->use_checksum)
> +					map_size += TCPOLEN_MPTCP_DSS_CHECKSUM;
>
> 				if (map_size <= remaining) {
> 					remaining -= map_size;
> @@ -849,7 +847,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
> 			 * overhead if mapping not populated
> 			 */
> 			if (dss_size == 0)
> -				ack_size += 4;
> +				ack_size += TCPOLEN_MPTCP_DSS_BASE;
>
> 			if (ack_size <= remaining) {
> 				dss_size += ack_size;
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 077d5baae270..3c6988cb3126 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -130,6 +130,8 @@ static struct subflow_context *clone_ctx(const struct sock *sk, struct sock *chi
> 	child_ctx->sk = child;
> 	child_ctx->conn = NULL;
> 	child_ctx->tcp_sk_data_ready = ctx->tcp_sk_data_ready;
> +	child_ctx->use_map = ctx->use_map;
> +	child_ctx->use_checksum = ctx->use_checksum;
>
> 	return child_ctx;
> }
> @@ -237,6 +239,8 @@ static int subflow_init(struct sock *sk)
> 	tsk->is_mptcp = 1;
> 	icsk->icsk_af_ops = &subflow_specific;
> 	ctx->tcp_sk_data_ready = sk->sk_data_ready;
> +	ctx->use_map = true;
> +	ctx->use_checksum = false;
> 	sk->sk_data_ready = subflow_data_ready;
> out:
> 	return err;
> -- 
> 2.20.1
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [MPTCP] [PATCH] mptcp: mptcp option len must be skb independent
@ 2019-04-24 17:51 Mat Martineau
  0 siblings, 0 replies; 4+ messages in thread
From: Mat Martineau @ 2019-04-24 17:51 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 3098 bytes --]


On Wed, 24 Apr 2019, Paolo Abeni wrote:

> On Fri, 2019-04-19 at 15:28 -0700, Mat Martineau wrote:
>> On Fri, 19 Apr 2019, Paolo Abeni wrote:
>>
>>> Otherwise the TCP stack can't compute correctly the mss and
>>> that will break segmentation. Note that we can't observe the
>>> issue with the self-test until we allow skb collapsing -
>>> loopback default MTU is pretty big and we dont build big enogh
>>> skb otherwise.
>>>
>>> Fix the issue caching the use_map/use_checksum flags also at the
>>> substream ctx level, and using them in tcp_established_options().
>>> While at it, replace a bunch of magic numbers with the appropriate
>>> constants.
>>
>> Hi Paolo -
>>
>> "use_checksum" isn't going to differ across packets, and is a subflow-wide
>> setting that can be in the subflow context.
>>
>> "use_map" is different. The MPTCP DSS option has two parts, with one or
>> both of the MPTCP Data ACK and the MPTCP data sequence mapping. This
>> varies by the packet. After the connection gets started (handshake plus a
>> DSS ACK indicating the DSS options are getting passed through the
>> network), the DSS option is not required on every packet.
>>
>> ACK packets have the DSS option with the MPTCP Data ACK but no mapping.
>>
>> Not every data packet needs to have DSS mapping information - our MPTCP
>> socket could get a sendmsg with 2800 bytes, and send one skb with the
>> first 1400 data bytes and the DSS mapping for the 2800 byte range, and a
>> second skb with the last 1400 data bytes and no MPTCP option header.
>>
>> One option is to have the MPTCP socket only send skbs with DSS mapping
>> information, which might use TCP option bytes in more packets than is
>> strictly necessary.
>
> I'm unsure I read correctly the above?!? no mptcp ACK?!?

That's not what I intended, let me rephrase the last paragraph: "One 
option is to have the MPTCP socket only send skbs with the MPTCP extension 
and the DSS information contained in that extension. This can populate the 
DSS header with mapping information in more packets than is strictly 
necessary, increasing header overhead."

>
>> Another approach would be to assume the larger MPTCP
>> DSS option size (both ack and map) when calculating the mss.
>
> This is basically what the proposed patch does (as subflow->use_map
> can't be changed). AFACS it fixes the issue.

Ok, I had incorrectly thought that subflow->use_map and 
subflow->use_checksum were intended to replace the per-skb flags in 
mptcp_ext, but that's not what the patch does.

I think subflow->use_map is not necessary. If it's always true, we can 
remove the conditional in tcp_established_options instead of creating a 
new flag.

subflow->use_checksum should replace mptcp_ext->use_checksum, since it is 
the same for every packet in a given connection. However, note that 
use_checksum doesn't change the number of option bytes required by MPTCP 
due to use of ALIGN() in the size calculation. If the checksum is unused, 
those bytes are filled with TCPOPT_NOP.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [MPTCP] [PATCH] mptcp: mptcp option len must be skb independent
@ 2019-04-24  8:35 Paolo Abeni
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-04-24  8:35 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2012 bytes --]

On Fri, 2019-04-19 at 15:28 -0700, Mat Martineau wrote:
> On Fri, 19 Apr 2019, Paolo Abeni wrote:
> 
> > Otherwise the TCP stack can't compute correctly the mss and
> > that will break segmentation. Note that we can't observe the
> > issue with the self-test until we allow skb collapsing -
> > loopback default MTU is pretty big and we dont build big enogh
> > skb otherwise.
> > 
> > Fix the issue caching the use_map/use_checksum flags also at the
> > substream ctx level, and using them in tcp_established_options().
> > While at it, replace a bunch of magic numbers with the appropriate
> > constants.
> 
> Hi Paolo -
> 
> "use_checksum" isn't going to differ across packets, and is a subflow-wide 
> setting that can be in the subflow context.
> 
> "use_map" is different. The MPTCP DSS option has two parts, with one or 
> both of the MPTCP Data ACK and the MPTCP data sequence mapping. This 
> varies by the packet. After the connection gets started (handshake plus a 
> DSS ACK indicating the DSS options are getting passed through the 
> network), the DSS option is not required on every packet.
> 
> ACK packets have the DSS option with the MPTCP Data ACK but no mapping.
> 
> Not every data packet needs to have DSS mapping information - our MPTCP 
> socket could get a sendmsg with 2800 bytes, and send one skb with the 
> first 1400 data bytes and the DSS mapping for the 2800 byte range, and a 
> second skb with the last 1400 data bytes and no MPTCP option header.
> 
> One option is to have the MPTCP socket only send skbs with DSS mapping 
> information, which might use TCP option bytes in more packets than is 
> strictly necessary. 

I'm unsure I read correctly the above?!? no mptcp ACK?!?

> Another approach would be to assume the larger MPTCP 
> DSS option size (both ack and map) when calculating the mss. 

This is basically what the proposed patch does (as subflow->use_map
can't be changed). AFACS it fixes the issue.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [MPTCP] [PATCH] mptcp: mptcp option len must be skb independent
@ 2019-04-19 21:12 Paolo Abeni
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-04-19 21:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 3571 bytes --]

Otherwise the TCP stack can't compute correctly the mss and
that will break segmentation. Note that we can't observe the
issue with the self-test until we allow skb collapsing -
loopback default MTU is pretty big and we dont build big enogh
skb otherwise.

Fix the issue caching the use_map/use_checksum flags also at the
substream ctx level, and using them in tcp_established_options().
While at it, replace a bunch of magic numbers with the appropriate
constants.

Fixes: a0e41f8a56d6 ("mptcp: Write MPTCP DSS headers to outgoing data packets")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
--
Notes:
- I'm ok with swashing this commit into the referenced one
- this fixes the data corruption I was observing on the xmit path
  refactor patchset

---
 include/net/mptcp.h   |  2 ++
 net/ipv4/tcp_output.c | 18 ++++++++----------
 net/mptcp/subflow.c   |  4 ++++
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 0c0839d9ab38..5658b85da7ca 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -128,6 +128,8 @@ struct subflow_context {
 	bool	fourth_ack;	// send initial DSS
 	bool	conn_finished;
 	bool	map_valid;
+	bool	use_map;
+	bool	use_checksum;
 	struct  sock *sk;       /* underlying tcp_sock */
 	struct  sock *conn;     /* parent mptcp_sock */
 	void	(*tcp_sk_data_ready)(struct sock *sk);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6798b1e91140..4e5dab801bed 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -822,19 +822,17 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 				opts->mptcp.rcvr_key = remote_key;
 				size += TCPOLEN_MPTCP_MPC_ACK;
 			}
-		} else if (subflow_ctx(sk)->mp_capable && skb) {
+		} else if (subflow_ctx(sk)->mp_capable) {
+			unsigned int ack_size = TCPOLEN_MPTCP_DSS_ACK64;
 			unsigned int dss_size = 0;
-			unsigned int ack_size = 8;
-			struct mptcp_ext *mpext;
 			u16 suboptions = 0;
 
-			mpext = mptcp_get_ext(skb);
+			if (subflow_ctx(sk)->use_map) {
+				unsigned int map_size = TCPOLEN_MPTCP_DSS_BASE +
+							TCPOLEN_MPTCP_DSS_MAP64;
 
-			if (mpext && mpext->use_map) {
-				unsigned int map_size = 18;
-
-				if (mpext->use_checksum)
-					map_size += 2;
+				if (subflow_ctx(sk)->use_checksum)
+					map_size += TCPOLEN_MPTCP_DSS_CHECKSUM;
 
 				if (map_size <= remaining) {
 					remaining -= map_size;
@@ -849,7 +847,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 			 * overhead if mapping not populated
 			 */
 			if (dss_size == 0)
-				ack_size += 4;
+				ack_size += TCPOLEN_MPTCP_DSS_BASE;
 
 			if (ack_size <= remaining) {
 				dss_size += ack_size;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 077d5baae270..3c6988cb3126 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -130,6 +130,8 @@ static struct subflow_context *clone_ctx(const struct sock *sk, struct sock *chi
 	child_ctx->sk = child;
 	child_ctx->conn = NULL;
 	child_ctx->tcp_sk_data_ready = ctx->tcp_sk_data_ready;
+	child_ctx->use_map = ctx->use_map;
+	child_ctx->use_checksum = ctx->use_checksum;
 
 	return child_ctx;
 }
@@ -237,6 +239,8 @@ static int subflow_init(struct sock *sk)
 	tsk->is_mptcp = 1;
 	icsk->icsk_af_ops = &subflow_specific;
 	ctx->tcp_sk_data_ready = sk->sk_data_ready;
+	ctx->use_map = true;
+	ctx->use_checksum = false;
 	sk->sk_data_ready = subflow_data_ready;
 out:
 	return err;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-04-24 17:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-19 22:28 [MPTCP] [PATCH] mptcp: mptcp option len must be skb independent Mat Martineau
  -- strict thread matches above, loose matches on Subject: below --
2019-04-24 17:51 Mat Martineau
2019-04-24  8:35 Paolo Abeni
2019-04-19 21:12 Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.