All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-23 17:40 Mat Martineau
  0 siblings, 0 replies; 4+ messages in thread
From: Mat Martineau @ 2019-08-23 17:40 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]


On Fri, 23 Aug 2019, Paolo Abeni wrote:

> Hi,
>
> Thank you for the feedback!
>
> On Thu, 2019-08-22 at 17:34 -0700, Mat Martineau wrote:
>> Rather than trying to sync subflow->snd_una values for the purpose of
>> expanding them to 64 bits, the subflow could record the 32-bit value and
>> let the MPTCP-level socket do the expansion? Then the subflows only need
>> to track their most recent data ack: the ack value, size (32 or 64), and
>> whether it's been updated since the last access by the MPTCP-level socket.
>
> I think that tracking the most recent data ack can become almost as
> tricky, if the peer interleaves 32 and 64 bits ack (IIRC the RFC, it
> can).

Yes, the sender can interleave.

At the subflow, it's still simple (any new ack replaces the previous one 
regardless of bit width). Since the ack always moves forward, we don't 
card about the old value even if it was "better" (64-bit). And the 
expansion of a 32-bit ack on one subflow doesn't depend on the previous 
value seen on that subflow, it depends on msk->snd_una.

>
>> The MPTCP-level socket has more information to know whether to reject an
>> ambiguous 32-bit ack. Latency between the update of subflow->snd_una and
>> the check of that value by the MPTCP socket is still an issue, but could
>> be more manageable.
>
> Since expanding to 64 bits looks safer with msk->snd_una, and moving
> the msk->snd_una updates into mptcp_incoming_options() should be quite
> doable. What about that option?
>
> Yep, we need an additional lock/atomic operation, but <quote> Premature
> optimization is the root of all evil </quote>

Sounds good to me.

>
>> Side note: I know we have to handle what compliant peers send to us, but
>> maybe we should keep sending only 64-bit acks!
>
> I think we already do that ?!?

Yes, that is what we currently do. My suggestion is that we continue 
with only 64-bit acks in the future.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-23 16:37 Paolo Abeni
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-08-23 16:37 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

Hi,

Thank you for the feedback!

On Thu, 2019-08-22 at 17:34 -0700, Mat Martineau wrote:
> Rather than trying to sync subflow->snd_una values for the purpose of 
> expanding them to 64 bits, the subflow could record the 32-bit value and 
> let the MPTCP-level socket do the expansion? Then the subflows only need 
> to track their most recent data ack: the ack value, size (32 or 64), and 
> whether it's been updated since the last access by the MPTCP-level socket. 

I think that tracking the most recent data ack can become almost as
tricky, if the peer interleaves 32 and 64 bits ack (IIRC the RFC, it
can).

> The MPTCP-level socket has more information to know whether to reject an 
> ambiguous 32-bit ack. Latency between the update of subflow->snd_una and 
> the check of that value by the MPTCP socket is still an issue, but could 
> be more manageable.

Since expanding to 64 bits looks safer with msk->snd_una, and moving
the msk->snd_una updates into mptcp_incoming_options() should be quite
doable. What about that option?

Yep, we need an additional lock/atomic operation, but <quote> Premature
optimization is the root of all evil </quote>

> Side note: I know we have to handle what compliant peers send to us, but 
> maybe we should keep sending only 64-bit acks!

I think we already do that ?!?

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-23  0:34 Mat Martineau
  0 siblings, 0 replies; 4+ messages in thread
From: Mat Martineau @ 2019-08-23  0:34 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 5596 bytes --]


On Thu, 22 Aug 2019, Paolo Abeni wrote:

> So that we keep per subflow unacked sequence number consistent;
> since we update only per socket data, no additional locking is
> required.
>
> Subflow and msk unacked seq are updated at different stages
> to keep the locking schema simple.
>
> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
> ---
> rfc v1 -> rfc v2:
> - initialize subflow snd_una on subflow connect()/accept()
> ---
> net/mptcp/options.c  | 31 +++++++++++++++++++++++++------
> net/mptcp/protocol.c |  2 ++
> net/mptcp/protocol.h |  1 +
> 3 files changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
> index 0e7abcea5bae..e0895834ff82 100644
> --- a/net/mptcp/options.c
> +++ b/net/mptcp/options.c
> @@ -524,6 +524,21 @@ bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
> 	return false;
> }
>
> +static u64 expand_ack(u64 old_ack, u64 cur_ack, bool use_64bit)
> +{
> +	u32 old_ack32, cur_ack32;
> +
> +	if (use_64bit)
> +		return cur_ack;
> +
> +	old_ack32 = (u32)old_ack;
> +	cur_ack32 = (u32)cur_ack;
> +	cur_ack = (old_ack & GENMASK_ULL(63, 32)) + cur_ack32;
> +	if (unlikely(before(cur_ack32, old_ack32)))
> +		return cur_ack + (1LL<<32);
> +	return cur_ack;
> +}

Expanding the 32-bit sequence numbers correctly and consistently has been 
a concern of mine for a while, and trying to think through all the issues 
in the context of ack processing has only made me realize it's even 
trickier!

With the 32-bit data acks, we can encounter delayed acks on any subflow. 
Even timely acks seen on one subflow might have large changes in value due 
to the peer's transmit scheduling choices (acks might only show up on 
other subflows for a while). The penalty for incorrectly accepting a data 
ack is significant, since we would discard data from our MPTCP-level 
retransmission buffer that the peer will know it doesn't have (leading to 
an unrecoverable stall). Incorrectly discarding a data ack would be 
recoverable and likely fix itself through additional acks but maybe lead 
to an unnecessary reinjection by the peer.


I point this out in order to say that expanding 32-bit acks seems like a 
hard problem, and by decoupling the subflow->snd_una values it becomes 
even harder :) . I also understand that the possible locking problems from 
trying to share snd_una across subflows would be difficult too.


Rather than trying to sync subflow->snd_una values for the purpose of 
expanding them to 64 bits, the subflow could record the 32-bit value and 
let the MPTCP-level socket do the expansion? Then the subflows only need 
to track their most recent data ack: the ack value, size (32 or 64), and 
whether it's been updated since the last access by the MPTCP-level socket. 
The MPTCP-level socket has more information to know whether to reject an 
ambiguous 32-bit ack. Latency between the update of subflow->snd_una and 
the check of that value by the MPTCP socket is still an issue, but could 
be more manageable.


Side note: I know we have to handle what compliant peers send to us, but 
maybe we should keep sending only 64-bit acks!


Mat


> +
> void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
> 			    struct tcp_options_received *opt_rx)
> {
> @@ -537,6 +552,16 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
> 	if (!mp_opt->dss)
> 		return;
>
> +	/* we can't wait for recvmsg() to update the ack_seq, otherwise
> +	 * monodirectional flows will stuck
> +	 */
> +	if (mp_opt->use_ack) {
> +		struct subflow_context *subflow = subflow_ctx(sk);
> +
> +		subflow->snd_una = expand_ack(subflow->snd_una,
> +					      mp_opt->data_ack, mp_opt->ack64);
> +	}
> +
> 	mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
> 	if (!mpext)
> 		return;
> @@ -553,12 +578,6 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
> 		mpext->use_checksum = mp_opt->use_checksum;
> 	}
>
> -	if (mp_opt->use_ack) {
> -		mpext->data_ack = mp_opt->data_ack;
> -		mpext->use_ack = 1;
> -		mpext->ack64 = mp_opt->ack64;
> -	}
> -
> 	mpext->data_fin = mp_opt->data_fin;
>
> 	if (msk)
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 5b7ddbe64e4b..65f7cc4712ef 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -674,6 +674,7 @@ static struct sock *mptcp_accept(struct sock *sk, int flags, int *err,
> 		subflow->map_subflow_seq = 1;
> 		subflow->rel_write_seq = 1;
> 		subflow->tcp_sock = new_sock;
> +		subflow->snd_una = msk->write_seq;
> 		newsk = new_mptcp_sock;
> 		subflow->conn = new_mptcp_sock;
> 		list_add(&subflow->node, &msk->conn_list);
> @@ -809,6 +810,7 @@ void mptcp_finish_connect(struct sock *sk, int mp_capable)
> 		subflow->map_seq = ack_seq;
> 		subflow->map_subflow_seq = 1;
> 		subflow->rel_write_seq = 1;
> +		subflow->snd_una = msk->write_seq;
>
> 		list_add(&subflow->node, &msk->conn_list);
> 		msk->subflow = NULL;
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 9f261ee31917..e37fd4bcd904 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -165,6 +165,7 @@ struct subflow_context {
> 	u32	token;
> 	u32     rel_write_seq;
> 	u64     idsn;
> +	u64	snd_una;
> 	u64	map_seq;
> 	u32	map_subflow_seq;
> 	u32	ssn_offset;
> -- 
> 2.21.0
>
> _______________________________________________
> mptcp mailing list
> mptcp(a)lists.01.org
> https://lists.01.org/mailman/listinfo/mptcp
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-22 13:47 Paolo Abeni
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-08-22 13:47 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 3279 bytes --]

So that we keep per subflow unacked sequence number consistent;
since we update only per socket data, no additional locking is
required.

Subflow and msk unacked seq are updated at different stages
to keep the locking schema simple.

Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
rfc v1 -> rfc v2:
 - initialize subflow snd_una on subflow connect()/accept()
---
 net/mptcp/options.c  | 31 +++++++++++++++++++++++++------
 net/mptcp/protocol.c |  2 ++
 net/mptcp/protocol.h |  1 +
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 0e7abcea5bae..e0895834ff82 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -524,6 +524,21 @@ bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
 	return false;
 }
 
+static u64 expand_ack(u64 old_ack, u64 cur_ack, bool use_64bit)
+{
+	u32 old_ack32, cur_ack32;
+
+	if (use_64bit)
+		return cur_ack;
+
+	old_ack32 = (u32)old_ack;
+	cur_ack32 = (u32)cur_ack;
+	cur_ack = (old_ack & GENMASK_ULL(63, 32)) + cur_ack32;
+	if (unlikely(before(cur_ack32, old_ack32)))
+		return cur_ack + (1LL<<32);
+	return cur_ack;
+}
+
 void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
 			    struct tcp_options_received *opt_rx)
 {
@@ -537,6 +552,16 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
 	if (!mp_opt->dss)
 		return;
 
+	/* we can't wait for recvmsg() to update the ack_seq, otherwise
+	 * monodirectional flows will stuck
+	 */
+	if (mp_opt->use_ack) {
+		struct subflow_context *subflow = subflow_ctx(sk);
+
+		subflow->snd_una = expand_ack(subflow->snd_una,
+					      mp_opt->data_ack, mp_opt->ack64);
+	}
+
 	mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
 	if (!mpext)
 		return;
@@ -553,12 +578,6 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
 		mpext->use_checksum = mp_opt->use_checksum;
 	}
 
-	if (mp_opt->use_ack) {
-		mpext->data_ack = mp_opt->data_ack;
-		mpext->use_ack = 1;
-		mpext->ack64 = mp_opt->ack64;
-	}
-
 	mpext->data_fin = mp_opt->data_fin;
 
 	if (msk)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 5b7ddbe64e4b..65f7cc4712ef 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -674,6 +674,7 @@ static struct sock *mptcp_accept(struct sock *sk, int flags, int *err,
 		subflow->map_subflow_seq = 1;
 		subflow->rel_write_seq = 1;
 		subflow->tcp_sock = new_sock;
+		subflow->snd_una = msk->write_seq;
 		newsk = new_mptcp_sock;
 		subflow->conn = new_mptcp_sock;
 		list_add(&subflow->node, &msk->conn_list);
@@ -809,6 +810,7 @@ void mptcp_finish_connect(struct sock *sk, int mp_capable)
 		subflow->map_seq = ack_seq;
 		subflow->map_subflow_seq = 1;
 		subflow->rel_write_seq = 1;
+		subflow->snd_una = msk->write_seq;
 
 		list_add(&subflow->node, &msk->conn_list);
 		msk->subflow = NULL;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 9f261ee31917..e37fd4bcd904 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -165,6 +165,7 @@ struct subflow_context {
 	u32	token;
 	u32     rel_write_seq;
 	u64     idsn;
+	u64	snd_una;
 	u64	map_seq;
 	u32	map_subflow_seq;
 	u32	ssn_offset;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-08-23 17:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-23 17:40 [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception Mat Martineau
  -- strict thread matches above, loose matches on Subject: below --
2019-08-23 16:37 Paolo Abeni
2019-08-23  0:34 Mat Martineau
2019-08-22 13:47 Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.