* [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-22 13:47 Paolo Abeni
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-08-22 13:47 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 3279 bytes --]
So that we keep per subflow unacked sequence number consistent;
since we update only per socket data, no additional locking is
required.
Subflow and msk unacked seq are updated at different stages
to keep the locking schema simple.
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
rfc v1 -> rfc v2:
- initialize subflow snd_una on subflow connect()/accept()
---
net/mptcp/options.c | 31 +++++++++++++++++++++++++------
net/mptcp/protocol.c | 2 ++
net/mptcp/protocol.h | 1 +
3 files changed, 28 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 0e7abcea5bae..e0895834ff82 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -524,6 +524,21 @@ bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
return false;
}
+static u64 expand_ack(u64 old_ack, u64 cur_ack, bool use_64bit)
+{
+ u32 old_ack32, cur_ack32;
+
+ if (use_64bit)
+ return cur_ack;
+
+ old_ack32 = (u32)old_ack;
+ cur_ack32 = (u32)cur_ack;
+ cur_ack = (old_ack & GENMASK_ULL(63, 32)) + cur_ack32;
+ if (unlikely(before(cur_ack32, old_ack32)))
+ return cur_ack + (1LL<<32);
+ return cur_ack;
+}
+
void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
struct tcp_options_received *opt_rx)
{
@@ -537,6 +552,16 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
if (!mp_opt->dss)
return;
+ /* we can't wait for recvmsg() to update the ack_seq, otherwise
+ * monodirectional flows will stuck
+ */
+ if (mp_opt->use_ack) {
+ struct subflow_context *subflow = subflow_ctx(sk);
+
+ subflow->snd_una = expand_ack(subflow->snd_una,
+ mp_opt->data_ack, mp_opt->ack64);
+ }
+
mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
if (!mpext)
return;
@@ -553,12 +578,6 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
mpext->use_checksum = mp_opt->use_checksum;
}
- if (mp_opt->use_ack) {
- mpext->data_ack = mp_opt->data_ack;
- mpext->use_ack = 1;
- mpext->ack64 = mp_opt->ack64;
- }
-
mpext->data_fin = mp_opt->data_fin;
if (msk)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 5b7ddbe64e4b..65f7cc4712ef 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -674,6 +674,7 @@ static struct sock *mptcp_accept(struct sock *sk, int flags, int *err,
subflow->map_subflow_seq = 1;
subflow->rel_write_seq = 1;
subflow->tcp_sock = new_sock;
+ subflow->snd_una = msk->write_seq;
newsk = new_mptcp_sock;
subflow->conn = new_mptcp_sock;
list_add(&subflow->node, &msk->conn_list);
@@ -809,6 +810,7 @@ void mptcp_finish_connect(struct sock *sk, int mp_capable)
subflow->map_seq = ack_seq;
subflow->map_subflow_seq = 1;
subflow->rel_write_seq = 1;
+ subflow->snd_una = msk->write_seq;
list_add(&subflow->node, &msk->conn_list);
msk->subflow = NULL;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 9f261ee31917..e37fd4bcd904 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -165,6 +165,7 @@ struct subflow_context {
u32 token;
u32 rel_write_seq;
u64 idsn;
+ u64 snd_una;
u64 map_seq;
u32 map_subflow_seq;
u32 ssn_offset;
--
2.21.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-23 17:40 Mat Martineau
0 siblings, 0 replies; 4+ messages in thread
From: Mat Martineau @ 2019-08-23 17:40 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]
On Fri, 23 Aug 2019, Paolo Abeni wrote:
> Hi,
>
> Thank you for the feedback!
>
> On Thu, 2019-08-22 at 17:34 -0700, Mat Martineau wrote:
>> Rather than trying to sync subflow->snd_una values for the purpose of
>> expanding them to 64 bits, the subflow could record the 32-bit value and
>> let the MPTCP-level socket do the expansion? Then the subflows only need
>> to track their most recent data ack: the ack value, size (32 or 64), and
>> whether it's been updated since the last access by the MPTCP-level socket.
>
> I think that tracking the most recent data ack can become almost as
> tricky, if the peer interleaves 32 and 64 bits ack (IIRC the RFC, it
> can).
Yes, the sender can interleave.
At the subflow, it's still simple (any new ack replaces the previous one
regardless of bit width). Since the ack always moves forward, we don't
card about the old value even if it was "better" (64-bit). And the
expansion of a 32-bit ack on one subflow doesn't depend on the previous
value seen on that subflow, it depends on msk->snd_una.
>
>> The MPTCP-level socket has more information to know whether to reject an
>> ambiguous 32-bit ack. Latency between the update of subflow->snd_una and
>> the check of that value by the MPTCP socket is still an issue, but could
>> be more manageable.
>
> Since expanding to 64 bits looks safer with msk->snd_una, and moving
> the msk->snd_una updates into mptcp_incoming_options() should be quite
> doable. What about that option?
>
> Yep, we need an additional lock/atomic operation, but <quote> Premature
> optimization is the root of all evil </quote>
Sounds good to me.
>
>> Side note: I know we have to handle what compliant peers send to us, but
>> maybe we should keep sending only 64-bit acks!
>
> I think we already do that ?!?
Yes, that is what we currently do. My suggestion is that we continue
with only 64-bit acks in the future.
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-23 16:37 Paolo Abeni
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-08-23 16:37 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]
Hi,
Thank you for the feedback!
On Thu, 2019-08-22 at 17:34 -0700, Mat Martineau wrote:
> Rather than trying to sync subflow->snd_una values for the purpose of
> expanding them to 64 bits, the subflow could record the 32-bit value and
> let the MPTCP-level socket do the expansion? Then the subflows only need
> to track their most recent data ack: the ack value, size (32 or 64), and
> whether it's been updated since the last access by the MPTCP-level socket.
I think that tracking the most recent data ack can become almost as
tricky, if the peer interleaves 32 and 64 bits ack (IIRC the RFC, it
can).
> The MPTCP-level socket has more information to know whether to reject an
> ambiguous 32-bit ack. Latency between the update of subflow->snd_una and
> the check of that value by the MPTCP socket is still an issue, but could
> be more manageable.
Since expanding to 64 bits looks safer with msk->snd_una, and moving
the msk->snd_una updates into mptcp_incoming_options() should be quite
doable. What about that option?
Yep, we need an additional lock/atomic operation, but <quote> Premature
optimization is the root of all evil </quote>
> Side note: I know we have to handle what compliant peers send to us, but
> maybe we should keep sending only 64-bit acks!
I think we already do that ?!?
Cheers,
Paolo
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception
@ 2019-08-23 0:34 Mat Martineau
0 siblings, 0 replies; 4+ messages in thread
From: Mat Martineau @ 2019-08-23 0:34 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 5596 bytes --]
On Thu, 22 Aug 2019, Paolo Abeni wrote:
> So that we keep per subflow unacked sequence number consistent;
> since we update only per socket data, no additional locking is
> required.
>
> Subflow and msk unacked seq are updated at different stages
> to keep the locking schema simple.
>
> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
> ---
> rfc v1 -> rfc v2:
> - initialize subflow snd_una on subflow connect()/accept()
> ---
> net/mptcp/options.c | 31 +++++++++++++++++++++++++------
> net/mptcp/protocol.c | 2 ++
> net/mptcp/protocol.h | 1 +
> 3 files changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
> index 0e7abcea5bae..e0895834ff82 100644
> --- a/net/mptcp/options.c
> +++ b/net/mptcp/options.c
> @@ -524,6 +524,21 @@ bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
> return false;
> }
>
> +static u64 expand_ack(u64 old_ack, u64 cur_ack, bool use_64bit)
> +{
> + u32 old_ack32, cur_ack32;
> +
> + if (use_64bit)
> + return cur_ack;
> +
> + old_ack32 = (u32)old_ack;
> + cur_ack32 = (u32)cur_ack;
> + cur_ack = (old_ack & GENMASK_ULL(63, 32)) + cur_ack32;
> + if (unlikely(before(cur_ack32, old_ack32)))
> + return cur_ack + (1LL<<32);
> + return cur_ack;
> +}
Expanding the 32-bit sequence numbers correctly and consistently has been
a concern of mine for a while, and trying to think through all the issues
in the context of ack processing has only made me realize it's even
trickier!
With the 32-bit data acks, we can encounter delayed acks on any subflow.
Even timely acks seen on one subflow might have large changes in value due
to the peer's transmit scheduling choices (acks might only show up on
other subflows for a while). The penalty for incorrectly accepting a data
ack is significant, since we would discard data from our MPTCP-level
retransmission buffer that the peer will know it doesn't have (leading to
an unrecoverable stall). Incorrectly discarding a data ack would be
recoverable and likely fix itself through additional acks but maybe lead
to an unnecessary reinjection by the peer.
I point this out in order to say that expanding 32-bit acks seems like a
hard problem, and by decoupling the subflow->snd_una values it becomes
even harder :) . I also understand that the possible locking problems from
trying to share snd_una across subflows would be difficult too.
Rather than trying to sync subflow->snd_una values for the purpose of
expanding them to 64 bits, the subflow could record the 32-bit value and
let the MPTCP-level socket do the expansion? Then the subflows only need
to track their most recent data ack: the ack value, size (32 or 64), and
whether it's been updated since the last access by the MPTCP-level socket.
The MPTCP-level socket has more information to know whether to reject an
ambiguous 32-bit ack. Latency between the update of subflow->snd_una and
the check of that value by the MPTCP socket is still an issue, but could
be more manageable.
Side note: I know we have to handle what compliant peers send to us, but
maybe we should keep sending only 64-bit acks!
Mat
> +
> void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
> struct tcp_options_received *opt_rx)
> {
> @@ -537,6 +552,16 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
> if (!mp_opt->dss)
> return;
>
> + /* we can't wait for recvmsg() to update the ack_seq, otherwise
> + * monodirectional flows will stuck
> + */
> + if (mp_opt->use_ack) {
> + struct subflow_context *subflow = subflow_ctx(sk);
> +
> + subflow->snd_una = expand_ack(subflow->snd_una,
> + mp_opt->data_ack, mp_opt->ack64);
> + }
> +
> mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
> if (!mpext)
> return;
> @@ -553,12 +578,6 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
> mpext->use_checksum = mp_opt->use_checksum;
> }
>
> - if (mp_opt->use_ack) {
> - mpext->data_ack = mp_opt->data_ack;
> - mpext->use_ack = 1;
> - mpext->ack64 = mp_opt->ack64;
> - }
> -
> mpext->data_fin = mp_opt->data_fin;
>
> if (msk)
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 5b7ddbe64e4b..65f7cc4712ef 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -674,6 +674,7 @@ static struct sock *mptcp_accept(struct sock *sk, int flags, int *err,
> subflow->map_subflow_seq = 1;
> subflow->rel_write_seq = 1;
> subflow->tcp_sock = new_sock;
> + subflow->snd_una = msk->write_seq;
> newsk = new_mptcp_sock;
> subflow->conn = new_mptcp_sock;
> list_add(&subflow->node, &msk->conn_list);
> @@ -809,6 +810,7 @@ void mptcp_finish_connect(struct sock *sk, int mp_capable)
> subflow->map_seq = ack_seq;
> subflow->map_subflow_seq = 1;
> subflow->rel_write_seq = 1;
> + subflow->snd_una = msk->write_seq;
>
> list_add(&subflow->node, &msk->conn_list);
> msk->subflow = NULL;
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 9f261ee31917..e37fd4bcd904 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -165,6 +165,7 @@ struct subflow_context {
> u32 token;
> u32 rel_write_seq;
> u64 idsn;
> + u64 snd_una;
> u64 map_seq;
> u32 map_subflow_seq;
> u32 ssn_offset;
> --
> 2.21.0
>
> _______________________________________________
> mptcp mailing list
> mptcp(a)lists.01.org
> https://lists.01.org/mailman/listinfo/mptcp
>
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-08-23 17:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-22 13:47 [MPTCP] [RFC PATCH v2 2/8] mptcp: update per subflow unacked sequence on pkt reception Paolo Abeni
2019-08-23 0:34 Mat Martineau
2019-08-23 16:37 Paolo Abeni
2019-08-23 17:40 Mat Martineau
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.