All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [MPTCP] protocol questions
@ 2019-09-25 19:06 Mat Martineau
  0 siblings, 0 replies; 7+ messages in thread
From: Mat Martineau @ 2019-09-25 19:06 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 6188 bytes --]


On Tue, 24 Sep 2019, Christoph Paasch wrote:

> Hello,
>
>       On Sep 24, 2019, at 4:30 PM, Mat Martineau <mathew.j.martineau(a)linux.intel.com> wrote:
> 
> 
> On Tue, 24 Sep 2019, Matthieu Baerts wrote:
>
>       On 24/09/2019 15:13, Paolo Abeni wrote:
>             On Tue, 2019-09-24 at 13:57 +0200, Matthieu Baerts wrote:
>                   On 24/09/2019 09:03, Paolo Abeni wrote:
> 
>
>       [...]
>
>                         * are out-of-order DSS allowed? I mean pkt 1 contans a DSS for pkt 2
>                         and vice-versa? If I read correctly, the RFC does not explicitly forbit
>                         them. Can we consider such scenario evil (and ev close the subflow if
>                         we hit it)?
> 
>
>                   I am not sure to understand how you could get this situation where a
>                   packet contains a DSS for another one. May you give more details about that?
>
>             AFAICS our export branch can't produce that result, but theoretically
>             it's possible, I think - at least with something like pktdrill.
>             I'm trying to redesign the recvmsg() path as per last mtg, and I would
>             like to understand the possible scenarios - and explicitly bails on
>             anything unsupported.
> 
>
>       I guess we can then say we don't want to support this case :)
> 
> 
> It's within the MPTCP spec for unmapped packets to arrive and get stored for MPTCP-level reassembly when the DSS arrives on a
> later packet. But we are not required to keep unmapped packets, it's optional to keep them for some period of time. If the
> DSS shows up later, then we can rely on MPTCP-level reinjection by the peer to fill in the data we thought was unmapped and
> we discarded.
> 
> It's also within the MPTCP spec for a DSS to arrive "early", while still reassembling data mapped by an earlier DSS.
> 
> 
> Indeed - the spec is unfortunately a bit too permissive here. It allows mappings to arrive early, late, or mixed in any imaginable
> way.
> 
> Implementation-wise, out-of-tree Linux MPTCP only supports a DSS-mapping that is covering a TCP-sequence space of the packet that
> it is being sent on. If the mapping covers a sequence in the future or in the "past", then we reset the flow.
>
>       While we can discard packets when we don't have a mapping for them, I think we should store "early" DSS mappings.
> 
> 
> Storing early mappings is kind of tricky because we need to limit the amount of mappings we store (otherwise, an attacker could
> simply fill our memory with mappings :-))
>
>       What I don't know is if any existing MPTCP implementations send these early mappings. If no one is sending them then it
>       does seem kind of pointless.
> 
> 
> In practice, I don't know of an implementation that sends early or late mappings. The only thing that happens is a missing mapping
> because of the scenario that Paolo described (aka., tcp_fragment,...). As long as the mapping is on one of the segments, it's fine.

Thanks Christoph. It sounds like handling all those corner cases is 
overkill.

>
>       Seems like these two things could happen on consecutive packets, I don't remember a requirement that DSS mappings must
>       be in-order. But I can't think of a reason an MPTCP implementation would send them out-of-order.
> 
>
>                               * what if we receive a different DSS before the old one is completed?
> 
>
>                         Do you mean:
>                         - We receive the DSS A covering packets 1, 2, 3 ; then DSS B covering 4,
>                         5, 6 but packet 3 is lost. (packets 1 → 6 are following each other when
>                         looking at the TCP seq num)
>                         - Or: we receive the DSS A covering packets 1, 2, 3, 7 ; then DSS B
>                         covering 4, 5, 6 ; packets 4 follows 3 regarding the TCP seq num, 7 will
>                         arrive later.
>
>                         I guess you are talking about the second one, right?
>
>                   I think the 2nd scenario is not possible ?!? DSS mapping should be
>                   continuous inside the TCP sequence numbers space, right?
> 
>
>             I guess indeed it is not possible. Maybe in this case the pkt 4 will be
>             associated to DSS A.
> 
>
>       I agree that the second scenario isn't possible.
>
>       What I do think is possible is this:
>
>       Packet 1 (Data, and DSS A that maps packets 1,2,3)
>       Packet 2 (Data)
>       Packet 3 (Data, and DSS B that maps packets 4,5,6)
>       Packet 4 (Data)
>       Packet 5 (Data)
>       Packet 6 (Data)
>
>       This is the "early DSS" situation I describe above.
> 
> 
> I think it is fine to kill the TCP-subflow for early mappings.

Ok.

> 
> 
> One interesting scenario (currently not supported by any known implementation) is the following though:
> 
> Packet 1, Seq 1:1501 (Data, DSS A MPTCP-seq 1 maps TCP-sequence 1 -> 2001)
> Packet 2, Seq 1501:2001 (Data, DSS A MPTCP-seq 1 maps TCP-sequence 1 -> 2001)
> Packet 3, Seq 2001:2501 (Data, DSS B MPTCP-seq 2001 maps TCP-sequence 2001 -> 2501)
> 
> Now, let's imagine Packet 2 & 3 are lost.
> 
> Ideally, I would be able to retransmit a single packet:
> 
> Packet 4 (retransmission), Seq 1501:2501 (Data, DSS C MPTCP-seq 1501 maps TCP-sequence 1501:2001)
> 
> Packet 1 should be pushed up to the MPTCP-layer (even though the mapping is incomplete) and Packet 4 will also be pushed higher. 
> 
> If that is not supported, the retransmission needs to maintain the Packet 2 & Packet 3 mappings and thus actually retransmit 2
> packets instead of 1.
>

Ah, right. I had been thinking in terms of the reassembled TCP-level 
stream that's passed up to MPTCP, but TCP-level retransmissions could lead 
to this kind of situation with the DSS. For our retransmission code, it 
seems safer and simpler to keep the same mappings when retransmitting 
(packets could have been delayed rather than lost?).

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MPTCP] protocol questions
@ 2019-09-25  0:05 Christoph Paasch
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Paasch @ 2019-09-25  0:05 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 8572 bytes --]

Hello,

> On Sep 24, 2019, at 4:30 PM, Mat Martineau <mathew.j.martineau(a)linux.intel.com> wrote:
> 
> 
> On Tue, 24 Sep 2019, Matthieu Baerts wrote:
> 
>> On 24/09/2019 15:13, Paolo Abeni wrote:
>>> On Tue, 2019-09-24 at 13:57 +0200, Matthieu Baerts wrote:
>>>> On 24/09/2019 09:03, Paolo Abeni wrote:
>> 
>> [...]
>> 
>>>>> * are out-of-order DSS allowed? I mean pkt 1 contans a DSS for pkt 2
>>>>> and vice-versa? If I read correctly, the RFC does not explicitly forbit
>>>>> them. Can we consider such scenario evil (and ev close the subflow if
>>>>> we hit it)?
>>>> 
>>>> I am not sure to understand how you could get this situation where a
>>>> packet contains a DSS for another one. May you give more details about that?
>>> AFAICS our export branch can't produce that result, but theoretically
>>> it's possible, I think - at least with something like pktdrill.
>>> I'm trying to redesign the recvmsg() path as per last mtg, and I would
>>> like to understand the possible scenarios - and explicitly bails on
>>> anything unsupported.
>> 
>> I guess we can then say we don't want to support this case :)
>> 
> 
> It's within the MPTCP spec for unmapped packets to arrive and get stored for MPTCP-level reassembly when the DSS arrives on a later packet. But we are not required to keep unmapped packets, it's optional to keep them for some period of time. If the DSS shows up later, then we can rely on MPTCP-level reinjection by the peer to fill in the data we thought was unmapped and we discarded.
> 
> It's also within the MPTCP spec for a DSS to arrive "early", while still reassembling data mapped by an earlier DSS.

Indeed - the spec is unfortunately a bit too permissive here. It allows mappings to arrive early, late, or mixed in any imaginable way.

Implementation-wise, out-of-tree Linux MPTCP only supports a DSS-mapping that is covering a TCP-sequence space of the packet that it is being sent on. If the mapping covers a sequence in the future or in the "past", then we reset the flow.

> While we can discard packets when we don't have a mapping for them, I think we should store "early" DSS mappings.

Storing early mappings is kind of tricky because we need to limit the amount of mappings we store (otherwise, an attacker could simply fill our memory with mappings :-))

> What I don't know is if any existing MPTCP implementations send these early mappings. If no one is sending them then it does seem kind of pointless.

In practice, I don't know of an implementation that sends early or late mappings. The only thing that happens is a missing mapping because of the scenario that Paolo described (aka., tcp_fragment,...). As long as the mapping is on one of the segments, it's fine.

> Seems like these two things could happen on consecutive packets, I don't remember a requirement that DSS mappings must be in-order. But I can't think of a reason an MPTCP implementation would send them out-of-order.
> 
> 
>>>>> * what if we receive a different DSS before the old one is completed?
>>>> 
>>>> Do you mean:
>>>> - We receive the DSS A covering packets 1, 2, 3 ; then DSS B covering 4,
>>>> 5, 6 but packet 3 is lost. (packets 1 → 6 are following each other when
>>>> looking at the TCP seq num)
>>>> - Or: we receive the DSS A covering packets 1, 2, 3, 7 ; then DSS B
>>>> covering 4, 5, 6 ; packets 4 follows 3 regarding the TCP seq num, 7 will
>>>> arrive later.
>>>> 
>>>> I guess you are talking about the second one, right?
>>> I think the 2nd scenario is not possible ?!? DSS mapping should be
>>> continuous inside the TCP sequence numbers space, right?
>> 
>> I guess indeed it is not possible. Maybe in this case the pkt 4 will be
>> associated to DSS A.
> 
> I agree that the second scenario isn't possible.
> 
> What I do think is possible is this:
> 
> Packet 1 (Data, and DSS A that maps packets 1,2,3)
> Packet 2 (Data)
> Packet 3 (Data, and DSS B that maps packets 4,5,6)
> Packet 4 (Data)
> Packet 5 (Data)
> Packet 6 (Data)
> 
> This is the "early DSS" situation I describe above.

I think it is fine to kill the TCP-subflow for early mappings.


One interesting scenario (currently not supported by any known implementation) is the following though:

Packet 1, Seq 1:1501 (Data, DSS A MPTCP-seq 1 maps TCP-sequence 1 -> 2001)
Packet 2, Seq 1501:2001 (Data, DSS A MPTCP-seq 1 maps TCP-sequence 1 -> 2001)
Packet 3, Seq 2001:2501 (Data, DSS B MPTCP-seq 2001 maps TCP-sequence 2001 -> 2501)

Now, let's imagine Packet 2 & 3 are lost.

Ideally, I would be able to retransmit a single packet:

Packet 4 (retransmission), Seq 1501:2501 (Data, DSS C MPTCP-seq 1501 maps TCP-sequence 1501:2001)

Packet 1 should be pushed up to the MPTCP-layer (even though the mapping is incomplete) and Packet 4 will also be pushed higher. 

If that is not supported, the retransmission needs to maintain the Packet 2 & Packet 3 mappings and thus actually retransmit 2 packets instead of 1.



Christoph


> 
> 
>> 
>>> I was wondering something alike:
>>> pkt 1 carries DSS A which covers also pkt 2.
>>> pkt 2 carries DSS B (with B != A, for some field at least).
>>> Can we accept pkt2/DSS B? The current recvmsg() 'export branch' code
>>> just emits a warning in such scenario. Reading the RFC I think it
>>> allows that if the only difference between DSS B and DSS A is
>>> B.data_len > A.data_len
>> 
>> To which section are you referring to? :)
>> 
>> I think that it is an acceptable situation. Not sure we have to support it.
>> 
>>>>> If I read the RFC correctly, that should be allowed only if the new DSS
>>>>> don't change the existing mapping - e.g. the DSS lenght grow and
>>>>> anything else is unchanged. How we should handle other scenario? ignore
>>>>> the DSS? close the subflow? as MPTCP processing happens after TCP
>>>>> validation, I suppose TCP reset should fit here.
>>>> 
>>>> Can we not fallback? (and send an MP_FAIL)
>>>> 
>>>> From my point of view, I think the only "tricky" case we should support
>>>> is to have the same DSS for a few packets (due to TSO). I don't think we
>>>> should support now the other corner cases you mentioned.
>>> Easier then what I feared, then ;)
>> 
>> The other corner cases are there to support multiple kind of
>> middleboxes. I don't know if this is common on Internet. Maybe Christoph
>> can help us for that :)
>> 
>>>> I think the case where there is no DSS in some packets (but only a
>>>> DATA-ACK?) can happen if the sender is doing some optimisation to reduce
>>>> the size of MPTCP options. For the rest, if it happens, it is more due
>>>> to some middleboxes and we can fallback to TCP.
>>>> 
>>>> Regarding the case where the DSS is "missing" in some packets, my
>>>> colleague Benjamin pointed me to this text from the RFC:
>>>> 
>>>>   A sender MUST include a DSS option with data sequence mapping in
>>>>   every segment *until* one of the sent segments has been acknowledged
>>>>   with a DSS option containing a Data ACK.  Upon reception of the
>>>>   acknowledgment, the sender has the confirmation that the DSS option
>>>>   passes in both directions and *may* choose to send fewer DSS options
>>>>   than once per segment.
>>> Note the the 'export branch' can (and actually does) send data pkts
>>> without DSS due the internal TCP machinery and without tacking in
>>> account any MPTCP related logic.
>>> e.g. we enqueue a pkt with proper DSS mapping, but later it's splitted-
>>> up by tso_fragment() and/or tcp_fragment() on retransmit or even normal
>>> tcp_push(). Nor tcp_fragment nor tso_fragment() copy the skb ext to the newly
>>> allocated skb header.
>>> Such packet will be xmitted without mapping.
>> 
>> Thank you for this note!
>> 
>> I think that in the out-of-tree kernel, the same header is copied in all
>> packets to allow the receiver (more the NIC) to aggregate packets (GRO).
>> But that's clearly an optimisation!
>> 
> 
> I have been assuming people would want to omit the mapping where possible to reduce the header overhead, but hadn't thought of the aggregation benefits for identical headers. Depends on whether someone wants to favor wire speed or CPU load, I guess. I added a note in the wiki for future work.
> 
> --
> Mat Martineau
> Intel_______________________________________________
> mptcp mailing list
> mptcp(a)lists.01.org <mailto:mptcp(a)lists.01.org>
> https://lists.01.org/mailman/listinfo/mptcp <https://lists.01.org/mailman/listinfo/mptcp>

[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 31978 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MPTCP] protocol questions
@ 2019-09-24 23:30 Mat Martineau
  0 siblings, 0 replies; 7+ messages in thread
From: Mat Martineau @ 2019-09-24 23:30 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 6406 bytes --]


On Tue, 24 Sep 2019, Matthieu Baerts wrote:

> On 24/09/2019 15:13, Paolo Abeni wrote:
>> On Tue, 2019-09-24 at 13:57 +0200, Matthieu Baerts wrote:
>>> On 24/09/2019 09:03, Paolo Abeni wrote:
>
> [...]
>
>>>> * are out-of-order DSS allowed? I mean pkt 1 contans a DSS for pkt 2
>>>> and vice-versa? If I read correctly, the RFC does not explicitly forbit
>>>> them. Can we consider such scenario evil (and ev close the subflow if
>>>> we hit it)?
>>>
>>> I am not sure to understand how you could get this situation where a
>>> packet contains a DSS for another one. May you give more details about that?
>> 
>> AFAICS our export branch can't produce that result, but theoretically
>> it's possible, I think - at least with something like pktdrill.
>> 
>> I'm trying to redesign the recvmsg() path as per last mtg, and I would
>> like to understand the possible scenarios - and explicitly bails on
>> anything unsupported.
>
> I guess we can then say we don't want to support this case :)
>

It's within the MPTCP spec for unmapped packets to arrive and get stored 
for MPTCP-level reassembly when the DSS arrives on a later packet. But we 
are not required to keep unmapped packets, it's optional to keep them for 
some period of time. If the DSS shows up later, then we can rely on 
MPTCP-level reinjection by the peer to fill in the data we thought was 
unmapped and we discarded.

It's also within the MPTCP spec for a DSS to arrive "early", while still 
reassembling data mapped by an earlier DSS.

While we can discard packets when we don't have a mapping for them, I 
think we should store "early" DSS mappings. What I don't know is if any 
existing MPTCP implementations send these early mappings. If no one is 
sending them then it does seem kind of pointless.

Seems like these two things could happen on consecutive packets, I don't 
remember a requirement that DSS mappings must be in-order. But I can't 
think of a reason an MPTCP implementation would send them out-of-order.


>>>> * what if we receive a different DSS before the old one is completed?
>>>
>>> Do you mean:
>>> - We receive the DSS A covering packets 1, 2, 3 ; then DSS B covering 4,
>>> 5, 6 but packet 3 is lost. (packets 1 → 6 are following each other when
>>> looking at the TCP seq num)
>>> - Or: we receive the DSS A covering packets 1, 2, 3, 7 ; then DSS B
>>> covering 4, 5, 6 ; packets 4 follows 3 regarding the TCP seq num, 7 will
>>> arrive later.
>>>
>>> I guess you are talking about the second one, right?
>> 
>> I think the 2nd scenario is not possible ?!? DSS mapping should be
>> continuous inside the TCP sequence numbers space, right?
>
> I guess indeed it is not possible. Maybe in this case the pkt 4 will be
> associated to DSS A.

I agree that the second scenario isn't possible.

What I do think is possible is this:

Packet 1 (Data, and DSS A that maps packets 1,2,3)
Packet 2 (Data)
Packet 3 (Data, and DSS B that maps packets 4,5,6)
Packet 4 (Data)
Packet 5 (Data)
Packet 6 (Data)

This is the "early DSS" situation I describe above.


>
>> I was wondering something alike:
>> 
>> pkt 1 carries DSS A which covers also pkt 2.
>> pkt 2 carries DSS B (with B != A, for some field at least).
>> 
>> Can we accept pkt2/DSS B? The current recvmsg() 'export branch' code
>> just emits a warning in such scenario. Reading the RFC I think it
>> allows that if the only difference between DSS B and DSS A is
>> B.data_len > A.data_len
>
> To which section are you referring to? :)
>
> I think that it is an acceptable situation. Not sure we have to support it.
>
>>>> If I read the RFC correctly, that should be allowed only if the new DSS
>>>> don't change the existing mapping - e.g. the DSS lenght grow and
>>>> anything else is unchanged. How we should handle other scenario? ignore
>>>> the DSS? close the subflow? as MPTCP processing happens after TCP
>>>> validation, I suppose TCP reset should fit here.
>>>
>>> Can we not fallback? (and send an MP_FAIL)
>>>
>>> From my point of view, I think the only "tricky" case we should support
>>> is to have the same DSS for a few packets (due to TSO). I don't think we
>>> should support now the other corner cases you mentioned.
>> 
>> Easier then what I feared, then ;)
>
> The other corner cases are there to support multiple kind of
> middleboxes. I don't know if this is common on Internet. Maybe Christoph
> can help us for that :)
>
>>> I think the case where there is no DSS in some packets (but only a
>>> DATA-ACK?) can happen if the sender is doing some optimisation to reduce
>>> the size of MPTCP options. For the rest, if it happens, it is more due
>>> to some middleboxes and we can fallback to TCP.
>>>
>>> Regarding the case where the DSS is "missing" in some packets, my
>>> colleague Benjamin pointed me to this text from the RFC:
>>>
>>>    A sender MUST include a DSS option with data sequence mapping in
>>>    every segment *until* one of the sent segments has been acknowledged
>>>    with a DSS option containing a Data ACK.  Upon reception of the
>>>    acknowledgment, the sender has the confirmation that the DSS option
>>>    passes in both directions and *may* choose to send fewer DSS options
>>>    than once per segment.
>> 
>> Note the the 'export branch' can (and actually does) send data pkts
>> without DSS due the internal TCP machinery and without tacking in
>> account any MPTCP related logic.
>> 
>> e.g. we enqueue a pkt with proper DSS mapping, but later it's splitted-
>> up by tso_fragment() and/or tcp_fragment() on retransmit or even normal
>> tcp_push(). 
>> 
>> Nor tcp_fragment nor tso_fragment() copy the skb ext to the newly
>> allocated skb header.
>> 
>> Such packet will be xmitted without mapping.
>
> Thank you for this note!
>
> I think that in the out-of-tree kernel, the same header is copied in all
> packets to allow the receiver (more the NIC) to aggregate packets (GRO).
> But that's clearly an optimisation!
>

I have been assuming people would want to omit the mapping where possible 
to reduce the header overhead, but hadn't thought of the aggregation 
benefits for identical headers. Depends on whether someone wants to favor 
wire speed or CPU load, I guess. I added a note in the wiki for future 
work.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MPTCP] protocol questions
@ 2019-09-24 14:56 Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2019-09-24 14:56 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 4752 bytes --]

On 24/09/2019 15:13, Paolo Abeni wrote:
> On Tue, 2019-09-24 at 13:57 +0200, Matthieu Baerts wrote:
>> On 24/09/2019 09:03, Paolo Abeni wrote:

[...]

>>> * are out-of-order DSS allowed? I mean pkt 1 contans a DSS for pkt 2
>>> and vice-versa? If I read correctly, the RFC does not explicitly forbit
>>> them. Can we consider such scenario evil (and ev close the subflow if
>>> we hit it)?
>>
>> I am not sure to understand how you could get this situation where a
>> packet contains a DSS for another one. May you give more details about that?
> 
> AFAICS our export branch can't produce that result, but theoretically
> it's possible, I think - at least with something like pktdrill.
> 
> I'm trying to redesign the recvmsg() path as per last mtg, and I would
> like to understand the possible scenarios - and explicitly bails on
> anything unsupported.

I guess we can then say we don't want to support this case :)

>>> * what if we receive a different DSS before the old one is completed?
>>
>> Do you mean:
>> - We receive the DSS A covering packets 1, 2, 3 ; then DSS B covering 4,
>> 5, 6 but packet 3 is lost. (packets 1 → 6 are following each other when
>> looking at the TCP seq num)
>> - Or: we receive the DSS A covering packets 1, 2, 3, 7 ; then DSS B
>> covering 4, 5, 6 ; packets 4 follows 3 regarding the TCP seq num, 7 will
>> arrive later.
>>
>> I guess you are talking about the second one, right?
> 
> I think the 2nd scenario is not possible ?!? DSS mapping should be
> continuous inside the TCP sequence numbers space, right?

I guess indeed it is not possible. Maybe in this case the pkt 4 will be
associated to DSS A.

> I was wondering something alike:
> 
> pkt 1 carries DSS A which covers also pkt 2.
> pkt 2 carries DSS B (with B != A, for some field at least).
> 
> Can we accept pkt2/DSS B? The current recvmsg() 'export branch' code
> just emits a warning in such scenario. Reading the RFC I think it
> allows that if the only difference between DSS B and DSS A is
> B.data_len > A.data_len

To which section are you referring to? :)

I think that it is an acceptable situation. Not sure we have to support it.

>>> If I read the RFC correctly, that should be allowed only if the new DSS
>>> don't change the existing mapping - e.g. the DSS lenght grow and
>>> anything else is unchanged. How we should handle other scenario? ignore
>>> the DSS? close the subflow? as MPTCP processing happens after TCP
>>> validation, I suppose TCP reset should fit here.
>>
>> Can we not fallback? (and send an MP_FAIL)
>>
>> From my point of view, I think the only "tricky" case we should support
>> is to have the same DSS for a few packets (due to TSO). I don't think we
>> should support now the other corner cases you mentioned.
> 
> Easier then what I feared, then ;)

The other corner cases are there to support multiple kind of
middleboxes. I don't know if this is common on Internet. Maybe Christoph
can help us for that :)

>> I think the case where there is no DSS in some packets (but only a
>> DATA-ACK?) can happen if the sender is doing some optimisation to reduce
>> the size of MPTCP options. For the rest, if it happens, it is more due
>> to some middleboxes and we can fallback to TCP.
>>
>> Regarding the case where the DSS is "missing" in some packets, my
>> colleague Benjamin pointed me to this text from the RFC:
>>
>>    A sender MUST include a DSS option with data sequence mapping in
>>    every segment *until* one of the sent segments has been acknowledged
>>    with a DSS option containing a Data ACK.  Upon reception of the
>>    acknowledgment, the sender has the confirmation that the DSS option
>>    passes in both directions and *may* choose to send fewer DSS options
>>    than once per segment.
> 
> Note the the 'export branch' can (and actually does) send data pkts
> without DSS due the internal TCP machinery and without tacking in
> account any MPTCP related logic.
> 
> e.g. we enqueue a pkt with proper DSS mapping, but later it's splitted-
> up by tso_fragment() and/or tcp_fragment() on retransmit or even normal
> tcp_push(). 
> 
> Nor tcp_fragment nor tso_fragment() copy the skb ext to the newly
> allocated skb header.
> 
> Such packet will be xmitted without mapping.

Thank you for this note!

I think that in the out-of-tree kernel, the same header is copied in all
packets to allow the receiver (more the NIC) to aggregate packets (GRO).
But that's clearly an optimisation!

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MPTCP] protocol questions
@ 2019-09-24 13:13 Paolo Abeni
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2019-09-24 13:13 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 4353 bytes --]

On Tue, 2019-09-24 at 13:57 +0200, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 24/09/2019 09:03, Paolo Abeni wrote:
> > there are a few DSS related corner cases I don't understand well:
> > 
> > * is support for 'delayed DSS' optional? e.g. pkt 1 with no DSS, pkt 2
> > with DSS for pkt 1 and 2. Can we ignore/consider such scenario evil
> > (and ev close the subflow if we hit it)?
> 
> It is possible to have this scenario but we don't have to support it
> now. I think the out-of-tree kernel support[1] it but I am sure some
> smaller stacks don't support it.
> 
> [1]
> https://github.com/multipath-tcp/mptcp/blob/mptcp_trunk/net/mptcp/mptcp_input.c#L642

good! :)

> > * are out-of-order DSS allowed? I mean pkt 1 contans a DSS for pkt 2
> > and vice-versa? If I read correctly, the RFC does not explicitly forbit
> > them. Can we consider such scenario evil (and ev close the subflow if
> > we hit it)?
> 
> I am not sure to understand how you could get this situation where a
> packet contains a DSS for another one. May you give more details about that?

AFAICS our export branch can't produce that result, but theoretically
it's possible, I think - at least with something like pktdrill.

I'm trying to redesign the recvmsg() path as per last mtg, and I would
like to understand the possible scenarios - and explicitly bails on
anything unsupported.

> > * what if we receive a different DSS before the old one is completed?
> 
> Do you mean:
> - We receive the DSS A covering packets 1, 2, 3 ; then DSS B covering 4,
> 5, 6 but packet 3 is lost. (packets 1 → 6 are following each other when
> looking at the TCP seq num)
> - Or: we receive the DSS A covering packets 1, 2, 3, 7 ; then DSS B
> covering 4, 5, 6 ; packets 4 follows 3 regarding the TCP seq num, 7 will
> arrive later.
> 
> I guess you are talking about the second one, right?

I think the 2nd scenario is not possible ?!? DSS mapping should be
continuous inside the TCP sequence numbers space, right?

I was wondering something alike:

pkt 1 carries DSS A which covers also pkt 2.
pkt 2 carries DSS B (with B != A, for some field at least).

Can we accept pkt2/DSS B? The current recvmsg() 'export branch' code
just emits a warning in such scenario. Reading the RFC I think it
allows that if the only difference between DSS B and DSS A is
B.data_len > A.data_len

> > If I read the RFC correctly, that should be allowed only if the new DSS
> > don't change the existing mapping - e.g. the DSS lenght grow and
> > anything else is unchanged. How we should handle other scenario? ignore
> > the DSS? close the subflow? as MPTCP processing happens after TCP
> > validation, I suppose TCP reset should fit here.
> 
> Can we not fallback? (and send an MP_FAIL)
> 
> From my point of view, I think the only "tricky" case we should support
> is to have the same DSS for a few packets (due to TSO). I don't think we
> should support now the other corner cases you mentioned.

Easier then what I feared, then ;)

> I think the case where there is no DSS in some packets (but only a
> DATA-ACK?) can happen if the sender is doing some optimisation to reduce
> the size of MPTCP options. For the rest, if it happens, it is more due
> to some middleboxes and we can fallback to TCP.
> 
> Regarding the case where the DSS is "missing" in some packets, my
> colleague Benjamin pointed me to this text from the RFC:
> 
>    A sender MUST include a DSS option with data sequence mapping in
>    every segment *until* one of the sent segments has been acknowledged
>    with a DSS option containing a Data ACK.  Upon reception of the
>    acknowledgment, the sender has the confirmation that the DSS option
>    passes in both directions and *may* choose to send fewer DSS options
>    than once per segment.

Note the the 'export branch' can (and actually does) send data pkts
without DSS due the internal TCP machinery and without tacking in
account any MPTCP related logic.

e.g. we enqueue a pkt with proper DSS mapping, but later it's splitted-
up by tso_fragment() and/or tcp_fragment() on retransmit or even normal
tcp_push(). 

Nor tcp_fragment nor tso_fragment() copy the skb ext to the newly
allocated skb header.

Such packet will be xmitted without mapping.

Thank you,

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MPTCP] protocol questions
@ 2019-09-24 11:57 Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2019-09-24 11:57 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 3328 bytes --]

Hi Paolo,

On 24/09/2019 09:03, Paolo Abeni wrote:
> there are a few DSS related corner cases I don't understand well:
> 
> * is support for 'delayed DSS' optional? e.g. pkt 1 with no DSS, pkt 2
> with DSS for pkt 1 and 2. Can we ignore/consider such scenario evil
> (and ev close the subflow if we hit it)?

It is possible to have this scenario but we don't have to support it
now. I think the out-of-tree kernel support[1] it but I am sure some
smaller stacks don't support it.

[1]
https://github.com/multipath-tcp/mptcp/blob/mptcp_trunk/net/mptcp/mptcp_input.c#L642

> * are out-of-order DSS allowed? I mean pkt 1 contans a DSS for pkt 2
> and vice-versa? If I read correctly, the RFC does not explicitly forbit
> them. Can we consider such scenario evil (and ev close the subflow if
> we hit it)?

I am not sure to understand how you could get this situation where a
packet contains a DSS for another one. May you give more details about that?

> * what if we receive a different DSS before the old one is completed?

Do you mean:
- We receive the DSS A covering packets 1, 2, 3 ; then DSS B covering 4,
5, 6 but packet 3 is lost. (packets 1 → 6 are following each other when
looking at the TCP seq num)
- Or: we receive the DSS A covering packets 1, 2, 3, 7 ; then DSS B
covering 4, 5, 6 ; packets 4 follows 3 regarding the TCP seq num, 7 will
arrive later.

I guess you are talking about the second one, right?

> If I read the RFC correctly, that should be allowed only if the new DSS
> don't change the existing mapping - e.g. the DSS lenght grow and
> anything else is unchanged. How we should handle other scenario? ignore
> the DSS? close the subflow? as MPTCP processing happens after TCP
> validation, I suppose TCP reset should fit here.

Can we not fallback? (and send an MP_FAIL)

From my point of view, I think the only "tricky" case we should support
is to have the same DSS for a few packets (due to TSO). I don't think we
should support now the other corner cases you mentioned.

I think the case where there is no DSS in some packets (but only a
DATA-ACK?) can happen if the sender is doing some optimisation to reduce
the size of MPTCP options. For the rest, if it happens, it is more due
to some middleboxes and we can fallback to TCP.

Regarding the case where the DSS is "missing" in some packets, my
colleague Benjamin pointed me to this text from the RFC:

   A sender MUST include a DSS option with data sequence mapping in
   every segment *until* one of the sent segments has been acknowledged
   with a DSS option containing a Data ACK.  Upon reception of the
   acknowledgment, the sender has the confirmation that the DSS option
   passes in both directions and *may* choose to send fewer DSS options
   than once per segment.

And the link with the code (fully_established):
-
https://github.com/multipath-tcp/mptcp/blob/mptcp_trunk/net/mptcp/mptcp_input.c#L582
-
https://github.com/multipath-tcp/mptcp/blob/mptcp_trunk/net/mptcp/mptcp_input.c#L1430

I am sure Christoph will soon share a useful notes about all of this :-)

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [MPTCP] protocol questions
@ 2019-09-24  7:03 Paolo Abeni
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2019-09-24  7:03 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 912 bytes --]

Hi,

there are a few DSS related corner cases I don't understand well:

* is support for 'delayed DSS' optional? e.g. pkt 1 with no DSS, pkt 2
with DSS for pkt 1 and 2. Can we ignore/consider such scenario evil
(and ev close the subflow if we hit it)?

* are out-of-order DSS allowed? I mean pkt 1 contans a DSS for pkt 2
and vice-versa? If I read correctly, the RFC does not explicitly forbit
them. Can we consider such scenario evil (and ev close the subflow if
we hit it)?

* what if we receive a different DSS before the old one is completed?
If I read the RFC correctly, that should be allowed only if the new DSS
don't change the existing mapping - e.g. the DSS lenght grow and
anything else is unchanged. How we should handle other scenario? ignore
the DSS? close the subflow? as MPTCP processing happens after TCP
validation, I suppose TCP reset should fit here.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-09-25 19:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-25 19:06 [MPTCP] protocol questions Mat Martineau
  -- strict thread matches above, loose matches on Subject: below --
2019-09-25  0:05 Christoph Paasch
2019-09-24 23:30 Mat Martineau
2019-09-24 14:56 Matthieu Baerts
2019-09-24 13:13 Paolo Abeni
2019-09-24 11:57 Matthieu Baerts
2019-09-24  7:03 Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.