On Tue, 19 May 2020, Davide Caratti wrote:

> hello Mat, thanks for looking at this.
>
> On Mon, 2020-05-18 at 17:14 -0700, Mat Martineau wrote:
>> On Mon, 18 May 2020, Davide Caratti wrote:
>>
>>> current MPTCP resets the connection in case the DSS option is not received
>>> after a successful three-way handshake, while RFC8684 Â§3.7 suggests a
>>> fall-back to regular TCP when only the first subflow has been estabilshed.
>>> Introduce support for "infinite mapping" to allow continuing without DSS in
>>> these cases.
>>
>> If I'm understanding the intent of the changes, the idea in this patch is
>> to keep using the MPTCP segmentation & reassembly after fallback, but to
>> short-circuit most of it by (1) not writing the DSS option to the TCP
>> header on the tx side and (2) generating dummy mappings on the rx side.
>
> on a second thought, it would have been better to split this and the
> support for infinite maps into two separate patches.
>

That would help!

> Initially I cleared "is_mptcp" on reception of an infinite map, but then
> those "goto fallback" were creating so many troubles that the approach was
> flipped to always use msk, also for fallback sockets (unless the ones
> passing through subflow_ulp_fallback()).
>

Not depending on is_mptcp for fallback behavior is good. Subflow sockets 
with is_mptcp == false do lead to some confusing situations.

>
>> I'd also like to avoid overloading the term "infinite mapping", which has
>> specific meaning in the RFC: a DSS mapping with the data-level length set
>> to 0 that is sent under specific circumstances. If that specific DSS
>> mapping isn't present on the wire, maybe "dummy mapping" or "fake mapping"
>> is a useful term to use instead.
>
> good point. Looking at the specifications again, not only the kernel must
> be able to process an incoming infinite map; it must also emit an infinite
> map, tentatively, in case it detects a middlebox that filters/corrupts the
> DSS sub-option.
>
> [...]
>>> 	}
>>> @@ -639,8 +645,12 @@ static enum mapping_status get_mapping_status(struct sock *ssk)
>>>
>>> 	data_len = mpext->data_len;
>>> 	if (data_len == 0) {
>>> -		pr_err("Infinite mapping not handled");
>>> +		/* XXX TODO this should be allowed only if we didn't
>>> +		 * receive a DACK yet. Paolo suggests to use a dedicated
>>> +		 * flag in msk, but can't we use msk->can_ack? */
>>
>> The RFC has a lot of requirements for allowing an infinite mapping, and I
>> don't think they align perfectly with what msk->can_ack means. A dedicated
>> "allow_fallback" flag could be set to 'false' when fallback is no longer
>> allowed: a subflow is added, out-of-order MPTCP mappings are encountered,
>> etc.
>
> sure, and also talking with Paolo "can_ack" didn't really fit, as it means
> "can send ack because have both local and remote keys". I was for using
> subflow->conn_finished (and for removing all XXX stuff before posting to a
> ML :) ), but also this is not fitting well.

It has been a challenge to keep track of what each flag means and how they 
all interact. A fallback state machine is a possible way to approach the 
problem, but I'm not sure how well that works with our locking 
arrangement. Hard to narrow down what to recommend here, the tradeoffs are 
complicated!


>
>> The way I read the RFC (and I could be misreading), there are cases where
>> an infinite mapping can be used after DATA_ACK if other conditions are
>> met (like the checksum error section in 3.7).
>
> let's see if I can add a scenario for this (in addition to those at
> https://github.com/dcaratti/packetdrill/tree/fallback-after3whs )

Great, thank you.

>
>>> 		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPRX);
>>> +		if (!subflow->local_id && !READ_ONCE(msk->pm.subflows))
>>> +			return MAPPING_INFINITE;
>>> 		return MAPPING_INVALID;
>>> 	}

Thanks for sharing the patch!

--
Mat Martineau
Intel