* problems with MP_FAIL timer
@ 2022-05-11 11:28 Paolo Abeni
2022-05-11 14:16 ` Geliang Tang
2022-05-11 15:32 ` Mat Martineau
0 siblings, 2 replies; 5+ messages in thread
From: Paolo Abeni @ 2022-05-11 11:28 UTC (permalink / raw)
To: mptcp, Geliang Tang
Hello,
re-reviewing again the commit:
mptcp: reset subflow when MP_FAIL doesn't respond
There are a few issue with sk_timer handling, specifically:
- no additional mptcp_data_lock is needed to stop/reset it
- I think overlapping infinite mapping reception and close may lead to
some inconsistent timer usage
More importantly I could not find the RFC reference for such
timer?!? @Geliang, @Mat could you please point that out?!?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with MP_FAIL timer
2022-05-11 11:28 problems with MP_FAIL timer Paolo Abeni
@ 2022-05-11 14:16 ` Geliang Tang
2022-05-11 15:32 ` Mat Martineau
1 sibling, 0 replies; 5+ messages in thread
From: Geliang Tang @ 2022-05-11 14:16 UTC (permalink / raw)
To: Paolo Abeni; +Cc: MPTCP Upstream, Geliang Tang
Hi Paolo,
Paolo Abeni <pabeni@redhat.com> 于2022年5月11日周三 19:28写道:
>
> Hello,
>
> re-reviewing again the commit:
>
> mptcp: reset subflow when MP_FAIL doesn't respond
>
> There are a few issue with sk_timer handling, specifically:
> - no additional mptcp_data_lock is needed to stop/reset it
> - I think overlapping infinite mapping reception and close may lead to
> some inconsistent timer usage
>
> More importantly I could not find the RFC reference for such
> timer?!? @Geliang, @Mat could you please point that out?!?
The timer isn't mentioned in the RFC 8684. It only mentions MP_FAIL
should be sent back here:
'''
3.7. Fallback
In this case, if a receiver identifies a checksum failure when there
is only one path, it will send back an MP_FAIL option on the
subflow-level ACK, referring to the data-level sequence number of the
start of the segment on which the checksum error was detected.
'''
Thanks,
-Geliang
>
> Thanks,
>
> Paolo
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with MP_FAIL timer
2022-05-11 11:28 problems with MP_FAIL timer Paolo Abeni
2022-05-11 14:16 ` Geliang Tang
@ 2022-05-11 15:32 ` Mat Martineau
2022-05-11 16:10 ` Paolo Abeni
1 sibling, 1 reply; 5+ messages in thread
From: Mat Martineau @ 2022-05-11 15:32 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp, Geliang Tang
[-- Attachment #1: Type: text/plain, Size: 1234 bytes --]
On Wed, 11 May 2022, Paolo Abeni wrote:
> Hello,
>
> re-reviewing again the commit:
>
> mptcp: reset subflow when MP_FAIL doesn't respond
>
> There are a few issue with sk_timer handling, specifically:
> - no additional mptcp_data_lock is needed to stop/reset it
> - I think overlapping infinite mapping reception and close may lead to
> some inconsistent timer usage
>
I'm working on a patch for this.
The issues you describe are what I was trying to comment on in
https://lore.kernel.org/mptcp/bdde3a40-1471-c8b9-65bd-f53ddebacdb4@linux.intel.com/,
but my proposed solution (deferred events for coordinating with
msk->sk_state) isn't the way to fix it.
I think checking the subflows for SOCK_DEAD before changing the timer
should prevent bad interaction with msk close, since the subflows are
always orphaned before the msk resets sk_timer.
> More importantly I could not find the RFC reference for such
> timer?!? @Geliang, @Mat could you please point that out?!?
>
The RFC doesn't specify what to do if the peer ignores the MP_FAIL.
Resending seemed to be hard to resolve correctly (what if it turned out to
be a duplicate?), so the choice was to revert to the reset behavior after
a timeout.
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with MP_FAIL timer
2022-05-11 15:32 ` Mat Martineau
@ 2022-05-11 16:10 ` Paolo Abeni
2022-05-11 18:24 ` Mat Martineau
0 siblings, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2022-05-11 16:10 UTC (permalink / raw)
To: Mat Martineau; +Cc: mptcp, Geliang Tang
On Wed, 2022-05-11 at 08:32 -0700, Mat Martineau wrote:
> On Wed, 11 May 2022, Paolo Abeni wrote:
>
> > Hello,
> >
> > re-reviewing again the commit:
> >
> > mptcp: reset subflow when MP_FAIL doesn't respond
> >
> > There are a few issue with sk_timer handling, specifically:
> > - no additional mptcp_data_lock is needed to stop/reset it
> > - I think overlapping infinite mapping reception and close may lead to
> > some inconsistent timer usage
> >
>
> I'm working on a patch for this.
>
> The issues you describe are what I was trying to comment on in
> https://lore.kernel.org/mptcp/bdde3a40-1471-c8b9-65bd-f53ddebacdb4@linux.intel.com/,
> but my proposed solution (deferred events for coordinating with
> msk->sk_state) isn't the way to fix it.
I think there is anoter issue: in mptcp_timeout_timer(), via
mptcp_check_mp_fail_response() -> mp_fail_response_expect_subflow() ->
mp_fail_response_expect_subflow(), the code traverse the conn_list
under the msk socket spin_lock. That is not enough: it should instead
acquire the full msk socket lock.
We can use the is_not_owned || defer to release_cb schema, but I think
we are simply better off reverting such patch. Any strong opinion
against that?
Paolo
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with MP_FAIL timer
2022-05-11 16:10 ` Paolo Abeni
@ 2022-05-11 18:24 ` Mat Martineau
0 siblings, 0 replies; 5+ messages in thread
From: Mat Martineau @ 2022-05-11 18:24 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp, Geliang Tang
On Wed, 11 May 2022, Paolo Abeni wrote:
> On Wed, 2022-05-11 at 08:32 -0700, Mat Martineau wrote:
>> On Wed, 11 May 2022, Paolo Abeni wrote:
>>
>>> Hello,
>>>
>>> re-reviewing again the commit:
>>>
>>> mptcp: reset subflow when MP_FAIL doesn't respond
>>>
>>> There are a few issue with sk_timer handling, specifically:
>>> - no additional mptcp_data_lock is needed to stop/reset it
>>> - I think overlapping infinite mapping reception and close may lead to
>>> some inconsistent timer usage
>>>
>>
>> I'm working on a patch for this.
>>
>> The issues you describe are what I was trying to comment on in
>> https://lore.kernel.org/mptcp/bdde3a40-1471-c8b9-65bd-f53ddebacdb4@linux.intel.com/,
>> but my proposed solution (deferred events for coordinating with
>> msk->sk_state) isn't the way to fix it.
>
> I think there is anoter issue: in mptcp_timeout_timer(), via
> mptcp_check_mp_fail_response() -> mp_fail_response_expect_subflow() ->
> mp_fail_response_expect_subflow(), the code traverse the conn_list
> under the msk socket spin_lock. That is not enough: it should instead
> acquire the full msk socket lock.
Agreed.
> We can use the is_not_owned || defer to release_cb schema, but I think
> we are simply better off reverting such patch. Any strong opinion
> against that?
I don't think we can just revert: it's important to either fall back or
reset, and not leave the connection in some indefinite state after sending
a MP_FAIL that's ignored (the RFC is specific that the peer must respond
with MP_FAIL). So we need to use some timer to implement that behavior, or
use other events (rx/tx/etc) to check for a timely MP_FAIL reply and reset
if it hasn't arrived. The timer approach seems simpler.
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-05-11 18:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-11 11:28 problems with MP_FAIL timer Paolo Abeni
2022-05-11 14:16 ` Geliang Tang
2022-05-11 15:32 ` Mat Martineau
2022-05-11 16:10 ` Paolo Abeni
2022-05-11 18:24 ` Mat Martineau
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.