netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* tcp: rearm RTO timer does not comply with RFC6298
@ 2021-01-20 10:49 Pengcheng Yang
  2021-01-20 14:59 ` Neal Cardwell
  0 siblings, 1 reply; 6+ messages in thread
From: Pengcheng Yang @ 2021-01-20 10:49 UTC (permalink / raw)
  To: ncardwell, ycheng; +Cc: netdev, yangpc

hi,

I have a doubt about tcp_rearm_rto().

Early TCP always rearm the RTO timer to NOW+RTO when it receives
an ACK that acknowledges new data.

Referring to RFC6298 SECTION 5.3: "When an ACK is received that
acknowledges new data, restart the retransmission timer so that
it will expire after RTO seconds (for the current value of RTO)."

After ER and TLP, we rearm the RTO timer to *tstamp_of_head+RTO*
when switching from ER/TLP/RACK to original RTO in tcp_rearm_rto(),
in this case the RTO timer is triggered earlier than described in 
RFC6298, otherwise the same.

Is this planned? Or can we always rearm the RTO timer to 
tstamp_of_head+RTO?

Thanks.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tcp: rearm RTO timer does not comply with RFC6298
  2021-01-20 10:49 tcp: rearm RTO timer does not comply with RFC6298 Pengcheng Yang
@ 2021-01-20 14:59 ` Neal Cardwell
  2021-01-20 18:58   ` Yuchung Cheng
  0 siblings, 1 reply; 6+ messages in thread
From: Neal Cardwell @ 2021-01-20 14:59 UTC (permalink / raw)
  To: Pengcheng Yang; +Cc: Yuchung Cheng, Netdev, Eric Dumazet

On Wed, Jan 20, 2021 at 5:50 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
>
> hi,
>
> I have a doubt about tcp_rearm_rto().
>
> Early TCP always rearm the RTO timer to NOW+RTO when it receives
> an ACK that acknowledges new data.
>
> Referring to RFC6298 SECTION 5.3: "When an ACK is received that
> acknowledges new data, restart the retransmission timer so that
> it will expire after RTO seconds (for the current value of RTO)."
>
> After ER and TLP, we rearm the RTO timer to *tstamp_of_head+RTO*
> when switching from ER/TLP/RACK to original RTO in tcp_rearm_rto(),
> in this case the RTO timer is triggered earlier than described in
> RFC6298, otherwise the same.
>
> Is this planned? Or can we always rearm the RTO timer to
> tstamp_of_head+RTO?
>
> Thanks.
>

This is a good question. As far as I can tell, this difference in
behavior would only come into play in a few corner cases, like:

(1) The TLP timer fires and the connection is unable to transmit a TLP
probe packet. This could happen due to memory allocation failure  or
the local qdisc being full.

(2) The RACK reorder timer fires but the connection does not take the
normal course of action and mark some packets lost and retransmit at
least one of them. I'm not sure how this would happen. Maybe someone
can think of a case.

My sense would be that given how relatively rare (1)/(2) are, it is
probably not worth changing the current behavior, given that it seems
it would require extra state (an extra u32 snd_una_advanced_tstamp? )
to save the time at which snd_una advanced (a cumulative ACK covered
some data) in order to rearm the RTO timer for snd_una_advanced_tstamp
+ rto.

neal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tcp: rearm RTO timer does not comply with RFC6298
  2021-01-20 14:59 ` Neal Cardwell
@ 2021-01-20 18:58   ` Yuchung Cheng
  2021-01-21 13:48     ` Pengcheng Yang
  0 siblings, 1 reply; 6+ messages in thread
From: Yuchung Cheng @ 2021-01-20 18:58 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Pengcheng Yang, Netdev, Eric Dumazet

On Wed, Jan 20, 2021 at 6:59 AM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Wed, Jan 20, 2021 at 5:50 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> >
> > hi,
> >
> > I have a doubt about tcp_rearm_rto().
> >
> > Early TCP always rearm the RTO timer to NOW+RTO when it receives
> > an ACK that acknowledges new data.
> >
> > Referring to RFC6298 SECTION 5.3: "When an ACK is received that
> > acknowledges new data, restart the retransmission timer so that
> > it will expire after RTO seconds (for the current value of RTO)."
> >
> > After ER and TLP, we rearm the RTO timer to *tstamp_of_head+RTO*
> > when switching from ER/TLP/RACK to original RTO in tcp_rearm_rto(),
> > in this case the RTO timer is triggered earlier than described in
> > RFC6298, otherwise the same.
> >
> > Is this planned? Or can we always rearm the RTO timer to
> > tstamp_of_head+RTO?
> >
> > Thanks.
> >
>
> This is a good question. As far as I can tell, this difference in
> behavior would only come into play in a few corner cases, like:
>
> (1) The TLP timer fires and the connection is unable to transmit a TLP
> probe packet. This could happen due to memory allocation failure  or
> the local qdisc being full.
>
> (2) The RACK reorder timer fires but the connection does not take the
> normal course of action and mark some packets lost and retransmit at
> least one of them. I'm not sure how this would happen. Maybe someone
> can think of a case.
>
> My sense would be that given how relatively rare (1)/(2) are, it is
> probably not worth changing the current behavior, given that it seems
> it would require extra state (an extra u32 snd_una_advanced_tstamp? )
> to save the time at which snd_una advanced (a cumulative ACK covered
> some data) in order to rearm the RTO timer for snd_una_advanced_tstamp
> + rto.

also there's an experimental proposal
https://tools.ietf.org/html/rfc7765

so Linux actually implements that in a limited way that only applies
in specific scenarios.

>
> neal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tcp: rearm RTO timer does not comply with RFC6298
  2021-01-20 18:58   ` Yuchung Cheng
@ 2021-01-21 13:48     ` Pengcheng Yang
  2021-01-21 15:50       ` Neal Cardwell
  0 siblings, 1 reply; 6+ messages in thread
From: Pengcheng Yang @ 2021-01-21 13:48 UTC (permalink / raw)
  To: 'Yuchung Cheng', 'Neal Cardwell'
  Cc: 'Netdev', 'Eric Dumazet'

On Thu, Jan 21, 2021 at 2:59 AM Yuchung Cheng <ycheng@google.com> wrote:
> 
> On Wed, Jan 20, 2021 at 6:59 AM Neal Cardwell <ncardwell@google.com> wrote:
> >
> > On Wed, Jan 20, 2021 at 5:50 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > >
> > > hi,
> > >
> > > I have a doubt about tcp_rearm_rto().
> > >
> > > Early TCP always rearm the RTO timer to NOW+RTO when it receives
> > > an ACK that acknowledges new data.
> > >
> > > Referring to RFC6298 SECTION 5.3: "When an ACK is received that
> > > acknowledges new data, restart the retransmission timer so that
> > > it will expire after RTO seconds (for the current value of RTO)."
> > >
> > > After ER and TLP, we rearm the RTO timer to *tstamp_of_head+RTO*
> > > when switching from ER/TLP/RACK to original RTO in tcp_rearm_rto(),
> > > in this case the RTO timer is triggered earlier than described in
> > > RFC6298, otherwise the same.
> > >
> > > Is this planned? Or can we always rearm the RTO timer to
> > > tstamp_of_head+RTO?
> > >
> > > Thanks.
> > >
> >
> > This is a good question. As far as I can tell, this difference in
> > behavior would only come into play in a few corner cases, like:
> >
> > (1) The TLP timer fires and the connection is unable to transmit a TLP
> > probe packet. This could happen due to memory allocation failure  or
> > the local qdisc being full.
> >
> > (2) The RACK reorder timer fires but the connection does not take the
> > normal course of action and mark some packets lost and retransmit at
> > least one of them. I'm not sure how this would happen. Maybe someone
> > can think of a case.

Yes, and it also happens when an ACK (a cumulative ACK covered out-of-order data)
is received that makes ca_state change from DISORDER to OPEN, by calling tcp_set_xmit_timer().
Because TLP is not triggered under DISORDER and tcp_rearm_rto() is called before the
ca_state changes.

> >
> > My sense would be that given how relatively rare (1)/(2) are, it is
> > probably not worth changing the current behavior, given that it seems
> > it would require extra state (an extra u32 snd_una_advanced_tstamp? )
> > to save the time at which snd_una advanced (a cumulative ACK covered
> > some data) in order to rearm the RTO timer for snd_una_advanced_tstamp
> > + rto.
> 
> also there's an experimental proposal
> https://tools.ietf.org/html/rfc7765
> 
> so Linux actually implements that in a limited way that only applies
> in specific scenarios.
> 
> >
> > neal

Thank you for answering my questions.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tcp: rearm RTO timer does not comply with RFC6298
  2021-01-21 13:48     ` Pengcheng Yang
@ 2021-01-21 15:50       ` Neal Cardwell
  2021-01-22 10:01         ` Pengcheng Yang
  0 siblings, 1 reply; 6+ messages in thread
From: Neal Cardwell @ 2021-01-21 15:50 UTC (permalink / raw)
  To: Pengcheng Yang; +Cc: Yuchung Cheng, Netdev, Eric Dumazet

On Thu, Jan 21, 2021 at 9:05 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
>
> On Thu, Jan 21, 2021 at 2:59 AM Yuchung Cheng <ycheng@google.com> wrote:
> >
> > On Wed, Jan 20, 2021 at 6:59 AM Neal Cardwell <ncardwell@google.com> wrote:
> > >
> > > On Wed, Jan 20, 2021 at 5:50 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > > >
> > > > hi,
> > > >
> > > > I have a doubt about tcp_rearm_rto().
> > > >
> > > > Early TCP always rearm the RTO timer to NOW+RTO when it receives
> > > > an ACK that acknowledges new data.
> > > >
> > > > Referring to RFC6298 SECTION 5.3: "When an ACK is received that
> > > > acknowledges new data, restart the retransmission timer so that
> > > > it will expire after RTO seconds (for the current value of RTO)."
> > > >
> > > > After ER and TLP, we rearm the RTO timer to *tstamp_of_head+RTO*
> > > > when switching from ER/TLP/RACK to original RTO in tcp_rearm_rto(),
> > > > in this case the RTO timer is triggered earlier than described in
> > > > RFC6298, otherwise the same.
> > > >
> > > > Is this planned? Or can we always rearm the RTO timer to
> > > > tstamp_of_head+RTO?
> > > >
> > > > Thanks.
> > > >
> > >
> > > This is a good question. As far as I can tell, this difference in
> > > behavior would only come into play in a few corner cases, like:
> > >
> > > (1) The TLP timer fires and the connection is unable to transmit a TLP
> > > probe packet. This could happen due to memory allocation failure  or
> > > the local qdisc being full.
> > >
> > > (2) The RACK reorder timer fires but the connection does not take the
> > > normal course of action and mark some packets lost and retransmit at
> > > least one of them. I'm not sure how this would happen. Maybe someone
> > > can think of a case.
>
> Yes, and it also happens when an ACK (a cumulative ACK covered out-of-order data)
> is received that makes ca_state change from DISORDER to OPEN, by calling tcp_set_xmit_timer().
> Because TLP is not triggered under DISORDER and tcp_rearm_rto() is called before the
> ca_state changes.

Hmm, that sounds like a good catch, and potentially a significant bug.
Re-reading the code, it seems that you correctly identify that on an
ACK when reordering is resolved (ca_state change from DISORDER to
OPEN) we will not set a TLP timer for now+TLP_interval, but instead
will set an RTO timer for rtx_head_tx_time+RTO (which could be very
soon indeed, if RTTVAR is very low). Seems like that could cause
spurious RTOs with connections that experience reordering with low RTT
variance.

It seems like we should try to fix this. Perhaps by calling
tcp_set_xmit_timer() only after we have settled on a final ca_state
implied by this ACK (in this case, to allow DISORDER to be resolved to
OPEN). Though that would require some careful surgery, since that
would move the tcp_set_xmit_timer() call *after* the point at which
the RACK reorder timer would be set.

Other thoughts?

neal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tcp: rearm RTO timer does not comply with RFC6298
  2021-01-21 15:50       ` Neal Cardwell
@ 2021-01-22 10:01         ` Pengcheng Yang
  0 siblings, 0 replies; 6+ messages in thread
From: Pengcheng Yang @ 2021-01-22 10:01 UTC (permalink / raw)
  To: 'Neal Cardwell'
  Cc: 'Yuchung Cheng', 'Netdev', 'Eric Dumazet'

On Thu, Jan 21, 2021 at 11:51 PM Neal Cardwell <ncardwell@google.com> wrote:
> 
> On Thu, Jan 21, 2021 at 9:05 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> >
> > On Thu, Jan 21, 2021 at 2:59 AM Yuchung Cheng <ycheng@google.com> wrote:
> > >
> > > On Wed, Jan 20, 2021 at 6:59 AM Neal Cardwell <ncardwell@google.com> wrote:
> > > >
> > > > On Wed, Jan 20, 2021 at 5:50 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > > > >
> > > > > hi,
> > > > >
> > > > > I have a doubt about tcp_rearm_rto().
> > > > >
> > > > > Early TCP always rearm the RTO timer to NOW+RTO when it receives
> > > > > an ACK that acknowledges new data.
> > > > >
> > > > > Referring to RFC6298 SECTION 5.3: "When an ACK is received that
> > > > > acknowledges new data, restart the retransmission timer so that
> > > > > it will expire after RTO seconds (for the current value of RTO)."
> > > > >
> > > > > After ER and TLP, we rearm the RTO timer to *tstamp_of_head+RTO*
> > > > > when switching from ER/TLP/RACK to original RTO in tcp_rearm_rto(),
> > > > > in this case the RTO timer is triggered earlier than described in
> > > > > RFC6298, otherwise the same.
> > > > >
> > > > > Is this planned? Or can we always rearm the RTO timer to
> > > > > tstamp_of_head+RTO?
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > > > This is a good question. As far as I can tell, this difference in
> > > > behavior would only come into play in a few corner cases, like:
> > > >
> > > > (1) The TLP timer fires and the connection is unable to transmit a TLP
> > > > probe packet. This could happen due to memory allocation failure  or
> > > > the local qdisc being full.
> > > >
> > > > (2) The RACK reorder timer fires but the connection does not take the
> > > > normal course of action and mark some packets lost and retransmit at
> > > > least one of them. I'm not sure how this would happen. Maybe someone
> > > > can think of a case.
> >
> > Yes, and it also happens when an ACK (a cumulative ACK covered out-of-order data)
> > is received that makes ca_state change from DISORDER to OPEN, by calling tcp_set_xmit_timer().
> > Because TLP is not triggered under DISORDER and tcp_rearm_rto() is called before the
> > ca_state changes.
> 
> Hmm, that sounds like a good catch, and potentially a significant bug.
> Re-reading the code, it seems that you correctly identify that on an
> ACK when reordering is resolved (ca_state change from DISORDER to
> OPEN) we will not set a TLP timer for now+TLP_interval, but instead
> will set an RTO timer for rtx_head_tx_time+RTO (which could be very
> soon indeed, if RTTVAR is very low). Seems like that could cause
> spurious RTOs with connections that experience reordering with low RTT
> variance.
> 
> It seems like we should try to fix this. Perhaps by calling
> tcp_set_xmit_timer() only after we have settled on a final ca_state
> implied by this ACK (in this case, to allow DISORDER to be resolved to
> OPEN). Though that would require some careful surgery, since that
> would move the tcp_set_xmit_timer() call *after* the point at which
> the RACK reorder timer would be set.
> 
> Other thoughts?
> 
> neal

I think this fix is necessary. Actually, we also fixed this issue on our branch recently
when we fixed an issue where the TLP timer might not fire(Point 1 below),
by calling tcp_set_xmit_timer() after tcp_fastretrans_alert() and 
then removing FLAG_SET_XMIT_TIMER from ack_flag when the RACK 
reorder timer is set. 

This repair has two additional benefits according to my understanding:
(1) When ca_state changes from DISORDER to OPEN, the TLP timer can be activated,
otherwise, the sender can only wait for the RTO timer when it is app-limited.
(2) Reduce the xmit timer reschedule once per ACK when the RACK reorder timer is set.

I'll send this fix later, I'm not sure whether there is a more elegant way. : )


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-01-22 10:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20 10:49 tcp: rearm RTO timer does not comply with RFC6298 Pengcheng Yang
2021-01-20 14:59 ` Neal Cardwell
2021-01-20 18:58   ` Yuchung Cheng
2021-01-21 13:48     ` Pengcheng Yang
2021-01-21 15:50       ` Neal Cardwell
2021-01-22 10:01         ` Pengcheng Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).