All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/5] tcp: fixes to non-SACK TCP
@ 2018-03-07 12:59 Ilpo Järvinen
  2018-03-07 12:59 ` [PATCH net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery Ilpo Järvinen
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 12:59 UTC (permalink / raw)
  To: netdev

Here is a series of fixes to issues that occur when SACK is not
enabled for a TCP connection:

tcp: feed correct number of pkts acked to cc modules
tcp: prevent bogus FRTO undos with non-SACK flows
tcp: move false FR condition into
tcp: prevent bogus undos when SACK is not enabled
tcp: send real dupACKs by locking advertized window

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery
  2018-03-07 12:59 [PATCH net 0/5] tcp: fixes to non-SACK TCP Ilpo Järvinen
@ 2018-03-07 12:59 ` Ilpo Järvinen
  2018-03-07 12:59 ` [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows Ilpo Järvinen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 12:59 UTC (permalink / raw)
  To: netdev

The pkts_acked is clearly not the correct value for slow start
after RTO as it may include segments that were not lost and
therefore did not need retransmissions in the slow start following
the RTO. Then tcp_slow_start will add the excess into cwnd bloating
it and triggering a burst.

Instead, we want to pass only the number of retransmitted segments
that were covered by the cumulative ACK (and potentially newly sent
data segments too if the cumulative ACK covers that far).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9a1b3c1..0305f6d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3027,6 +3027,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 	long seq_rtt_us = -1L;
 	long ca_rtt_us = -1L;
 	u32 pkts_acked = 0;
+	u32 rexmit_acked = 0;
+	u32 newdata_acked = 0;
 	u32 last_in_flight = 0;
 	bool rtt_update;
 	int flag = 0;
@@ -3056,8 +3058,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 		}
 
 		if (unlikely(sacked & TCPCB_RETRANS)) {
-			if (sacked & TCPCB_SACKED_RETRANS)
+			if (sacked & TCPCB_SACKED_RETRANS) {
 				tp->retrans_out -= acked_pcount;
+				rexmit_acked += acked_pcount;
+			}
 			flag |= FLAG_RETRANS_DATA_ACKED;
 		} else if (!(sacked & TCPCB_SACKED_ACKED)) {
 			last_ackt = skb->skb_mstamp;
@@ -3068,8 +3072,11 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 			last_in_flight = TCP_SKB_CB(skb)->tx.in_flight;
 			if (before(start_seq, reord))
 				reord = start_seq;
-			if (!after(scb->end_seq, tp->high_seq))
+			if (!after(scb->end_seq, tp->high_seq)) {
 				flag |= FLAG_ORIG_SACK_ACKED;
+			} else {
+				newdata_acked += acked_pcount;
+			}
 		}
 
 		if (sacked & TCPCB_SACKED_ACKED) {
@@ -3151,6 +3158,14 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 		}
 
 		if (tcp_is_reno(tp)) {
+			/* Due to discontinuity on RTO in the artificial
+			 * sacked_out calculations, TCP must restrict
+			 * pkts_acked without SACK to rexmits and new data
+			 * segments
+			 */
+			if (icsk->icsk_ca_state == TCP_CA_Loss)
+				pkts_acked = rexmit_acked + newdata_acked;
+
 			tcp_remove_reno_sacks(sk, pkts_acked);
 		} else {
 			int delta;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows
  2018-03-07 12:59 [PATCH net 0/5] tcp: fixes to non-SACK TCP Ilpo Järvinen
  2018-03-07 12:59 ` [PATCH net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery Ilpo Järvinen
@ 2018-03-07 12:59 ` Ilpo Järvinen
  2018-03-07 19:24   ` Neal Cardwell
  2018-03-07 12:59 ` [PATCH net 3/5] tcp: move false FR condition into tcp_false_fast_retrans_possible() Ilpo Järvinen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 12:59 UTC (permalink / raw)
  To: netdev

In a non-SACK case, any non-retransmitted segment acknowledged will
set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is
no indication that it would have been delivered for real (the
scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK
case). This causes bogus undos in ordinary RTO recoveries where
segments are lost here and there, with a few delivered segments in
between losses. A cumulative ACKs will cover retransmitted ones at
the bottom and the non-retransmitted ones following that causing
FLAG_ORIG_SACK_ACKED to be set in tcp_clean_rtx_queue and results
in a spurious FRTO undo.

We need to make the check more strict for non-SACK case and check
that none of the cumulatively ACKed segments were retransmitted,
which would be the case for the last step of FRTO algorithm as we
sent out only new segments previously. Only then, allow FRTO undo
to proceed in non-SACK case.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0305f6d..1a33752 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2629,8 +2629,13 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack,
 	if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */
 		/* Step 3.b. A timeout is spurious if not all data are
 		 * lost, i.e., never-retransmitted data are (s)acked.
+		 *
+		 * As the non-SACK case does not keep track of TCPCB_SACKED_ACKED,
+		 * we need to ensure that none of the cumulative ACKed segments
+		 * was retransmitted to confirm the validity of FLAG_ORIG_SACK_ACKED.
 		 */
 		if ((flag & FLAG_ORIG_SACK_ACKED) &&
+		    (tcp_is_sack(tp) || !(flag & FLAG_RETRANS_DATA_ACKED)) &&
 		    tcp_try_undo_loss(sk, true))
 			return;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 3/5] tcp: move false FR condition into tcp_false_fast_retrans_possible()
  2018-03-07 12:59 [PATCH net 0/5] tcp: fixes to non-SACK TCP Ilpo Järvinen
  2018-03-07 12:59 ` [PATCH net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery Ilpo Järvinen
  2018-03-07 12:59 ` [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows Ilpo Järvinen
@ 2018-03-07 12:59 ` Ilpo Järvinen
  2018-03-07 15:50   ` Eric Dumazet
  2018-03-07 12:59 ` [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled Ilpo Järvinen
  2018-03-07 12:59 ` [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows Ilpo Järvinen
  4 siblings, 1 reply; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 12:59 UTC (permalink / raw)
  To: netdev

No functional changes.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1a33752..e20f9ad 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2211,6 +2211,19 @@ static void tcp_update_scoreboard(struct sock *sk, int fast_rexmit)
 	}
 }
 
+/* False fast retransmits may occur when SACK is not in use under certain
+ * conditions (RFC6582). The sender MUST hold old state until something
+ * *above* high_seq is ACKed to prevent triggering such false fast
+ * retransmits. SACK TCP is safe.
+ */
+static bool tcp_false_fast_retrans_possible(const struct sock *sk,
+					    const u32 snd_una)
+{
+	const struct tcp_sock *tp = tcp_sk(sk);
+
+	return ((snd_una == tp->high_seq) && tcp_is_reno(tp));
+}
+
 static bool tcp_tsopt_ecr_before(const struct tcp_sock *tp, u32 when)
 {
 	return tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
@@ -2350,10 +2363,10 @@ static bool tcp_try_undo_recovery(struct sock *sk)
 	} else if (tp->rack.reo_wnd_persist) {
 		tp->rack.reo_wnd_persist--;
 	}
-	if (tp->snd_una == tp->high_seq && tcp_is_reno(tp)) {
-		/* Hold old state until something *above* high_seq
-		 * is ACKed. For Reno it is MUST to prevent false
-		 * fast retransmits (RFC2582). SACK TCP is safe. */
+	if (tcp_false_fast_retrans_possible(sk, tp->snd_una)) {
+		/* Hold old state until something *above* high_seq is ACKed
+		 * if false fast retransmit is possible.
+		 */
 		if (!tcp_any_retrans_done(sk))
 			tp->retrans_stamp = 0;
 		return true;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-07 12:59 [PATCH net 0/5] tcp: fixes to non-SACK TCP Ilpo Järvinen
                   ` (2 preceding siblings ...)
  2018-03-07 12:59 ` [PATCH net 3/5] tcp: move false FR condition into tcp_false_fast_retrans_possible() Ilpo Järvinen
@ 2018-03-07 12:59 ` Ilpo Järvinen
  2018-03-07 20:19   ` Neal Cardwell
  2018-03-07 12:59 ` [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows Ilpo Järvinen
  4 siblings, 1 reply; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 12:59 UTC (permalink / raw)
  To: netdev

A bogus undo may/will trigger when the loss recovery state is
kept until snd_una is above high_seq. If tcp_any_retrans_done
is zero, retrans_stamp is cleared in this transient state. On
the next ACK, tcp_try_undo_recovery again executes and
tcp_may_undo will always return true because tcp_packet_delayed
has this condition:
    return !tp->retrans_stamp || ...

Check for the false fast retransmit transient condition in
tcp_packet_delayed to avoid bogus undos. Since snd_una may have
advanced on this ACK but CA state still remains unchanged,
prior_snd_una needs to be passed instead of tp->snd_una.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c | 50 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 34 insertions(+), 16 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e20f9ad..b689915 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -98,6 +98,7 @@ int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
 #define FLAG_UPDATE_TS_RECENT	0x4000 /* tcp_replace_ts_recent() */
 #define FLAG_NO_CHALLENGE_ACK	0x8000 /* do not call tcp_send_challenge_ack()	*/
 #define FLAG_ACK_MAYBE_DELAYED	0x10000 /* Likely a delayed ACK */
+#define FLAG_PACKET_DELAYED	0x20000 /* 0 rexmits or tstamps reveal delayed pkt */
 
 #define FLAG_ACKED		(FLAG_DATA_ACKED|FLAG_SYN_ACKED)
 #define FLAG_NOT_DUP		(FLAG_DATA|FLAG_WIN_UPDATE|FLAG_ACKED)
@@ -2243,10 +2244,19 @@ static bool tcp_skb_spurious_retrans(const struct tcp_sock *tp,
 /* Nothing was retransmitted or returned timestamp is less
  * than timestamp of the first retransmission.
  */
-static inline bool tcp_packet_delayed(const struct tcp_sock *tp)
+static inline bool tcp_packet_delayed(const struct sock *sk,
+				      const u32 prior_snd_una)
 {
-	return !tp->retrans_stamp ||
-	       tcp_tsopt_ecr_before(tp, tp->retrans_stamp);
+	const struct tcp_sock *tp = tcp_sk(sk);
+
+	if (!tp->retrans_stamp) {
+		/* Sender will be in a transient state with cleared
+		 * retrans_stamp during false fast retransmit prevention
+		 * mechanism
+		 */
+		return !tcp_false_fast_retrans_possible(sk, prior_snd_una);
+	}
+	return tcp_tsopt_ecr_before(tp, tp->retrans_stamp);
 }
 
 /* Undo procedures. */
@@ -2336,17 +2346,20 @@ static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss)
 	tp->rack.advanced = 1; /* Force RACK to re-exam losses */
 }
 
-static inline bool tcp_may_undo(const struct tcp_sock *tp)
+static inline bool tcp_may_undo(const struct sock *sk, const int flag)
 {
-	return tp->undo_marker && (!tp->undo_retrans || tcp_packet_delayed(tp));
+	const struct tcp_sock *tp = tcp_sk(sk);
+
+	return tp->undo_marker &&
+	       (!tp->undo_retrans || (flag & FLAG_PACKET_DELAYED));
 }
 
 /* People celebrate: "We love our President!" */
-static bool tcp_try_undo_recovery(struct sock *sk)
+static bool tcp_try_undo_recovery(struct sock *sk, const int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 
-	if (tcp_may_undo(tp)) {
+	if (tcp_may_undo(sk, flag)) {
 		int mib_idx;
 
 		/* Happy end! We did not retransmit anything
@@ -2393,11 +2406,11 @@ static bool tcp_try_undo_dsack(struct sock *sk)
 }
 
 /* Undo during loss recovery after partial ACK or using F-RTO. */
-static bool tcp_try_undo_loss(struct sock *sk, bool frto_undo)
+static bool tcp_try_undo_loss(struct sock *sk, bool frto_undo, const int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 
-	if (frto_undo || tcp_may_undo(tp)) {
+	if (frto_undo || tcp_may_undo(sk, flag)) {
 		tcp_undo_cwnd_reduction(sk, true);
 
 		DBGUNDO(sk, "partial loss");
@@ -2636,7 +2649,7 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack,
 	bool recovered = !before(tp->snd_una, tp->high_seq);
 
 	if ((flag & FLAG_SND_UNA_ADVANCED) &&
-	    tcp_try_undo_loss(sk, false))
+	    tcp_try_undo_loss(sk, false, flag))
 		return;
 
 	if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */
@@ -2649,7 +2662,7 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack,
 		 */
 		if ((flag & FLAG_ORIG_SACK_ACKED) &&
 		    (tcp_is_sack(tp) || !(flag & FLAG_RETRANS_DATA_ACKED)) &&
-		    tcp_try_undo_loss(sk, true))
+		    tcp_try_undo_loss(sk, true, flag))
 			return;
 
 		if (after(tp->snd_nxt, tp->high_seq)) {
@@ -2672,7 +2685,7 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack,
 
 	if (recovered) {
 		/* F-RTO RFC5682 sec 3.1 step 2.a and 1st part of step 3.a */
-		tcp_try_undo_recovery(sk);
+		tcp_try_undo_recovery(sk, flag);
 		return;
 	}
 	if (tcp_is_reno(tp)) {
@@ -2688,11 +2701,12 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack,
 }
 
 /* Undo during fast recovery after partial ACK. */
-static bool tcp_try_undo_partial(struct sock *sk, u32 prior_snd_una)
+static bool tcp_try_undo_partial(struct sock *sk, u32 prior_snd_una,
+				 const int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 
-	if (tp->undo_marker && tcp_packet_delayed(tp)) {
+	if (tp->undo_marker && (flag & FLAG_PACKET_DELAYED)) {
 		/* Plain luck! Hole if filled with delayed
 		 * packet, rather than with a retransmit. Check reordering.
 		 */
@@ -2795,13 +2809,17 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
 		case TCP_CA_Recovery:
 			if (tcp_is_reno(tp))
 				tcp_reset_reno_sack(tp);
-			if (tcp_try_undo_recovery(sk))
+			if (tcp_try_undo_recovery(sk, flag))
 				return;
 			tcp_end_cwnd_reduction(sk);
 			break;
 		}
 	}
 
+	if (icsk->icsk_ca_state >= TCP_CA_Recovery &&
+	    tcp_packet_delayed(sk, prior_snd_una))
+		flag |= FLAG_PACKET_DELAYED;
+
 	/* E. Process state. */
 	switch (icsk->icsk_ca_state) {
 	case TCP_CA_Recovery:
@@ -2809,7 +2827,7 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
 			if (tcp_is_reno(tp) && is_dupack)
 				tcp_add_reno_sack(sk);
 		} else {
-			if (tcp_try_undo_partial(sk, prior_snd_una))
+			if (tcp_try_undo_partial(sk, prior_snd_una, flag))
 				return;
 			/* Partial ACK arrived. Force fast retransmit. */
 			do_lost = tcp_is_reno(tp) ||
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows
  2018-03-07 12:59 [PATCH net 0/5] tcp: fixes to non-SACK TCP Ilpo Järvinen
                   ` (3 preceding siblings ...)
  2018-03-07 12:59 ` [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled Ilpo Järvinen
@ 2018-03-07 12:59 ` Ilpo Järvinen
  2018-03-07 15:58   ` Eric Dumazet
  4 siblings, 1 reply; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 12:59 UTC (permalink / raw)
  To: netdev

Currently, the TCP code is overly eager to update window on
every ACK. It makes some of the ACKs that the receiver should
sent as dupACKs look like they update window update that are
not considered real dupACKs by the non-SACK sender-side code.

Make sure that when an ofo segment is received, no change to
window is applied if we are going to send a dupACK. It's ok
to change the window for non-dupACKs (such as the first ACK
after ofo arrivals start if that ACK was using delayed ACKs).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/tcp.h   |  3 ++-
 net/ipv4/tcp_input.c  |  5 ++++-
 net/ipv4/tcp_output.c | 43 +++++++++++++++++++++++++------------------
 3 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 8f4c549..e239662 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -225,7 +225,8 @@ struct tcp_sock {
 		fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */
 		fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */
 		is_sack_reneg:1,    /* in recovery from loss with SACK reneg? */
-		unused:2;
+		dupack_wnd_lock :1, /* Non-SACK constant rwnd dupacks needed? */
+		unused:1;
 	u8	nonagle     : 4,/* Disable Nagle algorithm?             */
 		thin_lto    : 1,/* Use linear timeouts for thin streams */
 		unused1	    : 1,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index b689915..77a289f 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4633,6 +4633,7 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
 static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
+	struct inet_connection_sock *icsk = inet_csk(sk);
 	bool fragstolen;
 	int eaten;
 
@@ -4676,7 +4677,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 			 * gap in queue is filled.
 			 */
 			if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
-				inet_csk(sk)->icsk_ack.pingpong = 0;
+				icsk->icsk_ack.pingpong = 0;
 		}
 
 		if (tp->rx_opt.num_sacks)
@@ -4726,6 +4727,8 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 		goto queue_and_out;
 	}
 
+	if (tcp_is_reno(tp) && !(icsk->icsk_ack.pending & ICSK_ACK_TIMER))
+		tp->dupack_wnd_lock = 1;
 	tcp_data_queue_ofo(sk, skb);
 }
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6818042..45fbe92 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -249,25 +249,32 @@ static u16 tcp_select_window(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	u32 old_win = tp->rcv_wnd;
-	u32 cur_win = tcp_receive_window(tp);
-	u32 new_win = __tcp_select_window(sk);
-
-	/* Never shrink the offered window */
-	if (new_win < cur_win) {
-		/* Danger Will Robinson!
-		 * Don't update rcv_wup/rcv_wnd here or else
-		 * we will not be able to advertise a zero
-		 * window in time.  --DaveM
-		 *
-		 * Relax Will Robinson.
-		 */
-		if (new_win == 0)
-			NET_INC_STATS(sock_net(sk),
-				      LINUX_MIB_TCPWANTZEROWINDOWADV);
-		new_win = ALIGN(cur_win, 1 << tp->rx_opt.rcv_wscale);
+	u32 cur_win;
+	u32 new_win;
+
+	if (tp->dupack_wnd_lock) {
+		new_win = old_win;
+		tp->dupack_wnd_lock = 0;
+	} else {
+		cur_win = tcp_receive_window(tp);
+		new_win = __tcp_select_window(sk);
+		/* Never shrink the offered window */
+		if (new_win < cur_win) {
+			/* Danger Will Robinson!
+			 * Don't update rcv_wup/rcv_wnd here or else
+			 * we will not be able to advertise a zero
+			 * window in time.  --DaveM
+			 *
+			 * Relax Will Robinson.
+			 */
+			if (new_win == 0)
+				NET_INC_STATS(sock_net(sk),
+					      LINUX_MIB_TCPWANTZEROWINDOWADV);
+			new_win = ALIGN(cur_win, 1 << tp->rx_opt.rcv_wscale);
+		}
+		tp->rcv_wnd = new_win;
+		tp->rcv_wup = tp->rcv_nxt;
 	}
-	tp->rcv_wnd = new_win;
-	tp->rcv_wup = tp->rcv_nxt;
 
 	/* Make sure we do not exceed the maximum possible
 	 * scaled window.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net 3/5] tcp: move false FR condition into tcp_false_fast_retrans_possible()
  2018-03-07 12:59 ` [PATCH net 3/5] tcp: move false FR condition into tcp_false_fast_retrans_possible() Ilpo Järvinen
@ 2018-03-07 15:50   ` Eric Dumazet
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2018-03-07 15:50 UTC (permalink / raw)
  To: Ilpo Järvinen, netdev

On Wed, 2018-03-07 at 14:59 +0200, Ilpo Järvinen wrote:
> No functional changes.
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> ---
>  net/ipv4/tcp_input.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 1a33752..e20f9ad 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2211,6 +2211,19 @@ static void tcp_update_scoreboard(struct sock
> *sk, int fast_rexmit)
>  	}
>  }
>  
> +/* False fast retransmits may occur when SACK is not in use under
> certain
> + * conditions (RFC6582). The sender MUST hold old state until
> something
> + * *above* high_seq is ACKed to prevent triggering such false fast
> + * retransmits. SACK TCP is safe.
> + */
> +static bool tcp_false_fast_retrans_possible(const struct sock *sk,
> +					    const u32 snd_una)
> +{
> +	const struct tcp_sock *tp = tcp_sk(sk);
> +
> +	return ((snd_una == tp->high_seq) && tcp_is_reno(tp));

return (EXPR);

should use instead :

return EXPR;

> +}
> +
>  static bool tcp_tsopt_ecr_before(const struct tcp_sock *tp, u32
> when)
>  {
>  	return tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
> @@ -2350,10 +2363,10 @@ static bool tcp_try_undo_recovery(struct sock
> *sk)
>  	} else if (tp->rack.reo_wnd_persist) {
>  		tp->rack.reo_wnd_persist--;
>  	}
> -	if (tp->snd_una == tp->high_seq && tcp_is_reno(tp)) {
> -		/* Hold old state until something *above* high_seq
> -		 * is ACKed. For Reno it is MUST to prevent false
> -		 * fast retransmits (RFC2582). SACK TCP is safe. */
> +	if (tcp_false_fast_retrans_possible(sk, tp->snd_una)) {
> +		/* Hold old state until something *above* high_seq
> is ACKed
> +		 * if false fast retransmit is possible.
> +		 */
>  		if (!tcp_any_retrans_done(sk))
>  			tp->retrans_stamp = 0;
>  		return true;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows
  2018-03-07 12:59 ` [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows Ilpo Järvinen
@ 2018-03-07 15:58   ` Eric Dumazet
  2018-03-07 20:09     ` Ilpo Järvinen
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2018-03-07 15:58 UTC (permalink / raw)
  To: Ilpo Järvinen, netdev

On Wed, 2018-03-07 at 14:59 +0200, Ilpo Järvinen wrote:
> Currently, the TCP code is overly eager to update window on
> every ACK. It makes some of the ACKs that the receiver should
> sent as dupACKs look like they update window update that are
> not considered real dupACKs by the non-SACK sender-side code.
> 
> Make sure that when an ofo segment is received, no change to
> window is applied if we are going to send a dupACK. It's ok
> to change the window for non-dupACKs (such as the first ACK
> after ofo arrivals start if that ACK was using delayed ACKs).

This looks dangerous to me.

We certainly want to lower the window at some point, or risk expensive
pruning and/or drops.

It is not clear by reading your changelog/patch, but it looks like some
malicious peers could hurt us.

By current standards, non TCP sack flows are not worth any potential
issues.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows
  2018-03-07 12:59 ` [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows Ilpo Järvinen
@ 2018-03-07 19:24   ` Neal Cardwell
  2018-03-07 19:54     ` Yuchung Cheng
  0 siblings, 1 reply; 23+ messages in thread
From: Neal Cardwell @ 2018-03-07 19:24 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Netdev, Yuchung Cheng

On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
>
> In a non-SACK case, any non-retransmitted segment acknowledged will
> set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is
> no indication that it would have been delivered for real (the
> scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK
> case). This causes bogus undos in ordinary RTO recoveries where
> segments are lost here and there, with a few delivered segments in
> between losses. A cumulative ACKs will cover retransmitted ones at
> the bottom and the non-retransmitted ones following that causing
> FLAG_ORIG_SACK_ACKED to be set in tcp_clean_rtx_queue and results
> in a spurious FRTO undo.
>
> We need to make the check more strict for non-SACK case and check
> that none of the cumulatively ACKed segments were retransmitted,
> which would be the case for the last step of FRTO algorithm as we
> sent out only new segments previously. Only then, allow FRTO undo
> to proceed in non-SACK case.

Hi Ilpo - Do you have a packet trace or (even better) packetdrill
script illustrating this issue? It would be nice to have a test case
or at least concrete example of this.

Thanks!
neal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows
  2018-03-07 19:24   ` Neal Cardwell
@ 2018-03-07 19:54     ` Yuchung Cheng
  2018-03-07 22:19       ` Ilpo Järvinen
  0 siblings, 1 reply; 23+ messages in thread
From: Yuchung Cheng @ 2018-03-07 19:54 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Ilpo Järvinen, Netdev

On Wed, Mar 7, 2018 at 11:24 AM, Neal Cardwell <ncardwell@google.com> wrote:
>
> On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> >
> > In a non-SACK case, any non-retransmitted segment acknowledged will
> > set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is
> > no indication that it would have been delivered for real (the
> > scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK
> > case). This causes bogus undos in ordinary RTO recoveries where
> > segments are lost here and there, with a few delivered segments in
> > between losses. A cumulative ACKs will cover retransmitted ones at
> > the bottom and the non-retransmitted ones following that causing
> > FLAG_ORIG_SACK_ACKED to be set in tcp_clean_rtx_queue and results
> > in a spurious FRTO undo.
> >
> > We need to make the check more strict for non-SACK case and check
> > that none of the cumulatively ACKed segments were retransmitted,
> > which would be the case for the last step of FRTO algorithm as we
> > sent out only new segments previously. Only then, allow FRTO undo
> > to proceed in non-SACK case.
>
> Hi Ilpo - Do you have a packet trace or (even better) packetdrill
> script illustrating this issue? It would be nice to have a test case
> or at least concrete example of this.
a packetdrill or even a contrived example would be good ... also why
not just avoid setting FLAG_ORIG_SACK_ACKED on non-sack? seems a much
clean fix.

>
> Thanks!
> neal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows
  2018-03-07 15:58   ` Eric Dumazet
@ 2018-03-07 20:09     ` Ilpo Järvinen
  2018-03-07 20:13       ` Eric Dumazet
  0 siblings, 1 reply; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 20:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Netdev

[-- Attachment #1: Type: text/plain, Size: 1987 bytes --]

Thanks for taking a look.

On Wed, 7 Mar 2018, Eric Dumazet wrote:
> On Wed, 2018-03-07 at 14:59 +0200, Ilpo Jï¿œrvinen wrote:
> > Currently, the TCP code is overly eager to update window on
> > every ACK. It makes some of the ACKs that the receiver should
> > sent as dupACKs look like they update window update that are
> > not considered real dupACKs by the non-SACK sender-side code.
> > 
> > Make sure that when an ofo segment is received, no change to
> > window is applied if we are going to send a dupACK. It's ok
> > to change the window for non-dupACKs (such as the first ACK
> > after ofo arrivals start if that ACK was using delayed ACKs).
> 
> This looks dangerous to me.
> 
> We certainly want to lower the window at some point, or risk expensive
> pruning and/or drops.

I see you're conspiring for "treason" (if you recall those charmy
times) ;-).

> It is not clear by reading your changelog/patch, but it looks like some
> malicious peers could hurt us.

The malicious peers can certainly send out-of-window data already at will 
(with all of such packets being dropped regardless of that being 
expensive or not) so I don't see there's a big change for malicious case 
really. The window is only locked for what we've already promised for in 
an earlier ACK! ...Previously, reneging that promise (advertising smaller 
than the previous value) was called "treason" by us (unfortunately, the 
message is no longer as charmy).

Even with this change, the window is free to change when the ack field is 
updated which for legimate senders occurs typically once per RTT.

> By current standards, non TCP sack flows are not worth any potential
> issues.

Currently non-SACK senders cannot identify almost any duplicate ACKs 
because the window keeps updating for almost all ACKs. As a result, 
non-SACK senders end up doing loss recovery only with RTO. RTO recovery 
without SACK is quite annoying because it potentially sends 
large number of unnecessary rexmits.


-- 
 i.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows
  2018-03-07 20:09     ` Ilpo Järvinen
@ 2018-03-07 20:13       ` Eric Dumazet
  2018-03-07 21:39         ` Ilpo Järvinen
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2018-03-07 20:13 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Netdev

On Wed, 2018-03-07 at 22:09 +0200, Ilpo Järvinen wrote:
> Thanks for taking a look.
> 
> On Wed, 7 Mar 2018, Eric Dumazet wrote:
> > On Wed, 2018-03-07 at 14:59 +0200, Ilpo Järvinen wrote:
> > > Currently, the TCP code is overly eager to update window on
> > > every ACK. It makes some of the ACKs that the receiver should
> > > sent as dupACKs look like they update window update that are
> > > not considered real dupACKs by the non-SACK sender-side code.
> > > 
> > > Make sure that when an ofo segment is received, no change to
> > > window is applied if we are going to send a dupACK. It's ok
> > > to change the window for non-dupACKs (such as the first ACK
> > > after ofo arrivals start if that ACK was using delayed ACKs).
> > 
> > This looks dangerous to me.
> > 
> > We certainly want to lower the window at some point, or risk
> > expensive
> > pruning and/or drops.
> 
> I see you're conspiring for "treason" (if you recall those charmy
> times) ;-).
> 
> > It is not clear by reading your changelog/patch, but it looks like
> > some
> > malicious peers could hurt us.
> 
> The malicious peers can certainly send out-of-window data already at
> will 
> (with all of such packets being dropped regardless of that being 
> expensive or not) so I don't see there's a big change for malicious
> case 
> really. The window is only locked for what we've already promised for
> in 
> an earlier ACK! ...Previously, reneging that promise (advertising
> smaller 
> than the previous value) was called "treason" by us (unfortunately,
> the 
> message is no longer as charmy).
> 
> Even with this change, the window is free to change when the ack
> field is 
> updated which for legimate senders occurs typically once per RTT.
> 
> > By current standards, non TCP sack flows are not worth any
> > potential
> > issues.
> 
> Currently non-SACK senders cannot identify almost any duplicate ACKs 
> because the window keeps updating for almost all ACKs. As a result, 
> non-SACK senders end up doing loss recovery only with RTO. RTO
> recovery 
> without SACK is quite annoying because it potentially sends 
> large number of unnecessary rexmits.

I get that, but switching from "always" to "never" sounds dangerous.

May I suggest you refine your patch, to maybe allow win reductions, in
a logarithmic fashion maybe ?

This way, when you send 1000 DUPACK, only few of them would actually
have to lower the window, and 99% of them would be considered as DUPACK
by these prehistoric TCP stacks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-07 12:59 ` [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled Ilpo Järvinen
@ 2018-03-07 20:19   ` Neal Cardwell
  2018-03-07 23:48     ` Yuchung Cheng
  0 siblings, 1 reply; 23+ messages in thread
From: Neal Cardwell @ 2018-03-07 20:19 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Netdev, Yuchung Cheng, Eric Dumazet

On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> A bogus undo may/will trigger when the loss recovery state is
> kept until snd_una is above high_seq. If tcp_any_retrans_done
> is zero, retrans_stamp is cleared in this transient state. On
> the next ACK, tcp_try_undo_recovery again executes and
> tcp_may_undo will always return true because tcp_packet_delayed
> has this condition:
>     return !tp->retrans_stamp || ...
>
> Check for the false fast retransmit transient condition in
> tcp_packet_delayed to avoid bogus undos. Since snd_una may have
> advanced on this ACK but CA state still remains unchanged,
> prior_snd_una needs to be passed instead of tp->snd_una.

This one also seems like a case where it would be nice to have a
specific packet-by-packet example, or trace, or packetdrill scenario.
Something that we might be able to translate into a test, or at least
to document the issue more explicitly.

Thanks!
neal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows
  2018-03-07 20:13       ` Eric Dumazet
@ 2018-03-07 21:39         ` Ilpo Järvinen
  2018-03-07 22:01           ` Eric Dumazet
  0 siblings, 1 reply; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 21:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Netdev

[-- Attachment #1: Type: text/plain, Size: 1416 bytes --]

On Wed, 7 Mar 2018, Eric Dumazet wrote:

> > Currently non-SACK senders cannot identify almost any duplicate ACKsᅵ
> > because the window keeps updating for almost all ACKs. As a result,ᅵ
> > non-SACK senders end up doing loss recovery only with RTO. RTO
> > recoveryᅵ
> > without SACK is quite annoying because it potentially sendsᅵ
> > large number of unnecessary rexmits.
> 
> I get that, but switching from "always" to "never" sounds dangerous.
> 
> May I suggest you refine your patch, to maybe allow win reductions, in
> a logarithmic fashion maybe ?
> 
> This way, when you send 1000 DUPACK, only few of them would actually
> have to lower the window, and 99% of them would be considered as DUPACK
> by these prehistoric TCP stacks.

The problem I'm trying to fix is due to window increasing (of course, 
dupacks could also be masked because of decrease but that's something
I've never seen to occur in practice) so I tried to make you happy and 
change my fix to do "never increase, always decrease". However, it turns 
out that even pre-fixed code implements "never decrease" not "always 
decrease" like you thought:

static u16 tcp_select_window(struct sock *sk)
{
	...
	/* Never shrink the offered window */
        if (new_win < cur_win) {
		...
		new_win = ALIGN(cur_win, 1 << tp->rx_opt.rcv_wscale);

...Thus, I don't see my fix making the case you're worried about any
worse, AFAICT.


-- 
 i.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows
  2018-03-07 21:39         ` Ilpo Järvinen
@ 2018-03-07 22:01           ` Eric Dumazet
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2018-03-07 22:01 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Netdev

On Wed, 2018-03-07 at 23:39 +0200, Ilpo Järvinen wrote:
> 
> The problem I'm trying to fix is due to window increasing (of
> course, 
> dupacks could also be masked because of decrease but that's something
> I've never seen to occur in practice) so I tried to make you happy
> and 
> change my fix to do "never increase, always decrease". However, it
> turns 
> out that even pre-fixed code implements "never decrease" not "always 
> decrease" like you thought:

If this was the case, your patch would not be needed.

So really try to add more information in the changelog so that your
patch makes sense to me, and really explains why it is safe.

This kind of information should be recorded in git history, not in mail
archives.

Thanks !

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows
  2018-03-07 19:54     ` Yuchung Cheng
@ 2018-03-07 22:19       ` Ilpo Järvinen
  0 siblings, 0 replies; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-07 22:19 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Neal Cardwell, Netdev

[-- Attachment #1: Type: text/plain, Size: 2859 bytes --]

On Wed, 7 Mar 2018, Yuchung Cheng wrote:
> On Wed, Mar 7, 2018 at 11:24 AM, Neal Cardwell <ncardwell@google.com> wrote:
> > On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Jï¿œrvinen <ilpo.jarvinen@helsinki.fi> wrote:
> > >
> > > In a non-SACK case, any non-retransmitted segment acknowledged will
> > > set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is
> > > no indication that it would have been delivered for real (the
> > > scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK
> > > case). This causes bogus undos in ordinary RTO recoveries where
> > > segments are lost here and there, with a few delivered segments in
> > > between losses. A cumulative ACKs will cover retransmitted ones at
> > > the bottom and the non-retransmitted ones following that causing
> > > FLAG_ORIG_SACK_ACKED to be set in tcp_clean_rtx_queue and results
> > > in a spurious FRTO undo.
> > >
> > > We need to make the check more strict for non-SACK case and check
> > > that none of the cumulatively ACKed segments were retransmitted,
> > > which would be the case for the last step of FRTO algorithm as we
> > > sent out only new segments previously. Only then, allow FRTO undo
> > > to proceed in non-SACK case.
> >
> > Hi Ilpo - Do you have a packet trace or (even better) packetdrill
> > script illustrating this issue? It would be nice to have a test case
> > or at least concrete example of this.
>
> a packetdrill or even a contrived example would be good ...

I've seen all but this for sure in packet traces. But I'm somewhat 
old-school that while looking for the burst issue I discovered this 
issue by reading the code only (making it more than _one_ issue).
However, I think that I later on saw also this issue from the traces
(as it seemed to not match to any of the other burst issues this whole 
series is trying to fix). But finding that dump afterwards would take 
really long time, I've more than enough of them from our recent
tests ;-)).

But anyway, that was before the recent moving for the condition into 
tp->frto block so it might no longer be triggerable. It clearly was 
triggerable beforehand without tp->frto guard (and I just forward-ported 
past that recent change without thinking it much).

To trigger it, ever-R and !ever-R skb would need to be cumulatively 
ACKed when tp->frto is non-zero. Do you think that is still possible
with FRTO? E.g., after some undo leaving some ever-R and then RTO 
resulting in FRTO procedure?

> also why not just avoid setting FLAG_ORIG_SACK_ACKED on non-sack? seems 
> a much clean fix.

I guess that would work now that the relevant FRTO condition got moved
into the tp->frto block. It wouldn't have been that simple earlier
as SACK wanted FLAG_ORIG_SACK_ACKED while non-SACK wants
FLAG_ONLY_ORIG_ACKED (that was already available through a combination
of the existing FLAGs).


-- 
 i.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-07 20:19   ` Neal Cardwell
@ 2018-03-07 23:48     ` Yuchung Cheng
  2018-03-09 14:11       ` Ilpo Järvinen
  0 siblings, 1 reply; 23+ messages in thread
From: Yuchung Cheng @ 2018-03-07 23:48 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Ilpo Järvinen, Netdev, Eric Dumazet

On Wed, Mar 7, 2018 at 12:19 PM, Neal Cardwell <ncardwell@google.com> wrote:
>
> On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> > A bogus undo may/will trigger when the loss recovery state is
> > kept until snd_una is above high_seq. If tcp_any_retrans_done
> > is zero, retrans_stamp is cleared in this transient state. On
> > the next ACK, tcp_try_undo_recovery again executes and
> > tcp_may_undo will always return true because tcp_packet_delayed
> > has this condition:
> >     return !tp->retrans_stamp || ...
> >
> > Check for the false fast retransmit transient condition in
> > tcp_packet_delayed to avoid bogus undos. Since snd_una may have
> > advanced on this ACK but CA state still remains unchanged,
> > prior_snd_una needs to be passed instead of tp->snd_una.
>
> This one also seems like a case where it would be nice to have a
> specific packet-by-packet example, or trace, or packetdrill scenario.
> Something that we might be able to translate into a test, or at least
> to document the issue more explicitly.
I am hesitate for further logic to make undo "perfect" on non-sack
cases b/c undo is very complicated and SACK is extremely
well-supported today. so a trace to demonstrate how severe this issue
is appreciated.

>
> Thanks!
> neal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-07 23:48     ` Yuchung Cheng
@ 2018-03-09 14:11       ` Ilpo Järvinen
  2018-03-09 14:32         ` Eric Dumazet
  2018-03-09 15:23         ` David Miller
  0 siblings, 2 replies; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-09 14:11 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Neal Cardwell, Netdev, Eric Dumazet

[-- Attachment #1: Type: text/plain, Size: 2073 bytes --]

On Wed, 7 Mar 2018, Yuchung Cheng wrote:

> On Wed, Mar 7, 2018 at 12:19 PM, Neal Cardwell <ncardwell@google.com> wrote:
> >
> > On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Jï¿œrvinen <ilpo.jarvinen@helsinki.fi> wrote:
> > > A bogus undo may/will trigger when the loss recovery state is
> > > kept until snd_una is above high_seq. If tcp_any_retrans_done
> > > is zero, retrans_stamp is cleared in this transient state. On
> > > the next ACK, tcp_try_undo_recovery again executes and
> > > tcp_may_undo will always return true because tcp_packet_delayed
> > > has this condition:
> > >     return !tp->retrans_stamp || ...
> > >
> > > Check for the false fast retransmit transient condition in
> > > tcp_packet_delayed to avoid bogus undos. Since snd_una may have
> > > advanced on this ACK but CA state still remains unchanged,
> > > prior_snd_una needs to be passed instead of tp->snd_una.
> >
> > This one also seems like a case where it would be nice to have a
> > specific packet-by-packet example, or trace, or packetdrill scenario.
> > Something that we might be able to translate into a test, or at least
> > to document the issue more explicitly.
>
> I am hesitate for further logic to make undo "perfect" on non-sack
> cases b/c undo is very complicated and SACK is extremely
> well-supported today. so a trace to demonstrate how severe this issue
> is appreciated.

This is not just some remote corner cases to which I perhaps would 
understand your "making undo perfect" comment. Those undos result in
a burst that, at worst, triggers additional buffer overflow because the 
correct CC action is cancelled. Unfortunately I don't have now permission
to publish the time-seq graph about it but I've tried to improve the
changelog messages so that you can better understand under which
conditions the problem occurs.

SACK case remains the same even after this change. I did rework the
logic a bit though (pass prior_snd_una rather than flag around) to
make it more obvious but the change is unfortunately lengthy (no matter
what I pass through the call-chain).


-- 
 i.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-09 14:11       ` Ilpo Järvinen
@ 2018-03-09 14:32         ` Eric Dumazet
  2018-03-09 15:28           ` David Miller
  2018-03-09 15:23         ` David Miller
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2018-03-09 14:32 UTC (permalink / raw)
  To: Ilpo Järvinen, Yuchung Cheng
  Cc: Neal Cardwell, Netdev, Eric Dumazet, Willem de Bruijn



On 03/09/2018 06:11 AM, Ilpo Jï¿œrvinen wrote:
> On Wed, 7 Mar 2018, Yuchung Cheng wrote:
> 
>> On Wed, Mar 7, 2018 at 12:19 PM, Neal Cardwell <ncardwell@google.com> wrote:
>>>
>>> On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Jï¿œrvinen <ilpo.jarvinen@helsinki.fi> wrote:
>>>> A bogus undo may/will trigger when the loss recovery state is
>>>> kept until snd_una is above high_seq. If tcp_any_retrans_done
>>>> is zero, retrans_stamp is cleared in this transient state. On
>>>> the next ACK, tcp_try_undo_recovery again executes and
>>>> tcp_may_undo will always return true because tcp_packet_delayed
>>>> has this condition:
>>>>      return !tp->retrans_stamp || ...
>>>>
>>>> Check for the false fast retransmit transient condition in
>>>> tcp_packet_delayed to avoid bogus undos. Since snd_una may have
>>>> advanced on this ACK but CA state still remains unchanged,
>>>> prior_snd_una needs to be passed instead of tp->snd_una.
>>>
>>> This one also seems like a case where it would be nice to have a
>>> specific packet-by-packet example, or trace, or packetdrill scenario.
>>> Something that we might be able to translate into a test, or at least
>>> to document the issue more explicitly.
>>
>> I am hesitate for further logic to make undo "perfect" on non-sack
>> cases b/c undo is very complicated and SACK is extremely
>> well-supported today. so a trace to demonstrate how severe this issue
>> is appreciated.
> 
> This is not just some remote corner cases to which I perhaps would
> understand your "making undo perfect" comment. Those undos result in
> a burst that, at worst, triggers additional buffer overflow because the
> correct CC action is cancelled. Unfortunately I don't have now permission
> to publish the time-seq graph about it but I've tried to improve the
> changelog messages so that you can better understand under which
> conditions the problem occurs.
> 
> SACK case remains the same even after this change. I did rework the
> logic a bit though (pass prior_snd_una rather than flag around) to
> make it more obvious but the change is unfortunately lengthy (no matter
> what I pass through the call-chain).
> 
> 

Can you provide packetdrill test(s) then if you can not provide traces ?

If we can not have tests, we will likely have future regressions, since
most developments are using SACK in mind.

Fact that these bugs have been unnoticed for years is concerning.

We have the goal of updating packetdrill and publish soon our regression 
suite (about 1500 TCP tests)

CC Willem who is working on that.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-09 14:11       ` Ilpo Järvinen
  2018-03-09 14:32         ` Eric Dumazet
@ 2018-03-09 15:23         ` David Miller
  2018-03-09 19:23           ` Ilpo Järvinen
  2018-03-13 10:24           ` Ilpo Järvinen
  1 sibling, 2 replies; 23+ messages in thread
From: David Miller @ 2018-03-09 15:23 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: ycheng, ncardwell, netdev, edumazet

From: Ilpo J�rvinen <ilpo.jarvinen@helsinki.fi>
Date: Fri, 9 Mar 2018 16:11:47 +0200 (EET)

> Unfortunately I don't have now permission to publish the time-seq
> graph about it but I've tried to improve the changelog messages so
> that you can better understand under which conditions the problem
> occurs.

It is indeed extremely unfortunate that you wish to justify a change
for which you cannot provide the supporting data at all.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-09 14:32         ` Eric Dumazet
@ 2018-03-09 15:28           ` David Miller
  0 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2018-03-09 15:28 UTC (permalink / raw)
  To: eric.dumazet; +Cc: ilpo.jarvinen, ycheng, ncardwell, netdev, edumazet, willemb

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 9 Mar 2018 06:32:21 -0800

> We have the goal of updating packetdrill and publish soon our
> regression suite (about 1500 TCP tests)
> 
> CC Willem who is working on that.

Excellent news!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-09 15:23         ` David Miller
@ 2018-03-09 19:23           ` Ilpo Järvinen
  2018-03-13 10:24           ` Ilpo Järvinen
  1 sibling, 0 replies; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-09 19:23 UTC (permalink / raw)
  To: David Miller; +Cc: Yuchung Cheng, ncardwell, Netdev, edumazet

[-- Attachment #1: Type: text/plain, Size: 1181 bytes --]

On Fri, 9 Mar 2018, David Miller wrote:

> From: Ilpo Jï¿œrvinen <ilpo.jarvinen@helsinki.fi>
> Date: Fri, 9 Mar 2018 16:11:47 +0200 (EET)
> 
> > Unfortunately I don't have now permission to publish the time-seq
> > graph about it but I've tried to improve the changelog messages so
> > that you can better understand under which conditions the problem
> > occurs.
> 
> It is indeed extremely unfortunate that you wish to justify a change
> for which you cannot provide the supporting data at all.

Well, the permission for time-seq graps was/is still simply just in 
pending state and then my patience on sending v2 out ran out :-).

Given that I perceived right from the start that the main purpose for
that "data" was to create a packetdrill test for that testsuite (that
is still in the soon-to-be-published state after all these years ;-)),
I didn't find significant benefit from a time-seq graph compared with
more verbose description that is now place in the changelog. In fact,
I think that time-seq graph well into the flow will be less useful than
the current explination in the changelog if one wants to come up a
packetdrill test but I'm no packetdrill expert.


-- 
 i.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
  2018-03-09 15:23         ` David Miller
  2018-03-09 19:23           ` Ilpo Järvinen
@ 2018-03-13 10:24           ` Ilpo Järvinen
  1 sibling, 0 replies; 23+ messages in thread
From: Ilpo Järvinen @ 2018-03-13 10:24 UTC (permalink / raw)
  To: David Miller; +Cc: Yuchung Cheng, ncardwell, Netdev, edumazet

[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]

On Fri, 9 Mar 2018, David Miller wrote:

> From: Ilpo Jï¿œrvinen <ilpo.jarvinen@helsinki.fi>
> Date: Fri, 9 Mar 2018 16:11:47 +0200 (EET)
> 
> > Unfortunately I don't have now permission to publish the time-seq
> > graph about it but I've tried to improve the changelog messages so
> > that you can better understand under which conditions the problem
> > occurs.
> 
> It is indeed extremely unfortunate that you wish to justify a change
> for which you cannot provide the supporting data at all.

Here is the time-seqno graph about the issue:

https://www.cs.helsinki.fi/u/ijjarvin/linux/nonsackbugs/recovery_undo_bug.pdf

First the correct CC action (wnd reduction) occurs; then bogus undo 
causes bursting back to the window with which the congestion losses 
occurred earlier; because of the burst, some packets get lost due to 
congestion again.

The sender is actually somewhat lucky here: If only one packet would get 
lost instead of three, the same process would repeat for the next recovery 
(as cumulative ACK to high_seq condition would reoccur).


-- 
 i.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-03-13 10:25 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-07 12:59 [PATCH net 0/5] tcp: fixes to non-SACK TCP Ilpo Järvinen
2018-03-07 12:59 ` [PATCH net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery Ilpo Järvinen
2018-03-07 12:59 ` [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows Ilpo Järvinen
2018-03-07 19:24   ` Neal Cardwell
2018-03-07 19:54     ` Yuchung Cheng
2018-03-07 22:19       ` Ilpo Järvinen
2018-03-07 12:59 ` [PATCH net 3/5] tcp: move false FR condition into tcp_false_fast_retrans_possible() Ilpo Järvinen
2018-03-07 15:50   ` Eric Dumazet
2018-03-07 12:59 ` [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled Ilpo Järvinen
2018-03-07 20:19   ` Neal Cardwell
2018-03-07 23:48     ` Yuchung Cheng
2018-03-09 14:11       ` Ilpo Järvinen
2018-03-09 14:32         ` Eric Dumazet
2018-03-09 15:28           ` David Miller
2018-03-09 15:23         ` David Miller
2018-03-09 19:23           ` Ilpo Järvinen
2018-03-13 10:24           ` Ilpo Järvinen
2018-03-07 12:59 ` [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows Ilpo Järvinen
2018-03-07 15:58   ` Eric Dumazet
2018-03-07 20:09     ` Ilpo Järvinen
2018-03-07 20:13       ` Eric Dumazet
2018-03-07 21:39         ` Ilpo Järvinen
2018-03-07 22:01           ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.