All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] tcp: implement SACK compression
@ 2018-05-17 12:12 Eric Dumazet
  2018-05-17 12:12 ` [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers() Eric Dumazet
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 12:12 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Toke Høiland-Jørgensen, Neal Cardwell,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet

When TCP receives an out-of-order packet, it immediately sends
a SACK packet, generating network load but also forcing the
receiver to send 1-MSS pathological packets, increasing its
RTX queue length/depth, and thus processing time.

Wifi networks suffer from this aggressive behavior, but generally
speaking, all these SACK packets add fuel to the fire when networks
are under congestion.

This patch series adds SACK compression, but the infrastructure
could be leveraged to also compress ACK in the future.

Eric Dumazet (4):
  tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers()
  tcp: do not force quickack when receiving out-of-order packets
  tcp: add SACK compression
  tcp: add TCPAckCompressed SNMP counter

 include/linux/tcp.h       |  2 ++
 include/net/tcp.h         |  5 ++++-
 include/uapi/linux/snmp.h |  1 +
 net/ipv4/proc.c           |  1 +
 net/ipv4/tcp.c            |  1 +
 net/ipv4/tcp_input.c      | 33 +++++++++++++++++++++++++--------
 net/ipv4/tcp_output.c     |  9 +++++++++
 net/ipv4/tcp_timer.c      | 25 +++++++++++++++++++++++++
 8 files changed, 68 insertions(+), 9 deletions(-)

-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers()
  2018-05-17 12:12 [PATCH net-next 0/4] tcp: implement SACK compression Eric Dumazet
@ 2018-05-17 12:12 ` Eric Dumazet
  2018-05-17 14:53   ` Neal Cardwell
  2018-05-17 12:12 ` [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets Eric Dumazet
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 12:12 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Toke Høiland-Jørgensen, Neal Cardwell,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet

Socket can not disappear under us.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index a08eab58ef7001b3e141e3722fd8a3875e5c5d7d..6ffc8bd894876ad23407f5ec4994350139af85e7 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -558,7 +558,7 @@ void tcp_init_xmit_timers(struct sock *);
 static inline void tcp_clear_xmit_timers(struct sock *sk)
 {
 	if (hrtimer_try_to_cancel(&tcp_sk(sk)->pacing_timer) == 1)
-		sock_put(sk);
+		__sock_put(sk);
 
 	inet_csk_clear_xmit_timers(sk);
 }
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets
  2018-05-17 12:12 [PATCH net-next 0/4] tcp: implement SACK compression Eric Dumazet
  2018-05-17 12:12 ` [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers() Eric Dumazet
@ 2018-05-17 12:12 ` Eric Dumazet
  2018-05-17 14:55   ` Neal Cardwell
  2018-05-17 17:14   ` Soheil Hassas Yeganeh
  2018-05-17 12:12 ` [PATCH net-next 3/4] tcp: add SACK compression Eric Dumazet
  2018-05-17 12:12 ` [PATCH net-next 4/4] tcp: add TCPAckCompressed SNMP counter Eric Dumazet
  3 siblings, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 12:12 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Toke Høiland-Jørgensen, Neal Cardwell,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet

As explained in commit 9f9843a751d0 ("tcp: properly handle stretch
acks in slow start"), TCP stacks have to consider how many packets
are acknowledged in one single ACK, because of GRO, but also
because of ACK compression or losses.

We plan to add SACK compression in the following patch, we
must therefore not call tcp_enter_quickack_mode()

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index b188e0d75edd9e5e1c9f0355818caa932fef7416..99fcab7e6570c8b8758ea4b15cdd26df29fb4fd6 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4708,8 +4708,6 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 	if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt + tcp_receive_window(tp)))
 		goto out_of_window;
 
-	tcp_enter_quickack_mode(sk);
-
 	if (before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
 		/* Partial packet, seq < rcv_next < end_seq */
 		SOCK_DEBUG(sk, "partial packet: rcv_next %X seq %X - %X\n",
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 12:12 [PATCH net-next 0/4] tcp: implement SACK compression Eric Dumazet
  2018-05-17 12:12 ` [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers() Eric Dumazet
  2018-05-17 12:12 ` [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets Eric Dumazet
@ 2018-05-17 12:12 ` Eric Dumazet
  2018-05-17 15:14   ` Neal Cardwell
  2018-05-17 12:12 ` [PATCH net-next 4/4] tcp: add TCPAckCompressed SNMP counter Eric Dumazet
  3 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 12:12 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Toke Høiland-Jørgensen, Neal Cardwell,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet

When TCP receives an out-of-order packet, it immediately sends
a SACK packet, generating network load but also forcing the
receiver to send 1-MSS pathological packets, increasing its
RTX queue length/depth, and thus processing time.

Wifi networks suffer from this aggressive behavior, but generally
speaking, all these SACK packets add fuel to the fire when networks
are under congestion.

This patch adds a high resolution timer and tp->compressed_ack counter.

Instead of sending a SACK, we program this timer with a small delay,
based on SRTT and capped to 2.5 ms : delay = min ( 5 % of SRTT, 2.5 ms)

If subsequent SACKs need to be sent while the timer has not yet expired,
we simply increment tp->compressed_ack

When timer expires, a SACK is sent with the latest information.

Note that tcp_sack_new_ofo_skb() is able to force a SACK to be sent
if the sack blocks need to be shuffled, even if the timer has not
expired.

A new SNMP counter is added in the following patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/tcp.h   |  2 ++
 include/net/tcp.h     |  3 +++
 net/ipv4/tcp.c        |  1 +
 net/ipv4/tcp_input.c  | 31 +++++++++++++++++++++++++------
 net/ipv4/tcp_output.c |  7 +++++++
 net/ipv4/tcp_timer.c  | 25 +++++++++++++++++++++++++
 6 files changed, 63 insertions(+), 6 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 807776928cb8610fe97121fbc3c600b08d5d2991..72705eaf4b84060a45bf04d5170f389a18010eac 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -218,6 +218,7 @@ struct tcp_sock {
 		   reord:1;	 /* reordering detected */
 	} rack;
 	u16	advmss;		/* Advertised MSS			*/
+	u8	compressed_ack;
 	u32	chrono_start;	/* Start time in jiffies of a TCP chrono */
 	u32	chrono_stat[3];	/* Time in jiffies for chrono_stat stats */
 	u8	chrono_type:2,	/* current chronograph type */
@@ -297,6 +298,7 @@ struct tcp_sock {
 	u32	sacked_out;	/* SACK'd packets			*/
 
 	struct hrtimer	pacing_timer;
+	struct hrtimer	compressed_ack_timer;
 
 	/* from STCP, retrans queue hinting */
 	struct sk_buff* lost_skb_hint;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6ffc8bd894876ad23407f5ec4994350139af85e7..c8c65ae62955eb12a9a6489fa8e008fd89f89f16 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -560,6 +560,9 @@ static inline void tcp_clear_xmit_timers(struct sock *sk)
 	if (hrtimer_try_to_cancel(&tcp_sk(sk)->pacing_timer) == 1)
 		__sock_put(sk);
 
+	if (hrtimer_try_to_cancel(&tcp_sk(sk)->compressed_ack_timer) == 1)
+		__sock_put(sk);
+
 	inet_csk_clear_xmit_timers(sk);
 }
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 62b776f9003798eaf06992a4eb0914d17646aa61..0a2ea0bbf867271db05aedd7d48b193677664321 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2595,6 +2595,7 @@ int tcp_disconnect(struct sock *sk, int flags)
 	dst_release(sk->sk_rx_dst);
 	sk->sk_rx_dst = NULL;
 	tcp_saved_syn_free(tp);
+	tp->compressed_ack = 0;
 
 	/* Clean up fastopen related fields */
 	tcp_free_fastopen_req(tp);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 99fcab7e6570c8b8758ea4b15cdd26df29fb4fd6..58feea67b6bb147fa9e75b8b514a9a41576b512b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4242,6 +4242,8 @@ static void tcp_sack_new_ofo_skb(struct sock *sk, u32 seq, u32 end_seq)
 	 * If the sack array is full, forget about the last one.
 	 */
 	if (this_sack >= TCP_NUM_SACKS) {
+		if (tp->compressed_ack)
+			tcp_send_ack(sk);
 		this_sack--;
 		tp->rx_opt.num_sacks--;
 		sp--;
@@ -5074,6 +5076,7 @@ static inline void tcp_data_snd_check(struct sock *sk)
 static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
+	unsigned long delay;
 
 	    /* More than one full frame received... */
 	if (((tp->rcv_nxt - tp->rcv_wup) > inet_csk(sk)->icsk_ack.rcv_mss &&
@@ -5085,15 +5088,31 @@ static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
 	    (tp->rcv_nxt - tp->copied_seq < sk->sk_rcvlowat ||
 	     __tcp_select_window(sk) >= tp->rcv_wnd)) ||
 	    /* We ACK each frame or... */
-	    tcp_in_quickack_mode(sk) ||
-	    /* We have out of order data. */
-	    (ofo_possible && !RB_EMPTY_ROOT(&tp->out_of_order_queue))) {
-		/* Then ack it now */
+	    tcp_in_quickack_mode(sk)) {
+send_now:
 		tcp_send_ack(sk);
-	} else {
-		/* Else, send delayed ack. */
+		return;
+	}
+
+	if (!ofo_possible || RB_EMPTY_ROOT(&tp->out_of_order_queue)) {
 		tcp_send_delayed_ack(sk);
+		return;
 	}
+
+	if (!tcp_is_sack(tp) || tp->compressed_ack >= 127)
+		goto send_now;
+	tp->compressed_ack++;
+
+	if (hrtimer_is_queued(&tp->compressed_ack_timer))
+		return;
+
+	/* compress ack timer : 5 % of srtt, but no more than 2.5 ms */
+
+	delay = min_t(unsigned long, 2500 * NSEC_PER_USEC,
+		      tp->rcv_rtt_est.rtt_us * (NSEC_PER_USEC >> 3)/20);
+	sock_hold(sk);
+	hrtimer_start(&tp->compressed_ack_timer, ns_to_ktime(delay),
+		      HRTIMER_MODE_REL_PINNED_SOFT);
 }
 
 static inline void tcp_ack_snd_check(struct sock *sk)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 0d8f950a9006598c70dbf51e281a3fe32dfaa234..7ee98aad82b758674ca7f3e90bd3fc165e8fcd45 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -162,6 +162,13 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
 /* Account for an ACK we sent. */
 static inline void tcp_event_ack_sent(struct sock *sk, unsigned int pkts)
 {
+	struct tcp_sock *tp = tcp_sk(sk);
+
+	if (unlikely(tp->compressed_ack)) {
+		tp->compressed_ack = 0;
+		if (hrtimer_try_to_cancel(&tp->compressed_ack_timer) == 1)
+			__sock_put(sk);
+	}
 	tcp_dec_quickack_mode(sk, pkts);
 	inet_csk_clear_xmit_timer(sk, ICSK_TIME_DACK);
 }
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 92bdf64fffae3a5be291ca419eb21276b4c8cbae..3b3611729928f77934e0298bb248e55c7a7c5def 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -708,6 +708,27 @@ static void tcp_keepalive_timer (struct timer_list *t)
 	sock_put(sk);
 }
 
+static enum hrtimer_restart tcp_compressed_ack_kick(struct hrtimer *timer)
+{
+	struct tcp_sock *tp = container_of(timer, struct tcp_sock, compressed_ack_timer);
+	struct sock *sk = (struct sock *)tp;
+
+	bh_lock_sock(sk);
+	if (!sock_owned_by_user(sk)) {
+		if (tp->compressed_ack)
+			tcp_send_ack(sk);
+	} else {
+		if (!test_and_set_bit(TCP_DELACK_TIMER_DEFERRED,
+				      &sk->sk_tsq_flags))
+			sock_hold(sk);
+	}
+	bh_unlock_sock(sk);
+
+	sock_put(sk);
+
+	return HRTIMER_NORESTART;
+}
+
 void tcp_init_xmit_timers(struct sock *sk)
 {
 	inet_csk_init_xmit_timers(sk, &tcp_write_timer, &tcp_delack_timer,
@@ -715,4 +736,8 @@ void tcp_init_xmit_timers(struct sock *sk)
 	hrtimer_init(&tcp_sk(sk)->pacing_timer, CLOCK_MONOTONIC,
 		     HRTIMER_MODE_ABS_PINNED_SOFT);
 	tcp_sk(sk)->pacing_timer.function = tcp_pace_kick;
+
+	hrtimer_init(&tcp_sk(sk)->compressed_ack_timer, CLOCK_MONOTONIC,
+		     HRTIMER_MODE_REL_PINNED_SOFT);
+	tcp_sk(sk)->compressed_ack_timer.function = tcp_compressed_ack_kick;
 }
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 4/4] tcp: add TCPAckCompressed SNMP counter
  2018-05-17 12:12 [PATCH net-next 0/4] tcp: implement SACK compression Eric Dumazet
                   ` (2 preceding siblings ...)
  2018-05-17 12:12 ` [PATCH net-next 3/4] tcp: add SACK compression Eric Dumazet
@ 2018-05-17 12:12 ` Eric Dumazet
  2018-05-17 14:55   ` Neal Cardwell
  3 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 12:12 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Toke Høiland-Jørgensen, Neal Cardwell,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet

This counter tracks number of ACK packets that the host has not sent,
thanks to ACK compression.

Sample output :

$ nstat -n;sleep 1;nstat|egrep "IpInReceives|IpOutRequests|TcpInSegs|TcpOutSegs|TcpExtTCPAckCompressed"
IpInReceives                    123250             0.0
IpOutRequests                   3684               0.0
TcpInSegs                       123251             0.0
TcpOutSegs                      3684               0.0
TcpExtTCPAckCompressed          119252             0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/uapi/linux/snmp.h | 1 +
 net/ipv4/proc.c           | 1 +
 net/ipv4/tcp_output.c     | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index d02e859301ff499dd72a1c0e1b56bed10a9397a6..750d89120335eb489f698191edb6c5110969fa8c 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -278,6 +278,7 @@ enum
 	LINUX_MIB_TCPMTUPSUCCESS,		/* TCPMTUPSuccess */
 	LINUX_MIB_TCPDELIVERED,			/* TCPDelivered */
 	LINUX_MIB_TCPDELIVEREDCE,		/* TCPDeliveredCE */
+	LINUX_MIB_TCPACKCOMPRESSED,		/* TCPAckCompressed */
 	__LINUX_MIB_MAX
 };
 
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 261b71d0ccc5c17c6032bf67eb8f842006766e64..6c1ff89a60fa0a3485dcc71fafc799e798d5dc11 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -298,6 +298,7 @@ static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("TCPMTUPSuccess", LINUX_MIB_TCPMTUPSUCCESS),
 	SNMP_MIB_ITEM("TCPDelivered", LINUX_MIB_TCPDELIVERED),
 	SNMP_MIB_ITEM("TCPDeliveredCE", LINUX_MIB_TCPDELIVEREDCE),
+	SNMP_MIB_ITEM("TCPAckCompressed", LINUX_MIB_TCPACKCOMPRESSED),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 7ee98aad82b758674ca7f3e90bd3fc165e8fcd45..437bb7ceba7fd388abac1c12f2920b02be77bad9 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -165,6 +165,8 @@ static inline void tcp_event_ack_sent(struct sock *sk, unsigned int pkts)
 	struct tcp_sock *tp = tcp_sk(sk);
 
 	if (unlikely(tp->compressed_ack)) {
+		NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPACKCOMPRESSED,
+			      tp->compressed_ack);
 		tp->compressed_ack = 0;
 		if (hrtimer_try_to_cancel(&tp->compressed_ack_timer) == 1)
 			__sock_put(sk);
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers()
  2018-05-17 12:12 ` [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers() Eric Dumazet
@ 2018-05-17 14:53   ` Neal Cardwell
  0 siblings, 0 replies; 16+ messages in thread
From: Neal Cardwell @ 2018-05-17 14:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Netdev, Toke Høiland-Jørgensen,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet

On Thu, May 17, 2018 at 8:12 AM Eric Dumazet <edumazet@google.com> wrote:

> Socket can not disappear under us.

> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---

Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks!

neal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets
  2018-05-17 12:12 ` [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets Eric Dumazet
@ 2018-05-17 14:55   ` Neal Cardwell
  2018-05-17 17:14   ` Soheil Hassas Yeganeh
  1 sibling, 0 replies; 16+ messages in thread
From: Neal Cardwell @ 2018-05-17 14:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Netdev, Toke Høiland-Jørgensen,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet

On Thu, May 17, 2018 at 8:12 AM Eric Dumazet <edumazet@google.com> wrote:

> As explained in commit 9f9843a751d0 ("tcp: properly handle stretch
> acks in slow start"), TCP stacks have to consider how many packets
> are acknowledged in one single ACK, because of GRO, but also
> because of ACK compression or losses.

> We plan to add SACK compression in the following patch, we
> must therefore not call tcp_enter_quickack_mode()

> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---

Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks!

neal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 4/4] tcp: add TCPAckCompressed SNMP counter
  2018-05-17 12:12 ` [PATCH net-next 4/4] tcp: add TCPAckCompressed SNMP counter Eric Dumazet
@ 2018-05-17 14:55   ` Neal Cardwell
  0 siblings, 0 replies; 16+ messages in thread
From: Neal Cardwell @ 2018-05-17 14:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Netdev, Toke Høiland-Jørgensen,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet

On Thu, May 17, 2018 at 8:12 AM Eric Dumazet <edumazet@google.com> wrote:

> This counter tracks number of ACK packets that the host has not sent,
> thanks to ACK compression.

> Sample output :

> $ nstat -n;sleep 1;nstat|egrep
"IpInReceives|IpOutRequests|TcpInSegs|TcpOutSegs|TcpExtTCPAckCompressed"
> IpInReceives                    123250             0.0
> IpOutRequests                   3684               0.0
> TcpInSegs                       123251             0.0
> TcpOutSegs                      3684               0.0
> TcpExtTCPAckCompressed          119252             0.0

> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---

Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks!

neal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 12:12 ` [PATCH net-next 3/4] tcp: add SACK compression Eric Dumazet
@ 2018-05-17 15:14   ` Neal Cardwell
  2018-05-17 15:40     ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Neal Cardwell @ 2018-05-17 15:14 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Netdev, Toke Høiland-Jørgensen,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet

On Thu, May 17, 2018 at 8:12 AM Eric Dumazet <edumazet@google.com> wrote:

> When TCP receives an out-of-order packet, it immediately sends
> a SACK packet, generating network load but also forcing the
> receiver to send 1-MSS pathological packets, increasing its
> RTX queue length/depth, and thus processing time.

> Wifi networks suffer from this aggressive behavior, but generally
> speaking, all these SACK packets add fuel to the fire when networks
> are under congestion.

> This patch adds a high resolution timer and tp->compressed_ack counter.

> Instead of sending a SACK, we program this timer with a small delay,
> based on SRTT and capped to 2.5 ms : delay = min ( 5 % of SRTT, 2.5 ms)
...

Very nice. Thanks for implementing this, Eric! I was wondering if the
constants here might be worth some discussion/elaboration.

> @@ -5074,6 +5076,7 @@ static inline void tcp_data_snd_check(struct sock
*sk)
>   static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
>   {
>          struct tcp_sock *tp = tcp_sk(sk);
> +       unsigned long delay;

>              /* More than one full frame received... */
>          if (((tp->rcv_nxt - tp->rcv_wup) > inet_csk(sk)->icsk_ack.rcv_mss
&&
> @@ -5085,15 +5088,31 @@ static void __tcp_ack_snd_check(struct sock *sk,
int ofo_possible)
>              (tp->rcv_nxt - tp->copied_seq < sk->sk_rcvlowat ||
>               __tcp_select_window(sk) >= tp->rcv_wnd)) ||
>              /* We ACK each frame or... */
> -           tcp_in_quickack_mode(sk) ||
> -           /* We have out of order data. */
> -           (ofo_possible && !RB_EMPTY_ROOT(&tp->out_of_order_queue))) {
> -               /* Then ack it now */
> +           tcp_in_quickack_mode(sk)) {
> +send_now:
>                  tcp_send_ack(sk);
> -       } else {
> -               /* Else, send delayed ack. */
> +               return;
> +       }
> +
> +       if (!ofo_possible || RB_EMPTY_ROOT(&tp->out_of_order_queue)) {
>                  tcp_send_delayed_ack(sk);
> +               return;
>          }
> +
> +       if (!tcp_is_sack(tp) || tp->compressed_ack >= 127)
> +               goto send_now;
> +       tp->compressed_ack++;

Is there a particular motivation for the cap of 127? IMHO 127 ACKs is quite
a few to compress. Experience seems to show that it works well to have one
GRO ACK for ~64KBytes that triggers a single TSO skb of ~64KBytes. It might
be nice to try to match those dynamics in this SACK compression case, so it
might be nice to cap the number of compressed ACKs at something like 44?
(0xffff / 1448 - 1).  That way for high-speed paths we could try to keep
the ACK clock going with ACKs for ~64KBytes that trigger a single TSO skb
of ~64KBytes, no matter whether we are sending SACKs or cumulative ACKs.

> +
> +       if (hrtimer_is_queued(&tp->compressed_ack_timer))
> +               return;
> +
> +       /* compress ack timer : 5 % of srtt, but no more than 2.5 ms */
> +
> +       delay = min_t(unsigned long, 2500 * NSEC_PER_USEC,
> +                     tp->rcv_rtt_est.rtt_us * (NSEC_PER_USEC >> 3)/20);

Any particular motivation for the 2.5ms here? It might be nice to match the
existing TSO autosizing dynamics and use 1ms here instead of having a
separate new constant of 2.5ms. Smaller time scales here should lead to
less burstiness and queue pressure from data packets in the network, and we
know from experience that the CPU overhead of 1ms chunks is acceptable.

thanks,
neal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 15:14   ` Neal Cardwell
@ 2018-05-17 15:40     ` Eric Dumazet
  2018-05-17 15:46       ` Eric Dumazet
  2018-05-17 16:41       ` Neal Cardwell
  0 siblings, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 15:40 UTC (permalink / raw)
  To: Neal Cardwell, Eric Dumazet
  Cc: David Miller, Netdev, Toke Høiland-Jørgensen,
	Yuchung Cheng, Soheil Hassas Yeganeh



On 05/17/2018 08:14 AM, Neal Cardwell wrote:
> On Thu, May 17, 2018 at 8:12 AM Eric Dumazet <edumazet@google.com> wrote:
> 
>> When TCP receives an out-of-order packet, it immediately sends
>> a SACK packet, generating network load but also forcing the
>> receiver to send 1-MSS pathological packets, increasing its
>> RTX queue length/depth, and thus processing time.
> 
>> Wifi networks suffer from this aggressive behavior, but generally
>> speaking, all these SACK packets add fuel to the fire when networks
>> are under congestion.
> 
>> This patch adds a high resolution timer and tp->compressed_ack counter.
> 
>> Instead of sending a SACK, we program this timer with a small delay,
>> based on SRTT and capped to 2.5 ms : delay = min ( 5 % of SRTT, 2.5 ms)
> ...
> 
> Very nice. Thanks for implementing this, Eric! I was wondering if the
> constants here might be worth some discussion/elaboration.
> 
>> @@ -5074,6 +5076,7 @@ static inline void tcp_data_snd_check(struct sock
> *sk)
>>   static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
>>   {
>>          struct tcp_sock *tp = tcp_sk(sk);
>> +       unsigned long delay;
> 
>>              /* More than one full frame received... */
>>          if (((tp->rcv_nxt - tp->rcv_wup) > inet_csk(sk)->icsk_ack.rcv_mss
> &&
>> @@ -5085,15 +5088,31 @@ static void __tcp_ack_snd_check(struct sock *sk,
> int ofo_possible)
>>              (tp->rcv_nxt - tp->copied_seq < sk->sk_rcvlowat ||
>>               __tcp_select_window(sk) >= tp->rcv_wnd)) ||
>>              /* We ACK each frame or... */
>> -           tcp_in_quickack_mode(sk) ||
>> -           /* We have out of order data. */
>> -           (ofo_possible && !RB_EMPTY_ROOT(&tp->out_of_order_queue))) {
>> -               /* Then ack it now */
>> +           tcp_in_quickack_mode(sk)) {
>> +send_now:
>>                  tcp_send_ack(sk);
>> -       } else {
>> -               /* Else, send delayed ack. */
>> +               return;
>> +       }
>> +
>> +       if (!ofo_possible || RB_EMPTY_ROOT(&tp->out_of_order_queue)) {
>>                  tcp_send_delayed_ack(sk);
>> +               return;
>>          }
>> +
>> +       if (!tcp_is_sack(tp) || tp->compressed_ack >= 127)
>> +               goto send_now;
>> +       tp->compressed_ack++;
> 
> Is there a particular motivation for the cap of 127? IMHO 127 ACKs is quite
> a few to compress. Experience seems to show that it works well to have one
> GRO ACK for ~64KBytes that triggers a single TSO skb of ~64KBytes. It might
> be nice to try to match those dynamics in this SACK compression case, so it
> might be nice to cap the number of compressed ACKs at something like 44?
> (0xffff / 1448 - 1).  That way for high-speed paths we could try to keep
> the ACK clock going with ACKs for ~64KBytes that trigger a single TSO skb
> of ~64KBytes, no matter whether we are sending SACKs or cumulative ACKs.

127 was chosen because the field is u8, and since skb allocation for the ACK
can fail, we could have cases were the field goes above 127.

Ultimately, I believe a followup patch would add a sysctl, so that we can fine-tune
this, and eventually disable ACK compression if this sysctl is set to 0

> 
>> +
>> +       if (hrtimer_is_queued(&tp->compressed_ack_timer))
>> +               return;
>> +
>> +       /* compress ack timer : 5 % of srtt, but no more than 2.5 ms */
>> +
>> +       delay = min_t(unsigned long, 2500 * NSEC_PER_USEC,
>> +                     tp->rcv_rtt_est.rtt_us * (NSEC_PER_USEC >> 3)/20);
> 
> Any particular motivation for the 2.5ms here? It might be nice to match the
> existing TSO autosizing dynamics and use 1ms here instead of having a
> separate new constant of 2.5ms. Smaller time scales here should lead to
> less burstiness and queue pressure from data packets in the network, and we
> know from experience that the CPU overhead of 1ms chunks is acceptable.

This came from my tests on wifi really :)

I also had the idea to make this threshold adjustable for wifi, like we did for sk_pacing_shift.

(On wifi, we might want to increase the max delay between ACK)

So maybe use 1ms delay, when sk_pacing_shift == 10, but increase it if sk_pacing_shift has been lowered.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 15:40     ` Eric Dumazet
@ 2018-05-17 15:46       ` Eric Dumazet
  2018-05-17 16:41       ` Neal Cardwell
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 15:46 UTC (permalink / raw)
  To: Neal Cardwell, Eric Dumazet
  Cc: David Miller, Netdev, Toke Høiland-Jørgensen,
	Yuchung Cheng, Soheil Hassas Yeganeh



On 05/17/2018 08:40 AM, Eric Dumazet wrote:
> 
> 
> On 05/17/2018 08:14 AM, Neal Cardwell wrote:

>> Any particular motivation for the 2.5ms here? It might be nice to match the
>> existing TSO autosizing dynamics and use 1ms here instead of having a
>> separate new constant of 2.5ms. Smaller time scales here should lead to
>> less burstiness and queue pressure from data packets in the network, and we
>> know from experience that the CPU overhead of 1ms chunks is acceptable.
> 
> This came from my tests on wifi really :)
> 
> I also had the idea to make this threshold adjustable for wifi, like we did for sk_pacing_shift.
> 
> (On wifi, we might want to increase the max delay between ACK)
> 
> So maybe use 1ms delay, when sk_pacing_shift == 10, but increase it if sk_pacing_shift has been lowered.
> 
>

BTW, maybe my changelog or patch is not clear enough :

As soon as some packets are received in order, we send an ACK, even if the timer was armed.
(This is the beginning of __tcp_ack_snd_check())

When this ACK is sent, timer is canceled (in tcp_event_ack_sent())

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 15:40     ` Eric Dumazet
  2018-05-17 15:46       ` Eric Dumazet
@ 2018-05-17 16:41       ` Neal Cardwell
  2018-05-17 16:59         ` Yuchung Cheng
  1 sibling, 1 reply; 16+ messages in thread
From: Neal Cardwell @ 2018-05-17 16:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David Miller, Netdev,
	Toke Høiland-Jørgensen, Yuchung Cheng,
	Soheil Hassas Yeganeh

On Thu, May 17, 2018 at 11:40 AM Eric Dumazet <eric.dumazet@gmail.com>
wrote:
> On 05/17/2018 08:14 AM, Neal Cardwell wrote:
> > Is there a particular motivation for the cap of 127? IMHO 127 ACKs is
quite
> > a few to compress. Experience seems to show that it works well to have
one
> > GRO ACK for ~64KBytes that triggers a single TSO skb of ~64KBytes. It
might
> > be nice to try to match those dynamics in this SACK compression case,
so it
> > might be nice to cap the number of compressed ACKs at something like 44?
> > (0xffff / 1448 - 1).  That way for high-speed paths we could try to keep
> > the ACK clock going with ACKs for ~64KBytes that trigger a single TSO
skb
> > of ~64KBytes, no matter whether we are sending SACKs or cumulative ACKs.

> 127 was chosen because the field is u8, and since skb allocation for the
ACK
> can fail, we could have cases were the field goes above 127.

> Ultimately, I believe a followup patch would add a sysctl, so that we can
fine-tune
> this, and eventually disable ACK compression if this sysctl is set to 0

OK, a sysctl sounds good. I would still vote for a default of 44.  :-)


> >> +       if (hrtimer_is_queued(&tp->compressed_ack_timer))
> >> +               return;
> >> +
> >> +       /* compress ack timer : 5 % of srtt, but no more than 2.5 ms */
> >> +
> >> +       delay = min_t(unsigned long, 2500 * NSEC_PER_USEC,
> >> +                     tp->rcv_rtt_est.rtt_us * (NSEC_PER_USEC >>
3)/20);
> >
> > Any particular motivation for the 2.5ms here? It might be nice to match
the
> > existing TSO autosizing dynamics and use 1ms here instead of having a
> > separate new constant of 2.5ms. Smaller time scales here should lead to
> > less burstiness and queue pressure from data packets in the network,
and we
> > know from experience that the CPU overhead of 1ms chunks is acceptable.

> This came from my tests on wifi really :)

> I also had the idea to make this threshold adjustable for wifi, like we
did for sk_pacing_shift.

> (On wifi, we might want to increase the max delay between ACK)

> So maybe use 1ms delay, when sk_pacing_shift == 10, but increase it if
sk_pacing_shift has been lowered.

Sounds good to me.

Thanks for implementing this! Overall this patch seems nice to me.

Acked-by: Neal Cardwell <ncardwell@google.com>

BTW, I guess we should spread the word to maintainers of other major TCP
stacks that they need to be prepared for what may be a much higher degree
of compression/aggregation in the SACK stream. Linux stacks going back many
years should be fine with this, but I'm not sure about the other major OSes
(they may only allow sending one MSS per ACK-with-SACKs received).

neal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 16:41       ` Neal Cardwell
@ 2018-05-17 16:59         ` Yuchung Cheng
  2018-05-17 17:15           ` Yuchung Cheng
  2018-05-17 17:15           ` Eric Dumazet
  0 siblings, 2 replies; 16+ messages in thread
From: Yuchung Cheng @ 2018-05-17 16:59 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Eric Dumazet, Eric Dumazet, David Miller, Netdev,
	Toke Høiland-Jørgensen, Soheil Hassas Yeganeh,
	Christoph Paasch

On Thu, May 17, 2018 at 9:41 AM, Neal Cardwell <ncardwell@google.com> wrote:
>
> On Thu, May 17, 2018 at 11:40 AM Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
> > On 05/17/2018 08:14 AM, Neal Cardwell wrote:
> > > Is there a particular motivation for the cap of 127? IMHO 127 ACKs is
> quite
> > > a few to compress. Experience seems to show that it works well to have
> one
> > > GRO ACK for ~64KBytes that triggers a single TSO skb of ~64KBytes. It
> might
> > > be nice to try to match those dynamics in this SACK compression case,
> so it
> > > might be nice to cap the number of compressed ACKs at something like 44?
> > > (0xffff / 1448 - 1).  That way for high-speed paths we could try to keep
> > > the ACK clock going with ACKs for ~64KBytes that trigger a single TSO
> skb
> > > of ~64KBytes, no matter whether we are sending SACKs or cumulative ACKs.
>
> > 127 was chosen because the field is u8, and since skb allocation for the
> ACK
> > can fail, we could have cases were the field goes above 127.
>
> > Ultimately, I believe a followup patch would add a sysctl, so that we can
> fine-tune
> > this, and eventually disable ACK compression if this sysctl is set to 0
>
> OK, a sysctl sounds good. I would still vote for a default of 44.  :-)
>
>
> > >> +       if (hrtimer_is_queued(&tp->compressed_ack_timer))
> > >> +               return;
> > >> +
> > >> +       /* compress ack timer : 5 % of srtt, but no more than 2.5 ms */
> > >> +
> > >> +       delay = min_t(unsigned long, 2500 * NSEC_PER_USEC,
> > >> +                     tp->rcv_rtt_est.rtt_us * (NSEC_PER_USEC >>
> 3)/20);
> > >
> > > Any particular motivation for the 2.5ms here? It might be nice to match
> the
> > > existing TSO autosizing dynamics and use 1ms here instead of having a
> > > separate new constant of 2.5ms. Smaller time scales here should lead to
> > > less burstiness and queue pressure from data packets in the network,
> and we
> > > know from experience that the CPU overhead of 1ms chunks is acceptable.
>
> > This came from my tests on wifi really :)
>
> > I also had the idea to make this threshold adjustable for wifi, like we
> did for sk_pacing_shift.
>
> > (On wifi, we might want to increase the max delay between ACK)
>
> > So maybe use 1ms delay, when sk_pacing_shift == 10, but increase it if
> sk_pacing_shift has been lowered.
>
> Sounds good to me.
>
> Thanks for implementing this! Overall this patch seems nice to me.
>
> Acked-by: Neal Cardwell <ncardwell@google.com>
>
> BTW, I guess we should spread the word to maintainers of other major TCP
> stacks that they need to be prepared for what may be a much higher degree
> of compression/aggregation in the SACK stream. Linux stacks going back many
> years should be fine with this, but I'm not sure about the other major OSes
> (they may only allow sending one MSS per ACK-with-SACKs received).
Patch looks really good but Neal's comment just reminds me a potential
legacy issue.

I recall at least Apple and Windows TCP stacks still need 3+ DUPACKs
(!= a SACK covering 3+ packets) to trigger fast recovery. Will we have
an issue there interacting w/ these stacks?

>
> neal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets
  2018-05-17 12:12 ` [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets Eric Dumazet
  2018-05-17 14:55   ` Neal Cardwell
@ 2018-05-17 17:14   ` Soheil Hassas Yeganeh
  1 sibling, 0 replies; 16+ messages in thread
From: Soheil Hassas Yeganeh @ 2018-05-17 17:14 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, netdev, Toke Høiland-Jørgensen,
	Neal Cardwell, Yuchung Cheng, Eric Dumazet

On Thu, May 17, 2018 at 8:12 AM, Eric Dumazet <edumazet@google.com> wrote:
> As explained in commit 9f9843a751d0 ("tcp: properly handle stretch
> acks in slow start"), TCP stacks have to consider how many packets
> are acknowledged in one single ACK, because of GRO, but also
> because of ACK compression or losses.
>
> We plan to add SACK compression in the following patch, we
> must therefore not call tcp_enter_quickack_mode()
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

Thank you, Eric!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 16:59         ` Yuchung Cheng
@ 2018-05-17 17:15           ` Yuchung Cheng
  2018-05-17 17:15           ` Eric Dumazet
  1 sibling, 0 replies; 16+ messages in thread
From: Yuchung Cheng @ 2018-05-17 17:15 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Eric Dumazet, Eric Dumazet, David Miller, Netdev,
	Toke Høiland-Jørgensen, Soheil Hassas Yeganeh,
	Christoph Paasch

On Thu, May 17, 2018 at 9:59 AM, Yuchung Cheng <ycheng@google.com> wrote:
> On Thu, May 17, 2018 at 9:41 AM, Neal Cardwell <ncardwell@google.com> wrote:
>>
>> On Thu, May 17, 2018 at 11:40 AM Eric Dumazet <eric.dumazet@gmail.com>
>> wrote:
>> > On 05/17/2018 08:14 AM, Neal Cardwell wrote:
>> > > Is there a particular motivation for the cap of 127? IMHO 127 ACKs is
>> quite
>> > > a few to compress. Experience seems to show that it works well to have
>> one
>> > > GRO ACK for ~64KBytes that triggers a single TSO skb of ~64KBytes. It
>> might
>> > > be nice to try to match those dynamics in this SACK compression case,
>> so it
>> > > might be nice to cap the number of compressed ACKs at something like 44?
>> > > (0xffff / 1448 - 1).  That way for high-speed paths we could try to keep
>> > > the ACK clock going with ACKs for ~64KBytes that trigger a single TSO
>> skb
>> > > of ~64KBytes, no matter whether we are sending SACKs or cumulative ACKs.
>>
>> > 127 was chosen because the field is u8, and since skb allocation for the
>> ACK
>> > can fail, we could have cases were the field goes above 127.
>>
>> > Ultimately, I believe a followup patch would add a sysctl, so that we can
>> fine-tune
>> > this, and eventually disable ACK compression if this sysctl is set to 0
>>
>> OK, a sysctl sounds good. I would still vote for a default of 44.  :-)
>>
>>
>> > >> +       if (hrtimer_is_queued(&tp->compressed_ack_timer))
>> > >> +               return;
>> > >> +
>> > >> +       /* compress ack timer : 5 % of srtt, but no more than 2.5 ms */
>> > >> +
>> > >> +       delay = min_t(unsigned long, 2500 * NSEC_PER_USEC,
>> > >> +                     tp->rcv_rtt_est.rtt_us * (NSEC_PER_USEC >>
>> 3)/20);
>> > >
>> > > Any particular motivation for the 2.5ms here? It might be nice to match
>> the
>> > > existing TSO autosizing dynamics and use 1ms here instead of having a
>> > > separate new constant of 2.5ms. Smaller time scales here should lead to
>> > > less burstiness and queue pressure from data packets in the network,
>> and we
>> > > know from experience that the CPU overhead of 1ms chunks is acceptable.
>>
>> > This came from my tests on wifi really :)
>>
>> > I also had the idea to make this threshold adjustable for wifi, like we
>> did for sk_pacing_shift.
>>
>> > (On wifi, we might want to increase the max delay between ACK)
>>
>> > So maybe use 1ms delay, when sk_pacing_shift == 10, but increase it if
>> sk_pacing_shift has been lowered.
>>
>> Sounds good to me.
>>
>> Thanks for implementing this! Overall this patch seems nice to me.
>>
>> Acked-by: Neal Cardwell <ncardwell@google.com>
>>
>> BTW, I guess we should spread the word to maintainers of other major TCP
>> stacks that they need to be prepared for what may be a much higher degree
>> of compression/aggregation in the SACK stream. Linux stacks going back many
>> years should be fine with this, but I'm not sure about the other major OSes
>> (they may only allow sending one MSS per ACK-with-SACKs received).
> Patch looks really good but Neal's comment just reminds me a potential
> legacy issue.
>
> I recall at least Apple and Windows TCP stacks still need 3+ DUPACKs
> (!= a SACK covering 3+ packets) to trigger fast recovery. Will we have
> an issue there interacting w/ these stacks?
Offline chat w/ Eric: actually the problem already exists with GRO: a
Linux receiver could receive a OOO skb of say 5 pkts and returns one
(DUP)ACK w/ sack option covering 5 pkts.

Since no issues have been reported my concern is probably not big
deal. Hopefully other stacks can improve their sack / recovery
handling there.

>
>>
>> neal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/4] tcp: add SACK compression
  2018-05-17 16:59         ` Yuchung Cheng
  2018-05-17 17:15           ` Yuchung Cheng
@ 2018-05-17 17:15           ` Eric Dumazet
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-05-17 17:15 UTC (permalink / raw)
  To: Yuchung Cheng
  Cc: Neal Cardwell, Eric Dumazet, David Miller, netdev,
	Toke Høiland-Jørgensen, Soheil Hassas Yeganeh,
	Christoph Paasch

On Thu, May 17, 2018 at 9:59 AM Yuchung Cheng <ycheng@google.com> wrote:
> >
> > Thanks for implementing this! Overall this patch seems nice to me.
> >
> > Acked-by: Neal Cardwell <ncardwell@google.com>
> >
> > BTW, I guess we should spread the word to maintainers of other major TCP
> > stacks that they need to be prepared for what may be a much higher
degree
> > of compression/aggregation in the SACK stream. Linux stacks going back
many
> > years should be fine with this, but I'm not sure about the other major
OSes
> > (they may only allow sending one MSS per ACK-with-SACKs received).
> Patch looks really good but Neal's comment just reminds me a potential
> legacy issue.

> I recall at least Apple and Windows TCP stacks still need 3+ DUPACKs
> (!= a SACK covering 3+ packets) to trigger fast recovery. Will we have
> an issue there interacting w/ these stacks?


Then we should revert GRO :)

Really it is time for these stacks to catch up, or give up to QUIC :/

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-05-17 17:15 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-17 12:12 [PATCH net-next 0/4] tcp: implement SACK compression Eric Dumazet
2018-05-17 12:12 ` [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers() Eric Dumazet
2018-05-17 14:53   ` Neal Cardwell
2018-05-17 12:12 ` [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets Eric Dumazet
2018-05-17 14:55   ` Neal Cardwell
2018-05-17 17:14   ` Soheil Hassas Yeganeh
2018-05-17 12:12 ` [PATCH net-next 3/4] tcp: add SACK compression Eric Dumazet
2018-05-17 15:14   ` Neal Cardwell
2018-05-17 15:40     ` Eric Dumazet
2018-05-17 15:46       ` Eric Dumazet
2018-05-17 16:41       ` Neal Cardwell
2018-05-17 16:59         ` Yuchung Cheng
2018-05-17 17:15           ` Yuchung Cheng
2018-05-17 17:15           ` Eric Dumazet
2018-05-17 12:12 ` [PATCH net-next 4/4] tcp: add TCPAckCompressed SNMP counter Eric Dumazet
2018-05-17 14:55   ` Neal Cardwell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.