netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock
@ 2017-05-16 20:59 Eric Dumazet
  2017-05-16 21:00 ` [PATCH net-next 01/15] tcp: use tp->tcp_mstamp in output path Eric Dumazet
                   ` (15 more replies)
  0 siblings, 16 replies; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 20:59 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffy' variable, because it had been a cheap and good enough
generator.

Unfortunately some distros use HZ=250 or even HZ=100 leading
to not very useful TCP timestamps.

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1].
RCVBUF autotuning is more precise.

This series converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

Kathleen Nichols [2] and others advocate for 1ms TS clocks for
network analysis. (1ms being the lowest value supported by RFC 7323.)

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
[2] http://netseminar.stanford.edu/seminars/02_02_17.pdf

Eric Dumazet (15):
  tcp: use tp->tcp_mstamp in output path
  tcp: introduce tcp_jiffies32
  dccp: do not use tcp_time_stamp
  tcp: use tcp_jiffies32 to feed tp->lsndtime
  tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp
  tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp
  tcp: bic,cubic: use tcp_jiffies32 instead of tcp_time_stamp
  tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime
  tcp: use tcp_jiffies32 to feed probe_timestamp
  tcp: uses jiffies_32 to feed tp->chrono_start
  tcp: use tcp_jiffies32 in __tcp_oow_rate_limited()
  tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp
  tcp_lp: cache tcp_time_stamp
  tcp: replace misc tcp_time_stamp to tcp_jiffies32
  tcp: switch TCP TS option (RFC 7323) to 1ms clock

 include/linux/skbuff.h           |  62 +------------------
 include/linux/tcp.h              |  22 +++----
 include/net/tcp.h                |  74 ++++++++++++++++++-----
 net/dccp/ccids/ccid2.c           |   8 +--
 net/dccp/ccids/ccid2.h           |   2 +-
 net/ipv4/syncookies.c            |   8 +--
 net/ipv4/tcp.c                   |  10 ++--
 net/ipv4/tcp_bbr.c               |  34 +++++------
 net/ipv4/tcp_bic.c               |   6 +-
 net/ipv4/tcp_cubic.c             |  14 ++---
 net/ipv4/tcp_htcp.c              |   2 +-
 net/ipv4/tcp_input.c             | 126 +++++++++++++++++++--------------------
 net/ipv4/tcp_ipv4.c              |  16 ++---
 net/ipv4/tcp_lp.c                |  17 +++---
 net/ipv4/tcp_metrics.c           |   2 +-
 net/ipv4/tcp_minisocks.c         |   8 +--
 net/ipv4/tcp_output.c            |  51 ++++++++--------
 net/ipv4/tcp_rate.c              |  16 ++---
 net/ipv4/tcp_recovery.c          |  24 ++++----
 net/ipv4/tcp_timer.c             |  17 +++---
 net/ipv4/tcp_westwood.c          |   6 +-
 net/ipv6/syncookies.c            |   2 +-
 net/ipv6/tcp_ipv6.c              |   4 +-
 net/netfilter/nf_synproxy_core.c |   2 +-
 24 files changed, 259 insertions(+), 274 deletions(-)

-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH net-next 01/15] tcp: use tp->tcp_mstamp in output path
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:42   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 02/15] tcp: introduce tcp_jiffies32 Eric Dumazet
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Idea is to later convert tp->tcp_mstamp to a full u64 counter
using usec resolution, so that we can later have fine
grained TCP TS clock (RFC 7323), regardless of HZ value.

We try to refresh tp->tcp_mstamp only when necessary.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_ipv4.c     |  1 +
 net/ipv4/tcp_output.c   | 21 +++++++++++----------
 net/ipv4/tcp_recovery.c |  1 -
 net/ipv4/tcp_timer.c    |  3 ++-
 4 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 5ab2aac5ca191075383fc75214da816873bb222c..d8fe25db79f223e3fde85882effd2ac6ec15f8ca 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -483,6 +483,7 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 		skb = tcp_write_queue_head(sk);
 		BUG_ON(!skb);
 
+		skb_mstamp_get(&tp->tcp_mstamp);
 		remaining = icsk->icsk_rto -
 			    min(icsk->icsk_rto,
 				tcp_time_stamp - tcp_skb_timestamp(skb));
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index a32172d69a03cbe76b45ec3094222f6c3a73e27d..4c8a6eaba6b39a2aea061dd6857ed8df954c5ca2 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -997,8 +997,8 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 	BUG_ON(!skb || !tcp_skb_pcount(skb));
 	tp = tcp_sk(sk);
 
+	skb->skb_mstamp = tp->tcp_mstamp;
 	if (clone_it) {
-		skb_mstamp_get(&skb->skb_mstamp);
 		TCP_SKB_CB(skb)->tx.in_flight = TCP_SKB_CB(skb)->end_seq
 			- tp->snd_una;
 		tcp_rate_skb_sent(sk, skb);
@@ -1906,7 +1906,6 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 	u32 age, send_win, cong_win, limit, in_flight;
 	struct tcp_sock *tp = tcp_sk(sk);
-	struct skb_mstamp now;
 	struct sk_buff *head;
 	int win_divisor;
 
@@ -1962,8 +1961,8 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 	}
 
 	head = tcp_write_queue_head(sk);
-	skb_mstamp_get(&now);
-	age = skb_mstamp_us_delta(&now, &head->skb_mstamp);
+
+	age = skb_mstamp_us_delta(&tp->tcp_mstamp, &head->skb_mstamp);
 	/* If next ACK is likely to come too late (half srtt), do not defer */
 	if (age < (tp->srtt_us >> 4))
 		goto send_now;
@@ -2280,6 +2279,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 	}
 
 	max_segs = tcp_tso_segs(sk, mss_now);
+	skb_mstamp_get(&tp->tcp_mstamp);
 	while ((skb = tcp_send_head(sk))) {
 		unsigned int limit;
 
@@ -2291,7 +2291,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 
 		if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
 			/* "skb_mstamp" is used as a start point for the retransmit timer */
-			skb_mstamp_get(&skb->skb_mstamp);
+			skb->skb_mstamp = tp->tcp_mstamp;
 			goto repair; /* Skip network transmission */
 		}
 
@@ -2879,7 +2879,7 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
 		     skb_headroom(skb) >= 0xFFFF)) {
 		struct sk_buff *nskb;
 
-		skb_mstamp_get(&skb->skb_mstamp);
+		skb->skb_mstamp = tp->tcp_mstamp;
 		nskb = __pskb_copy(skb, MAX_TCP_HEADER, GFP_ATOMIC);
 		err = nskb ? tcp_transmit_skb(sk, nskb, 0, GFP_ATOMIC) :
 			     -ENOBUFS;
@@ -3095,7 +3095,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority)
 	skb_reserve(skb, MAX_TCP_HEADER);
 	tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk),
 			     TCPHDR_ACK | TCPHDR_RST);
-	skb_mstamp_get(&skb->skb_mstamp);
+	skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
 	/* Send it off. */
 	if (tcp_transmit_skb(sk, skb, 0, priority))
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED);
@@ -3453,7 +3453,8 @@ int tcp_connect(struct sock *sk)
 		return -ENOBUFS;
 
 	tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
-	tp->retrans_stamp = tcp_time_stamp;
+	skb_mstamp_get(&tp->tcp_mstamp);
+	tp->retrans_stamp = tp->tcp_mstamp.stamp_jiffies;
 	tcp_connect_queue_skb(sk, buff);
 	tcp_ecn_send_syn(sk, buff);
 
@@ -3572,7 +3573,6 @@ void tcp_send_ack(struct sock *sk)
 	skb_set_tcp_pure_ack(buff);
 
 	/* Send it off, this clears delayed acks for us. */
-	skb_mstamp_get(&buff->skb_mstamp);
 	tcp_transmit_skb(sk, buff, 0, (__force gfp_t)0);
 }
 EXPORT_SYMBOL_GPL(tcp_send_ack);
@@ -3606,15 +3606,16 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent, int mib)
 	 * send it.
 	 */
 	tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK);
-	skb_mstamp_get(&skb->skb_mstamp);
 	NET_INC_STATS(sock_net(sk), mib);
 	return tcp_transmit_skb(sk, skb, 0, (__force gfp_t)0);
 }
 
+/* Called from setsockopt( ... TCP_REPAIR ) */
 void tcp_send_window_probe(struct sock *sk)
 {
 	if (sk->sk_state == TCP_ESTABLISHED) {
 		tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
+		skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
 		tcp_xmit_probe_skb(sk, 0, LINUX_MIB_TCPWINPROBE);
 	}
 }
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index 362b8c75bfab44cf87c2a01398a146a271bc1119..cd72b3d3879e88181c8a4639f0334a24e4cda852 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -166,7 +166,6 @@ void tcp_rack_reo_timeout(struct sock *sk)
 	u32 timeout, prior_inflight;
 
 	prior_inflight = tcp_packets_in_flight(tp);
-	skb_mstamp_get(&tp->tcp_mstamp);
 	tcp_rack_detect_loss(sk, &timeout);
 	if (prior_inflight != tcp_packets_in_flight(tp)) {
 		if (inet_csk(sk)->icsk_ca_state != TCP_CA_Recovery) {
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 86934bcf685a65ec3af3d22f1801ffa33eea76e2..ec7c5473c788d77ae459b38492f2f2606d00d1ba 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -339,7 +339,7 @@ static void tcp_probe_timer(struct sock *sk)
 	 */
 	start_ts = tcp_skb_timestamp(tcp_send_head(sk));
 	if (!start_ts)
-		skb_mstamp_get(&tcp_send_head(sk)->skb_mstamp);
+		tcp_send_head(sk)->skb_mstamp = tp->tcp_mstamp;
 	else if (icsk->icsk_user_timeout &&
 		 (s32)(tcp_time_stamp - start_ts) > icsk->icsk_user_timeout)
 		goto abort;
@@ -561,6 +561,7 @@ void tcp_write_timer_handler(struct sock *sk)
 		goto out;
 	}
 
+	skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
 	event = icsk->icsk_pending;
 
 	switch (event) {
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 02/15] tcp: introduce tcp_jiffies32
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
  2017-05-16 21:00 ` [PATCH net-next 01/15] tcp: use tp->tcp_mstamp in output path Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:43   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 03/15] dccp: do not use tcp_time_stamp Eric Dumazet
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

We abuse tcp_time_stamp for two different cases :

1) base to generate TCP Timestamp options (RFC 7323)

2) A 32bit version of jiffies since some TCP fields
   are 32bit wide to save memory.

Since we want in the future to have 1ms TCP TS clock,
regardless of HZ value, we want to cleanup things.

tcp_jiffies32 is the truncated jiffies value,
which will be used only in places where we want a 'host'
timestamp.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index b4dc93dae98c2d175ccadce150083705d237555e..4b45be5708215bae4551a5430b63ab2777baf447 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -700,11 +700,14 @@ u32 __tcp_select_window(struct sock *sk);
 
 void tcp_send_window_probe(struct sock *sk);
 
-/* TCP timestamps are only 32-bits, this causes a slight
- * complication on 64-bit systems since we store a snapshot
- * of jiffies in the buffer control blocks below.  We decided
- * to use only the low 32-bits of jiffies and hide the ugly
- * casts with the following macro.
+/* TCP uses 32bit jiffies to save some space.
+ * Note that this is different from tcp_time_stamp, which
+ * historically has been the same until linux-4.13.
+ */
+#define tcp_jiffies32 ((u32)jiffies)
+
+/* Generator for TCP TS option (RFC 7323)
+ * Currently tied to 'jiffies' but will soon be driven by 1 ms clock.
  */
 #define tcp_time_stamp		((__u32)(jiffies))
 
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 03/15] dccp: do not use tcp_time_stamp
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
  2017-05-16 21:00 ` [PATCH net-next 01/15] tcp: use tp->tcp_mstamp in output path Eric Dumazet
  2017-05-16 21:00 ` [PATCH net-next 02/15] tcp: introduce tcp_jiffies32 Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:43   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 04/15] tcp: use tcp_jiffies32 to feed tp->lsndtime Eric Dumazet
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Use our own macro instead of abusing tcp_time_stamp

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/dccp/ccids/ccid2.c | 8 ++++----
 net/dccp/ccids/ccid2.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c
index 5e3a7302f7747e4c4f3134eacab2f2c65b13402f..e1295d5f2c562e8785f59a0f5bd7064f471e85ab 100644
--- a/net/dccp/ccids/ccid2.c
+++ b/net/dccp/ccids/ccid2.c
@@ -233,7 +233,7 @@ static void ccid2_hc_tx_packet_sent(struct sock *sk, unsigned int len)
 {
 	struct dccp_sock *dp = dccp_sk(sk);
 	struct ccid2_hc_tx_sock *hc = ccid2_hc_tx_sk(sk);
-	const u32 now = ccid2_time_stamp;
+	const u32 now = ccid2_jiffies32;
 	struct ccid2_seq *next;
 
 	/* slow-start after idle periods (RFC 2581, RFC 2861) */
@@ -466,7 +466,7 @@ static void ccid2_new_ack(struct sock *sk, struct ccid2_seq *seqp,
 	 * The cleanest solution is to not use the ccid2s_sent field at all
 	 * and instead use DCCP timestamps: requires changes in other places.
 	 */
-	ccid2_rtt_estimator(sk, ccid2_time_stamp - seqp->ccid2s_sent);
+	ccid2_rtt_estimator(sk, ccid2_jiffies32 - seqp->ccid2s_sent);
 }
 
 static void ccid2_congestion_event(struct sock *sk, struct ccid2_seq *seqp)
@@ -478,7 +478,7 @@ static void ccid2_congestion_event(struct sock *sk, struct ccid2_seq *seqp)
 		return;
 	}
 
-	hc->tx_last_cong = ccid2_time_stamp;
+	hc->tx_last_cong = ccid2_jiffies32;
 
 	hc->tx_cwnd      = hc->tx_cwnd / 2 ? : 1U;
 	hc->tx_ssthresh  = max(hc->tx_cwnd, 2U);
@@ -731,7 +731,7 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk)
 
 	hc->tx_rto	 = DCCP_TIMEOUT_INIT;
 	hc->tx_rpdupack  = -1;
-	hc->tx_last_cong = hc->tx_lsndtime = hc->tx_cwnd_stamp = ccid2_time_stamp;
+	hc->tx_last_cong = hc->tx_lsndtime = hc->tx_cwnd_stamp = ccid2_jiffies32;
 	hc->tx_cwnd_used = 0;
 	setup_timer(&hc->tx_rtotimer, ccid2_hc_tx_rto_expire,
 			(unsigned long)sk);
diff --git a/net/dccp/ccids/ccid2.h b/net/dccp/ccids/ccid2.h
index 18c97543e522a6b9a5c8a3c817d4b40224adde48..6e50ef2898fb9dd9080217cc167defea6a2e9021 100644
--- a/net/dccp/ccids/ccid2.h
+++ b/net/dccp/ccids/ccid2.h
@@ -27,7 +27,7 @@
  * CCID-2 timestamping faces the same issues as TCP timestamping.
  * Hence we reuse/share as much of the code as possible.
  */
-#define ccid2_time_stamp	tcp_time_stamp
+#define ccid2_jiffies32	((u32)jiffies)
 
 /* NUMDUPACK parameter from RFC 4341, p. 6 */
 #define NUMDUPACK	3
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 04/15] tcp: use tcp_jiffies32 to feed tp->lsndtime
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (2 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 03/15] dccp: do not use tcp_time_stamp Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:43   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 05/15] tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp Eric Dumazet
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->lsndtime.

tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h     | 2 +-
 net/ipv4/tcp.c        | 2 +-
 net/ipv4/tcp_cubic.c  | 2 +-
 net/ipv4/tcp_input.c  | 4 ++--
 net/ipv4/tcp_output.c | 4 ++--
 net/ipv4/tcp_timer.c  | 4 ++--
 6 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 4b45be5708215bae4551a5430b63ab2777baf447..feba4c0406e551d7e57da3411476735731b4d817 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1245,7 +1245,7 @@ static inline void tcp_slow_start_after_idle_check(struct sock *sk)
 	if (!sysctl_tcp_slow_start_after_idle || tp->packets_out ||
 	    ca_ops->cong_control)
 		return;
-	delta = tcp_time_stamp - tp->lsndtime;
+	delta = tcp_jiffies32 - tp->lsndtime;
 	if (delta > inet_csk(sk)->icsk_rto)
 		tcp_cwnd_restart(sk, delta);
 }
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 1e4c76d2b8278ba71d6cc2cf7ebfe483e241f76e..d0bb61ee28bbceff8f2e27416ce87fec94935973 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2841,7 +2841,7 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
 	info->tcpi_retrans = tp->retrans_out;
 	info->tcpi_fackets = tp->fackets_out;
 
-	now = tcp_time_stamp;
+	now = tcp_jiffies32;
 	info->tcpi_last_data_sent = jiffies_to_msecs(now - tp->lsndtime);
 	info->tcpi_last_data_recv = jiffies_to_msecs(now - icsk->icsk_ack.lrcvtime);
 	info->tcpi_last_ack_recv = jiffies_to_msecs(now - tp->rcv_tstamp);
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 0683ba447d775b6101a929a6aca3eb255cff8932..2052ca740916d0872a41125ab61b769b334a314b 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -155,7 +155,7 @@ static void bictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event)
 {
 	if (event == CA_EVENT_TX_START) {
 		struct bictcp *ca = inet_csk_ca(sk);
-		u32 now = tcp_time_stamp;
+		u32 now = tcp_jiffies32;
 		s32 delta;
 
 		delta = now - tcp_sk(sk)->lsndtime;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 06e2dbc2b4a212a054fd88e57bb902c55a171b11..c0b3f909df394214785749704f2760171fe9d160 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5571,7 +5571,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
 	/* Prevent spurious tcp_cwnd_restart() on first data
 	 * packet.
 	 */
-	tp->lsndtime = tcp_time_stamp;
+	tp->lsndtime = tcp_jiffies32;
 
 	tcp_init_buffer_space(sk);
 
@@ -6008,7 +6008,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			tcp_update_pacing_rate(sk);
 
 		/* Prevent spurious tcp_cwnd_restart() on first data packet */
-		tp->lsndtime = tcp_time_stamp;
+		tp->lsndtime = tcp_jiffies32;
 
 		tcp_initialize_rcv_mss(sk);
 		tcp_fast_path_on(tp);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 4c8a6eaba6b39a2aea061dd6857ed8df954c5ca2..be9f8f483e21bdbb4d944fcdae8560f3ae11ee64 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -160,7 +160,7 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
 				struct sock *sk)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
-	const u32 now = tcp_time_stamp;
+	const u32 now = tcp_jiffies32;
 
 	if (tcp_packets_in_flight(tp) == 0)
 		tcp_ca_event(sk, CA_EVENT_TX_START);
@@ -1918,7 +1918,7 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 	/* Avoid bursty behavior by allowing defer
 	 * only if the last write was recent.
 	 */
-	if ((s32)(tcp_time_stamp - tp->lsndtime) > 0)
+	if ((s32)(tcp_jiffies32 - tp->lsndtime) > 0)
 		goto send_now;
 
 	in_flight = tcp_packets_in_flight(tp);
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index ec7c5473c788d77ae459b38492f2f2606d00d1ba..5f6f219a431e41a90b3c5d667a1a22b50f4464cf 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -63,7 +63,7 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
 
 	/* If peer does not open window for long time, or did not transmit
 	 * anything for long time, penalize it. */
-	if ((s32)(tcp_time_stamp - tp->lsndtime) > 2*TCP_RTO_MAX || !do_reset)
+	if ((s32)(tcp_jiffies32 - tp->lsndtime) > 2*TCP_RTO_MAX || !do_reset)
 		shift++;
 
 	/* If some dubious ICMP arrived, penalize even more. */
@@ -73,7 +73,7 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
 	if (tcp_check_oom(sk, shift)) {
 		/* Catch exceptional cases, when connection requires reset.
 		 *      1. Last segment was sent recently. */
-		if ((s32)(tcp_time_stamp - tp->lsndtime) <= TCP_TIMEWAIT_LEN ||
+		if ((s32)(tcp_jiffies32 - tp->lsndtime) <= TCP_TIMEWAIT_LEN ||
 		    /*  2. Window is closed. */
 		    (!tp->snd_wnd && !tp->packets_out))
 			do_reset = true;
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 05/15] tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (3 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 04/15] tcp: use tcp_jiffies32 to feed tp->lsndtime Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:45   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 06/15] tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->snd_cwnd_stamp.

tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c   | 14 +++++++-------
 net/ipv4/tcp_metrics.c |  2 +-
 net/ipv4/tcp_output.c  |  8 ++++----
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c0b3f909df394214785749704f2760171fe9d160..6a15c9b80b09829799dc37d89ecdbf11ec9ff904 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -463,7 +463,7 @@ void tcp_init_buffer_space(struct sock *sk)
 		tp->window_clamp = max(2 * tp->advmss, maxwin - tp->advmss);
 
 	tp->rcv_ssthresh = min(tp->rcv_ssthresh, tp->window_clamp);
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+	tp->snd_cwnd_stamp = tcp_jiffies32;
 }
 
 /* 5. Recalculate window clamp after socket hit its memory bounds. */
@@ -1954,7 +1954,7 @@ void tcp_enter_loss(struct sock *sk)
 	}
 	tp->snd_cwnd	   = 1;
 	tp->snd_cwnd_cnt   = 0;
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+	tp->snd_cwnd_stamp = tcp_jiffies32;
 
 	tp->retrans_out = 0;
 	tp->lost_out = 0;
@@ -2383,7 +2383,7 @@ static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss)
 			tcp_ecn_withdraw_cwr(tp);
 		}
 	}
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+	tp->snd_cwnd_stamp = tcp_jiffies32;
 	tp->undo_marker = 0;
 }
 
@@ -2520,7 +2520,7 @@ static inline void tcp_end_cwnd_reduction(struct sock *sk)
 	if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR ||
 	    (tp->undo_marker && tp->snd_ssthresh < TCP_INFINITE_SSTHRESH)) {
 		tp->snd_cwnd = tp->snd_ssthresh;
-		tp->snd_cwnd_stamp = tcp_time_stamp;
+		tp->snd_cwnd_stamp = tcp_jiffies32;
 	}
 	tcp_ca_event(sk, CA_EVENT_COMPLETE_CWR);
 }
@@ -2590,7 +2590,7 @@ static void tcp_mtup_probe_success(struct sock *sk)
 		       tcp_mss_to_mtu(sk, tp->mss_cache) /
 		       icsk->icsk_mtup.probe_size;
 	tp->snd_cwnd_cnt = 0;
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+	tp->snd_cwnd_stamp = tcp_jiffies32;
 	tp->snd_ssthresh = tcp_current_ssthresh(sk);
 
 	icsk->icsk_mtup.search_low = icsk->icsk_mtup.probe_size;
@@ -2976,7 +2976,7 @@ static void tcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 
 	icsk->icsk_ca_ops->cong_avoid(sk, ack, acked);
-	tcp_sk(sk)->snd_cwnd_stamp = tcp_time_stamp;
+	tcp_sk(sk)->snd_cwnd_stamp = tcp_jiffies32;
 }
 
 /* Restart timer after forward progress on connection.
@@ -5019,7 +5019,7 @@ static void tcp_new_space(struct sock *sk)
 
 	if (tcp_should_expand_sndbuf(sk)) {
 		tcp_sndbuf_expand(sk);
-		tp->snd_cwnd_stamp = tcp_time_stamp;
+		tp->snd_cwnd_stamp = tcp_jiffies32;
 	}
 
 	sk->sk_write_space(sk);
diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 653bbd67e3a39b68d27d26d17571c00ce2854bfd..102b2c90bb807d3a88d31b59324baf72cf901cdf 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -524,7 +524,7 @@ void tcp_init_metrics(struct sock *sk)
 		tp->snd_cwnd = 1;
 	else
 		tp->snd_cwnd = tcp_init_cwnd(tp, dst);
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+	tp->snd_cwnd_stamp = tcp_jiffies32;
 }
 
 bool tcp_peer_is_proven(struct request_sock *req, struct dst_entry *dst)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index be9f8f483e21bdbb4d944fcdae8560f3ae11ee64..4bd50f0b236ba23fe521a76dd9d35ee16acb061f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -151,7 +151,7 @@ void tcp_cwnd_restart(struct sock *sk, s32 delta)
 	while ((delta -= inet_csk(sk)->icsk_rto) > 0 && cwnd > restart_cwnd)
 		cwnd >>= 1;
 	tp->snd_cwnd = max(cwnd, restart_cwnd);
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+	tp->snd_cwnd_stamp = tcp_jiffies32;
 	tp->snd_cwnd_used = 0;
 }
 
@@ -1576,7 +1576,7 @@ static void tcp_cwnd_application_limited(struct sock *sk)
 		}
 		tp->snd_cwnd_used = 0;
 	}
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+	tp->snd_cwnd_stamp = tcp_jiffies32;
 }
 
 static void tcp_cwnd_validate(struct sock *sk, bool is_cwnd_limited)
@@ -1597,14 +1597,14 @@ static void tcp_cwnd_validate(struct sock *sk, bool is_cwnd_limited)
 	if (tcp_is_cwnd_limited(sk)) {
 		/* Network is feed fully. */
 		tp->snd_cwnd_used = 0;
-		tp->snd_cwnd_stamp = tcp_time_stamp;
+		tp->snd_cwnd_stamp = tcp_jiffies32;
 	} else {
 		/* Network starves. */
 		if (tp->packets_out > tp->snd_cwnd_used)
 			tp->snd_cwnd_used = tp->packets_out;
 
 		if (sysctl_tcp_slow_start_after_idle &&
-		    (s32)(tcp_time_stamp - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto &&
+		    (s32)(tcp_jiffies32 - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto &&
 		    !ca_ops->cong_control)
 			tcp_cwnd_application_limited(sk);
 
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 06/15] tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (4 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 05/15] tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:45   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 07/15] tcp: bic,cubic: " Eric Dumazet
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_bbr.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 92b045c72163def1c1d6aa0f2002760186aa5dc3..40dc4fc5f6acba91634290e1cacde69a3584248f 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -730,12 +730,12 @@ static void bbr_update_min_rtt(struct sock *sk, const struct rate_sample *rs)
 	bool filter_expired;
 
 	/* Track min RTT seen in the min_rtt_win_sec filter window: */
-	filter_expired = after(tcp_time_stamp,
+	filter_expired = after(tcp_jiffies32,
 			       bbr->min_rtt_stamp + bbr_min_rtt_win_sec * HZ);
 	if (rs->rtt_us >= 0 &&
 	    (rs->rtt_us <= bbr->min_rtt_us || filter_expired)) {
 		bbr->min_rtt_us = rs->rtt_us;
-		bbr->min_rtt_stamp = tcp_time_stamp;
+		bbr->min_rtt_stamp = tcp_jiffies32;
 	}
 
 	if (bbr_probe_rtt_mode_ms > 0 && filter_expired &&
@@ -754,7 +754,7 @@ static void bbr_update_min_rtt(struct sock *sk, const struct rate_sample *rs)
 		/* Maintain min packets in flight for max(200 ms, 1 round). */
 		if (!bbr->probe_rtt_done_stamp &&
 		    tcp_packets_in_flight(tp) <= bbr_cwnd_min_target) {
-			bbr->probe_rtt_done_stamp = tcp_time_stamp +
+			bbr->probe_rtt_done_stamp = tcp_jiffies32 +
 				msecs_to_jiffies(bbr_probe_rtt_mode_ms);
 			bbr->probe_rtt_round_done = 0;
 			bbr->next_rtt_delivered = tp->delivered;
@@ -762,8 +762,8 @@ static void bbr_update_min_rtt(struct sock *sk, const struct rate_sample *rs)
 			if (bbr->round_start)
 				bbr->probe_rtt_round_done = 1;
 			if (bbr->probe_rtt_round_done &&
-			    after(tcp_time_stamp, bbr->probe_rtt_done_stamp)) {
-				bbr->min_rtt_stamp = tcp_time_stamp;
+			    after(tcp_jiffies32, bbr->probe_rtt_done_stamp)) {
+				bbr->min_rtt_stamp = tcp_jiffies32;
 				bbr->restore_cwnd = 1;  /* snap to prior_cwnd */
 				bbr_reset_mode(sk);
 			}
@@ -810,7 +810,7 @@ static void bbr_init(struct sock *sk)
 	bbr->probe_rtt_done_stamp = 0;
 	bbr->probe_rtt_round_done = 0;
 	bbr->min_rtt_us = tcp_min_rtt(tp);
-	bbr->min_rtt_stamp = tcp_time_stamp;
+	bbr->min_rtt_stamp = tcp_jiffies32;
 
 	minmax_reset(&bbr->bw, bbr->rtt_cnt, 0);  /* init max bw to 0 */
 
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 07/15] tcp: bic,cubic: use tcp_jiffies32 instead of tcp_time_stamp
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (5 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 06/15] tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:46   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 08/15] tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime Eric Dumazet
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_bic.c   |  6 +++---
 net/ipv4/tcp_cubic.c | 12 ++++++------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/tcp_bic.c b/net/ipv4/tcp_bic.c
index 36087bca9f489646c2ca5aae3111449a956dd33b..609965f0e29836ed95605a2c7f3170e67c641058 100644
--- a/net/ipv4/tcp_bic.c
+++ b/net/ipv4/tcp_bic.c
@@ -84,14 +84,14 @@ static void bictcp_init(struct sock *sk)
 static inline void bictcp_update(struct bictcp *ca, u32 cwnd)
 {
 	if (ca->last_cwnd == cwnd &&
-	    (s32)(tcp_time_stamp - ca->last_time) <= HZ / 32)
+	    (s32)(tcp_jiffies32 - ca->last_time) <= HZ / 32)
 		return;
 
 	ca->last_cwnd = cwnd;
-	ca->last_time = tcp_time_stamp;
+	ca->last_time = tcp_jiffies32;
 
 	if (ca->epoch_start == 0) /* record the beginning of an epoch */
-		ca->epoch_start = tcp_time_stamp;
+		ca->epoch_start = tcp_jiffies32;
 
 	/* start off normal */
 	if (cwnd <= low_window) {
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 2052ca740916d0872a41125ab61b769b334a314b..57ae5b5ae643efad106f5d6ac224ca54a52f9689 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -231,21 +231,21 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked)
 	ca->ack_cnt += acked;	/* count the number of ACKed packets */
 
 	if (ca->last_cwnd == cwnd &&
-	    (s32)(tcp_time_stamp - ca->last_time) <= HZ / 32)
+	    (s32)(tcp_jiffies32 - ca->last_time) <= HZ / 32)
 		return;
 
 	/* The CUBIC function can update ca->cnt at most once per jiffy.
 	 * On all cwnd reduction events, ca->epoch_start is set to 0,
 	 * which will force a recalculation of ca->cnt.
 	 */
-	if (ca->epoch_start && tcp_time_stamp == ca->last_time)
+	if (ca->epoch_start && tcp_jiffies32 == ca->last_time)
 		goto tcp_friendliness;
 
 	ca->last_cwnd = cwnd;
-	ca->last_time = tcp_time_stamp;
+	ca->last_time = tcp_jiffies32;
 
 	if (ca->epoch_start == 0) {
-		ca->epoch_start = tcp_time_stamp;	/* record beginning */
+		ca->epoch_start = tcp_jiffies32;	/* record beginning */
 		ca->ack_cnt = acked;			/* start counting */
 		ca->tcp_cwnd = cwnd;			/* syn with cubic */
 
@@ -276,7 +276,7 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked)
 	 * if the cwnd < 1 million packets !!!
 	 */
 
-	t = (s32)(tcp_time_stamp - ca->epoch_start);
+	t = (s32)(tcp_jiffies32 - ca->epoch_start);
 	t += msecs_to_jiffies(ca->delay_min >> 3);
 	/* change the unit from HZ to bictcp_HZ */
 	t <<= BICTCP_HZ;
@@ -448,7 +448,7 @@ static void bictcp_acked(struct sock *sk, const struct ack_sample *sample)
 		return;
 
 	/* Discard delay samples right after fast recovery */
-	if (ca->epoch_start && (s32)(tcp_time_stamp - ca->epoch_start) < HZ)
+	if (ca->epoch_start && (s32)(tcp_jiffies32 - ca->epoch_start) < HZ)
 		return;
 
 	delay = (sample->rtt_us << 3) / USEC_PER_MSEC;
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 08/15] tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (6 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 07/15] tcp: bic,cubic: " Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:46   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 09/15] tcp: use tcp_jiffies32 to feed probe_timestamp Eric Dumazet
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h        | 4 ++--
 net/ipv4/tcp_input.c     | 6 +++---
 net/ipv4/tcp_minisocks.c | 2 +-
 net/ipv4/tcp_output.c    | 2 +-
 net/ipv4/tcp_timer.c     | 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index feba4c0406e551d7e57da3411476735731b4d817..5b2932b8363fb8546322ebff7c74663139b3371d 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1307,8 +1307,8 @@ static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp)
 {
 	const struct inet_connection_sock *icsk = &tp->inet_conn;
 
-	return min_t(u32, tcp_time_stamp - icsk->icsk_ack.lrcvtime,
-			  tcp_time_stamp - tp->rcv_tstamp);
+	return min_t(u32, tcp_jiffies32 - icsk->icsk_ack.lrcvtime,
+			  tcp_jiffies32 - tp->rcv_tstamp);
 }
 
 static inline int tcp_fin_time(const struct sock *sk)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 6a15c9b80b09829799dc37d89ecdbf11ec9ff904..eeb4967df25a8dc35128d0a0848b5ae7ee6d63e3 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -672,7 +672,7 @@ static void tcp_event_data_recv(struct sock *sk, struct sk_buff *skb)
 
 	tcp_rcv_rtt_measure(tp);
 
-	now = tcp_time_stamp;
+	now = tcp_jiffies32;
 
 	if (!icsk->icsk_ack.ato) {
 		/* The _first_ data packet received, initialize
@@ -3636,7 +3636,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 	 */
 	sk->sk_err_soft = 0;
 	icsk->icsk_probes_out = 0;
-	tp->rcv_tstamp = tcp_time_stamp;
+	tp->rcv_tstamp = tcp_jiffies32;
 	if (!prior_packets)
 		goto no_queue;
 
@@ -5554,7 +5554,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
 	struct inet_connection_sock *icsk = inet_csk(sk);
 
 	tcp_set_state(sk, TCP_ESTABLISHED);
-	icsk->icsk_ack.lrcvtime = tcp_time_stamp;
+	icsk->icsk_ack.lrcvtime = tcp_jiffies32;
 
 	if (skb) {
 		icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 717be4de53248352c758b50557987d898340dd4f..59c32e0086c0e46d7955dffe211ec03bb18dcb12 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -447,7 +447,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 		newtp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
 		minmax_reset(&newtp->rtt_min, tcp_time_stamp, ~0U);
 		newicsk->icsk_rto = TCP_TIMEOUT_INIT;
-		newicsk->icsk_ack.lrcvtime = tcp_time_stamp;
+		newicsk->icsk_ack.lrcvtime = tcp_jiffies32;
 
 		newtp->packets_out = 0;
 		newtp->retrans_out = 0;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 4bd50f0b236ba23fe521a76dd9d35ee16acb061f..cbda5de164495cf318960489bd8edf98fe3a5033 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3324,7 +3324,7 @@ static void tcp_connect_init(struct sock *sk)
 	if (likely(!tp->repair))
 		tp->rcv_nxt = 0;
 	else
-		tp->rcv_tstamp = tcp_time_stamp;
+		tp->rcv_tstamp = tcp_jiffies32;
 	tp->rcv_wup = tp->rcv_nxt;
 	tp->copied_seq = tp->rcv_nxt;
 
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 5f6f219a431e41a90b3c5d667a1a22b50f4464cf..9e0616cb8c17a6385ac97fc0cd657ef9413a1749 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -451,7 +451,7 @@ void tcp_retransmit_timer(struct sock *sk)
 					    tp->snd_una, tp->snd_nxt);
 		}
 #endif
-		if (tcp_time_stamp - tp->rcv_tstamp > TCP_RTO_MAX) {
+		if (tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX) {
 			tcp_write_err(sk);
 			goto out;
 		}
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 09/15] tcp: use tcp_jiffies32 to feed probe_timestamp
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (7 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 08/15] tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:46   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 10/15] tcp: uses jiffies_32 to feed tp->chrono_start Eric Dumazet
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_output.c | 6 +++---
 net/ipv4/tcp_timer.c  | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index cbda5de164495cf318960489bd8edf98fe3a5033..f0fd1b4fdb3291638fcdca613d826db2cd27f517 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1475,7 +1475,7 @@ void tcp_mtup_init(struct sock *sk)
 	icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, net->ipv4.sysctl_tcp_base_mss);
 	icsk->icsk_mtup.probe_size = 0;
 	if (icsk->icsk_mtup.enabled)
-		icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
+		icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
 }
 EXPORT_SYMBOL(tcp_mtup_init);
 
@@ -1987,7 +1987,7 @@ static inline void tcp_mtu_check_reprobe(struct sock *sk)
 	s32 delta;
 
 	interval = net->ipv4.sysctl_tcp_probe_interval;
-	delta = tcp_time_stamp - icsk->icsk_mtup.probe_timestamp;
+	delta = tcp_jiffies32 - icsk->icsk_mtup.probe_timestamp;
 	if (unlikely(delta >= interval * HZ)) {
 		int mss = tcp_current_mss(sk);
 
@@ -1999,7 +1999,7 @@ static inline void tcp_mtu_check_reprobe(struct sock *sk)
 		icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
 
 		/* Update probe time stamp */
-		icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
+		icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
 	}
 }
 
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 9e0616cb8c17a6385ac97fc0cd657ef9413a1749..6629f47aa7f0182ece7873afcc3daa6f0019e228 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -115,7 +115,7 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk)
 	if (net->ipv4.sysctl_tcp_mtu_probing) {
 		if (!icsk->icsk_mtup.enabled) {
 			icsk->icsk_mtup.enabled = 1;
-			icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
+			icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
 			tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
 		} else {
 			struct net *net = sock_net(sk);
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 10/15] tcp: uses jiffies_32 to feed tp->chrono_start
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (8 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 09/15] tcp: use tcp_jiffies32 to feed probe_timestamp Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:46   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 11/15] tcp: use tcp_jiffies32 in __tcp_oow_rate_limited() Eric Dumazet
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

tcp_time_stamp will no longer be tied to jiffies.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c        | 2 +-
 net/ipv4/tcp_output.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index d0bb61ee28bbceff8f2e27416ce87fec94935973..b85bfe7cb11dca68952cc4be19b169d893963fef 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2757,7 +2757,7 @@ static void tcp_get_info_chrono_stats(const struct tcp_sock *tp,
 	for (i = TCP_CHRONO_BUSY; i < __TCP_CHRONO_MAX; ++i) {
 		stats[i] = tp->chrono_stat[i - 1];
 		if (i == tp->chrono_type)
-			stats[i] += tcp_time_stamp - tp->chrono_start;
+			stats[i] += tcp_jiffies32 - tp->chrono_start;
 		stats[i] *= USEC_PER_SEC / HZ;
 		total += stats[i];
 	}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f0fd1b4fdb3291638fcdca613d826db2cd27f517..1011ea40c2ba4c12cce21149cab176e1fa4db583 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2202,7 +2202,7 @@ static bool tcp_small_queue_check(struct sock *sk, const struct sk_buff *skb,
 
 static void tcp_chrono_set(struct tcp_sock *tp, const enum tcp_chrono new)
 {
-	const u32 now = tcp_time_stamp;
+	const u32 now = tcp_jiffies32;
 
 	if (tp->chrono_type > TCP_CHRONO_UNSPEC)
 		tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start;
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 11/15] tcp: use tcp_jiffies32 in __tcp_oow_rate_limited()
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (9 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 10/15] tcp: uses jiffies_32 to feed tp->chrono_start Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:47   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 12/15] tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

This place wants to use tcp_jiffies32, this is good enough.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index eeb4967df25a8dc35128d0a0848b5ae7ee6d63e3..85575888365a10643e096f9e019adaa3eda87d40 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3390,7 +3390,7 @@ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx,
 				   u32 *last_oow_ack_time)
 {
 	if (*last_oow_ack_time) {
-		s32 elapsed = (s32)(tcp_time_stamp - *last_oow_ack_time);
+		s32 elapsed = (s32)(tcp_jiffies32 - *last_oow_ack_time);
 
 		if (0 <= elapsed && elapsed < sysctl_tcp_invalid_ratelimit) {
 			NET_INC_STATS(net, mib_idx);
@@ -3398,7 +3398,7 @@ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx,
 		}
 	}
 
-	*last_oow_ack_time = tcp_time_stamp;
+	*last_oow_ack_time = tcp_jiffies32;
 
 	return false;	/* not rate-limited: go ahead, send dupack now! */
 }
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 12/15] tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (10 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 11/15] tcp: use tcp_jiffies32 in __tcp_oow_rate_limited() Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:47   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 13/15] tcp_lp: cache tcp_time_stamp Eric Dumazet
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

This CC does not need 1 ms tcp_time_stamp and can use
the jiffy based 'timestamp'.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_westwood.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_westwood.c b/net/ipv4/tcp_westwood.c
index 9775453b8d174c848dc09df83d1fa185422cd8cc..bec9cafbe3f92938e5d79d743d629b2f33464418 100644
--- a/net/ipv4/tcp_westwood.c
+++ b/net/ipv4/tcp_westwood.c
@@ -68,7 +68,7 @@ static void tcp_westwood_init(struct sock *sk)
 	w->cumul_ack = 0;
 	w->reset_rtt_min = 1;
 	w->rtt_min = w->rtt = TCP_WESTWOOD_INIT_RTT;
-	w->rtt_win_sx = tcp_time_stamp;
+	w->rtt_win_sx = tcp_jiffies32;
 	w->snd_una = tcp_sk(sk)->snd_una;
 	w->first_ack = 1;
 }
@@ -116,7 +116,7 @@ static void tcp_westwood_pkts_acked(struct sock *sk,
 static void westwood_update_window(struct sock *sk)
 {
 	struct westwood *w = inet_csk_ca(sk);
-	s32 delta = tcp_time_stamp - w->rtt_win_sx;
+	s32 delta = tcp_jiffies32 - w->rtt_win_sx;
 
 	/* Initialize w->snd_una with the first acked sequence number in order
 	 * to fix mismatch between tp->snd_una and w->snd_una for the first
@@ -140,7 +140,7 @@ static void westwood_update_window(struct sock *sk)
 		westwood_filter(w, delta);
 
 		w->bk = 0;
-		w->rtt_win_sx = tcp_time_stamp;
+		w->rtt_win_sx = tcp_jiffies32;
 	}
 }
 
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 13/15] tcp_lp: cache tcp_time_stamp
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (11 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 12/15] tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:47   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 14/15] tcp: replace misc tcp_time_stamp to tcp_jiffies32 Eric Dumazet
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

tcp_time_stamp will become slightly more expensive soon,
cache its value.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_lp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_lp.c b/net/ipv4/tcp_lp.c
index d6fb6c067af4641f232b94e7c590c212648e8173..ef3122abb3734a63011fba035f7a7aae431da8de 100644
--- a/net/ipv4/tcp_lp.c
+++ b/net/ipv4/tcp_lp.c
@@ -264,18 +264,19 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct lp *lp = inet_csk_ca(sk);
+	u32 now = tcp_time_stamp;
 	u32 delta;
 
 	if (sample->rtt_us > 0)
 		tcp_lp_rtt_sample(sk, sample->rtt_us);
 
 	/* calc inference */
-	delta = tcp_time_stamp - tp->rx_opt.rcv_tsecr;
+	delta = now - tp->rx_opt.rcv_tsecr;
 	if ((s32)delta > 0)
 		lp->inference = 3 * delta;
 
 	/* test if within inference */
-	if (lp->last_drop && (tcp_time_stamp - lp->last_drop < lp->inference))
+	if (lp->last_drop && (now - lp->last_drop < lp->inference))
 		lp->flag |= LP_WITHIN_INF;
 	else
 		lp->flag &= ~LP_WITHIN_INF;
@@ -312,7 +313,7 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
 		tp->snd_cwnd = max(tp->snd_cwnd >> 1U, 1U);
 
 	/* record this drop time */
-	lp->last_drop = tcp_time_stamp;
+	lp->last_drop = now;
 }
 
 static struct tcp_congestion_ops tcp_lp __read_mostly = {
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 14/15] tcp: replace misc tcp_time_stamp to tcp_jiffies32
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (12 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 13/15] tcp_lp: cache tcp_time_stamp Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:47   ` Soheil Hassas Yeganeh
  2017-05-16 21:00 ` [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock Eric Dumazet
  2017-05-17 20:06 ` [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock David Miller
  15 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

After this patch, all uses of tcp_time_stamp will require
a change when we introduce 1 ms and/or 1 us TCP TS option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c           | 2 +-
 net/ipv4/tcp_htcp.c      | 2 +-
 net/ipv4/tcp_input.c     | 2 +-
 net/ipv4/tcp_minisocks.c | 2 +-
 net/ipv4/tcp_output.c    | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b85bfe7cb11dca68952cc4be19b169d893963fef..85005480052626c5769ef100a868c88fad803f75 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -386,7 +386,7 @@ void tcp_init_sock(struct sock *sk)
 
 	icsk->icsk_rto = TCP_TIMEOUT_INIT;
 	tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
-	minmax_reset(&tp->rtt_min, tcp_time_stamp, ~0U);
+	minmax_reset(&tp->rtt_min, tcp_jiffies32, ~0U);
 
 	/* So many TCP implementations out there (incorrectly) count the
 	 * initial SYN frame in their delayed-ACK and congestion control
diff --git a/net/ipv4/tcp_htcp.c b/net/ipv4/tcp_htcp.c
index 4a4d8e76738fa2831dcc3ecec5924dd3dfb7bf58..3eb78cde6ff0a22b7b411f0ae4258b6ef74ffe73 100644
--- a/net/ipv4/tcp_htcp.c
+++ b/net/ipv4/tcp_htcp.c
@@ -104,7 +104,7 @@ static void measure_achieved_throughput(struct sock *sk,
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 	const struct tcp_sock *tp = tcp_sk(sk);
 	struct htcp *ca = inet_csk_ca(sk);
-	u32 now = tcp_time_stamp;
+	u32 now = tcp_jiffies32;
 
 	if (icsk->icsk_ca_state == TCP_CA_Open)
 		ca->pkts_acked = sample->pkts_acked;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 85575888365a10643e096f9e019adaa3eda87d40..10e6775464f647a65ea0d19c10b421f9cd38923d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2911,7 +2911,7 @@ static void tcp_update_rtt_min(struct sock *sk, u32 rtt_us)
 	struct tcp_sock *tp = tcp_sk(sk);
 	u32 wlen = sysctl_tcp_min_rtt_wlen * HZ;
 
-	minmax_running_min(&tp->rtt_min, wlen, tcp_time_stamp,
+	minmax_running_min(&tp->rtt_min, wlen, tcp_jiffies32,
 			   rtt_us ? : jiffies_to_usecs(1));
 }
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 59c32e0086c0e46d7955dffe211ec03bb18dcb12..6504f1082bdfda77bfc1b53d0d85928e5083a24e 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -445,7 +445,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 
 		newtp->srtt_us = 0;
 		newtp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
-		minmax_reset(&newtp->rtt_min, tcp_time_stamp, ~0U);
+		minmax_reset(&newtp->rtt_min, tcp_jiffies32, ~0U);
 		newicsk->icsk_rto = TCP_TIMEOUT_INIT;
 		newicsk->icsk_ack.lrcvtime = tcp_jiffies32;
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 1011ea40c2ba4c12cce21149cab176e1fa4db583..65472e931a0b79f7078a4da7db802dfcc32c7621 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2418,10 +2418,10 @@ bool tcp_schedule_loss_probe(struct sock *sk)
 	timeout = max_t(u32, timeout, msecs_to_jiffies(10));
 
 	/* If RTO is shorter, just schedule TLP in its place. */
-	tlp_time_stamp = tcp_time_stamp + timeout;
+	tlp_time_stamp = tcp_jiffies32 + timeout;
 	rto_time_stamp = (u32)inet_csk(sk)->icsk_timeout;
 	if ((s32)(tlp_time_stamp - rto_time_stamp) > 0) {
-		s32 delta = rto_time_stamp - tcp_time_stamp;
+		s32 delta = rto_time_stamp - tcp_jiffies32;
 		if (delta > 0)
 			timeout = delta;
 	}
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (13 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 14/15] tcp: replace misc tcp_time_stamp to tcp_jiffies32 Eric Dumazet
@ 2017-05-16 21:00 ` Eric Dumazet
  2017-05-17 13:51   ` Soheil Hassas Yeganeh
  2017-05-18 12:33   ` Eric Dumazet
  2017-05-17 20:06 ` [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock David Miller
  15 siblings, 2 replies; 36+ messages in thread
From: Eric Dumazet @ 2017-05-16 21:00 UTC (permalink / raw)
  To: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang
  Cc: netdev, Eric Dumazet, Eric Dumazet

TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffies' variable, because it had been a cheap and good enough
generator.

For TCP flows on the Internet, 1 ms resolution would be much better
than 4ms or 10ms (HZ=250 or HZ=100 respectively)

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1]

Receive size autotuning (DRS) is indeed more precise and converges
faster to optimal window size.

This patch converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/skbuff.h           | 62 +-------------------------
 include/linux/tcp.h              | 22 ++++-----
 include/net/tcp.h                | 59 ++++++++++++++++++++----
 net/ipv4/syncookies.c            |  8 ++--
 net/ipv4/tcp.c                   |  4 +-
 net/ipv4/tcp_bbr.c               | 22 ++++-----
 net/ipv4/tcp_input.c             | 96 ++++++++++++++++++++--------------------
 net/ipv4/tcp_ipv4.c              | 17 +++----
 net/ipv4/tcp_lp.c                | 12 ++---
 net/ipv4/tcp_minisocks.c         |  4 +-
 net/ipv4/tcp_output.c            | 16 +++----
 net/ipv4/tcp_rate.c              | 16 +++----
 net/ipv4/tcp_recovery.c          | 23 +++++-----
 net/ipv4/tcp_timer.c             |  8 ++--
 net/ipv6/syncookies.c            |  2 +-
 net/ipv6/tcp_ipv6.c              |  4 +-
 net/netfilter/nf_synproxy_core.c |  2 +-
 17 files changed, 178 insertions(+), 199 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index bfc7892f6c33c9fdfb7c0d8110f80cfb12d1ae61..7c0cb2ce8b01a9be366d8cdb7e3661f65ebff3c9 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -506,66 +506,6 @@ typedef unsigned int sk_buff_data_t;
 typedef unsigned char *sk_buff_data_t;
 #endif
 
-/**
- * struct skb_mstamp - multi resolution time stamps
- * @stamp_us: timestamp in us resolution
- * @stamp_jiffies: timestamp in jiffies
- */
-struct skb_mstamp {
-	union {
-		u64		v64;
-		struct {
-			u32	stamp_us;
-			u32	stamp_jiffies;
-		};
-	};
-};
-
-/**
- * skb_mstamp_get - get current timestamp
- * @cl: place to store timestamps
- */
-static inline void skb_mstamp_get(struct skb_mstamp *cl)
-{
-	u64 val = local_clock();
-
-	do_div(val, NSEC_PER_USEC);
-	cl->stamp_us = (u32)val;
-	cl->stamp_jiffies = (u32)jiffies;
-}
-
-/**
- * skb_mstamp_delta - compute the difference in usec between two skb_mstamp
- * @t1: pointer to newest sample
- * @t0: pointer to oldest sample
- */
-static inline u32 skb_mstamp_us_delta(const struct skb_mstamp *t1,
-				      const struct skb_mstamp *t0)
-{
-	s32 delta_us = t1->stamp_us - t0->stamp_us;
-	u32 delta_jiffies = t1->stamp_jiffies - t0->stamp_jiffies;
-
-	/* If delta_us is negative, this might be because interval is too big,
-	 * or local_clock() drift is too big : fallback using jiffies.
-	 */
-	if (delta_us <= 0 ||
-	    delta_jiffies >= (INT_MAX / (USEC_PER_SEC / HZ)))
-
-		delta_us = jiffies_to_usecs(delta_jiffies);
-
-	return delta_us;
-}
-
-static inline bool skb_mstamp_after(const struct skb_mstamp *t1,
-				    const struct skb_mstamp *t0)
-{
-	s32 diff = t1->stamp_jiffies - t0->stamp_jiffies;
-
-	if (!diff)
-		diff = t1->stamp_us - t0->stamp_us;
-	return diff > 0;
-}
-
 /** 
  *	struct sk_buff - socket buffer
  *	@next: Next buffer in list
@@ -646,7 +586,7 @@ struct sk_buff {
 
 			union {
 				ktime_t		tstamp;
-				struct skb_mstamp skb_mstamp;
+				u64		skb_mstamp;
 			};
 		};
 		struct rb_node	rbnode; /* used in netem & tcp stack */
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 22854f0284347a3bb047709478525ee5a9dd9b36..542ca1ae02c4f64833b287c0fd744283ee518909 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -123,7 +123,7 @@ struct tcp_request_sock_ops;
 struct tcp_request_sock {
 	struct inet_request_sock 	req;
 	const struct tcp_request_sock_ops *af_specific;
-	struct skb_mstamp		snt_synack; /* first SYNACK sent time */
+	u64				snt_synack; /* first SYNACK sent time */
 	bool				tfo_listener;
 	u32				txhash;
 	u32				rcv_isn;
@@ -211,7 +211,7 @@ struct tcp_sock {
 
 	/* Information of the most recently (s)acked skb */
 	struct tcp_rack {
-		struct skb_mstamp mstamp; /* (Re)sent time of the skb */
+		u64 mstamp; /* (Re)sent time of the skb */
 		u32 rtt_us;  /* Associated RTT */
 		u32 end_seq; /* Ending TCP sequence of the skb */
 		u8 advanced; /* mstamp advanced since last lost marking */
@@ -240,7 +240,7 @@ struct tcp_sock {
 	u32	tlp_high_seq;	/* snd_nxt at the time of TLP retransmit. */
 
 /* RTT measurement */
-	struct skb_mstamp tcp_mstamp; /* most recent packet received/sent */
+	u64	tcp_mstamp;	/* most recent packet received/sent */
 	u32	srtt_us;	/* smoothed round trip time << 3 in usecs */
 	u32	mdev_us;	/* medium deviation			*/
 	u32	mdev_max_us;	/* maximal mdev for the last rtt period	*/
@@ -280,8 +280,8 @@ struct tcp_sock {
 	u32	delivered;	/* Total data packets delivered incl. rexmits */
 	u32	lost;		/* Total data packets lost incl. rexmits */
 	u32	app_limited;	/* limited until "delivered" reaches this val */
-	struct skb_mstamp first_tx_mstamp;  /* start of window send phase */
-	struct skb_mstamp delivered_mstamp; /* time we reached "delivered" */
+	u64	first_tx_mstamp;  /* start of window send phase */
+	u64	delivered_mstamp; /* time we reached "delivered" */
 	u32	rate_delivered;    /* saved rate sample: packets delivered */
 	u32	rate_interval_us;  /* saved rate sample: time elapsed */
 
@@ -335,16 +335,16 @@ struct tcp_sock {
 
 /* Receiver side RTT estimation */
 	struct {
-		u32		rtt_us;
-		u32		seq;
-		struct skb_mstamp time;
+		u32	rtt_us;
+		u32	seq;
+		u64	time;
 	} rcv_rtt_est;
 
 /* Receiver queue space */
 	struct {
-		int		space;
-		u32		seq;
-		struct skb_mstamp time;
+		int	space;
+		u32	seq;
+		u64	time;
 	} rcvq_space;
 
 /* TCP-specific MTU probe information. */
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 5b2932b8363fb8546322ebff7c74663139b3371d..82462db97183abebb33628eb5e04a5c5f04ea873 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -519,7 +519,7 @@ static inline u32 tcp_cookie_time(void)
 u32 __cookie_v4_init_sequence(const struct iphdr *iph, const struct tcphdr *th,
 			      u16 *mssp);
 __u32 cookie_v4_init_sequence(const struct sk_buff *skb, __u16 *mss);
-__u32 cookie_init_timestamp(struct request_sock *req);
+u64 cookie_init_timestamp(struct request_sock *req);
 bool cookie_timestamp_decode(struct tcp_options_received *opt);
 bool cookie_ecn_ok(const struct tcp_options_received *opt,
 		   const struct net *net, const struct dst_entry *dst);
@@ -706,14 +706,55 @@ void tcp_send_window_probe(struct sock *sk);
  */
 #define tcp_jiffies32 ((u32)jiffies)
 
-/* Generator for TCP TS option (RFC 7323)
- * Currently tied to 'jiffies' but will soon be driven by 1 ms clock.
+/*
+ * Deliver a 32bit value for TCP timestamp option (RFC 7323)
+ * It is no longer tied to jiffies, but to 1 ms clock.
+ * Note: double check if you want to use tcp_jiffies32 instead of this.
+ */
+#define TCP_TS_HZ	1000
+
+static inline u64 tcp_clock_ns(void)
+{
+	return local_clock();
+}
+
+static inline u64 tcp_clock_us(void)
+{
+	return div_u64(tcp_clock_ns(), NSEC_PER_USEC);
+}
+
+/* This should only be used in contexts where tp->tcp_mstamp is up to date */
+static inline u32 tcp_time_stamp(const struct tcp_sock *tp)
+{
+	return div_u64(tp->tcp_mstamp, USEC_PER_SEC / TCP_TS_HZ);
+}
+
+/* Could use tcp_clock_us() / 1000, but this version uses a single divide */
+static inline u32 tcp_time_stamp_raw(void)
+{
+	return div_u64(tcp_clock_ns(), NSEC_PER_SEC / TCP_TS_HZ);
+}
+
+
+/* Refresh 1us clock of a TCP socket,
+ * ensuring monotically increasing values.
  */
-#define tcp_time_stamp		((__u32)(jiffies))
+static inline void tcp_mstamp_refresh(struct tcp_sock *tp)
+{
+	u64 val = tcp_clock_us();
+
+	if (val > tp->tcp_mstamp)
+		tp->tcp_mstamp = val;
+}
+
+static inline u32 tcp_stamp_us_delta(u64 t1, u64 t0)
+{
+	return max_t(s64, t1 - t0, 0);
+}
 
 static inline u32 tcp_skb_timestamp(const struct sk_buff *skb)
 {
-	return skb->skb_mstamp.stamp_jiffies;
+	return div_u64(skb->skb_mstamp, USEC_PER_SEC / TCP_TS_HZ);
 }
 
 
@@ -778,9 +819,9 @@ struct tcp_skb_cb {
 			/* pkts S/ACKed so far upon tx of skb, incl retrans: */
 			__u32 delivered;
 			/* start of send pipeline phase */
-			struct skb_mstamp first_tx_mstamp;
+			u64 first_tx_mstamp;
 			/* when we reached the "delivered" count */
-			struct skb_mstamp delivered_mstamp;
+			u64 delivered_mstamp;
 		} tx;   /* only used for outgoing skbs */
 		union {
 			struct inet_skb_parm	h4;
@@ -896,7 +937,7 @@ struct ack_sample {
  * A sample is invalid if "delivered" or "interval_us" is negative.
  */
 struct rate_sample {
-	struct	skb_mstamp prior_mstamp; /* starting timestamp for interval */
+	u64  prior_mstamp; /* starting timestamp for interval */
 	u32  prior_delivered;	/* tp->delivered at "prior_mstamp" */
 	s32  delivered;		/* number of packets delivered over interval */
 	long interval_us;	/* time for tp->delivered to incr "delivered" */
@@ -1862,7 +1903,7 @@ void tcp_init(void);
 /* tcp_recovery.c */
 extern void tcp_rack_mark_lost(struct sock *sk);
 extern void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
-			     const struct skb_mstamp *xmit_time);
+			     u64 xmit_time);
 extern void tcp_rack_reo_timeout(struct sock *sk);
 
 /*
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 0257d965f11119acf8c55888d6e672d171ef5f08..6426250a58ea1afb29b673c00bb9d58bd3d21122 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -66,10 +66,10 @@ static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport,
  * Since subsequent timestamps use the normal tcp_time_stamp value, we
  * must make sure that the resulting initial timestamp is <= tcp_time_stamp.
  */
-__u32 cookie_init_timestamp(struct request_sock *req)
+u64 cookie_init_timestamp(struct request_sock *req)
 {
 	struct inet_request_sock *ireq;
-	u32 ts, ts_now = tcp_time_stamp;
+	u32 ts, ts_now = tcp_time_stamp_raw();
 	u32 options = 0;
 
 	ireq = inet_rsk(req);
@@ -88,7 +88,7 @@ __u32 cookie_init_timestamp(struct request_sock *req)
 		ts <<= TSBITS;
 		ts |= options;
 	}
-	return ts;
+	return (u64)ts * (USEC_PER_SEC / TCP_TS_HZ);
 }
 
 
@@ -343,7 +343,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
 	ireq->wscale_ok		= tcp_opt.wscale_ok;
 	ireq->tstamp_ok		= tcp_opt.saw_tstamp;
 	req->ts_recent		= tcp_opt.saw_tstamp ? tcp_opt.rcv_tsval : 0;
-	treq->snt_synack.v64	= 0;
+	treq->snt_synack	= 0;
 	treq->tfo_listener	= false;
 
 	ireq->ir_iif = inet_request_bound_dev_if(sk, skb);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 85005480052626c5769ef100a868c88fad803f75..b5d18484746daa9189ade316fa9ffc17be30cb60 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2706,7 +2706,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 		if (!tp->repair)
 			err = -EPERM;
 		else
-			tp->tsoffset = val - tcp_time_stamp;
+			tp->tsoffset = val - tcp_time_stamp_raw();
 		break;
 	case TCP_REPAIR_WINDOW:
 		err = tcp_repair_set_window(tp, optval, optlen);
@@ -3072,7 +3072,7 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
 		break;
 
 	case TCP_TIMESTAMP:
-		val = tcp_time_stamp + tp->tsoffset;
+		val = tcp_time_stamp_raw() + tp->tsoffset;
 		break;
 	case TCP_NOTSENT_LOWAT:
 		val = tp->notsent_lowat;
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 40dc4fc5f6acba91634290e1cacde69a3584248f..dbcc9352a48f07a12484e45f3baf0a733e244f75 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -91,7 +91,7 @@ struct bbr {
 	struct minmax bw;	/* Max recent delivery rate in pkts/uS << 24 */
 	u32	rtt_cnt;	    /* count of packet-timed rounds elapsed */
 	u32     next_rtt_delivered; /* scb->tx.delivered at end of round */
-	struct skb_mstamp cycle_mstamp;  /* time of this cycle phase start */
+	u64	cycle_mstamp;	     /* time of this cycle phase start */
 	u32     mode:3,		     /* current bbr_mode in state machine */
 		prev_ca_state:3,     /* CA state on previous ACK */
 		packet_conservation:1,  /* use packet conservation? */
@@ -411,7 +411,7 @@ static bool bbr_is_next_cycle_phase(struct sock *sk,
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct bbr *bbr = inet_csk_ca(sk);
 	bool is_full_length =
-		skb_mstamp_us_delta(&tp->delivered_mstamp, &bbr->cycle_mstamp) >
+		tcp_stamp_us_delta(tp->delivered_mstamp, bbr->cycle_mstamp) >
 		bbr->min_rtt_us;
 	u32 inflight, bw;
 
@@ -497,7 +497,7 @@ static void bbr_reset_lt_bw_sampling_interval(struct sock *sk)
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct bbr *bbr = inet_csk_ca(sk);
 
-	bbr->lt_last_stamp = tp->delivered_mstamp.stamp_jiffies;
+	bbr->lt_last_stamp = div_u64(tp->delivered_mstamp, USEC_PER_MSEC);
 	bbr->lt_last_delivered = tp->delivered;
 	bbr->lt_last_lost = tp->lost;
 	bbr->lt_rtt_cnt = 0;
@@ -551,7 +551,7 @@ static void bbr_lt_bw_sampling(struct sock *sk, const struct rate_sample *rs)
 	struct bbr *bbr = inet_csk_ca(sk);
 	u32 lost, delivered;
 	u64 bw;
-	s32 t;
+	u32 t;
 
 	if (bbr->lt_use_bw) {	/* already using long-term rate, lt_bw? */
 		if (bbr->mode == BBR_PROBE_BW && bbr->round_start &&
@@ -603,15 +603,15 @@ static void bbr_lt_bw_sampling(struct sock *sk, const struct rate_sample *rs)
 		return;
 
 	/* Find average delivery rate in this sampling interval. */
-	t = (s32)(tp->delivered_mstamp.stamp_jiffies - bbr->lt_last_stamp);
-	if (t < 1)
-		return;		/* interval is less than one jiffy, so wait */
-	t = jiffies_to_usecs(t);
-	/* Interval long enough for jiffies_to_usecs() to return a bogus 0? */
-	if (t < 1) {
+	t = div_u64(tp->delivered_mstamp, USEC_PER_MSEC) - bbr->lt_last_stamp;
+	if ((s32)t < 1)
+		return;		/* interval is less than one ms, so wait */
+	/* Check if can multiply without overflow */
+	if (t >= ~0U / USEC_PER_MSEC) {
 		bbr_reset_lt_bw_sampling(sk);  /* interval too long; reset */
 		return;
 	}
+	t *= USEC_PER_MSEC;
 	bw = (u64)delivered * BW_UNIT;
 	do_div(bw, t);
 	bbr_lt_bw_interval_done(sk, bw);
@@ -825,7 +825,7 @@ static void bbr_init(struct sock *sk)
 	bbr->idle_restart = 0;
 	bbr->full_bw = 0;
 	bbr->full_bw_cnt = 0;
-	bbr->cycle_mstamp.v64 = 0;
+	bbr->cycle_mstamp = 0;
 	bbr->cycle_idx = 0;
 	bbr_reset_lt_bw_sampling(sk);
 	bbr_reset_startup_mode(sk);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 10e6775464f647a65ea0d19c10b421f9cd38923d..9a5a9e8eda899666501cca06b37948ab64ae79b2 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -441,7 +441,7 @@ void tcp_init_buffer_space(struct sock *sk)
 		tcp_sndbuf_expand(sk);
 
 	tp->rcvq_space.space = tp->rcv_wnd;
-	skb_mstamp_get(&tp->tcp_mstamp);
+	tcp_mstamp_refresh(tp);
 	tp->rcvq_space.time = tp->tcp_mstamp;
 	tp->rcvq_space.seq = tp->copied_seq;
 
@@ -555,11 +555,11 @@ static inline void tcp_rcv_rtt_measure(struct tcp_sock *tp)
 {
 	u32 delta_us;
 
-	if (tp->rcv_rtt_est.time.v64 == 0)
+	if (tp->rcv_rtt_est.time == 0)
 		goto new_measure;
 	if (before(tp->rcv_nxt, tp->rcv_rtt_est.seq))
 		return;
-	delta_us = skb_mstamp_us_delta(&tp->tcp_mstamp, &tp->rcv_rtt_est.time);
+	delta_us = tcp_stamp_us_delta(tp->tcp_mstamp, tp->rcv_rtt_est.time);
 	tcp_rcv_rtt_update(tp, delta_us, 1);
 
 new_measure:
@@ -571,13 +571,15 @@ static inline void tcp_rcv_rtt_measure_ts(struct sock *sk,
 					  const struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
+
 	if (tp->rx_opt.rcv_tsecr &&
 	    (TCP_SKB_CB(skb)->end_seq -
-	     TCP_SKB_CB(skb)->seq >= inet_csk(sk)->icsk_ack.rcv_mss))
-		tcp_rcv_rtt_update(tp,
-				   jiffies_to_usecs(tcp_time_stamp -
-						    tp->rx_opt.rcv_tsecr),
-				   0);
+	     TCP_SKB_CB(skb)->seq >= inet_csk(sk)->icsk_ack.rcv_mss)) {
+		u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
+		u32 delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
+
+		tcp_rcv_rtt_update(tp, delta_us, 0);
+	}
 }
 
 /*
@@ -590,7 +592,7 @@ void tcp_rcv_space_adjust(struct sock *sk)
 	int time;
 	int copied;
 
-	time = skb_mstamp_us_delta(&tp->tcp_mstamp, &tp->rcvq_space.time);
+	time = tcp_stamp_us_delta(tp->tcp_mstamp, tp->rcvq_space.time);
 	if (time < (tp->rcv_rtt_est.rtt_us >> 3) || tp->rcv_rtt_est.rtt_us == 0)
 		return;
 
@@ -1134,8 +1136,8 @@ struct tcp_sacktag_state {
 	 * that was SACKed. RTO needs the earliest RTT to stay conservative,
 	 * but congestion control should still get an accurate delay signal.
 	 */
-	struct skb_mstamp first_sackt;
-	struct skb_mstamp last_sackt;
+	u64	first_sackt;
+	u64	last_sackt;
 	struct rate_sample *rate;
 	int	flag;
 };
@@ -1200,7 +1202,7 @@ static u8 tcp_sacktag_one(struct sock *sk,
 			  struct tcp_sacktag_state *state, u8 sacked,
 			  u32 start_seq, u32 end_seq,
 			  int dup_sack, int pcount,
-			  const struct skb_mstamp *xmit_time)
+			  u64 xmit_time)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	int fack_count = state->fack_count;
@@ -1242,9 +1244,9 @@ static u8 tcp_sacktag_one(struct sock *sk,
 							   state->reord);
 				if (!after(end_seq, tp->high_seq))
 					state->flag |= FLAG_ORIG_SACK_ACKED;
-				if (state->first_sackt.v64 == 0)
-					state->first_sackt = *xmit_time;
-				state->last_sackt = *xmit_time;
+				if (state->first_sackt == 0)
+					state->first_sackt = xmit_time;
+				state->last_sackt = xmit_time;
 			}
 
 			if (sacked & TCPCB_LOST) {
@@ -1304,7 +1306,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
 	 */
 	tcp_sacktag_one(sk, state, TCP_SKB_CB(skb)->sacked,
 			start_seq, end_seq, dup_sack, pcount,
-			&skb->skb_mstamp);
+			skb->skb_mstamp);
 	tcp_rate_skb_delivered(sk, skb, state->rate);
 
 	if (skb == tp->lost_skb_hint)
@@ -1356,8 +1358,8 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
 		tcp_advance_highest_sack(sk, skb);
 
 	tcp_skb_collapse_tstamp(prev, skb);
-	if (unlikely(TCP_SKB_CB(prev)->tx.delivered_mstamp.v64))
-		TCP_SKB_CB(prev)->tx.delivered_mstamp.v64 = 0;
+	if (unlikely(TCP_SKB_CB(prev)->tx.delivered_mstamp))
+		TCP_SKB_CB(prev)->tx.delivered_mstamp = 0;
 
 	tcp_unlink_write_queue(skb, sk);
 	sk_wmem_free_skb(sk, skb);
@@ -1587,7 +1589,7 @@ static struct sk_buff *tcp_sacktag_walk(struct sk_buff *skb, struct sock *sk,
 						TCP_SKB_CB(skb)->end_seq,
 						dup_sack,
 						tcp_skb_pcount(skb),
-						&skb->skb_mstamp);
+						skb->skb_mstamp);
 			tcp_rate_skb_delivered(sk, skb, state->rate);
 
 			if (!before(TCP_SKB_CB(skb)->seq,
@@ -2936,9 +2938,12 @@ static inline bool tcp_ack_update_rtt(struct sock *sk, const int flag,
 	 * See draft-ietf-tcplw-high-performance-00, section 3.3.
 	 */
 	if (seq_rtt_us < 0 && tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
-	    flag & FLAG_ACKED)
-		seq_rtt_us = ca_rtt_us = jiffies_to_usecs(tcp_time_stamp -
-							  tp->rx_opt.rcv_tsecr);
+	    flag & FLAG_ACKED) {
+		u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
+		u32 delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
+
+		seq_rtt_us = ca_rtt_us = delta_us;
+	}
 	if (seq_rtt_us < 0)
 		return false;
 
@@ -2960,12 +2965,8 @@ void tcp_synack_rtt_meas(struct sock *sk, struct request_sock *req)
 {
 	long rtt_us = -1L;
 
-	if (req && !req->num_retrans && tcp_rsk(req)->snt_synack.v64) {
-		struct skb_mstamp now;
-
-		skb_mstamp_get(&now);
-		rtt_us = skb_mstamp_us_delta(&now, &tcp_rsk(req)->snt_synack);
-	}
+	if (req && !req->num_retrans && tcp_rsk(req)->snt_synack)
+		rtt_us = tcp_stamp_us_delta(tcp_clock_us(), tcp_rsk(req)->snt_synack);
 
 	tcp_ack_update_rtt(sk, FLAG_SYN_ACKED, rtt_us, -1L, rtt_us);
 }
@@ -3003,7 +3004,7 @@ void tcp_rearm_rto(struct sock *sk)
 			struct sk_buff *skb = tcp_write_queue_head(sk);
 			const u32 rto_time_stamp =
 				tcp_skb_timestamp(skb) + rto;
-			s32 delta = (s32)(rto_time_stamp - tcp_time_stamp);
+			s32 delta = (s32)(rto_time_stamp - tcp_jiffies32);
 			/* delta may not be positive if the socket is locked
 			 * when the retrans timer fires and is rescheduled.
 			 */
@@ -3060,9 +3061,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 			       struct tcp_sacktag_state *sack)
 {
 	const struct inet_connection_sock *icsk = inet_csk(sk);
-	struct skb_mstamp first_ackt, last_ackt;
+	u64 first_ackt, last_ackt;
 	struct tcp_sock *tp = tcp_sk(sk);
-	struct skb_mstamp *now = &tp->tcp_mstamp;
 	u32 prior_sacked = tp->sacked_out;
 	u32 reord = tp->packets_out;
 	bool fully_acked = true;
@@ -3075,7 +3075,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 	bool rtt_update;
 	int flag = 0;
 
-	first_ackt.v64 = 0;
+	first_ackt = 0;
 
 	while ((skb = tcp_write_queue_head(sk)) && skb != tcp_send_head(sk)) {
 		struct tcp_skb_cb *scb = TCP_SKB_CB(skb);
@@ -3106,8 +3106,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 			flag |= FLAG_RETRANS_DATA_ACKED;
 		} else if (!(sacked & TCPCB_SACKED_ACKED)) {
 			last_ackt = skb->skb_mstamp;
-			WARN_ON_ONCE(last_ackt.v64 == 0);
-			if (!first_ackt.v64)
+			WARN_ON_ONCE(last_ackt == 0);
+			if (!first_ackt)
 				first_ackt = last_ackt;
 
 			last_in_flight = TCP_SKB_CB(skb)->tx.in_flight;
@@ -3122,7 +3122,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 			tp->delivered += acked_pcount;
 			if (!tcp_skb_spurious_retrans(tp, skb))
 				tcp_rack_advance(tp, sacked, scb->end_seq,
-						 &skb->skb_mstamp);
+						 skb->skb_mstamp);
 		}
 		if (sacked & TCPCB_LOST)
 			tp->lost_out -= acked_pcount;
@@ -3165,13 +3165,13 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 	if (skb && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
 		flag |= FLAG_SACK_RENEGING;
 
-	if (likely(first_ackt.v64) && !(flag & FLAG_RETRANS_DATA_ACKED)) {
-		seq_rtt_us = skb_mstamp_us_delta(now, &first_ackt);
-		ca_rtt_us = skb_mstamp_us_delta(now, &last_ackt);
+	if (likely(first_ackt) && !(flag & FLAG_RETRANS_DATA_ACKED)) {
+		seq_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, first_ackt);
+		ca_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, last_ackt);
 	}
-	if (sack->first_sackt.v64) {
-		sack_rtt_us = skb_mstamp_us_delta(now, &sack->first_sackt);
-		ca_rtt_us = skb_mstamp_us_delta(now, &sack->last_sackt);
+	if (sack->first_sackt) {
+		sack_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, sack->first_sackt);
+		ca_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, sack->last_sackt);
 	}
 	sack->rate->rtt_us = ca_rtt_us; /* RTT of last (S)ACKed packet, or -1 */
 	rtt_update = tcp_ack_update_rtt(sk, flag, seq_rtt_us, sack_rtt_us,
@@ -3201,7 +3201,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 		tp->fackets_out -= min(pkts_acked, tp->fackets_out);
 
 	} else if (skb && rtt_update && sack_rtt_us >= 0 &&
-		   sack_rtt_us > skb_mstamp_us_delta(now, &skb->skb_mstamp)) {
+		   sack_rtt_us > tcp_stamp_us_delta(tp->tcp_mstamp, skb->skb_mstamp)) {
 		/* Do not re-arm RTO if the sack RTT is measured from data sent
 		 * after when the head was last (re)transmitted. Otherwise the
 		 * timeout may continue to extend in loss recovery.
@@ -3553,7 +3553,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 	int acked = 0; /* Number of packets newly acked */
 	int rexmit = REXMIT_NONE; /* Flag to (re)transmit to recover losses */
 
-	sack_state.first_sackt.v64 = 0;
+	sack_state.first_sackt = 0;
 	sack_state.rate = &rs;
 
 	/* We very likely will need to access write queue head. */
@@ -5356,7 +5356,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 
-	skb_mstamp_get(&tp->tcp_mstamp);
+	tcp_mstamp_refresh(tp);
 	if (unlikely(!sk->sk_rx_dst))
 		inet_csk(sk)->icsk_af_ops->sk_rx_dst_set(sk, skb);
 	/*
@@ -5672,7 +5672,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 
 		if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
 		    !between(tp->rx_opt.rcv_tsecr, tp->retrans_stamp,
-			     tcp_time_stamp)) {
+			     tcp_time_stamp(tp))) {
 			NET_INC_STATS(sock_net(sk),
 					LINUX_MIB_PAWSACTIVEREJECTED);
 			goto reset_and_undo;
@@ -5917,7 +5917,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 
 	case TCP_SYN_SENT:
 		tp->rx_opt.saw_tstamp = 0;
-		skb_mstamp_get(&tp->tcp_mstamp);
+		tcp_mstamp_refresh(tp);
 		queued = tcp_rcv_synsent_state_process(sk, skb, th);
 		if (queued >= 0)
 			return queued;
@@ -5929,7 +5929,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		return 0;
 	}
 
-	skb_mstamp_get(&tp->tcp_mstamp);
+	tcp_mstamp_refresh(tp);
 	tp->rx_opt.saw_tstamp = 0;
 	req = tp->fastopen_rsk;
 	if (req) {
@@ -6202,7 +6202,7 @@ static void tcp_openreq_init(struct request_sock *req,
 	req->cookie_ts = 0;
 	tcp_rsk(req)->rcv_isn = TCP_SKB_CB(skb)->seq;
 	tcp_rsk(req)->rcv_nxt = TCP_SKB_CB(skb)->seq + 1;
-	skb_mstamp_get(&tcp_rsk(req)->snt_synack);
+	tcp_rsk(req)->snt_synack = tcp_clock_us();
 	tcp_rsk(req)->last_oow_ack_time = 0;
 	req->mss = rx_opt->mss_clamp;
 	req->ts_recent = rx_opt->saw_tstamp ? rx_opt->rcv_tsval : 0;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d8fe25db79f223e3fde85882effd2ac6ec15f8ca..191b2f78b19d2c8d62c59cc046bd608687679619 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -376,8 +376,9 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 	struct sock *sk;
 	struct sk_buff *skb;
 	struct request_sock *fastopen;
-	__u32 seq, snd_una;
-	__u32 remaining;
+	u32 seq, snd_una;
+	s32 remaining;
+	u32 delta_us;
 	int err;
 	struct net *net = dev_net(icmp_skb->dev);
 
@@ -483,12 +484,12 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 		skb = tcp_write_queue_head(sk);
 		BUG_ON(!skb);
 
-		skb_mstamp_get(&tp->tcp_mstamp);
+		tcp_mstamp_refresh(tp);
+		delta_us = (u32)(tp->tcp_mstamp - skb->skb_mstamp);
 		remaining = icsk->icsk_rto -
-			    min(icsk->icsk_rto,
-				tcp_time_stamp - tcp_skb_timestamp(skb));
+			    usecs_to_jiffies(delta_us);
 
-		if (remaining) {
+		if (remaining > 0) {
 			inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
 						  remaining, TCP_RTO_MAX);
 		} else {
@@ -812,7 +813,7 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
 	tcp_v4_send_ack(sk, skb,
 			tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt,
 			tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale,
-			tcp_time_stamp + tcptw->tw_ts_offset,
+			tcp_time_stamp_raw() + tcptw->tw_ts_offset,
 			tcptw->tw_ts_recent,
 			tw->tw_bound_dev_if,
 			tcp_twsk_md5_key(tcptw),
@@ -840,7 +841,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
 	tcp_v4_send_ack(sk, skb, seq,
 			tcp_rsk(req)->rcv_nxt,
 			req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
-			tcp_time_stamp + tcp_rsk(req)->ts_off,
+			tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
 			req->ts_recent,
 			0,
 			tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->daddr,
diff --git a/net/ipv4/tcp_lp.c b/net/ipv4/tcp_lp.c
index ef3122abb3734a63011fba035f7a7aae431da8de..ae10ed64fe13c5278161f92ccecb51653c87db5e 100644
--- a/net/ipv4/tcp_lp.c
+++ b/net/ipv4/tcp_lp.c
@@ -37,7 +37,7 @@
 #include <net/tcp.h>
 
 /* resolution of owd */
-#define LP_RESOL       1000
+#define LP_RESOL       TCP_TS_HZ
 
 /**
  * enum tcp_lp_state
@@ -147,9 +147,9 @@ static u32 tcp_lp_remote_hz_estimator(struct sock *sk)
 	    tp->rx_opt.rcv_tsecr == lp->local_ref_time)
 		goto out;
 
-	m = HZ * (tp->rx_opt.rcv_tsval -
-		  lp->remote_ref_time) / (tp->rx_opt.rcv_tsecr -
-					  lp->local_ref_time);
+	m = TCP_TS_HZ *
+	    (tp->rx_opt.rcv_tsval - lp->remote_ref_time) /
+	    (tp->rx_opt.rcv_tsecr - lp->local_ref_time);
 	if (m < 0)
 		m = -m;
 
@@ -194,7 +194,7 @@ static u32 tcp_lp_owd_calculator(struct sock *sk)
 	if (lp->flag & LP_VALID_RHZ) {
 		owd =
 		    tp->rx_opt.rcv_tsval * (LP_RESOL / lp->remote_hz) -
-		    tp->rx_opt.rcv_tsecr * (LP_RESOL / HZ);
+		    tp->rx_opt.rcv_tsecr * (LP_RESOL / TCP_TS_HZ);
 		if (owd < 0)
 			owd = -owd;
 	}
@@ -264,7 +264,7 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct lp *lp = inet_csk_ca(sk);
-	u32 now = tcp_time_stamp;
+	u32 now = tcp_time_stamp(tp);
 	u32 delta;
 
 	if (sample->rtt_us > 0)
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 6504f1082bdfda77bfc1b53d0d85928e5083a24e..d0642df7304452b57d2bc7f92a0a0c6d821553d3 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -455,7 +455,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 		newtp->fackets_out = 0;
 		newtp->snd_ssthresh = TCP_INFINITE_SSTHRESH;
 		newtp->tlp_high_seq = 0;
-		newtp->lsndtime = treq->snt_synack.stamp_jiffies;
+		newtp->lsndtime = tcp_jiffies32;
 		newsk->sk_txhash = treq->txhash;
 		newtp->last_oow_ack_time = 0;
 		newtp->total_retrans = req->num_retrans;
@@ -526,7 +526,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 		newtp->fastopen_req = NULL;
 		newtp->fastopen_rsk = NULL;
 		newtp->syn_data_acked = 0;
-		newtp->rack.mstamp.v64 = 0;
+		newtp->rack.mstamp = 0;
 		newtp->rack.advanced = 0;
 
 		__TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 65472e931a0b79f7078a4da7db802dfcc32c7621..478f75baee31d28b4e3122f7635cd1addf20cb98 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1962,7 +1962,7 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 
 	head = tcp_write_queue_head(sk);
 
-	age = skb_mstamp_us_delta(&tp->tcp_mstamp, &head->skb_mstamp);
+	age = tcp_stamp_us_delta(tp->tcp_mstamp, head->skb_mstamp);
 	/* If next ACK is likely to come too late (half srtt), do not defer */
 	if (age < (tp->srtt_us >> 4))
 		goto send_now;
@@ -2279,7 +2279,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 	}
 
 	max_segs = tcp_tso_segs(sk, mss_now);
-	skb_mstamp_get(&tp->tcp_mstamp);
+	tcp_mstamp_refresh(tp);
 	while ((skb = tcp_send_head(sk))) {
 		unsigned int limit;
 
@@ -3095,7 +3095,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority)
 	skb_reserve(skb, MAX_TCP_HEADER);
 	tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk),
 			     TCPHDR_ACK | TCPHDR_RST);
-	skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
+	tcp_mstamp_refresh(tcp_sk(sk));
 	/* Send it off. */
 	if (tcp_transmit_skb(sk, skb, 0, priority))
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED);
@@ -3191,10 +3191,10 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 	memset(&opts, 0, sizeof(opts));
 #ifdef CONFIG_SYN_COOKIES
 	if (unlikely(req->cookie_ts))
-		skb->skb_mstamp.stamp_jiffies = cookie_init_timestamp(req);
+		skb->skb_mstamp = cookie_init_timestamp(req);
 	else
 #endif
-	skb_mstamp_get(&skb->skb_mstamp);
+		skb->skb_mstamp = tcp_clock_us();
 
 #ifdef CONFIG_TCP_MD5SIG
 	rcu_read_lock();
@@ -3453,8 +3453,8 @@ int tcp_connect(struct sock *sk)
 		return -ENOBUFS;
 
 	tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
-	skb_mstamp_get(&tp->tcp_mstamp);
-	tp->retrans_stamp = tp->tcp_mstamp.stamp_jiffies;
+	tcp_mstamp_refresh(tp);
+	tp->retrans_stamp = tcp_time_stamp(tp);
 	tcp_connect_queue_skb(sk, buff);
 	tcp_ecn_send_syn(sk, buff);
 
@@ -3615,7 +3615,7 @@ void tcp_send_window_probe(struct sock *sk)
 {
 	if (sk->sk_state == TCP_ESTABLISHED) {
 		tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
-		skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
+		tcp_mstamp_refresh(tcp_sk(sk));
 		tcp_xmit_probe_skb(sk, 0, LINUX_MIB_TCPWINPROBE);
 	}
 }
diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c
index c6a9fa8946462100947ab62d86464ff8f99565c2..ad99569d4c1e2c7f0522645217a6f42e0c4155d6 100644
--- a/net/ipv4/tcp_rate.c
+++ b/net/ipv4/tcp_rate.c
@@ -78,7 +78,7 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct tcp_skb_cb *scb = TCP_SKB_CB(skb);
 
-	if (!scb->tx.delivered_mstamp.v64)
+	if (!scb->tx.delivered_mstamp)
 		return;
 
 	if (!rs->prior_delivered ||
@@ -89,9 +89,9 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
 		rs->is_retrans	     = scb->sacked & TCPCB_RETRANS;
 
 		/* Find the duration of the "send phase" of this window: */
-		rs->interval_us      = skb_mstamp_us_delta(
-						&skb->skb_mstamp,
-						&scb->tx.first_tx_mstamp);
+		rs->interval_us      = tcp_stamp_us_delta(
+						skb->skb_mstamp,
+						scb->tx.first_tx_mstamp);
 
 		/* Record send time of most recently ACKed packet: */
 		tp->first_tx_mstamp  = skb->skb_mstamp;
@@ -101,7 +101,7 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
 	 * we don't need to reset since it'll be freed soon.
 	 */
 	if (scb->sacked & TCPCB_SACKED_ACKED)
-		scb->tx.delivered_mstamp.v64 = 0;
+		scb->tx.delivered_mstamp = 0;
 }
 
 /* Update the connection delivery information and generate a rate sample. */
@@ -125,7 +125,7 @@ void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
 	rs->acked_sacked = delivered;	/* freshly ACKed or SACKed */
 	rs->losses = lost;		/* freshly marked lost */
 	/* Return an invalid sample if no timing information is available. */
-	if (!rs->prior_mstamp.v64) {
+	if (!rs->prior_mstamp) {
 		rs->delivered = -1;
 		rs->interval_us = -1;
 		return;
@@ -138,8 +138,8 @@ void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
 	 * longer phase.
 	 */
 	snd_us = rs->interval_us;				/* send phase */
-	ack_us = skb_mstamp_us_delta(&tp->tcp_mstamp,
-				     &rs->prior_mstamp); /* ack phase */
+	ack_us = tcp_stamp_us_delta(tp->tcp_mstamp,
+				    rs->prior_mstamp); /* ack phase */
 	rs->interval_us = max(snd_us, ack_us);
 
 	/* Normally we expect interval_us >= min-rtt.
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index cd72b3d3879e88181c8a4639f0334a24e4cda852..fe9a493d02082d3830f37854d5f169f769844ffb 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -17,12 +17,9 @@ static void tcp_rack_mark_skb_lost(struct sock *sk, struct sk_buff *skb)
 	}
 }
 
-static bool tcp_rack_sent_after(const struct skb_mstamp *t1,
-				const struct skb_mstamp *t2,
-				u32 seq1, u32 seq2)
+static bool tcp_rack_sent_after(u64 t1, u64 t2, u32 seq1, u32 seq2)
 {
-	return skb_mstamp_after(t1, t2) ||
-	       (t1->v64 == t2->v64 && after(seq1, seq2));
+	return t1 > t2 || (t1 == t2 && after(seq1, seq2));
 }
 
 /* RACK loss detection (IETF draft draft-ietf-tcpm-rack-01):
@@ -72,14 +69,14 @@ static void tcp_rack_detect_loss(struct sock *sk, u32 *reo_timeout)
 		    scb->sacked & TCPCB_SACKED_ACKED)
 			continue;
 
-		if (tcp_rack_sent_after(&tp->rack.mstamp, &skb->skb_mstamp,
+		if (tcp_rack_sent_after(tp->rack.mstamp, skb->skb_mstamp,
 					tp->rack.end_seq, scb->end_seq)) {
 			/* Step 3 in draft-cheng-tcpm-rack-00.txt:
 			 * A packet is lost if its elapsed time is beyond
 			 * the recent RTT plus the reordering window.
 			 */
-			u32 elapsed = skb_mstamp_us_delta(&tp->tcp_mstamp,
-							  &skb->skb_mstamp);
+			u32 elapsed = tcp_stamp_us_delta(tp->tcp_mstamp,
+							 skb->skb_mstamp);
 			s32 remaining = tp->rack.rtt_us + reo_wnd - elapsed;
 
 			if (remaining < 0) {
@@ -127,16 +124,16 @@ void tcp_rack_mark_lost(struct sock *sk)
  * draft-cheng-tcpm-rack-00.txt
  */
 void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
-		      const struct skb_mstamp *xmit_time)
+		      u64 xmit_time)
 {
 	u32 rtt_us;
 
-	if (tp->rack.mstamp.v64 &&
-	    !tcp_rack_sent_after(xmit_time, &tp->rack.mstamp,
+	if (tp->rack.mstamp &&
+	    !tcp_rack_sent_after(xmit_time, tp->rack.mstamp,
 				 end_seq, tp->rack.end_seq))
 		return;
 
-	rtt_us = skb_mstamp_us_delta(&tp->tcp_mstamp, xmit_time);
+	rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, xmit_time);
 	if (sacked & TCPCB_RETRANS) {
 		/* If the sacked packet was retransmitted, it's ambiguous
 		 * whether the retransmission or the original (or the prior
@@ -152,7 +149,7 @@ void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
 			return;
 	}
 	tp->rack.rtt_us = rtt_us;
-	tp->rack.mstamp = *xmit_time;
+	tp->rack.mstamp = xmit_time;
 	tp->rack.end_seq = end_seq;
 	tp->rack.advanced = 1;
 }
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 6629f47aa7f0182ece7873afcc3daa6f0019e228..27a667bce8060e6b2290fe636c27a79d0d593b48 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -153,8 +153,8 @@ static bool retransmits_timed_out(struct sock *sk,
 				  unsigned int timeout,
 				  bool syn_set)
 {
-	unsigned int linear_backoff_thresh, start_ts;
 	unsigned int rto_base = syn_set ? TCP_TIMEOUT_INIT : TCP_RTO_MIN;
+	unsigned int linear_backoff_thresh, start_ts;
 
 	if (!inet_csk(sk)->icsk_retransmits)
 		return false;
@@ -172,7 +172,7 @@ static bool retransmits_timed_out(struct sock *sk,
 			timeout = ((2 << linear_backoff_thresh) - 1) * rto_base +
 				(boundary - linear_backoff_thresh) * TCP_RTO_MAX;
 	}
-	return (tcp_time_stamp - start_ts) >= timeout;
+	return (tcp_time_stamp(tcp_sk(sk)) - start_ts) >= jiffies_to_msecs(timeout);
 }
 
 /* A write timeout has occurred. Process the after effects. */
@@ -341,7 +341,7 @@ static void tcp_probe_timer(struct sock *sk)
 	if (!start_ts)
 		tcp_send_head(sk)->skb_mstamp = tp->tcp_mstamp;
 	else if (icsk->icsk_user_timeout &&
-		 (s32)(tcp_time_stamp - start_ts) > icsk->icsk_user_timeout)
+		 (s32)(tcp_time_stamp(tp) - start_ts) > icsk->icsk_user_timeout)
 		goto abort;
 
 	max_probes = sock_net(sk)->ipv4.sysctl_tcp_retries2;
@@ -561,7 +561,7 @@ void tcp_write_timer_handler(struct sock *sk)
 		goto out;
 	}
 
-	skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
+	tcp_mstamp_refresh(tcp_sk(sk));
 	event = icsk->icsk_pending;
 
 	switch (event) {
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index 5abc3692b9011b140816dc4ce6223e79e5defddb..971823359f5b98da46c39b86c9ddcefd14df8559 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -211,7 +211,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
 	ireq->wscale_ok		= tcp_opt.wscale_ok;
 	ireq->tstamp_ok		= tcp_opt.saw_tstamp;
 	req->ts_recent		= tcp_opt.saw_tstamp ? tcp_opt.rcv_tsval : 0;
-	treq->snt_synack.v64	= 0;
+	treq->snt_synack	= 0;
 	treq->rcv_isn = ntohl(th->seq) - 1;
 	treq->snt_isn = cookie;
 	treq->ts_off = 0;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 4f4310a36a0481e2bd068e39285011ff28377ea5..233edfabe1dbceaeb6cdd42a2bb379072aeee361 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -949,7 +949,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb)
 
 	tcp_v6_send_ack(sk, skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt,
 			tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale,
-			tcp_time_stamp + tcptw->tw_ts_offset,
+			tcp_time_stamp_raw() + tcptw->tw_ts_offset,
 			tcptw->tw_ts_recent, tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw),
 			tw->tw_tclass, cpu_to_be32(tw->tw_flowlabel));
 
@@ -971,7 +971,7 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
 			tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt,
 			tcp_rsk(req)->rcv_nxt,
 			req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
-			tcp_time_stamp + tcp_rsk(req)->ts_off,
+			tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
 			req->ts_recent, sk->sk_bound_dev_if,
 			tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr),
 			0, 0);
diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c
index a504e87c6ddff1b1266a901549256f29dc1973d1..49bd8bb16b1817e9e06ee64c889e78a923bfd375 100644
--- a/net/netfilter/nf_synproxy_core.c
+++ b/net/netfilter/nf_synproxy_core.c
@@ -152,7 +152,7 @@ void synproxy_init_timestamp_cookie(const struct xt_synproxy_info *info,
 				    struct synproxy_options *opts)
 {
 	opts->tsecr = opts->tsval;
-	opts->tsval = tcp_time_stamp & ~0x3f;
+	opts->tsval = tcp_time_stamp_raw() & ~0x3f;
 
 	if (opts->options & XT_SYNPROXY_OPT_WSCALE) {
 		opts->tsval |= opts->wscale;
-- 
2.13.0.303.g4ebf302169-goog

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 01/15] tcp: use tp->tcp_mstamp in output path
  2017-05-16 21:00 ` [PATCH net-next 01/15] tcp: use tp->tcp_mstamp in output path Eric Dumazet
@ 2017-05-17 13:42   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Idea is to later convert tp->tcp_mstamp to a full u64 counter
> using usec resolution, so that we can later have fine
> grained TCP TS clock (RFC 7323), regardless of HZ value.
>
> We try to refresh tp->tcp_mstamp only when necessary.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_ipv4.c     |  1 +
>  net/ipv4/tcp_output.c   | 21 +++++++++++----------
>  net/ipv4/tcp_recovery.c |  1 -
>  net/ipv4/tcp_timer.c    |  3 ++-
>  4 files changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 5ab2aac5ca191075383fc75214da816873bb222c..d8fe25db79f223e3fde85882effd2ac6ec15f8ca 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -483,6 +483,7 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
>                 skb = tcp_write_queue_head(sk);
>                 BUG_ON(!skb);
>
> +               skb_mstamp_get(&tp->tcp_mstamp);
>                 remaining = icsk->icsk_rto -
>                             min(icsk->icsk_rto,
>                                 tcp_time_stamp - tcp_skb_timestamp(skb));
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index a32172d69a03cbe76b45ec3094222f6c3a73e27d..4c8a6eaba6b39a2aea061dd6857ed8df954c5ca2 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -997,8 +997,8 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
>         BUG_ON(!skb || !tcp_skb_pcount(skb));
>         tp = tcp_sk(sk);
>
> +       skb->skb_mstamp = tp->tcp_mstamp;
>         if (clone_it) {
> -               skb_mstamp_get(&skb->skb_mstamp);
>                 TCP_SKB_CB(skb)->tx.in_flight = TCP_SKB_CB(skb)->end_seq
>                         - tp->snd_una;
>                 tcp_rate_skb_sent(sk, skb);
> @@ -1906,7 +1906,6 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
>         const struct inet_connection_sock *icsk = inet_csk(sk);
>         u32 age, send_win, cong_win, limit, in_flight;
>         struct tcp_sock *tp = tcp_sk(sk);
> -       struct skb_mstamp now;
>         struct sk_buff *head;
>         int win_divisor;
>
> @@ -1962,8 +1961,8 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
>         }
>
>         head = tcp_write_queue_head(sk);
> -       skb_mstamp_get(&now);
> -       age = skb_mstamp_us_delta(&now, &head->skb_mstamp);
> +
> +       age = skb_mstamp_us_delta(&tp->tcp_mstamp, &head->skb_mstamp);
>         /* If next ACK is likely to come too late (half srtt), do not defer */
>         if (age < (tp->srtt_us >> 4))
>                 goto send_now;
> @@ -2280,6 +2279,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
>         }
>
>         max_segs = tcp_tso_segs(sk, mss_now);
> +       skb_mstamp_get(&tp->tcp_mstamp);
>         while ((skb = tcp_send_head(sk))) {
>                 unsigned int limit;
>
> @@ -2291,7 +2291,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
>
>                 if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
>                         /* "skb_mstamp" is used as a start point for the retransmit timer */
> -                       skb_mstamp_get(&skb->skb_mstamp);
> +                       skb->skb_mstamp = tp->tcp_mstamp;
>                         goto repair; /* Skip network transmission */
>                 }
>
> @@ -2879,7 +2879,7 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
>                      skb_headroom(skb) >= 0xFFFF)) {
>                 struct sk_buff *nskb;
>
> -               skb_mstamp_get(&skb->skb_mstamp);
> +               skb->skb_mstamp = tp->tcp_mstamp;
>                 nskb = __pskb_copy(skb, MAX_TCP_HEADER, GFP_ATOMIC);
>                 err = nskb ? tcp_transmit_skb(sk, nskb, 0, GFP_ATOMIC) :
>                              -ENOBUFS;
> @@ -3095,7 +3095,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority)
>         skb_reserve(skb, MAX_TCP_HEADER);
>         tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk),
>                              TCPHDR_ACK | TCPHDR_RST);
> -       skb_mstamp_get(&skb->skb_mstamp);
> +       skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
>         /* Send it off. */
>         if (tcp_transmit_skb(sk, skb, 0, priority))
>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED);
> @@ -3453,7 +3453,8 @@ int tcp_connect(struct sock *sk)
>                 return -ENOBUFS;
>
>         tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
> -       tp->retrans_stamp = tcp_time_stamp;
> +       skb_mstamp_get(&tp->tcp_mstamp);
> +       tp->retrans_stamp = tp->tcp_mstamp.stamp_jiffies;
>         tcp_connect_queue_skb(sk, buff);
>         tcp_ecn_send_syn(sk, buff);
>
> @@ -3572,7 +3573,6 @@ void tcp_send_ack(struct sock *sk)
>         skb_set_tcp_pure_ack(buff);
>
>         /* Send it off, this clears delayed acks for us. */
> -       skb_mstamp_get(&buff->skb_mstamp);
>         tcp_transmit_skb(sk, buff, 0, (__force gfp_t)0);
>  }
>  EXPORT_SYMBOL_GPL(tcp_send_ack);
> @@ -3606,15 +3606,16 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent, int mib)
>          * send it.
>          */
>         tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK);
> -       skb_mstamp_get(&skb->skb_mstamp);
>         NET_INC_STATS(sock_net(sk), mib);
>         return tcp_transmit_skb(sk, skb, 0, (__force gfp_t)0);
>  }
>
> +/* Called from setsockopt( ... TCP_REPAIR ) */
>  void tcp_send_window_probe(struct sock *sk)
>  {
>         if (sk->sk_state == TCP_ESTABLISHED) {
>                 tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
> +               skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
>                 tcp_xmit_probe_skb(sk, 0, LINUX_MIB_TCPWINPROBE);
>         }
>  }
> diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
> index 362b8c75bfab44cf87c2a01398a146a271bc1119..cd72b3d3879e88181c8a4639f0334a24e4cda852 100644
> --- a/net/ipv4/tcp_recovery.c
> +++ b/net/ipv4/tcp_recovery.c
> @@ -166,7 +166,6 @@ void tcp_rack_reo_timeout(struct sock *sk)
>         u32 timeout, prior_inflight;
>
>         prior_inflight = tcp_packets_in_flight(tp);
> -       skb_mstamp_get(&tp->tcp_mstamp);
>         tcp_rack_detect_loss(sk, &timeout);
>         if (prior_inflight != tcp_packets_in_flight(tp)) {
>                 if (inet_csk(sk)->icsk_ca_state != TCP_CA_Recovery) {
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 86934bcf685a65ec3af3d22f1801ffa33eea76e2..ec7c5473c788d77ae459b38492f2f2606d00d1ba 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -339,7 +339,7 @@ static void tcp_probe_timer(struct sock *sk)
>          */
>         start_ts = tcp_skb_timestamp(tcp_send_head(sk));
>         if (!start_ts)
> -               skb_mstamp_get(&tcp_send_head(sk)->skb_mstamp);
> +               tcp_send_head(sk)->skb_mstamp = tp->tcp_mstamp;
>         else if (icsk->icsk_user_timeout &&
>                  (s32)(tcp_time_stamp - start_ts) > icsk->icsk_user_timeout)
>                 goto abort;
> @@ -561,6 +561,7 @@ void tcp_write_timer_handler(struct sock *sk)
>                 goto out;
>         }
>
> +       skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
>         event = icsk->icsk_pending;
>
>         switch (event) {
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 02/15] tcp: introduce tcp_jiffies32
  2017-05-16 21:00 ` [PATCH net-next 02/15] tcp: introduce tcp_jiffies32 Eric Dumazet
@ 2017-05-17 13:43   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> We abuse tcp_time_stamp for two different cases :
>
> 1) base to generate TCP Timestamp options (RFC 7323)
>
> 2) A 32bit version of jiffies since some TCP fields
>    are 32bit wide to save memory.
>
> Since we want in the future to have 1ms TCP TS clock,
> regardless of HZ value, we want to cleanup things.
>
> tcp_jiffies32 is the truncated jiffies value,
> which will be used only in places where we want a 'host'
> timestamp.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  include/net/tcp.h | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index b4dc93dae98c2d175ccadce150083705d237555e..4b45be5708215bae4551a5430b63ab2777baf447 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -700,11 +700,14 @@ u32 __tcp_select_window(struct sock *sk);
>
>  void tcp_send_window_probe(struct sock *sk);
>
> -/* TCP timestamps are only 32-bits, this causes a slight
> - * complication on 64-bit systems since we store a snapshot
> - * of jiffies in the buffer control blocks below.  We decided
> - * to use only the low 32-bits of jiffies and hide the ugly
> - * casts with the following macro.
> +/* TCP uses 32bit jiffies to save some space.
> + * Note that this is different from tcp_time_stamp, which
> + * historically has been the same until linux-4.13.
> + */
> +#define tcp_jiffies32 ((u32)jiffies)
> +
> +/* Generator for TCP TS option (RFC 7323)
> + * Currently tied to 'jiffies' but will soon be driven by 1 ms clock.
>   */
>  #define tcp_time_stamp         ((__u32)(jiffies))
>
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 03/15] dccp: do not use tcp_time_stamp
  2017-05-16 21:00 ` [PATCH net-next 03/15] dccp: do not use tcp_time_stamp Eric Dumazet
@ 2017-05-17 13:43   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Use our own macro instead of abusing tcp_time_stamp
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/dccp/ccids/ccid2.c | 8 ++++----
>  net/dccp/ccids/ccid2.h | 2 +-
>  2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c
> index 5e3a7302f7747e4c4f3134eacab2f2c65b13402f..e1295d5f2c562e8785f59a0f5bd7064f471e85ab 100644
> --- a/net/dccp/ccids/ccid2.c
> +++ b/net/dccp/ccids/ccid2.c
> @@ -233,7 +233,7 @@ static void ccid2_hc_tx_packet_sent(struct sock *sk, unsigned int len)
>  {
>         struct dccp_sock *dp = dccp_sk(sk);
>         struct ccid2_hc_tx_sock *hc = ccid2_hc_tx_sk(sk);
> -       const u32 now = ccid2_time_stamp;
> +       const u32 now = ccid2_jiffies32;
>         struct ccid2_seq *next;
>
>         /* slow-start after idle periods (RFC 2581, RFC 2861) */
> @@ -466,7 +466,7 @@ static void ccid2_new_ack(struct sock *sk, struct ccid2_seq *seqp,
>          * The cleanest solution is to not use the ccid2s_sent field at all
>          * and instead use DCCP timestamps: requires changes in other places.
>          */
> -       ccid2_rtt_estimator(sk, ccid2_time_stamp - seqp->ccid2s_sent);
> +       ccid2_rtt_estimator(sk, ccid2_jiffies32 - seqp->ccid2s_sent);
>  }
>
>  static void ccid2_congestion_event(struct sock *sk, struct ccid2_seq *seqp)
> @@ -478,7 +478,7 @@ static void ccid2_congestion_event(struct sock *sk, struct ccid2_seq *seqp)
>                 return;
>         }
>
> -       hc->tx_last_cong = ccid2_time_stamp;
> +       hc->tx_last_cong = ccid2_jiffies32;
>
>         hc->tx_cwnd      = hc->tx_cwnd / 2 ? : 1U;
>         hc->tx_ssthresh  = max(hc->tx_cwnd, 2U);
> @@ -731,7 +731,7 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk)
>
>         hc->tx_rto       = DCCP_TIMEOUT_INIT;
>         hc->tx_rpdupack  = -1;
> -       hc->tx_last_cong = hc->tx_lsndtime = hc->tx_cwnd_stamp = ccid2_time_stamp;
> +       hc->tx_last_cong = hc->tx_lsndtime = hc->tx_cwnd_stamp = ccid2_jiffies32;
>         hc->tx_cwnd_used = 0;
>         setup_timer(&hc->tx_rtotimer, ccid2_hc_tx_rto_expire,
>                         (unsigned long)sk);
> diff --git a/net/dccp/ccids/ccid2.h b/net/dccp/ccids/ccid2.h
> index 18c97543e522a6b9a5c8a3c817d4b40224adde48..6e50ef2898fb9dd9080217cc167defea6a2e9021 100644
> --- a/net/dccp/ccids/ccid2.h
> +++ b/net/dccp/ccids/ccid2.h
> @@ -27,7 +27,7 @@
>   * CCID-2 timestamping faces the same issues as TCP timestamping.
>   * Hence we reuse/share as much of the code as possible.
>   */
> -#define ccid2_time_stamp       tcp_time_stamp
> +#define ccid2_jiffies32        ((u32)jiffies)
>
>  /* NUMDUPACK parameter from RFC 4341, p. 6 */
>  #define NUMDUPACK      3
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 04/15] tcp: use tcp_jiffies32 to feed tp->lsndtime
  2017-05-16 21:00 ` [PATCH net-next 04/15] tcp: use tcp_jiffies32 to feed tp->lsndtime Eric Dumazet
@ 2017-05-17 13:43   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Use tcp_jiffies32 instead of tcp_time_stamp to feed
> tp->lsndtime.
>
> tcp_time_stamp will soon be a litle bit more expensive
> than simply reading 'jiffies'.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  include/net/tcp.h     | 2 +-
>  net/ipv4/tcp.c        | 2 +-
>  net/ipv4/tcp_cubic.c  | 2 +-
>  net/ipv4/tcp_input.c  | 4 ++--
>  net/ipv4/tcp_output.c | 4 ++--
>  net/ipv4/tcp_timer.c  | 4 ++--
>  6 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 4b45be5708215bae4551a5430b63ab2777baf447..feba4c0406e551d7e57da3411476735731b4d817 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1245,7 +1245,7 @@ static inline void tcp_slow_start_after_idle_check(struct sock *sk)
>         if (!sysctl_tcp_slow_start_after_idle || tp->packets_out ||
>             ca_ops->cong_control)
>                 return;
> -       delta = tcp_time_stamp - tp->lsndtime;
> +       delta = tcp_jiffies32 - tp->lsndtime;
>         if (delta > inet_csk(sk)->icsk_rto)
>                 tcp_cwnd_restart(sk, delta);
>  }
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 1e4c76d2b8278ba71d6cc2cf7ebfe483e241f76e..d0bb61ee28bbceff8f2e27416ce87fec94935973 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2841,7 +2841,7 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
>         info->tcpi_retrans = tp->retrans_out;
>         info->tcpi_fackets = tp->fackets_out;
>
> -       now = tcp_time_stamp;
> +       now = tcp_jiffies32;
>         info->tcpi_last_data_sent = jiffies_to_msecs(now - tp->lsndtime);
>         info->tcpi_last_data_recv = jiffies_to_msecs(now - icsk->icsk_ack.lrcvtime);
>         info->tcpi_last_ack_recv = jiffies_to_msecs(now - tp->rcv_tstamp);
> diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
> index 0683ba447d775b6101a929a6aca3eb255cff8932..2052ca740916d0872a41125ab61b769b334a314b 100644
> --- a/net/ipv4/tcp_cubic.c
> +++ b/net/ipv4/tcp_cubic.c
> @@ -155,7 +155,7 @@ static void bictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event)
>  {
>         if (event == CA_EVENT_TX_START) {
>                 struct bictcp *ca = inet_csk_ca(sk);
> -               u32 now = tcp_time_stamp;
> +               u32 now = tcp_jiffies32;
>                 s32 delta;
>
>                 delta = now - tcp_sk(sk)->lsndtime;
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 06e2dbc2b4a212a054fd88e57bb902c55a171b11..c0b3f909df394214785749704f2760171fe9d160 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5571,7 +5571,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
>         /* Prevent spurious tcp_cwnd_restart() on first data
>          * packet.
>          */
> -       tp->lsndtime = tcp_time_stamp;
> +       tp->lsndtime = tcp_jiffies32;
>
>         tcp_init_buffer_space(sk);
>
> @@ -6008,7 +6008,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
>                         tcp_update_pacing_rate(sk);
>
>                 /* Prevent spurious tcp_cwnd_restart() on first data packet */
> -               tp->lsndtime = tcp_time_stamp;
> +               tp->lsndtime = tcp_jiffies32;
>
>                 tcp_initialize_rcv_mss(sk);
>                 tcp_fast_path_on(tp);
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 4c8a6eaba6b39a2aea061dd6857ed8df954c5ca2..be9f8f483e21bdbb4d944fcdae8560f3ae11ee64 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -160,7 +160,7 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
>                                 struct sock *sk)
>  {
>         struct inet_connection_sock *icsk = inet_csk(sk);
> -       const u32 now = tcp_time_stamp;
> +       const u32 now = tcp_jiffies32;
>
>         if (tcp_packets_in_flight(tp) == 0)
>                 tcp_ca_event(sk, CA_EVENT_TX_START);
> @@ -1918,7 +1918,7 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
>         /* Avoid bursty behavior by allowing defer
>          * only if the last write was recent.
>          */
> -       if ((s32)(tcp_time_stamp - tp->lsndtime) > 0)
> +       if ((s32)(tcp_jiffies32 - tp->lsndtime) > 0)
>                 goto send_now;
>
>         in_flight = tcp_packets_in_flight(tp);
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index ec7c5473c788d77ae459b38492f2f2606d00d1ba..5f6f219a431e41a90b3c5d667a1a22b50f4464cf 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -63,7 +63,7 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
>
>         /* If peer does not open window for long time, or did not transmit
>          * anything for long time, penalize it. */
> -       if ((s32)(tcp_time_stamp - tp->lsndtime) > 2*TCP_RTO_MAX || !do_reset)
> +       if ((s32)(tcp_jiffies32 - tp->lsndtime) > 2*TCP_RTO_MAX || !do_reset)
>                 shift++;
>
>         /* If some dubious ICMP arrived, penalize even more. */
> @@ -73,7 +73,7 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
>         if (tcp_check_oom(sk, shift)) {
>                 /* Catch exceptional cases, when connection requires reset.
>                  *      1. Last segment was sent recently. */
> -               if ((s32)(tcp_time_stamp - tp->lsndtime) <= TCP_TIMEWAIT_LEN ||
> +               if ((s32)(tcp_jiffies32 - tp->lsndtime) <= TCP_TIMEWAIT_LEN ||
>                     /*  2. Window is closed. */
>                     (!tp->snd_wnd && !tp->packets_out))
>                         do_reset = true;
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 05/15] tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp
  2017-05-16 21:00 ` [PATCH net-next 05/15] tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp Eric Dumazet
@ 2017-05-17 13:45   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:45 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Use tcp_jiffies32 instead of tcp_time_stamp to feed
> tp->snd_cwnd_stamp.
>
> tcp_time_stamp will soon be a litle bit more expensive
> than simply reading 'jiffies'.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_input.c   | 14 +++++++-------
>  net/ipv4/tcp_metrics.c |  2 +-
>  net/ipv4/tcp_output.c  |  8 ++++----
>  3 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index c0b3f909df394214785749704f2760171fe9d160..6a15c9b80b09829799dc37d89ecdbf11ec9ff904 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -463,7 +463,7 @@ void tcp_init_buffer_space(struct sock *sk)
>                 tp->window_clamp = max(2 * tp->advmss, maxwin - tp->advmss);
>
>         tp->rcv_ssthresh = min(tp->rcv_ssthresh, tp->window_clamp);
> -       tp->snd_cwnd_stamp = tcp_time_stamp;
> +       tp->snd_cwnd_stamp = tcp_jiffies32;
>  }
>
>  /* 5. Recalculate window clamp after socket hit its memory bounds. */
> @@ -1954,7 +1954,7 @@ void tcp_enter_loss(struct sock *sk)
>         }
>         tp->snd_cwnd       = 1;
>         tp->snd_cwnd_cnt   = 0;
> -       tp->snd_cwnd_stamp = tcp_time_stamp;
> +       tp->snd_cwnd_stamp = tcp_jiffies32;
>
>         tp->retrans_out = 0;
>         tp->lost_out = 0;
> @@ -2383,7 +2383,7 @@ static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss)
>                         tcp_ecn_withdraw_cwr(tp);
>                 }
>         }
> -       tp->snd_cwnd_stamp = tcp_time_stamp;
> +       tp->snd_cwnd_stamp = tcp_jiffies32;
>         tp->undo_marker = 0;
>  }
>
> @@ -2520,7 +2520,7 @@ static inline void tcp_end_cwnd_reduction(struct sock *sk)
>         if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR ||
>             (tp->undo_marker && tp->snd_ssthresh < TCP_INFINITE_SSTHRESH)) {
>                 tp->snd_cwnd = tp->snd_ssthresh;
> -               tp->snd_cwnd_stamp = tcp_time_stamp;
> +               tp->snd_cwnd_stamp = tcp_jiffies32;
>         }
>         tcp_ca_event(sk, CA_EVENT_COMPLETE_CWR);
>  }
> @@ -2590,7 +2590,7 @@ static void tcp_mtup_probe_success(struct sock *sk)
>                        tcp_mss_to_mtu(sk, tp->mss_cache) /
>                        icsk->icsk_mtup.probe_size;
>         tp->snd_cwnd_cnt = 0;
> -       tp->snd_cwnd_stamp = tcp_time_stamp;
> +       tp->snd_cwnd_stamp = tcp_jiffies32;
>         tp->snd_ssthresh = tcp_current_ssthresh(sk);
>
>         icsk->icsk_mtup.search_low = icsk->icsk_mtup.probe_size;
> @@ -2976,7 +2976,7 @@ static void tcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
>         const struct inet_connection_sock *icsk = inet_csk(sk);
>
>         icsk->icsk_ca_ops->cong_avoid(sk, ack, acked);
> -       tcp_sk(sk)->snd_cwnd_stamp = tcp_time_stamp;
> +       tcp_sk(sk)->snd_cwnd_stamp = tcp_jiffies32;
>  }
>
>  /* Restart timer after forward progress on connection.
> @@ -5019,7 +5019,7 @@ static void tcp_new_space(struct sock *sk)
>
>         if (tcp_should_expand_sndbuf(sk)) {
>                 tcp_sndbuf_expand(sk);
> -               tp->snd_cwnd_stamp = tcp_time_stamp;
> +               tp->snd_cwnd_stamp = tcp_jiffies32;
>         }
>
>         sk->sk_write_space(sk);
> diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
> index 653bbd67e3a39b68d27d26d17571c00ce2854bfd..102b2c90bb807d3a88d31b59324baf72cf901cdf 100644
> --- a/net/ipv4/tcp_metrics.c
> +++ b/net/ipv4/tcp_metrics.c
> @@ -524,7 +524,7 @@ void tcp_init_metrics(struct sock *sk)
>                 tp->snd_cwnd = 1;
>         else
>                 tp->snd_cwnd = tcp_init_cwnd(tp, dst);
> -       tp->snd_cwnd_stamp = tcp_time_stamp;
> +       tp->snd_cwnd_stamp = tcp_jiffies32;
>  }
>
>  bool tcp_peer_is_proven(struct request_sock *req, struct dst_entry *dst)
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index be9f8f483e21bdbb4d944fcdae8560f3ae11ee64..4bd50f0b236ba23fe521a76dd9d35ee16acb061f 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -151,7 +151,7 @@ void tcp_cwnd_restart(struct sock *sk, s32 delta)
>         while ((delta -= inet_csk(sk)->icsk_rto) > 0 && cwnd > restart_cwnd)
>                 cwnd >>= 1;
>         tp->snd_cwnd = max(cwnd, restart_cwnd);
> -       tp->snd_cwnd_stamp = tcp_time_stamp;
> +       tp->snd_cwnd_stamp = tcp_jiffies32;
>         tp->snd_cwnd_used = 0;
>  }
>
> @@ -1576,7 +1576,7 @@ static void tcp_cwnd_application_limited(struct sock *sk)
>                 }
>                 tp->snd_cwnd_used = 0;
>         }
> -       tp->snd_cwnd_stamp = tcp_time_stamp;
> +       tp->snd_cwnd_stamp = tcp_jiffies32;
>  }
>
>  static void tcp_cwnd_validate(struct sock *sk, bool is_cwnd_limited)
> @@ -1597,14 +1597,14 @@ static void tcp_cwnd_validate(struct sock *sk, bool is_cwnd_limited)
>         if (tcp_is_cwnd_limited(sk)) {
>                 /* Network is feed fully. */
>                 tp->snd_cwnd_used = 0;
> -               tp->snd_cwnd_stamp = tcp_time_stamp;
> +               tp->snd_cwnd_stamp = tcp_jiffies32;
>         } else {
>                 /* Network starves. */
>                 if (tp->packets_out > tp->snd_cwnd_used)
>                         tp->snd_cwnd_used = tp->packets_out;
>
>                 if (sysctl_tcp_slow_start_after_idle &&
> -                   (s32)(tcp_time_stamp - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto &&
> +                   (s32)(tcp_jiffies32 - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto &&
>                     !ca_ops->cong_control)
>                         tcp_cwnd_application_limited(sk);
>
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 06/15] tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp
  2017-05-16 21:00 ` [PATCH net-next 06/15] tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
@ 2017-05-17 13:45   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:45 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Use tcp_jiffies32 instead of tcp_time_stamp, since
> tcp_time_stamp will soon be only used for TCP TS option.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_bbr.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
> index 92b045c72163def1c1d6aa0f2002760186aa5dc3..40dc4fc5f6acba91634290e1cacde69a3584248f 100644
> --- a/net/ipv4/tcp_bbr.c
> +++ b/net/ipv4/tcp_bbr.c
> @@ -730,12 +730,12 @@ static void bbr_update_min_rtt(struct sock *sk, const struct rate_sample *rs)
>         bool filter_expired;
>
>         /* Track min RTT seen in the min_rtt_win_sec filter window: */
> -       filter_expired = after(tcp_time_stamp,
> +       filter_expired = after(tcp_jiffies32,
>                                bbr->min_rtt_stamp + bbr_min_rtt_win_sec * HZ);
>         if (rs->rtt_us >= 0 &&
>             (rs->rtt_us <= bbr->min_rtt_us || filter_expired)) {
>                 bbr->min_rtt_us = rs->rtt_us;
> -               bbr->min_rtt_stamp = tcp_time_stamp;
> +               bbr->min_rtt_stamp = tcp_jiffies32;
>         }
>
>         if (bbr_probe_rtt_mode_ms > 0 && filter_expired &&
> @@ -754,7 +754,7 @@ static void bbr_update_min_rtt(struct sock *sk, const struct rate_sample *rs)
>                 /* Maintain min packets in flight for max(200 ms, 1 round). */
>                 if (!bbr->probe_rtt_done_stamp &&
>                     tcp_packets_in_flight(tp) <= bbr_cwnd_min_target) {
> -                       bbr->probe_rtt_done_stamp = tcp_time_stamp +
> +                       bbr->probe_rtt_done_stamp = tcp_jiffies32 +
>                                 msecs_to_jiffies(bbr_probe_rtt_mode_ms);
>                         bbr->probe_rtt_round_done = 0;
>                         bbr->next_rtt_delivered = tp->delivered;
> @@ -762,8 +762,8 @@ static void bbr_update_min_rtt(struct sock *sk, const struct rate_sample *rs)
>                         if (bbr->round_start)
>                                 bbr->probe_rtt_round_done = 1;
>                         if (bbr->probe_rtt_round_done &&
> -                           after(tcp_time_stamp, bbr->probe_rtt_done_stamp)) {
> -                               bbr->min_rtt_stamp = tcp_time_stamp;
> +                           after(tcp_jiffies32, bbr->probe_rtt_done_stamp)) {
> +                               bbr->min_rtt_stamp = tcp_jiffies32;
>                                 bbr->restore_cwnd = 1;  /* snap to prior_cwnd */
>                                 bbr_reset_mode(sk);
>                         }
> @@ -810,7 +810,7 @@ static void bbr_init(struct sock *sk)
>         bbr->probe_rtt_done_stamp = 0;
>         bbr->probe_rtt_round_done = 0;
>         bbr->min_rtt_us = tcp_min_rtt(tp);
> -       bbr->min_rtt_stamp = tcp_time_stamp;
> +       bbr->min_rtt_stamp = tcp_jiffies32;
>
>         minmax_reset(&bbr->bw, bbr->rtt_cnt, 0);  /* init max bw to 0 */
>
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 07/15] tcp: bic,cubic: use tcp_jiffies32 instead of tcp_time_stamp
  2017-05-16 21:00 ` [PATCH net-next 07/15] tcp: bic,cubic: " Eric Dumazet
@ 2017-05-17 13:46   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Use tcp_jiffies32 instead of tcp_time_stamp, since
> tcp_time_stamp will soon be only used for TCP TS option.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_bic.c   |  6 +++---
>  net/ipv4/tcp_cubic.c | 12 ++++++------
>  2 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/net/ipv4/tcp_bic.c b/net/ipv4/tcp_bic.c
> index 36087bca9f489646c2ca5aae3111449a956dd33b..609965f0e29836ed95605a2c7f3170e67c641058 100644
> --- a/net/ipv4/tcp_bic.c
> +++ b/net/ipv4/tcp_bic.c
> @@ -84,14 +84,14 @@ static void bictcp_init(struct sock *sk)
>  static inline void bictcp_update(struct bictcp *ca, u32 cwnd)
>  {
>         if (ca->last_cwnd == cwnd &&
> -           (s32)(tcp_time_stamp - ca->last_time) <= HZ / 32)
> +           (s32)(tcp_jiffies32 - ca->last_time) <= HZ / 32)
>                 return;
>
>         ca->last_cwnd = cwnd;
> -       ca->last_time = tcp_time_stamp;
> +       ca->last_time = tcp_jiffies32;
>
>         if (ca->epoch_start == 0) /* record the beginning of an epoch */
> -               ca->epoch_start = tcp_time_stamp;
> +               ca->epoch_start = tcp_jiffies32;
>
>         /* start off normal */
>         if (cwnd <= low_window) {
> diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
> index 2052ca740916d0872a41125ab61b769b334a314b..57ae5b5ae643efad106f5d6ac224ca54a52f9689 100644
> --- a/net/ipv4/tcp_cubic.c
> +++ b/net/ipv4/tcp_cubic.c
> @@ -231,21 +231,21 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked)
>         ca->ack_cnt += acked;   /* count the number of ACKed packets */
>
>         if (ca->last_cwnd == cwnd &&
> -           (s32)(tcp_time_stamp - ca->last_time) <= HZ / 32)
> +           (s32)(tcp_jiffies32 - ca->last_time) <= HZ / 32)
>                 return;
>
>         /* The CUBIC function can update ca->cnt at most once per jiffy.
>          * On all cwnd reduction events, ca->epoch_start is set to 0,
>          * which will force a recalculation of ca->cnt.
>          */
> -       if (ca->epoch_start && tcp_time_stamp == ca->last_time)
> +       if (ca->epoch_start && tcp_jiffies32 == ca->last_time)
>                 goto tcp_friendliness;
>
>         ca->last_cwnd = cwnd;
> -       ca->last_time = tcp_time_stamp;
> +       ca->last_time = tcp_jiffies32;
>
>         if (ca->epoch_start == 0) {
> -               ca->epoch_start = tcp_time_stamp;       /* record beginning */
> +               ca->epoch_start = tcp_jiffies32;        /* record beginning */
>                 ca->ack_cnt = acked;                    /* start counting */
>                 ca->tcp_cwnd = cwnd;                    /* syn with cubic */
>
> @@ -276,7 +276,7 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked)
>          * if the cwnd < 1 million packets !!!
>          */
>
> -       t = (s32)(tcp_time_stamp - ca->epoch_start);
> +       t = (s32)(tcp_jiffies32 - ca->epoch_start);
>         t += msecs_to_jiffies(ca->delay_min >> 3);
>         /* change the unit from HZ to bictcp_HZ */
>         t <<= BICTCP_HZ;
> @@ -448,7 +448,7 @@ static void bictcp_acked(struct sock *sk, const struct ack_sample *sample)
>                 return;
>
>         /* Discard delay samples right after fast recovery */
> -       if (ca->epoch_start && (s32)(tcp_time_stamp - ca->epoch_start) < HZ)
> +       if (ca->epoch_start && (s32)(tcp_jiffies32 - ca->epoch_start) < HZ)
>                 return;
>
>         delay = (sample->rtt_us << 3) / USEC_PER_MSEC;
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 08/15] tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime
  2017-05-16 21:00 ` [PATCH net-next 08/15] tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime Eric Dumazet
@ 2017-05-17 13:46   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Use tcp_jiffies32 instead of tcp_time_stamp, since
> tcp_time_stamp will soon be only used for TCP TS option.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  include/net/tcp.h        | 4 ++--
>  net/ipv4/tcp_input.c     | 6 +++---
>  net/ipv4/tcp_minisocks.c | 2 +-
>  net/ipv4/tcp_output.c    | 2 +-
>  net/ipv4/tcp_timer.c     | 2 +-
>  5 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index feba4c0406e551d7e57da3411476735731b4d817..5b2932b8363fb8546322ebff7c74663139b3371d 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1307,8 +1307,8 @@ static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp)
>  {
>         const struct inet_connection_sock *icsk = &tp->inet_conn;
>
> -       return min_t(u32, tcp_time_stamp - icsk->icsk_ack.lrcvtime,
> -                         tcp_time_stamp - tp->rcv_tstamp);
> +       return min_t(u32, tcp_jiffies32 - icsk->icsk_ack.lrcvtime,
> +                         tcp_jiffies32 - tp->rcv_tstamp);
>  }
>
>  static inline int tcp_fin_time(const struct sock *sk)
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 6a15c9b80b09829799dc37d89ecdbf11ec9ff904..eeb4967df25a8dc35128d0a0848b5ae7ee6d63e3 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -672,7 +672,7 @@ static void tcp_event_data_recv(struct sock *sk, struct sk_buff *skb)
>
>         tcp_rcv_rtt_measure(tp);
>
> -       now = tcp_time_stamp;
> +       now = tcp_jiffies32;
>
>         if (!icsk->icsk_ack.ato) {
>                 /* The _first_ data packet received, initialize
> @@ -3636,7 +3636,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
>          */
>         sk->sk_err_soft = 0;
>         icsk->icsk_probes_out = 0;
> -       tp->rcv_tstamp = tcp_time_stamp;
> +       tp->rcv_tstamp = tcp_jiffies32;
>         if (!prior_packets)
>                 goto no_queue;
>
> @@ -5554,7 +5554,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
>         struct inet_connection_sock *icsk = inet_csk(sk);
>
>         tcp_set_state(sk, TCP_ESTABLISHED);
> -       icsk->icsk_ack.lrcvtime = tcp_time_stamp;
> +       icsk->icsk_ack.lrcvtime = tcp_jiffies32;
>
>         if (skb) {
>                 icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 717be4de53248352c758b50557987d898340dd4f..59c32e0086c0e46d7955dffe211ec03bb18dcb12 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -447,7 +447,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
>                 newtp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
>                 minmax_reset(&newtp->rtt_min, tcp_time_stamp, ~0U);
>                 newicsk->icsk_rto = TCP_TIMEOUT_INIT;
> -               newicsk->icsk_ack.lrcvtime = tcp_time_stamp;
> +               newicsk->icsk_ack.lrcvtime = tcp_jiffies32;
>
>                 newtp->packets_out = 0;
>                 newtp->retrans_out = 0;
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 4bd50f0b236ba23fe521a76dd9d35ee16acb061f..cbda5de164495cf318960489bd8edf98fe3a5033 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -3324,7 +3324,7 @@ static void tcp_connect_init(struct sock *sk)
>         if (likely(!tp->repair))
>                 tp->rcv_nxt = 0;
>         else
> -               tp->rcv_tstamp = tcp_time_stamp;
> +               tp->rcv_tstamp = tcp_jiffies32;
>         tp->rcv_wup = tp->rcv_nxt;
>         tp->copied_seq = tp->rcv_nxt;
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 5f6f219a431e41a90b3c5d667a1a22b50f4464cf..9e0616cb8c17a6385ac97fc0cd657ef9413a1749 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -451,7 +451,7 @@ void tcp_retransmit_timer(struct sock *sk)
>                                             tp->snd_una, tp->snd_nxt);
>                 }
>  #endif
> -               if (tcp_time_stamp - tp->rcv_tstamp > TCP_RTO_MAX) {
> +               if (tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX) {
>                         tcp_write_err(sk);
>                         goto out;
>                 }
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 09/15] tcp: use tcp_jiffies32 to feed probe_timestamp
  2017-05-16 21:00 ` [PATCH net-next 09/15] tcp: use tcp_jiffies32 to feed probe_timestamp Eric Dumazet
@ 2017-05-17 13:46   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> Use tcp_jiffies32 instead of tcp_time_stamp, since
> tcp_time_stamp will soon be only used for TCP TS option.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_output.c | 6 +++---
>  net/ipv4/tcp_timer.c  | 2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index cbda5de164495cf318960489bd8edf98fe3a5033..f0fd1b4fdb3291638fcdca613d826db2cd27f517 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1475,7 +1475,7 @@ void tcp_mtup_init(struct sock *sk)
>         icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, net->ipv4.sysctl_tcp_base_mss);
>         icsk->icsk_mtup.probe_size = 0;
>         if (icsk->icsk_mtup.enabled)
> -               icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
> +               icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
>  }
>  EXPORT_SYMBOL(tcp_mtup_init);
>
> @@ -1987,7 +1987,7 @@ static inline void tcp_mtu_check_reprobe(struct sock *sk)
>         s32 delta;
>
>         interval = net->ipv4.sysctl_tcp_probe_interval;
> -       delta = tcp_time_stamp - icsk->icsk_mtup.probe_timestamp;
> +       delta = tcp_jiffies32 - icsk->icsk_mtup.probe_timestamp;
>         if (unlikely(delta >= interval * HZ)) {
>                 int mss = tcp_current_mss(sk);
>
> @@ -1999,7 +1999,7 @@ static inline void tcp_mtu_check_reprobe(struct sock *sk)
>                 icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
>
>                 /* Update probe time stamp */
> -               icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
> +               icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
>         }
>  }
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 9e0616cb8c17a6385ac97fc0cd657ef9413a1749..6629f47aa7f0182ece7873afcc3daa6f0019e228 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -115,7 +115,7 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk)
>         if (net->ipv4.sysctl_tcp_mtu_probing) {
>                 if (!icsk->icsk_mtup.enabled) {
>                         icsk->icsk_mtup.enabled = 1;
> -                       icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
> +                       icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
>                         tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
>                 } else {
>                         struct net *net = sock_net(sk);
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 10/15] tcp: uses jiffies_32 to feed tp->chrono_start
  2017-05-16 21:00 ` [PATCH net-next 10/15] tcp: uses jiffies_32 to feed tp->chrono_start Eric Dumazet
@ 2017-05-17 13:46   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> tcp_time_stamp will no longer be tied to jiffies.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp.c        | 2 +-
>  net/ipv4/tcp_output.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index d0bb61ee28bbceff8f2e27416ce87fec94935973..b85bfe7cb11dca68952cc4be19b169d893963fef 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2757,7 +2757,7 @@ static void tcp_get_info_chrono_stats(const struct tcp_sock *tp,
>         for (i = TCP_CHRONO_BUSY; i < __TCP_CHRONO_MAX; ++i) {
>                 stats[i] = tp->chrono_stat[i - 1];
>                 if (i == tp->chrono_type)
> -                       stats[i] += tcp_time_stamp - tp->chrono_start;
> +                       stats[i] += tcp_jiffies32 - tp->chrono_start;
>                 stats[i] *= USEC_PER_SEC / HZ;
>                 total += stats[i];
>         }
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index f0fd1b4fdb3291638fcdca613d826db2cd27f517..1011ea40c2ba4c12cce21149cab176e1fa4db583 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2202,7 +2202,7 @@ static bool tcp_small_queue_check(struct sock *sk, const struct sk_buff *skb,
>
>  static void tcp_chrono_set(struct tcp_sock *tp, const enum tcp_chrono new)
>  {
> -       const u32 now = tcp_time_stamp;
> +       const u32 now = tcp_jiffies32;
>
>         if (tp->chrono_type > TCP_CHRONO_UNSPEC)
>                 tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start;
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 11/15] tcp: use tcp_jiffies32 in __tcp_oow_rate_limited()
  2017-05-16 21:00 ` [PATCH net-next 11/15] tcp: use tcp_jiffies32 in __tcp_oow_rate_limited() Eric Dumazet
@ 2017-05-17 13:47   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> This place wants to use tcp_jiffies32, this is good enough.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_input.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index eeb4967df25a8dc35128d0a0848b5ae7ee6d63e3..85575888365a10643e096f9e019adaa3eda87d40 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -3390,7 +3390,7 @@ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx,
>                                    u32 *last_oow_ack_time)
>  {
>         if (*last_oow_ack_time) {
> -               s32 elapsed = (s32)(tcp_time_stamp - *last_oow_ack_time);
> +               s32 elapsed = (s32)(tcp_jiffies32 - *last_oow_ack_time);
>
>                 if (0 <= elapsed && elapsed < sysctl_tcp_invalid_ratelimit) {
>                         NET_INC_STATS(net, mib_idx);
> @@ -3398,7 +3398,7 @@ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx,
>                 }
>         }
>
> -       *last_oow_ack_time = tcp_time_stamp;
> +       *last_oow_ack_time = tcp_jiffies32;
>
>         return false;   /* not rate-limited: go ahead, send dupack now! */
>  }
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 12/15] tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp
  2017-05-16 21:00 ` [PATCH net-next 12/15] tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
@ 2017-05-17 13:47   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> This CC does not need 1 ms tcp_time_stamp and can use
> the jiffy based 'timestamp'.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_westwood.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/tcp_westwood.c b/net/ipv4/tcp_westwood.c
> index 9775453b8d174c848dc09df83d1fa185422cd8cc..bec9cafbe3f92938e5d79d743d629b2f33464418 100644
> --- a/net/ipv4/tcp_westwood.c
> +++ b/net/ipv4/tcp_westwood.c
> @@ -68,7 +68,7 @@ static void tcp_westwood_init(struct sock *sk)
>         w->cumul_ack = 0;
>         w->reset_rtt_min = 1;
>         w->rtt_min = w->rtt = TCP_WESTWOOD_INIT_RTT;
> -       w->rtt_win_sx = tcp_time_stamp;
> +       w->rtt_win_sx = tcp_jiffies32;
>         w->snd_una = tcp_sk(sk)->snd_una;
>         w->first_ack = 1;
>  }
> @@ -116,7 +116,7 @@ static void tcp_westwood_pkts_acked(struct sock *sk,
>  static void westwood_update_window(struct sock *sk)
>  {
>         struct westwood *w = inet_csk_ca(sk);
> -       s32 delta = tcp_time_stamp - w->rtt_win_sx;
> +       s32 delta = tcp_jiffies32 - w->rtt_win_sx;
>
>         /* Initialize w->snd_una with the first acked sequence number in order
>          * to fix mismatch between tp->snd_una and w->snd_una for the first
> @@ -140,7 +140,7 @@ static void westwood_update_window(struct sock *sk)
>                 westwood_filter(w, delta);
>
>                 w->bk = 0;
> -               w->rtt_win_sx = tcp_time_stamp;
> +               w->rtt_win_sx = tcp_jiffies32;
>         }
>  }
>
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 13/15] tcp_lp: cache tcp_time_stamp
  2017-05-16 21:00 ` [PATCH net-next 13/15] tcp_lp: cache tcp_time_stamp Eric Dumazet
@ 2017-05-17 13:47   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> tcp_time_stamp will become slightly more expensive soon,
> cache its value.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp_lp.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/tcp_lp.c b/net/ipv4/tcp_lp.c
> index d6fb6c067af4641f232b94e7c590c212648e8173..ef3122abb3734a63011fba035f7a7aae431da8de 100644
> --- a/net/ipv4/tcp_lp.c
> +++ b/net/ipv4/tcp_lp.c
> @@ -264,18 +264,19 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
>  {
>         struct tcp_sock *tp = tcp_sk(sk);
>         struct lp *lp = inet_csk_ca(sk);
> +       u32 now = tcp_time_stamp;
>         u32 delta;
>
>         if (sample->rtt_us > 0)
>                 tcp_lp_rtt_sample(sk, sample->rtt_us);
>
>         /* calc inference */
> -       delta = tcp_time_stamp - tp->rx_opt.rcv_tsecr;
> +       delta = now - tp->rx_opt.rcv_tsecr;
>         if ((s32)delta > 0)
>                 lp->inference = 3 * delta;
>
>         /* test if within inference */
> -       if (lp->last_drop && (tcp_time_stamp - lp->last_drop < lp->inference))
> +       if (lp->last_drop && (now - lp->last_drop < lp->inference))
>                 lp->flag |= LP_WITHIN_INF;
>         else
>                 lp->flag &= ~LP_WITHIN_INF;
> @@ -312,7 +313,7 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
>                 tp->snd_cwnd = max(tp->snd_cwnd >> 1U, 1U);
>
>         /* record this drop time */
> -       lp->last_drop = tcp_time_stamp;
> +       lp->last_drop = now;
>  }
>
>  static struct tcp_congestion_ops tcp_lp __read_mostly = {
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 14/15] tcp: replace misc tcp_time_stamp to tcp_jiffies32
  2017-05-16 21:00 ` [PATCH net-next 14/15] tcp: replace misc tcp_time_stamp to tcp_jiffies32 Eric Dumazet
@ 2017-05-17 13:47   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> After this patch, all uses of tcp_time_stamp will require
> a change when we introduce 1 ms and/or 1 us TCP TS option.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  net/ipv4/tcp.c           | 2 +-
>  net/ipv4/tcp_htcp.c      | 2 +-
>  net/ipv4/tcp_input.c     | 2 +-
>  net/ipv4/tcp_minisocks.c | 2 +-
>  net/ipv4/tcp_output.c    | 4 ++--
>  5 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index b85bfe7cb11dca68952cc4be19b169d893963fef..85005480052626c5769ef100a868c88fad803f75 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -386,7 +386,7 @@ void tcp_init_sock(struct sock *sk)
>
>         icsk->icsk_rto = TCP_TIMEOUT_INIT;
>         tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
> -       minmax_reset(&tp->rtt_min, tcp_time_stamp, ~0U);
> +       minmax_reset(&tp->rtt_min, tcp_jiffies32, ~0U);
>
>         /* So many TCP implementations out there (incorrectly) count the
>          * initial SYN frame in their delayed-ACK and congestion control
> diff --git a/net/ipv4/tcp_htcp.c b/net/ipv4/tcp_htcp.c
> index 4a4d8e76738fa2831dcc3ecec5924dd3dfb7bf58..3eb78cde6ff0a22b7b411f0ae4258b6ef74ffe73 100644
> --- a/net/ipv4/tcp_htcp.c
> +++ b/net/ipv4/tcp_htcp.c
> @@ -104,7 +104,7 @@ static void measure_achieved_throughput(struct sock *sk,
>         const struct inet_connection_sock *icsk = inet_csk(sk);
>         const struct tcp_sock *tp = tcp_sk(sk);
>         struct htcp *ca = inet_csk_ca(sk);
> -       u32 now = tcp_time_stamp;
> +       u32 now = tcp_jiffies32;
>
>         if (icsk->icsk_ca_state == TCP_CA_Open)
>                 ca->pkts_acked = sample->pkts_acked;
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 85575888365a10643e096f9e019adaa3eda87d40..10e6775464f647a65ea0d19c10b421f9cd38923d 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2911,7 +2911,7 @@ static void tcp_update_rtt_min(struct sock *sk, u32 rtt_us)
>         struct tcp_sock *tp = tcp_sk(sk);
>         u32 wlen = sysctl_tcp_min_rtt_wlen * HZ;
>
> -       minmax_running_min(&tp->rtt_min, wlen, tcp_time_stamp,
> +       minmax_running_min(&tp->rtt_min, wlen, tcp_jiffies32,
>                            rtt_us ? : jiffies_to_usecs(1));
>  }
>
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 59c32e0086c0e46d7955dffe211ec03bb18dcb12..6504f1082bdfda77bfc1b53d0d85928e5083a24e 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -445,7 +445,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
>
>                 newtp->srtt_us = 0;
>                 newtp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
> -               minmax_reset(&newtp->rtt_min, tcp_time_stamp, ~0U);
> +               minmax_reset(&newtp->rtt_min, tcp_jiffies32, ~0U);
>                 newicsk->icsk_rto = TCP_TIMEOUT_INIT;
>                 newicsk->icsk_ack.lrcvtime = tcp_jiffies32;
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 1011ea40c2ba4c12cce21149cab176e1fa4db583..65472e931a0b79f7078a4da7db802dfcc32c7621 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2418,10 +2418,10 @@ bool tcp_schedule_loss_probe(struct sock *sk)
>         timeout = max_t(u32, timeout, msecs_to_jiffies(10));
>
>         /* If RTO is shorter, just schedule TLP in its place. */
> -       tlp_time_stamp = tcp_time_stamp + timeout;
> +       tlp_time_stamp = tcp_jiffies32 + timeout;
>         rto_time_stamp = (u32)inet_csk(sk)->icsk_timeout;
>         if ((s32)(tlp_time_stamp - rto_time_stamp) > 0) {
> -               s32 delta = rto_time_stamp - tcp_time_stamp;
> +               s32 delta = rto_time_stamp - tcp_jiffies32;
>                 if (delta > 0)
>                         timeout = delta;
>         }
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock
  2017-05-16 21:00 ` [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock Eric Dumazet
@ 2017-05-17 13:51   ` Soheil Hassas Yeganeh
  2017-05-18 12:33   ` Eric Dumazet
  1 sibling, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-17 13:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng, Wei Wang, netdev,
	Eric Dumazet

On Tue, May 16, 2017 at 5:00 PM, Eric Dumazet <edumazet@google.com> wrote:
> TCP Timestamps option is defined in RFC 7323
>
> Traditionally on linux, it has been tied to the internal
> 'jiffies' variable, because it had been a cheap and good enough
> generator.
>
> For TCP flows on the Internet, 1 ms resolution would be much better
> than 4ms or 10ms (HZ=250 or HZ=100 respectively)
>
> For TCP flows in the DC, Google has used usec resolution for more
> than two years with great success [1]
>
> Receive size autotuning (DRS) is indeed more precise and converges
> faster to optimal window size.
>
> This patch converts tp->tcp_mstamp to a plain u64 value storing
> a 1 usec TCP clock.
>
> This choice will allow us to upstream the 1 usec TS option as
> discussed in IETF 97.
>
> [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  include/linux/skbuff.h           | 62 +-------------------------
>  include/linux/tcp.h              | 22 ++++-----
>  include/net/tcp.h                | 59 ++++++++++++++++++++----
>  net/ipv4/syncookies.c            |  8 ++--
>  net/ipv4/tcp.c                   |  4 +-
>  net/ipv4/tcp_bbr.c               | 22 ++++-----
>  net/ipv4/tcp_input.c             | 96 ++++++++++++++++++++--------------------
>  net/ipv4/tcp_ipv4.c              | 17 +++----
>  net/ipv4/tcp_lp.c                | 12 ++---
>  net/ipv4/tcp_minisocks.c         |  4 +-
>  net/ipv4/tcp_output.c            | 16 +++----
>  net/ipv4/tcp_rate.c              | 16 +++----
>  net/ipv4/tcp_recovery.c          | 23 +++++-----
>  net/ipv4/tcp_timer.c             |  8 ++--
>  net/ipv6/syncookies.c            |  2 +-
>  net/ipv6/tcp_ipv6.c              |  4 +-
>  net/netfilter/nf_synproxy_core.c |  2 +-
>  17 files changed, 178 insertions(+), 199 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index bfc7892f6c33c9fdfb7c0d8110f80cfb12d1ae61..7c0cb2ce8b01a9be366d8cdb7e3661f65ebff3c9 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -506,66 +506,6 @@ typedef unsigned int sk_buff_data_t;
>  typedef unsigned char *sk_buff_data_t;
>  #endif
>
> -/**
> - * struct skb_mstamp - multi resolution time stamps
> - * @stamp_us: timestamp in us resolution
> - * @stamp_jiffies: timestamp in jiffies
> - */
> -struct skb_mstamp {
> -       union {
> -               u64             v64;
> -               struct {
> -                       u32     stamp_us;
> -                       u32     stamp_jiffies;
> -               };
> -       };
> -};
> -
> -/**
> - * skb_mstamp_get - get current timestamp
> - * @cl: place to store timestamps
> - */
> -static inline void skb_mstamp_get(struct skb_mstamp *cl)
> -{
> -       u64 val = local_clock();
> -
> -       do_div(val, NSEC_PER_USEC);
> -       cl->stamp_us = (u32)val;
> -       cl->stamp_jiffies = (u32)jiffies;
> -}
> -
> -/**
> - * skb_mstamp_delta - compute the difference in usec between two skb_mstamp
> - * @t1: pointer to newest sample
> - * @t0: pointer to oldest sample
> - */
> -static inline u32 skb_mstamp_us_delta(const struct skb_mstamp *t1,
> -                                     const struct skb_mstamp *t0)
> -{
> -       s32 delta_us = t1->stamp_us - t0->stamp_us;
> -       u32 delta_jiffies = t1->stamp_jiffies - t0->stamp_jiffies;
> -
> -       /* If delta_us is negative, this might be because interval is too big,
> -        * or local_clock() drift is too big : fallback using jiffies.
> -        */
> -       if (delta_us <= 0 ||
> -           delta_jiffies >= (INT_MAX / (USEC_PER_SEC / HZ)))
> -
> -               delta_us = jiffies_to_usecs(delta_jiffies);
> -
> -       return delta_us;
> -}
> -
> -static inline bool skb_mstamp_after(const struct skb_mstamp *t1,
> -                                   const struct skb_mstamp *t0)
> -{
> -       s32 diff = t1->stamp_jiffies - t0->stamp_jiffies;
> -
> -       if (!diff)
> -               diff = t1->stamp_us - t0->stamp_us;
> -       return diff > 0;
> -}
> -
>  /**
>   *     struct sk_buff - socket buffer
>   *     @next: Next buffer in list
> @@ -646,7 +586,7 @@ struct sk_buff {
>
>                         union {
>                                 ktime_t         tstamp;
> -                               struct skb_mstamp skb_mstamp;
> +                               u64             skb_mstamp;
>                         };
>                 };
>                 struct rb_node  rbnode; /* used in netem & tcp stack */
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 22854f0284347a3bb047709478525ee5a9dd9b36..542ca1ae02c4f64833b287c0fd744283ee518909 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -123,7 +123,7 @@ struct tcp_request_sock_ops;
>  struct tcp_request_sock {
>         struct inet_request_sock        req;
>         const struct tcp_request_sock_ops *af_specific;
> -       struct skb_mstamp               snt_synack; /* first SYNACK sent time */
> +       u64                             snt_synack; /* first SYNACK sent time */
>         bool                            tfo_listener;
>         u32                             txhash;
>         u32                             rcv_isn;
> @@ -211,7 +211,7 @@ struct tcp_sock {
>
>         /* Information of the most recently (s)acked skb */
>         struct tcp_rack {
> -               struct skb_mstamp mstamp; /* (Re)sent time of the skb */
> +               u64 mstamp; /* (Re)sent time of the skb */
>                 u32 rtt_us;  /* Associated RTT */
>                 u32 end_seq; /* Ending TCP sequence of the skb */
>                 u8 advanced; /* mstamp advanced since last lost marking */
> @@ -240,7 +240,7 @@ struct tcp_sock {
>         u32     tlp_high_seq;   /* snd_nxt at the time of TLP retransmit. */
>
>  /* RTT measurement */
> -       struct skb_mstamp tcp_mstamp; /* most recent packet received/sent */
> +       u64     tcp_mstamp;     /* most recent packet received/sent */
>         u32     srtt_us;        /* smoothed round trip time << 3 in usecs */
>         u32     mdev_us;        /* medium deviation                     */
>         u32     mdev_max_us;    /* maximal mdev for the last rtt period */
> @@ -280,8 +280,8 @@ struct tcp_sock {
>         u32     delivered;      /* Total data packets delivered incl. rexmits */
>         u32     lost;           /* Total data packets lost incl. rexmits */
>         u32     app_limited;    /* limited until "delivered" reaches this val */
> -       struct skb_mstamp first_tx_mstamp;  /* start of window send phase */
> -       struct skb_mstamp delivered_mstamp; /* time we reached "delivered" */
> +       u64     first_tx_mstamp;  /* start of window send phase */
> +       u64     delivered_mstamp; /* time we reached "delivered" */
>         u32     rate_delivered;    /* saved rate sample: packets delivered */
>         u32     rate_interval_us;  /* saved rate sample: time elapsed */
>
> @@ -335,16 +335,16 @@ struct tcp_sock {
>
>  /* Receiver side RTT estimation */
>         struct {
> -               u32             rtt_us;
> -               u32             seq;
> -               struct skb_mstamp time;
> +               u32     rtt_us;
> +               u32     seq;
> +               u64     time;
>         } rcv_rtt_est;
>
>  /* Receiver queue space */
>         struct {
> -               int             space;
> -               u32             seq;
> -               struct skb_mstamp time;
> +               int     space;
> +               u32     seq;
> +               u64     time;
>         } rcvq_space;
>
>  /* TCP-specific MTU probe information. */
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 5b2932b8363fb8546322ebff7c74663139b3371d..82462db97183abebb33628eb5e04a5c5f04ea873 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -519,7 +519,7 @@ static inline u32 tcp_cookie_time(void)
>  u32 __cookie_v4_init_sequence(const struct iphdr *iph, const struct tcphdr *th,
>                               u16 *mssp);
>  __u32 cookie_v4_init_sequence(const struct sk_buff *skb, __u16 *mss);
> -__u32 cookie_init_timestamp(struct request_sock *req);
> +u64 cookie_init_timestamp(struct request_sock *req);
>  bool cookie_timestamp_decode(struct tcp_options_received *opt);
>  bool cookie_ecn_ok(const struct tcp_options_received *opt,
>                    const struct net *net, const struct dst_entry *dst);
> @@ -706,14 +706,55 @@ void tcp_send_window_probe(struct sock *sk);
>   */
>  #define tcp_jiffies32 ((u32)jiffies)
>
> -/* Generator for TCP TS option (RFC 7323)
> - * Currently tied to 'jiffies' but will soon be driven by 1 ms clock.
> +/*
> + * Deliver a 32bit value for TCP timestamp option (RFC 7323)
> + * It is no longer tied to jiffies, but to 1 ms clock.
> + * Note: double check if you want to use tcp_jiffies32 instead of this.
> + */
> +#define TCP_TS_HZ      1000
> +
> +static inline u64 tcp_clock_ns(void)
> +{
> +       return local_clock();
> +}
> +
> +static inline u64 tcp_clock_us(void)
> +{
> +       return div_u64(tcp_clock_ns(), NSEC_PER_USEC);
> +}
> +
> +/* This should only be used in contexts where tp->tcp_mstamp is up to date */
> +static inline u32 tcp_time_stamp(const struct tcp_sock *tp)
> +{
> +       return div_u64(tp->tcp_mstamp, USEC_PER_SEC / TCP_TS_HZ);
> +}
> +
> +/* Could use tcp_clock_us() / 1000, but this version uses a single divide */
> +static inline u32 tcp_time_stamp_raw(void)
> +{
> +       return div_u64(tcp_clock_ns(), NSEC_PER_SEC / TCP_TS_HZ);
> +}
> +
> +
> +/* Refresh 1us clock of a TCP socket,
> + * ensuring monotically increasing values.
>   */
> -#define tcp_time_stamp         ((__u32)(jiffies))
> +static inline void tcp_mstamp_refresh(struct tcp_sock *tp)
> +{
> +       u64 val = tcp_clock_us();
> +
> +       if (val > tp->tcp_mstamp)
> +               tp->tcp_mstamp = val;
> +}
> +
> +static inline u32 tcp_stamp_us_delta(u64 t1, u64 t0)
> +{
> +       return max_t(s64, t1 - t0, 0);
> +}
>
>  static inline u32 tcp_skb_timestamp(const struct sk_buff *skb)
>  {
> -       return skb->skb_mstamp.stamp_jiffies;
> +       return div_u64(skb->skb_mstamp, USEC_PER_SEC / TCP_TS_HZ);
>  }
>
>
> @@ -778,9 +819,9 @@ struct tcp_skb_cb {
>                         /* pkts S/ACKed so far upon tx of skb, incl retrans: */
>                         __u32 delivered;
>                         /* start of send pipeline phase */
> -                       struct skb_mstamp first_tx_mstamp;
> +                       u64 first_tx_mstamp;
>                         /* when we reached the "delivered" count */
> -                       struct skb_mstamp delivered_mstamp;
> +                       u64 delivered_mstamp;
>                 } tx;   /* only used for outgoing skbs */
>                 union {
>                         struct inet_skb_parm    h4;
> @@ -896,7 +937,7 @@ struct ack_sample {
>   * A sample is invalid if "delivered" or "interval_us" is negative.
>   */
>  struct rate_sample {
> -       struct  skb_mstamp prior_mstamp; /* starting timestamp for interval */
> +       u64  prior_mstamp; /* starting timestamp for interval */
>         u32  prior_delivered;   /* tp->delivered at "prior_mstamp" */
>         s32  delivered;         /* number of packets delivered over interval */
>         long interval_us;       /* time for tp->delivered to incr "delivered" */
> @@ -1862,7 +1903,7 @@ void tcp_init(void);
>  /* tcp_recovery.c */
>  extern void tcp_rack_mark_lost(struct sock *sk);
>  extern void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
> -                            const struct skb_mstamp *xmit_time);
> +                            u64 xmit_time);
>  extern void tcp_rack_reo_timeout(struct sock *sk);
>
>  /*
> diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
> index 0257d965f11119acf8c55888d6e672d171ef5f08..6426250a58ea1afb29b673c00bb9d58bd3d21122 100644
> --- a/net/ipv4/syncookies.c
> +++ b/net/ipv4/syncookies.c
> @@ -66,10 +66,10 @@ static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport,
>   * Since subsequent timestamps use the normal tcp_time_stamp value, we
>   * must make sure that the resulting initial timestamp is <= tcp_time_stamp.
>   */
> -__u32 cookie_init_timestamp(struct request_sock *req)
> +u64 cookie_init_timestamp(struct request_sock *req)
>  {
>         struct inet_request_sock *ireq;
> -       u32 ts, ts_now = tcp_time_stamp;
> +       u32 ts, ts_now = tcp_time_stamp_raw();
>         u32 options = 0;
>
>         ireq = inet_rsk(req);
> @@ -88,7 +88,7 @@ __u32 cookie_init_timestamp(struct request_sock *req)
>                 ts <<= TSBITS;
>                 ts |= options;
>         }
> -       return ts;
> +       return (u64)ts * (USEC_PER_SEC / TCP_TS_HZ);
>  }
>
>
> @@ -343,7 +343,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
>         ireq->wscale_ok         = tcp_opt.wscale_ok;
>         ireq->tstamp_ok         = tcp_opt.saw_tstamp;
>         req->ts_recent          = tcp_opt.saw_tstamp ? tcp_opt.rcv_tsval : 0;
> -       treq->snt_synack.v64    = 0;
> +       treq->snt_synack        = 0;
>         treq->tfo_listener      = false;
>
>         ireq->ir_iif = inet_request_bound_dev_if(sk, skb);
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 85005480052626c5769ef100a868c88fad803f75..b5d18484746daa9189ade316fa9ffc17be30cb60 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2706,7 +2706,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>                 if (!tp->repair)
>                         err = -EPERM;
>                 else
> -                       tp->tsoffset = val - tcp_time_stamp;
> +                       tp->tsoffset = val - tcp_time_stamp_raw();
>                 break;
>         case TCP_REPAIR_WINDOW:
>                 err = tcp_repair_set_window(tp, optval, optlen);
> @@ -3072,7 +3072,7 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
>                 break;
>
>         case TCP_TIMESTAMP:
> -               val = tcp_time_stamp + tp->tsoffset;
> +               val = tcp_time_stamp_raw() + tp->tsoffset;
>                 break;
>         case TCP_NOTSENT_LOWAT:
>                 val = tp->notsent_lowat;
> diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
> index 40dc4fc5f6acba91634290e1cacde69a3584248f..dbcc9352a48f07a12484e45f3baf0a733e244f75 100644
> --- a/net/ipv4/tcp_bbr.c
> +++ b/net/ipv4/tcp_bbr.c
> @@ -91,7 +91,7 @@ struct bbr {
>         struct minmax bw;       /* Max recent delivery rate in pkts/uS << 24 */
>         u32     rtt_cnt;            /* count of packet-timed rounds elapsed */
>         u32     next_rtt_delivered; /* scb->tx.delivered at end of round */
> -       struct skb_mstamp cycle_mstamp;  /* time of this cycle phase start */
> +       u64     cycle_mstamp;        /* time of this cycle phase start */
>         u32     mode:3,              /* current bbr_mode in state machine */
>                 prev_ca_state:3,     /* CA state on previous ACK */
>                 packet_conservation:1,  /* use packet conservation? */
> @@ -411,7 +411,7 @@ static bool bbr_is_next_cycle_phase(struct sock *sk,
>         struct tcp_sock *tp = tcp_sk(sk);
>         struct bbr *bbr = inet_csk_ca(sk);
>         bool is_full_length =
> -               skb_mstamp_us_delta(&tp->delivered_mstamp, &bbr->cycle_mstamp) >
> +               tcp_stamp_us_delta(tp->delivered_mstamp, bbr->cycle_mstamp) >
>                 bbr->min_rtt_us;
>         u32 inflight, bw;
>
> @@ -497,7 +497,7 @@ static void bbr_reset_lt_bw_sampling_interval(struct sock *sk)
>         struct tcp_sock *tp = tcp_sk(sk);
>         struct bbr *bbr = inet_csk_ca(sk);
>
> -       bbr->lt_last_stamp = tp->delivered_mstamp.stamp_jiffies;
> +       bbr->lt_last_stamp = div_u64(tp->delivered_mstamp, USEC_PER_MSEC);
>         bbr->lt_last_delivered = tp->delivered;
>         bbr->lt_last_lost = tp->lost;
>         bbr->lt_rtt_cnt = 0;
> @@ -551,7 +551,7 @@ static void bbr_lt_bw_sampling(struct sock *sk, const struct rate_sample *rs)
>         struct bbr *bbr = inet_csk_ca(sk);
>         u32 lost, delivered;
>         u64 bw;
> -       s32 t;
> +       u32 t;
>
>         if (bbr->lt_use_bw) {   /* already using long-term rate, lt_bw? */
>                 if (bbr->mode == BBR_PROBE_BW && bbr->round_start &&
> @@ -603,15 +603,15 @@ static void bbr_lt_bw_sampling(struct sock *sk, const struct rate_sample *rs)
>                 return;
>
>         /* Find average delivery rate in this sampling interval. */
> -       t = (s32)(tp->delivered_mstamp.stamp_jiffies - bbr->lt_last_stamp);
> -       if (t < 1)
> -               return;         /* interval is less than one jiffy, so wait */
> -       t = jiffies_to_usecs(t);
> -       /* Interval long enough for jiffies_to_usecs() to return a bogus 0? */
> -       if (t < 1) {
> +       t = div_u64(tp->delivered_mstamp, USEC_PER_MSEC) - bbr->lt_last_stamp;
> +       if ((s32)t < 1)
> +               return;         /* interval is less than one ms, so wait */
> +       /* Check if can multiply without overflow */
> +       if (t >= ~0U / USEC_PER_MSEC) {
>                 bbr_reset_lt_bw_sampling(sk);  /* interval too long; reset */
>                 return;
>         }
> +       t *= USEC_PER_MSEC;
>         bw = (u64)delivered * BW_UNIT;
>         do_div(bw, t);
>         bbr_lt_bw_interval_done(sk, bw);
> @@ -825,7 +825,7 @@ static void bbr_init(struct sock *sk)
>         bbr->idle_restart = 0;
>         bbr->full_bw = 0;
>         bbr->full_bw_cnt = 0;
> -       bbr->cycle_mstamp.v64 = 0;
> +       bbr->cycle_mstamp = 0;
>         bbr->cycle_idx = 0;
>         bbr_reset_lt_bw_sampling(sk);
>         bbr_reset_startup_mode(sk);
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 10e6775464f647a65ea0d19c10b421f9cd38923d..9a5a9e8eda899666501cca06b37948ab64ae79b2 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -441,7 +441,7 @@ void tcp_init_buffer_space(struct sock *sk)
>                 tcp_sndbuf_expand(sk);
>
>         tp->rcvq_space.space = tp->rcv_wnd;
> -       skb_mstamp_get(&tp->tcp_mstamp);
> +       tcp_mstamp_refresh(tp);
>         tp->rcvq_space.time = tp->tcp_mstamp;
>         tp->rcvq_space.seq = tp->copied_seq;
>
> @@ -555,11 +555,11 @@ static inline void tcp_rcv_rtt_measure(struct tcp_sock *tp)
>  {
>         u32 delta_us;
>
> -       if (tp->rcv_rtt_est.time.v64 == 0)
> +       if (tp->rcv_rtt_est.time == 0)
>                 goto new_measure;
>         if (before(tp->rcv_nxt, tp->rcv_rtt_est.seq))
>                 return;
> -       delta_us = skb_mstamp_us_delta(&tp->tcp_mstamp, &tp->rcv_rtt_est.time);
> +       delta_us = tcp_stamp_us_delta(tp->tcp_mstamp, tp->rcv_rtt_est.time);
>         tcp_rcv_rtt_update(tp, delta_us, 1);
>
>  new_measure:
> @@ -571,13 +571,15 @@ static inline void tcp_rcv_rtt_measure_ts(struct sock *sk,
>                                           const struct sk_buff *skb)
>  {
>         struct tcp_sock *tp = tcp_sk(sk);
> +
>         if (tp->rx_opt.rcv_tsecr &&
>             (TCP_SKB_CB(skb)->end_seq -
> -            TCP_SKB_CB(skb)->seq >= inet_csk(sk)->icsk_ack.rcv_mss))
> -               tcp_rcv_rtt_update(tp,
> -                                  jiffies_to_usecs(tcp_time_stamp -
> -                                                   tp->rx_opt.rcv_tsecr),
> -                                  0);
> +            TCP_SKB_CB(skb)->seq >= inet_csk(sk)->icsk_ack.rcv_mss)) {
> +               u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
> +               u32 delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
> +
> +               tcp_rcv_rtt_update(tp, delta_us, 0);
> +       }
>  }
>
>  /*
> @@ -590,7 +592,7 @@ void tcp_rcv_space_adjust(struct sock *sk)
>         int time;
>         int copied;
>
> -       time = skb_mstamp_us_delta(&tp->tcp_mstamp, &tp->rcvq_space.time);
> +       time = tcp_stamp_us_delta(tp->tcp_mstamp, tp->rcvq_space.time);
>         if (time < (tp->rcv_rtt_est.rtt_us >> 3) || tp->rcv_rtt_est.rtt_us == 0)
>                 return;
>
> @@ -1134,8 +1136,8 @@ struct tcp_sacktag_state {
>          * that was SACKed. RTO needs the earliest RTT to stay conservative,
>          * but congestion control should still get an accurate delay signal.
>          */
> -       struct skb_mstamp first_sackt;
> -       struct skb_mstamp last_sackt;
> +       u64     first_sackt;
> +       u64     last_sackt;
>         struct rate_sample *rate;
>         int     flag;
>  };
> @@ -1200,7 +1202,7 @@ static u8 tcp_sacktag_one(struct sock *sk,
>                           struct tcp_sacktag_state *state, u8 sacked,
>                           u32 start_seq, u32 end_seq,
>                           int dup_sack, int pcount,
> -                         const struct skb_mstamp *xmit_time)
> +                         u64 xmit_time)
>  {
>         struct tcp_sock *tp = tcp_sk(sk);
>         int fack_count = state->fack_count;
> @@ -1242,9 +1244,9 @@ static u8 tcp_sacktag_one(struct sock *sk,
>                                                            state->reord);
>                                 if (!after(end_seq, tp->high_seq))
>                                         state->flag |= FLAG_ORIG_SACK_ACKED;
> -                               if (state->first_sackt.v64 == 0)
> -                                       state->first_sackt = *xmit_time;
> -                               state->last_sackt = *xmit_time;
> +                               if (state->first_sackt == 0)
> +                                       state->first_sackt = xmit_time;
> +                               state->last_sackt = xmit_time;
>                         }
>
>                         if (sacked & TCPCB_LOST) {
> @@ -1304,7 +1306,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
>          */
>         tcp_sacktag_one(sk, state, TCP_SKB_CB(skb)->sacked,
>                         start_seq, end_seq, dup_sack, pcount,
> -                       &skb->skb_mstamp);
> +                       skb->skb_mstamp);
>         tcp_rate_skb_delivered(sk, skb, state->rate);
>
>         if (skb == tp->lost_skb_hint)
> @@ -1356,8 +1358,8 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
>                 tcp_advance_highest_sack(sk, skb);
>
>         tcp_skb_collapse_tstamp(prev, skb);
> -       if (unlikely(TCP_SKB_CB(prev)->tx.delivered_mstamp.v64))
> -               TCP_SKB_CB(prev)->tx.delivered_mstamp.v64 = 0;
> +       if (unlikely(TCP_SKB_CB(prev)->tx.delivered_mstamp))
> +               TCP_SKB_CB(prev)->tx.delivered_mstamp = 0;
>
>         tcp_unlink_write_queue(skb, sk);
>         sk_wmem_free_skb(sk, skb);
> @@ -1587,7 +1589,7 @@ static struct sk_buff *tcp_sacktag_walk(struct sk_buff *skb, struct sock *sk,
>                                                 TCP_SKB_CB(skb)->end_seq,
>                                                 dup_sack,
>                                                 tcp_skb_pcount(skb),
> -                                               &skb->skb_mstamp);
> +                                               skb->skb_mstamp);
>                         tcp_rate_skb_delivered(sk, skb, state->rate);
>
>                         if (!before(TCP_SKB_CB(skb)->seq,
> @@ -2936,9 +2938,12 @@ static inline bool tcp_ack_update_rtt(struct sock *sk, const int flag,
>          * See draft-ietf-tcplw-high-performance-00, section 3.3.
>          */
>         if (seq_rtt_us < 0 && tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
> -           flag & FLAG_ACKED)
> -               seq_rtt_us = ca_rtt_us = jiffies_to_usecs(tcp_time_stamp -
> -                                                         tp->rx_opt.rcv_tsecr);
> +           flag & FLAG_ACKED) {
> +               u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
> +               u32 delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
> +
> +               seq_rtt_us = ca_rtt_us = delta_us;
> +       }
>         if (seq_rtt_us < 0)
>                 return false;
>
> @@ -2960,12 +2965,8 @@ void tcp_synack_rtt_meas(struct sock *sk, struct request_sock *req)
>  {
>         long rtt_us = -1L;
>
> -       if (req && !req->num_retrans && tcp_rsk(req)->snt_synack.v64) {
> -               struct skb_mstamp now;
> -
> -               skb_mstamp_get(&now);
> -               rtt_us = skb_mstamp_us_delta(&now, &tcp_rsk(req)->snt_synack);
> -       }
> +       if (req && !req->num_retrans && tcp_rsk(req)->snt_synack)
> +               rtt_us = tcp_stamp_us_delta(tcp_clock_us(), tcp_rsk(req)->snt_synack);
>
>         tcp_ack_update_rtt(sk, FLAG_SYN_ACKED, rtt_us, -1L, rtt_us);
>  }
> @@ -3003,7 +3004,7 @@ void tcp_rearm_rto(struct sock *sk)
>                         struct sk_buff *skb = tcp_write_queue_head(sk);
>                         const u32 rto_time_stamp =
>                                 tcp_skb_timestamp(skb) + rto;
> -                       s32 delta = (s32)(rto_time_stamp - tcp_time_stamp);
> +                       s32 delta = (s32)(rto_time_stamp - tcp_jiffies32);
>                         /* delta may not be positive if the socket is locked
>                          * when the retrans timer fires and is rescheduled.
>                          */
> @@ -3060,9 +3061,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
>                                struct tcp_sacktag_state *sack)
>  {
>         const struct inet_connection_sock *icsk = inet_csk(sk);
> -       struct skb_mstamp first_ackt, last_ackt;
> +       u64 first_ackt, last_ackt;
>         struct tcp_sock *tp = tcp_sk(sk);
> -       struct skb_mstamp *now = &tp->tcp_mstamp;
>         u32 prior_sacked = tp->sacked_out;
>         u32 reord = tp->packets_out;
>         bool fully_acked = true;
> @@ -3075,7 +3075,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
>         bool rtt_update;
>         int flag = 0;
>
> -       first_ackt.v64 = 0;
> +       first_ackt = 0;
>
>         while ((skb = tcp_write_queue_head(sk)) && skb != tcp_send_head(sk)) {
>                 struct tcp_skb_cb *scb = TCP_SKB_CB(skb);
> @@ -3106,8 +3106,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
>                         flag |= FLAG_RETRANS_DATA_ACKED;
>                 } else if (!(sacked & TCPCB_SACKED_ACKED)) {
>                         last_ackt = skb->skb_mstamp;
> -                       WARN_ON_ONCE(last_ackt.v64 == 0);
> -                       if (!first_ackt.v64)
> +                       WARN_ON_ONCE(last_ackt == 0);
> +                       if (!first_ackt)
>                                 first_ackt = last_ackt;
>
>                         last_in_flight = TCP_SKB_CB(skb)->tx.in_flight;
> @@ -3122,7 +3122,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
>                         tp->delivered += acked_pcount;
>                         if (!tcp_skb_spurious_retrans(tp, skb))
>                                 tcp_rack_advance(tp, sacked, scb->end_seq,
> -                                                &skb->skb_mstamp);
> +                                                skb->skb_mstamp);
>                 }
>                 if (sacked & TCPCB_LOST)
>                         tp->lost_out -= acked_pcount;
> @@ -3165,13 +3165,13 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
>         if (skb && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
>                 flag |= FLAG_SACK_RENEGING;
>
> -       if (likely(first_ackt.v64) && !(flag & FLAG_RETRANS_DATA_ACKED)) {
> -               seq_rtt_us = skb_mstamp_us_delta(now, &first_ackt);
> -               ca_rtt_us = skb_mstamp_us_delta(now, &last_ackt);
> +       if (likely(first_ackt) && !(flag & FLAG_RETRANS_DATA_ACKED)) {
> +               seq_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, first_ackt);
> +               ca_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, last_ackt);
>         }
> -       if (sack->first_sackt.v64) {
> -               sack_rtt_us = skb_mstamp_us_delta(now, &sack->first_sackt);
> -               ca_rtt_us = skb_mstamp_us_delta(now, &sack->last_sackt);
> +       if (sack->first_sackt) {
> +               sack_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, sack->first_sackt);
> +               ca_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, sack->last_sackt);
>         }
>         sack->rate->rtt_us = ca_rtt_us; /* RTT of last (S)ACKed packet, or -1 */
>         rtt_update = tcp_ack_update_rtt(sk, flag, seq_rtt_us, sack_rtt_us,
> @@ -3201,7 +3201,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
>                 tp->fackets_out -= min(pkts_acked, tp->fackets_out);
>
>         } else if (skb && rtt_update && sack_rtt_us >= 0 &&
> -                  sack_rtt_us > skb_mstamp_us_delta(now, &skb->skb_mstamp)) {
> +                  sack_rtt_us > tcp_stamp_us_delta(tp->tcp_mstamp, skb->skb_mstamp)) {
>                 /* Do not re-arm RTO if the sack RTT is measured from data sent
>                  * after when the head was last (re)transmitted. Otherwise the
>                  * timeout may continue to extend in loss recovery.
> @@ -3553,7 +3553,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
>         int acked = 0; /* Number of packets newly acked */
>         int rexmit = REXMIT_NONE; /* Flag to (re)transmit to recover losses */
>
> -       sack_state.first_sackt.v64 = 0;
> +       sack_state.first_sackt = 0;
>         sack_state.rate = &rs;
>
>         /* We very likely will need to access write queue head. */
> @@ -5356,7 +5356,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
>  {
>         struct tcp_sock *tp = tcp_sk(sk);
>
> -       skb_mstamp_get(&tp->tcp_mstamp);
> +       tcp_mstamp_refresh(tp);
>         if (unlikely(!sk->sk_rx_dst))
>                 inet_csk(sk)->icsk_af_ops->sk_rx_dst_set(sk, skb);
>         /*
> @@ -5672,7 +5672,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
>
>                 if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
>                     !between(tp->rx_opt.rcv_tsecr, tp->retrans_stamp,
> -                            tcp_time_stamp)) {
> +                            tcp_time_stamp(tp))) {
>                         NET_INC_STATS(sock_net(sk),
>                                         LINUX_MIB_PAWSACTIVEREJECTED);
>                         goto reset_and_undo;
> @@ -5917,7 +5917,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
>
>         case TCP_SYN_SENT:
>                 tp->rx_opt.saw_tstamp = 0;
> -               skb_mstamp_get(&tp->tcp_mstamp);
> +               tcp_mstamp_refresh(tp);
>                 queued = tcp_rcv_synsent_state_process(sk, skb, th);
>                 if (queued >= 0)
>                         return queued;
> @@ -5929,7 +5929,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
>                 return 0;
>         }
>
> -       skb_mstamp_get(&tp->tcp_mstamp);
> +       tcp_mstamp_refresh(tp);
>         tp->rx_opt.saw_tstamp = 0;
>         req = tp->fastopen_rsk;
>         if (req) {
> @@ -6202,7 +6202,7 @@ static void tcp_openreq_init(struct request_sock *req,
>         req->cookie_ts = 0;
>         tcp_rsk(req)->rcv_isn = TCP_SKB_CB(skb)->seq;
>         tcp_rsk(req)->rcv_nxt = TCP_SKB_CB(skb)->seq + 1;
> -       skb_mstamp_get(&tcp_rsk(req)->snt_synack);
> +       tcp_rsk(req)->snt_synack = tcp_clock_us();
>         tcp_rsk(req)->last_oow_ack_time = 0;
>         req->mss = rx_opt->mss_clamp;
>         req->ts_recent = rx_opt->saw_tstamp ? rx_opt->rcv_tsval : 0;
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index d8fe25db79f223e3fde85882effd2ac6ec15f8ca..191b2f78b19d2c8d62c59cc046bd608687679619 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -376,8 +376,9 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
>         struct sock *sk;
>         struct sk_buff *skb;
>         struct request_sock *fastopen;
> -       __u32 seq, snd_una;
> -       __u32 remaining;
> +       u32 seq, snd_una;
> +       s32 remaining;
> +       u32 delta_us;
>         int err;
>         struct net *net = dev_net(icmp_skb->dev);
>
> @@ -483,12 +484,12 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
>                 skb = tcp_write_queue_head(sk);
>                 BUG_ON(!skb);
>
> -               skb_mstamp_get(&tp->tcp_mstamp);
> +               tcp_mstamp_refresh(tp);
> +               delta_us = (u32)(tp->tcp_mstamp - skb->skb_mstamp);
>                 remaining = icsk->icsk_rto -
> -                           min(icsk->icsk_rto,
> -                               tcp_time_stamp - tcp_skb_timestamp(skb));
> +                           usecs_to_jiffies(delta_us);
>
> -               if (remaining) {
> +               if (remaining > 0) {
>                         inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
>                                                   remaining, TCP_RTO_MAX);
>                 } else {
> @@ -812,7 +813,7 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
>         tcp_v4_send_ack(sk, skb,
>                         tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt,
>                         tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale,
> -                       tcp_time_stamp + tcptw->tw_ts_offset,
> +                       tcp_time_stamp_raw() + tcptw->tw_ts_offset,
>                         tcptw->tw_ts_recent,
>                         tw->tw_bound_dev_if,
>                         tcp_twsk_md5_key(tcptw),
> @@ -840,7 +841,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
>         tcp_v4_send_ack(sk, skb, seq,
>                         tcp_rsk(req)->rcv_nxt,
>                         req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
> -                       tcp_time_stamp + tcp_rsk(req)->ts_off,
> +                       tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
>                         req->ts_recent,
>                         0,
>                         tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->daddr,
> diff --git a/net/ipv4/tcp_lp.c b/net/ipv4/tcp_lp.c
> index ef3122abb3734a63011fba035f7a7aae431da8de..ae10ed64fe13c5278161f92ccecb51653c87db5e 100644
> --- a/net/ipv4/tcp_lp.c
> +++ b/net/ipv4/tcp_lp.c
> @@ -37,7 +37,7 @@
>  #include <net/tcp.h>
>
>  /* resolution of owd */
> -#define LP_RESOL       1000
> +#define LP_RESOL       TCP_TS_HZ
>
>  /**
>   * enum tcp_lp_state
> @@ -147,9 +147,9 @@ static u32 tcp_lp_remote_hz_estimator(struct sock *sk)
>             tp->rx_opt.rcv_tsecr == lp->local_ref_time)
>                 goto out;
>
> -       m = HZ * (tp->rx_opt.rcv_tsval -
> -                 lp->remote_ref_time) / (tp->rx_opt.rcv_tsecr -
> -                                         lp->local_ref_time);
> +       m = TCP_TS_HZ *
> +           (tp->rx_opt.rcv_tsval - lp->remote_ref_time) /
> +           (tp->rx_opt.rcv_tsecr - lp->local_ref_time);
>         if (m < 0)
>                 m = -m;
>
> @@ -194,7 +194,7 @@ static u32 tcp_lp_owd_calculator(struct sock *sk)
>         if (lp->flag & LP_VALID_RHZ) {
>                 owd =
>                     tp->rx_opt.rcv_tsval * (LP_RESOL / lp->remote_hz) -
> -                   tp->rx_opt.rcv_tsecr * (LP_RESOL / HZ);
> +                   tp->rx_opt.rcv_tsecr * (LP_RESOL / TCP_TS_HZ);
>                 if (owd < 0)
>                         owd = -owd;
>         }
> @@ -264,7 +264,7 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
>  {
>         struct tcp_sock *tp = tcp_sk(sk);
>         struct lp *lp = inet_csk_ca(sk);
> -       u32 now = tcp_time_stamp;
> +       u32 now = tcp_time_stamp(tp);
>         u32 delta;
>
>         if (sample->rtt_us > 0)
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 6504f1082bdfda77bfc1b53d0d85928e5083a24e..d0642df7304452b57d2bc7f92a0a0c6d821553d3 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -455,7 +455,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
>                 newtp->fackets_out = 0;
>                 newtp->snd_ssthresh = TCP_INFINITE_SSTHRESH;
>                 newtp->tlp_high_seq = 0;
> -               newtp->lsndtime = treq->snt_synack.stamp_jiffies;
> +               newtp->lsndtime = tcp_jiffies32;
>                 newsk->sk_txhash = treq->txhash;
>                 newtp->last_oow_ack_time = 0;
>                 newtp->total_retrans = req->num_retrans;
> @@ -526,7 +526,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
>                 newtp->fastopen_req = NULL;
>                 newtp->fastopen_rsk = NULL;
>                 newtp->syn_data_acked = 0;
> -               newtp->rack.mstamp.v64 = 0;
> +               newtp->rack.mstamp = 0;
>                 newtp->rack.advanced = 0;
>
>                 __TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS);
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 65472e931a0b79f7078a4da7db802dfcc32c7621..478f75baee31d28b4e3122f7635cd1addf20cb98 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1962,7 +1962,7 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
>
>         head = tcp_write_queue_head(sk);
>
> -       age = skb_mstamp_us_delta(&tp->tcp_mstamp, &head->skb_mstamp);
> +       age = tcp_stamp_us_delta(tp->tcp_mstamp, head->skb_mstamp);
>         /* If next ACK is likely to come too late (half srtt), do not defer */
>         if (age < (tp->srtt_us >> 4))
>                 goto send_now;
> @@ -2279,7 +2279,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
>         }
>
>         max_segs = tcp_tso_segs(sk, mss_now);
> -       skb_mstamp_get(&tp->tcp_mstamp);
> +       tcp_mstamp_refresh(tp);
>         while ((skb = tcp_send_head(sk))) {
>                 unsigned int limit;
>
> @@ -3095,7 +3095,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority)
>         skb_reserve(skb, MAX_TCP_HEADER);
>         tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk),
>                              TCPHDR_ACK | TCPHDR_RST);
> -       skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
> +       tcp_mstamp_refresh(tcp_sk(sk));
>         /* Send it off. */
>         if (tcp_transmit_skb(sk, skb, 0, priority))
>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED);
> @@ -3191,10 +3191,10 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
>         memset(&opts, 0, sizeof(opts));
>  #ifdef CONFIG_SYN_COOKIES
>         if (unlikely(req->cookie_ts))
> -               skb->skb_mstamp.stamp_jiffies = cookie_init_timestamp(req);
> +               skb->skb_mstamp = cookie_init_timestamp(req);
>         else
>  #endif
> -       skb_mstamp_get(&skb->skb_mstamp);
> +               skb->skb_mstamp = tcp_clock_us();
>
>  #ifdef CONFIG_TCP_MD5SIG
>         rcu_read_lock();
> @@ -3453,8 +3453,8 @@ int tcp_connect(struct sock *sk)
>                 return -ENOBUFS;
>
>         tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
> -       skb_mstamp_get(&tp->tcp_mstamp);
> -       tp->retrans_stamp = tp->tcp_mstamp.stamp_jiffies;
> +       tcp_mstamp_refresh(tp);
> +       tp->retrans_stamp = tcp_time_stamp(tp);
>         tcp_connect_queue_skb(sk, buff);
>         tcp_ecn_send_syn(sk, buff);
>
> @@ -3615,7 +3615,7 @@ void tcp_send_window_probe(struct sock *sk)
>  {
>         if (sk->sk_state == TCP_ESTABLISHED) {
>                 tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
> -               skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
> +               tcp_mstamp_refresh(tcp_sk(sk));
>                 tcp_xmit_probe_skb(sk, 0, LINUX_MIB_TCPWINPROBE);
>         }
>  }
> diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c
> index c6a9fa8946462100947ab62d86464ff8f99565c2..ad99569d4c1e2c7f0522645217a6f42e0c4155d6 100644
> --- a/net/ipv4/tcp_rate.c
> +++ b/net/ipv4/tcp_rate.c
> @@ -78,7 +78,7 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
>         struct tcp_sock *tp = tcp_sk(sk);
>         struct tcp_skb_cb *scb = TCP_SKB_CB(skb);
>
> -       if (!scb->tx.delivered_mstamp.v64)
> +       if (!scb->tx.delivered_mstamp)
>                 return;
>
>         if (!rs->prior_delivered ||
> @@ -89,9 +89,9 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
>                 rs->is_retrans       = scb->sacked & TCPCB_RETRANS;
>
>                 /* Find the duration of the "send phase" of this window: */
> -               rs->interval_us      = skb_mstamp_us_delta(
> -                                               &skb->skb_mstamp,
> -                                               &scb->tx.first_tx_mstamp);
> +               rs->interval_us      = tcp_stamp_us_delta(
> +                                               skb->skb_mstamp,
> +                                               scb->tx.first_tx_mstamp);
>
>                 /* Record send time of most recently ACKed packet: */
>                 tp->first_tx_mstamp  = skb->skb_mstamp;
> @@ -101,7 +101,7 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
>          * we don't need to reset since it'll be freed soon.
>          */
>         if (scb->sacked & TCPCB_SACKED_ACKED)
> -               scb->tx.delivered_mstamp.v64 = 0;
> +               scb->tx.delivered_mstamp = 0;
>  }
>
>  /* Update the connection delivery information and generate a rate sample. */
> @@ -125,7 +125,7 @@ void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
>         rs->acked_sacked = delivered;   /* freshly ACKed or SACKed */
>         rs->losses = lost;              /* freshly marked lost */
>         /* Return an invalid sample if no timing information is available. */
> -       if (!rs->prior_mstamp.v64) {
> +       if (!rs->prior_mstamp) {
>                 rs->delivered = -1;
>                 rs->interval_us = -1;
>                 return;
> @@ -138,8 +138,8 @@ void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
>          * longer phase.
>          */
>         snd_us = rs->interval_us;                               /* send phase */
> -       ack_us = skb_mstamp_us_delta(&tp->tcp_mstamp,
> -                                    &rs->prior_mstamp); /* ack phase */
> +       ack_us = tcp_stamp_us_delta(tp->tcp_mstamp,
> +                                   rs->prior_mstamp); /* ack phase */
>         rs->interval_us = max(snd_us, ack_us);
>
>         /* Normally we expect interval_us >= min-rtt.
> diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
> index cd72b3d3879e88181c8a4639f0334a24e4cda852..fe9a493d02082d3830f37854d5f169f769844ffb 100644
> --- a/net/ipv4/tcp_recovery.c
> +++ b/net/ipv4/tcp_recovery.c
> @@ -17,12 +17,9 @@ static void tcp_rack_mark_skb_lost(struct sock *sk, struct sk_buff *skb)
>         }
>  }
>
> -static bool tcp_rack_sent_after(const struct skb_mstamp *t1,
> -                               const struct skb_mstamp *t2,
> -                               u32 seq1, u32 seq2)
> +static bool tcp_rack_sent_after(u64 t1, u64 t2, u32 seq1, u32 seq2)
>  {
> -       return skb_mstamp_after(t1, t2) ||
> -              (t1->v64 == t2->v64 && after(seq1, seq2));
> +       return t1 > t2 || (t1 == t2 && after(seq1, seq2));
>  }
>
>  /* RACK loss detection (IETF draft draft-ietf-tcpm-rack-01):
> @@ -72,14 +69,14 @@ static void tcp_rack_detect_loss(struct sock *sk, u32 *reo_timeout)
>                     scb->sacked & TCPCB_SACKED_ACKED)
>                         continue;
>
> -               if (tcp_rack_sent_after(&tp->rack.mstamp, &skb->skb_mstamp,
> +               if (tcp_rack_sent_after(tp->rack.mstamp, skb->skb_mstamp,
>                                         tp->rack.end_seq, scb->end_seq)) {
>                         /* Step 3 in draft-cheng-tcpm-rack-00.txt:
>                          * A packet is lost if its elapsed time is beyond
>                          * the recent RTT plus the reordering window.
>                          */
> -                       u32 elapsed = skb_mstamp_us_delta(&tp->tcp_mstamp,
> -                                                         &skb->skb_mstamp);
> +                       u32 elapsed = tcp_stamp_us_delta(tp->tcp_mstamp,
> +                                                        skb->skb_mstamp);
>                         s32 remaining = tp->rack.rtt_us + reo_wnd - elapsed;
>
>                         if (remaining < 0) {
> @@ -127,16 +124,16 @@ void tcp_rack_mark_lost(struct sock *sk)
>   * draft-cheng-tcpm-rack-00.txt
>   */
>  void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
> -                     const struct skb_mstamp *xmit_time)
> +                     u64 xmit_time)
>  {
>         u32 rtt_us;
>
> -       if (tp->rack.mstamp.v64 &&
> -           !tcp_rack_sent_after(xmit_time, &tp->rack.mstamp,
> +       if (tp->rack.mstamp &&
> +           !tcp_rack_sent_after(xmit_time, tp->rack.mstamp,
>                                  end_seq, tp->rack.end_seq))
>                 return;
>
> -       rtt_us = skb_mstamp_us_delta(&tp->tcp_mstamp, xmit_time);
> +       rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, xmit_time);
>         if (sacked & TCPCB_RETRANS) {
>                 /* If the sacked packet was retransmitted, it's ambiguous
>                  * whether the retransmission or the original (or the prior
> @@ -152,7 +149,7 @@ void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
>                         return;
>         }
>         tp->rack.rtt_us = rtt_us;
> -       tp->rack.mstamp = *xmit_time;
> +       tp->rack.mstamp = xmit_time;
>         tp->rack.end_seq = end_seq;
>         tp->rack.advanced = 1;
>  }
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 6629f47aa7f0182ece7873afcc3daa6f0019e228..27a667bce8060e6b2290fe636c27a79d0d593b48 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -153,8 +153,8 @@ static bool retransmits_timed_out(struct sock *sk,
>                                   unsigned int timeout,
>                                   bool syn_set)
>  {
> -       unsigned int linear_backoff_thresh, start_ts;
>         unsigned int rto_base = syn_set ? TCP_TIMEOUT_INIT : TCP_RTO_MIN;
> +       unsigned int linear_backoff_thresh, start_ts;
>
>         if (!inet_csk(sk)->icsk_retransmits)
>                 return false;
> @@ -172,7 +172,7 @@ static bool retransmits_timed_out(struct sock *sk,
>                         timeout = ((2 << linear_backoff_thresh) - 1) * rto_base +
>                                 (boundary - linear_backoff_thresh) * TCP_RTO_MAX;
>         }
> -       return (tcp_time_stamp - start_ts) >= timeout;
> +       return (tcp_time_stamp(tcp_sk(sk)) - start_ts) >= jiffies_to_msecs(timeout);
>  }
>
>  /* A write timeout has occurred. Process the after effects. */
> @@ -341,7 +341,7 @@ static void tcp_probe_timer(struct sock *sk)
>         if (!start_ts)
>                 tcp_send_head(sk)->skb_mstamp = tp->tcp_mstamp;
>         else if (icsk->icsk_user_timeout &&
> -                (s32)(tcp_time_stamp - start_ts) > icsk->icsk_user_timeout)
> +                (s32)(tcp_time_stamp(tp) - start_ts) > icsk->icsk_user_timeout)
>                 goto abort;
>
>         max_probes = sock_net(sk)->ipv4.sysctl_tcp_retries2;
> @@ -561,7 +561,7 @@ void tcp_write_timer_handler(struct sock *sk)
>                 goto out;
>         }
>
> -       skb_mstamp_get(&tcp_sk(sk)->tcp_mstamp);
> +       tcp_mstamp_refresh(tcp_sk(sk));
>         event = icsk->icsk_pending;
>
>         switch (event) {
> diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
> index 5abc3692b9011b140816dc4ce6223e79e5defddb..971823359f5b98da46c39b86c9ddcefd14df8559 100644
> --- a/net/ipv6/syncookies.c
> +++ b/net/ipv6/syncookies.c
> @@ -211,7 +211,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
>         ireq->wscale_ok         = tcp_opt.wscale_ok;
>         ireq->tstamp_ok         = tcp_opt.saw_tstamp;
>         req->ts_recent          = tcp_opt.saw_tstamp ? tcp_opt.rcv_tsval : 0;
> -       treq->snt_synack.v64    = 0;
> +       treq->snt_synack        = 0;
>         treq->rcv_isn = ntohl(th->seq) - 1;
>         treq->snt_isn = cookie;
>         treq->ts_off = 0;
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 4f4310a36a0481e2bd068e39285011ff28377ea5..233edfabe1dbceaeb6cdd42a2bb379072aeee361 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -949,7 +949,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb)
>
>         tcp_v6_send_ack(sk, skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt,
>                         tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale,
> -                       tcp_time_stamp + tcptw->tw_ts_offset,
> +                       tcp_time_stamp_raw() + tcptw->tw_ts_offset,
>                         tcptw->tw_ts_recent, tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw),
>                         tw->tw_tclass, cpu_to_be32(tw->tw_flowlabel));
>
> @@ -971,7 +971,7 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
>                         tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt,
>                         tcp_rsk(req)->rcv_nxt,
>                         req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
> -                       tcp_time_stamp + tcp_rsk(req)->ts_off,
> +                       tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
>                         req->ts_recent, sk->sk_bound_dev_if,
>                         tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr),
>                         0, 0);
> diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c
> index a504e87c6ddff1b1266a901549256f29dc1973d1..49bd8bb16b1817e9e06ee64c889e78a923bfd375 100644
> --- a/net/netfilter/nf_synproxy_core.c
> +++ b/net/netfilter/nf_synproxy_core.c
> @@ -152,7 +152,7 @@ void synproxy_init_timestamp_cookie(const struct xt_synproxy_info *info,
>                                     struct synproxy_options *opts)
>  {
>         opts->tsecr = opts->tsval;
> -       opts->tsval = tcp_time_stamp & ~0x3f;
> +       opts->tsval = tcp_time_stamp_raw() & ~0x3f;
>
>         if (opts->options & XT_SYNPROXY_OPT_WSCALE) {
>                 opts->tsval |= opts->wscale;
> --
> 2.13.0.303.g4ebf302169-goog
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock
  2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
                   ` (14 preceding siblings ...)
  2017-05-16 21:00 ` [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock Eric Dumazet
@ 2017-05-17 20:06 ` David Miller
  15 siblings, 0 replies; 36+ messages in thread
From: David Miller @ 2017-05-17 20:06 UTC (permalink / raw)
  To: edumazet; +Cc: ncardwell, ycheng, soheil, weiwan, netdev, eric.dumazet

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 16 May 2017 13:59:59 -0700

> TCP Timestamps option is defined in RFC 7323
> 
> Traditionally on linux, it has been tied to the internal
> 'jiffy' variable, because it had been a cheap and good enough
> generator.
> 
> Unfortunately some distros use HZ=250 or even HZ=100 leading
> to not very useful TCP timestamps.
> 
> For TCP flows in the DC, Google has used usec resolution for more
> than two years with great success [1].
> RCVBUF autotuning is more precise.
> 
> This series converts tp->tcp_mstamp to a plain u64 value storing
> a 1 usec TCP clock.
> 
> This choice will allow us to upstream the 1 usec TS option as
> discussed in IETF 97.
> 
> Kathleen Nichols [2] and others advocate for 1ms TS clocks for
> network analysis. (1ms being the lowest value supported by RFC 7323.)
> 
> [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
> [2] http://netseminar.stanford.edu/seminars/02_02_17.pdf

Series applied, thanks Eric.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock
  2017-05-16 21:00 ` [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock Eric Dumazet
  2017-05-17 13:51   ` Soheil Hassas Yeganeh
@ 2017-05-18 12:33   ` Eric Dumazet
  2017-05-18 16:15     ` [PATCH net-next] tcp: fix tcp_rearm_rto() Eric Dumazet
  1 sibling, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2017-05-18 12:33 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Neal Cardwell, Yuchung Cheng,
	Soheil Hassas Yeganeh, Wei Wang, netdev

On Tue, 2017-05-16 at 14:00 -0700, Eric Dumazet wrote:
> TCP Timestamps option is defined in RFC 7323
> 
> Traditionally on linux, it has been tied to the internal
> 'jiffies' variable, because it had been a cheap and good enough
> generator.

...

> @@ -3003,7 +3004,7 @@ void tcp_rearm_rto(struct sock *sk)
>  			struct sk_buff *skb = tcp_write_queue_head(sk);
>  			const u32 rto_time_stamp =
>  				tcp_skb_timestamp(skb) + rto;
> -			s32 delta = (s32)(rto_time_stamp - tcp_time_stamp);
> +			s32 delta = (s32)(rto_time_stamp - tcp_jiffies32);
>  			/* delta may not be positive if the socket is locked
>  			 * when the retrans timer fires and is rescheduled.
>  			 */

RTO is broken, I will send a fix after tests.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH net-next] tcp: fix tcp_rearm_rto()
  2017-05-18 12:33   ` Eric Dumazet
@ 2017-05-18 16:15     ` Eric Dumazet
  2017-05-18 16:55       ` Soheil Hassas Yeganeh
  2017-05-18 17:20       ` David Miller
  0 siblings, 2 replies; 36+ messages in thread
From: Eric Dumazet @ 2017-05-18 16:15 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller
  Cc: Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh, Wei Wang, netdev

From: Eric Dumazet <edumazet@google.com>

skbs in (re)transmit queue no longer have a copy of jiffies
at the time of the transmit : skb->skb_mstamp is now in usec unit,
with no correlation to tcp_jiffies32.

We have to convert rto from jiffies to usec, compute a time difference
in usec, then convert the delta to HZ units.

Fixes: 9a568de4818d ("tcp: switch TCP TS option (RFC 7323) to 1ms clock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9a5a9e8eda899666501cca06b37948ab64ae79b2..6db6b47e2bbc09aae2627a109e5a1ee9a3f4fe4e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3002,14 +3002,14 @@ void tcp_rearm_rto(struct sock *sk)
 		if (icsk->icsk_pending == ICSK_TIME_REO_TIMEOUT ||
 		    icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) {
 			struct sk_buff *skb = tcp_write_queue_head(sk);
-			const u32 rto_time_stamp =
-				tcp_skb_timestamp(skb) + rto;
-			s32 delta = (s32)(rto_time_stamp - tcp_jiffies32);
-			/* delta may not be positive if the socket is locked
+			u64 rto_time_stamp = skb->skb_mstamp +
+					     jiffies_to_usecs(rto);
+			s64 delta_us = rto_time_stamp - tp->tcp_mstamp;
+			/* delta_us may not be positive if the socket is locked
 			 * when the retrans timer fires and is rescheduled.
 			 */
-			if (delta > 0)
-				rto = delta;
+			if (delta_us > 0)
+				rto = usecs_to_jiffies(delta_us);
 		}
 		inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, rto,
 					  TCP_RTO_MAX);

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next] tcp: fix tcp_rearm_rto()
  2017-05-18 16:15     ` [PATCH net-next] tcp: fix tcp_rearm_rto() Eric Dumazet
@ 2017-05-18 16:55       ` Soheil Hassas Yeganeh
  2017-05-18 17:20       ` David Miller
  1 sibling, 0 replies; 36+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-05-18 16:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Neal Cardwell, Yuchung Cheng,
	Wei Wang, netdev

On Thu, May 18, 2017 at 12:15 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> skbs in (re)transmit queue no longer have a copy of jiffies
> at the time of the transmit : skb->skb_mstamp is now in usec unit,
> with no correlation to tcp_jiffies32.
>
> We have to convert rto from jiffies to usec, compute a time difference
> in usec, then convert the delta to HZ units.
>
> Fixes: 9a568de4818d ("tcp: switch TCP TS option (RFC 7323) to 1ms clock")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

Thank you for the quick fix, Eric!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next] tcp: fix tcp_rearm_rto()
  2017-05-18 16:15     ` [PATCH net-next] tcp: fix tcp_rearm_rto() Eric Dumazet
  2017-05-18 16:55       ` Soheil Hassas Yeganeh
@ 2017-05-18 17:20       ` David Miller
  1 sibling, 0 replies; 36+ messages in thread
From: David Miller @ 2017-05-18 17:20 UTC (permalink / raw)
  To: eric.dumazet; +Cc: edumazet, ncardwell, ycheng, soheil, weiwan, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 18 May 2017 09:15:58 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> skbs in (re)transmit queue no longer have a copy of jiffies
> at the time of the transmit : skb->skb_mstamp is now in usec unit,
> with no correlation to tcp_jiffies32.
> 
> We have to convert rto from jiffies to usec, compute a time difference
> in usec, then convert the delta to HZ units.
> 
> Fixes: 9a568de4818d ("tcp: switch TCP TS option (RFC 7323) to 1ms clock")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-05-18 17:21 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-16 20:59 [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock Eric Dumazet
2017-05-16 21:00 ` [PATCH net-next 01/15] tcp: use tp->tcp_mstamp in output path Eric Dumazet
2017-05-17 13:42   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 02/15] tcp: introduce tcp_jiffies32 Eric Dumazet
2017-05-17 13:43   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 03/15] dccp: do not use tcp_time_stamp Eric Dumazet
2017-05-17 13:43   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 04/15] tcp: use tcp_jiffies32 to feed tp->lsndtime Eric Dumazet
2017-05-17 13:43   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 05/15] tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp Eric Dumazet
2017-05-17 13:45   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 06/15] tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
2017-05-17 13:45   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 07/15] tcp: bic,cubic: " Eric Dumazet
2017-05-17 13:46   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 08/15] tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime Eric Dumazet
2017-05-17 13:46   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 09/15] tcp: use tcp_jiffies32 to feed probe_timestamp Eric Dumazet
2017-05-17 13:46   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 10/15] tcp: uses jiffies_32 to feed tp->chrono_start Eric Dumazet
2017-05-17 13:46   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 11/15] tcp: use tcp_jiffies32 in __tcp_oow_rate_limited() Eric Dumazet
2017-05-17 13:47   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 12/15] tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp Eric Dumazet
2017-05-17 13:47   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 13/15] tcp_lp: cache tcp_time_stamp Eric Dumazet
2017-05-17 13:47   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 14/15] tcp: replace misc tcp_time_stamp to tcp_jiffies32 Eric Dumazet
2017-05-17 13:47   ` Soheil Hassas Yeganeh
2017-05-16 21:00 ` [PATCH net-next 15/15] tcp: switch TCP TS option (RFC 7323) to 1ms clock Eric Dumazet
2017-05-17 13:51   ` Soheil Hassas Yeganeh
2017-05-18 12:33   ` Eric Dumazet
2017-05-18 16:15     ` [PATCH net-next] tcp: fix tcp_rearm_rto() Eric Dumazet
2017-05-18 16:55       ` Soheil Hassas Yeganeh
2017-05-18 17:20       ` David Miller
2017-05-17 20:06 ` [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).