All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when
@ 2014-09-05 22:33 Eric Dumazet
  2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Eric Dumazet @ 2014-09-05 22:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Yuchung Cheng, Neal Cardwell, Eric Dumazet

TCP_SKB_CB(skb)->when has different meaning in output and input paths.

In output path, it contains a timestamp.
In input path, it contains an ISN, chosen by tcp_timewait_state_process()

Its usage in output path is obsolete after usec timestamping.
Lets simplify and clean this.

Eric Dumazet (2):
  tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn
  tcp: remove TCP_SKB_CB(skb)->when

 include/net/tcp.h        |  8 +++++++-
 net/ipv4/tcp_input.c     |  5 +++--
 net/ipv4/tcp_ipv4.c      |  7 ++++---
 net/ipv4/tcp_minisocks.c |  2 +-
 net/ipv4/tcp_output.c    | 34 +++++++++++++---------------------
 net/ipv4/tcp_timer.c     |  7 +++----
 net/ipv6/tcp_ipv6.c      |  4 ++--
 7 files changed, 33 insertions(+), 34 deletions(-)

-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn
  2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet
@ 2014-09-05 22:33 ` Eric Dumazet
  2014-09-05 22:50   ` Yuchung Cheng
  2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet
  2014-09-06  0:49 ` [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when David Miller
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2014-09-05 22:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Yuchung Cheng, Neal Cardwell, Eric Dumazet

TCP_SKB_CB(skb)->when has different meaning in output and input paths.

In output path, it contains a timestamp.
In input path, it contains an ISN, chosen by tcp_timewait_state_process()

Lets add a different name to ease code comprehension.

Note that 'when' field will disappear in following patch,
as skb_mstamp already contains timestamp, the anonymous
union will promptly disappear as well.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h        | 7 ++++++-
 net/ipv4/tcp_input.c     | 2 +-
 net/ipv4/tcp_ipv4.c      | 2 +-
 net/ipv4/tcp_minisocks.c | 2 +-
 net/ipv6/tcp_ipv6.c      | 4 ++--
 5 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 590e01a476ac..0cd7d2c65dc0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -698,7 +698,12 @@ struct tcp_skb_cb {
 	} header;	/* For incoming frames		*/
 	__u32		seq;		/* Starting sequence number	*/
 	__u32		end_seq;	/* SEQ + FIN + SYN + datalen	*/
-	__u32		when;		/* used to compute rtt's	*/
+	union {
+		/* used in output path */
+		__u32		when;	/* used to compute rtt's	*/
+		/* used in input path */
+		__u32		tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */
+	};
 	__u8		tcp_flags;	/* TCP header flags. (tcp[13])	*/
 
 	__u8		sacked;		/* State flags for SACK/FACK.	*/
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index aba4926ca095..9c8b9f1dcf69 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5906,7 +5906,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	struct request_sock *req;
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct dst_entry *dst = NULL;
-	__u32 isn = TCP_SKB_CB(skb)->when;
+	__u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn;
 	bool want_cookie = false, fastopen;
 	struct flowi fl;
 	struct tcp_fastopen_cookie foc = { .len = -1 };
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 487e2a41667f..02e6cd29ebf1 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1627,7 +1627,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
 				    skb->len - th->doff * 4);
 	TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
-	TCP_SKB_CB(skb)->when	 = 0;
+	TCP_SKB_CB(skb)->tcp_tw_isn = 0;
 	TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
 	TCP_SKB_CB(skb)->sacked	 = 0;
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 1649988bd1b6..a058f411d3a6 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -232,7 +232,7 @@ kill:
 		u32 isn = tcptw->tw_snd_nxt + 65535 + 2;
 		if (isn == 0)
 			isn++;
-		TCP_SKB_CB(skb)->when = isn;
+		TCP_SKB_CB(skb)->tcp_tw_isn = isn;
 		return TCP_TW_SYN;
 	}
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 29964c3d363c..5b3c70ff7a72 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -738,7 +738,7 @@ static void tcp_v6_init_req(struct request_sock *req, struct sock *sk,
 	    ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL)
 		ireq->ir_iif = inet6_iif(skb);
 
-	if (!TCP_SKB_CB(skb)->when &&
+	if (!TCP_SKB_CB(skb)->tcp_tw_isn &&
 	    (ipv6_opt_accepted(sk, skb) || np->rxopt.bits.rxinfo ||
 	     np->rxopt.bits.rxoinfo || np->rxopt.bits.rxhlim ||
 	     np->rxopt.bits.rxohlim || np->repflow)) {
@@ -1412,7 +1412,7 @@ static int tcp_v6_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
 				    skb->len - th->doff*4);
 	TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
-	TCP_SKB_CB(skb)->when = 0;
+	TCP_SKB_CB(skb)->tcp_tw_isn = 0;
 	TCP_SKB_CB(skb)->ip_dsfield = ipv6_get_dsfield(hdr);
 	TCP_SKB_CB(skb)->sacked = 0;
 
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when
  2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet
  2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet
@ 2014-09-05 22:33 ` Eric Dumazet
  2014-09-05 22:49   ` Yuchung Cheng
  2014-09-06  0:49 ` [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when David Miller
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2014-09-05 22:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Yuchung Cheng, Neal Cardwell, Eric Dumazet

After commit 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution"),
we no longer need to maintain timestamps in two different fields.

TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h     | 13 +++++++------
 net/ipv4/tcp_input.c  |  3 ++-
 net/ipv4/tcp_ipv4.c   |  5 +++--
 net/ipv4/tcp_output.c | 39 ++++++++++++++++-----------------------
 net/ipv4/tcp_timer.c  |  7 +++----
 5 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 0cd7d2c65dc0..a4201ef216e8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -672,6 +672,12 @@ void tcp_send_window_probe(struct sock *sk);
  */
 #define tcp_time_stamp		((__u32)(jiffies))
 
+static inline u32 tcp_skb_timestamp(const struct sk_buff *skb)
+{
+	return skb->skb_mstamp.stamp_jiffies;
+}
+
+
 #define tcp_flag_byte(th) (((u_int8_t *)th)[13])
 
 #define TCPHDR_FIN 0x01
@@ -698,12 +704,7 @@ struct tcp_skb_cb {
 	} header;	/* For incoming frames		*/
 	__u32		seq;		/* Starting sequence number	*/
 	__u32		end_seq;	/* SEQ + FIN + SYN + datalen	*/
-	union {
-		/* used in output path */
-		__u32		when;	/* used to compute rtt's	*/
-		/* used in input path */
-		__u32		tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */
-	};
+	__u32		tcp_tw_isn;	/* isn chosen by tcp_timewait_state_process() */
 	__u8		tcp_flags;	/* TCP header flags. (tcp[13])	*/
 
 	__u8		sacked;		/* State flags for SACK/FACK.	*/
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9c8b9f1dcf69..f97003ad0af5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2967,7 +2967,8 @@ void tcp_rearm_rto(struct sock *sk)
 		if (icsk->icsk_pending == ICSK_TIME_EARLY_RETRANS ||
 		    icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) {
 			struct sk_buff *skb = tcp_write_queue_head(sk);
-			const u32 rto_time_stamp = TCP_SKB_CB(skb)->when + rto;
+			const u32 rto_time_stamp =
+				tcp_skb_timestamp(skb) + rto;
 			s32 delta = (s32)(rto_time_stamp - tcp_time_stamp);
 			/* delta may not be positive if the socket is locked
 			 * when the retrans timer fires and is rescheduled.
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 02e6cd29ebf1..3f9bc3f0bba0 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -437,8 +437,9 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 		skb = tcp_write_queue_head(sk);
 		BUG_ON(!skb);
 
-		remaining = icsk->icsk_rto - min(icsk->icsk_rto,
-				tcp_time_stamp - TCP_SKB_CB(skb)->when);
+		remaining = icsk->icsk_rto -
+			    min(icsk->icsk_rto,
+				tcp_time_stamp - tcp_skb_timestamp(skb));
 
 		if (remaining) {
 			inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 5a7c41fbc6d3..3b22dcb7bb5c 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -550,7 +550,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
 
 	if (likely(sysctl_tcp_timestamps && *md5 == NULL)) {
 		opts->options |= OPTION_TS;
-		opts->tsval = TCP_SKB_CB(skb)->when + tp->tsoffset;
+		opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset;
 		opts->tsecr = tp->rx_opt.ts_recent;
 		remaining -= TCPOLEN_TSTAMP_ALIGNED;
 	}
@@ -618,7 +618,7 @@ static unsigned int tcp_synack_options(struct sock *sk,
 	}
 	if (likely(ireq->tstamp_ok)) {
 		opts->options |= OPTION_TS;
-		opts->tsval = TCP_SKB_CB(skb)->when;
+		opts->tsval = tcp_skb_timestamp(skb);
 		opts->tsecr = req->ts_recent;
 		remaining -= TCPOLEN_TSTAMP_ALIGNED;
 	}
@@ -647,7 +647,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 					struct tcp_out_options *opts,
 					struct tcp_md5sig_key **md5)
 {
-	struct tcp_skb_cb *tcb = skb ? TCP_SKB_CB(skb) : NULL;
 	struct tcp_sock *tp = tcp_sk(sk);
 	unsigned int size = 0;
 	unsigned int eff_sacks;
@@ -666,7 +665,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 
 	if (likely(tp->rx_opt.tstamp_ok)) {
 		opts->options |= OPTION_TS;
-		opts->tsval = tcb ? tcb->when + tp->tsoffset : 0;
+		opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0;
 		opts->tsecr = tp->rx_opt.ts_recent;
 		size += TCPOLEN_TSTAMP_ALIGNED;
 	}
@@ -886,8 +885,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 			skb = skb_clone(skb, gfp_mask);
 		if (unlikely(!skb))
 			return -ENOBUFS;
-		/* Our usage of tstamp should remain private */
-		skb->tstamp.tv64 = 0;
 	}
 
 	inet = inet_sk(sk);
@@ -975,7 +972,10 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 		TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS,
 			      tcp_skb_pcount(skb));
 
+	/* Our usage of tstamp should remain private */
+	skb->tstamp.tv64 = 0;
 	err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
+
 	if (likely(err <= 0))
 		return err;
 
@@ -1149,7 +1149,6 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
 	/* Looks stupid, but our code really uses when of
 	 * skbs, which it never sent before. --ANK
 	 */
-	TCP_SKB_CB(buff)->when = TCP_SKB_CB(skb)->when;
 	buff->tstamp = skb->tstamp;
 	tcp_fragment_tstamp(skb, buff);
 
@@ -1874,8 +1873,8 @@ static int tcp_mtu_probe(struct sock *sk)
 	tcp_init_tso_segs(sk, nskb, nskb->len);
 
 	/* We're ready to send.  If this fails, the probe will
-	 * be resegmented into mss-sized pieces by tcp_write_xmit(). */
-	TCP_SKB_CB(nskb)->when = tcp_time_stamp;
+	 * be resegmented into mss-sized pieces by tcp_write_xmit().
+	 */
 	if (!tcp_transmit_skb(sk, nskb, 1, GFP_ATOMIC)) {
 		/* Decrement cwnd here because we are sending
 		 * effectively two packets. */
@@ -1935,8 +1934,8 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 		BUG_ON(!tso_segs);
 
 		if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
-			/* "when" is used as a start point for the retransmit timer */
-			TCP_SKB_CB(skb)->when = tcp_time_stamp;
+			/* "skb_mstamp" is used as a start point for the retransmit timer */
+			skb_mstamp_get(&skb->skb_mstamp);
 			goto repair; /* Skip network transmission */
 		}
 
@@ -2000,8 +1999,6 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 		    unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
 			break;
 
-		TCP_SKB_CB(skb)->when = tcp_time_stamp;
-
 		if (unlikely(tcp_transmit_skb(sk, skb, 1, gfp)))
 			break;
 
@@ -2499,7 +2496,6 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 	/* Make a copy, if the first transmission SKB clone we made
 	 * is still in somebody's hands, else make a clone.
 	 */
-	TCP_SKB_CB(skb)->when = tcp_time_stamp;
 
 	/* make sure skb->data is aligned on arches that require it
 	 * and check if ack-trimming & collapsing extended the headroom
@@ -2544,7 +2540,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 
 		/* Save stamp of the first retransmit. */
 		if (!tp->retrans_stamp)
-			tp->retrans_stamp = TCP_SKB_CB(skb)->when;
+			tp->retrans_stamp = tcp_skb_timestamp(skb);
 
 		/* snd_nxt is stored to detect loss of retransmitted segment,
 		 * see tcp_input.c tcp_sacktag_write_queue().
@@ -2752,7 +2748,6 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority)
 	tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk),
 			     TCPHDR_ACK | TCPHDR_RST);
 	/* Send it off. */
-	TCP_SKB_CB(skb)->when = tcp_time_stamp;
 	if (tcp_transmit_skb(sk, skb, 0, priority))
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED);
 
@@ -2791,7 +2786,6 @@ int tcp_send_synack(struct sock *sk)
 		TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ACK;
 		TCP_ECN_send_synack(tcp_sk(sk), skb);
 	}
-	TCP_SKB_CB(skb)->when = tcp_time_stamp;
 	return tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC);
 }
 
@@ -2835,10 +2829,10 @@ struct sk_buff *tcp_make_synack(struct sock *sk, struct dst_entry *dst,
 	memset(&opts, 0, sizeof(opts));
 #ifdef CONFIG_SYN_COOKIES
 	if (unlikely(req->cookie_ts))
-		TCP_SKB_CB(skb)->when = cookie_init_timestamp(req);
+		skb->skb_mstamp.stamp_jiffies = cookie_init_timestamp(req);
 	else
 #endif
-	TCP_SKB_CB(skb)->when = tcp_time_stamp;
+	skb_mstamp_get(&skb->skb_mstamp);
 	tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, &md5,
 					     foc) + sizeof(*th);
 
@@ -3086,7 +3080,7 @@ int tcp_connect(struct sock *sk)
 	skb_reserve(buff, MAX_TCP_HEADER);
 
 	tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
-	tp->retrans_stamp = TCP_SKB_CB(buff)->when = tcp_time_stamp;
+	tp->retrans_stamp = tcp_time_stamp;
 	tcp_connect_queue_skb(sk, buff);
 	TCP_ECN_send_syn(sk, buff);
 
@@ -3194,7 +3188,7 @@ void tcp_send_ack(struct sock *sk)
 	tcp_init_nondata_skb(buff, tcp_acceptable_seq(sk), TCPHDR_ACK);
 
 	/* Send it off, this clears delayed acks for us. */
-	TCP_SKB_CB(buff)->when = tcp_time_stamp;
+	skb_mstamp_get(&buff->skb_mstamp);
 	tcp_transmit_skb(sk, buff, 0, sk_gfp_atomic(sk, GFP_ATOMIC));
 }
 
@@ -3226,7 +3220,7 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent)
 	 * send it.
 	 */
 	tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK);
-	TCP_SKB_CB(skb)->when = tcp_time_stamp;
+	skb_mstamp_get(&skb->skb_mstamp);
 	return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC);
 }
 
@@ -3270,7 +3264,6 @@ int tcp_write_wakeup(struct sock *sk)
 			tcp_set_skb_tso_segs(sk, skb, mss);
 
 		TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_PSH;
-		TCP_SKB_CB(skb)->when = tcp_time_stamp;
 		err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC);
 		if (!err)
 			tcp_event_new_data_sent(sk, skb);
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index df90cd1ce37f..a339e7ba05a4 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -135,10 +135,9 @@ static bool retransmits_timed_out(struct sock *sk,
 	if (!inet_csk(sk)->icsk_retransmits)
 		return false;
 
-	if (unlikely(!tcp_sk(sk)->retrans_stamp))
-		start_ts = TCP_SKB_CB(tcp_write_queue_head(sk))->when;
-	else
-		start_ts = tcp_sk(sk)->retrans_stamp;
+	start_ts = tcp_sk(sk)->retrans_stamp;
+	if (unlikely(!start_ts))
+		start_ts = tcp_skb_timestamp(tcp_write_queue_head(sk));
 
 	if (likely(timeout == 0)) {
 		linear_backoff_thresh = ilog2(TCP_RTO_MAX/rto_base);
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when
  2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet
@ 2014-09-05 22:49   ` Yuchung Cheng
  0 siblings, 0 replies; 6+ messages in thread
From: Yuchung Cheng @ 2014-09-05 22:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, netdev, Neal Cardwell

On Fri, Sep 5, 2014 at 3:33 PM, Eric Dumazet <edumazet@google.com> wrote:
> After commit 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution"),
> we no longer need to maintain timestamps in two different fields.
>
> TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>

> ---
>  include/net/tcp.h     | 13 +++++++------
>  net/ipv4/tcp_input.c  |  3 ++-
>  net/ipv4/tcp_ipv4.c   |  5 +++--
>  net/ipv4/tcp_output.c | 39 ++++++++++++++++-----------------------
>  net/ipv4/tcp_timer.c  |  7 +++----
>  5 files changed, 31 insertions(+), 36 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 0cd7d2c65dc0..a4201ef216e8 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -672,6 +672,12 @@ void tcp_send_window_probe(struct sock *sk);
>   */
>  #define tcp_time_stamp         ((__u32)(jiffies))
>
> +static inline u32 tcp_skb_timestamp(const struct sk_buff *skb)
> +{
> +       return skb->skb_mstamp.stamp_jiffies;
> +}
> +
> +
>  #define tcp_flag_byte(th) (((u_int8_t *)th)[13])
>
>  #define TCPHDR_FIN 0x01
> @@ -698,12 +704,7 @@ struct tcp_skb_cb {
>         } header;       /* For incoming frames          */
>         __u32           seq;            /* Starting sequence number     */
>         __u32           end_seq;        /* SEQ + FIN + SYN + datalen    */
> -       union {
> -               /* used in output path */
> -               __u32           when;   /* used to compute rtt's        */
> -               /* used in input path */
> -               __u32           tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */
> -       };
> +       __u32           tcp_tw_isn;     /* isn chosen by tcp_timewait_state_process() */
>         __u8            tcp_flags;      /* TCP header flags. (tcp[13])  */
>
>         __u8            sacked;         /* State flags for SACK/FACK.   */
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 9c8b9f1dcf69..f97003ad0af5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2967,7 +2967,8 @@ void tcp_rearm_rto(struct sock *sk)
>                 if (icsk->icsk_pending == ICSK_TIME_EARLY_RETRANS ||
>                     icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) {
>                         struct sk_buff *skb = tcp_write_queue_head(sk);
> -                       const u32 rto_time_stamp = TCP_SKB_CB(skb)->when + rto;
> +                       const u32 rto_time_stamp =
> +                               tcp_skb_timestamp(skb) + rto;
>                         s32 delta = (s32)(rto_time_stamp - tcp_time_stamp);
>                         /* delta may not be positive if the socket is locked
>                          * when the retrans timer fires and is rescheduled.
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 02e6cd29ebf1..3f9bc3f0bba0 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -437,8 +437,9 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
>                 skb = tcp_write_queue_head(sk);
>                 BUG_ON(!skb);
>
> -               remaining = icsk->icsk_rto - min(icsk->icsk_rto,
> -                               tcp_time_stamp - TCP_SKB_CB(skb)->when);
> +               remaining = icsk->icsk_rto -
> +                           min(icsk->icsk_rto,
> +                               tcp_time_stamp - tcp_skb_timestamp(skb));
>
>                 if (remaining) {
>                         inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 5a7c41fbc6d3..3b22dcb7bb5c 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -550,7 +550,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
>
>         if (likely(sysctl_tcp_timestamps && *md5 == NULL)) {
>                 opts->options |= OPTION_TS;
> -               opts->tsval = TCP_SKB_CB(skb)->when + tp->tsoffset;
> +               opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset;
>                 opts->tsecr = tp->rx_opt.ts_recent;
>                 remaining -= TCPOLEN_TSTAMP_ALIGNED;
>         }
> @@ -618,7 +618,7 @@ static unsigned int tcp_synack_options(struct sock *sk,
>         }
>         if (likely(ireq->tstamp_ok)) {
>                 opts->options |= OPTION_TS;
> -               opts->tsval = TCP_SKB_CB(skb)->when;
> +               opts->tsval = tcp_skb_timestamp(skb);
>                 opts->tsecr = req->ts_recent;
>                 remaining -= TCPOLEN_TSTAMP_ALIGNED;
>         }
> @@ -647,7 +647,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
>                                         struct tcp_out_options *opts,
>                                         struct tcp_md5sig_key **md5)
>  {
> -       struct tcp_skb_cb *tcb = skb ? TCP_SKB_CB(skb) : NULL;
>         struct tcp_sock *tp = tcp_sk(sk);
>         unsigned int size = 0;
>         unsigned int eff_sacks;
> @@ -666,7 +665,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
>
>         if (likely(tp->rx_opt.tstamp_ok)) {
>                 opts->options |= OPTION_TS;
> -               opts->tsval = tcb ? tcb->when + tp->tsoffset : 0;
> +               opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0;
>                 opts->tsecr = tp->rx_opt.ts_recent;
>                 size += TCPOLEN_TSTAMP_ALIGNED;
>         }
> @@ -886,8 +885,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
>                         skb = skb_clone(skb, gfp_mask);
>                 if (unlikely(!skb))
>                         return -ENOBUFS;
> -               /* Our usage of tstamp should remain private */
> -               skb->tstamp.tv64 = 0;
>         }
>
>         inet = inet_sk(sk);
> @@ -975,7 +972,10 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
>                 TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS,
>                               tcp_skb_pcount(skb));
>
> +       /* Our usage of tstamp should remain private */
> +       skb->tstamp.tv64 = 0;
>         err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
> +
>         if (likely(err <= 0))
>                 return err;
>
> @@ -1149,7 +1149,6 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
>         /* Looks stupid, but our code really uses when of
>          * skbs, which it never sent before. --ANK
>          */
Remove the comment above too?


> -       TCP_SKB_CB(buff)->when = TCP_SKB_CB(skb)->when;
>         buff->tstamp = skb->tstamp;
>         tcp_fragment_tstamp(skb, buff);
>
> @@ -1874,8 +1873,8 @@ static int tcp_mtu_probe(struct sock *sk)
>         tcp_init_tso_segs(sk, nskb, nskb->len);
>
>         /* We're ready to send.  If this fails, the probe will
> -        * be resegmented into mss-sized pieces by tcp_write_xmit(). */
> -       TCP_SKB_CB(nskb)->when = tcp_time_stamp;
> +        * be resegmented into mss-sized pieces by tcp_write_xmit().
> +        */
>         if (!tcp_transmit_skb(sk, nskb, 1, GFP_ATOMIC)) {
>                 /* Decrement cwnd here because we are sending
>                  * effectively two packets. */
> @@ -1935,8 +1934,8 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
>                 BUG_ON(!tso_segs);
>
>                 if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
> -                       /* "when" is used as a start point for the retransmit timer */
> -                       TCP_SKB_CB(skb)->when = tcp_time_stamp;
> +                       /* "skb_mstamp" is used as a start point for the retransmit timer */
> +                       skb_mstamp_get(&skb->skb_mstamp);
>                         goto repair; /* Skip network transmission */
>                 }
>
> @@ -2000,8 +1999,6 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
>                     unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
>                         break;
>
> -               TCP_SKB_CB(skb)->when = tcp_time_stamp;
> -
>                 if (unlikely(tcp_transmit_skb(sk, skb, 1, gfp)))
>                         break;
>
> @@ -2499,7 +2496,6 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
>         /* Make a copy, if the first transmission SKB clone we made
>          * is still in somebody's hands, else make a clone.
>          */
> -       TCP_SKB_CB(skb)->when = tcp_time_stamp;
>
>         /* make sure skb->data is aligned on arches that require it
>          * and check if ack-trimming & collapsing extended the headroom
> @@ -2544,7 +2540,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
>
>                 /* Save stamp of the first retransmit. */
>                 if (!tp->retrans_stamp)
> -                       tp->retrans_stamp = TCP_SKB_CB(skb)->when;
> +                       tp->retrans_stamp = tcp_skb_timestamp(skb);
>
>                 /* snd_nxt is stored to detect loss of retransmitted segment,
>                  * see tcp_input.c tcp_sacktag_write_queue().
> @@ -2752,7 +2748,6 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority)
>         tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk),
>                              TCPHDR_ACK | TCPHDR_RST);
>         /* Send it off. */
> -       TCP_SKB_CB(skb)->when = tcp_time_stamp;
>         if (tcp_transmit_skb(sk, skb, 0, priority))
>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED);
>
> @@ -2791,7 +2786,6 @@ int tcp_send_synack(struct sock *sk)
>                 TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ACK;
>                 TCP_ECN_send_synack(tcp_sk(sk), skb);
>         }
> -       TCP_SKB_CB(skb)->when = tcp_time_stamp;
>         return tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC);
>  }
>
> @@ -2835,10 +2829,10 @@ struct sk_buff *tcp_make_synack(struct sock *sk, struct dst_entry *dst,
>         memset(&opts, 0, sizeof(opts));
>  #ifdef CONFIG_SYN_COOKIES
>         if (unlikely(req->cookie_ts))
> -               TCP_SKB_CB(skb)->when = cookie_init_timestamp(req);
> +               skb->skb_mstamp.stamp_jiffies = cookie_init_timestamp(req);
>         else
>  #endif
> -       TCP_SKB_CB(skb)->when = tcp_time_stamp;
> +       skb_mstamp_get(&skb->skb_mstamp);
>         tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, &md5,
>                                              foc) + sizeof(*th);
>
> @@ -3086,7 +3080,7 @@ int tcp_connect(struct sock *sk)
>         skb_reserve(buff, MAX_TCP_HEADER);
>
>         tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
> -       tp->retrans_stamp = TCP_SKB_CB(buff)->when = tcp_time_stamp;
> +       tp->retrans_stamp = tcp_time_stamp;
>         tcp_connect_queue_skb(sk, buff);
>         TCP_ECN_send_syn(sk, buff);
>
> @@ -3194,7 +3188,7 @@ void tcp_send_ack(struct sock *sk)
>         tcp_init_nondata_skb(buff, tcp_acceptable_seq(sk), TCPHDR_ACK);
>
>         /* Send it off, this clears delayed acks for us. */
> -       TCP_SKB_CB(buff)->when = tcp_time_stamp;
> +       skb_mstamp_get(&buff->skb_mstamp);
>         tcp_transmit_skb(sk, buff, 0, sk_gfp_atomic(sk, GFP_ATOMIC));
>  }
>
> @@ -3226,7 +3220,7 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent)
>          * send it.
>          */
>         tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK);
> -       TCP_SKB_CB(skb)->when = tcp_time_stamp;
> +       skb_mstamp_get(&skb->skb_mstamp);
>         return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC);
>  }
>
> @@ -3270,7 +3264,6 @@ int tcp_write_wakeup(struct sock *sk)
>                         tcp_set_skb_tso_segs(sk, skb, mss);
>
>                 TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_PSH;
> -               TCP_SKB_CB(skb)->when = tcp_time_stamp;
>                 err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC);
>                 if (!err)
>                         tcp_event_new_data_sent(sk, skb);
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index df90cd1ce37f..a339e7ba05a4 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -135,10 +135,9 @@ static bool retransmits_timed_out(struct sock *sk,
>         if (!inet_csk(sk)->icsk_retransmits)
>                 return false;
>
> -       if (unlikely(!tcp_sk(sk)->retrans_stamp))
> -               start_ts = TCP_SKB_CB(tcp_write_queue_head(sk))->when;
> -       else
> -               start_ts = tcp_sk(sk)->retrans_stamp;
> +       start_ts = tcp_sk(sk)->retrans_stamp;
> +       if (unlikely(!start_ts))
> +               start_ts = tcp_skb_timestamp(tcp_write_queue_head(sk));
>
>         if (likely(timeout == 0)) {
>                 linear_backoff_thresh = ilog2(TCP_RTO_MAX/rto_base);
> --
> 2.1.0.rc2.206.gedb03e5
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn
  2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet
@ 2014-09-05 22:50   ` Yuchung Cheng
  0 siblings, 0 replies; 6+ messages in thread
From: Yuchung Cheng @ 2014-09-05 22:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, netdev, Neal Cardwell

On Fri, Sep 5, 2014 at 3:33 PM, Eric Dumazet <edumazet@google.com> wrote:
> TCP_SKB_CB(skb)->when has different meaning in output and input paths.
>
> In output path, it contains a timestamp.
> In input path, it contains an ISN, chosen by tcp_timewait_state_process()
>
> Lets add a different name to ease code comprehension.
>
> Note that 'when' field will disappear in following patch,
> as skb_mstamp already contains timestamp, the anonymous
> union will promptly disappear as well.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>

> ---
>  include/net/tcp.h        | 7 ++++++-
>  net/ipv4/tcp_input.c     | 2 +-
>  net/ipv4/tcp_ipv4.c      | 2 +-
>  net/ipv4/tcp_minisocks.c | 2 +-
>  net/ipv6/tcp_ipv6.c      | 4 ++--
>  5 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 590e01a476ac..0cd7d2c65dc0 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -698,7 +698,12 @@ struct tcp_skb_cb {
>         } header;       /* For incoming frames          */
>         __u32           seq;            /* Starting sequence number     */
>         __u32           end_seq;        /* SEQ + FIN + SYN + datalen    */
> -       __u32           when;           /* used to compute rtt's        */
> +       union {
> +               /* used in output path */
> +               __u32           when;   /* used to compute rtt's        */
> +               /* used in input path */
> +               __u32           tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */
> +       };
>         __u8            tcp_flags;      /* TCP header flags. (tcp[13])  */
>
>         __u8            sacked;         /* State flags for SACK/FACK.   */
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index aba4926ca095..9c8b9f1dcf69 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5906,7 +5906,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
>         struct request_sock *req;
>         struct tcp_sock *tp = tcp_sk(sk);
>         struct dst_entry *dst = NULL;
> -       __u32 isn = TCP_SKB_CB(skb)->when;
> +       __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn;
>         bool want_cookie = false, fastopen;
>         struct flowi fl;
>         struct tcp_fastopen_cookie foc = { .len = -1 };
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 487e2a41667f..02e6cd29ebf1 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1627,7 +1627,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
>         TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
>                                     skb->len - th->doff * 4);
>         TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> -       TCP_SKB_CB(skb)->when    = 0;
> +       TCP_SKB_CB(skb)->tcp_tw_isn = 0;
>         TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
>         TCP_SKB_CB(skb)->sacked  = 0;
>
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 1649988bd1b6..a058f411d3a6 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -232,7 +232,7 @@ kill:
>                 u32 isn = tcptw->tw_snd_nxt + 65535 + 2;
>                 if (isn == 0)
>                         isn++;
> -               TCP_SKB_CB(skb)->when = isn;
> +               TCP_SKB_CB(skb)->tcp_tw_isn = isn;
>                 return TCP_TW_SYN;
>         }
>
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 29964c3d363c..5b3c70ff7a72 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -738,7 +738,7 @@ static void tcp_v6_init_req(struct request_sock *req, struct sock *sk,
>             ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL)
>                 ireq->ir_iif = inet6_iif(skb);
>
> -       if (!TCP_SKB_CB(skb)->when &&
> +       if (!TCP_SKB_CB(skb)->tcp_tw_isn &&
>             (ipv6_opt_accepted(sk, skb) || np->rxopt.bits.rxinfo ||
>              np->rxopt.bits.rxoinfo || np->rxopt.bits.rxhlim ||
>              np->rxopt.bits.rxohlim || np->repflow)) {
> @@ -1412,7 +1412,7 @@ static int tcp_v6_rcv(struct sk_buff *skb)
>         TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
>                                     skb->len - th->doff*4);
>         TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> -       TCP_SKB_CB(skb)->when = 0;
> +       TCP_SKB_CB(skb)->tcp_tw_isn = 0;
>         TCP_SKB_CB(skb)->ip_dsfield = ipv6_get_dsfield(hdr);
>         TCP_SKB_CB(skb)->sacked = 0;
>
> --
> 2.1.0.rc2.206.gedb03e5
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when
  2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet
  2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet
  2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet
@ 2014-09-06  0:49 ` David Miller
  2 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2014-09-06  0:49 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, ycheng, ncardwell

From: Eric Dumazet <edumazet@google.com>
Date: Fri,  5 Sep 2014 15:33:31 -0700

> TCP_SKB_CB(skb)->when has different meaning in output and input paths.
> 
> In output path, it contains a timestamp.
> In input path, it contains an ISN, chosen by tcp_timewait_state_process()
> 
> Its usage in output path is obsolete after usec timestamping.
> Lets simplify and clean this.

Nice cleanup, applied, thanks Eric.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-09-06  0:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet
2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet
2014-09-05 22:50   ` Yuchung Cheng
2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet
2014-09-05 22:49   ` Yuchung Cheng
2014-09-06  0:49 ` [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.