* [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when @ 2014-09-05 22:33 Eric Dumazet 2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Eric Dumazet @ 2014-09-05 22:33 UTC (permalink / raw) To: David S. Miller; +Cc: netdev, Yuchung Cheng, Neal Cardwell, Eric Dumazet TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Its usage in output path is obsolete after usec timestamping. Lets simplify and clean this. Eric Dumazet (2): tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn tcp: remove TCP_SKB_CB(skb)->when include/net/tcp.h | 8 +++++++- net/ipv4/tcp_input.c | 5 +++-- net/ipv4/tcp_ipv4.c | 7 ++++--- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/tcp_output.c | 34 +++++++++++++--------------------- net/ipv4/tcp_timer.c | 7 +++---- net/ipv6/tcp_ipv6.c | 4 ++-- 7 files changed, 33 insertions(+), 34 deletions(-) -- 2.1.0.rc2.206.gedb03e5 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn 2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet @ 2014-09-05 22:33 ` Eric Dumazet 2014-09-05 22:50 ` Yuchung Cheng 2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet 2014-09-06 0:49 ` [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when David Miller 2 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2014-09-05 22:33 UTC (permalink / raw) To: David S. Miller; +Cc: netdev, Yuchung Cheng, Neal Cardwell, Eric Dumazet TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Lets add a different name to ease code comprehension. Note that 'when' field will disappear in following patch, as skb_mstamp already contains timestamp, the anonymous union will promptly disappear as well. Signed-off-by: Eric Dumazet <edumazet@google.com> --- include/net/tcp.h | 7 ++++++- net/ipv4/tcp_input.c | 2 +- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_minisocks.c | 2 +- net/ipv6/tcp_ipv6.c | 4 ++-- 5 files changed, 11 insertions(+), 6 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 590e01a476ac..0cd7d2c65dc0 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -698,7 +698,12 @@ struct tcp_skb_cb { } header; /* For incoming frames */ __u32 seq; /* Starting sequence number */ __u32 end_seq; /* SEQ + FIN + SYN + datalen */ - __u32 when; /* used to compute rtt's */ + union { + /* used in output path */ + __u32 when; /* used to compute rtt's */ + /* used in input path */ + __u32 tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */ + }; __u8 tcp_flags; /* TCP header flags. (tcp[13]) */ __u8 sacked; /* State flags for SACK/FACK. */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index aba4926ca095..9c8b9f1dcf69 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5906,7 +5906,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, struct request_sock *req; struct tcp_sock *tp = tcp_sk(sk); struct dst_entry *dst = NULL; - __u32 isn = TCP_SKB_CB(skb)->when; + __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn; bool want_cookie = false, fastopen; struct flowi fl; struct tcp_fastopen_cookie foc = { .len = -1 }; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 487e2a41667f..02e6cd29ebf1 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1627,7 +1627,7 @@ int tcp_v4_rcv(struct sk_buff *skb) TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + skb->len - th->doff * 4); TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); - TCP_SKB_CB(skb)->when = 0; + TCP_SKB_CB(skb)->tcp_tw_isn = 0; TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); TCP_SKB_CB(skb)->sacked = 0; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 1649988bd1b6..a058f411d3a6 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -232,7 +232,7 @@ kill: u32 isn = tcptw->tw_snd_nxt + 65535 + 2; if (isn == 0) isn++; - TCP_SKB_CB(skb)->when = isn; + TCP_SKB_CB(skb)->tcp_tw_isn = isn; return TCP_TW_SYN; } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 29964c3d363c..5b3c70ff7a72 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -738,7 +738,7 @@ static void tcp_v6_init_req(struct request_sock *req, struct sock *sk, ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL) ireq->ir_iif = inet6_iif(skb); - if (!TCP_SKB_CB(skb)->when && + if (!TCP_SKB_CB(skb)->tcp_tw_isn && (ipv6_opt_accepted(sk, skb) || np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo || np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim || np->repflow)) { @@ -1412,7 +1412,7 @@ static int tcp_v6_rcv(struct sk_buff *skb) TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + skb->len - th->doff*4); TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); - TCP_SKB_CB(skb)->when = 0; + TCP_SKB_CB(skb)->tcp_tw_isn = 0; TCP_SKB_CB(skb)->ip_dsfield = ipv6_get_dsfield(hdr); TCP_SKB_CB(skb)->sacked = 0; -- 2.1.0.rc2.206.gedb03e5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn 2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet @ 2014-09-05 22:50 ` Yuchung Cheng 0 siblings, 0 replies; 6+ messages in thread From: Yuchung Cheng @ 2014-09-05 22:50 UTC (permalink / raw) To: Eric Dumazet; +Cc: David S. Miller, netdev, Neal Cardwell On Fri, Sep 5, 2014 at 3:33 PM, Eric Dumazet <edumazet@google.com> wrote: > TCP_SKB_CB(skb)->when has different meaning in output and input paths. > > In output path, it contains a timestamp. > In input path, it contains an ISN, chosen by tcp_timewait_state_process() > > Lets add a different name to ease code comprehension. > > Note that 'when' field will disappear in following patch, > as skb_mstamp already contains timestamp, the anonymous > union will promptly disappear as well. > > Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> > --- > include/net/tcp.h | 7 ++++++- > net/ipv4/tcp_input.c | 2 +- > net/ipv4/tcp_ipv4.c | 2 +- > net/ipv4/tcp_minisocks.c | 2 +- > net/ipv6/tcp_ipv6.c | 4 ++-- > 5 files changed, 11 insertions(+), 6 deletions(-) > > diff --git a/include/net/tcp.h b/include/net/tcp.h > index 590e01a476ac..0cd7d2c65dc0 100644 > --- a/include/net/tcp.h > +++ b/include/net/tcp.h > @@ -698,7 +698,12 @@ struct tcp_skb_cb { > } header; /* For incoming frames */ > __u32 seq; /* Starting sequence number */ > __u32 end_seq; /* SEQ + FIN + SYN + datalen */ > - __u32 when; /* used to compute rtt's */ > + union { > + /* used in output path */ > + __u32 when; /* used to compute rtt's */ > + /* used in input path */ > + __u32 tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */ > + }; > __u8 tcp_flags; /* TCP header flags. (tcp[13]) */ > > __u8 sacked; /* State flags for SACK/FACK. */ > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index aba4926ca095..9c8b9f1dcf69 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -5906,7 +5906,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, > struct request_sock *req; > struct tcp_sock *tp = tcp_sk(sk); > struct dst_entry *dst = NULL; > - __u32 isn = TCP_SKB_CB(skb)->when; > + __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn; > bool want_cookie = false, fastopen; > struct flowi fl; > struct tcp_fastopen_cookie foc = { .len = -1 }; > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 487e2a41667f..02e6cd29ebf1 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -1627,7 +1627,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > skb->len - th->doff * 4); > TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > - TCP_SKB_CB(skb)->when = 0; > + TCP_SKB_CB(skb)->tcp_tw_isn = 0; > TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > TCP_SKB_CB(skb)->sacked = 0; > > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index 1649988bd1b6..a058f411d3a6 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -232,7 +232,7 @@ kill: > u32 isn = tcptw->tw_snd_nxt + 65535 + 2; > if (isn == 0) > isn++; > - TCP_SKB_CB(skb)->when = isn; > + TCP_SKB_CB(skb)->tcp_tw_isn = isn; > return TCP_TW_SYN; > } > > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c > index 29964c3d363c..5b3c70ff7a72 100644 > --- a/net/ipv6/tcp_ipv6.c > +++ b/net/ipv6/tcp_ipv6.c > @@ -738,7 +738,7 @@ static void tcp_v6_init_req(struct request_sock *req, struct sock *sk, > ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL) > ireq->ir_iif = inet6_iif(skb); > > - if (!TCP_SKB_CB(skb)->when && > + if (!TCP_SKB_CB(skb)->tcp_tw_isn && > (ipv6_opt_accepted(sk, skb) || np->rxopt.bits.rxinfo || > np->rxopt.bits.rxoinfo || np->rxopt.bits.rxhlim || > np->rxopt.bits.rxohlim || np->repflow)) { > @@ -1412,7 +1412,7 @@ static int tcp_v6_rcv(struct sk_buff *skb) > TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > skb->len - th->doff*4); > TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > - TCP_SKB_CB(skb)->when = 0; > + TCP_SKB_CB(skb)->tcp_tw_isn = 0; > TCP_SKB_CB(skb)->ip_dsfield = ipv6_get_dsfield(hdr); > TCP_SKB_CB(skb)->sacked = 0; > > -- > 2.1.0.rc2.206.gedb03e5 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when 2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet 2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet @ 2014-09-05 22:33 ` Eric Dumazet 2014-09-05 22:49 ` Yuchung Cheng 2014-09-06 0:49 ` [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when David Miller 2 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2014-09-05 22:33 UTC (permalink / raw) To: David S. Miller; +Cc: netdev, Yuchung Cheng, Neal Cardwell, Eric Dumazet After commit 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution"), we no longer need to maintain timestamps in two different fields. TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies Signed-off-by: Eric Dumazet <edumazet@google.com> --- include/net/tcp.h | 13 +++++++------ net/ipv4/tcp_input.c | 3 ++- net/ipv4/tcp_ipv4.c | 5 +++-- net/ipv4/tcp_output.c | 39 ++++++++++++++++----------------------- net/ipv4/tcp_timer.c | 7 +++---- 5 files changed, 31 insertions(+), 36 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 0cd7d2c65dc0..a4201ef216e8 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -672,6 +672,12 @@ void tcp_send_window_probe(struct sock *sk); */ #define tcp_time_stamp ((__u32)(jiffies)) +static inline u32 tcp_skb_timestamp(const struct sk_buff *skb) +{ + return skb->skb_mstamp.stamp_jiffies; +} + + #define tcp_flag_byte(th) (((u_int8_t *)th)[13]) #define TCPHDR_FIN 0x01 @@ -698,12 +704,7 @@ struct tcp_skb_cb { } header; /* For incoming frames */ __u32 seq; /* Starting sequence number */ __u32 end_seq; /* SEQ + FIN + SYN + datalen */ - union { - /* used in output path */ - __u32 when; /* used to compute rtt's */ - /* used in input path */ - __u32 tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */ - }; + __u32 tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */ __u8 tcp_flags; /* TCP header flags. (tcp[13]) */ __u8 sacked; /* State flags for SACK/FACK. */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 9c8b9f1dcf69..f97003ad0af5 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2967,7 +2967,8 @@ void tcp_rearm_rto(struct sock *sk) if (icsk->icsk_pending == ICSK_TIME_EARLY_RETRANS || icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) { struct sk_buff *skb = tcp_write_queue_head(sk); - const u32 rto_time_stamp = TCP_SKB_CB(skb)->when + rto; + const u32 rto_time_stamp = + tcp_skb_timestamp(skb) + rto; s32 delta = (s32)(rto_time_stamp - tcp_time_stamp); /* delta may not be positive if the socket is locked * when the retrans timer fires and is rescheduled. diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 02e6cd29ebf1..3f9bc3f0bba0 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -437,8 +437,9 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info) skb = tcp_write_queue_head(sk); BUG_ON(!skb); - remaining = icsk->icsk_rto - min(icsk->icsk_rto, - tcp_time_stamp - TCP_SKB_CB(skb)->when); + remaining = icsk->icsk_rto - + min(icsk->icsk_rto, + tcp_time_stamp - tcp_skb_timestamp(skb)); if (remaining) { inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 5a7c41fbc6d3..3b22dcb7bb5c 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -550,7 +550,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, if (likely(sysctl_tcp_timestamps && *md5 == NULL)) { opts->options |= OPTION_TS; - opts->tsval = TCP_SKB_CB(skb)->when + tp->tsoffset; + opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset; opts->tsecr = tp->rx_opt.ts_recent; remaining -= TCPOLEN_TSTAMP_ALIGNED; } @@ -618,7 +618,7 @@ static unsigned int tcp_synack_options(struct sock *sk, } if (likely(ireq->tstamp_ok)) { opts->options |= OPTION_TS; - opts->tsval = TCP_SKB_CB(skb)->when; + opts->tsval = tcp_skb_timestamp(skb); opts->tsecr = req->ts_recent; remaining -= TCPOLEN_TSTAMP_ALIGNED; } @@ -647,7 +647,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb struct tcp_out_options *opts, struct tcp_md5sig_key **md5) { - struct tcp_skb_cb *tcb = skb ? TCP_SKB_CB(skb) : NULL; struct tcp_sock *tp = tcp_sk(sk); unsigned int size = 0; unsigned int eff_sacks; @@ -666,7 +665,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb if (likely(tp->rx_opt.tstamp_ok)) { opts->options |= OPTION_TS; - opts->tsval = tcb ? tcb->when + tp->tsoffset : 0; + opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0; opts->tsecr = tp->rx_opt.ts_recent; size += TCPOLEN_TSTAMP_ALIGNED; } @@ -886,8 +885,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, skb = skb_clone(skb, gfp_mask); if (unlikely(!skb)) return -ENOBUFS; - /* Our usage of tstamp should remain private */ - skb->tstamp.tv64 = 0; } inet = inet_sk(sk); @@ -975,7 +972,10 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS, tcp_skb_pcount(skb)); + /* Our usage of tstamp should remain private */ + skb->tstamp.tv64 = 0; err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl); + if (likely(err <= 0)) return err; @@ -1149,7 +1149,6 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, /* Looks stupid, but our code really uses when of * skbs, which it never sent before. --ANK */ - TCP_SKB_CB(buff)->when = TCP_SKB_CB(skb)->when; buff->tstamp = skb->tstamp; tcp_fragment_tstamp(skb, buff); @@ -1874,8 +1873,8 @@ static int tcp_mtu_probe(struct sock *sk) tcp_init_tso_segs(sk, nskb, nskb->len); /* We're ready to send. If this fails, the probe will - * be resegmented into mss-sized pieces by tcp_write_xmit(). */ - TCP_SKB_CB(nskb)->when = tcp_time_stamp; + * be resegmented into mss-sized pieces by tcp_write_xmit(). + */ if (!tcp_transmit_skb(sk, nskb, 1, GFP_ATOMIC)) { /* Decrement cwnd here because we are sending * effectively two packets. */ @@ -1935,8 +1934,8 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, BUG_ON(!tso_segs); if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) { - /* "when" is used as a start point for the retransmit timer */ - TCP_SKB_CB(skb)->when = tcp_time_stamp; + /* "skb_mstamp" is used as a start point for the retransmit timer */ + skb_mstamp_get(&skb->skb_mstamp); goto repair; /* Skip network transmission */ } @@ -2000,8 +1999,6 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, unlikely(tso_fragment(sk, skb, limit, mss_now, gfp))) break; - TCP_SKB_CB(skb)->when = tcp_time_stamp; - if (unlikely(tcp_transmit_skb(sk, skb, 1, gfp))) break; @@ -2499,7 +2496,6 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) /* Make a copy, if the first transmission SKB clone we made * is still in somebody's hands, else make a clone. */ - TCP_SKB_CB(skb)->when = tcp_time_stamp; /* make sure skb->data is aligned on arches that require it * and check if ack-trimming & collapsing extended the headroom @@ -2544,7 +2540,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) /* Save stamp of the first retransmit. */ if (!tp->retrans_stamp) - tp->retrans_stamp = TCP_SKB_CB(skb)->when; + tp->retrans_stamp = tcp_skb_timestamp(skb); /* snd_nxt is stored to detect loss of retransmitted segment, * see tcp_input.c tcp_sacktag_write_queue(). @@ -2752,7 +2748,6 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority) tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk), TCPHDR_ACK | TCPHDR_RST); /* Send it off. */ - TCP_SKB_CB(skb)->when = tcp_time_stamp; if (tcp_transmit_skb(sk, skb, 0, priority)) NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED); @@ -2791,7 +2786,6 @@ int tcp_send_synack(struct sock *sk) TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ACK; TCP_ECN_send_synack(tcp_sk(sk), skb); } - TCP_SKB_CB(skb)->when = tcp_time_stamp; return tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC); } @@ -2835,10 +2829,10 @@ struct sk_buff *tcp_make_synack(struct sock *sk, struct dst_entry *dst, memset(&opts, 0, sizeof(opts)); #ifdef CONFIG_SYN_COOKIES if (unlikely(req->cookie_ts)) - TCP_SKB_CB(skb)->when = cookie_init_timestamp(req); + skb->skb_mstamp.stamp_jiffies = cookie_init_timestamp(req); else #endif - TCP_SKB_CB(skb)->when = tcp_time_stamp; + skb_mstamp_get(&skb->skb_mstamp); tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, &md5, foc) + sizeof(*th); @@ -3086,7 +3080,7 @@ int tcp_connect(struct sock *sk) skb_reserve(buff, MAX_TCP_HEADER); tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN); - tp->retrans_stamp = TCP_SKB_CB(buff)->when = tcp_time_stamp; + tp->retrans_stamp = tcp_time_stamp; tcp_connect_queue_skb(sk, buff); TCP_ECN_send_syn(sk, buff); @@ -3194,7 +3188,7 @@ void tcp_send_ack(struct sock *sk) tcp_init_nondata_skb(buff, tcp_acceptable_seq(sk), TCPHDR_ACK); /* Send it off, this clears delayed acks for us. */ - TCP_SKB_CB(buff)->when = tcp_time_stamp; + skb_mstamp_get(&buff->skb_mstamp); tcp_transmit_skb(sk, buff, 0, sk_gfp_atomic(sk, GFP_ATOMIC)); } @@ -3226,7 +3220,7 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent) * send it. */ tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK); - TCP_SKB_CB(skb)->when = tcp_time_stamp; + skb_mstamp_get(&skb->skb_mstamp); return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC); } @@ -3270,7 +3264,6 @@ int tcp_write_wakeup(struct sock *sk) tcp_set_skb_tso_segs(sk, skb, mss); TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_PSH; - TCP_SKB_CB(skb)->when = tcp_time_stamp; err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC); if (!err) tcp_event_new_data_sent(sk, skb); diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index df90cd1ce37f..a339e7ba05a4 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -135,10 +135,9 @@ static bool retransmits_timed_out(struct sock *sk, if (!inet_csk(sk)->icsk_retransmits) return false; - if (unlikely(!tcp_sk(sk)->retrans_stamp)) - start_ts = TCP_SKB_CB(tcp_write_queue_head(sk))->when; - else - start_ts = tcp_sk(sk)->retrans_stamp; + start_ts = tcp_sk(sk)->retrans_stamp; + if (unlikely(!start_ts)) + start_ts = tcp_skb_timestamp(tcp_write_queue_head(sk)); if (likely(timeout == 0)) { linear_backoff_thresh = ilog2(TCP_RTO_MAX/rto_base); -- 2.1.0.rc2.206.gedb03e5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when 2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet @ 2014-09-05 22:49 ` Yuchung Cheng 0 siblings, 0 replies; 6+ messages in thread From: Yuchung Cheng @ 2014-09-05 22:49 UTC (permalink / raw) To: Eric Dumazet; +Cc: David S. Miller, netdev, Neal Cardwell On Fri, Sep 5, 2014 at 3:33 PM, Eric Dumazet <edumazet@google.com> wrote: > After commit 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution"), > we no longer need to maintain timestamps in two different fields. > > TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies > > Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> > --- > include/net/tcp.h | 13 +++++++------ > net/ipv4/tcp_input.c | 3 ++- > net/ipv4/tcp_ipv4.c | 5 +++-- > net/ipv4/tcp_output.c | 39 ++++++++++++++++----------------------- > net/ipv4/tcp_timer.c | 7 +++---- > 5 files changed, 31 insertions(+), 36 deletions(-) > > diff --git a/include/net/tcp.h b/include/net/tcp.h > index 0cd7d2c65dc0..a4201ef216e8 100644 > --- a/include/net/tcp.h > +++ b/include/net/tcp.h > @@ -672,6 +672,12 @@ void tcp_send_window_probe(struct sock *sk); > */ > #define tcp_time_stamp ((__u32)(jiffies)) > > +static inline u32 tcp_skb_timestamp(const struct sk_buff *skb) > +{ > + return skb->skb_mstamp.stamp_jiffies; > +} > + > + > #define tcp_flag_byte(th) (((u_int8_t *)th)[13]) > > #define TCPHDR_FIN 0x01 > @@ -698,12 +704,7 @@ struct tcp_skb_cb { > } header; /* For incoming frames */ > __u32 seq; /* Starting sequence number */ > __u32 end_seq; /* SEQ + FIN + SYN + datalen */ > - union { > - /* used in output path */ > - __u32 when; /* used to compute rtt's */ > - /* used in input path */ > - __u32 tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */ > - }; > + __u32 tcp_tw_isn; /* isn chosen by tcp_timewait_state_process() */ > __u8 tcp_flags; /* TCP header flags. (tcp[13]) */ > > __u8 sacked; /* State flags for SACK/FACK. */ > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 9c8b9f1dcf69..f97003ad0af5 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -2967,7 +2967,8 @@ void tcp_rearm_rto(struct sock *sk) > if (icsk->icsk_pending == ICSK_TIME_EARLY_RETRANS || > icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) { > struct sk_buff *skb = tcp_write_queue_head(sk); > - const u32 rto_time_stamp = TCP_SKB_CB(skb)->when + rto; > + const u32 rto_time_stamp = > + tcp_skb_timestamp(skb) + rto; > s32 delta = (s32)(rto_time_stamp - tcp_time_stamp); > /* delta may not be positive if the socket is locked > * when the retrans timer fires and is rescheduled. > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 02e6cd29ebf1..3f9bc3f0bba0 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -437,8 +437,9 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info) > skb = tcp_write_queue_head(sk); > BUG_ON(!skb); > > - remaining = icsk->icsk_rto - min(icsk->icsk_rto, > - tcp_time_stamp - TCP_SKB_CB(skb)->when); > + remaining = icsk->icsk_rto - > + min(icsk->icsk_rto, > + tcp_time_stamp - tcp_skb_timestamp(skb)); > > if (remaining) { > inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 5a7c41fbc6d3..3b22dcb7bb5c 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -550,7 +550,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, > > if (likely(sysctl_tcp_timestamps && *md5 == NULL)) { > opts->options |= OPTION_TS; > - opts->tsval = TCP_SKB_CB(skb)->when + tp->tsoffset; > + opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset; > opts->tsecr = tp->rx_opt.ts_recent; > remaining -= TCPOLEN_TSTAMP_ALIGNED; > } > @@ -618,7 +618,7 @@ static unsigned int tcp_synack_options(struct sock *sk, > } > if (likely(ireq->tstamp_ok)) { > opts->options |= OPTION_TS; > - opts->tsval = TCP_SKB_CB(skb)->when; > + opts->tsval = tcp_skb_timestamp(skb); > opts->tsecr = req->ts_recent; > remaining -= TCPOLEN_TSTAMP_ALIGNED; > } > @@ -647,7 +647,6 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb > struct tcp_out_options *opts, > struct tcp_md5sig_key **md5) > { > - struct tcp_skb_cb *tcb = skb ? TCP_SKB_CB(skb) : NULL; > struct tcp_sock *tp = tcp_sk(sk); > unsigned int size = 0; > unsigned int eff_sacks; > @@ -666,7 +665,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb > > if (likely(tp->rx_opt.tstamp_ok)) { > opts->options |= OPTION_TS; > - opts->tsval = tcb ? tcb->when + tp->tsoffset : 0; > + opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0; > opts->tsecr = tp->rx_opt.ts_recent; > size += TCPOLEN_TSTAMP_ALIGNED; > } > @@ -886,8 +885,6 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, > skb = skb_clone(skb, gfp_mask); > if (unlikely(!skb)) > return -ENOBUFS; > - /* Our usage of tstamp should remain private */ > - skb->tstamp.tv64 = 0; > } > > inet = inet_sk(sk); > @@ -975,7 +972,10 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, > TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS, > tcp_skb_pcount(skb)); > > + /* Our usage of tstamp should remain private */ > + skb->tstamp.tv64 = 0; > err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl); > + > if (likely(err <= 0)) > return err; > > @@ -1149,7 +1149,6 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, > /* Looks stupid, but our code really uses when of > * skbs, which it never sent before. --ANK > */ Remove the comment above too? > - TCP_SKB_CB(buff)->when = TCP_SKB_CB(skb)->when; > buff->tstamp = skb->tstamp; > tcp_fragment_tstamp(skb, buff); > > @@ -1874,8 +1873,8 @@ static int tcp_mtu_probe(struct sock *sk) > tcp_init_tso_segs(sk, nskb, nskb->len); > > /* We're ready to send. If this fails, the probe will > - * be resegmented into mss-sized pieces by tcp_write_xmit(). */ > - TCP_SKB_CB(nskb)->when = tcp_time_stamp; > + * be resegmented into mss-sized pieces by tcp_write_xmit(). > + */ > if (!tcp_transmit_skb(sk, nskb, 1, GFP_ATOMIC)) { > /* Decrement cwnd here because we are sending > * effectively two packets. */ > @@ -1935,8 +1934,8 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, > BUG_ON(!tso_segs); > > if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) { > - /* "when" is used as a start point for the retransmit timer */ > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > + /* "skb_mstamp" is used as a start point for the retransmit timer */ > + skb_mstamp_get(&skb->skb_mstamp); > goto repair; /* Skip network transmission */ > } > > @@ -2000,8 +1999,6 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, > unlikely(tso_fragment(sk, skb, limit, mss_now, gfp))) > break; > > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > - > if (unlikely(tcp_transmit_skb(sk, skb, 1, gfp))) > break; > > @@ -2499,7 +2496,6 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) > /* Make a copy, if the first transmission SKB clone we made > * is still in somebody's hands, else make a clone. > */ > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > > /* make sure skb->data is aligned on arches that require it > * and check if ack-trimming & collapsing extended the headroom > @@ -2544,7 +2540,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) > > /* Save stamp of the first retransmit. */ > if (!tp->retrans_stamp) > - tp->retrans_stamp = TCP_SKB_CB(skb)->when; > + tp->retrans_stamp = tcp_skb_timestamp(skb); > > /* snd_nxt is stored to detect loss of retransmitted segment, > * see tcp_input.c tcp_sacktag_write_queue(). > @@ -2752,7 +2748,6 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority) > tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk), > TCPHDR_ACK | TCPHDR_RST); > /* Send it off. */ > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > if (tcp_transmit_skb(sk, skb, 0, priority)) > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED); > > @@ -2791,7 +2786,6 @@ int tcp_send_synack(struct sock *sk) > TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ACK; > TCP_ECN_send_synack(tcp_sk(sk), skb); > } > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > return tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC); > } > > @@ -2835,10 +2829,10 @@ struct sk_buff *tcp_make_synack(struct sock *sk, struct dst_entry *dst, > memset(&opts, 0, sizeof(opts)); > #ifdef CONFIG_SYN_COOKIES > if (unlikely(req->cookie_ts)) > - TCP_SKB_CB(skb)->when = cookie_init_timestamp(req); > + skb->skb_mstamp.stamp_jiffies = cookie_init_timestamp(req); > else > #endif > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > + skb_mstamp_get(&skb->skb_mstamp); > tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, &md5, > foc) + sizeof(*th); > > @@ -3086,7 +3080,7 @@ int tcp_connect(struct sock *sk) > skb_reserve(buff, MAX_TCP_HEADER); > > tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN); > - tp->retrans_stamp = TCP_SKB_CB(buff)->when = tcp_time_stamp; > + tp->retrans_stamp = tcp_time_stamp; > tcp_connect_queue_skb(sk, buff); > TCP_ECN_send_syn(sk, buff); > > @@ -3194,7 +3188,7 @@ void tcp_send_ack(struct sock *sk) > tcp_init_nondata_skb(buff, tcp_acceptable_seq(sk), TCPHDR_ACK); > > /* Send it off, this clears delayed acks for us. */ > - TCP_SKB_CB(buff)->when = tcp_time_stamp; > + skb_mstamp_get(&buff->skb_mstamp); > tcp_transmit_skb(sk, buff, 0, sk_gfp_atomic(sk, GFP_ATOMIC)); > } > > @@ -3226,7 +3220,7 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent) > * send it. > */ > tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK); > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > + skb_mstamp_get(&skb->skb_mstamp); > return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC); > } > > @@ -3270,7 +3264,6 @@ int tcp_write_wakeup(struct sock *sk) > tcp_set_skb_tso_segs(sk, skb, mss); > > TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_PSH; > - TCP_SKB_CB(skb)->when = tcp_time_stamp; > err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC); > if (!err) > tcp_event_new_data_sent(sk, skb); > diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c > index df90cd1ce37f..a339e7ba05a4 100644 > --- a/net/ipv4/tcp_timer.c > +++ b/net/ipv4/tcp_timer.c > @@ -135,10 +135,9 @@ static bool retransmits_timed_out(struct sock *sk, > if (!inet_csk(sk)->icsk_retransmits) > return false; > > - if (unlikely(!tcp_sk(sk)->retrans_stamp)) > - start_ts = TCP_SKB_CB(tcp_write_queue_head(sk))->when; > - else > - start_ts = tcp_sk(sk)->retrans_stamp; > + start_ts = tcp_sk(sk)->retrans_stamp; > + if (unlikely(!start_ts)) > + start_ts = tcp_skb_timestamp(tcp_write_queue_head(sk)); > > if (likely(timeout == 0)) { > linear_backoff_thresh = ilog2(TCP_RTO_MAX/rto_base); > -- > 2.1.0.rc2.206.gedb03e5 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when 2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet 2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet 2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet @ 2014-09-06 0:49 ` David Miller 2 siblings, 0 replies; 6+ messages in thread From: David Miller @ 2014-09-06 0:49 UTC (permalink / raw) To: edumazet; +Cc: netdev, ycheng, ncardwell From: Eric Dumazet <edumazet@google.com> Date: Fri, 5 Sep 2014 15:33:31 -0700 > TCP_SKB_CB(skb)->when has different meaning in output and input paths. > > In output path, it contains a timestamp. > In input path, it contains an ISN, chosen by tcp_timewait_state_process() > > Its usage in output path is obsolete after usec timestamping. > Lets simplify and clean this. Nice cleanup, applied, thanks Eric. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-09-06 0:49 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-09-05 22:33 [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when Eric Dumazet 2014-09-05 22:33 ` [PATCH net-next 1/2] tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn Eric Dumazet 2014-09-05 22:50 ` Yuchung Cheng 2014-09-05 22:33 ` [PATCH net-next 2/2] tcp: remove TCP_SKB_CB(skb)->when Eric Dumazet 2014-09-05 22:49 ` Yuchung Cheng 2014-09-06 0:49 ` [PATCH net-next 0/2] tcp: deduplicate TCP_SKB_CB(skb)->when David Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.