linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] tcp: add ability to set a timestamp offset
@ 2013-01-22 20:52 Andrey Vagin
  2013-01-22 21:16 ` Rick Jones
  2013-01-22 21:18 ` Neal Cardwell
  0 siblings, 2 replies; 6+ messages in thread
From: Andrey Vagin @ 2013-01-22 20:52 UTC (permalink / raw)
  To: netdev
  Cc: criu, linux-kernel, Andrey Vagin, David S. Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, Yuchung Cheng, Neal Cardwell,
	Pavel Emelyanov, Dave Jones, Michael Kerrisk

If a TCP socket will get live-migrated from one box to another the
timestamps (which are typically ON) will get screwed up -- the new
kernel will generate TS values that has nothing to do with what they
were on dump. The solution is to yet again fix the kernel and put a
"timestamp offset" on a socket.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
---
 include/linux/tcp.h      | 2 ++
 include/uapi/linux/tcp.h | 1 +
 net/ipv4/tcp.c           | 6 ++++++
 net/ipv4/tcp_output.c    | 7 ++++---
 4 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 4e1d228..746dad5 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -162,6 +162,8 @@ struct tcp_sock {
 	u32	rcv_tstamp;	/* timestamp of last received ACK (for keepalives) */
 	u32	lsndtime;	/* timestamp of last sent data packet (for restart window) */
 
+	u32	snd_tsval_offset; /* offset for snd_tsval */
+
 	struct list_head tsq_node; /* anchor in tsq_tasklet.head list */
 	unsigned long	tsq_flags;
 
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index e962faa..6b1ead0 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -111,6 +111,7 @@ enum {
 #define TCP_QUEUE_SEQ		21
 #define TCP_REPAIR_OPTIONS	22
 #define TCP_FASTOPEN		23	/* Enable FastOpen on listeners */
+#define TCP_TIMESTAMP		24
 
 struct tcp_repair_opt {
 	__u32	opt_code;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 1ca2536..72dee28 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2704,6 +2704,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 		else
 			err = -EINVAL;
 		break;
+	case TCP_TIMESTAMP:
+		tp->snd_tsval_offset = val - tcp_time_stamp;
+		break;
 	default:
 		err = -ENOPROTOOPT;
 		break;
@@ -2952,6 +2955,9 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
 	case TCP_USER_TIMEOUT:
 		val = jiffies_to_msecs(icsk->icsk_user_timeout);
 		break;
+	case TCP_TIMESTAMP:
+		val = tcp_time_stamp + tp->snd_tsval_offset;
+		break;
 	default:
 		return -ENOPROTOOPT;
 	}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 5d45159..9b6d485 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -622,7 +622,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
 
 	if (likely(sysctl_tcp_timestamps && *md5 == NULL)) {
 		opts->options |= OPTION_TS;
-		opts->tsval = TCP_SKB_CB(skb)->when;
+		opts->tsval = TCP_SKB_CB(skb)->when + tp->snd_tsval_offset;
 		opts->tsecr = tp->rx_opt.ts_recent;
 		remaining -= TCPOLEN_TSTAMP_ALIGNED;
 	}
@@ -705,6 +705,7 @@ static unsigned int tcp_synack_options(struct sock *sk,
 				   struct tcp_extend_values *xvp,
 				   struct tcp_fastopen_cookie *foc)
 {
+	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_request_sock *ireq = inet_rsk(req);
 	unsigned int remaining = MAX_TCP_OPTION_SPACE;
 	u8 cookie_plus = (xvp != NULL && !xvp->cookie_out_never) ?
@@ -739,7 +740,7 @@ static unsigned int tcp_synack_options(struct sock *sk,
 	}
 	if (likely(ireq->tstamp_ok)) {
 		opts->options |= OPTION_TS;
-		opts->tsval = TCP_SKB_CB(skb)->when;
+		opts->tsval = TCP_SKB_CB(skb)->when + tp->snd_tsval_offset;
 		opts->tsecr = req->ts_recent;
 		remaining -= TCPOLEN_TSTAMP_ALIGNED;
 	}
@@ -806,7 +807,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 
 	if (likely(tp->rx_opt.tstamp_ok)) {
 		opts->options |= OPTION_TS;
-		opts->tsval = tcb ? tcb->when : 0;
+		opts->tsval = tcb ? tcb->when + tp->snd_tsval_offset : 0;
 		opts->tsecr = tp->rx_opt.ts_recent;
 		size += TCPOLEN_TSTAMP_ALIGNED;
 	}
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: add ability to set a timestamp offset
  2013-01-22 20:52 [PATCH net-next] tcp: add ability to set a timestamp offset Andrey Vagin
@ 2013-01-22 21:16 ` Rick Jones
  2013-01-22 21:18 ` Neal Cardwell
  1 sibling, 0 replies; 6+ messages in thread
From: Rick Jones @ 2013-01-22 21:16 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: netdev, criu, linux-kernel, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Eric Dumazet,
	Yuchung Cheng, Neal Cardwell, Pavel Emelyanov, Dave Jones,
	Michael Kerrisk

On 01/22/2013 12:52 PM, Andrey Vagin wrote:
> If a TCP socket will get live-migrated from one box to another the
> timestamps (which are typically ON) will get screwed up -- the new
> kernel will generate TS values that has nothing to do with what they
> were on dump. The solution is to yet again fix the kernel and put a
> "timestamp offset" on a socket.

Is there a chance a connection can be moved more than once within the 
"lifetime" of a given timestamp value?

rick jones


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: add ability to set a timestamp offset
  2013-01-22 20:52 [PATCH net-next] tcp: add ability to set a timestamp offset Andrey Vagin
  2013-01-22 21:16 ` Rick Jones
@ 2013-01-22 21:18 ` Neal Cardwell
  2013-01-22 21:24   ` David Miller
  1 sibling, 1 reply; 6+ messages in thread
From: Neal Cardwell @ 2013-01-22 21:18 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: Netdev, criu, LKML, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Eric Dumazet,
	Yuchung Cheng, Pavel Emelyanov, Dave Jones, Michael Kerrisk

On Tue, Jan 22, 2013 at 3:52 PM, Andrey Vagin <avagin@openvz.org> wrote:
> If a TCP socket will get live-migrated from one box to another the
> timestamps (which are typically ON) will get screwed up -- the new
> kernel will generate TS values that has nothing to do with what they
> were on dump. The solution is to yet again fix the kernel and put a
> "timestamp offset" on a socket.

One serious issue with this patch is that outgoing timestamp values
will no longer correspond to tcp_time_stamp, so echoed timestamp
values will also no longer have a meaningful relationship to
tcp_time_stamp. That violates assumptions made in several places in
the code, which assumes that we can compare echoed timestamp values to
tcp_time_stamp; for example, there are several places where we do
things like subtracting:
   tcp_time_stamp - tp->rx_opt.rcv_tsecr
to find the estimated RTT for a segment.

neal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: add ability to set a timestamp offset
  2013-01-22 21:18 ` Neal Cardwell
@ 2013-01-22 21:24   ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2013-01-22 21:24 UTC (permalink / raw)
  To: ncardwell
  Cc: avagin, netdev, criu, linux-kernel, kuznet, jmorris, yoshfuji,
	kaber, edumazet, ycheng, xemul, davej, mtk.manpages

From: Neal Cardwell <ncardwell@google.com>
Date: Tue, 22 Jan 2013 16:18:04 -0500

> On Tue, Jan 22, 2013 at 3:52 PM, Andrey Vagin <avagin@openvz.org> wrote:
>> If a TCP socket will get live-migrated from one box to another the
>> timestamps (which are typically ON) will get screwed up -- the new
>> kernel will generate TS values that has nothing to do with what they
>> were on dump. The solution is to yet again fix the kernel and put a
>> "timestamp offset" on a socket.
> 
> One serious issue with this patch is that outgoing timestamp values
> will no longer correspond to tcp_time_stamp, so echoed timestamp
> values will also no longer have a meaningful relationship to
> tcp_time_stamp. That violates assumptions made in several places in
> the code, which assumes that we can compare echoed timestamp values to
> tcp_time_stamp; for example, there are several places where we do
> things like subtracting:
>    tcp_time_stamp - tp->rx_opt.rcv_tsecr
> to find the estimated RTT for a segment.

Right, this change seems pretty bogus as-is.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: add ability to set a timestamp offset
  2013-02-11 15:50 Andrey Vagin
@ 2013-02-13 18:22 ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2013-02-13 18:22 UTC (permalink / raw)
  To: avagin
  Cc: netdev, criu, linux-kernel, kuznet, jmorris, yoshfuji, kaber,
	edumazet, xemul

From: Andrey Vagin <avagin@openvz.org>
Date: Mon, 11 Feb 2013 19:50:16 +0400

> If a TCP socket will get live-migrated from one box to another the
> timestamps (which are typically ON) will get screwed up -- the new
> kernel will generate TS values that has nothing to do with what they
> were on dump. The solution is to yet again fix the kernel and put a
> "timestamp offset" on a socket.
> 
> A socket offset is added in places where externally visible tcp
> timestamp option is parsed/initialized.
> 
> Connections in the SYN_RECV state are not supported, global
> tcp_time_stamp is used for them, because repair mode doesn't support
> this state. In a future it can be implemented by the similar way as for
> TIME_WAIT sockets.
> 
> For time-wait sockets offset is inhereted by a proper tcp_sock.
> 
> A per-socket offset can be set only for sockets in repair mode.

Series applied, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next] tcp: add ability to set a timestamp offset
@ 2013-02-11 15:50 Andrey Vagin
  2013-02-13 18:22 ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Andrey Vagin @ 2013-02-11 15:50 UTC (permalink / raw)
  To: netdev
  Cc: criu, linux-kernel, Andrey Vagin, David S. Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, Pavel Emelyanov

If a TCP socket will get live-migrated from one box to another the
timestamps (which are typically ON) will get screwed up -- the new
kernel will generate TS values that has nothing to do with what they
were on dump. The solution is to yet again fix the kernel and put a
"timestamp offset" on a socket.

A socket offset is added in places where externally visible tcp
timestamp option is parsed/initialized.

Connections in the SYN_RECV state are not supported, global
tcp_time_stamp is used for them, because repair mode doesn't support
this state. In a future it can be implemented by the similar way as for
TIME_WAIT sockets.

For time-wait sockets offset is inhereted by a proper tcp_sock.

A per-socket offset can be set only for sockets in repair mode.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>

Andrey Vagin (3):
  tcp: adding a per-socket timestamp offset
  tcp: set and get per-socket timestamp
  tcp: send packets with a socket timestamp

 include/linux/tcp.h      |  3 +++
 include/uapi/linux/tcp.h |  1 +
 net/ipv4/tcp.c           | 11 +++++++++++
 net/ipv4/tcp_input.c     |  8 +++++++-
 net/ipv4/tcp_ipv4.c      | 12 +++++++-----
 net/ipv4/tcp_minisocks.c |  3 +++
 net/ipv4/tcp_output.c    |  4 ++--
 net/ipv6/tcp_ipv6.c      | 22 +++++++++++++---------
 8 files changed, 47 insertions(+), 17 deletions(-)
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-02-13 18:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-22 20:52 [PATCH net-next] tcp: add ability to set a timestamp offset Andrey Vagin
2013-01-22 21:16 ` Rick Jones
2013-01-22 21:18 ` Neal Cardwell
2013-01-22 21:24   ` David Miller
2013-02-11 15:50 Andrey Vagin
2013-02-13 18:22 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).