All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] tcp: increase resilence vs. blind data injection
@ 2016-08-18 12:48 Florian Westphal
  2016-08-18 12:48 ` [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection Florian Westphal
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-18 12:48 UTC (permalink / raw)
  To: netdev


This series introduces a new tcp_timestamps=2 mode.

When enabled, out-of-order packets that carry data are also subject
to a range check of TSecr (timestamp echo).

Current kernels use a global timestamp echo, i.e.  the timestamp values
are known.

Therefore first patch adds per connection randomzation of timestamp.
The ISN generator is re-used for this purpose to get same offset
for all connections with identical connection quadruple.

Second patch adds the new timestamp mode, default is not changed.

Third patch adds a MIB counter to track when we drop skb because
of the new facility.

Syncookies are not yet converted to use tsoff.
Depending on how discussion about this set will turn out I will add
this for the v1 submit.

 Documentation/networking/ip-sysctl.txt |    8 ++++--
 include/linux/tcp.h                    |    1 
 include/net/secure_seq.h               |   13 ++++++---
 include/net/tcp.h                      |    2 -
 include/uapi/linux/snmp.h              |    1 
 net/core/secure_seq.c                  |   19 +++++++++-----
 net/ipv4/proc.c                        |    1 
 net/ipv4/syncookies.c                  |    1 
 net/ipv4/tcp_input.c                   |   43 ++++++++++++++++++++++++++++++++-
 net/ipv4/tcp_ipv4.c                    |   30 +++++++++++++++--------
 net/ipv4/tcp_minisocks.c               |    4 ++-
 net/ipv4/tcp_output.c                  |    2 -
 net/ipv6/syncookies.c                  |    1 
 net/ipv6/tcp_ipv6.c                    |   28 +++++++++++++--------
 14 files changed, 118 insertions(+), 36 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-18 12:48 [RFC 0/3] tcp: increase resilence vs. blind data injection Florian Westphal
@ 2016-08-18 12:48 ` Florian Westphal
  2016-08-18 16:18   ` Eric Dumazet
                     ` (2 more replies)
  2016-08-18 12:48 ` [RFC 2/3] tcp: add tcp_timestamps=2 mode to force tsecr validation on ofo segments Florian Westphal
  2016-08-18 12:48 ` [RFC 3/3] tcp: add mib counter to track ts tsecr validation failures Florian Westphal
  2 siblings, 3 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-18 12:48 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
added the main infrastructure that is needed for per-connection
randomization, in particular writing/reading the on-wire tcp header
format takes the offset into account so rest of stack can use normal
tcp_time_stamp (jiffies).

So only two items are left:
 - add a tsoffset for request sockets
 - extend the tcp isn generator to also return another 32bit number
 in addition to the ISN.

Re-use of ISN generator also means timestamps are still monotonically
increasing for same connection quadruple.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/tcp.h      |  1 +
 include/net/secure_seq.h | 13 +++++++++----
 include/net/tcp.h        |  2 +-
 net/core/secure_seq.c    | 19 +++++++++++++------
 net/ipv4/syncookies.c    |  1 +
 net/ipv4/tcp_input.c     |  7 ++++++-
 net/ipv4/tcp_ipv4.c      | 30 ++++++++++++++++++++----------
 net/ipv4/tcp_minisocks.c |  4 +++-
 net/ipv4/tcp_output.c    |  2 +-
 net/ipv6/syncookies.c    |  1 +
 net/ipv6/tcp_ipv6.c      | 28 ++++++++++++++++++----------
 11 files changed, 74 insertions(+), 34 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 7be9b12..edad1a1 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -122,6 +122,7 @@ struct tcp_request_sock {
 	u32				txhash;
 	u32				rcv_isn;
 	u32				snt_isn;
+	u32				ts_off;
 	u32				last_oow_ack_time; /* last SYNACK */
 	u32				rcv_nxt; /* the ack # by SYNACK. For
 						  * FastOpen it's the seq#
diff --git a/include/net/secure_seq.h b/include/net/secure_seq.h
index 3f36d45..3d576f7 100644
--- a/include/net/secure_seq.h
+++ b/include/net/secure_seq.h
@@ -3,13 +3,18 @@
 
 #include <linux/types.h>
 
+struct secure_tcp_seq {
+	u32 seq;
+	u32 tsoff;
+};
+
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport);
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport);
-__u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
-				 __be16 sport, __be16 dport);
-__u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
-				   __be16 sport, __be16 dport);
+struct secure_tcp_seq secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
+						 __be16 sport, __be16 dport);
+struct secure_tcp_seq secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
+						    __be16 sport, __be16 dport);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index c00e7d5..5707361 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1749,7 +1749,7 @@ struct tcp_request_sock_ops {
 	struct dst_entry *(*route_req)(const struct sock *sk, struct flowi *fl,
 				       const struct request_sock *req,
 				       bool *strict);
-	__u32 (*init_seq)(const struct sk_buff *skb);
+	__u32 (*init_seq)(const struct sk_buff *skb, u32 *tsoff);
 	int (*send_synack)(const struct sock *sk, struct dst_entry *dst,
 			   struct flowi *fl, struct request_sock *req,
 			   struct tcp_fastopen_cookie *foc,
diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index fd3ce46..613f2fd 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -40,11 +40,13 @@ static u32 seq_scale(u32 seq)
 #endif
 
 #if IS_ENABLED(CONFIG_IPV6)
-__u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
-				   __be16 sport, __be16 dport)
+struct secure_tcp_seq
+secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
+			     __be16 sport, __be16 dport)
 {
 	u32 secret[MD5_MESSAGE_BYTES / 4];
 	u32 hash[MD5_DIGEST_WORDS];
+	struct secure_tcp_seq seq;
 	u32 i;
 
 	net_secret_init();
@@ -58,7 +60,9 @@ __u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 
 	md5_transform(hash, secret);
 
-	return seq_scale(hash[0]);
+	seq.seq = seq_scale(hash[0]);
+	seq.tsoff = hash[1];
+	return seq;
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
@@ -86,10 +90,11 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 
 #ifdef CONFIG_INET
 
-__u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
-				 __be16 sport, __be16 dport)
+struct secure_tcp_seq secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
+						 __be16 sport, __be16 dport)
 {
 	u32 hash[MD5_DIGEST_WORDS];
+	struct secure_tcp_seq seq;
 
 	net_secret_init();
 	hash[0] = (__force u32)saddr;
@@ -99,7 +104,9 @@ __u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 
 	md5_transform(hash, net_secret);
 
-	return seq_scale(hash[0]);
+	seq.seq = seq_scale(hash[0]);
+	seq.tsoff = hash[1];
+	return seq;
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index e3c4043..5db13f7 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -334,6 +334,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
 	treq = tcp_rsk(req);
 	treq->rcv_isn		= ntohl(th->seq) - 1;
 	treq->snt_isn		= cookie;
+	treq->ts_off		= 0;
 	req->mss		= mss;
 	ireq->ir_num		= ntohs(th->dest);
 	ireq->ir_rmt_port	= th->source;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3ebf45b..092f2dd 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6249,6 +6249,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 		goto drop;
 
 	tcp_rsk(req)->af_specific = af_ops;
+	tcp_rsk(req)->ts_off = 0;
 
 	tcp_clear_options(&tmp_opt);
 	tmp_opt.mss_clamp = af_ops->mss_clamp;
@@ -6269,6 +6270,9 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	if (security_inet_conn_request(sk, skb, req))
 		goto drop_and_free;
 
+	if (isn && tmp_opt.tstamp_ok) /* retrans? recompute tsoff */
+		af_ops->init_seq(skb, &tcp_rsk(req)->ts_off);
+
 	if (!want_cookie && !isn) {
 		/* VJ's idea. We save last timestamp seen
 		 * from the destination in peer table, when entering
@@ -6309,7 +6313,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 			goto drop_and_release;
 		}
 
-		isn = af_ops->init_seq(skb);
+		isn = af_ops->init_seq(skb, &tcp_rsk(req)->ts_off);
 	}
 	if (!dst) {
 		dst = af_ops->route_req(sk, &fl, req, NULL);
@@ -6321,6 +6325,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 
 	if (want_cookie) {
 		isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
+		tcp_rsk(req)->ts_off = 0;
 		req->cookie_ts = tmp_opt.tstamp_ok;
 		if (!tmp_opt.tstamp_ok)
 			inet_rsk(req)->ecn_ok = 0;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 32b048e..47683c7 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -66,6 +66,7 @@
 #include <net/net_namespace.h>
 #include <net/icmp.h>
 #include <net/inet_hashtables.h>
+#include <net/secure_seq.h>
 #include <net/tcp.h>
 #include <net/transp_v6.h>
 #include <net/ipv6.h>
@@ -96,12 +97,16 @@ static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
 struct inet_hashinfo tcp_hashinfo;
 EXPORT_SYMBOL(tcp_hashinfo);
 
-static  __u32 tcp_v4_init_sequence(const struct sk_buff *skb)
+static u32 tcp_v4_init_sequence(const struct sk_buff *skb, u32 *tsoff)
 {
-	return secure_tcp_sequence_number(ip_hdr(skb)->daddr,
-					  ip_hdr(skb)->saddr,
-					  tcp_hdr(skb)->dest,
-					  tcp_hdr(skb)->source);
+	struct secure_tcp_seq s;
+
+	s = secure_tcp_sequence_number(ip_hdr(skb)->daddr,
+				       ip_hdr(skb)->saddr,
+				       tcp_hdr(skb)->dest,
+				       tcp_hdr(skb)->source);
+	*tsoff = s.tsoff;
+	return s.seq;
 }
 
 int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp)
@@ -234,11 +239,16 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	sk->sk_gso_type = SKB_GSO_TCPV4;
 	sk_setup_caps(sk, &rt->dst);
 
-	if (!tp->write_seq && likely(!tp->repair))
-		tp->write_seq = secure_tcp_sequence_number(inet->inet_saddr,
-							   inet->inet_daddr,
-							   inet->inet_sport,
-							   usin->sin_port);
+	if (!tp->write_seq && likely(!tp->repair)) {
+		struct secure_tcp_seq seq;
+
+		seq = secure_tcp_sequence_number(inet->inet_saddr,
+						 inet->inet_daddr,
+						 inet->inet_sport,
+						 usin->sin_port);
+		tp->write_seq = seq.seq;
+		tp->tsoffset = seq.tsoff;
+	}
 
 	inet->inet_id = tp->write_seq ^ jiffies;
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 4b95ec4..b43ca7e 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -530,7 +530,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 			newtp->rx_opt.ts_recent_stamp = 0;
 			newtp->tcp_header_len = sizeof(struct tcphdr);
 		}
-		newtp->tsoffset = 0;
+		newtp->tsoffset = treq->ts_off;
 #ifdef CONFIG_TCP_MD5SIG
 		newtp->md5sig_info = NULL;	/*XXX*/
 		if (newtp->af_specific->md5_lookup(sk, newsk))
@@ -579,6 +579,8 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 
 		if (tmp_opt.saw_tstamp) {
 			tmp_opt.ts_recent = req->ts_recent;
+			if (tmp_opt.rcv_tsecr)
+				tmp_opt.rcv_tsecr -= tcp_rsk(req)->ts_off;
 			/* We do not store true stamp, but it is not required,
 			 * it can be estimated (approximately)
 			 * from another data.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index bdaef7f..e0e478e 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -640,7 +640,7 @@ static unsigned int tcp_synack_options(struct request_sock *req,
 	}
 	if (likely(ireq->tstamp_ok)) {
 		opts->options |= OPTION_TS;
-		opts->tsval = tcp_skb_timestamp(skb);
+		opts->tsval = tcp_skb_timestamp(skb) + tcp_rsk(req)->ts_off;
 		opts->tsecr = req->ts_recent;
 		remaining -= TCPOLEN_TSTAMP_ALIGNED;
 	}
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index 59c4839..241d40e 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -209,6 +209,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
 	treq->snt_synack.v64	= 0;
 	treq->rcv_isn = ntohl(th->seq) - 1;
 	treq->snt_isn = cookie;
+	treq->ts_off = 0;
 
 	/*
 	 * We need to lookup the dst_entry to get the correct window size.
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 33df8b8..5cf7aa9 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -101,12 +101,16 @@ static void inet6_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb)
 	}
 }
 
-static __u32 tcp_v6_init_sequence(const struct sk_buff *skb)
+static u32 tcp_v6_init_sequence(const struct sk_buff *skb, u32 *tsoff)
 {
-	return secure_tcpv6_sequence_number(ipv6_hdr(skb)->daddr.s6_addr32,
-					    ipv6_hdr(skb)->saddr.s6_addr32,
-					    tcp_hdr(skb)->dest,
-					    tcp_hdr(skb)->source);
+	struct secure_tcp_seq s;
+
+	s = secure_tcpv6_sequence_number(ipv6_hdr(skb)->daddr.s6_addr32,
+					 ipv6_hdr(skb)->saddr.s6_addr32,
+					 tcp_hdr(skb)->dest,
+					 tcp_hdr(skb)->source);
+	*tsoff = s.tsoff;
+	return s.seq;
 }
 
 static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
@@ -278,12 +282,16 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 	sk_set_txhash(sk);
 
-	if (!tp->write_seq && likely(!tp->repair))
-		tp->write_seq = secure_tcpv6_sequence_number(np->saddr.s6_addr32,
-							     sk->sk_v6_daddr.s6_addr32,
-							     inet->inet_sport,
-							     inet->inet_dport);
+	if (!tp->write_seq && likely(!tp->repair)) {
+		struct secure_tcp_seq seq;
 
+		seq = secure_tcpv6_sequence_number(np->saddr.s6_addr32,
+						   sk->sk_v6_daddr.s6_addr32,
+						   inet->inet_sport,
+						   inet->inet_dport);
+		tp->write_seq = seq.seq;
+		tp->tsoffset = seq.tsoff;
+	}
 	err = tcp_connect(sk);
 	if (err)
 		goto late_failure;
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 2/3] tcp: add tcp_timestamps=2 mode to force tsecr validation on ofo segments
  2016-08-18 12:48 [RFC 0/3] tcp: increase resilence vs. blind data injection Florian Westphal
  2016-08-18 12:48 ` [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection Florian Westphal
@ 2016-08-18 12:48 ` Florian Westphal
  2016-08-18 12:48 ` [RFC 3/3] tcp: add mib counter to track ts tsecr validation failures Florian Westphal
  2 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-18 12:48 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

Adds new tcp_timestamps=2 mode.  If enabled, out-of-order segments
(within window but sequence number does not equal to rcv.next)
that carry data (i.e. pure acks are not tested) are also subject to a
check of tsecr, the timestamp echo value.

This make blind data injection more challenging if a connection is using
tcp timestamps as injected packet needs to match expected next sequence
number or carry a timestamp echo within an given range.

This approach could later be made more strict, f.e. this currently
uses tp->lsndtime which is only updated when we send data.

Because we only care about ofo segments, normal fastpath
(successful header prediction) is not changed. We also do not
send a challenge ack, as at least one packet is known to be in-flight
(or lost).

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 Documentation/networking/ip-sysctl.txt |  8 ++++++--
 net/ipv4/tcp_input.c                   | 35 ++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 9ae9293..1e360a3 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -609,8 +609,12 @@ tcp_syn_retries - INTEGER
 	with the current initial RTO of 1second. With this the final timeout
 	for an active TCP connection attempt will happen after 127seconds.
 
-tcp_timestamps - BOOLEAN
-	Enable timestamps as defined in RFC1323.
+tcp_timestamps - INTEGER
+	0: Disabled.
+	1: Enable timestamps as defined in RFC1323.
+	2: Like 1, but also drop segments that arrive out-of-order if
+	their echo timestamp (TSecr) is outside of an expected range.
+	Default: 1
 
 tcp_min_tso_segs - INTEGER
 	Minimal number of segments per TSO frame.
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 092f2dd..1234244 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -128,6 +128,8 @@ int sysctl_tcp_invalid_ratelimit __read_mostly = HZ/2;
 #define REXMIT_LOST	1 /* retransmit packets marked lost */
 #define REXMIT_NEW	2 /* FRTO-style transmit of unsent/new packets */
 
+#define TSECR_MAX_DELAY		TCP_RTO_MAX
+
 /* Adapt the MSS value used to make delayed ack decision to the
  * real world.
  */
@@ -5165,6 +5167,30 @@ static int tcp_copy_to_iovec(struct sock *sk, struct sk_buff *skb, int hlen)
 	return err;
 }
 
+static bool tcp_validate_tsecr(struct sock *sk, const struct tcphdr *th)
+{
+	const struct tcp_sock *tp = tcp_sk(sk);
+	const struct sk_buff *skb = NULL;
+	u32 ts_low;
+
+	if (sysctl_tcp_timestamps != 2 || !tp->rx_opt.tstamp_ok)
+		return true;
+
+	/* packet has no timestamp but connection is supposed to */
+	if (!tp->rx_opt.saw_tstamp)
+		return false;
+
+	ts_low = tp->retrans_stamp;
+	if (ts_low == 0) {
+		skb = tcp_write_queue_head(sk);
+		ts_low = skb ? tcp_skb_timestamp(skb) : tp->lsndtime;
+	}
+
+	ts_low -= TSECR_MAX_DELAY;
+
+	return between(tp->rx_opt.rcv_tsecr, ts_low, tcp_time_stamp);
+}
+
 /* Does PAWS and seqno based validation of an incoming segment, flags will
  * play significant role here.
  */
@@ -5255,6 +5281,15 @@ syn_challenge:
 		goto discard;
 	}
 
+	if (TCP_SKB_CB(skb)->seq == tp->rcv_nxt ||
+	    TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq)
+		return true;
+
+	/* last step: ofo and not pure ack: check tsecr */
+	if (!tcp_validate_tsecr(sk, th)) {
+		goto discard;
+	}
+
 	return true;
 
 discard:
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 3/3] tcp: add mib counter to track ts tsecr validation failures
  2016-08-18 12:48 [RFC 0/3] tcp: increase resilence vs. blind data injection Florian Westphal
  2016-08-18 12:48 ` [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection Florian Westphal
  2016-08-18 12:48 ` [RFC 2/3] tcp: add tcp_timestamps=2 mode to force tsecr validation on ofo segments Florian Westphal
@ 2016-08-18 12:48 ` Florian Westphal
  2 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-18 12:48 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

useful to detect packet drop due to enabled tcp_timestamps=2 mode.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/uapi/linux/snmp.h | 1 +
 net/ipv4/proc.c           | 1 +
 net/ipv4/tcp_input.c      | 1 +
 3 files changed, 3 insertions(+)

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 25a9ad8..79916ca 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -280,6 +280,7 @@ enum
 	LINUX_MIB_TCPKEEPALIVE,			/* TCPKeepAlive */
 	LINUX_MIB_TCPMTUPFAIL,			/* TCPMTUPFail */
 	LINUX_MIB_TCPMTUPSUCCESS,		/* TCPMTUPSuccess */
+	LINUX_MIB_TCPTSECRFAIL,			/* TCPTSecrFail */
 	__LINUX_MIB_MAX
 };
 
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 9f665b6..e7667f8 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -302,6 +302,7 @@ static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("TCPKeepAlive", LINUX_MIB_TCPKEEPALIVE),
 	SNMP_MIB_ITEM("TCPMTUPFail", LINUX_MIB_TCPMTUPFAIL),
 	SNMP_MIB_ITEM("TCPMTUPSuccess", LINUX_MIB_TCPMTUPSUCCESS),
+	SNMP_MIB_ITEM("TCPTSecrFail", LINUX_MIB_TCPTSECRFAIL),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1234244..e114777 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5287,6 +5287,7 @@ syn_challenge:
 
 	/* last step: ofo and not pure ack: check tsecr */
 	if (!tcp_validate_tsecr(sk, th)) {
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPTSECRFAIL);
 		goto discard;
 	}
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-18 12:48 ` [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection Florian Westphal
@ 2016-08-18 16:18   ` Eric Dumazet
  2016-08-18 22:32     ` Florian Westphal
  2016-08-25  9:06     ` Florian Westphal
  2016-08-25 19:34   ` Eric Dumazet
  2016-08-25 22:06   ` Eric Dumazet
  2 siblings, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2016-08-18 16:18 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> added the main infrastructure that is needed for per-connection
> randomization, in particular writing/reading the on-wire tcp header
> format takes the offset into account so rest of stack can use normal
> tcp_time_stamp (jiffies).
> 
> So only two items are left:
>  - add a tsoffset for request sockets
>  - extend the tcp isn generator to also return another 32bit number
>  in addition to the ISN.
> 
> Re-use of ISN generator also means timestamps are still monotonically
> increasing for same connection quadruple.

I like the idea, but the implementation looks a bit complex.

Instead of initializing tsoffset to 0, we could simply use

jhash(src_addr, dst_addr, boot_time_rnd)

This way, even syncookies would be handled, and we do not need to
increase tcp_request_sock size.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-18 16:18   ` Eric Dumazet
@ 2016-08-18 22:32     ` Florian Westphal
  2016-08-25  9:06     ` Florian Westphal
  1 sibling, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-18 22:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Florian Westphal, netdev

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> > commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> > added the main infrastructure that is needed for per-connection
> > randomization, in particular writing/reading the on-wire tcp header
> > format takes the offset into account so rest of stack can use normal
> > tcp_time_stamp (jiffies).
> > 
> > So only two items are left:
> >  - add a tsoffset for request sockets
> >  - extend the tcp isn generator to also return another 32bit number
> >  in addition to the ISN.
> > 
> > Re-use of ISN generator also means timestamps are still monotonically
> > increasing for same connection quadruple.
> 
> I like the idea, but the implementation looks a bit complex.
> 
> Instead of initializing tsoffset to 0, we could simply use
> 
> jhash(src_addr, dst_addr, boot_time_rnd)
> 
> This way, even syncookies would be handled, and we do not need to
> increase tcp_request_sock size.

True, however I think it would be fairly easy to discover
boot_time_rnd given a few outputs, as jhash is not cryptograhic hash, no?

If thats not a concern I can just use jhash (not taking ports
into account doesn't seem to be a problem).

Alternatively (if tcp_request_sock increase/complexity is a problem)
I could either call the isn generator again, or add an extra function
for it (again using md5), I did not do this because I was afraid
it would be too expensive to do two md5 calculations.

Thanks for reviewing!

For cookies I had planned to just extend the cookie sha1 similar
to isn generator here, alternatives welcome.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-18 16:18   ` Eric Dumazet
  2016-08-18 22:32     ` Florian Westphal
@ 2016-08-25  9:06     ` Florian Westphal
  2016-08-25 14:15       ` Eric Dumazet
  1 sibling, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2016-08-25  9:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Florian Westphal, netdev

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> > commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> > added the main infrastructure that is needed for per-connection
> > randomization, in particular writing/reading the on-wire tcp header
> > format takes the offset into account so rest of stack can use normal
> > tcp_time_stamp (jiffies).
> > 
> > So only two items are left:
> >  - add a tsoffset for request sockets
> >  - extend the tcp isn generator to also return another 32bit number
> >  in addition to the ISN.
> > 
> > Re-use of ISN generator also means timestamps are still monotonically
> > increasing for same connection quadruple.
> 
> I like the idea, but the implementation looks a bit complex.
> 
> Instead of initializing tsoffset to 0, we could simply use
> 
> jhash(src_addr, dst_addr, boot_time_rnd)
> 
> This way, even syncookies would be handled, and we do not need to
> increase tcp_request_sock size.

So I gave this a try and it does avoid this tcp_request_sock increase,
but I feel that getting boot_time_rnd is too easy.

I tried a few other ideas but nothing satisfying/simpler came out of it
(e.g. i tried to also hash the isn but that gets scaled w. current
 clock so it doesn't work).

Are you more concerned wrt. complexity or the reqsk increase?

One could use tfo boolean padding in the struct to avoid size increase
(1 bit tfo_listener, 31 for tsoff).

I would then split this patch in two (one to add tsoff to reqsk, one
to add the randomization).

The only other alternative I see is to eat 2nd md5_transform and
add a tso_offset function to secure_seq.c -- but I don't like that
either.

Any other idea?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-25  9:06     ` Florian Westphal
@ 2016-08-25 14:15       ` Eric Dumazet
  2016-08-25 14:49         ` Florian Westphal
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2016-08-25 14:15 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, 2016-08-25 at 11:06 +0200, Florian Westphal wrote:
> So I gave this a try and it does avoid this tcp_request_sock increase,
> but I feel that getting boot_time_rnd is too easy.

Fair enough, I didn't think very hard about it.

> 
> I tried a few other ideas but nothing satisfying/simpler came out of it
> (e.g. i tried to also hash the isn but that gets scaled w. current
>  clock so it doesn't work).
> 
> Are you more concerned wrt. complexity or the reqsk increase?
> 

No, a reqsk increase is really fine.

I guess I was simply worrying your work would make my future submission
more tricky.

Here at Google we have been using usec resolution in TCP timestamps for
a while for all our DC traffic, and we have to upstream this at some
point.

It would be nice if the randomization was optional, because having usec
timestamps with a common base (ie no per flow random base) helps a lot
to understand the host delays at transmit time, and receive time.

I will review your patch more in depth today, thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-25 14:15       ` Eric Dumazet
@ 2016-08-25 14:49         ` Florian Westphal
  2016-08-25 16:05           ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2016-08-25 14:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Florian Westphal, netdev

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Here at Google we have been using usec resolution in TCP timestamps for
> a while for all our DC traffic, and we have to upstream this at some
> point.
> 
> It would be nice if the randomization was optional, because having usec
> timestamps with a common base (ie no per flow random base) helps a lot
> to understand the host delays at transmit time, and receive time.

Would it help to make it per-host instead of per-flow?

> I will review your patch more in depth today, thanks.

Great, thanks a lot!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-25 14:49         ` Florian Westphal
@ 2016-08-25 16:05           ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2016-08-25 16:05 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, 2016-08-25 at 16:49 +0200, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Here at Google we have been using usec resolution in TCP timestamps for
> > a while for all our DC traffic, and we have to upstream this at some
> > point.
> > 
> > It would be nice if the randomization was optional, because having usec
> > timestamps with a common base (ie no per flow random base) helps a lot
> > to understand the host delays at transmit time, and receive time.
> 
> Would it help to make it per-host instead of per-flow?

Not really.

By looking at tcpdump, and TS val of xmit packets of multiple flows, we
can deduct the relative qdisc delays (think of fq pacing).
This should work even if we have one flow per remote peer.

> 
> > I will review your patch more in depth today, thanks.
> 
> Great, thanks a lot!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-18 12:48 ` [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection Florian Westphal
  2016-08-18 16:18   ` Eric Dumazet
@ 2016-08-25 19:34   ` Eric Dumazet
  2016-08-25 20:31     ` Florian Westphal
  2016-08-25 22:06   ` Eric Dumazet
  2 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2016-08-25 19:34 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> added the main infrastructure that is needed for per-connection
> randomization, in particular writing/reading the on-wire tcp header
> format takes the offset into account so rest of stack can use normal
> tcp_time_stamp (jiffies).
> 
> So only two items are left:
>  - add a tsoffset for request sockets
>  - extend the tcp isn generator to also return another 32bit number
>  in addition to the ISN.
> 
> Re-use of ISN generator also means timestamps are still monotonically
> increasing for same connection quadruple.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  include/linux/tcp.h      |  1 +
>  include/net/secure_seq.h | 13 +++++++++----
>  include/net/tcp.h        |  2 +-
>  net/core/secure_seq.c    | 19 +++++++++++++------
>  net/ipv4/syncookies.c    |  1 +
>  net/ipv4/tcp_input.c     |  7 ++++++-
>  net/ipv4/tcp_ipv4.c      | 30 ++++++++++++++++++++----------
>  net/ipv4/tcp_minisocks.c |  4 +++-
>  net/ipv4/tcp_output.c    |  2 +-
>  net/ipv6/syncookies.c    |  1 +
>  net/ipv6/tcp_ipv6.c      | 28 ++++++++++++++++++----------
>  11 files changed, 74 insertions(+), 34 deletions(-)

It seems tcp_v4_reqsk_send_ack() and tcp_v6_reqsk_send_ack() were not
taken into account.

See commit 20a2b49fc5385 changelog
packetdrill test showing the possible issue if the TS sent on an ACK in
SYN_RECV state is wrong.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-25 19:34   ` Eric Dumazet
@ 2016-08-25 20:31     ` Florian Westphal
  2016-08-25 21:06       ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2016-08-25 20:31 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Florian Westphal, netdev

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> It seems tcp_v4_reqsk_send_ack() and tcp_v6_reqsk_send_ack() were not
> taken into account.
> 
> See commit 20a2b49fc5385 changelog
> packetdrill test showing the possible issue if the TS sent on an ACK in
> SYN_RECV state is wrong.

Thanks a lot Eric, I'll fix this.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-25 20:31     ` Florian Westphal
@ 2016-08-25 21:06       ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2016-08-25 21:06 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, 2016-08-25 at 22:31 +0200, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> > It seems tcp_v4_reqsk_send_ack() and tcp_v6_reqsk_send_ack() were not
> > taken into account.
> > 
> > See commit 20a2b49fc5385 changelog
> > packetdrill test showing the possible issue if the TS sent on an ACK in
> > SYN_RECV state is wrong.
> 
> Thanks a lot Eric, I'll fix this.

I would try :

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 47683c798f57..c2bf284239f6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -826,7 +826,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
 
        tcp_v4_send_ack(sock_net(sk), skb, seq,
                        tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd,
-                       tcp_time_stamp,
+                       tcp_time_stamp + tcp_rsk(req)->ts_off,
                        req->ts_recent,
                        0,
                        tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->daddr,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index ce029c090f94..6411aa378b1b 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -955,7 +955,8 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
        tcp_v6_send_ack(sk, skb, (sk->sk_state == TCP_LISTEN) ?
                        tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt,
                        tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd,
-                       tcp_time_stamp, req->ts_recent, sk->sk_bound_dev_if,
+                       tcp_time_stamp + tcp_rsk(req)->ts_off,
+                       req->ts_recent, sk->sk_bound_dev_if,
                        tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr),
                        0, 0);
 }

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-18 12:48 ` [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection Florian Westphal
  2016-08-18 16:18   ` Eric Dumazet
  2016-08-25 19:34   ` Eric Dumazet
@ 2016-08-25 22:06   ` Eric Dumazet
  2016-08-25 23:46     ` Florian Westphal
  2 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2016-08-25 22:06 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> added the main infrastructure that is needed for per-connection
> randomization, in particular writing/reading the on-wire tcp header
> format takes the offset into account so rest of stack can use normal
> tcp_time_stamp (jiffies).

...

> +struct secure_tcp_seq {
> +	u32 seq;
> +	u32 tsoff;
> +};
> +

...

>  #if IS_ENABLED(CONFIG_IPV6)
> -__u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
> -				   __be16 sport, __be16 dport)
> +struct secure_tcp_seq
> +secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
> +			     __be16 sport, __be16 dport)
>  {
>  	u32 secret[MD5_MESSAGE_BYTES / 4];
>  	u32 hash[MD5_DIGEST_WORDS];
> +	struct secure_tcp_seq seq;
>  	u32 i;
>  
>  	net_secret_init();
> @@ -58,7 +60,9 @@ __u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
>  
>  	md5_transform(hash, secret);
>  
> -	return seq_scale(hash[0]);
> +	seq.seq = seq_scale(hash[0]);
> +	seq.tsoff = hash[1];
> +	return seq;
>  }


I am not a super fan of this "struct secure_tcp_seq" being returned by
functions. This adds unnecessary overhead.

I would instead add a "u32 *ts_off" parameter, as you already did for
tcp_v4_init_sequence()

Patch on top of yours :

 include/net/secure_seq.h |   13 ++++---------
 net/core/secure_seq.c    |   21 ++++++++-------------
 net/ipv4/tcp_ipv4.c      |   30 +++++++++++-------------------
 net/ipv6/tcp_ipv6.c      |   31 ++++++++++++-------------------
 4 files changed, 35 insertions(+), 60 deletions(-)

diff --git a/include/net/secure_seq.h b/include/net/secure_seq.h
index 3d576f74169c..5f1cd2cb7d5d 100644
--- a/include/net/secure_seq.h
+++ b/include/net/secure_seq.h
@@ -3,18 +3,13 @@
 
 #include <linux/types.h>
 
-struct secure_tcp_seq {
-	u32 seq;
-	u32 tsoff;
-};
-
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport);
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport);
-struct secure_tcp_seq secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
-						 __be16 sport, __be16 dport);
-struct secure_tcp_seq secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
-						    __be16 sport, __be16 dport);
+u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
+			       __be16 sport, __be16 dport, u32 *tsoff);
+u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
+			         __be16 sport, __be16 dport, u32 *tsoff);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 613f2fd406d4..a8d6062cbb4a 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -40,13 +40,11 @@ static u32 seq_scale(u32 seq)
 #endif
 
 #if IS_ENABLED(CONFIG_IPV6)
-struct secure_tcp_seq
-secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
-			     __be16 sport, __be16 dport)
+u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
+				 __be16 sport, __be16 dport, u32 *tsoff)
 {
 	u32 secret[MD5_MESSAGE_BYTES / 4];
 	u32 hash[MD5_DIGEST_WORDS];
-	struct secure_tcp_seq seq;
 	u32 i;
 
 	net_secret_init();
@@ -60,9 +58,8 @@ secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 
 	md5_transform(hash, secret);
 
-	seq.seq = seq_scale(hash[0]);
-	seq.tsoff = hash[1];
-	return seq;
+	*tsoff = hash[1];
+	return seq_scale(hash[0]);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
@@ -90,11 +87,10 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 
 #ifdef CONFIG_INET
 
-struct secure_tcp_seq secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
-						 __be16 sport, __be16 dport)
+u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
+			       __be16 sport, __be16 dport, u32 *tsoff)
 {
 	u32 hash[MD5_DIGEST_WORDS];
-	struct secure_tcp_seq seq;
 
 	net_secret_init();
 	hash[0] = (__force u32)saddr;
@@ -104,9 +100,8 @@ struct secure_tcp_seq secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 
 	md5_transform(hash, net_secret);
 
-	seq.seq = seq_scale(hash[0]);
-	seq.tsoff = hash[1];
-	return seq;
+	*tsoff = hash[1];
+	return seq_scale(hash[0]);
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 47683c798f57..ebfdbb0e1698 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -99,14 +99,10 @@ EXPORT_SYMBOL(tcp_hashinfo);
 
 static u32 tcp_v4_init_sequence(const struct sk_buff *skb, u32 *tsoff)
 {
-	struct secure_tcp_seq s;
-
-	s = secure_tcp_sequence_number(ip_hdr(skb)->daddr,
-				       ip_hdr(skb)->saddr,
-				       tcp_hdr(skb)->dest,
-				       tcp_hdr(skb)->source);
-	*tsoff = s.tsoff;
-	return s.seq;
+	return secure_tcp_sequence_number(ip_hdr(skb)->daddr,
+					  ip_hdr(skb)->saddr,
+					  tcp_hdr(skb)->dest,
+					  tcp_hdr(skb)->source, tsoff);
 }
 
 int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp)
@@ -239,16 +235,12 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	sk->sk_gso_type = SKB_GSO_TCPV4;
 	sk_setup_caps(sk, &rt->dst);
 
-	if (!tp->write_seq && likely(!tp->repair)) {
-		struct secure_tcp_seq seq;
-
-		seq = secure_tcp_sequence_number(inet->inet_saddr,
-						 inet->inet_daddr,
-						 inet->inet_sport,
-						 usin->sin_port);
-		tp->write_seq = seq.seq;
-		tp->tsoffset = seq.tsoff;
-	}
+	if (!tp->write_seq && likely(!tp->repair))
+		tp->write_seq = secure_tcp_sequence_number(inet->inet_saddr,
+					inet->inet_daddr,
+					inet->inet_sport,
+					usin->sin_port,
+					&tp->tsoffset);
 
 	inet->inet_id = tp->write_seq ^ jiffies;
 
@@ -826,7 +818,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
 
 	tcp_v4_send_ack(sock_net(sk), skb, seq,
 			tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd,
-			tcp_time_stamp,
+			tcp_time_stamp + tcp_rsk(req)->ts_off,
 			req->ts_recent,
 			0,
 			tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->daddr,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index ce029c090f94..3b4033ad87a0 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -103,14 +103,10 @@ static void inet6_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb)
 
 static u32 tcp_v6_init_sequence(const struct sk_buff *skb, u32 *tsoff)
 {
-	struct secure_tcp_seq s;
-
-	s = secure_tcpv6_sequence_number(ipv6_hdr(skb)->daddr.s6_addr32,
-					 ipv6_hdr(skb)->saddr.s6_addr32,
-					 tcp_hdr(skb)->dest,
-					 tcp_hdr(skb)->source);
-	*tsoff = s.tsoff;
-	return s.seq;
+	return secure_tcpv6_sequence_number(ipv6_hdr(skb)->daddr.s6_addr32,
+					    ipv6_hdr(skb)->saddr.s6_addr32,
+					    tcp_hdr(skb)->dest,
+					    tcp_hdr(skb)->source, tsoff);
 }
 
 static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
@@ -282,16 +278,12 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 	sk_set_txhash(sk);
 
-	if (!tp->write_seq && likely(!tp->repair)) {
-		struct secure_tcp_seq seq;
-
-		seq = secure_tcpv6_sequence_number(np->saddr.s6_addr32,
-						   sk->sk_v6_daddr.s6_addr32,
-						   inet->inet_sport,
-						   inet->inet_dport);
-		tp->write_seq = seq.seq;
-		tp->tsoffset = seq.tsoff;
-	}
+	if (!tp->write_seq && likely(!tp->repair))
+		tp->write_seq = secure_tcpv6_sequence_number(np->saddr.s6_addr32,
+							    sk->sk_v6_daddr.s6_addr32,
+							    inet->inet_sport,
+							    inet->inet_dport,
+							    &tp->tsoffset);
 	err = tcp_connect(sk);
 	if (err)
 		goto late_failure;
@@ -955,7 +947,8 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
 	tcp_v6_send_ack(sk, skb, (sk->sk_state == TCP_LISTEN) ?
 			tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt,
 			tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd,
-			tcp_time_stamp, req->ts_recent, sk->sk_bound_dev_if,
+			tcp_time_stamp + tcp_rsk(req)->ts_off,
+			req->ts_recent, sk->sk_bound_dev_if,
 			tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr),
 			0, 0);
 }

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-25 22:06   ` Eric Dumazet
@ 2016-08-25 23:46     ` Florian Westphal
  2016-08-26  2:34       ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2016-08-25 23:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Florian Westphal, netdev

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> > commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> > added the main infrastructure that is needed for per-connection
> > randomization, in particular writing/reading the on-wire tcp header
> > format takes the offset into account so rest of stack can use normal
> > tcp_time_stamp (jiffies).

[..]

> > +secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
> > +			     __be16 sport, __be16 dport)
> >  {
> >  	u32 secret[MD5_MESSAGE_BYTES / 4];
> >  	u32 hash[MD5_DIGEST_WORDS];
> > +	struct secure_tcp_seq seq;
> >  	u32 i;
> >  
> >  	net_secret_init();
> > @@ -58,7 +60,9 @@ __u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
> >  
> >  	md5_transform(hash, secret);
> >  
> > -	return seq_scale(hash[0]);
> > +	seq.seq = seq_scale(hash[0]);
> > +	seq.tsoff = hash[1];
> > +	return seq;
> >  }
> 
> 
> I am not a super fan of this "struct secure_tcp_seq" being returned by
> functions. This adds unnecessary overhead.
>
> I would instead add a "u32 *ts_off" parameter, as you already did for
> tcp_v4_init_sequence()
>
> Patch on top of yours :
[..]

Looks great, I squashed it into my working branch.

Wrt. making randomization optional:

Would you go for another sysctl or should I just change
secure_tcpvX_sequence_number to check for tcp_timestamps == 2 mode?

	*tsoff  = sysctl_tcp_timestamps == 2 ? hash[1] : 0;

Could also use a static key but I don't think its worth it vs. md5 cost.
What do you think?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection
  2016-08-25 23:46     ` Florian Westphal
@ 2016-08-26  2:34       ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2016-08-26  2:34 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Fri, 2016-08-26 at 01:46 +0200, Florian Westphal wrote:

> Wrt. making randomization optional:
> 
> Would you go for another sysctl or should I just change
> secure_tcpvX_sequence_number to check for tcp_timestamps == 2 mode?
> 
> 	*tsoff  = sysctl_tcp_timestamps == 2 ? hash[1] : 0;

I've not yet look at your "add tcp_timestamps=2 mode to force tsecr
validation on ofo segments" patch.
It looks a bit scary to me :(

I would rather have a separate sysctl, or maybe a per route attribute.

Note that this could be done later.

> 
> Could also use a static key but I don't think its worth it vs. md5 cost.
> What do you think?

A static key wont help a lot I think ;)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-08-26  2:34 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-18 12:48 [RFC 0/3] tcp: increase resilence vs. blind data injection Florian Westphal
2016-08-18 12:48 ` [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection Florian Westphal
2016-08-18 16:18   ` Eric Dumazet
2016-08-18 22:32     ` Florian Westphal
2016-08-25  9:06     ` Florian Westphal
2016-08-25 14:15       ` Eric Dumazet
2016-08-25 14:49         ` Florian Westphal
2016-08-25 16:05           ` Eric Dumazet
2016-08-25 19:34   ` Eric Dumazet
2016-08-25 20:31     ` Florian Westphal
2016-08-25 21:06       ` Eric Dumazet
2016-08-25 22:06   ` Eric Dumazet
2016-08-25 23:46     ` Florian Westphal
2016-08-26  2:34       ` Eric Dumazet
2016-08-18 12:48 ` [RFC 2/3] tcp: add tcp_timestamps=2 mode to force tsecr validation on ofo segments Florian Westphal
2016-08-18 12:48 ` [RFC 3/3] tcp: add mib counter to track ts tsecr validation failures Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.