* [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request
@ 2014-06-25 14:09 Octavian Purdila
2014-06-25 14:09 ` [net-next v2 01/13] tcp: cookie_v4_init_sequence: skb should be const Octavian Purdila
` (13 more replies)
0 siblings, 14 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
This patch series unifies the TCPv4 and TCPv6 connection request flow
in a single new function (tcp_conn_request).
The first 3 patches are small cleanups and fixes found during the code
merge process.
The next patches add new methods in tcp_request_sock_ops to abstract
the IPv4/IPv6 operations and keep the TCP connection request flow
common.
To identify potential performance issues this patch has been tested
by measuring the connection per second rate with nginx and a httperf
like client (to allow for concurrent connection requests - 256 CC were
used during testing) using the loopback interface. A dual-core i5 Ivy
Bridge processor was used and each process was bounded to a different
core to make results consistent.
Results for IPv4, unit is connections per second, higher is better, 20
measurements have been collected:
before after
min 27917 27962
max 28262 28366
avg 28094.1 28212.75
stdev 87.35 97.26
Results for IPv6, unit is connections per second, higher is better, 20
measurements have been collected:
before after
min 24813 24877
max 25029 25119
avg 24935.5 25017
stdev 64.13 62.93
Changes since v1:
* add benchmarking datapoints
* fix a few issues in the last patch (IPv6 related)
Octavian Purdila (13):
tcp: cookie_v4_init_sequence: skb should be const
tcp: tcp_v[46]_conn_request: fix snt_synack initialization
net: remove inet6_reqsk_alloc
tcp: add init_req method to tcp_request_sock_ops
tcp: add init_cookie_seq method to tcp_request_sock_ops
tcp: add route_req method to tcp_request_sock_ops
tcp: move around a few calls in tcp_v6_conn_request
tcp: add init_seq method to tcp_request_sock_ops
tcp: add send_synack method to tcp_request_sock_ops
tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack
tcp: add mss_clamp to tcp_request_sock_ops
tcp: add queue_add_hash to tcp_request_sock_ops
tcp: add tcp_conn_request
include/linux/ipv6.h | 10 ---
include/linux/tcp.h | 3 -
include/net/inet_sock.h | 6 +-
include/net/tcp.h | 54 +++++++++----
net/dccp/ipv6.c | 2 +-
net/ipv4/syncookies.c | 3 +-
net/ipv4/tcp_input.c | 148 ++++++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 188 ++++++++++---------------------------------
net/ipv4/tcp_output.c | 15 ++++
net/ipv6/syncookies.c | 2 +-
net/ipv6/tcp_ipv6.c | 206 +++++++++++++-----------------------------------
11 files changed, 306 insertions(+), 331 deletions(-)
--
1.8.3.2
^ permalink raw reply [flat|nested] 25+ messages in thread
* [net-next v2 01/13] tcp: cookie_v4_init_sequence: skb should be const
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:09 ` [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization Octavian Purdila
` (12 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 5 +++--
net/ipv4/syncookies.c | 3 ++-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 06a3023..39e47c4 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -493,10 +493,11 @@ static inline u32 tcp_cookie_time(void)
u32 __cookie_v4_init_sequence(const struct iphdr *iph, const struct tcphdr *th,
u16 *mssp);
-__u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mss);
+__u32 cookie_v4_init_sequence(struct sock *sk, const struct sk_buff *skb,
+ __u16 *mss);
#else
static inline __u32 cookie_v4_init_sequence(struct sock *sk,
- struct sk_buff *skb,
+ const struct sk_buff *skb,
__u16 *mss)
{
return 0;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index c86624b..c0c7568 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -170,7 +170,8 @@ u32 __cookie_v4_init_sequence(const struct iphdr *iph, const struct tcphdr *th,
}
EXPORT_SYMBOL_GPL(__cookie_v4_init_sequence);
-__u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mssp)
+__u32 cookie_v4_init_sequence(struct sock *sk, const struct sk_buff *skb,
+ __u16 *mssp)
{
const struct iphdr *iph = ip_hdr(skb);
const struct tcphdr *th = tcp_hdr(skb);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
2014-06-25 14:09 ` [net-next v2 01/13] tcp: cookie_v4_init_sequence: skb should be const Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-28 2:16 ` Neal Cardwell
2014-06-25 14:09 ` [net-next v2 03/13] net: remove inet6_reqsk_alloc Octavian Purdila
` (11 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila, Cardwell, Daniel Lee, Yuchung Cheng
Commit 016818d07 (tcp: TCP Fast Open Server - take SYNACK RTT after
completing 3WHS) changes the code to only take a snt_synack timestamp
when a SYNACK transmit or retransmit succeeds. This behaviour is later
broken by commit 843f4a55e (tcp: use tcp_v4_send_synack on first
SYN-ACK), as snt_synack is now updated even if tcp_v4_send_synack
fails.
Also, commit 3a19ce0ee (tcp: IPv6 support for fastopen server) misses
the required IPv6 updates for 016818d07.
This patch makes sure that snt_synack is updated only when the SYNACK
trasnmit/retransmit succeeds, for both IPv4 and IPv6.
Cc: Cardwell <ncardwell@google.com>
Cc: Daniel Lee <longinus00@gmail.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
net/ipv4/tcp_ipv4.c | 2 --
net/ipv6/tcp_ipv6.c | 3 ++-
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 180336d..145f640 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1370,7 +1370,6 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
goto drop_and_free;
tcp_rsk(req)->snt_isn = isn;
- tcp_rsk(req)->snt_synack = tcp_time_stamp;
tcp_openreq_init_rwin(req, sk, dst);
fastopen = !want_cookie &&
tcp_try_fastopen(sk, skb, req, &foc, dst);
@@ -1380,7 +1379,6 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
if (err || want_cookie)
goto drop_and_free;
- tcp_rsk(req)->snt_synack = tcp_time_stamp;
tcp_rsk(req)->listener = NULL;
inet_csk_reqsk_queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 08ae3da..a962455 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -497,6 +497,8 @@ static int tcp_v6_send_synack(struct sock *sk, struct dst_entry *dst,
skb_set_queue_mapping(skb, queue_mapping);
err = ip6_xmit(sk, skb, fl6, np->opt, np->tclass);
err = net_xmit_eval(err);
+ if (!tcp_rsk(req)->snt_synack && !err)
+ tcp_rsk(req)->snt_synack = tcp_time_stamp;
}
done:
@@ -1100,7 +1102,6 @@ have_isn:
goto drop_and_free;
tcp_rsk(req)->snt_isn = isn;
- tcp_rsk(req)->snt_synack = tcp_time_stamp;
tcp_openreq_init_rwin(req, sk, dst);
fastopen = !want_cookie &&
tcp_try_fastopen(sk, skb, req, &foc, dst);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 03/13] net: remove inet6_reqsk_alloc
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
2014-06-25 14:09 ` [net-next v2 01/13] tcp: cookie_v4_init_sequence: skb should be const Octavian Purdila
2014-06-25 14:09 ` [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 17:59 ` Cong Wang
2014-06-25 14:09 ` [net-next v2 04/13] tcp: add init_req method to tcp_request_sock_ops Octavian Purdila
` (10 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Since pktops is only used for IPv6 only and opts is used for IPv4
only, we can move these fields into a union and this allows us to drop
the inet6_reqsk_alloc function as after this change it becomes
equivalent with inet_reqsk_alloc.
This patch also fixes a kmemcheck issue in the IPv6 stack: the flags
field was not annotated after a request_sock was allocated.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/linux/ipv6.h | 10 ----------
include/net/inet_sock.h | 6 ++++--
net/dccp/ipv6.c | 2 +-
net/ipv6/syncookies.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
5 files changed, 7 insertions(+), 15 deletions(-)
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 2faef33..c811300 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -256,16 +256,6 @@ static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk)
return inet_sk(__sk)->pinet6;
}
-static inline struct request_sock *inet6_reqsk_alloc(struct request_sock_ops *ops)
-{
- struct request_sock *req = reqsk_alloc(ops);
-
- if (req)
- inet_rsk(req)->pktopts = NULL;
-
- return req;
-}
-
static inline struct raw6_sock *raw6_sk(const struct sock *sk)
{
return (struct raw6_sock *)sk;
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index b1edf17..a829b77 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -88,8 +88,10 @@ struct inet_request_sock {
acked : 1,
no_srccheck: 1;
kmemcheck_bitfield_end(flags);
- struct ip_options_rcu *opt;
- struct sk_buff *pktopts;
+ union {
+ struct ip_options_rcu *opt;
+ struct sk_buff *pktopts;
+ };
u32 ir_mark;
};
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 4db3c2a..04cb17d 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -386,7 +386,7 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1)
goto drop;
- req = inet6_reqsk_alloc(&dccp6_request_sock_ops);
+ req = inet_reqsk_alloc(&dccp6_request_sock_ops);
if (req == NULL)
goto drop;
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index a822b88..83cea1d 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -187,7 +187,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
goto out;
ret = NULL;
- req = inet6_reqsk_alloc(&tcp6_request_sock_ops);
+ req = inet_reqsk_alloc(&tcp6_request_sock_ops);
if (!req)
goto out;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index a962455..5e2d7e6 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1010,7 +1010,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
goto drop;
}
- req = inet6_reqsk_alloc(&tcp6_request_sock_ops);
+ req = inet_reqsk_alloc(&tcp6_request_sock_ops);
if (req == NULL)
goto drop;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 04/13] tcp: add init_req method to tcp_request_sock_ops
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (2 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 03/13] net: remove inet6_reqsk_alloc Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:09 ` [net-next v2 05/13] tcp: add init_cookie_seq " Octavian Purdila
` (9 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Move the specific IPv4/IPv6 intializations to a new method in
tcp_request_sock_ops in preparation for unifying tcp_v4_conn_request
and tcp_v6_conn_request.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/linux/tcp.h | 3 ---
include/net/tcp.h | 2 ++
net/ipv4/tcp_ipv4.c | 29 ++++++++++++++++------------
net/ipv6/tcp_ipv6.c | 55 +++++++++++++++++++++++++++++++----------------------
4 files changed, 51 insertions(+), 38 deletions(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index a051321..fa5258f 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -111,10 +111,7 @@ struct tcp_request_sock_ops;
struct tcp_request_sock {
struct inet_request_sock req;
-#ifdef CONFIG_TCP_MD5SIG
- /* Only used by TCP MD5 Signature so far. */
const struct tcp_request_sock_ops *af_specific;
-#endif
struct sock *listener; /* needed for TFO */
u32 rcv_isn;
u32 snt_isn;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 39e47c4..7ad8ce2 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1613,6 +1613,8 @@ struct tcp_request_sock_ops {
const struct request_sock *req,
const struct sk_buff *skb);
#endif
+ void (*init_req)(struct request_sock *req, struct sock *sk,
+ struct sk_buff *skb);
};
int tcpv4_offload_init(void);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 145f640..f86a86b 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1237,6 +1237,17 @@ static bool tcp_v4_inbound_md5_hash(struct sock *sk, const struct sk_buff *skb)
#endif
+static void tcp_v4_init_req(struct request_sock *req, struct sock *sk,
+ struct sk_buff *skb)
+{
+ struct inet_request_sock *ireq = inet_rsk(req);
+
+ ireq->ir_loc_addr = ip_hdr(skb)->daddr;
+ ireq->ir_rmt_addr = ip_hdr(skb)->saddr;
+ ireq->no_srccheck = inet_sk(sk)->transparent;
+ ireq->opt = tcp_v4_save_options(skb);
+}
+
struct request_sock_ops tcp_request_sock_ops __read_mostly = {
.family = PF_INET,
.obj_size = sizeof(struct tcp_request_sock),
@@ -1247,26 +1258,26 @@ struct request_sock_ops tcp_request_sock_ops __read_mostly = {
.syn_ack_timeout = tcp_syn_ack_timeout,
};
-#ifdef CONFIG_TCP_MD5SIG
static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
+#ifdef CONFIG_TCP_MD5SIG
.md5_lookup = tcp_v4_reqsk_md5_lookup,
.calc_md5_hash = tcp_v4_md5_hash_skb,
-};
#endif
+ .init_req = tcp_v4_init_req,
+};
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
{
struct tcp_options_received tmp_opt;
struct request_sock *req;
- struct inet_request_sock *ireq;
struct tcp_sock *tp = tcp_sk(sk);
struct dst_entry *dst = NULL;
__be32 saddr = ip_hdr(skb)->saddr;
- __be32 daddr = ip_hdr(skb)->daddr;
__u32 isn = TCP_SKB_CB(skb)->when;
bool want_cookie = false, fastopen;
struct flowi4 fl4;
struct tcp_fastopen_cookie foc = { .len = -1 };
+ const struct tcp_request_sock_ops *af_ops;
int err;
/* Never answer to SYNs send to broadcast or multicast */
@@ -1298,9 +1309,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
if (!req)
goto drop;
-#ifdef CONFIG_TCP_MD5SIG
- tcp_rsk(req)->af_specific = &tcp_request_sock_ipv4_ops;
-#endif
+ af_ops = tcp_rsk(req)->af_specific = &tcp_request_sock_ipv4_ops;
tcp_clear_options(&tmp_opt);
tmp_opt.mss_clamp = TCP_MSS_DEFAULT;
@@ -1313,11 +1322,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
tcp_openreq_init(req, &tmp_opt, skb, sk);
- ireq = inet_rsk(req);
- ireq->ir_loc_addr = daddr;
- ireq->ir_rmt_addr = saddr;
- ireq->no_srccheck = inet_sk(sk)->transparent;
- ireq->opt = tcp_v4_save_options(skb);
+ af_ops->init_req(req, sk, skb);
if (security_inet_conn_request(sk, skb, req))
goto drop_and_free;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 5e2d7e6..87a360c 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -720,6 +720,31 @@ static int tcp_v6_inbound_md5_hash(struct sock *sk, const struct sk_buff *skb)
}
#endif
+static void tcp_v6_init_req(struct request_sock *req, struct sock *sk,
+ struct sk_buff *skb)
+{
+ struct inet_request_sock *ireq = inet_rsk(req);
+ struct ipv6_pinfo *np = inet6_sk(sk);
+
+ ireq->ir_v6_rmt_addr = ipv6_hdr(skb)->saddr;
+ ireq->ir_v6_loc_addr = ipv6_hdr(skb)->daddr;
+
+ ireq->ir_iif = sk->sk_bound_dev_if;
+
+ /* So that link locals have meaning */
+ if (!sk->sk_bound_dev_if &&
+ ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL)
+ ireq->ir_iif = inet6_iif(skb);
+
+ if (!TCP_SKB_CB(skb)->when &&
+ (ipv6_opt_accepted(sk, skb) || np->rxopt.bits.rxinfo ||
+ np->rxopt.bits.rxoinfo || np->rxopt.bits.rxhlim ||
+ np->rxopt.bits.rxohlim || np->repflow)) {
+ atomic_inc(&skb->users);
+ ireq->pktopts = skb;
+ }
+}
+
struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
.family = AF_INET6,
.obj_size = sizeof(struct tcp6_request_sock),
@@ -730,12 +755,13 @@ struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
.syn_ack_timeout = tcp_syn_ack_timeout,
};
-#ifdef CONFIG_TCP_MD5SIG
static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
+#ifdef CONFIG_TCP_MD5SIG
.md5_lookup = tcp_v6_reqsk_md5_lookup,
.calc_md5_hash = tcp_v6_md5_hash_skb,
-};
#endif
+ .init_req = tcp_v6_init_req,
+};
static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
u32 tsval, u32 tsecr, int oif,
@@ -983,13 +1009,13 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
struct tcp_options_received tmp_opt;
struct request_sock *req;
struct inet_request_sock *ireq;
- struct ipv6_pinfo *np = inet6_sk(sk);
struct tcp_sock *tp = tcp_sk(sk);
__u32 isn = TCP_SKB_CB(skb)->when;
struct dst_entry *dst = NULL;
struct tcp_fastopen_cookie foc = { .len = -1 };
bool want_cookie = false, fastopen;
struct flowi6 fl6;
+ const struct tcp_request_sock_ops *af_ops;
int err;
if (skb->protocol == htons(ETH_P_IP))
@@ -1014,9 +1040,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
if (req == NULL)
goto drop;
-#ifdef CONFIG_TCP_MD5SIG
- tcp_rsk(req)->af_specific = &tcp_request_sock_ipv6_ops;
-#endif
+ af_ops = tcp_rsk(req)->af_specific = &tcp_request_sock_ipv6_ops;
tcp_clear_options(&tmp_opt);
tmp_opt.mss_clamp = IPV6_MIN_MTU - sizeof(struct tcphdr) - sizeof(struct ipv6hdr);
@@ -1030,27 +1054,12 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
tcp_openreq_init(req, &tmp_opt, skb, sk);
ireq = inet_rsk(req);
- ireq->ir_v6_rmt_addr = ipv6_hdr(skb)->saddr;
- ireq->ir_v6_loc_addr = ipv6_hdr(skb)->daddr;
+ af_ops->init_req(req, sk, skb);
+
if (!want_cookie || tmp_opt.tstamp_ok)
TCP_ECN_create_request(req, skb, sock_net(sk));
- ireq->ir_iif = sk->sk_bound_dev_if;
-
- /* So that link locals have meaning */
- if (!sk->sk_bound_dev_if &&
- ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL)
- ireq->ir_iif = inet6_iif(skb);
-
if (!isn) {
- if (ipv6_opt_accepted(sk, skb) ||
- np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo ||
- np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim ||
- np->repflow) {
- atomic_inc(&skb->users);
- ireq->pktopts = skb;
- }
-
if (want_cookie) {
isn = cookie_v6_init_sequence(sk, skb, &req->mss);
req->cookie_ts = tmp_opt.tstamp_ok;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 05/13] tcp: add init_cookie_seq method to tcp_request_sock_ops
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (3 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 04/13] tcp: add init_req method to tcp_request_sock_ops Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:09 ` [net-next v2 06/13] tcp: add route_req " Octavian Purdila
` (8 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Move the specific IPv4/IPv6 cookie sequence initialization to a new
method in tcp_request_sock_ops in preparation for unifying
tcp_v4_conn_request and tcp_v6_conn_request.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 34 ++++++++++++++++++++--------------
net/ipv4/tcp_ipv4.c | 5 ++++-
net/ipv6/tcp_ipv6.c | 5 ++++-
3 files changed, 28 insertions(+), 16 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7ad8ce2..086d00e 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -495,13 +495,6 @@ u32 __cookie_v4_init_sequence(const struct iphdr *iph, const struct tcphdr *th,
u16 *mssp);
__u32 cookie_v4_init_sequence(struct sock *sk, const struct sk_buff *skb,
__u16 *mss);
-#else
-static inline __u32 cookie_v4_init_sequence(struct sock *sk,
- const struct sk_buff *skb,
- __u16 *mss)
-{
- return 0;
-}
#endif
__u32 cookie_init_timestamp(struct request_sock *req);
@@ -517,13 +510,6 @@ u32 __cookie_v6_init_sequence(const struct ipv6hdr *iph,
const struct tcphdr *th, u16 *mssp);
__u32 cookie_v6_init_sequence(struct sock *sk, const struct sk_buff *skb,
__u16 *mss);
-#else
-static inline __u32 cookie_v6_init_sequence(struct sock *sk,
- struct sk_buff *skb,
- __u16 *mss)
-{
- return 0;
-}
#endif
/* tcp_output.c */
@@ -1615,8 +1601,28 @@ struct tcp_request_sock_ops {
#endif
void (*init_req)(struct request_sock *req, struct sock *sk,
struct sk_buff *skb);
+#ifdef CONFIG_SYN_COOKIES
+ __u32 (*cookie_init_seq)(struct sock *sk, const struct sk_buff *skb,
+ __u16 *mss);
+#endif
};
+#ifdef CONFIG_SYN_COOKIES
+static inline __u32 cookie_init_sequence(const struct tcp_request_sock_ops *ops,
+ struct sock *sk, struct sk_buff *skb,
+ __u16 *mss)
+{
+ return ops->cookie_init_seq(sk, skb, mss);
+}
+#else
+static inline __u32 cookie_init_sequence(const struct tcp_request_sock_ops *ops,
+ struct sock *sk, struct sk_buff *skb,
+ __u16 *mss)
+{
+ return 0;
+}
+#endif
+
int tcpv4_offload_init(void);
void tcp_v4_init(void);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f86a86b..8c69e44 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1264,6 +1264,9 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
.calc_md5_hash = tcp_v4_md5_hash_skb,
#endif
.init_req = tcp_v4_init_req,
+#ifdef CONFIG_SYN_COOKIES
+ .cookie_init_seq = cookie_v4_init_sequence,
+#endif
};
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
@@ -1331,7 +1334,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
TCP_ECN_create_request(req, skb, sock_net(sk));
if (want_cookie) {
- isn = cookie_v4_init_sequence(sk, skb, &req->mss);
+ isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
req->cookie_ts = tmp_opt.tstamp_ok;
} else if (!isn) {
/* VJ's idea. We save last timestamp seen
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 87a360c..17710cf 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -761,6 +761,9 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
.calc_md5_hash = tcp_v6_md5_hash_skb,
#endif
.init_req = tcp_v6_init_req,
+#ifdef CONFIG_SYN_COOKIES
+ .cookie_init_seq = cookie_v6_init_sequence,
+#endif
};
static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
@@ -1061,7 +1064,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
if (!isn) {
if (want_cookie) {
- isn = cookie_v6_init_sequence(sk, skb, &req->mss);
+ isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
req->cookie_ts = tmp_opt.tstamp_ok;
goto have_isn;
}
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 06/13] tcp: add route_req method to tcp_request_sock_ops
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (4 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 05/13] tcp: add init_cookie_seq " Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:09 ` [net-next v2 07/13] tcp: move around a few calls in tcp_v6_conn_request Octavian Purdila
` (7 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Create wrappers with same signature for the IPv4/IPv6 request routing
calls and use these wrappers (via route_req method from
tcp_request_sock_ops) in tcp_v4_conn_request and tcp_v6_conn_request
with the purpose of unifying the two functions in a later patch.
We can later drop the wrapper functions and modify inet_csk_route_req
and inet6_cks_route_req to use the same signature.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 3 +++
net/ipv4/tcp_ipv4.c | 36 +++++++++++++++++++++++++++++-------
net/ipv6/tcp_ipv6.c | 26 ++++++++++++++++++++------
3 files changed, 52 insertions(+), 13 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 086d00e..59fcc59 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1605,6 +1605,9 @@ struct tcp_request_sock_ops {
__u32 (*cookie_init_seq)(struct sock *sk, const struct sk_buff *skb,
__u16 *mss);
#endif
+ struct dst_entry *(*route_req)(struct sock *sk, struct flowi *fl,
+ const struct request_sock *req,
+ bool *strict);
};
#ifdef CONFIG_SYN_COOKIES
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 8c69e44..54fbbd8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1248,6 +1248,22 @@ static void tcp_v4_init_req(struct request_sock *req, struct sock *sk,
ireq->opt = tcp_v4_save_options(skb);
}
+static struct dst_entry *tcp_v4_route_req(struct sock *sk, struct flowi *fl,
+ const struct request_sock *req,
+ bool *strict)
+{
+ struct dst_entry *dst = inet_csk_route_req(sk, &fl->u.ip4, req);
+
+ if (strict) {
+ if (fl->u.ip4.daddr == inet_rsk(req)->ir_rmt_addr)
+ *strict = true;
+ else
+ *strict = false;
+ }
+
+ return dst;
+}
+
struct request_sock_ops tcp_request_sock_ops __read_mostly = {
.family = PF_INET,
.obj_size = sizeof(struct tcp_request_sock),
@@ -1267,6 +1283,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
#ifdef CONFIG_SYN_COOKIES
.cookie_init_seq = cookie_v4_init_sequence,
#endif
+ .route_req = tcp_v4_route_req,
};
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
@@ -1346,11 +1363,13 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
* timewait bucket, so that all the necessary checks
* are made in the function processing timewait state.
*/
- if (tmp_opt.saw_tstamp &&
- tcp_death_row.sysctl_tw_recycle &&
- (dst = inet_csk_route_req(sk, &fl4, req)) != NULL &&
- fl4.daddr == saddr) {
- if (!tcp_peer_is_proven(req, dst, true)) {
+ if (tmp_opt.saw_tstamp && tcp_death_row.sysctl_tw_recycle) {
+ bool strict;
+
+ dst = af_ops->route_req(sk, (struct flowi *)&fl4, req,
+ &strict);
+ if (dst && strict &&
+ !tcp_peer_is_proven(req, dst, true)) {
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
goto drop_and_release;
}
@@ -1374,8 +1393,11 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
isn = tcp_v4_init_sequence(skb);
}
- if (!dst && (dst = inet_csk_route_req(sk, &fl4, req)) == NULL)
- goto drop_and_free;
+ if (!dst) {
+ dst = af_ops->route_req(sk, (struct flowi *)&fl4, req, NULL);
+ if (!dst)
+ goto drop_and_free;
+ }
tcp_rsk(req)->snt_isn = isn;
tcp_openreq_init_rwin(req, sk, dst);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 17710cf..d780d88 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -745,6 +745,16 @@ static void tcp_v6_init_req(struct request_sock *req, struct sock *sk,
}
}
+static struct dst_entry *tcp_v6_route_req(struct sock *sk, struct flowi *fl,
+ const struct request_sock *req,
+ bool *strict)
+{
+ if (strict)
+ *strict = true;
+ return inet6_csk_route_req(sk, &fl->u.ip6, req);
+}
+
+
struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
.family = AF_INET6,
.obj_size = sizeof(struct tcp6_request_sock),
@@ -764,6 +774,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
#ifdef CONFIG_SYN_COOKIES
.cookie_init_seq = cookie_v6_init_sequence,
#endif
+ .route_req = tcp_v6_route_req,
};
static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
@@ -1078,10 +1089,10 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
* timewait bucket, so that all the necessary checks
* are made in the function processing timewait state.
*/
- if (tmp_opt.saw_tstamp &&
- tcp_death_row.sysctl_tw_recycle &&
- (dst = inet6_csk_route_req(sk, &fl6, req)) != NULL) {
- if (!tcp_peer_is_proven(req, dst, true)) {
+ if (tmp_opt.saw_tstamp && tcp_death_row.sysctl_tw_recycle) {
+ dst = af_ops->route_req(sk, (struct flowi *)&fl6, req,
+ NULL);
+ if (dst && !tcp_peer_is_proven(req, dst, true)) {
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
goto drop_and_release;
}
@@ -1110,8 +1121,11 @@ have_isn:
if (security_inet_conn_request(sk, skb, req))
goto drop_and_release;
- if (!dst && (dst = inet6_csk_route_req(sk, &fl6, req)) == NULL)
- goto drop_and_free;
+ if (!dst) {
+ dst = af_ops->route_req(sk, (struct flowi *)&fl6, req, NULL);
+ if (!dst)
+ goto drop_and_free;
+ }
tcp_rsk(req)->snt_isn = isn;
tcp_openreq_init_rwin(req, sk, dst);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 07/13] tcp: move around a few calls in tcp_v6_conn_request
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (5 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 06/13] tcp: add route_req " Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:29 ` Paul Moore
2014-06-25 14:09 ` [net-next v2 08/13] tcp: add init_seq method to tcp_request_sock_ops Octavian Purdila
` (6 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Make the tcp_v6_conn_request calls flow similar with that of
tcp_v4_conn_request.
Note that want_cookie can be true only if isn is zero and that is why
we can move the if (want_cookie) block out of the if (!isn) block.
Moving security_inet_conn_request() has a couple of side effects:
missing inet_rsk(req)->ecn_ok update and the req->cookie_ts
update. However, neither SELinux nor Smack security hooks seems to
check them. This change should also avoid future different behaviour
for IPv4 and IPv6 in the security hooks.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
net/ipv6/tcp_ipv6.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index d780d88..91b8a2e 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1070,16 +1070,16 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
ireq = inet_rsk(req);
af_ops->init_req(req, sk, skb);
+ if (security_inet_conn_request(sk, skb, req))
+ goto drop_and_release;
+
if (!want_cookie || tmp_opt.tstamp_ok)
TCP_ECN_create_request(req, skb, sock_net(sk));
- if (!isn) {
- if (want_cookie) {
- isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
- req->cookie_ts = tmp_opt.tstamp_ok;
- goto have_isn;
- }
-
+ if (want_cookie) {
+ isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
+ req->cookie_ts = tmp_opt.tstamp_ok;
+ } else if (!isn) {
/* VJ's idea. We save last timestamp seen
* from the destination in peer table, when entering
* state TIME-WAIT, and check against it before
@@ -1116,10 +1116,6 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
isn = tcp_v6_init_sequence(skb);
}
-have_isn:
-
- if (security_inet_conn_request(sk, skb, req))
- goto drop_and_release;
if (!dst) {
dst = af_ops->route_req(sk, (struct flowi *)&fl6, req, NULL);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 08/13] tcp: add init_seq method to tcp_request_sock_ops
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (6 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 07/13] tcp: move around a few calls in tcp_v6_conn_request Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:09 ` [net-next v2 09/13] tcp: add send_synack " Octavian Purdila
` (5 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
More work in preparation of unifying tcp_v4_conn_request and
tcp_v6_conn_request: indirect the init sequence calls via the
tcp_request_sock_ops.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 1 +
net/ipv4/tcp_ipv4.c | 5 +++--
net/ipv6/tcp_ipv6.c | 3 ++-
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 59fcc59..8cacf0d 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1608,6 +1608,7 @@ struct tcp_request_sock_ops {
struct dst_entry *(*route_req)(struct sock *sk, struct flowi *fl,
const struct request_sock *req,
bool *strict);
+ __u32 (*init_seq)(const struct sk_buff *skb);
};
#ifdef CONFIG_SYN_COOKIES
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 54fbbd8..4347826 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -99,7 +99,7 @@ static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
struct inet_hashinfo tcp_hashinfo;
EXPORT_SYMBOL(tcp_hashinfo);
-static inline __u32 tcp_v4_init_sequence(const struct sk_buff *skb)
+static __u32 tcp_v4_init_sequence(const struct sk_buff *skb)
{
return secure_tcp_sequence_number(ip_hdr(skb)->daddr,
ip_hdr(skb)->saddr,
@@ -1284,6 +1284,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
.cookie_init_seq = cookie_v4_init_sequence,
#endif
.route_req = tcp_v4_route_req,
+ .init_seq = tcp_v4_init_sequence,
};
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
@@ -1391,7 +1392,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
goto drop_and_release;
}
- isn = tcp_v4_init_sequence(skb);
+ isn = af_ops->init_seq(skb);
}
if (!dst) {
dst = af_ops->route_req(sk, (struct flowi *)&fl4, req, NULL);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 91b8a2e..2fd886f 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -775,6 +775,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
.cookie_init_seq = cookie_v6_init_sequence,
#endif
.route_req = tcp_v6_route_req,
+ .init_seq = tcp_v6_init_sequence,
};
static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
@@ -1114,7 +1115,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
goto drop_and_release;
}
- isn = tcp_v6_init_sequence(skb);
+ isn = af_ops->init_seq(skb);
}
if (!dst) {
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 09/13] tcp: add send_synack method to tcp_request_sock_ops
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (7 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 08/13] tcp: add init_seq method to tcp_request_sock_ops Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:09 ` [net-next v2 10/13] tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack Octavian Purdila
` (4 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Create a new tcp_request_sock_ops method to unify the IPv4/IPv6
signature for tcp_v[46]_send_synack. This allows us to later unify
tcp_v4_rtx_synack with tcp_v6_rtx_synack and tcp_v4_conn_request with
tcp_v4_conn_request.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 3 +++
net/ipv4/tcp_ipv4.c | 9 ++++++---
net/ipv6/tcp_ipv6.c | 14 ++++++++------
3 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 8cacf0d..8c05c25 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1609,6 +1609,9 @@ struct tcp_request_sock_ops {
const struct request_sock *req,
bool *strict);
__u32 (*init_seq)(const struct sk_buff *skb);
+ int (*send_synack)(struct sock *sk, struct dst_entry *dst,
+ struct flowi *fl, struct request_sock *req,
+ u16 queue_mapping, struct tcp_fastopen_cookie *foc);
};
#ifdef CONFIG_SYN_COOKIES
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 4347826..b5945ac 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -814,6 +814,7 @@ static void tcp_v4_reqsk_send_ack(struct sock *sk, struct sk_buff *skb,
* socket.
*/
static int tcp_v4_send_synack(struct sock *sk, struct dst_entry *dst,
+ struct flowi *fl,
struct request_sock *req,
u16 queue_mapping,
struct tcp_fastopen_cookie *foc)
@@ -846,7 +847,8 @@ static int tcp_v4_send_synack(struct sock *sk, struct dst_entry *dst,
static int tcp_v4_rtx_synack(struct sock *sk, struct request_sock *req)
{
- int res = tcp_v4_send_synack(sk, NULL, req, 0, NULL);
+ const struct tcp_request_sock_ops *af_ops = tcp_rsk(req)->af_specific;
+ int res = af_ops->send_synack(sk, NULL, NULL, req, 0, NULL);
if (!res) {
TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_RETRANSSEGS);
@@ -1285,6 +1287,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
#endif
.route_req = tcp_v4_route_req,
.init_seq = tcp_v4_init_sequence,
+ .send_synack = tcp_v4_send_synack,
};
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
@@ -1404,8 +1407,8 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
tcp_openreq_init_rwin(req, sk, dst);
fastopen = !want_cookie &&
tcp_try_fastopen(sk, skb, req, &foc, dst);
- err = tcp_v4_send_synack(sk, dst, req,
- skb_get_queue_mapping(skb), &foc);
+ err = af_ops->send_synack(sk, dst, NULL, req,
+ skb_get_queue_mapping(skb), &foc);
if (!fastopen) {
if (err || want_cookie)
goto drop_and_free;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 2fd886f..210b610 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -470,13 +470,14 @@ out:
static int tcp_v6_send_synack(struct sock *sk, struct dst_entry *dst,
- struct flowi6 *fl6,
+ struct flowi *fl,
struct request_sock *req,
u16 queue_mapping,
struct tcp_fastopen_cookie *foc)
{
struct inet_request_sock *ireq = inet_rsk(req);
struct ipv6_pinfo *np = inet6_sk(sk);
+ struct flowi6 *fl6 = &fl->u.ip6;
struct sk_buff *skb;
int err = -ENOMEM;
@@ -507,10 +508,11 @@ done:
static int tcp_v6_rtx_synack(struct sock *sk, struct request_sock *req)
{
- struct flowi6 fl6;
+ const struct tcp_request_sock_ops *af_ops = tcp_rsk(req)->af_specific;
+ struct flowi fl;
int res;
- res = tcp_v6_send_synack(sk, NULL, &fl6, req, 0, NULL);
+ res = af_ops->send_synack(sk, NULL, &fl, req, 0, NULL);
if (!res) {
TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_RETRANSSEGS);
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
@@ -754,7 +756,6 @@ static struct dst_entry *tcp_v6_route_req(struct sock *sk, struct flowi *fl,
return inet6_csk_route_req(sk, &fl->u.ip6, req);
}
-
struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
.family = AF_INET6,
.obj_size = sizeof(struct tcp6_request_sock),
@@ -776,6 +777,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
#endif
.route_req = tcp_v6_route_req,
.init_seq = tcp_v6_init_sequence,
+ .send_synack = tcp_v6_send_synack,
};
static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
@@ -1128,8 +1130,8 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
tcp_openreq_init_rwin(req, sk, dst);
fastopen = !want_cookie &&
tcp_try_fastopen(sk, skb, req, &foc, dst);
- err = tcp_v6_send_synack(sk, dst, &fl6, req,
- skb_get_queue_mapping(skb), &foc);
+ err = af_ops->send_synack(sk, dst, (struct flowi *)&fl6, req,
+ skb_get_queue_mapping(skb), &foc);
if (!fastopen) {
if (err || want_cookie)
goto drop_and_free;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 10/13] tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (8 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 09/13] tcp: add send_synack " Octavian Purdila
@ 2014-06-25 14:09 ` Octavian Purdila
2014-06-25 14:10 ` [net-next v2 11/13] tcp: add mss_clamp to tcp_request_sock_ops Octavian Purdila
` (3 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:09 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 2 ++
net/ipv4/tcp_ipv4.c | 14 +-------------
net/ipv4/tcp_output.c | 15 +++++++++++++++
net/ipv6/tcp_ipv6.c | 15 +--------------
4 files changed, 19 insertions(+), 27 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 8c05c25..8e9c28d 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1573,6 +1573,8 @@ int tcp4_proc_init(void);
void tcp4_proc_exit(void);
#endif
+int tcp_rtx_synack(struct sock *sk, struct request_sock *req);
+
/* TCP af-specific functions */
struct tcp_sock_af_ops {
#ifdef CONFIG_TCP_MD5SIG
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index b5945ac..597dd9d75 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -845,18 +845,6 @@ static int tcp_v4_send_synack(struct sock *sk, struct dst_entry *dst,
return err;
}
-static int tcp_v4_rtx_synack(struct sock *sk, struct request_sock *req)
-{
- const struct tcp_request_sock_ops *af_ops = tcp_rsk(req)->af_specific;
- int res = af_ops->send_synack(sk, NULL, NULL, req, 0, NULL);
-
- if (!res) {
- TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_RETRANSSEGS);
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
- }
- return res;
-}
-
/*
* IPv4 request_sock destructor.
*/
@@ -1269,7 +1257,7 @@ static struct dst_entry *tcp_v4_route_req(struct sock *sk, struct flowi *fl,
struct request_sock_ops tcp_request_sock_ops __read_mostly = {
.family = PF_INET,
.obj_size = sizeof(struct tcp_request_sock),
- .rtx_syn_ack = tcp_v4_rtx_synack,
+ .rtx_syn_ack = tcp_rtx_synack,
.send_ack = tcp_v4_reqsk_send_ack,
.destructor = tcp_v4_reqsk_destructor,
.send_reset = tcp_v4_send_reset,
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d92bce0..f8f2a94 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3299,3 +3299,18 @@ void tcp_send_probe0(struct sock *sk)
TCP_RTO_MAX);
}
}
+
+int tcp_rtx_synack(struct sock *sk, struct request_sock *req)
+{
+ const struct tcp_request_sock_ops *af_ops = tcp_rsk(req)->af_specific;
+ struct flowi fl;
+ int res;
+
+ res = af_ops->send_synack(sk, NULL, &fl, req, 0, NULL);
+ if (!res) {
+ TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_RETRANSSEGS);
+ NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
+ }
+ return res;
+}
+EXPORT_SYMBOL(tcp_rtx_synack);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 210b610..41389bb 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -506,19 +506,6 @@ done:
return err;
}
-static int tcp_v6_rtx_synack(struct sock *sk, struct request_sock *req)
-{
- const struct tcp_request_sock_ops *af_ops = tcp_rsk(req)->af_specific;
- struct flowi fl;
- int res;
-
- res = af_ops->send_synack(sk, NULL, &fl, req, 0, NULL);
- if (!res) {
- TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_RETRANSSEGS);
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
- }
- return res;
-}
static void tcp_v6_reqsk_destructor(struct request_sock *req)
{
@@ -759,7 +746,7 @@ static struct dst_entry *tcp_v6_route_req(struct sock *sk, struct flowi *fl,
struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
.family = AF_INET6,
.obj_size = sizeof(struct tcp6_request_sock),
- .rtx_syn_ack = tcp_v6_rtx_synack,
+ .rtx_syn_ack = tcp_rtx_synack,
.send_ack = tcp_v6_reqsk_send_ack,
.destructor = tcp_v6_reqsk_destructor,
.send_reset = tcp_v6_send_reset,
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 11/13] tcp: add mss_clamp to tcp_request_sock_ops
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (9 preceding siblings ...)
2014-06-25 14:09 ` [net-next v2 10/13] tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack Octavian Purdila
@ 2014-06-25 14:10 ` Octavian Purdila
2014-06-25 14:10 ` [net-next v2 12/13] tcp: add queue_add_hash " Octavian Purdila
` (2 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:10 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Add mss_clamp member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 1 +
net/ipv4/tcp_ipv4.c | 3 ++-
net/ipv6/tcp_ipv6.c | 4 +++-
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 8e9c28d..30fe98b 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1592,6 +1592,7 @@ struct tcp_sock_af_ops {
};
struct tcp_request_sock_ops {
+ u16 mss_clamp;
#ifdef CONFIG_TCP_MD5SIG
struct tcp_md5sig_key *(*md5_lookup) (struct sock *sk,
struct request_sock *req);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 597dd9d75..499d440 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1265,6 +1265,7 @@ struct request_sock_ops tcp_request_sock_ops __read_mostly = {
};
static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
+ .mss_clamp = TCP_MSS_DEFAULT,
#ifdef CONFIG_TCP_MD5SIG
.md5_lookup = tcp_v4_reqsk_md5_lookup,
.calc_md5_hash = tcp_v4_md5_hash_skb,
@@ -1324,7 +1325,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
af_ops = tcp_rsk(req)->af_specific = &tcp_request_sock_ipv4_ops;
tcp_clear_options(&tmp_opt);
- tmp_opt.mss_clamp = TCP_MSS_DEFAULT;
+ tmp_opt.mss_clamp = af_ops->mss_clamp;
tmp_opt.user_mss = tp->rx_opt.user_mss;
tcp_parse_options(skb, &tmp_opt, 0, want_cookie ? NULL : &foc);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 41389bb..ad65833 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -754,6 +754,8 @@ struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
};
static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
+ .mss_clamp = IPV6_MIN_MTU - sizeof(struct tcphdr) -
+ sizeof(struct ipv6hdr),
#ifdef CONFIG_TCP_MD5SIG
.md5_lookup = tcp_v6_reqsk_md5_lookup,
.calc_md5_hash = tcp_v6_md5_hash_skb,
@@ -1047,7 +1049,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
af_ops = tcp_rsk(req)->af_specific = &tcp_request_sock_ipv6_ops;
tcp_clear_options(&tmp_opt);
- tmp_opt.mss_clamp = IPV6_MIN_MTU - sizeof(struct tcphdr) - sizeof(struct ipv6hdr);
+ tmp_opt.mss_clamp = af_ops->mss_clamp;
tmp_opt.user_mss = tp->rx_opt.user_mss;
tcp_parse_options(skb, &tmp_opt, 0, want_cookie ? NULL : &foc);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 12/13] tcp: add queue_add_hash to tcp_request_sock_ops
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (10 preceding siblings ...)
2014-06-25 14:10 ` [net-next v2 11/13] tcp: add mss_clamp to tcp_request_sock_ops Octavian Purdila
@ 2014-06-25 14:10 ` Octavian Purdila
2014-06-25 19:05 ` Cong Wang
2014-06-25 14:10 ` [net-next v2 13/13] tcp: add tcp_conn_request Octavian Purdila
2014-06-27 22:54 ` [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request David Miller
13 siblings, 1 reply; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:10 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Add queue_add_hash member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 2 ++
net/ipv4/tcp_ipv4.c | 3 ++-
net/ipv6/tcp_ipv6.c | 3 ++-
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 30fe98b..cec6e2c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1615,6 +1615,8 @@ struct tcp_request_sock_ops {
int (*send_synack)(struct sock *sk, struct dst_entry *dst,
struct flowi *fl, struct request_sock *req,
u16 queue_mapping, struct tcp_fastopen_cookie *foc);
+ void (*queue_hash_add)(struct sock *sk, struct request_sock *req,
+ const unsigned long timeout);
};
#ifdef CONFIG_SYN_COOKIES
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 499d440..845c39d 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1277,6 +1277,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
.route_req = tcp_v4_route_req,
.init_seq = tcp_v4_init_sequence,
.send_synack = tcp_v4_send_synack,
+ .queue_hash_add = inet_csk_reqsk_queue_hash_add,
};
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
@@ -1403,7 +1404,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
goto drop_and_free;
tcp_rsk(req)->listener = NULL;
- inet_csk_reqsk_queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
+ af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
}
return 0;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index ad65833..8232bc7 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -767,6 +767,7 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
.route_req = tcp_v6_route_req,
.init_seq = tcp_v6_init_sequence,
.send_synack = tcp_v6_send_synack,
+ .queue_hash_add = inet6_csk_reqsk_queue_hash_add,
};
static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
@@ -1126,7 +1127,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
goto drop_and_free;
tcp_rsk(req)->listener = NULL;
- inet6_csk_reqsk_queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
+ af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
}
return 0;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [net-next v2 13/13] tcp: add tcp_conn_request
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (11 preceding siblings ...)
2014-06-25 14:10 ` [net-next v2 12/13] tcp: add queue_add_hash " Octavian Purdila
@ 2014-06-25 14:10 ` Octavian Purdila
2014-06-27 22:54 ` [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request David Miller
13 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 14:10 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Create tcp_conn_request and remove most of the code from
tcp_v4_conn_request and tcp_v6_conn_request.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
---
include/net/tcp.h | 3 ++
net/ipv4/tcp_input.c | 148 +++++++++++++++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 128 +-------------------------------------------
net/ipv6/tcp_ipv6.c | 120 +----------------------------------------
4 files changed, 155 insertions(+), 244 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index cec6e2c..0d5389ae 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1574,6 +1574,9 @@ void tcp4_proc_exit(void);
#endif
int tcp_rtx_synack(struct sock *sk, struct request_sock *req);
+int tcp_conn_request(struct request_sock_ops *rsk_ops,
+ const struct tcp_request_sock_ops *af_ops,
+ struct sock *sk, struct sk_buff *skb);
/* TCP af-specific functions */
struct tcp_sock_af_ops {
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 40661fc..401164e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5877,3 +5877,151 @@ discard:
return 0;
}
EXPORT_SYMBOL(tcp_rcv_state_process);
+
+static inline void pr_drop_req(struct request_sock *req, __u16 port, int family)
+{
+ struct inet_request_sock *ireq = inet_rsk(req);
+
+ if (family == AF_INET)
+ LIMIT_NETDEBUG(KERN_DEBUG pr_fmt("drop open request from %pI4/%u\n"),
+ &ireq->ir_rmt_addr, port);
+ else
+ LIMIT_NETDEBUG(KERN_DEBUG pr_fmt("drop open request from %pI6/%u\n"),
+ &ireq->ir_v6_rmt_addr, port);
+}
+
+int tcp_conn_request(struct request_sock_ops *rsk_ops,
+ const struct tcp_request_sock_ops *af_ops,
+ struct sock *sk, struct sk_buff *skb)
+{
+ struct tcp_options_received tmp_opt;
+ struct request_sock *req;
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct dst_entry *dst = NULL;
+ __u32 isn = TCP_SKB_CB(skb)->when;
+ bool want_cookie = false, fastopen;
+ struct flowi fl;
+ struct tcp_fastopen_cookie foc = { .len = -1 };
+ int err;
+
+
+ /* TW buckets are converted to open requests without
+ * limitations, they conserve resources and peer is
+ * evidently real one.
+ */
+ if ((sysctl_tcp_syncookies == 2 ||
+ inet_csk_reqsk_queue_is_full(sk)) && !isn) {
+ want_cookie = tcp_syn_flood_action(sk, skb, rsk_ops->slab_name);
+ if (!want_cookie)
+ goto drop;
+ }
+
+
+ /* Accept backlog is full. If we have already queued enough
+ * of warm entries in syn queue, drop request. It is better than
+ * clogging syn queue with openreqs with exponentially increasing
+ * timeout.
+ */
+ if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1) {
+ NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+ goto drop;
+ }
+
+ req = inet_reqsk_alloc(rsk_ops);
+ if (!req)
+ goto drop;
+
+ tcp_rsk(req)->af_specific = af_ops;
+
+ tcp_clear_options(&tmp_opt);
+ tmp_opt.mss_clamp = af_ops->mss_clamp;
+ tmp_opt.user_mss = tp->rx_opt.user_mss;
+ tcp_parse_options(skb, &tmp_opt, 0, want_cookie ? NULL : &foc);
+
+ if (want_cookie && !tmp_opt.saw_tstamp)
+ tcp_clear_options(&tmp_opt);
+
+ tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
+ tcp_openreq_init(req, &tmp_opt, skb, sk);
+
+ af_ops->init_req(req, sk, skb);
+
+ if (security_inet_conn_request(sk, skb, req))
+ goto drop_and_free;
+
+ if (!want_cookie || tmp_opt.tstamp_ok)
+ TCP_ECN_create_request(req, skb, sock_net(sk));
+
+ if (want_cookie) {
+ isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
+ req->cookie_ts = tmp_opt.tstamp_ok;
+ } else if (!isn) {
+ /* VJ's idea. We save last timestamp seen
+ * from the destination in peer table, when entering
+ * state TIME-WAIT, and check against it before
+ * accepting new connection request.
+ *
+ * If "isn" is not zero, this request hit alive
+ * timewait bucket, so that all the necessary checks
+ * are made in the function processing timewait state.
+ */
+ if (tmp_opt.saw_tstamp && tcp_death_row.sysctl_tw_recycle) {
+ bool strict;
+
+ dst = af_ops->route_req(sk, &fl, req, &strict);
+ if (dst && strict &&
+ !tcp_peer_is_proven(req, dst, true)) {
+ NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
+ goto drop_and_release;
+ }
+ }
+ /* Kill the following clause, if you dislike this way. */
+ else if (!sysctl_tcp_syncookies &&
+ (sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) <
+ (sysctl_max_syn_backlog >> 2)) &&
+ !tcp_peer_is_proven(req, dst, false)) {
+ /* Without syncookies last quarter of
+ * backlog is filled with destinations,
+ * proven to be alive.
+ * It means that we continue to communicate
+ * to destinations, already remembered
+ * to the moment of synflood.
+ */
+ pr_drop_req(req, ntohs(tcp_hdr(skb)->source),
+ rsk_ops->family);
+ goto drop_and_release;
+ }
+
+ isn = af_ops->init_seq(skb);
+ }
+ if (!dst) {
+ dst = af_ops->route_req(sk, &fl, req, NULL);
+ if (!dst)
+ goto drop_and_free;
+ }
+
+ tcp_rsk(req)->snt_isn = isn;
+ tcp_openreq_init_rwin(req, sk, dst);
+ fastopen = !want_cookie &&
+ tcp_try_fastopen(sk, skb, req, &foc, dst);
+ err = af_ops->send_synack(sk, dst, &fl, req,
+ skb_get_queue_mapping(skb), &foc);
+ if (!fastopen) {
+ if (err || want_cookie)
+ goto drop_and_free;
+
+ tcp_rsk(req)->listener = NULL;
+ af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
+ }
+
+ return 0;
+
+drop_and_release:
+ dst_release(dst);
+drop_and_free:
+ reqsk_free(req);
+drop:
+ NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
+ return 0;
+}
+EXPORT_SYMBOL(tcp_conn_request);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 845c39d..5dfebd2 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1282,137 +1282,13 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
{
- struct tcp_options_received tmp_opt;
- struct request_sock *req;
- struct tcp_sock *tp = tcp_sk(sk);
- struct dst_entry *dst = NULL;
- __be32 saddr = ip_hdr(skb)->saddr;
- __u32 isn = TCP_SKB_CB(skb)->when;
- bool want_cookie = false, fastopen;
- struct flowi4 fl4;
- struct tcp_fastopen_cookie foc = { .len = -1 };
- const struct tcp_request_sock_ops *af_ops;
- int err;
-
/* Never answer to SYNs send to broadcast or multicast */
if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))
goto drop;
- /* TW buckets are converted to open requests without
- * limitations, they conserve resources and peer is
- * evidently real one.
- */
- if ((sysctl_tcp_syncookies == 2 ||
- inet_csk_reqsk_queue_is_full(sk)) && !isn) {
- want_cookie = tcp_syn_flood_action(sk, skb, "TCP");
- if (!want_cookie)
- goto drop;
- }
-
- /* Accept backlog is full. If we have already queued enough
- * of warm entries in syn queue, drop request. It is better than
- * clogging syn queue with openreqs with exponentially increasing
- * timeout.
- */
- if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1) {
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
- goto drop;
- }
-
- req = inet_reqsk_alloc(&tcp_request_sock_ops);
- if (!req)
- goto drop;
-
- af_ops = tcp_rsk(req)->af_specific = &tcp_request_sock_ipv4_ops;
-
- tcp_clear_options(&tmp_opt);
- tmp_opt.mss_clamp = af_ops->mss_clamp;
- tmp_opt.user_mss = tp->rx_opt.user_mss;
- tcp_parse_options(skb, &tmp_opt, 0, want_cookie ? NULL : &foc);
-
- if (want_cookie && !tmp_opt.saw_tstamp)
- tcp_clear_options(&tmp_opt);
-
- tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
- tcp_openreq_init(req, &tmp_opt, skb, sk);
-
- af_ops->init_req(req, sk, skb);
-
- if (security_inet_conn_request(sk, skb, req))
- goto drop_and_free;
+ return tcp_conn_request(&tcp_request_sock_ops,
+ &tcp_request_sock_ipv4_ops, sk, skb);
- if (!want_cookie || tmp_opt.tstamp_ok)
- TCP_ECN_create_request(req, skb, sock_net(sk));
-
- if (want_cookie) {
- isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
- req->cookie_ts = tmp_opt.tstamp_ok;
- } else if (!isn) {
- /* VJ's idea. We save last timestamp seen
- * from the destination in peer table, when entering
- * state TIME-WAIT, and check against it before
- * accepting new connection request.
- *
- * If "isn" is not zero, this request hit alive
- * timewait bucket, so that all the necessary checks
- * are made in the function processing timewait state.
- */
- if (tmp_opt.saw_tstamp && tcp_death_row.sysctl_tw_recycle) {
- bool strict;
-
- dst = af_ops->route_req(sk, (struct flowi *)&fl4, req,
- &strict);
- if (dst && strict &&
- !tcp_peer_is_proven(req, dst, true)) {
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
- goto drop_and_release;
- }
- }
- /* Kill the following clause, if you dislike this way. */
- else if (!sysctl_tcp_syncookies &&
- (sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) <
- (sysctl_max_syn_backlog >> 2)) &&
- !tcp_peer_is_proven(req, dst, false)) {
- /* Without syncookies last quarter of
- * backlog is filled with destinations,
- * proven to be alive.
- * It means that we continue to communicate
- * to destinations, already remembered
- * to the moment of synflood.
- */
- LIMIT_NETDEBUG(KERN_DEBUG pr_fmt("drop open request from %pI4/%u\n"),
- &saddr, ntohs(tcp_hdr(skb)->source));
- goto drop_and_release;
- }
-
- isn = af_ops->init_seq(skb);
- }
- if (!dst) {
- dst = af_ops->route_req(sk, (struct flowi *)&fl4, req, NULL);
- if (!dst)
- goto drop_and_free;
- }
-
- tcp_rsk(req)->snt_isn = isn;
- tcp_openreq_init_rwin(req, sk, dst);
- fastopen = !want_cookie &&
- tcp_try_fastopen(sk, skb, req, &foc, dst);
- err = af_ops->send_synack(sk, dst, NULL, req,
- skb_get_queue_mapping(skb), &foc);
- if (!fastopen) {
- if (err || want_cookie)
- goto drop_and_free;
-
- tcp_rsk(req)->listener = NULL;
- af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
- }
-
- return 0;
-
-drop_and_release:
- dst_release(dst);
-drop_and_free:
- reqsk_free(req);
drop:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
return 0;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 8232bc7..bc24ee2 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1008,133 +1008,17 @@ static struct sock *tcp_v6_hnd_req(struct sock *sk, struct sk_buff *skb)
return sk;
}
-/* FIXME: this is substantially similar to the ipv4 code.
- * Can some kind of merge be done? -- erics
- */
static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
{
- struct tcp_options_received tmp_opt;
- struct request_sock *req;
- struct inet_request_sock *ireq;
- struct tcp_sock *tp = tcp_sk(sk);
- __u32 isn = TCP_SKB_CB(skb)->when;
- struct dst_entry *dst = NULL;
- struct tcp_fastopen_cookie foc = { .len = -1 };
- bool want_cookie = false, fastopen;
- struct flowi6 fl6;
- const struct tcp_request_sock_ops *af_ops;
- int err;
-
if (skb->protocol == htons(ETH_P_IP))
return tcp_v4_conn_request(sk, skb);
if (!ipv6_unicast_destination(skb))
goto drop;
- if ((sysctl_tcp_syncookies == 2 ||
- inet_csk_reqsk_queue_is_full(sk)) && !isn) {
- want_cookie = tcp_syn_flood_action(sk, skb, "TCPv6");
- if (!want_cookie)
- goto drop;
- }
+ return tcp_conn_request(&tcp6_request_sock_ops,
+ &tcp_request_sock_ipv6_ops, sk, skb);
- if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1) {
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
- goto drop;
- }
-
- req = inet_reqsk_alloc(&tcp6_request_sock_ops);
- if (req == NULL)
- goto drop;
-
- af_ops = tcp_rsk(req)->af_specific = &tcp_request_sock_ipv6_ops;
-
- tcp_clear_options(&tmp_opt);
- tmp_opt.mss_clamp = af_ops->mss_clamp;
- tmp_opt.user_mss = tp->rx_opt.user_mss;
- tcp_parse_options(skb, &tmp_opt, 0, want_cookie ? NULL : &foc);
-
- if (want_cookie && !tmp_opt.saw_tstamp)
- tcp_clear_options(&tmp_opt);
-
- tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
- tcp_openreq_init(req, &tmp_opt, skb, sk);
-
- ireq = inet_rsk(req);
- af_ops->init_req(req, sk, skb);
-
- if (security_inet_conn_request(sk, skb, req))
- goto drop_and_release;
-
- if (!want_cookie || tmp_opt.tstamp_ok)
- TCP_ECN_create_request(req, skb, sock_net(sk));
-
- if (want_cookie) {
- isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
- req->cookie_ts = tmp_opt.tstamp_ok;
- } else if (!isn) {
- /* VJ's idea. We save last timestamp seen
- * from the destination in peer table, when entering
- * state TIME-WAIT, and check against it before
- * accepting new connection request.
- *
- * If "isn" is not zero, this request hit alive
- * timewait bucket, so that all the necessary checks
- * are made in the function processing timewait state.
- */
- if (tmp_opt.saw_tstamp && tcp_death_row.sysctl_tw_recycle) {
- dst = af_ops->route_req(sk, (struct flowi *)&fl6, req,
- NULL);
- if (dst && !tcp_peer_is_proven(req, dst, true)) {
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
- goto drop_and_release;
- }
- }
- /* Kill the following clause, if you dislike this way. */
- else if (!sysctl_tcp_syncookies &&
- (sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) <
- (sysctl_max_syn_backlog >> 2)) &&
- !tcp_peer_is_proven(req, dst, false)) {
- /* Without syncookies last quarter of
- * backlog is filled with destinations,
- * proven to be alive.
- * It means that we continue to communicate
- * to destinations, already remembered
- * to the moment of synflood.
- */
- LIMIT_NETDEBUG(KERN_DEBUG "TCP: drop open request from %pI6/%u\n",
- &ireq->ir_v6_rmt_addr, ntohs(tcp_hdr(skb)->source));
- goto drop_and_release;
- }
-
- isn = af_ops->init_seq(skb);
- }
-
- if (!dst) {
- dst = af_ops->route_req(sk, (struct flowi *)&fl6, req, NULL);
- if (!dst)
- goto drop_and_free;
- }
-
- tcp_rsk(req)->snt_isn = isn;
- tcp_openreq_init_rwin(req, sk, dst);
- fastopen = !want_cookie &&
- tcp_try_fastopen(sk, skb, req, &foc, dst);
- err = af_ops->send_synack(sk, dst, (struct flowi *)&fl6, req,
- skb_get_queue_mapping(skb), &foc);
- if (!fastopen) {
- if (err || want_cookie)
- goto drop_and_free;
-
- tcp_rsk(req)->listener = NULL;
- af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
- }
- return 0;
-
-drop_and_release:
- dst_release(dst);
-drop_and_free:
- reqsk_free(req);
drop:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
return 0; /* don't send reset */
--
1.8.3.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [net-next v2 07/13] tcp: move around a few calls in tcp_v6_conn_request
2014-06-25 14:09 ` [net-next v2 07/13] tcp: move around a few calls in tcp_v6_conn_request Octavian Purdila
@ 2014-06-25 14:29 ` Paul Moore
0 siblings, 0 replies; 25+ messages in thread
From: Paul Moore @ 2014-06-25 14:29 UTC (permalink / raw)
To: Octavian Purdila, netdev; +Cc: linux-security-module
On Wednesday, June 25, 2014 05:09:56 PM Octavian Purdila wrote:
> Make the tcp_v6_conn_request calls flow similar with that of
> tcp_v4_conn_request.
>
> Note that want_cookie can be true only if isn is zero and that is why
> we can move the if (want_cookie) block out of the if (!isn) block.
>
> Moving security_inet_conn_request() has a couple of side effects:
> missing inet_rsk(req)->ecn_ok update and the req->cookie_ts
> update. However, neither SELinux nor Smack security hooks seems to
> check them. This change should also avoid future different behaviour
> for IPv4 and IPv6 in the security hooks.
>
> Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
> ---
> net/ipv6/tcp_ipv6.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
Please send changes to the LSM hooks to the LSM mailing list (CC'd) for
review/comment.
Mailing list issues aside, this looks fine to me; the loss of the ecn_ok and
cookie_ts field updates should not matter to the current LSMs, or any
reasonable future LSM, as the LSMs are more concerned about updating the LSM
context assigned to the request_sock in this particular hook.
Acked-by: Paul Moore <paul@paul-moore.com>
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index d780d88..91b8a2e 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -1070,16 +1070,16 @@ static int tcp_v6_conn_request(struct sock *sk,
> struct sk_buff *skb) ireq = inet_rsk(req);
> af_ops->init_req(req, sk, skb);
>
> + if (security_inet_conn_request(sk, skb, req))
> + goto drop_and_release;
> +
> if (!want_cookie || tmp_opt.tstamp_ok)
> TCP_ECN_create_request(req, skb, sock_net(sk));
>
> - if (!isn) {
> - if (want_cookie) {
> - isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
> - req->cookie_ts = tmp_opt.tstamp_ok;
> - goto have_isn;
> - }
> -
> + if (want_cookie) {
> + isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
> + req->cookie_ts = tmp_opt.tstamp_ok;
> + } else if (!isn) {
> /* VJ's idea. We save last timestamp seen
> * from the destination in peer table, when entering
> * state TIME-WAIT, and check against it before
> @@ -1116,10 +1116,6 @@ static int tcp_v6_conn_request(struct sock *sk,
> struct sk_buff *skb)
>
> isn = tcp_v6_init_sequence(skb);
> }
> -have_isn:
> -
> - if (security_inet_conn_request(sk, skb, req))
> - goto drop_and_release;
>
> if (!dst) {
> dst = af_ops->route_req(sk, (struct flowi *)&fl6, req, NULL);
--
paul moore
www.paul-moore.com
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 03/13] net: remove inet6_reqsk_alloc
2014-06-25 14:09 ` [net-next v2 03/13] net: remove inet6_reqsk_alloc Octavian Purdila
@ 2014-06-25 17:59 ` Cong Wang
2014-06-25 18:04 ` Octavian Purdila
0 siblings, 1 reply; 25+ messages in thread
From: Cong Wang @ 2014-06-25 17:59 UTC (permalink / raw)
To: Octavian Purdila; +Cc: netdev
On Wed, Jun 25, 2014 at 7:09 AM, Octavian Purdila
<octavian.purdila@intel.com> wrote:
> Since pktops is only used for IPv6 only and opts is used for IPv4
> only, we can move these fields into a union and this allows us to drop
> the inet6_reqsk_alloc function as after this change it becomes
> equivalent with inet_reqsk_alloc.
>
Or use kmem_cache_zalloc() so that inet_reqsk_alloc() doesn't have
to init ->opt to NULL explicitly.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 03/13] net: remove inet6_reqsk_alloc
2014-06-25 17:59 ` Cong Wang
@ 2014-06-25 18:04 ` Octavian Purdila
0 siblings, 0 replies; 25+ messages in thread
From: Octavian Purdila @ 2014-06-25 18:04 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev
On Wed, Jun 25, 2014 at 8:59 PM, Cong Wang <cwang@twopensource.com> wrote:
> On Wed, Jun 25, 2014 at 7:09 AM, Octavian Purdila
> <octavian.purdila@intel.com> wrote:
>> Since pktops is only used for IPv6 only and opts is used for IPv4
>> only, we can move these fields into a union and this allows us to drop
>> the inet6_reqsk_alloc function as after this change it becomes
>> equivalent with inet_reqsk_alloc.
>>
>
> Or use kmem_cache_zalloc() so that inet_reqsk_alloc() doesn't have
> to init ->opt to NULL explicitly.
Wouldn't that be costly?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 12/13] tcp: add queue_add_hash to tcp_request_sock_ops
2014-06-25 14:10 ` [net-next v2 12/13] tcp: add queue_add_hash " Octavian Purdila
@ 2014-06-25 19:05 ` Cong Wang
0 siblings, 0 replies; 25+ messages in thread
From: Cong Wang @ 2014-06-25 19:05 UTC (permalink / raw)
To: Octavian Purdila; +Cc: netdev
On Wed, Jun 25, 2014 at 7:10 AM, Octavian Purdila
<octavian.purdila@intel.com> wrote:
> Add queue_add_hash member to tcp_request_sock_ops so that we can later
> unify tcp_v4_conn_request and tcp_v6_conn_request.
>
Alternatively, you can unify inet6_csk_reqsk_queue_hash_add() with
inet_csk_reqsk_queue_hash_add().
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
` (12 preceding siblings ...)
2014-06-25 14:10 ` [net-next v2 13/13] tcp: add tcp_conn_request Octavian Purdila
@ 2014-06-27 22:54 ` David Miller
13 siblings, 0 replies; 25+ messages in thread
From: David Miller @ 2014-06-27 22:54 UTC (permalink / raw)
To: octavian.purdila; +Cc: netdev
From: Octavian Purdila <octavian.purdila@intel.com>
Date: Wed, 25 Jun 2014 17:09:49 +0300
> This patch series unifies the TCPv4 and TCPv6 connection request flow
> in a single new function (tcp_conn_request).
Looks good, thanks for auditing the issue wrt. ipv4 mapped sockets.
Applied to net-next, thanks again.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization
2014-06-25 14:09 ` [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization Octavian Purdila
@ 2014-06-28 2:16 ` Neal Cardwell
2014-06-28 2:24 ` Neal Cardwell
2014-06-30 11:50 ` Octavian Purdila
0 siblings, 2 replies; 25+ messages in thread
From: Neal Cardwell @ 2014-06-28 2:16 UTC (permalink / raw)
To: Octavian Purdila; +Cc: Netdev, Daniel Lee, Yuchung Cheng, Jerry Chu
On Wed, Jun 25, 2014 at 10:09 AM, Octavian Purdila
<octavian.purdila@intel.com> wrote:
> Commit 016818d07 (tcp: TCP Fast Open Server - take SYNACK RTT after
> completing 3WHS) changes the code to only take a snt_synack timestamp
> when a SYNACK transmit or retransmit succeeds. This behaviour is later
> broken by commit 843f4a55e (tcp: use tcp_v4_send_synack on first
> SYN-ACK), as snt_synack is now updated even if tcp_v4_send_synack
> fails.
>
> Also, commit 3a19ce0ee (tcp: IPv6 support for fastopen server) misses
> the required IPv6 updates for 016818d07.
>
> This patch makes sure that snt_synack is updated only when the SYNACK
> trasnmit/retransmit succeeds, for both IPv4 and IPv6.
>
> Cc: Cardwell <ncardwell@google.com>
> Cc: Daniel Lee <longinus00@gmail.com>
> Cc: Yuchung Cheng <ycheng@google.com>
>
> Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 08ae3da..a962455 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -497,6 +497,8 @@ static int tcp_v6_send_synack(struct sock *sk, struct dst_entry *dst,
> skb_set_queue_mapping(skb, queue_mapping);
> err = ip6_xmit(sk, skb, fl6, np->opt, np->tclass);
> err = net_xmit_eval(err);
> + if (!tcp_rsk(req)->snt_synack && !err)
> + tcp_rsk(req)->snt_synack = tcp_time_stamp;
> }
Octavian, thanks for finding this issue!
I think we should resolve the inconsistency you have discovered by
converging on the approach in Yuchung's 843f4a55e (tcp: use
tcp_v4_send_synack on first SYN-ACK), which restores the original
behavior of snt_synack from the commit where it was added by Jerry:
9ad7c049f0f79 ("tcp: RFC2988bis + taking RTT sample from 3WHS for the
passive open side") in v3.1. In that commit tcp_v4_conn_request() and
tcp_v6_conn_request() unconditionally store the time at which the
server received the first client SYN in snt_synack:
tcp_rsk(req)->snt_synack = tcp_time_stamp;
At Google we have found that behavior very useful because it allows
extracting useful performance metrics summarizing how long it took to
establish a passive TCP connection.
So I would suggest that we fix this inconsistency you have noted by
removing the two lines in tcp_v4_send_synack() that touch snt_synack.
That is the approach we have been using at Google recently, and I
think it was an oversight that we didn't remove those two lines.
Thanks!
neal
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization
2014-06-28 2:16 ` Neal Cardwell
@ 2014-06-28 2:24 ` Neal Cardwell
2014-06-30 11:50 ` Octavian Purdila
1 sibling, 0 replies; 25+ messages in thread
From: Neal Cardwell @ 2014-06-28 2:24 UTC (permalink / raw)
To: Octavian Purdila; +Cc: Netdev, Daniel Lee, Yuchung Cheng, Jerry Chu
On Fri, Jun 27, 2014 at 10:16 PM, Neal Cardwell <ncardwell@google.com> wrote:
> So I would suggest that we fix this inconsistency you have noted by
> removing the two lines in tcp_v4_send_synack() that touch snt_synack.
> That is the approach we have been using at Google recently, and I
> think it was an oversight that we didn't remove those two lines.
I see David merged the patch series tonight, so since I came late to
the party I'm happy to make and test this change if there is consensus
that this is reasonable.
neal
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization
2014-06-28 2:16 ` Neal Cardwell
2014-06-28 2:24 ` Neal Cardwell
@ 2014-06-30 11:50 ` Octavian Purdila
2014-06-30 19:11 ` Neal Cardwell
1 sibling, 1 reply; 25+ messages in thread
From: Octavian Purdila @ 2014-06-30 11:50 UTC (permalink / raw)
To: Neal Cardwell; +Cc: Netdev, Daniel Lee, Yuchung Cheng, Jerry Chu
On Sat, Jun 28, 2014 at 5:16 AM, Neal Cardwell <ncardwell@google.com> wrote:
> On Wed, Jun 25, 2014 at 10:09 AM, Octavian Purdila
> <octavian.purdila@intel.com> wrote:
>> Commit 016818d07 (tcp: TCP Fast Open Server - take SYNACK RTT after
>> completing 3WHS) changes the code to only take a snt_synack timestamp
>> when a SYNACK transmit or retransmit succeeds. This behaviour is later
>> broken by commit 843f4a55e (tcp: use tcp_v4_send_synack on first
>> SYN-ACK), as snt_synack is now updated even if tcp_v4_send_synack
>> fails.
>>
>> Also, commit 3a19ce0ee (tcp: IPv6 support for fastopen server) misses
>> the required IPv6 updates for 016818d07.
>>
>> This patch makes sure that snt_synack is updated only when the SYNACK
>> trasnmit/retransmit succeeds, for both IPv4 and IPv6.
>>
>> Cc: Cardwell <ncardwell@google.com>
>> Cc: Daniel Lee <longinus00@gmail.com>
>> Cc: Yuchung Cheng <ycheng@google.com>
>>
>> Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
>
>> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
>> index 08ae3da..a962455 100644
>> --- a/net/ipv6/tcp_ipv6.c
>> +++ b/net/ipv6/tcp_ipv6.c
>> @@ -497,6 +497,8 @@ static int tcp_v6_send_synack(struct sock *sk, struct dst_entry *dst,
>> skb_set_queue_mapping(skb, queue_mapping);
>> err = ip6_xmit(sk, skb, fl6, np->opt, np->tclass);
>> err = net_xmit_eval(err);
>> + if (!tcp_rsk(req)->snt_synack && !err)
>> + tcp_rsk(req)->snt_synack = tcp_time_stamp;
>> }
>
> Octavian, thanks for finding this issue!
>
Hi Neal,
> I think we should resolve the inconsistency you have discovered by
> converging on the approach in Yuchung's 843f4a55e (tcp: use
> tcp_v4_send_synack on first SYN-ACK), which restores the original
> behavior of snt_synack from the commit where it was added by Jerry:
> 9ad7c049f0f79 ("tcp: RFC2988bis + taking RTT sample from 3WHS for the
> passive open side") in v3.1. In that commit tcp_v4_conn_request() and
> tcp_v6_conn_request() unconditionally store the time at which the
> server received the first client SYN in snt_synack:
>
> tcp_rsk(req)->snt_synack = tcp_time_stamp;
>
> At Google we have found that behavior very useful because it allows
> extracting useful performance metrics summarizing how long it took to
> establish a passive TCP connection.
In that case perhaps it is better to add a new field (or rename the
existing one if it is not needed anymore) to store the syn arrival
time? I think it is confusing to store the syn arrival time in the
"synack sent time" field.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization
2014-06-30 11:50 ` Octavian Purdila
@ 2014-06-30 19:11 ` Neal Cardwell
2014-06-30 21:00 ` Octavian Purdila
0 siblings, 1 reply; 25+ messages in thread
From: Neal Cardwell @ 2014-06-30 19:11 UTC (permalink / raw)
To: Octavian Purdila; +Cc: Netdev, Daniel Lee, Yuchung Cheng, Jerry Chu
On Mon, Jun 30, 2014 at 7:50 AM, Octavian Purdila
<octavian.purdila@intel.com> wrote:
> In that case perhaps it is better to add a new field (or rename the
> existing one if it is not needed anymore) to store the syn arrival
> time? I think it is confusing to store the syn arrival time in the
> "synack sent time" field.
That would be reasonable, but I think the longstanding "snt_synack"
name is also good, since the primary purpose of that field is to
measure the RTT of the SYNACK packet.
neal
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization
2014-06-30 19:11 ` Neal Cardwell
@ 2014-06-30 21:00 ` Octavian Purdila
2014-07-01 2:24 ` Neal Cardwell
0 siblings, 1 reply; 25+ messages in thread
From: Octavian Purdila @ 2014-06-30 21:00 UTC (permalink / raw)
To: Neal Cardwell; +Cc: Netdev, Daniel Lee, Yuchung Cheng, Jerry Chu
On Mon, Jun 30, 2014 at 10:11 PM, Neal Cardwell <ncardwell@google.com> wrote:
> On Mon, Jun 30, 2014 at 7:50 AM, Octavian Purdila
> <octavian.purdila@intel.com> wrote:
>> In that case perhaps it is better to add a new field (or rename the
>> existing one if it is not needed anymore) to store the syn arrival
>> time? I think it is confusing to store the syn arrival time in the
>> "synack sent time" field.
>
> That would be reasonable, but I think the longstanding "snt_synack"
> name is also good, since the primary purpose of that field is to
> measure the RTT of the SYNACK packet.
>
So, if we use it to measure the RTT, with this approach, wouldn't the
RTT estimate be artificially high if sending the syn-ack fails? And
wouldn't that negatively affect congestion control ?
On second thought, if we need to retransmit the syn-ack then it does
not matter much. Is this the reason we don't care?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization
2014-06-30 21:00 ` Octavian Purdila
@ 2014-07-01 2:24 ` Neal Cardwell
0 siblings, 0 replies; 25+ messages in thread
From: Neal Cardwell @ 2014-07-01 2:24 UTC (permalink / raw)
To: Octavian Purdila; +Cc: Netdev, Daniel Lee, Yuchung Cheng, Jerry Chu
On Mon, Jun 30, 2014 at 5:00 PM, Octavian Purdila
<octavian.purdila@intel.com> wrote:
> On Mon, Jun 30, 2014 at 10:11 PM, Neal Cardwell <ncardwell@google.com> wrote:
>> On Mon, Jun 30, 2014 at 7:50 AM, Octavian Purdila
>> <octavian.purdila@intel.com> wrote:
>>> In that case perhaps it is better to add a new field (or rename the
>>> existing one if it is not needed anymore) to store the syn arrival
>>> time? I think it is confusing to store the syn arrival time in the
>>> "synack sent time" field.
>>
>> That would be reasonable, but I think the longstanding "snt_synack"
>> name is also good, since the primary purpose of that field is to
>> measure the RTT of the SYNACK packet.
>>
>
> So, if we use it to measure the RTT, with this approach, wouldn't the
> RTT estimate be artificially high if sending the syn-ack fails? And
> wouldn't that negatively affect congestion control ?
>
> On second thought, if we need to retransmit the syn-ack then it does
> not matter much. Is this the reason we don't care?
tcp_synack_rtt_meas() already explicitly checks to see if the SYNACK
was retransmitted before using the time to calculate the SYNACK RTT.
So this kind of scenario should be OK.
neal
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2014-07-01 2:24 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-25 14:09 [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request Octavian Purdila
2014-06-25 14:09 ` [net-next v2 01/13] tcp: cookie_v4_init_sequence: skb should be const Octavian Purdila
2014-06-25 14:09 ` [net-next v2 02/13] tcp: tcp_v[46]_conn_request: fix snt_synack initialization Octavian Purdila
2014-06-28 2:16 ` Neal Cardwell
2014-06-28 2:24 ` Neal Cardwell
2014-06-30 11:50 ` Octavian Purdila
2014-06-30 19:11 ` Neal Cardwell
2014-06-30 21:00 ` Octavian Purdila
2014-07-01 2:24 ` Neal Cardwell
2014-06-25 14:09 ` [net-next v2 03/13] net: remove inet6_reqsk_alloc Octavian Purdila
2014-06-25 17:59 ` Cong Wang
2014-06-25 18:04 ` Octavian Purdila
2014-06-25 14:09 ` [net-next v2 04/13] tcp: add init_req method to tcp_request_sock_ops Octavian Purdila
2014-06-25 14:09 ` [net-next v2 05/13] tcp: add init_cookie_seq " Octavian Purdila
2014-06-25 14:09 ` [net-next v2 06/13] tcp: add route_req " Octavian Purdila
2014-06-25 14:09 ` [net-next v2 07/13] tcp: move around a few calls in tcp_v6_conn_request Octavian Purdila
2014-06-25 14:29 ` Paul Moore
2014-06-25 14:09 ` [net-next v2 08/13] tcp: add init_seq method to tcp_request_sock_ops Octavian Purdila
2014-06-25 14:09 ` [net-next v2 09/13] tcp: add send_synack " Octavian Purdila
2014-06-25 14:09 ` [net-next v2 10/13] tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack Octavian Purdila
2014-06-25 14:10 ` [net-next v2 11/13] tcp: add mss_clamp to tcp_request_sock_ops Octavian Purdila
2014-06-25 14:10 ` [net-next v2 12/13] tcp: add queue_add_hash " Octavian Purdila
2014-06-25 19:05 ` Cong Wang
2014-06-25 14:10 ` [net-next v2 13/13] tcp: add tcp_conn_request Octavian Purdila
2014-06-27 22:54 ` [net-next v2 00/13] remove code duplication in tcp_v[46]_conn_request David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.