* [PATCH net 0/2] RDS: TCP: sychronization during connection startup
@ 2016-05-01 23:10 Sowmini Varadhan
2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan
0 siblings, 2 replies; 9+ messages in thread
From: Sowmini Varadhan @ 2016-05-01 23:10 UTC (permalink / raw)
To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem
This patch series ensures that the passive (accept) side of the
TCP connection used for RDS-TCP is correctly synchronized with
any concurrent active (connect) attempts for a given pair of peers.
Patch 1 in the series makes sure that the t_sock in struct
rds_tcp_connection is only reset after any threads in rds_tcp_xmit
have completed (otherwise a null-ptr deref may be encountered).
Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect()
path.
Sowmini Varadhan (2):
RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when
resetting t_sock
RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.
net/rds/tcp.c | 3 +-
net/rds/tcp.h | 4 +++
net/rds/tcp_connect.c | 8 +++++++
net/rds/tcp_listen.c | 54 ++++++++++++++++++++++++++++++++----------------
4 files changed, 50 insertions(+), 19 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
2016-05-01 23:10 [PATCH net 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan
@ 2016-05-01 23:10 ` Sowmini Varadhan
2016-05-02 16:20 ` Santosh Shilimkar
2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan
1 sibling, 1 reply; 9+ messages in thread
From: Sowmini Varadhan @ 2016-05-01 23:10 UTC (permalink / raw)
To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem
There is a race condition between rds_send_xmit -> rds_tcp_xmit
and the code that deals with resolution of duelling syns added
by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()").
Specifically, we may end up derefencing a null pointer in rds_send_xmit
if we have the interleaving sequencee:
rds_tcp_accept_one rds_send_xmit
conn is RDS_CONN_UP, so
invoke rds_tcp_xmit
tc = conn->c_transport_data
rds_tcp_restore_callbacks
/* reset t_sock */
null ptr deref from tc->t_sock
The race condition can be avoided without adding the overhead of
additional locking in the xmit path: have rds_tcp_accept_one wait
for rds_tcp_xmit threads to complete before resetting callbacks.
The synchronization can be done in the same manner as rds_conn_shutdown().
First set the rds_conn_state to something other than RDS_CONN_UP
(so that new threads cannot get into rds_tcp_xmit()), then wait for
RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any
threads in rds_tcp_xmit are done.
Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/tcp.c | 2 +-
net/rds/tcp_listen.c | 40 ++++++++++++++++++++++++----------------
2 files changed, 25 insertions(+), 17 deletions(-)
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 61ed2a8..9134544 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -127,7 +127,7 @@ void rds_tcp_restore_callbacks(struct socket *sock,
/*
* This is the only path that sets tc->t_sock. Send and receive trust that
- * it is set. The RDS_CONN_CONNECTED bit protects those paths from being
+ * it is set. The RDS_CONN_UP bit protects those paths from being
* called while it isn't set.
*/
void rds_tcp_set_callbacks(struct socket *sock, struct rds_connection *conn)
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 0936a4a..0896187 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -115,24 +115,32 @@ int rds_tcp_accept_one(struct socket *sock)
* rds_tcp_state_change() will do that cleanup
*/
rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data;
- if (rs_tcp->t_sock &&
- ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
- struct sock *nsk = new_sock->sk;
-
- nsk->sk_user_data = NULL;
- nsk->sk_prot->disconnect(nsk, 0);
- tcp_done(nsk);
- new_sock = NULL;
- ret = 0;
- goto out;
- } else if (rs_tcp->t_sock) {
- rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
- conn->c_outgoing = 0;
- }
-
rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
+ if (rs_tcp->t_sock) {
+ /* Need to resolve a duelling SYN between peers.
+ * We have an outstanding SYN to this peer, which may
+ * potentially have transitioned to the RDS_CONN_UP state,
+ * so we must quiesce any send threads before resetting
+ * c_transport_data.
+ */
+ wait_event(conn->c_waitq,
+ !test_bit(RDS_IN_XMIT, &conn->c_flags));
+ if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
+ struct sock *nsk = new_sock->sk;
+
+ nsk->sk_user_data = NULL;
+ nsk->sk_prot->disconnect(nsk, 0);
+ tcp_done(nsk);
+ new_sock = NULL;
+ ret = 0;
+ goto out;
+ } else if (rs_tcp->t_sock) {
+ rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
+ conn->c_outgoing = 0;
+ }
+ }
rds_tcp_set_callbacks(new_sock, conn);
- rds_connect_complete(conn);
+ rds_connect_complete(conn); /* marks RDS_CONN_UP */
new_sock = NULL;
ret = 0;
--
1.7.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.
2016-05-01 23:10 [PATCH net 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan
2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
@ 2016-05-01 23:10 ` Sowmini Varadhan
2016-05-02 16:33 ` Santosh Shilimkar
1 sibling, 1 reply; 9+ messages in thread
From: Sowmini Varadhan @ 2016-05-01 23:10 UTC (permalink / raw)
To: sowmini.varadhan, netdev, rds-devel; +Cc: santosh.shilimkar, davem
An arbitration scheme for duelling SYNs is implemented as part of
commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
involved will arrive at the same arbitration decision. However, this
needs to be synchronized with an outgoing SYN to be generated by
rds_tcp_conn_connect(). This commit achieves the synchronization
through the t_conn_lock mutex in struct rds_tcp_connection.
The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
the t_conn_lock mutex. A SYN is sent out only if the RDS connection is
not already UP (an UP would indicate that rds_tcp_accept_one() has
completed 3WH, so no SYN needs to be generated).
Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
acquiring the t_conn_lock mutex. The only acceptable states (to
allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
outgoing SYN before we saw incoming SYN).
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
net/rds/tcp.c | 1 +
net/rds/tcp.h | 4 ++++
net/rds/tcp_connect.c | 8 ++++++++
net/rds/tcp_listen.c | 30 ++++++++++++++++++++----------
4 files changed, 33 insertions(+), 10 deletions(-)
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 9134544..86187da 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -216,6 +216,7 @@ static int rds_tcp_conn_alloc(struct rds_connection *conn, gfp_t gfp)
if (!tc)
return -ENOMEM;
+ mutex_init(&tc->t_conn_lock);
tc->t_sock = NULL;
tc->t_tinc = NULL;
tc->t_tinc_hdr_rem = sizeof(struct rds_header);
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 64f873c..41c2283 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -12,6 +12,10 @@ struct rds_tcp_connection {
struct list_head t_tcp_node;
struct rds_connection *conn;
+ /* t_conn_lock synchronizes the connection establishment between
+ * rds_tcp_accept_one and rds_tcp_conn_connect
+ */
+ struct mutex t_conn_lock;
struct socket *t_sock;
void *t_orig_write_space;
void *t_orig_data_ready;
diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c
index 5cb1687..49a3fcf 100644
--- a/net/rds/tcp_connect.c
+++ b/net/rds/tcp_connect.c
@@ -78,7 +78,14 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
struct socket *sock = NULL;
struct sockaddr_in src, dest;
int ret;
+ struct rds_tcp_connection *tc = conn->c_transport_data;
+
+ mutex_lock(&tc->t_conn_lock);
+ if (rds_conn_up(conn)) {
+ mutex_unlock(&tc->t_conn_lock);
+ return 0;
+ }
ret = sock_create_kern(rds_conn_net(conn), PF_INET,
SOCK_STREAM, IPPROTO_TCP, &sock);
if (ret < 0)
@@ -120,6 +127,7 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
}
out:
+ mutex_unlock(&tc->t_conn_lock);
if (sock)
sock_release(sock);
return ret;
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 0896187..cc8496f 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -76,7 +76,9 @@ int rds_tcp_accept_one(struct socket *sock)
struct rds_connection *conn;
int ret;
struct inet_sock *inet;
- struct rds_tcp_connection *rs_tcp;
+ struct rds_tcp_connection *rs_tcp = NULL;
+ int conn_state;
+ struct sock *nsk;
ret = sock_create_kern(sock_net(sock->sk), sock->sk->sk_family,
sock->sk->sk_type, sock->sk->sk_protocol,
@@ -116,6 +118,10 @@ int rds_tcp_accept_one(struct socket *sock)
*/
rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data;
rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
+ mutex_lock(&rs_tcp->t_conn_lock);
+ conn_state = rds_conn_state(conn);
+ if (conn_state != RDS_CONN_CONNECTING && conn_state != RDS_CONN_UP)
+ goto rst_nsk;
if (rs_tcp->t_sock) {
/* Need to resolve a duelling SYN between peers.
* We have an outstanding SYN to this peer, which may
@@ -126,14 +132,7 @@ int rds_tcp_accept_one(struct socket *sock)
wait_event(conn->c_waitq,
!test_bit(RDS_IN_XMIT, &conn->c_flags));
if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
- struct sock *nsk = new_sock->sk;
-
- nsk->sk_user_data = NULL;
- nsk->sk_prot->disconnect(nsk, 0);
- tcp_done(nsk);
- new_sock = NULL;
- ret = 0;
- goto out;
+ goto rst_nsk;
} else if (rs_tcp->t_sock) {
rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
conn->c_outgoing = 0;
@@ -143,8 +142,19 @@ int rds_tcp_accept_one(struct socket *sock)
rds_connect_complete(conn); /* marks RDS_CONN_UP */
new_sock = NULL;
ret = 0;
-
+ goto out;
+rst_nsk:
+ /* rest the newly returned accept sock and bail */
+ nsk = new_sock->sk;
+ rds_tcp_stats_inc(s_tcp_listen_closed_stale);
+ nsk->sk_user_data = NULL;
+ nsk->sk_prot->disconnect(nsk, 0);
+ tcp_done(nsk);
+ new_sock = NULL;
+ ret = 0;
out:
+ if (rs_tcp)
+ mutex_unlock(&rs_tcp->t_conn_lock);
if (new_sock)
sock_release(new_sock);
return ret;
--
1.7.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
@ 2016-05-02 16:20 ` Santosh Shilimkar
2016-05-02 16:37 ` Sowmini Varadhan
0 siblings, 1 reply; 9+ messages in thread
From: Santosh Shilimkar @ 2016-05-02 16:20 UTC (permalink / raw)
To: Sowmini Varadhan, netdev, rds-devel; +Cc: davem
On 5/1/2016 4:10 PM, Sowmini Varadhan wrote:
> There is a race condition between rds_send_xmit -> rds_tcp_xmit
> and the code that deals with resolution of duelling syns added
> by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
> outgoing socket in rds_tcp_accept_one()").
>
> Specifically, we may end up derefencing a null pointer in rds_send_xmit
> if we have the interleaving sequencee:
> rds_tcp_accept_one rds_send_xmit
>
> conn is RDS_CONN_UP, so
> invoke rds_tcp_xmit
>
> tc = conn->c_transport_data
> rds_tcp_restore_callbacks
> /* reset t_sock */
> null ptr deref from tc->t_sock
>
> The race condition can be avoided without adding the overhead of
> additional locking in the xmit path: have rds_tcp_accept_one wait
> for rds_tcp_xmit threads to complete before resetting callbacks.
> The synchronization can be done in the same manner as rds_conn_shutdown().
> First set the rds_conn_state to something other than RDS_CONN_UP
> (so that new threads cannot get into rds_tcp_xmit()), then wait for
> RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any
> threads in rds_tcp_xmit are done.
>
> Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
> outgoing socket in rds_tcp_accept_one()")
>
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
> ---
Mostly looks correct. A question below.
> net/rds/tcp.c | 2 +-
> net/rds/tcp_listen.c | 40 ++++++++++++++++++++++++----------------
> 2 files changed, 25 insertions(+), 17 deletions(-)
>
> diff --git a/net/rds/tcp.c b/net/rds/tcp.c
> index 61ed2a8..9134544 100644
> --- a/net/rds/tcp.c
> +++ b/net/rds/tcp.c
> @@ -127,7 +127,7 @@ void rds_tcp_restore_callbacks(struct socket *sock,
>
> /*
> * This is the only path that sets tc->t_sock. Send and receive trust that
> - * it is set. The RDS_CONN_CONNECTED bit protects those paths from being
> + * it is set. The RDS_CONN_UP bit protects those paths from being
> * called while it isn't set.
> */
> void rds_tcp_set_callbacks(struct socket *sock, struct rds_connection *conn)
> diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
> index 0936a4a..0896187 100644
> --- a/net/rds/tcp_listen.c
> +++ b/net/rds/tcp_listen.c
> @@ -115,24 +115,32 @@ int rds_tcp_accept_one(struct socket *sock)
> * rds_tcp_state_change() will do that cleanup
> */
> rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data;
> - if (rs_tcp->t_sock &&
> - ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
> - struct sock *nsk = new_sock->sk;
> -
> - nsk->sk_user_data = NULL;
> - nsk->sk_prot->disconnect(nsk, 0);
> - tcp_done(nsk);
> - new_sock = NULL;
> - ret = 0;
> - goto out;
> - } else if (rs_tcp->t_sock) {
> - rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
> - conn->c_outgoing = 0;
> - }
> -
> rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
> + if (rs_tcp->t_sock) {
> + /* Need to resolve a duelling SYN between peers.
> + * We have an outstanding SYN to this peer, which may
> + * potentially have transitioned to the RDS_CONN_UP state,
> + * so we must quiesce any send threads before resetting
> + * c_transport_data.
> + */
> + wait_event(conn->c_waitq,
> + !test_bit(RDS_IN_XMIT, &conn->c_flags));
Would it be good to check the return value of rds_conn_transition()
since if CONN is already UP above will fail and then send message
might again race and we will let message through even though passive
hasn't finished its connection.
> + if (ntohl(inet->inet_saddr) < ntohl(inet->inet_daddr)) {
> + struct sock *nsk = new_sock->sk;
> +
> + nsk->sk_user_data = NULL;
> + nsk->sk_prot->disconnect(nsk, 0);
> + tcp_done(nsk);
> + new_sock = NULL;
> + ret = 0;
> + goto out;
> + } else if (rs_tcp->t_sock) {
> + rds_tcp_restore_callbacks(rs_tcp->t_sock, rs_tcp);
> + conn->c_outgoing = 0;
> + }
> + }
> rds_tcp_set_callbacks(new_sock, conn);
> - rds_connect_complete(conn);
> + rds_connect_complete(conn); /* marks RDS_CONN_UP */
> new_sock = NULL;
> ret = 0;
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.
2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan
@ 2016-05-02 16:33 ` Santosh Shilimkar
2016-05-02 16:43 ` Sowmini Varadhan
0 siblings, 1 reply; 9+ messages in thread
From: Santosh Shilimkar @ 2016-05-02 16:33 UTC (permalink / raw)
To: Sowmini Varadhan, netdev, rds-devel; +Cc: davem
On 5/1/2016 4:10 PM, Sowmini Varadhan wrote:
> An arbitration scheme for duelling SYNs is implemented as part of
> commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
> outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
> involved will arrive at the same arbitration decision. However, this
> needs to be synchronized with an outgoing SYN to be generated by
> rds_tcp_conn_connect(). This commit achieves the synchronization
> through the t_conn_lock mutex in struct rds_tcp_connection.
>
> The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
> the t_conn_lock mutex. A SYN is sent out only if the RDS connection is
> not already UP (an UP would indicate that rds_tcp_accept_one() has
> completed 3WH, so no SYN needs to be generated).
>
> Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
> acquiring the t_conn_lock mutex. The only acceptable states (to
> allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
> was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
> outgoing SYN before we saw incoming SYN).
>
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
> ---
> net/rds/tcp.c | 1 +
> net/rds/tcp.h | 4 ++++
> net/rds/tcp_connect.c | 8 ++++++++
> net/rds/tcp_listen.c | 30 ++++++++++++++++++++----------
> 4 files changed, 33 insertions(+), 10 deletions(-)
>
[...]
> diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c
> index 5cb1687..49a3fcf 100644
> --- a/net/rds/tcp_connect.c
> +++ b/net/rds/tcp_connect.c
> @@ -78,7 +78,14 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
> struct socket *sock = NULL;
> struct sockaddr_in src, dest;
> int ret;
> + struct rds_tcp_connection *tc = conn->c_transport_data;
> +
> + mutex_lock(&tc->t_conn_lock);
>
> + if (rds_conn_up(conn)) {
> + mutex_unlock(&tc->t_conn_lock);
> + return 0;
> + }
> ret = sock_create_kern(rds_conn_net(conn), PF_INET,
> SOCK_STREAM, IPPROTO_TCP, &sock);
> if (ret < 0)
> @@ -120,6 +127,7 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
> }
>
> out:
> + mutex_unlock(&tc->t_conn_lock);
Just wondering whether the spin_lock() would better here considering
entry into rds_tcp_conn_connect() & rds_tcp_accept_one() might be
from softirq context. Ignore it if its not applicable.
> if (sock)
> sock_release(sock);
> return ret;
> diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
> index 0896187..cc8496f 100644
> --- a/net/rds/tcp_listen.c
> +++ b/net/rds/tcp_listen.c
> @@ -76,7 +76,9 @@ int rds_tcp_accept_one(struct socket *sock)
> struct rds_connection *conn;
> int ret;
> struct inet_sock *inet;
> - struct rds_tcp_connection *rs_tcp;
> + struct rds_tcp_connection *rs_tcp = NULL;
> + int conn_state;
> + struct sock *nsk;
>
> ret = sock_create_kern(sock_net(sock->sk), sock->sk->sk_family,
> sock->sk->sk_type, sock->sk->sk_protocol,
> @@ -116,6 +118,10 @@ int rds_tcp_accept_one(struct socket *sock)
> */
> rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data;
> rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
Like patch 1/2, probably we can leverage return value of above.
> + conn_state = rds_conn_state(conn);
> + if (conn_state != RDS_CONN_CONNECTING && conn_state != RDS_CONN_UP)
You probably don't need the local 'conn_state' and below should work.
if (!rds_conn_connecting(conn) && !rds_conn_up(conn))
Regards,
Santosh
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
2016-05-02 16:20 ` Santosh Shilimkar
@ 2016-05-02 16:37 ` Sowmini Varadhan
2016-05-02 18:05 ` Santosh Shilimkar
0 siblings, 1 reply; 9+ messages in thread
From: Sowmini Varadhan @ 2016-05-02 16:37 UTC (permalink / raw)
To: Santosh Shilimkar; +Cc: netdev, rds-devel, davem
On (05/02/16 09:20), Santosh Shilimkar wrote:
> > rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
> >+ if (rs_tcp->t_sock) {
> >+ /* Need to resolve a duelling SYN between peers.
> >+ * We have an outstanding SYN to this peer, which may
> >+ * potentially have transitioned to the RDS_CONN_UP state,
> >+ * so we must quiesce any send threads before resetting
> >+ * c_transport_data.
> >+ */
> >+ wait_event(conn->c_waitq,
> >+ !test_bit(RDS_IN_XMIT, &conn->c_flags));
> Would it be good to check the return value of rds_conn_transition()
> since if CONN is already UP above will fail and then send message
> might again race and we will let message through even though passive
> hasn't finished its connection.
no, that was the original issue that I was running into, which needed
commit 241b2719 - prior to that commit, if the conn was already UP,
we'd end up doing a rds_conn_drop on a good connection, and both sides
would end up in a pair of infinite 3WH loops. Even if we dont do
a rds_conn_drop on the UP connection, we've just (before
rds_tcp_accept_one) sent out a syn-ack on the incoming syn, and now
need to RST that syn-ac. The other side is going to receive the rst,
and get confused about what to clean up (since there's already an UP
connection going on).
In short, when there is a duel, it's cleanest to have a deterministic
arbitration- both sides use the numeric value of saddr and faddr to
figure out which side is active, which side is passive. (Thus the
basis on the BGP router-id based model for 241b2719)
FWIW, much of this is actually a corner case- in practice, its not
frequent to have syns crossing each other at "almost the same time".
--Sowmini
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.
2016-05-02 16:33 ` Santosh Shilimkar
@ 2016-05-02 16:43 ` Sowmini Varadhan
2016-05-02 18:08 ` Santosh Shilimkar
0 siblings, 1 reply; 9+ messages in thread
From: Sowmini Varadhan @ 2016-05-02 16:43 UTC (permalink / raw)
To: Santosh Shilimkar; +Cc: netdev, rds-devel, davem
On (05/02/16 09:33), Santosh Shilimkar wrote:
> >+ mutex_unlock(&tc->t_conn_lock);
> Just wondering whether the spin_lock() would better here considering
> entry into rds_tcp_conn_connect() & rds_tcp_accept_one() might be
> from softirq context. Ignore it if its not applicable.
It's not from softirq context (both are workqs), but I used a mutex
to follow c_cm_lock (which I considered reusing, given that it
is only IB specific?) But spin_lock vs mutex may not be a big
differentiator here- this is really a one-time start up (corner-case)
issue in the control path.
> > rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
> Like patch 1/2, probably we can leverage return value of above.
:
> You probably don't need the local 'conn_state' and below should work.
> if (!rds_conn_connecting(conn) && !rds_conn_up(conn))
see explanation for comment to 1/2.
--Sowmini
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
2016-05-02 16:37 ` Sowmini Varadhan
@ 2016-05-02 18:05 ` Santosh Shilimkar
0 siblings, 0 replies; 9+ messages in thread
From: Santosh Shilimkar @ 2016-05-02 18:05 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: netdev, rds-devel, davem
On 5/2/2016 9:37 AM, Sowmini Varadhan wrote:
> On (05/02/16 09:20), Santosh Shilimkar wrote:
>>> rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
>>> + if (rs_tcp->t_sock) {
>>> + /* Need to resolve a duelling SYN between peers.
>>> + * We have an outstanding SYN to this peer, which may
>>> + * potentially have transitioned to the RDS_CONN_UP state,
>>> + * so we must quiesce any send threads before resetting
>>> + * c_transport_data.
>>> + */
>>> + wait_event(conn->c_waitq,
>>> + !test_bit(RDS_IN_XMIT, &conn->c_flags));
>> Would it be good to check the return value of rds_conn_transition()
>> since if CONN is already UP above will fail and then send message
>> might again race and we will let message through even though passive
>> hasn't finished its connection.
>
> no, that was the original issue that I was running into, which needed
> commit 241b2719 - prior to that commit, if the conn was already UP,
> we'd end up doing a rds_conn_drop on a good connection, and both sides
> would end up in a pair of infinite 3WH loops. Even if we dont do
> a rds_conn_drop on the UP connection, we've just (before
> rds_tcp_accept_one) sent out a syn-ack on the incoming syn, and now
> need to RST that syn-ac. The other side is going to receive the rst,
> and get confused about what to clean up (since there's already an UP
> connection going on).
>
> In short, when there is a duel, it's cleanest to have a deterministic
> arbitration- both sides use the numeric value of saddr and faddr to
> figure out which side is active, which side is passive. (Thus the
> basis on the BGP router-id based model for 241b2719)
>
> FWIW, much of this is actually a corner case- in practice, its not
> frequent to have syns crossing each other at "almost the same time".
>
Sounds good. Thanks for expanding it. Patch looks good to me.
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.
2016-05-02 16:43 ` Sowmini Varadhan
@ 2016-05-02 18:08 ` Santosh Shilimkar
0 siblings, 0 replies; 9+ messages in thread
From: Santosh Shilimkar @ 2016-05-02 18:08 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: netdev, rds-devel, davem
On 5/2/2016 9:43 AM, Sowmini Varadhan wrote:
> On (05/02/16 09:33), Santosh Shilimkar wrote:
>>> + mutex_unlock(&tc->t_conn_lock);
>> Just wondering whether the spin_lock() would better here considering
>> entry into rds_tcp_conn_connect() & rds_tcp_accept_one() might be
>> from softirq context. Ignore it if its not applicable.
>
> It's not from softirq context (both are workqs), but I used a mutex
> to follow c_cm_lock (which I considered reusing, given that it
> is only IB specific?) But spin_lock vs mutex may not be a big
> differentiator here- this is really a one-time start up (corner-case)
> issue in the control path.
>
That should be fine then.
>>> rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
>> Like patch 1/2, probably we can leverage return value of above.
> :
>> You probably don't need the local 'conn_state' and below should work.
>> if (!rds_conn_connecting(conn) && !rds_conn_up(conn))
>
> see explanation for comment to 1/2.
>
Yep.
> +rst_nsk:
> + /* rest the newly returned accept sock and bail */
s/rest/reset
With typo fixed,
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-05-02 18:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-01 23:10 [PATCH net 0/2] RDS: TCP: sychronization during connection startup Sowmini Varadhan
2016-05-01 23:10 ` [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock Sowmini Varadhan
2016-05-02 16:20 ` Santosh Shilimkar
2016-05-02 16:37 ` Sowmini Varadhan
2016-05-02 18:05 ` Santosh Shilimkar
2016-05-01 23:10 ` [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock Sowmini Varadhan
2016-05-02 16:33 ` Santosh Shilimkar
2016-05-02 16:43 ` Sowmini Varadhan
2016-05-02 18:08 ` Santosh Shilimkar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).