All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3
@ 2018-04-26 15:18 Ursula Braun
  2018-04-26 15:18 ` [PATCH net-next v3 1/4] net/smc: fix structure size Ursula Braun
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Ursula Braun @ 2018-04-26 15:18 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

From: Ursula Braun <ursula.braun@linux.ibm.com>

Dave,

in the mean time we challenged the benefit of these CLC handshake
optimizations for the sockopts TCP_NODELAY and TCP_CORK.
We decided to give up on them for now, since SMC still works
properly without.
There is now version 3 of the patch series with patches 2-4 implementing
sockopts that require special handling in SMC.

Version 3 changes
   * no deferring of setsockopts TCP_NODELAY and TCP_CORK anymore
   * allow fallback for some sockopts eliminating SMC usage
   * when setting TCP_NODELAY always enforce data transmission
     (not only together with corked data)

Version 2 changes of Patch 2/4 (and 3/4):
   * return error -EOPNOTSUPP for TCP_FASTOPEN sockopts
   * fix a kernel_setsockopt() usage bug by switching parameter
     variable from type "u8" to "int"
   * add return code validation when calling kernel_setsockopt()
   * propagate a setsockopt error on the internal CLC socket
     to the SMC socket.

Regards, Ursula

Karsten Graul (1):
  net/smc: fix structure size

Ursula Braun (3):
  net/smc: handle sockopts forcing fallback
  net/smc: handle sockopt TCP_CORK
  net/smc: handle sockopt TCP_DEFER_ACCEPT

 net/smc/af_smc.c  | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
 net/smc/smc.h     |  4 +++
 net/smc/smc_cdc.c |  2 +-
 net/smc/smc_cdc.h |  2 +-
 net/smc/smc_rx.c  |  2 +-
 net/smc/smc_rx.h  |  1 +
 net/smc/smc_tx.c  | 24 ++++++++++++--
 7 files changed, 122 insertions(+), 11 deletions(-)

-- 
2.13.5

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next v3 1/4] net/smc: fix structure size
  2018-04-26 15:18 [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 Ursula Braun
@ 2018-04-26 15:18 ` Ursula Braun
  2018-04-26 15:18 ` [PATCH net-next v3 2/4] net/smc: handle sockopts forcing fallback Ursula Braun
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ursula Braun @ 2018-04-26 15:18 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

From: Karsten Graul <kgraul@linux.ibm.com>

The struct smc_cdc_msg must be defined as packed so the
size is 44 bytes.
And change the structure size check so sizeof is checked.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/smc_cdc.c | 2 +-
 net/smc/smc_cdc.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index b42395d24cba..42ad57365eca 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -82,7 +82,7 @@ static inline void smc_cdc_add_pending_send(struct smc_connection *conn,
 		sizeof(struct smc_cdc_msg) > SMC_WR_BUF_SIZE,
 		"must increase SMC_WR_BUF_SIZE to at least sizeof(struct smc_cdc_msg)");
 	BUILD_BUG_ON_MSG(
-		offsetof(struct smc_cdc_msg, reserved) > SMC_WR_TX_SIZE,
+		sizeof(struct smc_cdc_msg) != SMC_WR_TX_SIZE,
 		"must adapt SMC_WR_TX_SIZE to sizeof(struct smc_cdc_msg); if not all smc_wr upper layer protocols use the same message size any more, must start to set link->wr_tx_sges[i].length on each individual smc_wr_tx_send()");
 	BUILD_BUG_ON_MSG(
 		sizeof(struct smc_cdc_tx_pend) > SMC_WR_TX_PEND_PRIV_SIZE,
diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
index ab240b37ad11..d2012fd22100 100644
--- a/net/smc/smc_cdc.h
+++ b/net/smc/smc_cdc.h
@@ -48,7 +48,7 @@ struct smc_cdc_msg {
 	struct smc_cdc_producer_flags	prod_flags;
 	struct smc_cdc_conn_state_flags	conn_state_flags;
 	u8				reserved[18];
-} __aligned(8);
+} __packed;					/* format defined in RFC7609 */
 
 static inline bool smc_cdc_rxed_any_close(struct smc_connection *conn)
 {
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v3 2/4] net/smc: handle sockopts forcing fallback
  2018-04-26 15:18 [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 Ursula Braun
  2018-04-26 15:18 ` [PATCH net-next v3 1/4] net/smc: fix structure size Ursula Braun
@ 2018-04-26 15:18 ` Ursula Braun
  2018-04-26 15:18 ` [PATCH net-next v3 3/4] net/smc: sockopts TCP_NODELAY and TCP_CORK Ursula Braun
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ursula Braun @ 2018-04-26 15:18 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

From: Ursula Braun <ubraun@linux.ibm.com>

Several TCP sockopts do not work for SMC. One example are the
TCP_FASTOPEN sockopts, since SMC-connection setup is based on the TCP
three-way-handshake.
If the SMC socket is still in state SMC_INIT, such sockopts trigger
fallback to TCP. Otherwise an error is returned.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 4470501374bf..d274be7265ea 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -391,6 +391,9 @@ static int smc_connect_rdma(struct smc_sock *smc)
 
 	sock_hold(&smc->sk); /* sock put in passive closing */
 
+	if (smc->use_fallback)
+		goto out_connected;
+
 	if (!tcp_sk(smc->clcsock->sk)->syn_smc) {
 		/* peer has not signalled SMC-capability */
 		smc->use_fallback = true;
@@ -790,6 +793,9 @@ static void smc_listen_work(struct work_struct *work)
 	int rc = 0;
 	u8 ibport;
 
+	if (new_smc->use_fallback)
+		goto out_connected;
+
 	/* check if peer is smc capable */
 	if (!tcp_sk(newclcsock->sk)->syn_smc) {
 		new_smc->use_fallback = true;
@@ -968,7 +974,7 @@ static void smc_tcp_listen_work(struct work_struct *work)
 			continue;
 
 		new_smc->listen_smc = lsmc;
-		new_smc->use_fallback = false; /* assume rdma capability first*/
+		new_smc->use_fallback = lsmc->use_fallback;
 		sock_hold(lsk); /* sock_put in smc_listen_work */
 		INIT_WORK(&new_smc->smc_listen_work, smc_listen_work);
 		smc_copy_sock_settings_to_smc(new_smc);
@@ -1004,7 +1010,8 @@ static int smc_listen(struct socket *sock, int backlog)
 	 * them to the clc socket -- copy smc socket options to clc socket
 	 */
 	smc_copy_sock_settings_to_clc(smc);
-	tcp_sk(smc->clcsock->sk)->syn_smc = 1;
+	if (!smc->use_fallback)
+		tcp_sk(smc->clcsock->sk)->syn_smc = 1;
 
 	rc = kernel_listen(smc->clcsock, backlog);
 	if (rc)
@@ -1097,6 +1104,16 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	    (sk->sk_state != SMC_APPCLOSEWAIT1) &&
 	    (sk->sk_state != SMC_INIT))
 		goto out;
+
+	if (msg->msg_flags & MSG_FASTOPEN) {
+		if (sk->sk_state == SMC_INIT) {
+			smc->use_fallback = true;
+		} else {
+			rc = -EINVAL;
+			goto out;
+		}
+	}
+
 	if (smc->use_fallback)
 		rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg, len);
 	else
@@ -1274,14 +1291,43 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
+	int rc;
 
 	smc = smc_sk(sk);
 
 	/* generic setsockopts reaching us here always apply to the
 	 * CLC socket
 	 */
-	return smc->clcsock->ops->setsockopt(smc->clcsock, level, optname,
-					     optval, optlen);
+	rc = smc->clcsock->ops->setsockopt(smc->clcsock, level, optname,
+					   optval, optlen);
+	if (smc->clcsock->sk->sk_err) {
+		sk->sk_err = smc->clcsock->sk->sk_err;
+		sk->sk_error_report(sk);
+	}
+	if (rc)
+		return rc;
+
+	lock_sock(sk);
+	switch (optname) {
+	case TCP_ULP:
+	case TCP_FASTOPEN:
+	case TCP_FASTOPEN_CONNECT:
+	case TCP_FASTOPEN_KEY:
+	case TCP_FASTOPEN_NO_COOKIE:
+		/* option not supported by SMC */
+		if (sk->sk_state == SMC_INIT) {
+			smc->use_fallback = true;
+		} else {
+			if (!smc->use_fallback)
+				rc = -EINVAL;
+		}
+		break;
+	default:
+		break;
+	}
+	release_sock(sk);
+
+	return rc;
 }
 
 static int smc_getsockopt(struct socket *sock, int level, int optname,
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v3 3/4] net/smc: sockopts TCP_NODELAY and TCP_CORK
  2018-04-26 15:18 [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 Ursula Braun
  2018-04-26 15:18 ` [PATCH net-next v3 1/4] net/smc: fix structure size Ursula Braun
  2018-04-26 15:18 ` [PATCH net-next v3 2/4] net/smc: handle sockopts forcing fallback Ursula Braun
@ 2018-04-26 15:18 ` Ursula Braun
  2018-04-26 15:18 ` [PATCH net-next v3 4/4] net/smc: handle sockopt TCP_DEFER_ACCEPT Ursula Braun
  2018-04-27 18:03 ` [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: Ursula Braun @ 2018-04-26 15:18 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

From: Ursula Braun <ubraun@linux.ibm.com>

Setting sockopt TCP_NODELAY or resetting sockopt TCP_CORK
triggers data transfer.

For a corked SMC socket RDMA writes are deferred, if there is
still sufficient send buffer space available.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c | 20 +++++++++++++++++++-
 net/smc/smc_tx.c | 24 +++++++++++++++++++++---
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index d274be7265ea..9d8b381281e3 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -1291,7 +1291,7 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
-	int rc;
+	int val, rc;
 
 	smc = smc_sk(sk);
 
@@ -1307,6 +1307,10 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
 	if (rc)
 		return rc;
 
+	if (optlen < sizeof(int))
+		return rc;
+	get_user(val, (int __user *)optval);
+
 	lock_sock(sk);
 	switch (optname) {
 	case TCP_ULP:
@@ -1322,6 +1326,20 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
 				rc = -EINVAL;
 		}
 		break;
+	case TCP_NODELAY:
+		if (sk->sk_state != SMC_INIT && sk->sk_state != SMC_LISTEN) {
+			if (val)
+				mod_delayed_work(system_wq, &smc->conn.tx_work,
+						 0);
+		}
+		break;
+	case TCP_CORK:
+		if (sk->sk_state != SMC_INIT && sk->sk_state != SMC_LISTEN) {
+			if (!val)
+				mod_delayed_work(system_wq, &smc->conn.tx_work,
+						 0);
+		}
+		break;
 	default:
 		break;
 	}
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 72f004c9c9b1..58dfe0bd9d60 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -19,6 +19,7 @@
 #include <linux/sched/signal.h>
 
 #include <net/sock.h>
+#include <net/tcp.h>
 
 #include "smc.h"
 #include "smc_wr.h"
@@ -26,6 +27,7 @@
 #include "smc_tx.h"
 
 #define SMC_TX_WORK_DELAY	HZ
+#define SMC_TX_CORK_DELAY	(HZ >> 2)	/* 250 ms */
 
 /***************************** sndbuf producer *******************************/
 
@@ -115,6 +117,13 @@ static int smc_tx_wait_memory(struct smc_sock *smc, int flags)
 	return rc;
 }
 
+static bool smc_tx_is_corked(struct smc_sock *smc)
+{
+	struct tcp_sock *tp = tcp_sk(smc->clcsock->sk);
+
+	return (tp->nonagle & TCP_NAGLE_CORK) ? true : false;
+}
+
 /* sndbuf producer: main API called by socket layer.
  * called under sock lock.
  */
@@ -209,7 +218,16 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 		/* since we just produced more new data into sndbuf,
 		 * trigger sndbuf consumer: RDMA write into peer RMBE and CDC
 		 */
-		smc_tx_sndbuf_nonempty(conn);
+		if ((msg->msg_flags & MSG_MORE || smc_tx_is_corked(smc)) &&
+		    (atomic_read(&conn->sndbuf_space) >
+						(conn->sndbuf_size >> 1)))
+			/* for a corked socket defer the RDMA writes if there
+			 * is still sufficient sndbuf_space available
+			 */
+			schedule_delayed_work(&conn->tx_work,
+					      SMC_TX_CORK_DELAY);
+		else
+			smc_tx_sndbuf_nonempty(conn);
 	} /* while (msg_data_left(msg)) */
 
 	return send_done;
@@ -409,8 +427,8 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
 			}
 			rc = 0;
 			if (conn->alert_token_local) /* connection healthy */
-				schedule_delayed_work(&conn->tx_work,
-						      SMC_TX_WORK_DELAY);
+				mod_delayed_work(system_wq, &conn->tx_work,
+						 SMC_TX_WORK_DELAY);
 		}
 		goto out_unlock;
 	}
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v3 4/4] net/smc: handle sockopt TCP_DEFER_ACCEPT
  2018-04-26 15:18 [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 Ursula Braun
                   ` (2 preceding siblings ...)
  2018-04-26 15:18 ` [PATCH net-next v3 3/4] net/smc: sockopts TCP_NODELAY and TCP_CORK Ursula Braun
@ 2018-04-26 15:18 ` Ursula Braun
  2018-04-27 18:03 ` [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: Ursula Braun @ 2018-04-26 15:18 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

From: Ursula Braun <ubraun@linux.ibm.com>

If sockopt TCP_DEFER_ACCEPT is set, the accept is delayed till
data is available.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c | 26 +++++++++++++++++++++++++-
 net/smc/smc.h    |  4 ++++
 net/smc/smc_rx.c |  2 +-
 net/smc/smc_rx.h |  1 +
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 9d8b381281e3..20aa4175b9f8 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -1044,6 +1044,7 @@ static int smc_accept(struct socket *sock, struct socket *new_sock,
 
 	if (lsmc->sk.sk_state != SMC_LISTEN) {
 		rc = -EINVAL;
+		release_sock(sk);
 		goto out;
 	}
 
@@ -1071,9 +1072,29 @@ static int smc_accept(struct socket *sock, struct socket *new_sock,
 
 	if (!rc)
 		rc = sock_error(nsk);
+	release_sock(sk);
+	if (rc)
+		goto out;
+
+	if (lsmc->sockopt_defer_accept && !(flags & O_NONBLOCK)) {
+		/* wait till data arrives on the socket */
+		timeo = msecs_to_jiffies(lsmc->sockopt_defer_accept *
+								MSEC_PER_SEC);
+		if (smc_sk(nsk)->use_fallback) {
+			struct sock *clcsk = smc_sk(nsk)->clcsock->sk;
+
+			lock_sock(clcsk);
+			if (skb_queue_empty(&clcsk->sk_receive_queue))
+				sk_wait_data(clcsk, &timeo, NULL);
+			release_sock(clcsk);
+		} else if (!atomic_read(&smc_sk(nsk)->conn.bytes_to_rcv)) {
+			lock_sock(nsk);
+			smc_rx_wait_data(smc_sk(nsk), &timeo);
+			release_sock(nsk);
+		}
+	}
 
 out:
-	release_sock(sk);
 	sock_put(sk); /* sock_hold above */
 	return rc;
 }
@@ -1340,6 +1361,9 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
 						 0);
 		}
 		break;
+	case TCP_DEFER_ACCEPT:
+		smc->sockopt_defer_accept = val;
+		break;
 	default:
 		break;
 	}
diff --git a/net/smc/smc.h b/net/smc/smc.h
index e4829a2f46ba..2405e889b93d 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -180,6 +180,10 @@ struct smc_sock {				/* smc sock container */
 	struct list_head	accept_q;	/* sockets to be accepted */
 	spinlock_t		accept_q_lock;	/* protects accept_q */
 	bool			use_fallback;	/* fallback to tcp */
+	int			sockopt_defer_accept;
+						/* sockopt TCP_DEFER_ACCEPT
+						 * value
+						 */
 	u8			wait_close_tx_prepared : 1;
 						/* shutdown wr or close
 						 * started, waiting for unsent
diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
index eff4e0d0bb31..af851d8df1f8 100644
--- a/net/smc/smc_rx.c
+++ b/net/smc/smc_rx.c
@@ -51,7 +51,7 @@ static void smc_rx_data_ready(struct sock *sk)
  * 1 if at least 1 byte available in rcvbuf or if socket error/shutdown.
  * 0 otherwise (nothing in rcvbuf nor timeout, e.g. interrupted).
  */
-static int smc_rx_wait_data(struct smc_sock *smc, long *timeo)
+int smc_rx_wait_data(struct smc_sock *smc, long *timeo)
 {
 	DEFINE_WAIT_FUNC(wait, woken_wake_function);
 	struct smc_connection *conn = &smc->conn;
diff --git a/net/smc/smc_rx.h b/net/smc/smc_rx.h
index 3a32b59bf06c..0b75a6b470e6 100644
--- a/net/smc/smc_rx.h
+++ b/net/smc/smc_rx.h
@@ -20,5 +20,6 @@
 void smc_rx_init(struct smc_sock *smc);
 int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg, size_t len,
 		   int flags);
+int smc_rx_wait_data(struct smc_sock *smc, long *timeo);
 
 #endif /* SMC_RX_H */
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3
  2018-04-26 15:18 [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 Ursula Braun
                   ` (3 preceding siblings ...)
  2018-04-26 15:18 ` [PATCH net-next v3 4/4] net/smc: handle sockopt TCP_DEFER_ACCEPT Ursula Braun
@ 2018-04-27 18:03 ` David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2018-04-27 18:03 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

From: Ursula Braun <ubraun@linux.ibm.com>
Date: Thu, 26 Apr 2018 17:18:19 +0200

> Version 3 changes
>    * no deferring of setsockopts TCP_NODELAY and TCP_CORK anymore
>    * allow fallback for some sockopts eliminating SMC usage
>    * when setting TCP_NODELAY always enforce data transmission
>      (not only together with corked data)

This looks a lot better than what you were doing before.

Series applied, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-04-27 18:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-26 15:18 [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 Ursula Braun
2018-04-26 15:18 ` [PATCH net-next v3 1/4] net/smc: fix structure size Ursula Braun
2018-04-26 15:18 ` [PATCH net-next v3 2/4] net/smc: handle sockopts forcing fallback Ursula Braun
2018-04-26 15:18 ` [PATCH net-next v3 3/4] net/smc: sockopts TCP_NODELAY and TCP_CORK Ursula Braun
2018-04-26 15:18 ` [PATCH net-next v3 4/4] net/smc: handle sockopt TCP_DEFER_ACCEPT Ursula Braun
2018-04-27 18:03 ` [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3 David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.