netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3]  Optimizing performance in short-lived
@ 2022-01-27 12:08 D. Wythe
  2022-01-27 12:08 ` [PATCH net-next 1/3] net/smc: Make smc_tcp_listen_work() independent D. Wythe
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: D. Wythe @ 2022-01-27 12:08 UTC (permalink / raw)
  To: kgraul; +Cc: kuba, davem, netdev, linux-s390, linux-rdma, D. Wythe

From: "D. Wythe" <alibuda@linux.alibaba.com>

This patch set aims to optimizing performance of SMC in short-lived
links scenarios, which is quite unsatisfactory right now.

In our benchmark, we test it with follow scripts:

./wrk -c 10000 -t 4 -H 'Connection: Close' -d 20 http://smc-server

Current performance figures like that:

Running 20s test @ http://11.213.45.6
  4 threads and 10000 connections
  4956 requests in 20.06s, 3.24MB read
  Socket errors: connect 0, read 0, write 672, timeout 0
Requests/sec:    247.07
Transfer/sec:    165.28KB

There are many reasons for this phenomenon, this patch set doesn't
solve it all though, but it can be well alleviated with it in.

Patch 1/3  (Make smc_tcp_listen_work() independent) :

Separate smc_tcp_listen_work() from smc_listen_work(), make them
independent of each other, the busy SMC handshake can not affect new TCP
connections visit any more. Avoid discarding a large number of TCP
connections after being overstock, which is undoubtedly raise the
connection establishment time.

Patch 2/3 (Limits SMC backlog connections):

Since patch 1 has separated smc_tcp_listen_work() from
smc_listen_work(), an unrestricted TCP accept have come into being. This
patch try to put a limit on SMC backlog connections refers to
implementation of TCP.

Patch 3/3 (Fallback when SMC handshake workqueue congested):

Considering the complexity of SMC handshake right now, in short-lived
links scenarios, this may not be the main scenario of SMC though, it's
performance is still quite poor. This Patch try to provide auto fallback
case when SMC handshake workqueue congested, which is the sign of SMC
handshake stacking in our opinion.

Of course, it's optional.

After this patch set, performance figures like that:

Running 20s test @ http://11.213.45.6
  4 threads and 10000 connections
  693253 requests in 20.10s, 452.88MB read
Requests/sec:  34488.13
Transfer/sec:     22.53MB

That's a quite well performance improvement, about to 6 to 7 times in my
environment.

D. Wythe (3):
  net/smc: Make smc_tcp_listen_work() independent
  net/smc: Limits backlog connections
  net/smc: Fallback when SMC handshake workqueue

 include/linux/tcp.h  |  1 +
 net/ipv4/tcp_input.c |  3 +-
 net/smc/Kconfig      | 12 ++++++++
 net/smc/af_smc.c     | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 net/smc/smc.h        |  5 ++++
 5 files changed, 96 insertions(+), 3 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 1/3] net/smc: Make smc_tcp_listen_work() independent
  2022-01-27 12:08 [PATCH net-next 0/3] Optimizing performance in short-lived D. Wythe
@ 2022-01-27 12:08 ` D. Wythe
  2022-01-27 12:08 ` [PATCH net-next 2/3] net/smc: Limits backlog connections D. Wythe
  2022-01-27 12:08 ` [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested D. Wythe
  2 siblings, 0 replies; 11+ messages in thread
From: D. Wythe @ 2022-01-27 12:08 UTC (permalink / raw)
  To: kgraul; +Cc: kuba, davem, netdev, linux-s390, linux-rdma, D. Wythe

From: "D. Wythe" <alibuda@linux.alibaba.com>

In multithread and 10K connections benchmark, the backend TCP connection
established very slowly, and lots of TCP connections stay in SYN_SENT
state.

Client: smc_run wrk -c 10000 -t 4 http://server

the netstate of server host shows like:
    145042 times the listen queue of a socket overflowed
    145042 SYNs to LISTEN sockets dropped

One reason of this issue is that, since the smc_tcp_listen_work() shared
the same workqueue (smc_hs_wq) with smc_listen_work(), while the
smc_listen_work() do blocking wait for smc connection established. Once
the workqueue became congested, it's will block the accpet() from TCP
listen.

This patch creates a independent workqueue(smc_tcp_ls_wq) for
smc_tcp_listen_work(), separate it from smc_listen_work(), which is
quite acceptable considering that smc_tcp_listen_work() runs very fast.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c | 13 +++++++++++--
 net/smc/smc.h    |  1 +
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index d5ea62b..1b40304 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -59,6 +59,7 @@
 						 * creation on client
 						 */
 
+struct workqueue_struct	*smc_tcp_ls_wq;	/* wq for tcp listen work */
 struct workqueue_struct	*smc_hs_wq;	/* wq for handshake work */
 struct workqueue_struct	*smc_close_wq;	/* wq for close work */
 
@@ -2124,7 +2125,7 @@ static void smc_clcsock_data_ready(struct sock *listen_clcsock)
 	lsmc->clcsk_data_ready(listen_clcsock);
 	if (lsmc->sk.sk_state == SMC_LISTEN) {
 		sock_hold(&lsmc->sk); /* sock_put in smc_tcp_listen_work() */
-		if (!queue_work(smc_hs_wq, &lsmc->tcp_listen_work))
+		if (!queue_work(smc_tcp_ls_wq, &lsmc->tcp_listen_work))
 			sock_put(&lsmc->sk);
 	}
 }
@@ -2919,9 +2920,14 @@ static int __init smc_init(void)
 		goto out_nl;
 
 	rc = -ENOMEM;
+
+	smc_tcp_ls_wq = alloc_workqueue("smc_tcp_ls_wq", 0, 0);
+	if (!smc_tcp_ls_wq)
+		goto out_pnet;
+
 	smc_hs_wq = alloc_workqueue("smc_hs_wq", 0, 0);
 	if (!smc_hs_wq)
-		goto out_pnet;
+		goto out_alloc_tcp_ls_wq;
 
 	smc_close_wq = alloc_workqueue("smc_close_wq", 0, 0);
 	if (!smc_close_wq)
@@ -2992,6 +2998,8 @@ static int __init smc_init(void)
 	destroy_workqueue(smc_close_wq);
 out_alloc_hs_wq:
 	destroy_workqueue(smc_hs_wq);
+out_alloc_tcp_ls_wq:
+	destroy_workqueue(smc_tcp_ls_wq);
 out_pnet:
 	smc_pnet_exit();
 out_nl:
@@ -3010,6 +3018,7 @@ static void __exit smc_exit(void)
 	smc_core_exit();
 	smc_ib_unregister_client();
 	destroy_workqueue(smc_close_wq);
+	destroy_workqueue(smc_tcp_ls_wq);
 	destroy_workqueue(smc_hs_wq);
 	proto_unregister(&smc_proto6);
 	proto_unregister(&smc_proto);
diff --git a/net/smc/smc.h b/net/smc/smc.h
index 3d0b8e3..bd2f3dc 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -264,6 +264,7 @@ static inline struct smc_sock *smc_sk(const struct sock *sk)
 	return (struct smc_sock *)sk;
 }
 
+extern struct workqueue_struct	*smc_tcp_ls_wq;	/* wq for tcp listen work */
 extern struct workqueue_struct	*smc_hs_wq;	/* wq for handshake work */
 extern struct workqueue_struct	*smc_close_wq;	/* wq for close work */
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 2/3] net/smc: Limits backlog connections
  2022-01-27 12:08 [PATCH net-next 0/3] Optimizing performance in short-lived D. Wythe
  2022-01-27 12:08 ` [PATCH net-next 1/3] net/smc: Make smc_tcp_listen_work() independent D. Wythe
@ 2022-01-27 12:08 ` D. Wythe
  2022-01-27 15:52   ` kernel test robot
                     ` (2 more replies)
  2022-01-27 12:08 ` [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested D. Wythe
  2 siblings, 3 replies; 11+ messages in thread
From: D. Wythe @ 2022-01-27 12:08 UTC (permalink / raw)
  To: kgraul; +Cc: kuba, davem, netdev, linux-s390, linux-rdma, D. Wythe

From: "D. Wythe" <alibuda@linux.alibaba.com>

Current implementation does not handling backlog semantics, one
potential risk is that server will be flooded by infinite amount
connections, even if client was SMC-incapable.

This patch works to put a limit on backlog connections, referring to the
TCP implementation, we divides SMC connections into two categories:

1. Half SMC connection, which includes all TCP established while SMC not
connections.

2. Full SMC connection, which includes all SMC established connections.

For half SMC connection, since all half SMC connections starts with TCP
established, we can achieve our goal by put a limit before TCP
established. Refer to the implementation of TCP, this limits will based
on not only the half SMC connections but also the full connections,
which is also a constraint on full SMC connections.

For full SMC connections, although we know exactly where it starts, it's
quite hard to put a limit before it. The easiest way is to block wait
before receive SMC confirm CLC message, while it's under protection by
smc_server_lgr_pending, a global lock, which leads this limit to the
entire host instead of a single listen socket. Another way is to drop
the full connections, but considering the cast of SMC connections, we
prefer to keep full SMC connections.

Even so, the limits of full SMC connections still exists, see commits
about half SMC connection below.

After this patch, the limits of backend connection shows like:

For SMC:

1. Client with SMC-capability can makes 2 * backlog full SMC connections
   or 1 * backlog half SMC connections and 1 * backlog full SMC
   connections at most.

2. Client without SMC-capability can only makes 1 * backlog half TCP
   connections and 1 * backlog full TCP connections.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 net/smc/smc.h    |  4 ++++
 2 files changed, 47 insertions(+)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 1b40304..66a0e64 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -73,6 +73,34 @@ static void smc_set_keepalive(struct sock *sk, int val)
 	smc->clcsock->sk->sk_prot->keepalive(smc->clcsock->sk, val);
 }
 
+static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
+					  struct request_sock *req,
+					  struct dst_entry *dst,
+					  struct request_sock *req_unhash,
+					  bool *own_req)
+{
+	struct smc_sock *smc;
+
+	smc = (struct smc_sock *)((uintptr_t)sk->sk_user_data & ~SK_USER_DATA_NOCOPY);
+
+	if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->smc_pendings) >
+				sk->sk_max_ack_backlog)
+		goto drop;
+
+	if (sk_acceptq_is_full(&smc->sk)) {
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+		goto drop;
+	}
+
+	/* passthrough to origin syn recv sock fct */
+	return smc->ori_af_ops->syn_recv_sock(sk, skb, req, dst, req_unhash, own_req);
+
+drop:
+	dst_release(dst);
+	tcp_listendrop(sk);
+	return NULL;
+}
+
 static struct smc_hashinfo smc_v4_hashinfo = {
 	.lock = __RW_LOCK_UNLOCKED(smc_v4_hashinfo.lock),
 };
@@ -1491,6 +1519,9 @@ static void smc_listen_out(struct smc_sock *new_smc)
 	struct smc_sock *lsmc = new_smc->listen_smc;
 	struct sock *newsmcsk = &new_smc->sk;
 
+	if (tcp_sk(new_smc->clcsock->sk)->syn_smc)
+		atomic_dec(&lsmc->smc_pendings);
+
 	if (lsmc->sk.sk_state == SMC_LISTEN) {
 		lock_sock_nested(&lsmc->sk, SINGLE_DEPTH_NESTING);
 		smc_accept_enqueue(&lsmc->sk, newsmcsk);
@@ -2096,6 +2127,9 @@ static void smc_tcp_listen_work(struct work_struct *work)
 		if (!new_smc)
 			continue;
 
+		if (tcp_sk(new_smc->clcsock->sk)->syn_smc)
+			atomic_inc(&lsmc->smc_pendings);
+
 		new_smc->listen_smc = lsmc;
 		new_smc->use_fallback = lsmc->use_fallback;
 		new_smc->fallback_rsn = lsmc->fallback_rsn;
@@ -2163,6 +2197,15 @@ static int smc_listen(struct socket *sock, int backlog)
 	smc->clcsock->sk->sk_data_ready = smc_clcsock_data_ready;
 	smc->clcsock->sk->sk_user_data =
 		(void *)((uintptr_t)smc | SK_USER_DATA_NOCOPY);
+
+	/* save origin ops */
+	smc->ori_af_ops = inet_csk(smc->clcsock->sk)->icsk_af_ops;
+
+	smc->af_ops = *smc->ori_af_ops;
+	smc->af_ops.syn_recv_sock = smc_tcp_syn_recv_sock;
+
+	inet_csk(smc->clcsock->sk)->icsk_af_ops = &smc->af_ops;
+
 	rc = kernel_listen(smc->clcsock, backlog);
 	if (rc) {
 		smc->clcsock->sk->sk_data_ready = smc->clcsk_data_ready;
diff --git a/net/smc/smc.h b/net/smc/smc.h
index bd2f3dc..4366f24 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -240,6 +240,10 @@ struct smc_sock {				/* smc sock container */
 	bool			use_fallback;	/* fallback to tcp */
 	int			fallback_rsn;	/* reason for fallback */
 	u32			peer_diagnosis; /* decline reason from peer */
+	atomic_t                smc_pendings;   /* pending smc connections */
+	struct inet_connection_sock_af_ops	af_ops;
+	struct inet_connection_sock_af_ops	*ori_af_ops;
+						/* origin af ops */
 	int			sockopt_defer_accept;
 						/* sockopt TCP_DEFER_ACCEPT
 						 * value
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested
  2022-01-27 12:08 [PATCH net-next 0/3] Optimizing performance in short-lived D. Wythe
  2022-01-27 12:08 ` [PATCH net-next 1/3] net/smc: Make smc_tcp_listen_work() independent D. Wythe
  2022-01-27 12:08 ` [PATCH net-next 2/3] net/smc: Limits backlog connections D. Wythe
@ 2022-01-27 12:08 ` D. Wythe
  2022-01-27 17:09   ` Matthieu Baerts
  2022-01-28  7:17   ` Tony Lu
  2 siblings, 2 replies; 11+ messages in thread
From: D. Wythe @ 2022-01-27 12:08 UTC (permalink / raw)
  To: kgraul; +Cc: kuba, davem, netdev, linux-s390, linux-rdma, D. Wythe

From: "D. Wythe" <alibuda@linux.alibaba.com>

This patch intends to provide a mechanism to allow automatic fallback to
TCP according to the pressure of SMC handshake process. At present,
frequent visits will cause the incoming connections to be backlogged in
SMC handshake queue, raise the connections established time. Which is
quite unacceptable for those applications who base on short lived
connections.

It should be optional for applications that don't care about connection
established time. For now, this patch only provides the switch at the
compile time.

There are two ways to implement this mechanism:

1. Fallback when TCP established.
2. Fallback before TCP established.

In the first way, we need to wait and receive CLC messages that the
client will potentially send, and then actively reply with a decline
message, in a sense, which is also a sort of SMC handshake, affect the
connections established time on its way.

In the second way, the only problem is that we need to inject SMC logic
into TCP when it is about to reply the incoming SYN, since we already do
that, it's seems not a problem anymore. And advantage is obvious, few
additional processes are required to complete the fallback.

This patch use the second way.

Link: https://lore.kernel.org/all/1641301961-59331-1-git-send-email-alibuda@linux.alibaba.com/
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 include/linux/tcp.h  |  1 +
 net/ipv4/tcp_input.c |  3 ++-
 net/smc/Kconfig      | 12 ++++++++++++
 net/smc/af_smc.c     | 22 ++++++++++++++++++++++
 4 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 78b91bb..1c4ae5d 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -394,6 +394,7 @@ struct tcp_sock {
 	bool	is_mptcp;
 #endif
 #if IS_ENABLED(CONFIG_SMC)
+	bool	(*smc_in_limited)(const struct sock *sk);
 	bool	syn_smc;	/* SYN includes SMC */
 #endif
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index dc49a3d..9890de9 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6701,7 +6701,8 @@ static void tcp_openreq_init(struct request_sock *req,
 	ireq->ir_num = ntohs(tcp_hdr(skb)->dest);
 	ireq->ir_mark = inet_request_mark(sk, skb);
 #if IS_ENABLED(CONFIG_SMC)
-	ireq->smc_ok = rx_opt->smc_ok;
+	ireq->smc_ok = rx_opt->smc_ok && !(tcp_sk(sk)->smc_in_limited &&
+			tcp_sk(sk)->smc_in_limited(sk));
 #endif
 }
 
diff --git a/net/smc/Kconfig b/net/smc/Kconfig
index 1ab3c5a..1903927 100644
--- a/net/smc/Kconfig
+++ b/net/smc/Kconfig
@@ -19,3 +19,15 @@ config SMC_DIAG
 	  smcss.
 
 	  if unsure, say Y.
+
+if MPTCP
+
+config SMC_AUTO_FALLBACK
+	bool "SMC: automatic fallback to TCP"
+	default y
+	help
+	  Allow automatic fallback to TCP accroding to the pressure of SMC-R
+	  handshake process.
+
+	  If that's not what you except or unsure, say N.
+endif
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 66a0e64..49b8a29 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -101,6 +101,24 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk, struct sk_buff
 	return NULL;
 }
 
+#if IS_ENABLED(CONFIG_SMC_AUTO_FALLBACK)
+static bool smc_is_in_limited(const struct sock *sk)
+{
+	const struct smc_sock *smc;
+
+	smc = (const struct smc_sock *)
+		((uintptr_t)sk->sk_user_data & ~SK_USER_DATA_NOCOPY);
+
+	if (!smc)
+		return true;
+
+	if (workqueue_congested(WORK_CPU_UNBOUND, smc_hs_wq))
+		return true;
+
+	return false;
+}
+#endif
+
 static struct smc_hashinfo smc_v4_hashinfo = {
 	.lock = __RW_LOCK_UNLOCKED(smc_v4_hashinfo.lock),
 };
@@ -2206,6 +2224,10 @@ static int smc_listen(struct socket *sock, int backlog)
 
 	inet_csk(smc->clcsock->sk)->icsk_af_ops = &smc->af_ops;
 
+#if IS_ENABLED(CONFIG_SMC_AUTO_FALLBACK)
+	tcp_sk(smc->clcsock->sk)->smc_in_limited = smc_is_in_limited;
+#endif
+
 	rc = kernel_listen(smc->clcsock, backlog);
 	if (rc) {
 		smc->clcsock->sk->sk_data_ready = smc->clcsk_data_ready;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/3] net/smc: Limits backlog connections
  2022-01-27 12:08 ` [PATCH net-next 2/3] net/smc: Limits backlog connections D. Wythe
@ 2022-01-27 15:52   ` kernel test robot
  2022-01-27 15:52   ` kernel test robot
  2022-01-27 23:23   ` kernel test robot
  2 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2022-01-27 15:52 UTC (permalink / raw)
  To: D. Wythe, kgraul
  Cc: kbuild-all, kuba, davem, netdev, linux-s390, linux-rdma, D. Wythe

Hi Wythe",

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/D-Wythe/Optimizing-performance-in-short-lived/20220127-200912
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git fbb8295248e1d6f576d444309fcf79356008eac1
config: arc-allyesconfig (https://download.01.org/0day-ci/archive/20220127/202201272349.KA4IX9hr-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/718aff24f3fcc73ecb7bff17fcbe029b799c6624
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review D-Wythe/Optimizing-performance-in-short-lived/20220127-200912
        git checkout 718aff24f3fcc73ecb7bff17fcbe029b799c6624
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arc SHELL=/bin/bash net/smc/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   net/smc/af_smc.c: In function 'smc_listen':
>> net/smc/af_smc.c:2202:25: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
    2202 |         smc->ori_af_ops = inet_csk(smc->clcsock->sk)->icsk_af_ops;
         |                         ^


vim +/const +2202 net/smc/af_smc.c

  2166	
  2167	static int smc_listen(struct socket *sock, int backlog)
  2168	{
  2169		struct sock *sk = sock->sk;
  2170		struct smc_sock *smc;
  2171		int rc;
  2172	
  2173		smc = smc_sk(sk);
  2174		lock_sock(sk);
  2175	
  2176		rc = -EINVAL;
  2177		if ((sk->sk_state != SMC_INIT && sk->sk_state != SMC_LISTEN) ||
  2178		    smc->connect_nonblock)
  2179			goto out;
  2180	
  2181		rc = 0;
  2182		if (sk->sk_state == SMC_LISTEN) {
  2183			sk->sk_max_ack_backlog = backlog;
  2184			goto out;
  2185		}
  2186		/* some socket options are handled in core, so we could not apply
  2187		 * them to the clc socket -- copy smc socket options to clc socket
  2188		 */
  2189		smc_copy_sock_settings_to_clc(smc);
  2190		if (!smc->use_fallback)
  2191			tcp_sk(smc->clcsock->sk)->syn_smc = 1;
  2192	
  2193		/* save original sk_data_ready function and establish
  2194		 * smc-specific sk_data_ready function
  2195		 */
  2196		smc->clcsk_data_ready = smc->clcsock->sk->sk_data_ready;
  2197		smc->clcsock->sk->sk_data_ready = smc_clcsock_data_ready;
  2198		smc->clcsock->sk->sk_user_data =
  2199			(void *)((uintptr_t)smc | SK_USER_DATA_NOCOPY);
  2200	
  2201		/* save origin ops */
> 2202		smc->ori_af_ops = inet_csk(smc->clcsock->sk)->icsk_af_ops;
  2203	
  2204		smc->af_ops = *smc->ori_af_ops;
  2205		smc->af_ops.syn_recv_sock = smc_tcp_syn_recv_sock;
  2206	
  2207		inet_csk(smc->clcsock->sk)->icsk_af_ops = &smc->af_ops;
  2208	
  2209		rc = kernel_listen(smc->clcsock, backlog);
  2210		if (rc) {
  2211			smc->clcsock->sk->sk_data_ready = smc->clcsk_data_ready;
  2212			goto out;
  2213		}
  2214		sk->sk_max_ack_backlog = backlog;
  2215		sk->sk_ack_backlog = 0;
  2216		sk->sk_state = SMC_LISTEN;
  2217	
  2218	out:
  2219		release_sock(sk);
  2220		return rc;
  2221	}
  2222	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/3] net/smc: Limits backlog connections
  2022-01-27 12:08 ` [PATCH net-next 2/3] net/smc: Limits backlog connections D. Wythe
  2022-01-27 15:52   ` kernel test robot
@ 2022-01-27 15:52   ` kernel test robot
  2022-01-27 23:23   ` kernel test robot
  2 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2022-01-27 15:52 UTC (permalink / raw)
  To: D. Wythe, kgraul
  Cc: llvm, kbuild-all, kuba, davem, netdev, linux-s390, linux-rdma, D. Wythe

Hi Wythe",

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/D-Wythe/Optimizing-performance-in-short-lived/20220127-200912
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git fbb8295248e1d6f576d444309fcf79356008eac1
config: mips-buildonly-randconfig-r004-20220124 (https://download.01.org/0day-ci/archive/20220127/202201272328.JtsPKLkq-lkp@intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project f32dccb9a43b02ce4e540d6ba5dbbdb188f2dc7d)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install mips cross compiling tool for clang build
        # apt-get install binutils-mips-linux-gnu
        # https://github.com/0day-ci/linux/commit/718aff24f3fcc73ecb7bff17fcbe029b799c6624
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review D-Wythe/Optimizing-performance-in-short-lived/20220127-200912
        git checkout 718aff24f3fcc73ecb7bff17fcbe029b799c6624
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=mips SHELL=/bin/bash arch/mips/kernel/ net/smc/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> net/smc/af_smc.c:2202:18: error: assigning to 'struct inet_connection_sock_af_ops *' from 'const struct inet_connection_sock_af_ops *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
           smc->ori_af_ops = inet_csk(smc->clcsock->sk)->icsk_af_ops;
                           ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   1 error generated.


vim +2202 net/smc/af_smc.c

  2166	
  2167	static int smc_listen(struct socket *sock, int backlog)
  2168	{
  2169		struct sock *sk = sock->sk;
  2170		struct smc_sock *smc;
  2171		int rc;
  2172	
  2173		smc = smc_sk(sk);
  2174		lock_sock(sk);
  2175	
  2176		rc = -EINVAL;
  2177		if ((sk->sk_state != SMC_INIT && sk->sk_state != SMC_LISTEN) ||
  2178		    smc->connect_nonblock)
  2179			goto out;
  2180	
  2181		rc = 0;
  2182		if (sk->sk_state == SMC_LISTEN) {
  2183			sk->sk_max_ack_backlog = backlog;
  2184			goto out;
  2185		}
  2186		/* some socket options are handled in core, so we could not apply
  2187		 * them to the clc socket -- copy smc socket options to clc socket
  2188		 */
  2189		smc_copy_sock_settings_to_clc(smc);
  2190		if (!smc->use_fallback)
  2191			tcp_sk(smc->clcsock->sk)->syn_smc = 1;
  2192	
  2193		/* save original sk_data_ready function and establish
  2194		 * smc-specific sk_data_ready function
  2195		 */
  2196		smc->clcsk_data_ready = smc->clcsock->sk->sk_data_ready;
  2197		smc->clcsock->sk->sk_data_ready = smc_clcsock_data_ready;
  2198		smc->clcsock->sk->sk_user_data =
  2199			(void *)((uintptr_t)smc | SK_USER_DATA_NOCOPY);
  2200	
  2201		/* save origin ops */
> 2202		smc->ori_af_ops = inet_csk(smc->clcsock->sk)->icsk_af_ops;
  2203	
  2204		smc->af_ops = *smc->ori_af_ops;
  2205		smc->af_ops.syn_recv_sock = smc_tcp_syn_recv_sock;
  2206	
  2207		inet_csk(smc->clcsock->sk)->icsk_af_ops = &smc->af_ops;
  2208	
  2209		rc = kernel_listen(smc->clcsock, backlog);
  2210		if (rc) {
  2211			smc->clcsock->sk->sk_data_ready = smc->clcsk_data_ready;
  2212			goto out;
  2213		}
  2214		sk->sk_max_ack_backlog = backlog;
  2215		sk->sk_ack_backlog = 0;
  2216		sk->sk_state = SMC_LISTEN;
  2217	
  2218	out:
  2219		release_sock(sk);
  2220		return rc;
  2221	}
  2222	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested
  2022-01-27 12:08 ` [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested D. Wythe
@ 2022-01-27 17:09   ` Matthieu Baerts
  2022-01-28 13:54     ` D. Wythe
  2022-01-28 14:05     ` D. Wythe
  2022-01-28  7:17   ` Tony Lu
  1 sibling, 2 replies; 11+ messages in thread
From: Matthieu Baerts @ 2022-01-27 17:09 UTC (permalink / raw)
  To: D. Wythe, kgraul
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, MPTCP Upstream

Hi,

(+cc MPTCP ML)

On 27/01/2022 13:08, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This patch intends to provide a mechanism to allow automatic fallback to
> TCP according to the pressure of SMC handshake process. At present,
> frequent visits will cause the incoming connections to be backlogged in
> SMC handshake queue, raise the connections established time. Which is
> quite unacceptable for those applications who base on short lived
> connections.

(...)

> diff --git a/net/smc/Kconfig b/net/smc/Kconfig
> index 1ab3c5a..1903927 100644
> --- a/net/smc/Kconfig
> +++ b/net/smc/Kconfig
> @@ -19,3 +19,15 @@ config SMC_DIAG
>  	  smcss.
>  
>  	  if unsure, say Y.
> +
> +if MPTCP

After having read the code and the commit message, it is not clear to me
 why this new feature requires to have MPTCP enabled. May you share some
explanations about that please?

> +
> +config SMC_AUTO_FALLBACK
> +	bool "SMC: automatic fallback to TCP"
> +	default y
> +	help
> +	  Allow automatic fallback to TCP accroding to the pressure of SMC-R
> +	  handshake process.
> +
> +	  If that's not what you except or unsure, say N.
> +endif

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/3] net/smc: Limits backlog connections
  2022-01-27 12:08 ` [PATCH net-next 2/3] net/smc: Limits backlog connections D. Wythe
  2022-01-27 15:52   ` kernel test robot
  2022-01-27 15:52   ` kernel test robot
@ 2022-01-27 23:23   ` kernel test robot
  2 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2022-01-27 23:23 UTC (permalink / raw)
  To: D. Wythe, kgraul
  Cc: kbuild-all, kuba, davem, netdev, linux-s390, linux-rdma, D. Wythe

Hi Wythe",

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/D-Wythe/Optimizing-performance-in-short-lived/20220127-200912
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git fbb8295248e1d6f576d444309fcf79356008eac1
config: x86_64-randconfig-s022-20220124 (https://download.01.org/0day-ci/archive/20220128/202201280741.2EsIf9Jy-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.4-dirty
        # https://github.com/0day-ci/linux/commit/718aff24f3fcc73ecb7bff17fcbe029b799c6624
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review D-Wythe/Optimizing-performance-in-short-lived/20220127-200912
        git checkout 718aff24f3fcc73ecb7bff17fcbe029b799c6624
        # save the config file to linux build tree
        mkdir build_dir
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 SHELL=/bin/bash net/smc/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
>> net/smc/af_smc.c:2202:25: sparse: sparse: incorrect type in assignment (different modifiers) @@     expected struct inet_connection_sock_af_ops *ori_af_ops @@     got struct inet_connection_sock_af_ops const *icsk_af_ops @@
   net/smc/af_smc.c:2202:25: sparse:     expected struct inet_connection_sock_af_ops *ori_af_ops
   net/smc/af_smc.c:2202:25: sparse:     got struct inet_connection_sock_af_ops const *icsk_af_ops

vim +2202 net/smc/af_smc.c

  2166	
  2167	static int smc_listen(struct socket *sock, int backlog)
  2168	{
  2169		struct sock *sk = sock->sk;
  2170		struct smc_sock *smc;
  2171		int rc;
  2172	
  2173		smc = smc_sk(sk);
  2174		lock_sock(sk);
  2175	
  2176		rc = -EINVAL;
  2177		if ((sk->sk_state != SMC_INIT && sk->sk_state != SMC_LISTEN) ||
  2178		    smc->connect_nonblock)
  2179			goto out;
  2180	
  2181		rc = 0;
  2182		if (sk->sk_state == SMC_LISTEN) {
  2183			sk->sk_max_ack_backlog = backlog;
  2184			goto out;
  2185		}
  2186		/* some socket options are handled in core, so we could not apply
  2187		 * them to the clc socket -- copy smc socket options to clc socket
  2188		 */
  2189		smc_copy_sock_settings_to_clc(smc);
  2190		if (!smc->use_fallback)
  2191			tcp_sk(smc->clcsock->sk)->syn_smc = 1;
  2192	
  2193		/* save original sk_data_ready function and establish
  2194		 * smc-specific sk_data_ready function
  2195		 */
  2196		smc->clcsk_data_ready = smc->clcsock->sk->sk_data_ready;
  2197		smc->clcsock->sk->sk_data_ready = smc_clcsock_data_ready;
  2198		smc->clcsock->sk->sk_user_data =
  2199			(void *)((uintptr_t)smc | SK_USER_DATA_NOCOPY);
  2200	
  2201		/* save origin ops */
> 2202		smc->ori_af_ops = inet_csk(smc->clcsock->sk)->icsk_af_ops;
  2203	
  2204		smc->af_ops = *smc->ori_af_ops;
  2205		smc->af_ops.syn_recv_sock = smc_tcp_syn_recv_sock;
  2206	
  2207		inet_csk(smc->clcsock->sk)->icsk_af_ops = &smc->af_ops;
  2208	
  2209		rc = kernel_listen(smc->clcsock, backlog);
  2210		if (rc) {
  2211			smc->clcsock->sk->sk_data_ready = smc->clcsk_data_ready;
  2212			goto out;
  2213		}
  2214		sk->sk_max_ack_backlog = backlog;
  2215		sk->sk_ack_backlog = 0;
  2216		sk->sk_state = SMC_LISTEN;
  2217	
  2218	out:
  2219		release_sock(sk);
  2220		return rc;
  2221	}
  2222	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested
  2022-01-27 12:08 ` [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested D. Wythe
  2022-01-27 17:09   ` Matthieu Baerts
@ 2022-01-28  7:17   ` Tony Lu
  1 sibling, 0 replies; 11+ messages in thread
From: Tony Lu @ 2022-01-28  7:17 UTC (permalink / raw)
  To: D. Wythe; +Cc: kgraul, kuba, davem, netdev, linux-s390, linux-rdma

On Thu, Jan 27, 2022 at 08:08:03PM +0800, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This patch intends to provide a mechanism to allow automatic fallback to
> TCP according to the pressure of SMC handshake process. At present,
> frequent visits will cause the incoming connections to be backlogged in
> SMC handshake queue, raise the connections established time. Which is
> quite unacceptable for those applications who base on short lived
> connections.
> 
> It should be optional for applications that don't care about connection
> established time. For now, this patch only provides the switch at the
> compile time.
> 
> There are two ways to implement this mechanism:
> 
> 1. Fallback when TCP established.
> 2. Fallback before TCP established.
> 
> In the first way, we need to wait and receive CLC messages that the
> client will potentially send, and then actively reply with a decline
> message, in a sense, which is also a sort of SMC handshake, affect the
> connections established time on its way.
> 
> In the second way, the only problem is that we need to inject SMC logic
> into TCP when it is about to reply the incoming SYN, since we already do
> that, it's seems not a problem anymore. And advantage is obvious, few
> additional processes are required to complete the fallback.
> 
> This patch use the second way.
> 
> Link: https://lore.kernel.org/all/1641301961-59331-1-git-send-email-alibuda@linux.alibaba.com/
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ---
>  include/linux/tcp.h  |  1 +
>  net/ipv4/tcp_input.c |  3 ++-
>  net/smc/Kconfig      | 12 ++++++++++++
>  net/smc/af_smc.c     | 22 ++++++++++++++++++++++
>  4 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 78b91bb..1c4ae5d 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -394,6 +394,7 @@ struct tcp_sock {
>  	bool	is_mptcp;
>  #endif
>  #if IS_ENABLED(CONFIG_SMC)
> +	bool	(*smc_in_limited)(const struct sock *sk);
>  	bool	syn_smc;	/* SYN includes SMC */
>  #endif
>  
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index dc49a3d..9890de9 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6701,7 +6701,8 @@ static void tcp_openreq_init(struct request_sock *req,
>  	ireq->ir_num = ntohs(tcp_hdr(skb)->dest);
>  	ireq->ir_mark = inet_request_mark(sk, skb);
>  #if IS_ENABLED(CONFIG_SMC)
> -	ireq->smc_ok = rx_opt->smc_ok;
> +	ireq->smc_ok = rx_opt->smc_ok && !(tcp_sk(sk)->smc_in_limited &&
> +			tcp_sk(sk)->smc_in_limited(sk));
>  #endif
>  }
>  
> diff --git a/net/smc/Kconfig b/net/smc/Kconfig
> index 1ab3c5a..1903927 100644
> --- a/net/smc/Kconfig
> +++ b/net/smc/Kconfig
> @@ -19,3 +19,15 @@ config SMC_DIAG
>  	  smcss.
>  
>  	  if unsure, say Y.
> +
> +if MPTCP

If we really need MPTCP? According the context, this doesn't seem necessary.

> +
> +config SMC_AUTO_FALLBACK
> +	bool "SMC: automatic fallback to TCP"
> +	default y
> +	help
> +	  Allow automatic fallback to TCP accroding to the pressure of SMC-R
> +	  handshake process.
> +
> +	  If that's not what you except or unsure, say N.
> +endif

Consider using a dynamic switch to enable or disable this feature? SMC
currently have netlink interface in smc_netlink.c, we can extend this in
userspace tool smc-tools.

Thank you,
Tony Lu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested
  2022-01-27 17:09   ` Matthieu Baerts
@ 2022-01-28 13:54     ` D. Wythe
  2022-01-28 14:05     ` D. Wythe
  1 sibling, 0 replies; 11+ messages in thread
From: D. Wythe @ 2022-01-28 13:54 UTC (permalink / raw)
  To: Matthieu Baerts, kgraul
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, MPTCP Upstream


This is a spelling error, which has nothing to do with MPTCP. I'll fix 
it soon.


在 2022/1/28 上午1:09, Matthieu Baerts 写道:
> Hi,
> 
> (+cc MPTCP ML)
> 
> On 27/01/2022 13:08, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This patch intends to provide a mechanism to allow automatic fallback to
>> TCP according to the pressure of SMC handshake process. At present,
>> frequent visits will cause the incoming connections to be backlogged in
>> SMC handshake queue, raise the connections established time. Which is
>> quite unacceptable for those applications who base on short lived
>> connections.
> 
> (...)
> 
>> diff --git a/net/smc/Kconfig b/net/smc/Kconfig
>> index 1ab3c5a..1903927 100644
>> --- a/net/smc/Kconfig
>> +++ b/net/smc/Kconfig
>> @@ -19,3 +19,15 @@ config SMC_DIAG
>>   	  smcss.
>>   
>>   	  if unsure, say Y.
>> +
>> +if MPTCP
> 
> After having read the code and the commit message, it is not clear to me
>   why this new feature requires to have MPTCP enabled. May you share some
> explanations about that please?
> 
>> +
>> +config SMC_AUTO_FALLBACK
>> +	bool "SMC: automatic fallback to TCP"
>> +	default y
>> +	help
>> +	  Allow automatic fallback to TCP accroding to the pressure of SMC-R
>> +	  handshake process.
>> +
>> +	  If that's not what you except or unsure, say N.
>> +endif
> 
> Cheers,
> Matt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested
  2022-01-27 17:09   ` Matthieu Baerts
  2022-01-28 13:54     ` D. Wythe
@ 2022-01-28 14:05     ` D. Wythe
  1 sibling, 0 replies; 11+ messages in thread
From: D. Wythe @ 2022-01-28 14:05 UTC (permalink / raw)
  To: Matthieu Baerts, kgraul
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, MPTCP Upstream

Sorry for this mistake。 Thanks for your pointing out.

Thanks again.


在 2022/1/28 上午1:09, Matthieu Baerts 写道:
> Hi,
> 
> (+cc MPTCP ML)
> 
> On 27/01/2022 13:08, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This patch intends to provide a mechanism to allow automatic fallback to
>> TCP according to the pressure of SMC handshake process. At present,
>> frequent visits will cause the incoming connections to be backlogged in
>> SMC handshake queue, raise the connections established time. Which is
>> quite unacceptable for those applications who base on short lived
>> connections.
> 
> (...)
> 
>> diff --git a/net/smc/Kconfig b/net/smc/Kconfig
>> index 1ab3c5a..1903927 100644
>> --- a/net/smc/Kconfig
>> +++ b/net/smc/Kconfig
>> @@ -19,3 +19,15 @@ config SMC_DIAG
>>   	  smcss.
>>   
>>   	  if unsure, say Y.
>> +
>> +if MPTCP
> 
> After having read the code and the commit message, it is not clear to me
>   why this new feature requires to have MPTCP enabled. May you share some
> explanations about that please?
> 
>> +
>> +config SMC_AUTO_FALLBACK
>> +	bool "SMC: automatic fallback to TCP"
>> +	default y
>> +	help
>> +	  Allow automatic fallback to TCP accroding to the pressure of SMC-R
>> +	  handshake process.
>> +
>> +	  If that's not what you except or unsure, say N.
>> +endif
> 
> Cheers,
> Matt

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-01-28 14:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-27 12:08 [PATCH net-next 0/3] Optimizing performance in short-lived D. Wythe
2022-01-27 12:08 ` [PATCH net-next 1/3] net/smc: Make smc_tcp_listen_work() independent D. Wythe
2022-01-27 12:08 ` [PATCH net-next 2/3] net/smc: Limits backlog connections D. Wythe
2022-01-27 15:52   ` kernel test robot
2022-01-27 15:52   ` kernel test robot
2022-01-27 23:23   ` kernel test robot
2022-01-27 12:08 ` [PATCH net-next 3/3] net/smc: Fallback when handshake workqueue congested D. Wythe
2022-01-27 17:09   ` Matthieu Baerts
2022-01-28 13:54     ` D. Wythe
2022-01-28 14:05     ` D. Wythe
2022-01-28  7:17   ` Tony Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).