All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect()
@ 2021-11-11 16:21 Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 1/7] mptcp: keep snd_una updated for fallback socket Paolo Abeni
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

As outlined in the public mtg, mptcp_accept() is currently quite
suboptimal, both from performance and code complexity

This series tries to clean it up, enforcing a wider lifetime for
the initial subflow, so that we don't need to acquire additional
references there.

To reach such goal we need to properly define the disconnect()
behavior, which is currently quite incomplete. Additionally allow
user-space to really disconnect established connections.

Disconnect() needs in turn an egress FASTCLOSE implementation,
added here according to option R (reset - the simpler form).

Finally, the self-tests need as pre-req Florian's patches implementing
SIOCOUTQ

v1 -> v2:
 - update mptcp_connect argument lists and usage() in patch 7/7

RFC -> v1:
 - added patches 1/7, 3/7, 6/7, 7/7
 - added a few missing bits in patch 4/7

Paolo Abeni (7):
  mptcp: keep snd_una updated for fallback socket
  mptcp: never allow the PM to close a listener subflow
  mptcp: implement fastclose xmit path
  mptcp: full disconnect implementation
  mptcp: cleanup accept and poll
  mptcp: implement support for user-space disconnect
  mptcp: add disconnect selftests

 net/mptcp/options.c                           |  57 +++++--
 net/mptcp/pm.c                                |  10 +-
 net/mptcp/pm_netlink.c                        |   3 +
 net/mptcp/protocol.c                          | 144 +++++++++++------
 net/mptcp/protocol.h                          |  16 +-
 net/mptcp/subflow.c                           |   1 -
 net/mptcp/token.c                             |   1 +
 .../selftests/net/mptcp/mptcp_connect.c       | 148 +++++++++++++++---
 .../selftests/net/mptcp/mptcp_connect.sh      |  39 ++++-
 9 files changed, 324 insertions(+), 95 deletions(-)

-- 
2.33.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 mptcp-next 1/7] mptcp: keep snd_una updated for fallback socket
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
@ 2021-11-11 16:21 ` Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 2/7] mptcp: never allow the PM to close a listener subflow Paolo Abeni
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

After shutdown, for fallback MPTCP sockets, we always have

write_seq == snd_una+1

The above will foul OUTQ ioctl(). Keep snd_una in sync with
write_seq even after shutdown.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 29b6f57b917e..3fef1b4e7780 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2690,6 +2690,7 @@ static void __mptcp_check_send_data_fin(struct sock *sk)
 	 * state now
 	 */
 	if (__mptcp_check_fallback(msk)) {
+		WRITE_ONCE(msk->snd_una, msk->write_seq);
 		if ((1 << sk->sk_state) & (TCPF_CLOSING | TCPF_LAST_ACK)) {
 			inet_sk_state_store(sk, TCP_CLOSE);
 			mptcp_close_wake_up(sk);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 mptcp-next 2/7] mptcp: never allow the PM to close a listener subflow
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 1/7] mptcp: keep snd_una updated for fallback socket Paolo Abeni
@ 2021-11-11 16:21 ` Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 3/7] mptcp: implement fastclose xmit path Paolo Abeni
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

Currently, when deleting an endpoint the netlink PM treverses
all the local MPTCP sockets, regardless of their status.

If an MPTCP listener socket is bound to the IP matching the
delete endpoint, the listener TCP socket will be closed.
That is unexpected, the PM should only affect data subflows.

Fix the issue explicitly skipping MPTCP socket in TCP_LISTEN
status.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/pm_netlink.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 7b96be1e9f14..f523051f5aef 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -700,6 +700,9 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
 
 	msk_owned_by_me(msk);
 
+	if (sk->sk_state == TCP_LISTEN)
+		return;
+
 	if (!rm_list->nr)
 		return;
 
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 mptcp-next 3/7] mptcp: implement fastclose xmit path
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 1/7] mptcp: keep snd_una updated for fallback socket Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 2/7] mptcp: never allow the PM to close a listener subflow Paolo Abeni
@ 2021-11-11 16:21 ` Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 4/7] mptcp: full disconnect implementation Paolo Abeni
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

Allow the MPTCP xmit path to add MP_FASTCLOSE suboption
on RST egress packets.

Additionally reorder related options writing to reduce
the number of conditionals required in the fast path.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c  | 57 +++++++++++++++++++++++++++++++-------------
 net/mptcp/protocol.h |  1 +
 2 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 68a9a1c79200..8a1020e4285c 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -768,6 +768,28 @@ static noinline bool mptcp_established_options_rst(struct sock *sk, struct sk_bu
 	return true;
 }
 
+static bool mptcp_established_options_fastclose(struct sock *sk,
+						unsigned int *size,
+						unsigned int remaining,
+						struct mptcp_out_options *opts)
+{
+	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
+	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
+
+	if (likely(!subflow->send_fastclose))
+		return false;
+
+	if (remaining < TCPOLEN_MPTCP_FASTCLOSE)
+		return false;
+
+	*size = TCPOLEN_MPTCP_FASTCLOSE;
+	opts->suboptions |= OPTION_MPTCP_FASTCLOSE;
+	opts->rcvr_key = msk->remote_key;
+
+	pr_debug("FASTCLOSE key=%llu", opts->rcvr_key);
+	return true;
+}
+
 static bool mptcp_established_options_mp_fail(struct sock *sk,
 					      unsigned int *size,
 					      unsigned int remaining,
@@ -806,11 +828,9 @@ bool mptcp_established_options(struct sock *sk, struct sk_buff *skb,
 		return false;
 
 	if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_RST)) {
-		if (mptcp_established_options_mp_fail(sk, &opt_size, remaining, opts)) {
-			*size += opt_size;
-			remaining -= opt_size;
-		}
-		if (mptcp_established_options_rst(sk, skb, &opt_size, remaining, opts)) {
+		if (mptcp_established_options_fastclose(sk, &opt_size, remaining, opts) ||
+		    mptcp_established_options_mp_fail(sk, &opt_size, remaining, opts) ||
+		    mptcp_established_options_rst(sk, skb, &opt_size, remaining, opts)) {
 			*size += opt_size;
 			remaining -= opt_size;
 		}
@@ -1251,17 +1271,8 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 		ptr += 2;
 	}
 
-	/* RST is mutually exclusive with everything else */
-	if (unlikely(OPTION_MPTCP_RST & opts->suboptions)) {
-		*ptr++ = mptcp_option(MPTCPOPT_RST,
-				      TCPOLEN_MPTCP_RST,
-				      opts->reset_transient,
-				      opts->reset_reason);
-		return;
-	}
-
-	/* DSS, MPC, MPJ and ADD_ADDR are mutually exclusive, see
-	 * mptcp_established_options*()
+	/* DSS, MPC, MPJ, ADD_ADDR, FASTCLOSE and RST are mutually exclusive,
+	 * see mptcp_established_options*()
 	 */
 	if (likely(OPTION_MPTCP_DSS & opts->suboptions)) {
 		struct mptcp_ext *mpext = &opts->ext_copy;
@@ -1447,6 +1458,20 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 				ptr += 1;
 			}
 		}
+	} else if (unlikely(OPTION_MPTCP_RST & opts->suboptions)) {
+		/* RST is mutually exclusive with everything else */
+		*ptr++ = mptcp_option(MPTCPOPT_RST,
+				      TCPOLEN_MPTCP_RST,
+				      opts->reset_transient,
+				      opts->reset_reason);
+		return;
+	} else if (unlikely(OPTION_MPTCP_FASTCLOSE & opts->suboptions)) {
+		/* FASTCLOSE is mutually exclusive with everything else */
+		*ptr++ = mptcp_option(MPTCPOPT_MP_FASTCLOSE,
+				      TCPOLEN_MPTCP_FASTCLOSE,
+				      0, 0);
+		put_unaligned_be64(opts->rcvr_key, ptr);
+		return;
 	}
 
 	if (OPTION_MPTCP_PRIO & opts->suboptions) {
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index e77de7662df0..cee323de1a1c 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -422,6 +422,7 @@ struct mptcp_subflow_context {
 		backup : 1,
 		send_mp_prio : 1,
 		send_mp_fail : 1,
+		send_fastclose : 1,
 		send_infinite_map : 1,
 		rx_eof : 1,
 		can_ack : 1,        /* only after processing the remote a key */
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 mptcp-next 4/7] mptcp: full disconnect implementation
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
                   ` (2 preceding siblings ...)
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 3/7] mptcp: implement fastclose xmit path Paolo Abeni
@ 2021-11-11 16:21 ` Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 5/7] mptcp: cleanup accept and poll Paolo Abeni
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

The current mptcp_disconnect() implementation lacks several
steps, we additionally need to reset the msk socket state
and flush the subflow list.

Factor out the needed helper to avoid code duplication.

Additionally ensure that the initial subflow is disposed
only after mptcp_close(), just reset it at disconnect time.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v1 -> v2:
 - fix compile warning (CI)
 - reset first subflow socket state on disconnect
 - use fast-close on disconnect
---
 net/mptcp/pm.c       |  10 +++--
 net/mptcp/protocol.c | 101 ++++++++++++++++++++++++++++++++-----------
 net/mptcp/protocol.h |  14 ++++++
 net/mptcp/token.c    |   1 +
 4 files changed, 98 insertions(+), 28 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 86b38a830b4c..761995a34124 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -362,7 +362,7 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
 	}
 }
 
-void mptcp_pm_data_init(struct mptcp_sock *msk)
+void mptcp_pm_data_reset(struct mptcp_sock *msk)
 {
 	msk->pm.add_addr_signaled = 0;
 	msk->pm.add_addr_accepted = 0;
@@ -377,10 +377,14 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
 	WRITE_ONCE(msk->pm.remote_deny_join_id0, false);
 	msk->pm.status = 0;
 
+	mptcp_pm_nl_data_init(msk);
+}
+
+void mptcp_pm_data_init(struct mptcp_sock *msk)
+{
 	spin_lock_init(&msk->pm.lock);
 	INIT_LIST_HEAD(&msk->pm.anno_list);
-
-	mptcp_pm_nl_data_init(msk);
+	mptcp_pm_data_reset(msk);
 }
 
 void __init mptcp_pm_init(void)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3fef1b4e7780..84a3df43a38d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2271,6 +2271,10 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
 	return true;
 }
 
+/* flags for __mptcp_close_ssk() */
+#define MPTCP_CF_PUSH		BIT(1)
+#define MPTCP_CF_FASTCLOSE	BIT(2)
+
 /* subflow sockets can be either outgoing (connect) or incoming
  * (accept).
  *
@@ -2280,22 +2284,37 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
  * parent socket.
  */
 static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
-			      struct mptcp_subflow_context *subflow)
+			      struct mptcp_subflow_context *subflow,
+			      unsigned int flags)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
-	bool need_push;
+	bool need_push, dispose_it;
 
-	list_del(&subflow->node);
+	dispose_it = !msk->subflow || ssk != msk->subflow->sk;
+	if (dispose_it)
+		list_del(&subflow->node);
 
 	lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
 
+	if (flags & MPTCP_CF_FASTCLOSE)
+		subflow->send_fastclose = 1;
+
+	need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk);
+	if (!dispose_it) {
+		tcp_disconnect(ssk, 0);
+		msk->subflow->state = SS_UNCONNECTED;
+		mptcp_subflow_ctx_reset(subflow);
+		release_sock(ssk);
+
+		goto out;
+	}
+
 	/* if we are invoked by the msk cleanup code, the subflow is
 	 * already orphaned
 	 */
 	if (ssk->sk_socket)
 		sock_orphan(ssk);
 
-	need_push = __mptcp_retransmit_pending_data(sk);
 	subflow->disposable = 1;
 
 	/* if ssk hit tcp_done(), tcp_cleanup_ulp() cleared the related ops
@@ -2315,14 +2334,12 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 
 	sock_put(ssk);
 
-	if (ssk == msk->last_snd)
-		msk->last_snd = NULL;
-
 	if (ssk == msk->first)
 		msk->first = NULL;
 
-	if (msk->subflow && ssk == msk->subflow->sk)
-		mptcp_dispose_initial_subflow(msk);
+out:
+	if (ssk == msk->last_snd)
+		msk->last_snd = NULL;
 
 	if (need_push)
 		__mptcp_push_pending(sk, 0);
@@ -2333,7 +2350,7 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 {
 	if (sk->sk_state == TCP_ESTABLISHED)
 		mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL);
-	__mptcp_close_ssk(sk, ssk, subflow);
+	__mptcp_close_ssk(sk, ssk, subflow, MPTCP_CF_PUSH);
 }
 
 static unsigned int mptcp_sync_mss(struct sock *sk, u32 pmtu)
@@ -2557,9 +2574,20 @@ static int __mptcp_init_sock(struct sock *sk)
 	return 0;
 }
 
-static int mptcp_init_sock(struct sock *sk)
+static void mptcp_ca_reset(struct sock *sk)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
+
+	tcp_assign_congestion_control(sk);
+	strcpy(mptcp_sk(sk)->ca_name, icsk->icsk_ca_ops->name);
+
+	/* no need to keep a reference to the ops, the name will suffice */
+	tcp_cleanup_congestion_control(sk);
+	icsk->icsk_ca_ops = NULL;
+}
+
+static int mptcp_init_sock(struct sock *sk)
+{
 	struct net *net = sock_net(sk);
 	int ret;
 
@@ -2580,12 +2608,7 @@ static int mptcp_init_sock(struct sock *sk)
 	/* fetch the ca name; do it outside __mptcp_init_sock(), so that clone will
 	 * propagate the correct value
 	 */
-	tcp_assign_congestion_control(sk);
-	strcpy(mptcp_sk(sk)->ca_name, icsk->icsk_ca_ops->name);
-
-	/* no need to keep a reference to the ops, the name will suffice */
-	tcp_cleanup_congestion_control(sk);
-	icsk->icsk_ca_ops = NULL;
+	mptcp_ca_reset(sk);
 
 	sk_sockets_allocated_inc(sk);
 	sk->sk_rcvbuf = sock_net(sk)->ipv4.sysctl_tcp_rmem[1];
@@ -2744,9 +2767,13 @@ static void __mptcp_destroy_sock(struct sock *sk)
 	sk_stop_timer(sk, &sk->sk_timer);
 	msk->pm.status = 0;
 
+	/* clears msk->subflow, allowing the following loop to close
+	 * even the initial subflow
+	 */
+	mptcp_dispose_initial_subflow(msk);
 	list_for_each_entry_safe(subflow, tmp, &conn_list, node) {
 		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
-		__mptcp_close_ssk(sk, ssk, subflow);
+		__mptcp_close_ssk(sk, ssk, subflow, 0);
 	}
 
 	sk->sk_prot->destroy(sk);
@@ -2757,7 +2784,6 @@ static void __mptcp_destroy_sock(struct sock *sk)
 	xfrm_sk_free_policy(sk);
 
 	sk_refcnt_debug_release(sk);
-	mptcp_dispose_initial_subflow(msk);
 	sock_put(sk);
 }
 
@@ -2793,6 +2819,9 @@ static void mptcp_close(struct sock *sk, long timeout)
 
 	sock_hold(sk);
 	pr_debug("msk=%p state=%d", sk, sk->sk_state);
+	if (mptcp_sk(sk)->token)
+		mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL);
+
 	if (sk->sk_state == TCP_CLOSE) {
 		__mptcp_destroy_sock(sk);
 		do_cancel_work = true;
@@ -2803,9 +2832,6 @@ static void mptcp_close(struct sock *sk, long timeout)
 	if (do_cancel_work)
 		mptcp_cancel_work(sk);
 
-	if (mptcp_sk(sk)->token)
-		mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL);
-
 	sock_put(sk);
 }
 
@@ -2839,13 +2865,36 @@ static int mptcp_disconnect(struct sock *sk, int flags)
 
 	mptcp_do_flush_join_list(msk);
 
+	inet_sk_state_store(sk, TCP_CLOSE);
+
 	mptcp_for_each_subflow(msk, subflow) {
 		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
 
-		lock_sock(ssk);
-		tcp_disconnect(ssk, flags);
-		release_sock(ssk);
+		__mptcp_close_ssk(sk, ssk, subflow, MPTCP_CF_FASTCLOSE);
 	}
+
+	sk_stop_timer(sk, &msk->sk.icsk_retransmit_timer);
+	sk_stop_timer(sk, &sk->sk_timer);
+
+	if (mptcp_sk(sk)->token)
+		mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL);
+
+	mptcp_destroy_common(msk);
+	msk->last_snd = NULL;
+	msk->flags = 0;
+	msk->recovery = false;
+	msk->can_ack = false;
+	msk->fully_established = false;
+	msk->rcv_data_fin = false;
+	msk->snd_data_fin_enable = false;
+	msk->rcv_fastclose = false;
+	msk->use_64bit_ack = false;
+	WRITE_ONCE(msk->csum_enabled, mptcp_is_checksum_enabled(sock_net(sk)));
+	mptcp_pm_data_reset(msk);
+	mptcp_ca_reset(sk);
+
+	sk->sk_shutdown = 0;
+	sk_error_report(sk);
 	return 0;
 }
 
@@ -2985,9 +3034,11 @@ void mptcp_destroy_common(struct mptcp_sock *msk)
 	__mptcp_clear_xmit(sk);
 
 	/* move to sk_receive_queue, sk_stream_kill_queues will purge it */
+	mptcp_data_lock(sk);
 	skb_queue_splice_tail_init(&msk->receive_queue, &sk->sk_receive_queue);
 	__skb_queue_purge(&sk->sk_receive_queue);
 	skb_rbtree_purge(&msk->out_of_order_queue);
+	mptcp_data_unlock(sk);
 
 	/* move all the rx fwd alloc into the sk_mem_reclaim_final in
 	 * inet_sock_destruct() will dispose it
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index cee323de1a1c..f8ed68c5ef9d 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -394,6 +394,9 @@ DECLARE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions);
 /* MPTCP subflow context */
 struct mptcp_subflow_context {
 	struct	list_head node;/* conn_list of subflows */
+
+	char	reset_start[0];
+
 	unsigned long avg_pacing_rate; /* protected by msk socket lock */
 	u64	local_key;
 	u64	remote_key;
@@ -442,6 +445,9 @@ struct mptcp_subflow_context {
 	u8	stale_count;
 
 	long	delegated_status;
+
+	char	reset_end[0];
+
 	struct	list_head delegated_node;   /* link into delegated_action, protected by local BH */
 
 	u32	setsockopt_seq;
@@ -473,6 +479,13 @@ mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow)
 	return subflow->tcp_sock;
 }
 
+static inline void
+mptcp_subflow_ctx_reset(struct mptcp_subflow_context *subflow)
+{
+	memset(subflow->reset_start, 0, subflow->reset_end - subflow->reset_start);
+	subflow->request_mptcp = 1;
+}
+
 static inline u64
 mptcp_subflow_get_map_offset(const struct mptcp_subflow_context *subflow)
 {
@@ -712,6 +725,7 @@ void mptcp_crypto_hmac_sha(u64 key1, u64 key2, u8 *msg, int len, void *hmac);
 
 void __init mptcp_pm_init(void);
 void mptcp_pm_data_init(struct mptcp_sock *msk);
+void mptcp_pm_data_reset(struct mptcp_sock *msk);
 void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk);
 void mptcp_pm_nl_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk);
 void mptcp_pm_new_connection(struct mptcp_sock *msk, const struct sock *ssk, int server_side);
diff --git a/net/mptcp/token.c b/net/mptcp/token.c
index e581b341c5be..f52ee7b26aed 100644
--- a/net/mptcp/token.c
+++ b/net/mptcp/token.c
@@ -384,6 +384,7 @@ void mptcp_token_destroy(struct mptcp_sock *msk)
 		bucket->chain_len--;
 	}
 	spin_unlock_bh(&bucket->lock);
+	WRITE_ONCE(msk->token, 0);
 }
 
 void __init mptcp_token_init(void)
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 mptcp-next 5/7] mptcp: cleanup accept and poll
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
                   ` (3 preceding siblings ...)
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 4/7] mptcp: full disconnect implementation Paolo Abeni
@ 2021-11-11 16:21 ` Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 6/7] mptcp: implement support for user-space disconnect Paolo Abeni
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

After the previous patch,  msk->subflow will never be deleted during
the whole msk lifetime. We don't need anymore to acquire references to
it in mptcp_stream_accept() and we can use the listener subflow accept
queue to simplify mptcp_poll() for listener socket.

Overall this removes a lock pair and 4 more atomic operations per accept().

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 25 +++++++------------------
 net/mptcp/protocol.h |  1 -
 net/mptcp/subflow.c  |  1 -
 3 files changed, 7 insertions(+), 20 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 84a3df43a38d..ee2a5169c13a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3517,17 +3517,9 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
 
 	pr_debug("msk=%p", msk);
 
-	lock_sock(sock->sk);
-	if (sock->sk->sk_state != TCP_LISTEN)
-		goto unlock_fail;
-
 	ssock = __mptcp_nmpc_socket(msk);
 	if (!ssock)
-		goto unlock_fail;
-
-	clear_bit(MPTCP_DATA_READY, &msk->flags);
-	sock_hold(ssock->sk);
-	release_sock(sock->sk);
+		return -EINVAL;
 
 	err = ssock->ops->accept(sock, newsock, flags, kern);
 	if (err == 0 && !mptcp_is_tcpsk(newsock->sk)) {
@@ -3567,14 +3559,7 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
 		release_sock(newsk);
 	}
 
-	if (inet_csk_listen_poll(ssock->sk))
-		set_bit(MPTCP_DATA_READY, &msk->flags);
-	sock_put(ssock->sk);
 	return err;
-
-unlock_fail:
-	release_sock(sock->sk);
-	return -EINVAL;
 }
 
 static __poll_t mptcp_check_readable(struct mptcp_sock *msk)
@@ -3620,8 +3605,12 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock,
 
 	state = inet_sk_state_load(sk);
 	pr_debug("msk=%p state=%d flags=%lx", msk, state, msk->flags);
-	if (state == TCP_LISTEN)
-		return test_bit(MPTCP_DATA_READY, &msk->flags) ? EPOLLIN | EPOLLRDNORM : 0;
+	if (state == TCP_LISTEN) {
+		if (WARN_ON_ONCE(!msk->subflow || !msk->subflow->sk))
+			return 0;
+
+		return inet_csk_listen_poll(msk->subflow->sk);
+	}
 
 	if (state != TCP_SYN_SENT && state != TCP_SYN_RECV) {
 		mask |= mptcp_check_readable(msk);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index f8ed68c5ef9d..a6a4bd7de5b4 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -111,7 +111,6 @@
 #define MPTCP_RST_TRANSIENT	BIT(0)
 
 /* MPTCP socket flags */
-#define MPTCP_DATA_READY	0
 #define MPTCP_NOSPACE		1
 #define MPTCP_WORK_RTX		2
 #define MPTCP_WORK_EOF		3
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 12c49f898e28..2aea7935019e 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1298,7 +1298,6 @@ static void subflow_data_ready(struct sock *sk)
 		if (reqsk_queue_empty(&inet_csk(sk)->icsk_accept_queue))
 			return;
 
-		set_bit(MPTCP_DATA_READY, &msk->flags);
 		parent->sk_data_ready(parent);
 		return;
 	}
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 mptcp-next 6/7] mptcp: implement support for user-space disconnect
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
                   ` (4 preceding siblings ...)
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 5/7] mptcp: cleanup accept and poll Paolo Abeni
@ 2021-11-11 16:21 ` Paolo Abeni
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 7/7] mptcp: add disconnect selftests Paolo Abeni
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

Handle explicitly AF_UNSPEC in mptcp_stream_connnect() to
allow user-space to disconnect established MPTCP connections

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index ee2a5169c13a..8319db8ae3ed 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3428,9 +3428,20 @@ static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr,
 	struct mptcp_sock *msk = mptcp_sk(sock->sk);
 	struct mptcp_subflow_context *subflow;
 	struct socket *ssock;
-	int err;
+	int err = -EINVAL;
 
 	lock_sock(sock->sk);
+	if (uaddr) {
+		if (addr_len < sizeof(uaddr->sa_family))
+			goto unlock;
+
+		if (uaddr->sa_family == AF_UNSPEC) {
+			err = mptcp_disconnect(sock->sk, flags);
+			sock->state = err ? SS_DISCONNECTING : SS_UNCONNECTED;
+			goto unlock;
+		}
+	}
+
 	if (sock->state != SS_UNCONNECTED && msk->subflow) {
 		/* pending connection or invalid state, let existing subflow
 		 * cope with that
@@ -3440,10 +3451,8 @@ static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr,
 	}
 
 	ssock = __mptcp_nmpc_socket(msk);
-	if (!ssock) {
-		err = -EINVAL;
+	if (!ssock)
 		goto unlock;
-	}
 
 	mptcp_token_destroy(msk);
 	inet_sk_state_store(sock->sk, TCP_SYN_SENT);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 mptcp-next 7/7] mptcp: add disconnect selftests
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
                   ` (5 preceding siblings ...)
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 6/7] mptcp: implement support for user-space disconnect Paolo Abeni
@ 2021-11-11 16:21 ` Paolo Abeni
  2021-11-12  4:24 ` [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Mat Martineau
  2021-11-19 16:47 ` Matthieu Baerts
  8 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2021-11-11 16:21 UTC (permalink / raw)
  To: mptcp

Performs several disconnect/reconnect on the same socket,
ensuring the overall transfer is succesful.

The new test leverages ioctl(SIOCOUTQ) to ensure all the
pending data is acked before disconnecting.

Additionally order alphabetically the test program arguments list
for better maintainability.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
notes:
- v1 -> v2:
  - updated usage, re-ordered args
- this is on top of Florian's patches
---
 .../selftests/net/mptcp/mptcp_connect.c       | 148 +++++++++++++++---
 .../selftests/net/mptcp/mptcp_connect.sh      |  39 ++++-
 2 files changed, 160 insertions(+), 27 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index e3e4338d610f..61abf98e897e 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -16,6 +16,7 @@
 #include <unistd.h>
 #include <time.h>
 
+#include <sys/ioctl.h>
 #include <sys/poll.h>
 #include <sys/sendfile.h>
 #include <sys/stat.h>
@@ -28,6 +29,7 @@
 
 #include <linux/tcp.h>
 #include <linux/time_types.h>
+#include <linux/sockios.h>
 
 extern int optind;
 
@@ -69,6 +71,8 @@ static unsigned int cfg_time;
 static unsigned int cfg_do_w;
 static int cfg_wait;
 static uint32_t cfg_mark;
+static char *cfg_input = NULL;
+static int cfg_repeat = 1;
 
 struct cfg_cmsg_types {
 	unsigned int cmsg_enabled:1;
@@ -92,23 +96,32 @@ static struct cfg_sockopt_types cfg_sockopt_types;
 
 static void die_usage(void)
 {
-	fprintf(stderr, "Usage: mptcp_connect [-6] [-u] [-s MPTCP|TCP] [-p port] [-m mode]"
-		"[-l] [-w sec] [-t num] [-T num] connect_address\n");
+	fprintf(stderr, "Usage: mptcp_connect [-6] [-c cmsg] [-i file] [-I num] [-j] [-l] "
+		"[-m mode] [-M mark] [-o option] [-p port] [-P mode] [-j] [-l] [-r num] "
+		"[-s MPTCP|TCP] [-S num] [-r num] [-t num] [-T num] [-u] [-w sec] connect_address\n");
 	fprintf(stderr, "\t-6 use ipv6\n");
+	fprintf(stderr, "\t-c cmsg -- test cmsg type <cmsg>\n");
+	fprintf(stderr, "\t-i file -- read the data to send from the given file instead of stdin");
+	fprintf(stderr, "\t-I num -- repeat the transfer 'num' times. In listen mode accepts num "
+		"incoming connections, in client mode, disconnect and reconnect to the server\n");
+	fprintf(stderr, "\t-j     -- add additional sleep at connection start and tear down"
+		"-- for MPJ tests\n");
+	fprintf(stderr, "\t-l     -- listens mode, accepts incoming connection\n");
+	fprintf(stderr, "\t-m [poll|mmap|sendfile] -- use poll(default)/mmap+write/sendfile\n");
+	fprintf(stderr, "\t-M mark -- set socket packet mark\n");
+	fprintf(stderr, "\t-o option -- test sockopt <option>\n");
+	fprintf(stderr, "\t-p num -- use port num\n");
+	fprintf(stderr,
+		"\t-P [saveWithPeek|saveAfterPeek] -- save data with/after MSG_PEEK form tcp socket\n");
 	fprintf(stderr, "\t-t num -- set poll timeout to num\n");
 	fprintf(stderr, "\t-T num -- set expected runtime to num ms\n");
-	fprintf(stderr, "\t-S num -- set SO_SNDBUF to num\n");
+	fprintf(stderr, "\t-r num -- enable slow mode, limiting each write to num bytes"
+		"- for remove addr tests\n");
 	fprintf(stderr, "\t-R num -- set SO_RCVBUF to num\n");
-	fprintf(stderr, "\t-p num -- use port num\n");
 	fprintf(stderr, "\t-s [MPTCP|TCP] -- use mptcp(default) or tcp sockets\n");
-	fprintf(stderr, "\t-m [poll|mmap|sendfile] -- use poll(default)/mmap+write/sendfile\n");
-	fprintf(stderr, "\t-M mark -- set socket packet mark\n");
+	fprintf(stderr, "\t-S num -- set SO_SNDBUF to num\n");
 	fprintf(stderr, "\t-u -- check mptcp ulp\n");
 	fprintf(stderr, "\t-w num -- wait num sec before closing the socket\n");
-	fprintf(stderr, "\t-c cmsg -- test cmsg type <cmsg>\n");
-	fprintf(stderr, "\t-o option -- test sockopt <option>\n");
-	fprintf(stderr,
-		"\t-P [saveWithPeek|saveAfterPeek] -- save data with/after MSG_PEEK form tcp socket\n");
 	exit(1);
 }
 
@@ -307,7 +320,8 @@ static bool sock_test_tcpulp(const char * const remoteaddr,
 }
 
 static int sock_connect_mptcp(const char * const remoteaddr,
-			      const char * const port, int proto)
+			      const char * const port, int proto,
+			      struct addrinfo **peer)
 {
 	struct addrinfo hints = {
 		.ai_protocol = IPPROTO_TCP,
@@ -329,8 +343,10 @@ static int sock_connect_mptcp(const char * const remoteaddr,
 		if (cfg_mark)
 			set_mark(sock, cfg_mark);
 
-		if (connect(sock, a->ai_addr, a->ai_addrlen) == 0)
+		if (connect(sock, a->ai_addr, a->ai_addrlen) == 0) {
+			*peer = a;
 			break; /* success */
+		}
 
 		perror("connect()");
 		close(sock);
@@ -513,14 +529,17 @@ static ssize_t do_rnd_read(const int fd, char *buf, const size_t len)
 	return ret;
 }
 
-static void set_nonblock(int fd)
+static void set_nonblock(int fd, bool nonblock)
 {
 	int flags = fcntl(fd, F_GETFL);
 
 	if (flags == -1)
 		return;
 
-	fcntl(fd, F_SETFL, flags | O_NONBLOCK);
+	if (nonblock)
+		fcntl(fd, F_SETFL, flags | O_NONBLOCK);
+	else
+		fcntl(fd, F_SETFL, flags & ~O_NONBLOCK);
 }
 
 static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after_out)
@@ -532,7 +551,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
 	unsigned int woff = 0, wlen = 0;
 	char wbuf[8192];
 
-	set_nonblock(peerfd);
+	set_nonblock(peerfd, true);
 
 	for (;;) {
 		char rbuf[8192];
@@ -627,7 +646,6 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
 	if (cfg_remove)
 		usleep(cfg_wait);
 
-	close(peerfd);
 	return 0;
 }
 
@@ -769,7 +787,7 @@ static int copyfd_io_sendfile(int infd, int peerfd, int outfd,
 	return err;
 }
 
-static int copyfd_io(int infd, int peerfd, int outfd)
+static int copyfd_io(int infd, int peerfd, int outfd, bool close_peerfd)
 {
 	bool in_closed_after_out = false;
 	struct timespec start, end;
@@ -808,6 +826,9 @@ static int copyfd_io(int infd, int peerfd, int outfd)
 	if (ret)
 		return ret;
 
+	if (close_peerfd)
+		close(peerfd);
+
 	if (cfg_time) {
 		unsigned int delta_ms;
 
@@ -919,7 +940,7 @@ static void maybe_close(int fd)
 {
 	unsigned int r = rand();
 
-	if (!(cfg_join || cfg_remove) && (r & 1))
+	if (!(cfg_join || cfg_remove || (cfg_repeat > 1)) && (r & 1))
 		close(fd);
 }
 
@@ -929,7 +950,9 @@ int main_loop_s(int listensock)
 	struct pollfd polls;
 	socklen_t salen;
 	int remotesock;
+	int fd = 0;
 
+again:
 	polls.fd = listensock;
 	polls.events = POLLIN;
 
@@ -950,12 +973,25 @@ int main_loop_s(int listensock)
 		check_sockaddr(pf, &ss, salen);
 		check_getpeername(remotesock, &ss, salen);
 
-		return copyfd_io(0, remotesock, 1);
+		if (cfg_input) {
+			fd = open(cfg_input, O_RDONLY);
+			if (fd < 0)
+				xerror("can't open %s: %d", cfg_input, errno);
+		}
+
+		copyfd_io(fd, remotesock, 1, true);
+	} else {
+		perror("accept");
+		return 1;
 	}
 
-	perror("accept");
+	if (--cfg_repeat > 0) {
+		if (cfg_input)
+			close(fd);
+		goto again;
+	}
 
-	return 1;
+	return 0;
 }
 
 static void init_rng(void)
@@ -1044,15 +1080,47 @@ static void parse_setsock_options(const char *name)
 	exit(1);
 }
 
+void xdisconnect(int fd, int addrlen)
+{
+	struct sockaddr_storage empty;
+	int msec_sleep = 10;
+	int queued = 1;
+	int i;
+
+	shutdown(fd, SHUT_WR);
+
+	/* while until the pending data is completely flushed, the later
+	 * disconnect will bypass/ingore/drop any pending data.
+	 */
+	for (i = 0; ; i += msec_sleep) {
+		if (ioctl(fd, SIOCOUTQ, &queued) < 0)
+			xerror("can't query out socket queue: %d", errno);
+
+		if (!queued)
+			break;
+
+		if (i > poll_timeout)
+			xerror("timeout while wating for spool to complete");
+		usleep(msec_sleep * 1000);
+	}
+
+	memset(&empty, 0, sizeof(empty));
+	empty.ss_family = AF_UNSPEC;
+	if (connect(fd, (struct sockaddr *)&empty, addrlen) < 0)
+		xerror("can't disconnect: %d", errno);
+}
+
 int main_loop(void)
 {
-	int fd;
+	int fd, ret, fd_in = 0;
+	struct addrinfo *peer;
 
 	/* listener is ready. */
-	fd = sock_connect_mptcp(cfg_host, cfg_port, cfg_sock_proto);
+	fd = sock_connect_mptcp(cfg_host, cfg_port, cfg_sock_proto, &peer);
 	if (fd < 0)
 		return 2;
 
+again:
 	check_getpeername_connect(fd);
 
 	if (cfg_rcvbuf)
@@ -1062,7 +1130,31 @@ int main_loop(void)
 	if (cfg_cmsg_types.cmsg_enabled)
 		apply_cmsg_types(fd, &cfg_cmsg_types);
 
-	return copyfd_io(0, fd, 1);
+	if (cfg_input) {
+		fd_in = open(cfg_input, O_RDONLY);
+		if (fd < 0)
+			xerror("can't open %s:%d", cfg_input, errno);
+	}
+
+	/* close the client socket open only if we are not going to reconnect */
+	ret = copyfd_io(fd_in, fd, 1, cfg_repeat == 1);
+	if (ret)
+		return ret;
+
+	if (--cfg_repeat > 0) {
+		xdisconnect(fd, peer->ai_addrlen);
+
+		/* the socket could be unblocking at this point, we need the
+		 * connect to be blocking
+		 */
+		set_nonblock(fd, false);
+		if (connect(fd, peer->ai_addr, peer->ai_addrlen))
+			xerror("can't reconnect: %d", errno);
+		if (cfg_input)
+			close(fd_in);
+		goto again;
+	}
+	return 0;
 }
 
 int parse_proto(const char *proto)
@@ -1147,7 +1239,7 @@ static void parse_opts(int argc, char **argv)
 {
 	int c;
 
-	while ((c = getopt(argc, argv, "6jr:lp:s:hut:T:m:S:R:w:M:P:c:o:")) != -1) {
+	while ((c = getopt(argc, argv, "6c:hi:I:jlm:M:o:p:P:r:R:s:S:t:T:uw:")) != -1) {
 		switch (c) {
 		case 'j':
 			cfg_join = true;
@@ -1161,6 +1253,12 @@ static void parse_opts(int argc, char **argv)
 			if (cfg_do_w <= 0)
 				cfg_do_w = 50;
 			break;
+		case 'i':
+			cfg_input = optarg;
+			break;
+		case 'I':
+			cfg_repeat = atoi(optarg);
+			break;
 		case 'l':
 			listen_mode = true;
 			break;
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index a4226b608c68..de6c630a59da 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -7,6 +7,7 @@ optstring="S:R:d:e:l:r:h4cm:f:tC"
 ret=0
 sin=""
 sout=""
+cin_disconnect=""
 cin=""
 cout=""
 ksft_skip=4
@@ -24,6 +25,7 @@ options_log=true
 do_tcp=0
 checksum=false
 filesize=0
+connect_per_transfer=1
 
 if [ $tc_loss -eq 100 ];then
 	tc_loss=1%
@@ -127,6 +129,7 @@ TEST_COUNT=0
 
 cleanup()
 {
+	rm -f "$cin_disconnect" "$cout_disconnect"
 	rm -f "$cin" "$cout"
 	rm -f "$sin" "$sout"
 	rm -f "$capout"
@@ -149,6 +152,8 @@ sout=$(mktemp)
 cin=$(mktemp)
 cout=$(mktemp)
 capout=$(mktemp)
+cin_disconnect="$cin".disconnect
+cout_disconnect="$cout".disconnect
 trap cleanup EXIT
 
 for i in "$ns1" "$ns2" "$ns3" "$ns4";do
@@ -518,8 +523,8 @@ do_transfer()
 	cookies=${cookies##*=}
 
 	if [ ${cl_proto} = "MPTCP" ] && [ ${srv_proto} = "MPTCP" ]; then
-		expect_synrx=$((stat_synrx_last_l+1))
-		expect_ackrx=$((stat_ackrx_last_l+1))
+		expect_synrx=$((stat_synrx_last_l+$connect_per_transfer))
+		expect_ackrx=$((stat_ackrx_last_l+$connect_per_transfer))
 	fi
 
 	if [ ${stat_synrx_now_l} -lt ${expect_synrx} ]; then
@@ -756,6 +761,33 @@ run_tests_peekmode()
 	run_tests_lo "$ns1" "$ns1" dead:beef:1::1 1 "-P ${peekmode}"
 }
 
+run_tests_disconnect()
+{
+	local peekmode="$1"
+	local old_cin=$cin
+	local old_sin=$sin
+
+	cat $cin $cin $cin > "$cin".disconnect
+
+	# force do_transfer to cope with the multiple tranmissions
+	sin="$cin.disconnect"
+	sin_disconnect=$old_sin
+	cin="$cin.disconnect"
+	cin_disconnect="$old_cin"
+	connect_per_transfer=3
+
+	echo "INFO: disconnect"
+	run_tests_lo "$ns1" "$ns1" 10.0.1.1 1 "-I 3 -i $old_cin"
+	run_tests_lo "$ns1" "$ns1" dead:beef:1::1 1 "-I 3 -i $old_cin"
+
+	# restore previous status
+	cout=$old_cout
+	cout_disconnect="$cout".disconnect
+	cin=$old_cin
+	cin_disconnect="$cin".disconnect
+	connect_per_transfer=1
+}
+
 display_time()
 {
 	time_end=$(date +%s)
@@ -873,6 +905,9 @@ stop_if_error "Tests with peek mode have failed"
 # connect to ns4 ip address, ns2 should intercept/proxy
 run_test_transparent 10.0.3.1 "tproxy ipv4"
 run_test_transparent dead:beef:3::1 "tproxy ipv6"
+stop_if_error "Tests with tproxy have failed"
+
+run_tests_disconnect
 
 display_time
 exit $ret
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect()
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
                   ` (6 preceding siblings ...)
  2021-11-11 16:21 ` [PATCH v2 mptcp-next 7/7] mptcp: add disconnect selftests Paolo Abeni
@ 2021-11-12  4:24 ` Mat Martineau
  2021-11-19 16:47 ` Matthieu Baerts
  8 siblings, 0 replies; 10+ messages in thread
From: Mat Martineau @ 2021-11-12  4:24 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

On Thu, 11 Nov 2021, Paolo Abeni wrote:

> As outlined in the public mtg, mptcp_accept() is currently quite
> suboptimal, both from performance and code complexity
>
> This series tries to clean it up, enforcing a wider lifetime for
> the initial subflow, so that we don't need to acquire additional
> references there.
>
> To reach such goal we need to properly define the disconnect()
> behavior, which is currently quite incomplete. Additionally allow
> user-space to really disconnect established connections.
>
> Disconnect() needs in turn an egress FASTCLOSE implementation,
> added here according to option R (reset - the simpler form).
>
> Finally, the self-tests need as pre-req Florian's patches implementing
> SIOCOUTQ
>
> v1 -> v2:
> - update mptcp_connect argument lists and usage() in patch 7/7
>

Thanks for the v2 Paolo. Looks good to me.

(Matthieu, note the dependency on Florian's TCP_INQ series)


Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>

> RFC -> v1:
> - added patches 1/7, 3/7, 6/7, 7/7
> - added a few missing bits in patch 4/7
>
> Paolo Abeni (7):
>  mptcp: keep snd_una updated for fallback socket
>  mptcp: never allow the PM to close a listener subflow
>  mptcp: implement fastclose xmit path
>  mptcp: full disconnect implementation
>  mptcp: cleanup accept and poll
>  mptcp: implement support for user-space disconnect
>  mptcp: add disconnect selftests
>
> net/mptcp/options.c                           |  57 +++++--
> net/mptcp/pm.c                                |  10 +-
> net/mptcp/pm_netlink.c                        |   3 +
> net/mptcp/protocol.c                          | 144 +++++++++++------
> net/mptcp/protocol.h                          |  16 +-
> net/mptcp/subflow.c                           |   1 -
> net/mptcp/token.c                             |   1 +
> .../selftests/net/mptcp/mptcp_connect.c       | 148 +++++++++++++++---
> .../selftests/net/mptcp/mptcp_connect.sh      |  39 ++++-
> 9 files changed, 324 insertions(+), 95 deletions(-)
>
> -- 
> 2.33.1
>
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect()
  2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
                   ` (7 preceding siblings ...)
  2021-11-12  4:24 ` [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Mat Martineau
@ 2021-11-19 16:47 ` Matthieu Baerts
  8 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2021-11-19 16:47 UTC (permalink / raw)
  To: Paolo Abeni, Mat Martineau; +Cc: mptcp

Hi Paolo, Mat,

On 11/11/2021 17:21, Paolo Abeni wrote:
> As outlined in the public mtg, mptcp_accept() is currently quite
> suboptimal, both from performance and code complexity
> 
> This series tries to clean it up, enforcing a wider lifetime for
> the initial subflow, so that we don't need to acquire additional
> references there.
> 
> To reach such goal we need to properly define the disconnect()
> behavior, which is currently quite incomplete. Additionally allow
> user-space to really disconnect established connections.
> 
> Disconnect() needs in turn an egress FASTCLOSE implementation,
> added here according to option R (reset - the simpler form).
> 
> Finally, the self-tests need as pre-req Florian's patches implementing
> SIOCOUTQ

Thank you for the patches and reviews!

Now that Florian's patches are in our tree, we can apply this series
too. Patches are now in our tree with Mat's RvB tags and without some
warnings reported by checkpatch in selftests (except the ones not to
split strings to different lines, so just some small things):

- be5470135e9c: mptcp: keep snd_una updated for fallback socket
- 3dd1385ee6a2: mptcp: never allow the PM to close a listener subflow
- b382fb5fa27c: mptcp: implement fastclose xmit path
- 445bf9531ef7: mptcp: full disconnect implementation
- a2fcf84f7b73: mptcp: cleanup accept and poll
- c999f3f089c7: mptcp: implement support for user-space disconnect
- 707b9bbf2820: mptcp: add disconnect selftests
- Results: 3ebf8b8d1426..736c6532dd92

Builds and tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20211119T164652
https://github.com/multipath-tcp/mptcp_net-next/actions/workflows/build-validation.yml?query=branch:export

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-11-19 16:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-11 16:21 [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Paolo Abeni
2021-11-11 16:21 ` [PATCH v2 mptcp-next 1/7] mptcp: keep snd_una updated for fallback socket Paolo Abeni
2021-11-11 16:21 ` [PATCH v2 mptcp-next 2/7] mptcp: never allow the PM to close a listener subflow Paolo Abeni
2021-11-11 16:21 ` [PATCH v2 mptcp-next 3/7] mptcp: implement fastclose xmit path Paolo Abeni
2021-11-11 16:21 ` [PATCH v2 mptcp-next 4/7] mptcp: full disconnect implementation Paolo Abeni
2021-11-11 16:21 ` [PATCH v2 mptcp-next 5/7] mptcp: cleanup accept and poll Paolo Abeni
2021-11-11 16:21 ` [PATCH v2 mptcp-next 6/7] mptcp: implement support for user-space disconnect Paolo Abeni
2021-11-11 16:21 ` [PATCH v2 mptcp-next 7/7] mptcp: add disconnect selftests Paolo Abeni
2021-11-12  4:24 ` [PATCH v2 mptcp-next 0/7] mptcp: improve accept() and disconnect() Mat Martineau
2021-11-19 16:47 ` Matthieu Baerts

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.