mptcp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error.
@ 2022-09-01 17:31 Paolo Abeni
  2022-09-01 17:31 ` [PATCH v3 mptcp-next 2/3] mptcp: use fastclose on more edge scenarios Paolo Abeni
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Paolo Abeni @ 2022-09-01 17:31 UTC (permalink / raw)
  To: mptcp

When an mptcp socket is closed due to an incoming FASTCLOSE
option, so specific sk_err is set and later syscall will
fail usually with EPIPE.

Align the current fastclose error handling with TCP reset,
properly setting the socket error according to the current
msk state and propagating such error.

Additionally sendmsg() is currently not handling properly
the sk_err, always returning EPIPE.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 35 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f4891db86217..b04f184695e4 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
 		ret = sk_stream_wait_connect(sk, &timeo);
 		if (ret)
-			goto out;
+			goto do_error;
 	}
 
+	ret = -EPIPE;
+	if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)))
+		goto do_error;
+
 	pfrag = sk_page_frag(sk);
 
 	while (msg_data_left(msg)) {
@@ -1696,11 +1700,6 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		bool dfrag_collapsed;
 		size_t psize, offset;
 
-		if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) {
-			ret = -EPIPE;
-			goto out;
-		}
-
 		/* reuse tail pfrag, if possible, or carve a new one from the
 		 * page allocator
 		 */
@@ -1732,7 +1731,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		if (copy_page_from_iter(dfrag->page, offset, psize,
 					&msg->msg_iter) != psize) {
 			ret = -EFAULT;
-			goto out;
+			goto do_error;
 		}
 
 		/* data successfully copied into the write queue */
@@ -1764,7 +1763,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		__mptcp_push_pending(sk, msg->msg_flags);
 		ret = sk_stream_wait_memory(sk, &timeo);
 		if (ret)
-			goto out;
+			goto do_error;
 	}
 
 	if (copied)
@@ -1772,7 +1771,14 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 out:
 	release_sock(sk);
-	return copied ? : ret;
+	return copied;
+
+do_error:
+	if (copied)
+		goto out;
+
+	copied = sk_stream_error(sk, msg->msg_flags, ret);
+	goto out;
 }
 
 static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk,
@@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
 		unlock_sock_fast(tcp_sk, slow);
 	}
 
+	/* Mirror the tcp_reset() error propagation */
+	switch (sk->sk_state) {
+	case TCP_SYN_SENT:
+		sk->sk_err = ECONNREFUSED;
+		break;
+	case TCP_CLOSE_WAIT:
+		sk->sk_err = EPIPE;
+		break;
+	case TCP_CLOSE:
+		return;
+	default:
+		sk->sk_err = ECONNRESET;
+	}
+
 	inet_sk_state_store(sk, TCP_CLOSE);
 	sk->sk_shutdown = SHUTDOWN_MASK;
 	smp_mb__before_atomic(); /* SHUTDOWN must be visible first */
 	set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags);
 
-	mptcp_close_wake_up(sk);
+	/* the calling mptcp_worker will properly destroy the socket */
+	if (sock_flag(sk, SOCK_DEAD))
+		return;
+
+	sk->sk_state_change(sk);
+	sk_error_report(sk);
 }
 
 static void __mptcp_retrans(struct sock *sk)
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 mptcp-next 2/3] mptcp: use fastclose on more edge scenarios.
  2022-09-01 17:31 [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Paolo Abeni
@ 2022-09-01 17:31 ` Paolo Abeni
  2022-09-01 17:31 ` [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases Paolo Abeni
  2022-09-07 17:22 ` [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Matthieu Baerts
  2 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2022-09-01 17:31 UTC (permalink / raw)
  To: mptcp

Daire reported a user-space application hang-up when the
peer is forcibly closed before the data transfer completion.

The relevant application expects the peer to either
do an application-level clean shutdown or a transport-level
connection reset.

We can accommodate a such user by extending the fastclose
usage: at fd close time, if the msk socket has some unread
data, and at FIN_WAIT timeout.

Note that at MPTCP close time we must ensure that the TCP
subflows will reset: set the linger socket option to a suitable
value.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
--
v2 -> v3:
 - move self-tests update to a different patch

v1 -> v2:
 - ensure mptcp_do_fastclose will reset all the subflows,
 - update fastclose self-tests
---
 net/mptcp/protocol.c | 50 +++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b04f184695e4..fec542fea02a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2276,8 +2276,14 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 
 	lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
 
-	if (flags & MPTCP_CF_FASTCLOSE)
+	if (flags & MPTCP_CF_FASTCLOSE) {
+		/* be sure to force the tcp_disconnect() path,
+		 * to generate the egress reset
+		 */
+		ssk->sk_lingertime = 0;
+		sock_set_flag(ssk, SOCK_LINGER);
 		subflow->send_fastclose = 1;
+	}
 
 	need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk);
 	if (!dispose_it) {
@@ -2540,6 +2546,16 @@ static void mptcp_mp_fail_no_response(struct mptcp_sock *msk)
 	mptcp_reset_timeout(msk, 0);
 }
 
+static void mptcp_do_fastclose(struct sock *sk)
+{
+	struct mptcp_subflow_context *subflow, *tmp;
+	struct mptcp_sock *msk = mptcp_sk(sk);
+
+	mptcp_for_each_subflow_safe(msk, subflow, tmp)
+		__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow),
+				  subflow, MPTCP_CF_FASTCLOSE);
+}
+
 static void mptcp_worker(struct work_struct *work)
 {
 	struct mptcp_sock *msk = container_of(work, struct mptcp_sock, work);
@@ -2571,6 +2587,7 @@ static void mptcp_worker(struct work_struct *work)
 	if (sock_flag(sk, SOCK_DEAD) &&
 	    (mptcp_check_close_timeout(sk) || sk->sk_state == TCP_CLOSE)) {
 		inet_sk_state_store(sk, TCP_CLOSE);
+		mptcp_do_fastclose(sk);
 		__mptcp_destroy_sock(sk);
 		goto unlock;
 	}
@@ -2823,6 +2840,18 @@ static void __mptcp_destroy_sock(struct sock *sk)
 	sock_put(sk);
 }
 
+static __poll_t mptcp_check_readable(struct mptcp_sock *msk)
+{
+	/* Concurrent splices from sk_receive_queue into receive_queue will
+	 * always show at least one non-empty queue when checked in this order.
+	 */
+	if (skb_queue_empty_lockless(&((struct sock *)msk)->sk_receive_queue) &&
+	    skb_queue_empty_lockless(&msk->receive_queue))
+		return 0;
+
+	return EPOLLIN | EPOLLRDNORM;
+}
+
 static void mptcp_close(struct sock *sk, long timeout)
 {
 	struct mptcp_subflow_context *subflow;
@@ -2837,8 +2866,13 @@ static void mptcp_close(struct sock *sk, long timeout)
 		goto cleanup;
 	}
 
-	if (mptcp_close_state(sk))
+	if (mptcp_check_readable(msk)) {
+		/* the msk has read data, do the MPTCP equivalent of TCP reset */
+		inet_sk_state_store(sk, TCP_CLOSE);
+		mptcp_do_fastclose(sk);
+	} else if (mptcp_close_state(sk)) {
 		__mptcp_wr_shutdown(sk);
+	}
 
 	sk_stream_wait_close(sk, timeout);
 
@@ -3646,18 +3680,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
 	return err;
 }
 
-static __poll_t mptcp_check_readable(struct mptcp_sock *msk)
-{
-	/* Concurrent splices from sk_receive_queue into receive_queue will
-	 * always show at least one non-empty queue when checked in this order.
-	 */
-	if (skb_queue_empty_lockless(&((struct sock *)msk)->sk_receive_queue) &&
-	    skb_queue_empty_lockless(&msk->receive_queue))
-		return 0;
-
-	return EPOLLIN | EPOLLRDNORM;
-}
-
 static __poll_t mptcp_check_writeable(struct mptcp_sock *msk)
 {
 	struct sock *sk = (struct sock *)msk;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases
  2022-09-01 17:31 [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Paolo Abeni
  2022-09-01 17:31 ` [PATCH v3 mptcp-next 2/3] mptcp: use fastclose on more edge scenarios Paolo Abeni
@ 2022-09-01 17:31 ` Paolo Abeni
  2022-09-07 17:22   ` Matthieu Baerts
  2022-09-07 17:22 ` [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Matthieu Baerts
  2 siblings, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2022-09-01 17:31 UTC (permalink / raw)
  To: mptcp

After the previous patches, the MPTCP protocol can generate
fast-closes on both ends of the connection. Rework the relevant
test-case to carefully trigger the fast-close code-path on a
single end at the time, while ensuring than a predictable amount
of data is spooled on both ends.

Additionally add another test-cases for the passive socket
fast-close.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v2 -> v3:
 - splitted out of 2/2
 - fixed a bunch of checkpatch issues
---
 .../selftests/net/mptcp/mptcp_connect.c       | 58 ++++++++++--
 .../testing/selftests/net/mptcp/mptcp_join.sh | 88 +++++++++++++++----
 2 files changed, 120 insertions(+), 26 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 24d4e9cb617e..ee6df83c55f3 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -72,6 +72,7 @@ static int cfg_wait;
 static uint32_t cfg_mark;
 static char *cfg_input;
 static int cfg_repeat = 1;
+static int cfg_truncate;
 
 struct cfg_cmsg_types {
 	unsigned int cmsg_enabled:1;
@@ -381,8 +382,6 @@ static size_t do_rnd_write(const int fd, char *buf, const size_t len)
 		do_w = cfg_do_w;
 
 	bw = write(fd, buf, do_w);
-	if (bw < 0)
-		perror("write");
 
 	/* let the join handshake complete, before going on */
 	if (cfg_join && first) {
@@ -571,7 +570,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
 		.fd = peerfd,
 		.events = POLLIN | POLLOUT,
 	};
-	unsigned int woff = 0, wlen = 0;
+	unsigned int woff = 0, wlen = 0, total_wlen = 0, total_rlen = 0;
 	char wbuf[8192];
 
 	set_nonblock(peerfd, true);
@@ -597,7 +596,16 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
 		}
 
 		if (fds.revents & POLLIN) {
-			len = do_rnd_read(peerfd, rbuf, sizeof(rbuf));
+			ssize_t rb = sizeof(rbuf);
+
+			/* limit the total amount of read data to the trunc value*/
+			if (cfg_truncate > 0) {
+				if (rb + total_rlen > cfg_truncate)
+					rb = cfg_truncate - total_rlen;
+				len = read(peerfd, rbuf, rb);
+			} else {
+				len = do_rnd_read(peerfd, rbuf, sizeof(rbuf));
+			}
 			if (len == 0) {
 				/* no more data to receive:
 				 * peer has closed its write side
@@ -612,10 +620,14 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
 
 			/* Else, still have data to transmit */
 			} else if (len < 0) {
+				/* ignore errors on I/O operation when the peer is fastclosing */
+				if (cfg_truncate < 0)
+					return 0;
 				perror("read");
 				return 3;
 			}
 
+			total_rlen += len;
 			do_write(outfd, rbuf, len);
 		}
 
@@ -628,12 +640,21 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
 			if (wlen > 0) {
 				ssize_t bw;
 
+				/* limit the total amount of written data to the trunc value */
+				if (cfg_truncate > 0 && wlen + total_wlen > cfg_truncate)
+					wlen = cfg_truncate - total_wlen;
+
 				bw = do_rnd_write(peerfd, wbuf + woff, wlen);
-				if (bw < 0)
+				if (bw < 0) {
+					if (cfg_truncate < 0)
+						return 0;
+					perror("write");
 					return 111;
+				}
 
 				woff += bw;
 				wlen -= bw;
+				total_wlen += bw;
 			} else if (wlen == 0) {
 				/* We have no more data to send. */
 				fds.events &= ~POLLOUT;
@@ -652,10 +673,19 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
 		}
 
 		if (fds.revents & (POLLERR | POLLNVAL)) {
+			/* when the peer is truncating the file and fastclosing, ignore
+			 * HUP/connection reset errors
+			 */
+			if (cfg_truncate < 0)
+				return 0;
 			fprintf(stderr, "Unexpected revents: "
 				"POLLERR/POLLNVAL(%x)\n", fds.revents);
 			return 5;
 		}
+
+		if (cfg_truncate > 0 && total_wlen >= cfg_truncate &&
+		    total_rlen >= cfg_truncate)
+			break;
 	}
 
 	/* leave some time for late join/announce */
@@ -1160,11 +1190,13 @@ int main_loop(void)
 	}
 
 	/* close the client socket open only if we are not going to reconnect */
-	ret = copyfd_io(fd_in, fd, 1, cfg_repeat == 1);
+	ret = copyfd_io(fd_in, fd, 1, 0);
 	if (ret)
 		return ret;
 
-	if (--cfg_repeat > 0) {
+	if (cfg_truncate > 0) {
+		xdisconnect(fd, peer->ai_addrlen);
+	} else if (--cfg_repeat > 0) {
 		xdisconnect(fd, peer->ai_addrlen);
 
 		/* the socket could be unblocking at this point, we need the
@@ -1176,7 +1208,8 @@ int main_loop(void)
 		if (cfg_input)
 			close(fd_in);
 		goto again;
-	}
+	} else
+		close(fd);
 	return 0;
 }
 
@@ -1262,8 +1295,15 @@ static void parse_opts(int argc, char **argv)
 {
 	int c;
 
-	while ((c = getopt(argc, argv, "6c:hi:I:jlm:M:o:p:P:r:R:s:S:t:T:w:")) != -1) {
+	while ((c = getopt(argc, argv, "6c:f:hi:I:jlm:M:o:p:P:r:R:s:S:t:T:w:")) != -1) {
 		switch (c) {
+		case 'f':
+			cfg_truncate = atoi(optarg);
+
+			/* when receiving a fastclose, ignore PIPE signals */
+			if (cfg_truncate < 0)
+				signal(SIGPIPE, handle_signal);
+			break;
 		case 'j':
 			cfg_join = true;
 			cfg_mode = CFG_MODE_POLL;
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 2957fe414639..0f4c9c4bcb5b 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -346,10 +346,21 @@ check_transfer()
 	local in=$1
 	local out=$2
 	local what=$3
+	local bytes=$4
 	local i a b
 
 	local line
-	cmp -l "$in" "$out" | while read -r i a b; do
+	if [ -n "$bytes" ]; then
+		# when truncating we must check the size explicitly
+		local out_size=$(wc -c $out | awk '{print $1}')
+		if [ $out_size -ne $bytes ]; then
+			echo "[ FAIL ] $what output file has wrong size ($out_size, $bytes)"
+			fail_test
+			return 1
+		fi
+		bytes="--bytes=${bytes}"
+	fi
+	cmp -l "$in" "$out" ${bytes} | while read -r i a b; do
 		local sum=$((0${a} + 0${b}))
 		if [ $check_invert -eq 0 ] || [ $sum -ne $((0xff)) ]; then
 			echo "[ FAIL ] $what does not match (in, out):"
@@ -707,9 +718,29 @@ do_transfer()
 	fi
 
 	local flags="subflow"
+	local extra_cl_args=""
+	local extra_srv_args=""
+	local trunc_size=""
 	if [[ "${addr_nr_ns2}" = "fastclose_"* ]]; then
+		if [ ${test_link_fail} -le 1 ]; then
+			echo "fastclose tests need test_link_fail argument"
+			return 0
+		fi
+
 		# disconnect
-		extra_args="$extra_args -I ${addr_nr_ns2:10}"
+		trunc_size=${test_link_fail}
+		local side=${addr_nr_ns2:10}
+
+		if [ ${side} = "client" ]; then
+			extra_cl_args="-f ${test_link_fail}"
+			extra_srv_args="-f -1"
+		elif [ ${side} = "server" ]; then
+			extra_srv_args="-f ${test_link_fail}"
+			extra_cl_args="-f -1"
+		else
+			echo "wrong/unknown fastclose spec ${side}"
+			return 0
+		fi
 		addr_nr_ns2=0
 	elif [[ "${addr_nr_ns2}" = "userspace_"* ]]; then
 		userspace_pm=1
@@ -737,39 +768,41 @@ do_transfer()
 		local_addr="0.0.0.0"
 	fi
 
+	extra_srv_args="$extra_args $extra_srv_args"
 	if [ "$test_link_fail" -gt 1 ];then
 		timeout ${timeout_test} \
 			ip netns exec ${listener_ns} \
 				./mptcp_connect -t ${timeout_poll} -l -p $port -s ${srv_proto} \
-					$extra_args ${local_addr} < "$sinfail" > "$sout" &
+					$extra_srv_args ${local_addr} < "$sinfail" > "$sout" &
 	else
 		timeout ${timeout_test} \
 			ip netns exec ${listener_ns} \
 				./mptcp_connect -t ${timeout_poll} -l -p $port -s ${srv_proto} \
-					$extra_args ${local_addr} < "$sin" > "$sout" &
+					$extra_srv_args ${local_addr} < "$sin" > "$sout" &
 	fi
 	local spid=$!
 
 	wait_local_port_listen "${listener_ns}" "${port}"
 
+	extra_cl_args="$extra_args $extra_cl_args"
 	if [ "$test_link_fail" -eq 0 ];then
 		timeout ${timeout_test} \
 			ip netns exec ${connector_ns} \
 				./mptcp_connect -t ${timeout_poll} -p $port -s ${cl_proto} \
-					$extra_args $connect_addr < "$cin" > "$cout" &
+					$extra_cl_args $connect_addr < "$cin" > "$cout" &
 	elif [ "$test_link_fail" -eq 1 ] || [ "$test_link_fail" -eq 2 ];then
 		( cat "$cinfail" ; sleep 2; link_failure $listener_ns ; cat "$cinfail" ) | \
 			tee "$cinsent" | \
 			timeout ${timeout_test} \
 				ip netns exec ${connector_ns} \
 					./mptcp_connect -t ${timeout_poll} -p $port -s ${cl_proto} \
-						$extra_args $connect_addr > "$cout" &
+						$extra_cl_args $connect_addr > "$cout" &
 	else
 		tee "$cinsent" < "$cinfail" | \
 			timeout ${timeout_test} \
 				ip netns exec ${connector_ns} \
 					./mptcp_connect -t ${timeout_poll} -p $port -s ${cl_proto} \
-						$extra_args $connect_addr > "$cout" &
+						$extra_cl_args $connect_addr > "$cout" &
 	fi
 	local cpid=$!
 
@@ -971,15 +1004,15 @@ do_transfer()
 	fi
 
 	if [ "$test_link_fail" -gt 1 ];then
-		check_transfer $sinfail $cout "file received by client"
+		check_transfer $sinfail $cout "file received by client" $trunc_size
 	else
-		check_transfer $sin $cout "file received by client"
+		check_transfer $sin $cout "file received by client" $trunc_size
 	fi
 	retc=$?
 	if [ "$test_link_fail" -eq 0 ];then
-		check_transfer $cin $sout "file received by server"
+		check_transfer $cin $sout "file received by server" $trunc_size
 	else
-		check_transfer $cinsent $sout "file received by server"
+		check_transfer $cinsent $sout "file received by server" $trunc_size
 	fi
 	rets=$?
 
@@ -1188,12 +1221,23 @@ chk_fclose_nr()
 {
 	local fclose_tx=$1
 	local fclose_rx=$2
+	local ns_invert=${3:-""}
 	local count
 	local dump_stats
+	local ns_tx=$ns2
+	local ns_rx=$ns1
+	local extra_msg=""
+
+	if [[ $ns_invert = "invert" ]]; then
+		ns_tx=$ns1
+		ns_rx=$ns2
+		extra_msg="   invert"
+	fi
 
 	printf "%-${nr_blank}s %s" " " "ctx"
-	count=$(ip netns exec $ns2 nstat -as | grep MPTcpExtMPFastcloseTx | awk '{print $2}')
+	count=$(ip netns exec $ns_tx nstat -as | grep MPTcpExtMPFastcloseTx | awk '{print $2}')
 	[ -z "$count" ] && count=0
+	[ "$count" != "$fclose_tx" ] && extra_msg="$extra_msg,tx=$count"
 	if [ "$count" != "$fclose_tx" ]; then
 		echo "[fail] got $count MP_FASTCLOSE[s] TX expected $fclose_tx"
 		fail_test
@@ -1203,17 +1247,20 @@ chk_fclose_nr()
 	fi
 
 	echo -n " - fclzrx"
-	count=$(ip netns exec $ns1 nstat -as | grep MPTcpExtMPFastcloseRx | awk '{print $2}')
+	count=$(ip netns exec $ns_rx nstat -as | grep MPTcpExtMPFastcloseRx | awk '{print $2}')
 	[ -z "$count" ] && count=0
+	[ "$count" != "$fclose_rx" ] && extra_msg="$extra_msg,rx=$count"
 	if [ "$count" != "$fclose_rx" ]; then
 		echo "[fail] got $count MP_FASTCLOSE[s] RX expected $fclose_rx"
 		fail_test
 		dump_stats=1
 	else
-		echo "[ ok ]"
+		echo -n "[ ok ]"
 	fi
 
 	[ "${dump_stats}" = 1 ] && dump_stats
+
+	echo "$extra_msg"
 }
 
 chk_rst_nr()
@@ -1236,7 +1283,7 @@ chk_rst_nr()
 	printf "%-${nr_blank}s %s" " " "rtx"
 	count=$(ip netns exec $ns_tx nstat -as | grep MPTcpExtMPRstTx | awk '{print $2}')
 	[ -z "$count" ] && count=0
-	if [ "$count" != "$rst_tx" ]; then
+	if [ $count -lt $rst_tx ]; then
 		echo "[fail] got $count MP_RST[s] TX expected $rst_tx"
 		fail_test
 		dump_stats=1
@@ -1247,7 +1294,7 @@ chk_rst_nr()
 	echo -n " - rstrx "
 	count=$(ip netns exec $ns_rx nstat -as | grep MPTcpExtMPRstRx | awk '{print $2}')
 	[ -z "$count" ] && count=0
-	if [ "$count" != "$rst_rx" ]; then
+	if [ "$count" -lt "$rst_rx" ]; then
 		echo "[fail] got $count MP_RST[s] RX expected $rst_rx"
 		fail_test
 		dump_stats=1
@@ -2801,11 +2848,18 @@ fullmesh_tests()
 fastclose_tests()
 {
 	if reset "fastclose test"; then
-		run_tests $ns1 $ns2 10.0.1.1 1024 0 fastclose_2
+		run_tests $ns1 $ns2 10.0.1.1 1024 0 fastclose_client
 		chk_join_nr 0 0 0
 		chk_fclose_nr 1 1
 		chk_rst_nr 1 1 invert
 	fi
+
+	if reset "fastclose server test"; then
+		run_tests $ns1 $ns2 10.0.1.1 1024 0 fastclose_server
+		chk_join_nr 0 0 0
+		chk_fclose_nr 1 1 invert
+		chk_rst_nr 1 1
+	fi
 }
 
 pedit_action_pkts()
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error.
  2022-09-01 17:31 [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Paolo Abeni
  2022-09-01 17:31 ` [PATCH v3 mptcp-next 2/3] mptcp: use fastclose on more edge scenarios Paolo Abeni
  2022-09-01 17:31 ` [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases Paolo Abeni
@ 2022-09-07 17:22 ` Matthieu Baerts
  2022-09-08 11:48   ` Paolo Abeni
  2 siblings, 1 reply; 10+ messages in thread
From: Matthieu Baerts @ 2022-09-07 17:22 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 01/09/2022 19:31, Paolo Abeni wrote:
> When an mptcp socket is closed due to an incoming FASTCLOSE
> option, so specific sk_err is set and later syscall will
> fail usually with EPIPE.
> 
> Align the current fastclose error handling with TCP reset,
> properly setting the socket error according to the current
> msk state and propagating such error.

Thank you for the patch and sorry for the delay!

I have some questions here below:

> Additionally sendmsg() is currently not handling properly
> the sk_err, always returning EPIPE.
Could it be seen as a bug-fix and deserving a dedicated commit?

> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++----------
>  1 file changed, 35 insertions(+), 10 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index f4891db86217..b04f184695e4 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
>  	if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
>  		ret = sk_stream_wait_connect(sk, &timeo);
>  		if (ret)
> -			goto out;
> +			goto do_error;
>  	}
>  
> +	ret = -EPIPE;

Regarding this line above... (see below)

> +	if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)))
> +		goto do_error;
> +
>  	pfrag = sk_page_frag(sk);
>  
>  	while (msg_data_left(msg)) {

...Could we eventually have no data left and no going through the while
loop?

In this case "copied" will be 0 and we will return the new -EPIPE, maybe
not what we want, no?
Or should we change 'ret' value just before calling 'goto do_error' to
avoid that?

(...)

> @@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
>  		unlock_sock_fast(tcp_sk, slow);
>  	}
>  
> +	/* Mirror the tcp_reset() error propagation */
> +	switch (sk->sk_state) {
> +	case TCP_SYN_SENT:
> +		sk->sk_err = ECONNREFUSED;
> +		break;
> +	case TCP_CLOSE_WAIT:
> +		sk->sk_err = EPIPE;
> +		break;
> +	case TCP_CLOSE:
> +		return;
> +	default:
> +		sk->sk_err = ECONNRESET;
> +	}

Should we eventually move this out of tcp_reset(), export it (inline
function?) and re-using it here to avoid the duplication (and eventually
being out-of-sync in case of changes on TCP side? even if I guess this
would not change but well)

> +
>  	inet_sk_state_store(sk, TCP_CLOSE);
>  	sk->sk_shutdown = SHUTDOWN_MASK;
>  	smp_mb__before_atomic(); /* SHUTDOWN must be visible first */
>  	set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags);
>  
> -	mptcp_close_wake_up(sk);
> +	/* the calling mptcp_worker will properly destroy the socket */
> +	if (sock_flag(sk, SOCK_DEAD))
> +		return;
> +
> +	sk->sk_state_change(sk);
> +	sk_error_report(sk);
>  }
>  
>  static void __mptcp_retrans(struct sock *sk)

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases
  2022-09-01 17:31 ` [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases Paolo Abeni
@ 2022-09-07 17:22   ` Matthieu Baerts
  2022-09-07 18:58     ` Matthieu Baerts
  2022-09-09  8:09     ` Paolo Abeni
  0 siblings, 2 replies; 10+ messages in thread
From: Matthieu Baerts @ 2022-09-07 17:22 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 01/09/2022 19:31, Paolo Abeni wrote:
> After the previous patches, the MPTCP protocol can generate
> fast-closes on both ends of the connection. Rework the relevant
> test-case to carefully trigger the fast-close code-path on a
> single end at the time, while ensuring than a predictable amount
> of data is spooled on both ends.
> 
> Additionally add another test-cases for the passive socket
> fast-close.

Thank you for the v3!

(...)

> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
> index 24d4e9cb617e..ee6df83c55f3 100644
> --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
> +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c

(...)

> @@ -381,8 +382,6 @@ static size_t do_rnd_write(const int fd, char *buf, const size_t len)
>  		do_w = cfg_do_w;
>  
>  	bw = write(fd, buf, do_w);
> -	if (bw < 0)
> -		perror("write");

Should we skip the sleep done here just below and return bw directly in
case of error?
Or is it more like an small opti not really needed?

>  
>  	/* let the join handshake complete, before going on */
>  	if (cfg_join && first) {
> @@ -571,7 +570,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
>  		.fd = peerfd,
>  		.events = POLLIN | POLLOUT,
>  	};
> -	unsigned int woff = 0, wlen = 0;
> +	unsigned int woff = 0, wlen = 0, total_wlen = 0, total_rlen = 0;
>  	char wbuf[8192];
>  
>  	set_nonblock(peerfd, true);
> @@ -597,7 +596,16 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
>  		}
>  
>  		if (fds.revents & POLLIN) {
> -			len = do_rnd_read(peerfd, rbuf, sizeof(rbuf));
> +			ssize_t rb = sizeof(rbuf);
> +
> +			/* limit the total amount of read data to the trunc value*/
> +			if (cfg_truncate > 0) {
> +				if (rb + total_rlen > cfg_truncate)
> +					rb = cfg_truncate - total_rlen;
> +				len = read(peerfd, rbuf, rb);
> +			} else {
> +				len = do_rnd_read(peerfd, rbuf, sizeof(rbuf));
> +			}
>  			if (len == 0) {
>  				/* no more data to receive:
>  				 * peer has closed its write side
> @@ -612,10 +620,14 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
>  
>  			/* Else, still have data to transmit */
>  			} else if (len < 0) {
> +				/* ignore errors on I/O operation when the peer is fastclosing */
> +				if (cfg_truncate < 0)

(it is confusing to me to use negative values for a special case
(requiring an addional comment each time) instead of using a new
dedicated option with a clear name but probably just me :) )

> +					return 0;
>  				perror("read");
>  				return 3;
>  			}
>  
> +			total_rlen += len;
>  			do_write(outfd, rbuf, len);
>  		}
>  

(...)

> @@ -1262,8 +1295,15 @@ static void parse_opts(int argc, char **argv)
>  {
>  	int c;
>  
> -	while ((c = getopt(argc, argv, "6c:hi:I:jlm:M:o:p:P:r:R:s:S:t:T:w:")) != -1) {
> +	while ((c = getopt(argc, argv, "6c:f:hi:I:jlm:M:o:p:P:r:R:s:S:t:T:w:")) != -1) {
>  		switch (c) {
> +		case 'f':

Maybe better to update the die_usage() function if you don't mind.

> +			cfg_truncate = atoi(optarg);
> +
> +			/* when receiving a fastclose, ignore PIPE signals */
> +			if (cfg_truncate < 0)
> +				signal(SIGPIPE, handle_signal);
> +			break;
>  		case 'j':
>  			cfg_join = true;
>  			cfg_mode = CFG_MODE_POLL;
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> index 2957fe414639..0f4c9c4bcb5b 100755
> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh

(...)

> @@ -707,9 +718,29 @@ do_transfer()
>  	fi
>  
>  	local flags="subflow"
> +	local extra_cl_args=""
> +	local extra_srv_args=""
> +	local trunc_size=""
>  	if [[ "${addr_nr_ns2}" = "fastclose_"* ]]; then
> +		if [ ${test_link_fail} -le 1 ]; then
> +			echo "fastclose tests need test_link_fail argument"
> +			return 0

Probably clearer to exit directly with an error, to be sure the CI
catches this, no?
Or do you think it is not needed?

> +		fi
> +
>  		# disconnect
> -		extra_args="$extra_args -I ${addr_nr_ns2:10}"
> +		trunc_size=${test_link_fail}
> +		local side=${addr_nr_ns2:10}
> +
> +		if [ ${side} = "client" ]; then
> +			extra_cl_args="-f ${test_link_fail}"
> +			extra_srv_args="-f -1"
> +		elif [ ${side} = "server" ]; then
> +			extra_srv_args="-f ${test_link_fail}"
> +			extra_cl_args="-f -1"
> +		else
> +			echo "wrong/unknown fastclose spec ${side}"
> +			return 0

Same here.

> +		fi
>  		addr_nr_ns2=0
>  	elif [[ "${addr_nr_ns2}" = "userspace_"* ]]; then
>  		userspace_pm=1

(...)

> @@ -1188,12 +1221,23 @@ chk_fclose_nr()
>  {
>  	local fclose_tx=$1
>  	local fclose_rx=$2
> +	local ns_invert=${3:-""}

(I'm not sure why you need to default to an empty string but that's a
detail :) )

>  	local count
>  	local dump_stats
> +	local ns_tx=$ns2
> +	local ns_rx=$ns1
> +	local extra_msg=""
> +
> +	if [[ $ns_invert = "invert" ]]; then
> +		ns_tx=$ns1
> +		ns_rx=$ns2
> +		extra_msg="   invert"
> +	fi
>  
>  	printf "%-${nr_blank}s %s" " " "ctx"
> -	count=$(ip netns exec $ns2 nstat -as | grep MPTcpExtMPFastcloseTx | awk '{print $2}')
> +	count=$(ip netns exec $ns_tx nstat -as | grep MPTcpExtMPFastcloseTx | awk '{print $2}')
>  	[ -z "$count" ] && count=0
> +	[ "$count" != "$fclose_tx" ] && extra_msg="$extra_msg,tx=$count"

detail: if we are not in the "invert" mode, extra_msg will be empty: no
need to add the 3 spaces? or init extra_msg with these 3 spaces?

>  	if [ "$count" != "$fclose_tx" ]; then
>  		echo "[fail] got $count MP_FASTCLOSE[s] TX expected $fclose_tx"
>  		fail_test

(...)

> @@ -1236,7 +1283,7 @@ chk_rst_nr()
>  	printf "%-${nr_blank}s %s" " " "rtx"
>  	count=$(ip netns exec $ns_tx nstat -as | grep MPTcpExtMPRstTx | awk '{print $2}')
>  	[ -z "$count" ] && count=0
> -	if [ "$count" != "$rst_tx" ]; then
> +	if [ $count -lt $rst_tx ]; then

Why is it fine if we send (and below receive) more than expected?

>  		echo "[fail] got $count MP_RST[s] TX expected $rst_tx"
>  		fail_test
>  		dump_stats=1
> @@ -1247,7 +1294,7 @@ chk_rst_nr()
>  	echo -n " - rstrx "
>  	count=$(ip netns exec $ns_rx nstat -as | grep MPTcpExtMPRstRx | awk '{print $2}')
>  	[ -z "$count" ] && count=0
> -	if [ "$count" != "$rst_rx" ]; then
> +	if [ "$count" -lt "$rst_rx" ]; then
>  		echo "[fail] got $count MP_RST[s] RX expected $rst_rx"
>  		fail_test
>  		dump_stats=1
> @@ -2801,11 +2848,18 @@ fullmesh_tests()
>  fastclose_tests()
>  {
>  	if reset "fastclose test"; then
> -		run_tests $ns1 $ns2 10.0.1.1 1024 0 fastclose_2
> +		run_tests $ns1 $ns2 10.0.1.1 1024 0 fastclose_client

clearer :)

>  		chk_join_nr 0 0 0
>  		chk_fclose_nr 1 1
>  		chk_rst_nr 1 1 invert
>  	fi
> +
> +	if reset "fastclose server test"; then
> +		run_tests $ns1 $ns2 10.0.1.1 1024 0 fastclose_server
> +		chk_join_nr 0 0 0
> +		chk_fclose_nr 1 1 invert
> +		chk_rst_nr 1 1
> +	fi
>  }
>  
>  pedit_action_pkts()

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases
  2022-09-07 17:22   ` Matthieu Baerts
@ 2022-09-07 18:58     ` Matthieu Baerts
  2022-09-09  8:09     ` Paolo Abeni
  1 sibling, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2022-09-07 18:58 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

On 07/09/2022 19:22, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 01/09/2022 19:31, Paolo Abeni wrote:
>> After the previous patches, the MPTCP protocol can generate
>> fast-closes on both ends of the connection. Rework the relevant
>> test-case to carefully trigger the fast-close code-path on a
>> single end at the time, while ensuring than a predictable amount
>> of data is spooled on both ends.
>>
>> Additionally add another test-cases for the passive socket
>> fast-close.
> 
> Thank you for the v3!

One last thing: it looks like some packetdrill tests -- dss and syscalls
-- will need to be adapted as well because in some situations, a RST is
now going to be sent instead of a FIN.

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error.
  2022-09-07 17:22 ` [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Matthieu Baerts
@ 2022-09-08 11:48   ` Paolo Abeni
  2022-09-08 12:13     ` Matthieu Baerts
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2022-09-08 11:48 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp

On Wed, 2022-09-07 at 19:22 +0200, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 01/09/2022 19:31, Paolo Abeni wrote:
> > When an mptcp socket is closed due to an incoming FASTCLOSE
> > option, so specific sk_err is set and later syscall will
> > fail usually with EPIPE.
> > 
> > Align the current fastclose error handling with TCP reset,
> > properly setting the socket error according to the current
> > msk state and propagating such error.
> 
> Thank you for the patch and sorry for the delay!

Thank you for the review!
> 
> I have some questions here below:
> 
> > Additionally sendmsg() is currently not handling properly
> > the sk_err, always returning EPIPE.
> Could it be seen as a bug-fix and deserving a dedicated commit?

I personally see this as an improvement, as AFAIK there is no mandated
by RFC return error code for MPTCP.
> 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++----------
> >  1 file changed, 35 insertions(+), 10 deletions(-)
> > 
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index f4891db86217..b04f184695e4 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> >  	if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
> >  		ret = sk_stream_wait_connect(sk, &timeo);
> >  		if (ret)
> > -			goto out;
> > +			goto do_error;
> >  	}
> >  
> > +	ret = -EPIPE;
> 
> Regarding this line above... (see below)
> 
> > +	if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)))
> > +		goto do_error;
> > +
> >  	pfrag = sk_page_frag(sk);
> >  
> >  	while (msg_data_left(msg)) {
> 
> ...Could we eventually have no data left and no going through the while
> loop?

yes...

> In this case "copied" will be 0 and we will return the new -EPIPE, maybe
> not what we want, no?

no;) we now always return 'copied'. We set 'copied' to the correct
error code only in the error path, saving a conditional in the fast
path.

All gracefully stolen from plain TCP ;)

> Or should we change 'ret' value just before calling 'goto do_error' to
> avoid that?
> 
> (...)
> 
> > @@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
> >  		unlock_sock_fast(tcp_sk, slow);
> >  	}
> >  
> > +	/* Mirror the tcp_reset() error propagation */
> > +	switch (sk->sk_state) {
> > +	case TCP_SYN_SENT:
> > +		sk->sk_err = ECONNREFUSED;
> > +		break;
> > +	case TCP_CLOSE_WAIT:
> > +		sk->sk_err = EPIPE;
> > +		break;
> > +	case TCP_CLOSE:
> > +		return;
> > +	default:
> > +		sk->sk_err = ECONNRESET;
> > +	}
> 
> Should we eventually move this out of tcp_reset(), export it (inline
> function?) and re-using it here to avoid the duplication (and eventually
> being out-of-sync in case of changes on TCP side? even if I guess this
> would not change but well)

That could be a possible follow-up I guess, but I would refrain from
touching TCP to address issues/295.


Thanks!

Paolo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error.
  2022-09-08 11:48   ` Paolo Abeni
@ 2022-09-08 12:13     ` Matthieu Baerts
  0 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2022-09-08 12:13 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

Hi Paolo,

Thank you for your replies!

On 08/09/2022 13:48, Paolo Abeni wrote:
> On Wed, 2022-09-07 at 19:22 +0200, Matthieu Baerts wrote:
>> On 01/09/2022 19:31, Paolo Abeni wrote:

(...)

>>> Additionally sendmsg() is currently not handling properly
>>> the sk_err, always returning EPIPE.
>> Could it be seen as a bug-fix and deserving a dedicated commit?
> 
> I personally see this as an improvement, as AFAIK there is no mandated
> by RFC return error code for MPTCP.

Mmh yes I see, we can say that.

>>
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>> ---
>>>  net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++----------
>>>  1 file changed, 35 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>>> index f4891db86217..b04f184695e4 100644
>>> --- a/net/mptcp/protocol.c
>>> +++ b/net/mptcp/protocol.c
>>> @@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
>>>  	if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
>>>  		ret = sk_stream_wait_connect(sk, &timeo);
>>>  		if (ret)
>>> -			goto out;
>>> +			goto do_error;
>>>  	}
>>>  
>>> +	ret = -EPIPE;
>>
>> Regarding this line above... (see below)
>>
>>> +	if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)))
>>> +		goto do_error;
>>> +
>>>  	pfrag = sk_page_frag(sk);
>>>  
>>>  	while (msg_data_left(msg)) {
>>
>> ...Could we eventually have no data left and no going through the while
>> loop?
> 
> yes...
> 
>> In this case "copied" will be 0 and we will return the new -EPIPE, maybe
>> not what we want, no?
> 
> no;) we now always return 'copied'. We set 'copied' to the correct
> error code only in the error path, saving a conditional in the fast
> path.

My bad, for the end of the function, I was looking at code prior this
patch... Indeed, all good with the rest of the modifications :)

> All gracefully stolen from plain TCP ;)

:)

> 
>> Or should we change 'ret' value just before calling 'goto do_error' to
>> avoid that?
>>
>> (...)
>>
>>> @@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
>>>  		unlock_sock_fast(tcp_sk, slow);
>>>  	}
>>>  
>>> +	/* Mirror the tcp_reset() error propagation */
>>> +	switch (sk->sk_state) {
>>> +	case TCP_SYN_SENT:
>>> +		sk->sk_err = ECONNREFUSED;
>>> +		break;
>>> +	case TCP_CLOSE_WAIT:
>>> +		sk->sk_err = EPIPE;
>>> +		break;
>>> +	case TCP_CLOSE:
>>> +		return;
>>> +	default:
>>> +		sk->sk_err = ECONNRESET;
>>> +	}
>>
>> Should we eventually move this out of tcp_reset(), export it (inline
>> function?) and re-using it here to avoid the duplication (and eventually
>> being out-of-sync in case of changes on TCP side? even if I guess this
>> would not change but well)
> 
> That could be a possible follow-up I guess, but I would refrain from
> touching TCP to address issues/295.

Sure, we can do that later as a clean-up.


Patch 1/3 and 2/3 from this v3 are then good to me:

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>

(I'm not going to apply them now as we need patch 3/3 and I had some
questions there + packetdrill adaptations)


Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases
  2022-09-07 17:22   ` Matthieu Baerts
  2022-09-07 18:58     ` Matthieu Baerts
@ 2022-09-09  8:09     ` Paolo Abeni
  2022-09-09  8:52       ` Matthieu Baerts
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2022-09-09  8:09 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp

On Wed, 2022-09-07 at 19:22 +0200, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 01/09/2022 19:31, Paolo Abeni wrote:
> > After the previous patches, the MPTCP protocol can generate
> > fast-closes on both ends of the connection. Rework the relevant
> > test-case to carefully trigger the fast-close code-path on a
> > single end at the time, while ensuring than a predictable amount
> > of data is spooled on both ends.
> > 
> > Additionally add another test-cases for the passive socket
> > fast-close.
> 
> Thank you for the v3!
> 
> (...)
> 
> > diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
> > index 24d4e9cb617e..ee6df83c55f3 100644
> > --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
> > +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
> 
> (...)
> 
> > @@ -381,8 +382,6 @@ static size_t do_rnd_write(const int fd, char *buf, const size_t len)
> >  		do_w = cfg_do_w;
> >  
> >  	bw = write(fd, buf, do_w);
> > -	if (bw < 0)
> > -		perror("write");
> 
> Should we skip the sleep done here just below and return bw directly in
> case of error?
> Or is it more like an small opti not really needed?
> 
> >  
> >  	/* let the join handshake complete, before going on */
> >  	if (cfg_join && first) {
> > @@ -571,7 +570,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
> >  		.fd = peerfd,
> >  		.events = POLLIN | POLLOUT,
> >  	};
> > -	unsigned int woff = 0, wlen = 0;
> > +	unsigned int woff = 0, wlen = 0, total_wlen = 0, total_rlen = 0;
> >  	char wbuf[8192];
> >  
> >  	set_nonblock(peerfd, true);
> > @@ -597,7 +596,16 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
> >  		}
> >  
> >  		if (fds.revents & POLLIN) {
> > -			len = do_rnd_read(peerfd, rbuf, sizeof(rbuf));
> > +			ssize_t rb = sizeof(rbuf);
> > +
> > +			/* limit the total amount of read data to the trunc value*/
> > +			if (cfg_truncate > 0) {
> > +				if (rb + total_rlen > cfg_truncate)
> > +					rb = cfg_truncate - total_rlen;
> > +				len = read(peerfd, rbuf, rb);
> > +			} else {
> > +				len = do_rnd_read(peerfd, rbuf, sizeof(rbuf));
> > +			}
> >  			if (len == 0) {
> >  				/* no more data to receive:
> >  				 * peer has closed its write side
> > @@ -612,10 +620,14 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after
> >  
> >  			/* Else, still have data to transmit */
> >  			} else if (len < 0) {
> > +				/* ignore errors on I/O operation when the peer is fastclosing */
> > +				if (cfg_truncate < 0)
> 
> (it is confusing to me to use negative values for a special case
> (requiring an addional comment each time) instead of using a new
> dedicated option with a clear name but probably just me :) )
> 
> > +					return 0;
> >  				perror("read");
> >  				return 3;
> >  			}
> >  
> > +			total_rlen += len;
> >  			do_write(outfd, rbuf, len);
> >  		}
> >  
> 
> (...)
> 
> > @@ -1262,8 +1295,15 @@ static void parse_opts(int argc, char **argv)
> >  {
> >  	int c;
> >  
> > -	while ((c = getopt(argc, argv, "6c:hi:I:jlm:M:o:p:P:r:R:s:S:t:T:w:")) != -1) {
> > +	while ((c = getopt(argc, argv, "6c:f:hi:I:jlm:M:o:p:P:r:R:s:S:t:T:w:")) != -1) {
> >  		switch (c) {
> > +		case 'f':
> 
> Maybe better to update the die_usage() function if you don't mind.
> 
> > +			cfg_truncate = atoi(optarg);
> > +
> > +			/* when receiving a fastclose, ignore PIPE signals */
> > +			if (cfg_truncate < 0)
> > +				signal(SIGPIPE, handle_signal);
> > +			break;
> >  		case 'j':
> >  			cfg_join = true;
> >  			cfg_mode = CFG_MODE_POLL;
> > diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > index 2957fe414639..0f4c9c4bcb5b 100755
> > --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> 
> (...)
> 
> > @@ -707,9 +718,29 @@ do_transfer()
> >  	fi
> >  
> >  	local flags="subflow"
> > +	local extra_cl_args=""
> > +	local extra_srv_args=""
> > +	local trunc_size=""
> >  	if [[ "${addr_nr_ns2}" = "fastclose_"* ]]; then
> > +		if [ ${test_link_fail} -le 1 ]; then
> > +			echo "fastclose tests need test_link_fail argument"
> > +			return 0
> 
> Probably clearer to exit directly with an error, to be sure the CI
> catches this, no?
> Or do you think it is not needed?
> 
> > +		fi
> > +
> >  		# disconnect
> > -		extra_args="$extra_args -I ${addr_nr_ns2:10}"
> > +		trunc_size=${test_link_fail}
> > +		local side=${addr_nr_ns2:10}
> > +
> > +		if [ ${side} = "client" ]; then
> > +			extra_cl_args="-f ${test_link_fail}"
> > +			extra_srv_args="-f -1"
> > +		elif [ ${side} = "server" ]; then
> > +			extra_srv_args="-f ${test_link_fail}"
> > +			extra_cl_args="-f -1"
> > +		else
> > +			echo "wrong/unknown fastclose spec ${side}"
> > +			return 0
> 
> Same here.
> 
> > +		fi
> >  		addr_nr_ns2=0
> >  	elif [[ "${addr_nr_ns2}" = "userspace_"* ]]; then
> >  		userspace_pm=1
> 
> (...)
> 
> > @@ -1188,12 +1221,23 @@ chk_fclose_nr()
> >  {
> >  	local fclose_tx=$1
> >  	local fclose_rx=$2
> > +	local ns_invert=${3:-""}
> 
> (I'm not sure why you need to default to an empty string but that's a
> detail :) )
> 
> >  	local count
> >  	local dump_stats
> > +	local ns_tx=$ns2
> > +	local ns_rx=$ns1
> > +	local extra_msg=""
> > +
> > +	if [[ $ns_invert = "invert" ]]; then
> > +		ns_tx=$ns1
> > +		ns_rx=$ns2
> > +		extra_msg="   invert"
> > +	fi
> >  
> >  	printf "%-${nr_blank}s %s" " " "ctx"
> > -	count=$(ip netns exec $ns2 nstat -as | grep MPTcpExtMPFastcloseTx | awk '{print $2}')
> > +	count=$(ip netns exec $ns_tx nstat -as | grep MPTcpExtMPFastcloseTx | awk '{print $2}')
> >  	[ -z "$count" ] && count=0
> > +	[ "$count" != "$fclose_tx" ] && extra_msg="$extra_msg,tx=$count"
> 
> detail: if we are not in the "invert" mode, extra_msg will be empty: no
> need to add the 3 spaces? or init extra_msg with these 3 spaces?
> 
> >  	if [ "$count" != "$fclose_tx" ]; then
> >  		echo "[fail] got $count MP_FASTCLOSE[s] TX expected $fclose_tx"
> >  		fail_test
> 
> (...)
> 
> > @@ -1236,7 +1283,7 @@ chk_rst_nr()
> >  	printf "%-${nr_blank}s %s" " " "rtx"
> >  	count=$(ip netns exec $ns_tx nstat -as | grep MPTcpExtMPRstTx | awk '{print $2}')
> >  	[ -z "$count" ] && count=0
> > -	if [ "$count" != "$rst_tx" ]; then
> > +	if [ $count -lt $rst_tx ]; then
> 
> Why is it fine if we send (and below receive) more than expected?

[answering only here because this will be the only suggestion leading
to no changes in the next revision]

Because the rst can be retransmitted. I'm not sure/can't recall how the
self-test previously was able to enforce a single reset, but that does
not look possible now.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases
  2022-09-09  8:09     ` Paolo Abeni
@ 2022-09-09  8:52       ` Matthieu Baerts
  0 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2022-09-09  8:52 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 09/09/2022 10:09, Paolo Abeni wrote:
> On Wed, 2022-09-07 at 19:22 +0200, Matthieu Baerts wrote:
>> On 01/09/2022 19:31, Paolo Abeni wrote:
>>> After the previous patches, the MPTCP protocol can generate
>>> fast-closes on both ends of the connection. Rework the relevant
>>> test-case to carefully trigger the fast-close code-path on a
>>> single end at the time, while ensuring than a predictable amount
>>> of data is spooled on both ends.
>>>
>>> Additionally add another test-cases for the passive socket
>>> fast-close.
>>
>> Thank you for the v3!
>>
>> (...)
>>
>>> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
>>> index 24d4e9cb617e..ee6df83c55f3 100644
>>> --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
>>> +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c

(...)

>>> @@ -1236,7 +1283,7 @@ chk_rst_nr()
>>>  	printf "%-${nr_blank}s %s" " " "rtx"
>>>  	count=$(ip netns exec $ns_tx nstat -as | grep MPTcpExtMPRstTx | awk '{print $2}')
>>>  	[ -z "$count" ] && count=0
>>> -	if [ "$count" != "$rst_tx" ]; then
>>> +	if [ $count -lt $rst_tx ]; then
>>
>> Why is it fine if we send (and below receive) more than expected?
> 
> [answering only here because this will be the only suggestion leading
> to no changes in the next revision]
> 
> Because the rst can be retransmitted. I'm not sure/can't recall how the
> self-test previously was able to enforce a single reset, but that does
> not look possible now.

Indeed, good idea to do that! I guess we can control the number of
retransmissions in general but not specific to the reset: we don't want
to not accept any retransmissions of the previous data packets.

(I'm sorry to repeat the same kind of thing again but should we extract
this part and have a dedicated commit for -net? :) )

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-09-09  8:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-01 17:31 [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Paolo Abeni
2022-09-01 17:31 ` [PATCH v3 mptcp-next 2/3] mptcp: use fastclose on more edge scenarios Paolo Abeni
2022-09-01 17:31 ` [PATCH v3 mptcp-next 3/3] selftests: mptcp: update and extend fastclose test-cases Paolo Abeni
2022-09-07 17:22   ` Matthieu Baerts
2022-09-07 18:58     ` Matthieu Baerts
2022-09-09  8:09     ` Paolo Abeni
2022-09-09  8:52       ` Matthieu Baerts
2022-09-07 17:22 ` [PATCH v3 mptcp-next 1/3] mptcp: propagate fastclose error Matthieu Baerts
2022-09-08 11:48   ` Paolo Abeni
2022-09-08 12:13     ` Matthieu Baerts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).