All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors
@ 2021-11-23 16:37 Paolo Abeni
  2021-11-23 16:37 ` [PATCH mptcp-next 1/3] mptcp: clean-up MPJ option writing Paolo Abeni
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Paolo Abeni @ 2021-11-23 16:37 UTC (permalink / raw)
  To: mptcp; +Cc: Phil Greenland

Currently when a subflow connection fails - either the TCP
connection is reset by the peer, or the MPJ handshake never
completes - the in kernel PM don't perform any further action.
Notably no additional subflow creation is attempted.

This series is aimed at improving the in-kernel path manager
behavior in the above scenario: on failure we try to move
to the next subflow, if any.

The first patch is minor cleanup, the feature is implemented
by patch 2/3. The last patch adds a specific self-tests.

This should possibly address/close:
https://github.com/multipath-tcp/mptcp_net-next/issues/235
https://github.com/multipath-tcp/mptcp_net-next/issues/242

@Phil: could you please have a spin in your testbed and
see if this is enough to fit you?

Paolo Abeni (3):
  mptcp: clean-up MPJ option writing.
  mptcp: do not block subflows creation on errors
  selftests: mptcp: add tests for subflow creation failure

 net/mptcp/options.c                           | 63 ++++++++++++-------
 net/mptcp/protocol.h                          |  1 +
 net/mptcp/subflow.c                           | 12 +++-
 .../testing/selftests/net/mptcp/mptcp_join.sh | 10 +++
 4 files changed, 61 insertions(+), 25 deletions(-)

-- 
2.33.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH mptcp-next 1/3] mptcp: clean-up MPJ option writing.
  2021-11-23 16:37 [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni
@ 2021-11-23 16:37 ` Paolo Abeni
  2021-11-24  9:52   ` Matthieu Baerts
  2021-11-23 16:37 ` [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors Paolo Abeni
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Paolo Abeni @ 2021-11-23 16:37 UTC (permalink / raw)
  To: mptcp; +Cc: Phil Greenland

Check for all MPJ variant at once, this reduces the number
of conditionals traversed on average and will simplify the
next patch.

No functional change intended.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 8a1020e4285c..7a6a39b71633 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -1381,27 +1381,29 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 
 		/* MPC is additionally mutually exclusive with MP_PRIO */
 		goto mp_capable_done;
-	} else if (OPTION_MPTCP_MPJ_SYN & opts->suboptions) {
-		*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
-				      TCPOLEN_MPTCP_MPJ_SYN,
-				      opts->backup, opts->join_id);
-		put_unaligned_be32(opts->token, ptr);
-		ptr += 1;
-		put_unaligned_be32(opts->nonce, ptr);
-		ptr += 1;
-	} else if (OPTION_MPTCP_MPJ_SYNACK & opts->suboptions) {
-		*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
-				      TCPOLEN_MPTCP_MPJ_SYNACK,
-				      opts->backup, opts->join_id);
-		put_unaligned_be64(opts->thmac, ptr);
-		ptr += 2;
-		put_unaligned_be32(opts->nonce, ptr);
-		ptr += 1;
-	} else if (OPTION_MPTCP_MPJ_ACK & opts->suboptions) {
-		*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
-				      TCPOLEN_MPTCP_MPJ_ACK, 0, 0);
-		memcpy(ptr, opts->hmac, MPTCPOPT_HMAC_LEN);
-		ptr += 5;
+	} else if (OPTIONS_MPTCP_MPJ & opts->suboptions) {
+		if (OPTION_MPTCP_MPJ_SYN & opts->suboptions) {
+			*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
+					      TCPOLEN_MPTCP_MPJ_SYN,
+					      opts->backup, opts->join_id);
+			put_unaligned_be32(opts->token, ptr);
+			ptr += 1;
+			put_unaligned_be32(opts->nonce, ptr);
+			ptr += 1;
+		} else if (OPTION_MPTCP_MPJ_SYNACK & opts->suboptions) {
+			*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
+					      TCPOLEN_MPTCP_MPJ_SYNACK,
+					      opts->backup, opts->join_id);
+			put_unaligned_be64(opts->thmac, ptr);
+			ptr += 2;
+			put_unaligned_be32(opts->nonce, ptr);
+			ptr += 1;
+		} else {
+			*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
+					      TCPOLEN_MPTCP_MPJ_ACK, 0, 0);
+			memcpy(ptr, opts->hmac, MPTCPOPT_HMAC_LEN);
+			ptr += 5;
+		}
 	} else if (OPTION_MPTCP_ADD_ADDR & opts->suboptions) {
 		u8 len = TCPOLEN_MPTCP_ADD_ADDR_BASE;
 		u8 echo = MPTCP_ADDR_ECHO;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors
  2021-11-23 16:37 [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni
  2021-11-23 16:37 ` [PATCH mptcp-next 1/3] mptcp: clean-up MPJ option writing Paolo Abeni
@ 2021-11-23 16:37 ` Paolo Abeni
  2021-11-24  9:55   ` Matthieu Baerts
  2021-11-23 16:37 ` [PATCH mptcp-next 3/3] selftests: mptcp: add tests for subflow creation failure Paolo Abeni
  2021-11-23 18:48 ` [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni
  3 siblings, 1 reply; 12+ messages in thread
From: Paolo Abeni @ 2021-11-23 16:37 UTC (permalink / raw)
  To: mptcp; +Cc: Phil Greenland

If the MPTCP configuration allows for multiple subflows
creation, and the first additional subflows never reach
the fully established status - e.g. due to packets drop or
reset - the in kernel path manager do not move to the
next subflow.

Let's trigger subflow creation on both the above event.
Note that we don't need an additional timer to catch timeout
and/or long delay in connection completion: we just need to
measure the time elapsed since the subflow creation every
time we emit an MPJ sub-option.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c  | 19 ++++++++++++++++---
 net/mptcp/protocol.h |  1 +
 net/mptcp/subflow.c  | 12 +++++++++++-
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 7a6a39b71633..d6c4f5c82d09 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -1257,10 +1257,10 @@ static u16 mptcp_make_csum(const struct mptcp_ext *mpext)
 void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 			 struct mptcp_out_options *opts)
 {
-	if (unlikely(OPTION_MPTCP_FAIL & opts->suboptions)) {
-		const struct sock *ssk = (const struct sock *)tp;
-		struct mptcp_subflow_context *subflow;
+	const struct sock *ssk = (const struct sock *)tp;
+	struct mptcp_subflow_context *subflow;
 
+	if (unlikely(OPTION_MPTCP_FAIL & opts->suboptions)) {
 		subflow = mptcp_subflow_ctx(ssk);
 		subflow->send_mp_fail = 0;
 
@@ -1382,6 +1382,19 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 		/* MPC is additionally mutually exclusive with MP_PRIO */
 		goto mp_capable_done;
 	} else if (OPTIONS_MPTCP_MPJ & opts->suboptions) {
+		if (ssk) {
+			subflow = mptcp_subflow_ctx(ssk);
+			if (subflow->start_stamp &&
+			    unlikely(after(tcp_jiffies32, subflow->start_stamp + HZ/10))) {
+				/* we are not established yet and "a lot" of time passed
+				 * e.g. due to syn retranmissions. We can attempt next
+				 * subflow creation
+				 */
+				mptcp_pm_subflow_established(mptcp_sk(subflow->conn));
+				subflow->start_stamp = 0;
+			}
+		}
+
 		if (OPTION_MPTCP_MPJ_SYN & opts->suboptions) {
 			*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
 					      TCPOLEN_MPTCP_MPJ_SYN,
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index a6a4bd7de5b4..d5df04d5b3f0 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -432,6 +432,7 @@ struct mptcp_subflow_context {
 		stale : 1;	    /* unable to snd/rcv data, do not use for xmit */
 	enum mptcp_data_avail data_avail;
 	u32	remote_nonce;
+	u32	start_stamp;
 	u64	thmac;
 	u32	local_nonce;
 	u32	remote_token;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 0f90bd61de01..85986f3b58ed 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1273,7 +1273,16 @@ void __mptcp_error_report(struct sock *sk)
 
 static void subflow_error_report(struct sock *ssk)
 {
-	struct sock *sk = mptcp_subflow_ctx(ssk)->conn;
+	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
+	struct sock *sk = subflow->conn;
+
+	/* subflow aborted before reaching the fully_established status
+	 * attempt the creation of the next subflow
+	 */
+	if (unlikely(!subflow->fully_established && subflow->start_stamp)) {
+		mptcp_pm_subflow_established(mptcp_sk(sk));
+		subflow->start_stamp = 0;
+	}
 
 	mptcp_data_lock(sk);
 	if (!sock_owned_by_user(sk))
@@ -1444,6 +1453,7 @@ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc,
 	subflow->remote_id = remote_id;
 	subflow->request_join = 1;
 	subflow->request_bkup = !!(flags & MPTCP_PM_ADDR_FLAG_BACKUP);
+	subflow->start_stamp = tcp_jiffies32;
 	mptcp_info2sockaddr(remote, &addr, ssk->sk_family);
 
 	mptcp_add_pending_subflow(msk, subflow);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH mptcp-next 3/3] selftests: mptcp: add tests for subflow creation failure
  2021-11-23 16:37 [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni
  2021-11-23 16:37 ` [PATCH mptcp-next 1/3] mptcp: clean-up MPJ option writing Paolo Abeni
  2021-11-23 16:37 ` [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors Paolo Abeni
@ 2021-11-23 16:37 ` Paolo Abeni
  2021-11-24  9:56   ` Matthieu Baerts
  2021-11-23 18:48 ` [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni
  3 siblings, 1 reply; 12+ messages in thread
From: Paolo Abeni @ 2021-11-23 16:37 UTC (permalink / raw)
  To: mptcp; +Cc: Phil Greenland

Verify that, when multiple endpoints are available, subflows
creation proceed even when the first additional subflow creation
fails - due to packet drop on the relevant link

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 2684ef9c0d42..dd4488746151 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -1023,6 +1023,16 @@ subflows_tests()
 	run_tests $ns1 $ns2 10.0.1.1
 	chk_join_nr "multiple subflows, limited by server" 2 2 1
 
+	# multiple subflows, with subflow creation error
+	reset
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow
+	ip netns exec $ns1 iptables -A INPUT -s 10.0.3.2 -p tcp -j REJECT
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "multiple subflows, with failing subflow" 1 1 1
+
 	# single subflow, dev
 	reset
 	ip netns exec $ns1 ./pm_nl_ctl limits 0 1
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors
  2021-11-23 16:37 [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni
                   ` (2 preceding siblings ...)
  2021-11-23 16:37 ` [PATCH mptcp-next 3/3] selftests: mptcp: add tests for subflow creation failure Paolo Abeni
@ 2021-11-23 18:48 ` Paolo Abeni
  3 siblings, 0 replies; 12+ messages in thread
From: Paolo Abeni @ 2021-11-23 18:48 UTC (permalink / raw)
  To: mptcp; +Cc: Phil Greenland

On Tue, 2021-11-23 at 17:37 +0100, Paolo Abeni wrote:
> Currently when a subflow connection fails - either the TCP
> connection is reset by the peer, or the MPJ handshake never
> completes - the in kernel PM don't perform any further action.
> Notably no additional subflow creation is attempted.
> 
> This series is aimed at improving the in-kernel path manager
> behavior in the above scenario: on failure we try to move
> to the next subflow, if any.
> 
> The first patch is minor cleanup, the feature is implemented
> by patch 2/3. The last patch adds a specific self-tests.
> 
> This should possibly address/close:
> https://github.com/multipath-tcp/mptcp_net-next/issues/235
> https://github.com/multipath-tcp/mptcp_net-next/issues/242

Uhm... I actually missed a few points from issues/242. This does not
address the latter, yet.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 1/3] mptcp: clean-up MPJ option writing.
  2021-11-23 16:37 ` [PATCH mptcp-next 1/3] mptcp: clean-up MPJ option writing Paolo Abeni
@ 2021-11-24  9:52   ` Matthieu Baerts
  0 siblings, 0 replies; 12+ messages in thread
From: Matthieu Baerts @ 2021-11-24  9:52 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Phil Greenland, mptcp

Hi Paolo,

On 23/11/2021 17:37, Paolo Abeni wrote:
> Check for all MPJ variant at once, this reduces the number
> of conditionals traversed on average and will simplify the
> next patch.
> 
> No functional change intended.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Good idea!

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors
  2021-11-23 16:37 ` [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors Paolo Abeni
@ 2021-11-24  9:55   ` Matthieu Baerts
  2021-11-24 10:09     ` Paolo Abeni
  0 siblings, 1 reply; 12+ messages in thread
From: Matthieu Baerts @ 2021-11-24  9:55 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Phil Greenland, mptcp

Hi Paolo,

On 23/11/2021 17:37, Paolo Abeni wrote:
> If the MPTCP configuration allows for multiple subflows
> creation, and the first additional subflows never reach
> the fully established status - e.g. due to packets drop or
> reset - the in kernel path manager do not move to the
> next subflow.

Thank you for looking at that!

Why the PM could not try to establish all possible paths "in parallel"?
Is it because we have this limit of allowed subflows and we wait to know
if a path is established to count it as used subflow? But if here we
mark a path as established if it has an issue, does it still make sense?
I probably missed something here :)

> Let's trigger subflow creation on both the above event.
> Note that we don't need an additional timer to catch timeout
> and/or long delay in connection completion: we just need to
> measure the time elapsed since the subflow creation every
> time we emit an MPJ sub-option.

Just to be sure: can we do that because we always expect a retransmission?
> @@ -1382,6 +1382,19 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
>  		/* MPC is additionally mutually exclusive with MP_PRIO */
>  		goto mp_capable_done;
>  	} else if (OPTIONS_MPTCP_MPJ & opts->suboptions) {
> +		if (ssk) {
> +			subflow = mptcp_subflow_ctx(ssk);
> +			if (subflow->start_stamp &&
> +			    unlikely(after(tcp_jiffies32, subflow->start_stamp + HZ/10))) {

Small detail: checkpatch asks to have space around "/". I can add them
when applying the patch if only this needs to be modified.

> +				/* we are not established yet and "a lot" of time passed
> +				 * e.g. due to syn retranmissions. We can attempt next
> +				 * subflow creation

That's strange to tell the PM "yes, the subflow is established now"
while there is potentially (or clearly in case of error (RST) below) an
issue :)

Should we eventually have another function name doing the same and
resetting the stamp variable?

> +				 */
> +				mptcp_pm_subflow_established(mptcp_sk(subflow->conn));
> +				subflow->start_stamp = 0;
> +			}
> +		}
> +
>  		if (OPTION_MPTCP_MPJ_SYN & opts->suboptions) {
>  			*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
>  					      TCPOLEN_MPTCP_MPJ_SYN,
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 0f90bd61de01..85986f3b58ed 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -1273,7 +1273,16 @@ void __mptcp_error_report(struct sock *sk)
>  
>  static void subflow_error_report(struct sock *ssk)
>  {
> -	struct sock *sk = mptcp_subflow_ctx(ssk)->conn;
> +	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> +	struct sock *sk = subflow->conn;
> +
> +	/* subflow aborted before reaching the fully_established status
> +	 * attempt the creation of the next subflow
> +	 */
> +	if (unlikely(!subflow->fully_established && subflow->start_stamp)) {
> +		mptcp_pm_subflow_established(mptcp_sk(sk));
> +		subflow->start_stamp = 0;

Detail: do you need to set it to 0 here?

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 3/3] selftests: mptcp: add tests for subflow creation failure
  2021-11-23 16:37 ` [PATCH mptcp-next 3/3] selftests: mptcp: add tests for subflow creation failure Paolo Abeni
@ 2021-11-24  9:56   ` Matthieu Baerts
  2021-11-24 10:10     ` Paolo Abeni
  0 siblings, 1 reply; 12+ messages in thread
From: Matthieu Baerts @ 2021-11-24  9:56 UTC (permalink / raw)
  To: Paolo Abeni, mptcp; +Cc: Phil Greenland

Hi Paolo,

On 23/11/2021 17:37, Paolo Abeni wrote:
> Verify that, when multiple endpoints are available, subflows
> creation proceed even when the first additional subflow creation
> fails - due to packet drop on the relevant link

Here you don't drop the packet but you reset it. Should we have a test
for both (REJECT/DROP) (or with an additional subflow where packets are
dropped)?

> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> index 2684ef9c0d42..dd4488746151 100755
> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> @@ -1023,6 +1023,16 @@ subflows_tests()
>  	run_tests $ns1 $ns2 10.0.1.1
>  	chk_join_nr "multiple subflows, limited by server" 2 2 1
>  
> +	# multiple subflows, with subflow creation error
> +	reset
> +	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> +	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> +	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> +	ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow
> +	ip netns exec $ns1 iptables -A INPUT -s 10.0.3.2 -p tcp -j REJECT

I guess we should then add "CONFIG_IP_NF_TARGET_REJECT" in the config
file :)

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors
  2021-11-24  9:55   ` Matthieu Baerts
@ 2021-11-24 10:09     ` Paolo Abeni
  2021-11-24 12:23       ` Matthieu Baerts
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Abeni @ 2021-11-24 10:09 UTC (permalink / raw)
  To: Matthieu Baerts; +Cc: Phil Greenland, mptcp

On Wed, 2021-11-24 at 10:55 +0100, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 23/11/2021 17:37, Paolo Abeni wrote:
> > If the MPTCP configuration allows for multiple subflows
> > creation, and the first additional subflows never reach
> > the fully established status - e.g. due to packets drop or
> > reset - the in kernel path manager do not move to the
> > next subflow.
> 
> Thank you for looking at that!
> 
> Why the PM could not try to establish all possible paths "in parallel"?
> Is it because we have this limit of allowed subflows and we wait to know
> if a path is established to count it as used subflow?

IIRC, that design choice is to avoid that an MPTCP endpoint could
easily become a DDoS bot, generating "a lot" of syn in response to
action from possibly untrusted peer.

>  But if here we
> mark a path as established if it has an issue, does it still make sense?
> I probably missed something here :)

With this patch we don't mark the path as astablished, we try to pick
the next one. 

> > Let's trigger subflow creation on both the above event.
> > Note that we don't need an additional timer to catch timeout
> > and/or long delay in connection completion: we just need to
> > measure the time elapsed since the subflow creation every
> > time we emit an MPJ sub-option.
> 
> Just to be sure: can we do that because we always expect a retransmission?
> > @@ -1382,6 +1382,19 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
> >  		/* MPC is additionally mutually exclusive with MP_PRIO */
> >  		goto mp_capable_done;
> >  	} else if (OPTIONS_MPTCP_MPJ & opts->suboptions) {
> > +		if (ssk) {
> > +			subflow = mptcp_subflow_ctx(ssk);
> > +			if (subflow->start_stamp &&
> > +			    unlikely(after(tcp_jiffies32, subflow->start_stamp + HZ/10))) {
> 
> Small detail: checkpatch asks to have space around "/". I can add them
> when applying the patch if only this needs to be modified.
> 
> > +				/* we are not established yet and "a lot" of time passed
> > +				 * e.g. due to syn retranmissions. We can attempt next
> > +				 * subflow creation
> 
> That's strange to tell the PM "yes, the subflow is established now"
> while there is potentially (or clearly in case of error (RST) below) an
> issue :)
> 
> Should we eventually have another function name doing the same and
> resetting the stamp variable?

Yes, in v2 I'll add another pm helper for that.

Side note: do we still need the pm -> pm_netlink additional layering?
That was initially introduced to allow multiple in-kernel PM, but
afaics we are now considering a single in kernel one and e.v. an user-
space one. Perhpas we can remove some intermediate abstraction?

(not in this series!)

> > +				 */
> > +				mptcp_pm_subflow_established(mptcp_sk(subflow->conn));
> > +				subflow->start_stamp = 0;
> > +			}
> > +		}
> > +
> >  		if (OPTION_MPTCP_MPJ_SYN & opts->suboptions) {
> >  			*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
> >  					      TCPOLEN_MPTCP_MPJ_SYN,
> > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> > index 0f90bd61de01..85986f3b58ed 100644
> > --- a/net/mptcp/subflow.c
> > +++ b/net/mptcp/subflow.c
> > @@ -1273,7 +1273,16 @@ void __mptcp_error_report(struct sock *sk)
> >  
> >  static void subflow_error_report(struct sock *ssk)
> >  {
> > -	struct sock *sk = mptcp_subflow_ctx(ssk)->conn;
> > +	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> > +	struct sock *sk = subflow->conn;
> > +
> > +	/* subflow aborted before reaching the fully_established status
> > +	 * attempt the creation of the next subflow
> > +	 */
> > +	if (unlikely(!subflow->fully_established && subflow->start_stamp)) {
> > +		mptcp_pm_subflow_established(mptcp_sk(sk));
> > +		subflow->start_stamp = 0;
> 
> Detail: do you need to set it to 0 here?

So we call mptcp_pm_subflow_established() at most once due to subflow
timeout or reset. I feared a malicius peer could trigger a lot of
activity with a e.g. a burst of icmp unreachable.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 3/3] selftests: mptcp: add tests for subflow creation failure
  2021-11-24  9:56   ` Matthieu Baerts
@ 2021-11-24 10:10     ` Paolo Abeni
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Abeni @ 2021-11-24 10:10 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp; +Cc: Phil Greenland

On Wed, 2021-11-24 at 10:56 +0100, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 23/11/2021 17:37, Paolo Abeni wrote:
> > Verify that, when multiple endpoints are available, subflows
> > creation proceed even when the first additional subflow creation
> > fails - due to packet drop on the relevant link
> 
> Here you don't drop the packet but you reset it. Should we have a test
> for both (REJECT/DROP) (or with an additional subflow where packets are
> dropped)?

I'll try to see if a drop test case will fit our infra. 

> > diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > index 2684ef9c0d42..dd4488746151 100755
> > --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > @@ -1023,6 +1023,16 @@ subflows_tests()
> >  	run_tests $ns1 $ns2 10.0.1.1
> >  	chk_join_nr "multiple subflows, limited by server" 2 2 1
> >  
> > +	# multiple subflows, with subflow creation error
> > +	reset
> > +	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> > +	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> > +	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> > +	ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow
> > +	ip netns exec $ns1 iptables -A INPUT -s 10.0.3.2 -p tcp -j REJECT
> 
> I guess we should then add "CONFIG_IP_NF_TARGET_REJECT" in the config
> file :)

Will add in v2.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors
  2021-11-24 10:09     ` Paolo Abeni
@ 2021-11-24 12:23       ` Matthieu Baerts
  2021-11-24 16:31         ` Paolo Abeni
  0 siblings, 1 reply; 12+ messages in thread
From: Matthieu Baerts @ 2021-11-24 12:23 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Phil Greenland, mptcp

On 24/11/2021 11:09, Paolo Abeni wrote:
> On Wed, 2021-11-24 at 10:55 +0100, Matthieu Baerts wrote:
>> Hi Paolo,
>>
>> On 23/11/2021 17:37, Paolo Abeni wrote:
>>> If the MPTCP configuration allows for multiple subflows
>>> creation, and the first additional subflows never reach
>>> the fully established status - e.g. due to packets drop or
>>> reset - the in kernel path manager do not move to the
>>> next subflow.
>>
>> Thank you for looking at that!
>>
>> Why the PM could not try to establish all possible paths "in parallel"?
>> Is it because we have this limit of allowed subflows and we wait to know
>> if a path is established to count it as used subflow?
> 
> IIRC, that design choice is to avoid that an MPTCP endpoint could
> easily become a DDoS bot, generating "a lot" of syn in response to
> action from possibly untrusted peer.

The endpoint will only be able to produce a very few SYN+MPJ depending
on the limit that is in place.

Should we maybe do a distinction between local addresses and remote
ones? I mean when the MPTCP connection is established, the PM could open
all paths using different local addresses and then the remote ones one
at a time?
>>> +				/* we are not established yet and "a lot" of time passed
>>> +				 * e.g. due to syn retranmissions. We can attempt next
>>> +				 * subflow creation
>>
>> That's strange to tell the PM "yes, the subflow is established now"
>> while there is potentially (or clearly in case of error (RST) below) an
>> issue :)
>>
>> Should we eventually have another function name doing the same and
>> resetting the stamp variable?
> 
> Yes, in v2 I'll add another pm helper for that.

Thank you!

> Side note: do we still need the pm -> pm_netlink additional layering?
> That was initially introduced to allow multiple in-kernel PM, but
> afaics we are now considering a single in kernel one and e.v. an user-
> space one. Perhpas we can remove some intermediate abstraction?

I think we should keep it. It still makes sense to have a PM controlled
with eBPF for servers having to support a lot of connections and always
doing the same action. Maybe the in-kernel PM can already cover all use
cases for a server, mainly announcing addresses. Or maybe more might be
needed, e.g. announcing an IP or another depending on the initial one,
etc. and in this case, a controlled with eBPF would be interesting (but
could be done as an extension of the in-kernel PM one?)

> (not in this series!)

Of course!

>>> +				 */
>>> +				mptcp_pm_subflow_established(mptcp_sk(subflow->conn));
>>> +				subflow->start_stamp = 0;
>>> +			}
>>> +		}
>>> +
>>>  		if (OPTION_MPTCP_MPJ_SYN & opts->suboptions) {
>>>  			*ptr++ = mptcp_option(MPTCPOPT_MP_JOIN,
>>>  					      TCPOLEN_MPTCP_MPJ_SYN,
>>> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
>>> index 0f90bd61de01..85986f3b58ed 100644
>>> --- a/net/mptcp/subflow.c
>>> +++ b/net/mptcp/subflow.c
>>> @@ -1273,7 +1273,16 @@ void __mptcp_error_report(struct sock *sk)
>>>  
>>>  static void subflow_error_report(struct sock *ssk)
>>>  {
>>> -	struct sock *sk = mptcp_subflow_ctx(ssk)->conn;
>>> +	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
>>> +	struct sock *sk = subflow->conn;
>>> +
>>> +	/* subflow aborted before reaching the fully_established status
>>> +	 * attempt the creation of the next subflow
>>> +	 */
>>> +	if (unlikely(!subflow->fully_established && subflow->start_stamp)) {
>>> +		mptcp_pm_subflow_established(mptcp_sk(sk));
>>> +		subflow->start_stamp = 0;
>>
>> Detail: do you need to set it to 0 here?
> 
> So we call mptcp_pm_subflow_established() at most once due to subflow
> timeout or reset. I feared a malicius peer could trigger a lot of
> activity with a e.g. a burst of icmp unreachable.

Good point, thank you for your reply. I thought subflow_error_report()
would be called only once per ssk but then yes, safer to add this check.

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors
  2021-11-24 12:23       ` Matthieu Baerts
@ 2021-11-24 16:31         ` Paolo Abeni
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Abeni @ 2021-11-24 16:31 UTC (permalink / raw)
  To: Matthieu Baerts; +Cc: Phil Greenland, mptcp

On Wed, 2021-11-24 at 13:23 +0100, Matthieu Baerts wrote:
> On 24/11/2021 11:09, Paolo Abeni wrote:
> > On Wed, 2021-11-24 at 10:55 +0100, Matthieu Baerts wrote:
> > > Hi Paolo,
> > > 
> > > On 23/11/2021 17:37, Paolo Abeni wrote:
> > > > If the MPTCP configuration allows for multiple subflows
> > > > creation, and the first additional subflows never reach
> > > > the fully established status - e.g. due to packets drop or
> > > > reset - the in kernel path manager do not move to the
> > > > next subflow.
> > > 
> > > Thank you for looking at that!
> > > 
> > > Why the PM could not try to establish all possible paths "in parallel"?
> > > Is it because we have this limit of allowed subflows and we wait to know
> > > if a path is established to count it as used subflow?
> > 
> > IIRC, that design choice is to avoid that an MPTCP endpoint could
> > easily become a DDoS bot, generating "a lot" of syn in response to
> > action from possibly untrusted peer.
> 
> The endpoint will only be able to produce a very few SYN+MPJ depending
> on the limit that is in place.
> 
> Should we maybe do a distinction between local addresses and remote
> ones? I mean when the MPTCP connection is established, the PM could open
> all paths using different local addresses and then the remote ones one
> at a time?
> > > 

I *think* that kind of change will be more invasive - we would have to
revisit the non trivial mptcp_pm_create_subflow_or_signal_addr() and
AFAICS we would still need to cope with subflow creation failure.

All the above IMHO ! 

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-11-24 16:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 16:37 [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni
2021-11-23 16:37 ` [PATCH mptcp-next 1/3] mptcp: clean-up MPJ option writing Paolo Abeni
2021-11-24  9:52   ` Matthieu Baerts
2021-11-23 16:37 ` [PATCH mptcp-next 2/3] mptcp: do not block subflows creation on errors Paolo Abeni
2021-11-24  9:55   ` Matthieu Baerts
2021-11-24 10:09     ` Paolo Abeni
2021-11-24 12:23       ` Matthieu Baerts
2021-11-24 16:31         ` Paolo Abeni
2021-11-23 16:37 ` [PATCH mptcp-next 3/3] selftests: mptcp: add tests for subflow creation failure Paolo Abeni
2021-11-24  9:56   ` Matthieu Baerts
2021-11-24 10:10     ` Paolo Abeni
2021-11-23 18:48 ` [PATCH mptcp-next 0/3] mptcp: improve subflow creation on errors Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.