All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions
@ 2021-11-02 20:48 Paolo Abeni
  2021-11-02 20:48 ` [PATCH v2 mptcp-net 1/2] mptcp: fix delack timer Paolo Abeni
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Paolo Abeni @ 2021-11-02 20:48 UTC (permalink / raw)
  To: mptcp

The first patch is from Eric, thus is correct ;)
The 2nd one is much more doubtful, starting from the author ;)
The individual commit messages should be descriptive enough. With these
2 patches the related, pending pktdrill test:

https://github.com/multipath-tcp/packetdrill/pull/70

now complete successfully

v1 -> v2:
 - fix from delegated action invocation in patch 2/2 (Mat)

Eric Dumazet (1):
  mptcp: fix delack timer

Paolo Abeni (1):
  mptcp: use delegate action to schedule 3rd ack retrans

 net/mptcp/options.c  | 32 ++++++++--------------------
 net/mptcp/protocol.c | 50 ++++++++++++++++++++++++++++++++++++--------
 net/mptcp/protocol.h | 17 ++++++++-------
 3 files changed, 59 insertions(+), 40 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 mptcp-net 1/2] mptcp: fix delack timer
  2021-11-02 20:48 [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions Paolo Abeni
@ 2021-11-02 20:48 ` Paolo Abeni
  2021-11-03 22:51   ` Mat Martineau
  2021-11-02 20:48 ` [PATCH v2 mptcp-net 2/2] mptcp: use delegate action to schedule 3rd ack retrans Paolo Abeni
  2021-11-04 14:18 ` [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions Matthieu Baerts
  2 siblings, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2021-11-02 20:48 UTC (permalink / raw)
  To: mptcp

From: Eric Dumazet <edumazet@google.com>

To compute the rtx timeout schedule_3rdack_retransmission() does multiple
things in the wrong way: srtt_us is measures in usec/8 and the timeout
itself is an absolute vale.

Fixes: ec3edaa7ca6ce02f ("mptcp: Add handling of outgoing MP_JOIN requests")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
commit message typos added by me
---
 net/mptcp/options.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 7c3420afb1a0..2ca077fdf10a 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -434,9 +434,10 @@ static void schedule_3rdack_retransmission(struct sock *sk)
 
 	/* reschedule with a timeout above RTT, as we must look only for drop */
 	if (tp->srtt_us)
-		timeout = tp->srtt_us << 1;
+		timeout = usecs_to_jiffies(tp->srtt_us >> (3-1));
 	else
 		timeout = TCP_TIMEOUT_INIT;
+	timeout += jiffies;
 
 	WARN_ON_ONCE(icsk->icsk_ack.pending & ICSK_ACK_TIMER);
 	icsk->icsk_ack.pending |= ICSK_ACK_SCHED | ICSK_ACK_TIMER;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 mptcp-net 2/2] mptcp: use delegate action to schedule 3rd ack retrans
  2021-11-02 20:48 [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions Paolo Abeni
  2021-11-02 20:48 ` [PATCH v2 mptcp-net 1/2] mptcp: fix delack timer Paolo Abeni
@ 2021-11-02 20:48 ` Paolo Abeni
  2021-11-04 14:18 ` [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions Matthieu Baerts
  2 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2021-11-02 20:48 UTC (permalink / raw)
  To: mptcp

Scheduling a delack in mptcp_established_options_mp() is
not a good idea: such function is called by tcp_send_ack() and
the pending delayed ack will be cleared shortly after by the
tcp_event_ack_sent() call in __tcp_transmit_skb().

Instead use the mptcp delegated action infrastructure to
schedule the delayed ack after the current bh processing completes.

Additionally moves the schedule_3rdack_retransmission() helper
into protocol.c to avoid making it visible in a different compilation
unit.

Fixes: ec3edaa7ca6ce02f ("mptcp: Add handling of outgoing MP_JOIN requests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v1 -> v2:
 - fix wrong delegated action scheduling (Mat)
---
 net/mptcp/options.c  | 33 ++++++++---------------------
 net/mptcp/protocol.c | 50 ++++++++++++++++++++++++++++++++++++--------
 net/mptcp/protocol.h | 17 ++++++++-------
 3 files changed, 59 insertions(+), 41 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 2ca077fdf10a..fe98e4f475ba 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -422,29 +422,6 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
 	return false;
 }
 
-/* MP_JOIN client subflow must wait for 4th ack before sending any data:
- * TCP can't schedule delack timer before the subflow is fully established.
- * MPTCP uses the delack timer to do 3rd ack retransmissions
- */
-static void schedule_3rdack_retransmission(struct sock *sk)
-{
-	struct inet_connection_sock *icsk = inet_csk(sk);
-	struct tcp_sock *tp = tcp_sk(sk);
-	unsigned long timeout;
-
-	/* reschedule with a timeout above RTT, as we must look only for drop */
-	if (tp->srtt_us)
-		timeout = usecs_to_jiffies(tp->srtt_us >> (3-1));
-	else
-		timeout = TCP_TIMEOUT_INIT;
-	timeout += jiffies;
-
-	WARN_ON_ONCE(icsk->icsk_ack.pending & ICSK_ACK_TIMER);
-	icsk->icsk_ack.pending |= ICSK_ACK_SCHED | ICSK_ACK_TIMER;
-	icsk->icsk_ack.timeout = timeout;
-	sk_reset_timer(sk, &icsk->icsk_delack_timer, timeout);
-}
-
 static void clear_3rdack_retransmission(struct sock *sk)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
@@ -527,7 +504,15 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 		*size = TCPOLEN_MPTCP_MPJ_ACK;
 		pr_debug("subflow=%p", subflow);
 
-		schedule_3rdack_retransmission(sk);
+		/* we can use the full delegate action helper only from BH context
+		 * If we are in process context - sk is flushing the backlog at
+		 * socket lock release time - just set the appropriate flag, will
+		 * be handled by the release callback
+		 */
+		if (sock_owned_by_user(sk))
+			set_bit(MPTCP_DELEGATE_ACK, &subflow->delegated_status);
+		else
+			mptcp_subflow_delegate(subflow, MPTCP_DELEGATE_ACK);
 		return true;
 	}
 	return false;
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 1671fc9184c2..1ee3530ac2a3 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1618,7 +1618,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk)
 			if (!xmit_ssk)
 				goto out;
 			if (xmit_ssk != ssk) {
-				mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk));
+				mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), MPTCP_DELEGATE_SEND);
 				goto out;
 			}
 
@@ -2971,7 +2971,7 @@ void __mptcp_check_push(struct sock *sk, struct sock *ssk)
 		if (xmit_ssk == ssk)
 			__mptcp_subflow_push_pending(sk, ssk);
 		else if (xmit_ssk)
-			mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk));
+			mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), MPTCP_DELEGATE_SEND);
 	} else {
 		set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags);
 	}
@@ -3021,18 +3021,50 @@ static void mptcp_release_cb(struct sock *sk)
 	__mptcp_update_rmem(sk);
 }
 
+/* MP_JOIN client subflow must wait for 4th ack before sending any data:
+ * TCP can't schedule delack timer before the subflow is fully established.
+ * MPTCP uses the delack timer to do 3rd ack retransmissions
+ */
+static void schedule_3rdack_retransmission(struct sock *ssk)
+{
+	struct inet_connection_sock *icsk = inet_csk(ssk);
+	struct tcp_sock *tp = tcp_sk(ssk);
+	unsigned long timeout;
+
+	if (mptcp_subflow_ctx(ssk)->fully_established)
+		return;
+
+	/* reschedule with a timeout above RTT, as we must look only for drop */
+	if (tp->srtt_us)
+		timeout = usecs_to_jiffies(tp->srtt_us >> (3-1));
+	else
+		timeout = TCP_TIMEOUT_INIT;
+	timeout += jiffies;
+
+	WARN_ON_ONCE(icsk->icsk_ack.pending & ICSK_ACK_TIMER);
+	icsk->icsk_ack.pending |= ICSK_ACK_SCHED | ICSK_ACK_TIMER;
+	icsk->icsk_ack.timeout = timeout;
+	sk_reset_timer(ssk, &icsk->icsk_delack_timer, timeout);
+}
+
 void mptcp_subflow_process_delegated(struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
 	struct sock *sk = subflow->conn;
 
-	mptcp_data_lock(sk);
-	if (!sock_owned_by_user(sk))
-		__mptcp_subflow_push_pending(sk, ssk);
-	else
-		set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags);
-	mptcp_data_unlock(sk);
-	mptcp_subflow_delegated_done(subflow);
+	if (test_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status)) {
+		mptcp_data_lock(sk);
+		if (!sock_owned_by_user(sk))
+			__mptcp_subflow_push_pending(sk, ssk);
+		else
+			set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags);
+		mptcp_data_unlock(sk);
+		mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_SEND);
+	}
+	if (test_bit(MPTCP_DELEGATE_ACK, &subflow->delegated_status)) {
+		schedule_3rdack_retransmission(ssk);
+		mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_ACK);
+	}
 }
 
 static int mptcp_hash(struct sock *sk)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 46691acdea24..f6f4054e19ee 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -387,6 +387,7 @@ struct mptcp_delegated_action {
 DECLARE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions);
 
 #define MPTCP_DELEGATE_SEND		0
+#define MPTCP_DELEGATE_ACK		1
 
 /* MPTCP subflow context */
 struct mptcp_subflow_context {
@@ -493,23 +494,23 @@ static inline void mptcp_add_pending_subflow(struct mptcp_sock *msk,
 
 void mptcp_subflow_process_delegated(struct sock *ssk);
 
-static inline void mptcp_subflow_delegate(struct mptcp_subflow_context *subflow)
+static inline void mptcp_subflow_delegate(struct mptcp_subflow_context *subflow, int action)
 {
 	struct mptcp_delegated_action *delegated;
 	bool schedule;
 
+	/* the caller held the subflow bh socket lock */
+	lockdep_assert_in_softirq();
+
 	/* The implied barrier pairs with mptcp_subflow_delegated_done(), and
 	 * ensures the below list check sees list updates done prior to status
 	 * bit changes
 	 */
-	if (!test_and_set_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status)) {
+	if (!test_and_set_bit(action, &subflow->delegated_status)) {
 		/* still on delegated list from previous scheduling */
 		if (!list_empty(&subflow->delegated_node))
 			return;
 
-		/* the caller held the subflow bh socket lock */
-		lockdep_assert_in_softirq();
-
 		delegated = this_cpu_ptr(&mptcp_delegated_actions);
 		schedule = list_empty(&delegated->head);
 		list_add_tail(&subflow->delegated_node, &delegated->head);
@@ -534,16 +535,16 @@ mptcp_subflow_delegated_next(struct mptcp_delegated_action *delegated)
 
 static inline bool mptcp_subflow_has_delegated_action(const struct mptcp_subflow_context *subflow)
 {
-	return test_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status);
+	return !!READ_ONCE(subflow->delegated_status);
 }
 
-static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *subflow)
+static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *subflow, int action)
 {
 	/* pairs with mptcp_subflow_delegate, ensures delegate_node is updated before
 	 * touching the status bit
 	 */
 	smp_wmb();
-	clear_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status);
+	clear_bit(action, &subflow->delegated_status);
 }
 
 int mptcp_is_enabled(const struct net *net);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 mptcp-net 1/2] mptcp: fix delack timer
  2021-11-02 20:48 ` [PATCH v2 mptcp-net 1/2] mptcp: fix delack timer Paolo Abeni
@ 2021-11-03 22:51   ` Mat Martineau
  0 siblings, 0 replies; 5+ messages in thread
From: Mat Martineau @ 2021-11-03 22:51 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

On Tue, 2 Nov 2021, Paolo Abeni wrote:

> From: Eric Dumazet <edumazet@google.com>
>
> To compute the rtx timeout schedule_3rdack_retransmission() does multiple
> things in the wrong way: srtt_us is measures in usec/8 and the timeout
> itself is an absolute vale.
>
> Fixes: ec3edaa7ca6ce02f ("mptcp: Add handling of outgoing MP_JOIN requests")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> commit message typos added by me

Thanks for the v2 Paolo (both patches). I think they're good for applying 
to the export branch, maybe Matthieu can s/measures/measured/ and 
s/vale/value/ above to restore Eric's spelling :)

Reviewed-by: Mat Martineau <mathew.j.martineau>@linux.intel.com>


> ---
> net/mptcp/options.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
> index 7c3420afb1a0..2ca077fdf10a 100644
> --- a/net/mptcp/options.c
> +++ b/net/mptcp/options.c
> @@ -434,9 +434,10 @@ static void schedule_3rdack_retransmission(struct sock *sk)
>
> 	/* reschedule with a timeout above RTT, as we must look only for drop */
> 	if (tp->srtt_us)
> -		timeout = tp->srtt_us << 1;
> +		timeout = usecs_to_jiffies(tp->srtt_us >> (3-1));
> 	else
> 		timeout = TCP_TIMEOUT_INIT;
> +	timeout += jiffies;
>
> 	WARN_ON_ONCE(icsk->icsk_ack.pending & ICSK_ACK_TIMER);
> 	icsk->icsk_ack.pending |= ICSK_ACK_SCHED | ICSK_ACK_TIMER;
> -- 
> 2.26.3
>
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions
  2021-11-02 20:48 [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions Paolo Abeni
  2021-11-02 20:48 ` [PATCH v2 mptcp-net 1/2] mptcp: fix delack timer Paolo Abeni
  2021-11-02 20:48 ` [PATCH v2 mptcp-net 2/2] mptcp: use delegate action to schedule 3rd ack retrans Paolo Abeni
@ 2021-11-04 14:18 ` Matthieu Baerts
  2 siblings, 0 replies; 5+ messages in thread
From: Matthieu Baerts @ 2021-11-04 14:18 UTC (permalink / raw)
  To: Paolo Abeni, Mat Martineau; +Cc: mptcp

Hi Paolo, Mat,

On 02/11/2021 21:48, Paolo Abeni wrote:
> The first patch is from Eric, thus is correct ;)
> The 2nd one is much more doubtful, starting from the author ;)

:-P

> The individual commit messages should be descriptive enough. With these
> 2 patches the related, pending pktdrill test:
> 
> https://github.com/multipath-tcp/packetdrill/pull/70

There is one comment left on this PR. Should we merge it anyway?

> now complete successfully
> 
> v1 -> v2:
>  - fix from delegated action invocation in patch 2/2 (Mat)
> 
> Eric Dumazet (1):
>   mptcp: fix delack timer

@Paolo: I added your Acked-by on this one, no objection?

> Paolo Abeni (1):
>   mptcp: use delegate action to schedule 3rd ack retrans

Thank you for the patches and the reviews!

Now in our tree (fixes for -net) with Mat's RvB tag, without 2 typos
spotted by Mat and with 2 minor modifications to satisfy checkpatch
(spaces around '-' and max 100 chars per line):

- 0eeedc170fdf: mptcp: fix delack timer
- 232f203394ce: mptcp: use delegate action to schedule 3rd ack retrans
- Results: 6a2b7fa0be7c..c1d29ad013b4

Builds and tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20211104T141805
https://github.com/multipath-tcp/mptcp_net-next/actions/workflows/build-validation.yml?query=branch:export

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-04 14:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-02 20:48 [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions Paolo Abeni
2021-11-02 20:48 ` [PATCH v2 mptcp-net 1/2] mptcp: fix delack timer Paolo Abeni
2021-11-03 22:51   ` Mat Martineau
2021-11-02 20:48 ` [PATCH v2 mptcp-net 2/2] mptcp: use delegate action to schedule 3rd ack retrans Paolo Abeni
2021-11-04 14:18 ` [PATCH v2 mptcp-net 0/2] mptcp: fix MPJ 3rd ack retransmissions Matthieu Baerts

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.