All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock
@ 2023-05-30 13:17 Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 01/15] Squash to "mptcp: add struct mptcp_sched_ops" Geliang Tang
                   ` (14 more replies)
  0 siblings, 15 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

v3:
 - address Florian's comments in v2.
 - split into three more patches.

v2:
 - fix this error reported by CI:
KASAN: slab-use-after-free in __mptcp_close_ssk (net/mptcp/protocol.c:2461)
 - add bpf burst scheduler.

This patchset adds sched_data pointer into mptcp_sock to to save some
data at MPTCP and subflows levels.

With these changes, the old patch "mptcp: register default scheduler" in
[1] now works.

https://patchwork.kernel.org/project/mptcp/cover/cover.1665753926.git.geliang.tang@suse.com/ [1]
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/342

Geliang Tang (15):
  Squash to "mptcp: add struct mptcp_sched_ops"
  Squash to "mptcp: add sched in mptcp_sock"
  Squash to "mptcp: add scheduler wrappers"
  mptcp: add last_snd in sched_data
  mptcp: add snd_burst in sched_data
  mptcp: register default scheduler
  mptcp: rename __mptcp_set_timeout for bpf_burst
  mptcp: add two wrappers for bpf_burst
  mptcp: add three helpers for bpf_burst
  Squash to "bpf: Add bpf_mptcp_sched_ops"
  Squash to "bpf: Add bpf_mptcp_sched_kfunc_set"
  Squash to "selftests/bpf: Add mptcp sched structs"
  Squash to "selftests/bpf: Add bpf_rr scheduler"
  selftests/bpf: Add bpf_burst scheduler
  selftests/bpf: Add bpf_burst test

 include/net/mptcp.h                           |   4 +-
 net/mptcp/bpf.c                               |  42 +++-
 net/mptcp/protocol.c                          |  73 +++++--
 net/mptcp/protocol.h                          |  12 +-
 net/mptcp/sched.c                             |  67 ++++--
 tools/testing/selftests/bpf/bpf_tcp_helpers.h |   7 +-
 .../testing/selftests/bpf/prog_tests/mptcp.c  |  38 ++++
 .../selftests/bpf/progs/mptcp_bpf_burst.c     | 195 ++++++++++++++++++
 .../selftests/bpf/progs/mptcp_bpf_rr.c        |   4 +-
 9 files changed, 392 insertions(+), 50 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c

-- 
2.35.3


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 01/15] Squash to "mptcp: add struct mptcp_sched_ops"
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 02/15] Squash to "mptcp: add sched in mptcp_sock" Geliang Tang
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Use two tabs.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 include/net/mptcp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 828b10ddabee..4bba29c99172 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -100,7 +100,7 @@ struct mptcp_out_options {
 #define MPTCP_SUBFLOWS_MAX	8
 
 struct mptcp_sched_data {
-	bool	reinject;
+	bool		reinject;
 	struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
 };
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 02/15] Squash to "mptcp: add sched in mptcp_sock"
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 01/15] Squash to "mptcp: add struct mptcp_sched_ops" Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 03/15] Squash to "mptcp: add scheduler wrappers" Geliang Tang
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Add sched_data pointer into mptcp_sock too.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/protocol.c |  5 +++--
 net/mptcp/protocol.h |  4 +++-
 net/mptcp/sched.c    | 13 ++++++++++++-
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 763f709fd5f5..a2560813bfb5 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2813,7 +2813,8 @@ static int mptcp_init_sock(struct sock *sk)
 		return -ENOMEM;
 
 	ret = mptcp_init_sched(mptcp_sk(sk),
-			       mptcp_sched_find(mptcp_get_scheduler(net)));
+			       mptcp_sched_find(mptcp_get_scheduler(net)),
+			       GFP_KERNEL);
 	if (ret)
 		return ret;
 
@@ -3204,7 +3205,7 @@ struct sock *mptcp_sk_clone_init(const struct sock *sk,
 	msk->snd_una = msk->write_seq;
 	msk->wnd_end = msk->snd_nxt + req->rsk_rcv_wnd;
 	msk->setsockopt_seq = mptcp_sk(sk)->setsockopt_seq;
-	mptcp_init_sched(msk, mptcp_sk(sk)->sched);
+	mptcp_init_sched(msk, mptcp_sk(sk)->sched, GFP_ATOMIC);
 
 	sock_reset_flag(nsk, SOCK_RCU_FREE);
 	security_inet_csk_clone(nsk, req);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index bd3771c7d79d..88469590eb32 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -311,6 +311,7 @@ struct mptcp_sock {
 				   */
 	struct sock	*first;
 	struct mptcp_pm_data	pm;
+	struct mptcp_sched_data *sched_data;
 	struct mptcp_sched_ops	*sched;
 	struct {
 		u32	space;	/* bytes copied in last measurement window */
@@ -653,7 +654,8 @@ struct mptcp_sched_ops *mptcp_sched_find(const char *name);
 int mptcp_register_scheduler(struct mptcp_sched_ops *sched);
 void mptcp_unregister_scheduler(struct mptcp_sched_ops *sched);
 int mptcp_init_sched(struct mptcp_sock *msk,
-		     struct mptcp_sched_ops *sched);
+		     struct mptcp_sched_ops *sched,
+		     gfp_t gfp);
 void mptcp_release_sched(struct mptcp_sock *msk);
 void mptcp_subflow_set_scheduled(struct mptcp_subflow_context *subflow,
 				 bool scheduled);
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index c7c167e48d72..a053a9504dfd 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -56,7 +56,8 @@ void mptcp_unregister_scheduler(struct mptcp_sched_ops *sched)
 }
 
 int mptcp_init_sched(struct mptcp_sock *msk,
-		     struct mptcp_sched_ops *sched)
+		     struct mptcp_sched_ops *sched,
+		     gfp_t gfp)
 {
 	if (!sched)
 		goto out;
@@ -64,6 +65,12 @@ int mptcp_init_sched(struct mptcp_sock *msk,
 	if (!bpf_try_module_get(sched, sched->owner))
 		return -EBUSY;
 
+	msk->sched_data = kzalloc(sizeof(struct mptcp_sched_data), gfp);
+	if (!msk->sched_data) {
+		bpf_module_put(sched, sched->owner);
+		return -ENOMEM;
+	}
+
 	msk->sched = sched;
 	if (msk->sched->init)
 		msk->sched->init(msk);
@@ -81,6 +88,10 @@ void mptcp_release_sched(struct mptcp_sock *msk)
 	if (!sched)
 		return;
 
+	if (msk->sched_data) {
+		kfree(msk->sched_data);
+		msk->sched_data = NULL;
+	}
 	msk->sched = NULL;
 	if (sched->release)
 		sched->release(msk);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 03/15] Squash to "mptcp: add scheduler wrappers"
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 01/15] Squash to "mptcp: add struct mptcp_sched_ops" Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 02/15] Squash to "mptcp: add sched in mptcp_sock" Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 04/15] mptcp: add last_snd in sched_data Geliang Tang
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Use msk->sched_data instead of the local variable data.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/sched.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index a053a9504dfd..5438a86e897a 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -127,7 +127,6 @@ void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
 int mptcp_sched_get_send(struct mptcp_sock *msk)
 {
 	struct mptcp_subflow_context *subflow;
-	struct mptcp_sched_data data;
 
 	msk_owned_by_me(msk);
 
@@ -157,15 +156,14 @@ int mptcp_sched_get_send(struct mptcp_sock *msk)
 		return 0;
 	}
 
-	data.reinject = false;
-	msk->sched->data_init(msk, &data);
-	return msk->sched->get_subflow(msk, &data);
+	msk->sched_data->reinject = false;
+	msk->sched->data_init(msk, msk->sched_data);
+	return msk->sched->get_subflow(msk, msk->sched_data);
 }
 
 int mptcp_sched_get_retrans(struct mptcp_sock *msk)
 {
 	struct mptcp_subflow_context *subflow;
-	struct mptcp_sched_data data;
 
 	msk_owned_by_me(msk);
 
@@ -188,7 +186,7 @@ int mptcp_sched_get_retrans(struct mptcp_sock *msk)
 		return 0;
 	}
 
-	data.reinject = true;
-	msk->sched->data_init(msk, &data);
-	return msk->sched->get_subflow(msk, &data);
+	msk->sched_data->reinject = true;
+	msk->sched->data_init(msk, msk->sched_data);
+	return msk->sched->get_subflow(msk, msk->sched_data);
 }
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 04/15] mptcp: add last_snd in sched_data
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (2 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 03/15] Squash to "mptcp: add scheduler wrappers" Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 05/15] mptcp: add snd_burst " Geliang Tang
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch moves the member last_snd from struct mptcp_sock to
struct mptcp_sched_data to make it accessible to bpf schedulers.

With this change, msk->last_snd should be replaced by
msk->sched_data->last_snd.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 include/net/mptcp.h  |  1 +
 net/mptcp/protocol.c | 14 +++++++-------
 net/mptcp/protocol.h |  1 -
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 4bba29c99172..d52aeb8b4485 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -100,6 +100,7 @@ struct mptcp_out_options {
 #define MPTCP_SUBFLOWS_MAX	8
 
 struct mptcp_sched_data {
+	struct sock	*last_snd;
 	bool		reinject;
 	struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
 };
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index a2560813bfb5..05a98ae09226 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1619,7 +1619,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
 					continue;
 				}
 				do_check_data_fin = true;
-				msk->last_snd = ssk;
+				msk->sched_data->last_snd = ssk;
 			}
 		}
 	}
@@ -1660,7 +1660,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool
 			if (ret <= 0)
 				break;
 			copied += ret;
-			msk->last_snd = ssk;
+			msk->sched_data->last_snd = ssk;
 			continue;
 		}
 
@@ -1673,7 +1673,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool
 			if (ret <= 0)
 				keep_pushing = false;
 			copied += ret;
-			msk->last_snd = ssk;
+			msk->sched_data->last_snd = ssk;
 		}
 
 		mptcp_for_each_subflow(msk, subflow) {
@@ -2457,8 +2457,8 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 		WRITE_ONCE(msk->first, NULL);
 
 out:
-	if (ssk == msk->last_snd)
-		msk->last_snd = NULL;
+	if (msk->sched_data && ssk == msk->sched_data->last_snd)
+		msk->sched_data->last_snd = NULL;
 
 	if (need_push)
 		__mptcp_push_pending(sk, 0);
@@ -2640,7 +2640,7 @@ static void __mptcp_retrans(struct sock *sk)
 
 			release_sock(ssk);
 
-			msk->last_snd = ssk;
+			msk->sched_data->last_snd = ssk;
 		}
 	}
 	dfrag->already_sent = max(dfrag->already_sent, len);
@@ -3143,7 +3143,7 @@ static int mptcp_disconnect(struct sock *sk, int flags)
 	 * subflow
 	 */
 	mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE);
-	msk->last_snd = NULL;
+	msk->sched_data->last_snd = NULL;
 	WRITE_ONCE(msk->flags, 0);
 	msk->cb_flags = 0;
 	msk->push_pending = 0;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 88469590eb32..ce3be8eb68d6 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -266,7 +266,6 @@ struct mptcp_sock {
 	atomic64_t	rcv_wnd_sent;
 	u64		rcv_data_fin_seq;
 	int		rmem_fwd_alloc;
-	struct sock	*last_snd;
 	int		snd_burst;
 	int		old_wspace;
 	u64		recovery_snd_nxt;	/* in recovery mode accept up to this seq;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 05/15] mptcp: add snd_burst in sched_data
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (3 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 04/15] mptcp: add last_snd in sched_data Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 06/15] mptcp: register default scheduler Geliang Tang
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch moves the member snd_burst from struct mptcp_sock to struct
mptcp_sched_data to make it accessible to bpf schedulers.

To make mptcp_subflow_get_send() adapt with MPTCP scheduler API, it's
necessary to make the msk parameter of it const. Also an new parameter
sched_data is needed.

With this change, msk->snd_burst should be replaced by
msk->sched_data->snd_burst.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 include/net/mptcp.h  |  1 +
 net/mptcp/protocol.c | 11 ++++++-----
 net/mptcp/protocol.h |  4 ++--
 net/mptcp/sched.c    |  2 +-
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index d52aeb8b4485..fb4e6a59afc8 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -101,6 +101,7 @@ struct mptcp_out_options {
 
 struct mptcp_sched_data {
 	struct sock	*last_snd;
+	int		snd_burst;
 	bool		reinject;
 	struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
 };
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 05a98ae09226..a178cd841d82 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1416,7 +1416,8 @@ bool mptcp_subflow_active(struct mptcp_subflow_context *subflow)
  * returns the subflow that will transmit the next DSS
  * additionally updates the rtx timeout
  */
-struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
+struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
+				    struct mptcp_sched_data *data)
 {
 	struct subflow_send_info send_info[SSK_MODE_MAX];
 	struct mptcp_subflow_context *subflow;
@@ -1486,7 +1487,7 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
 	subflow->avg_pacing_rate = div_u64((u64)subflow->avg_pacing_rate * wmem +
 					   READ_ONCE(ssk->sk_pacing_rate) * burst,
 					   burst + wmem);
-	msk->snd_burst = burst;
+	data->snd_burst = burst;
 	return ssk;
 }
 
@@ -1504,7 +1505,7 @@ static void mptcp_update_post_push(struct mptcp_sock *msk,
 
 	dfrag->already_sent += sent;
 
-	msk->snd_burst -= sent;
+	msk->sched_data->snd_burst -= sent;
 
 	snd_nxt_new += dfrag->already_sent;
 
@@ -1555,7 +1556,7 @@ static int __subflow_push_pending(struct sock *sk, struct sock *ssk,
 		}
 		WRITE_ONCE(msk->first_pending, mptcp_send_next(sk));
 
-		if (msk->snd_burst <= 0 ||
+		if (msk->sched_data->snd_burst <= 0 ||
 		    !sk_stream_memory_free(ssk) ||
 		    !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) {
 			err = copied;
@@ -2349,7 +2350,7 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
 	mptcp_data_unlock(sk);
 
 	msk->first_pending = rtx_head;
-	msk->snd_burst = 0;
+	msk->sched_data->snd_burst = 0;
 
 	/* be sure to clear the "sent status" on all re-injected fragments */
 	list_for_each_entry(cur, &msk->rtx_queue, list) {
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index ce3be8eb68d6..37f5cb898b45 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -266,7 +266,6 @@ struct mptcp_sock {
 	atomic64_t	rcv_wnd_sent;
 	u64		rcv_data_fin_seq;
 	int		rmem_fwd_alloc;
-	int		snd_burst;
 	int		old_wspace;
 	u64		recovery_snd_nxt;	/* in recovery mode accept up to this seq;
 						 * recovery related fields are under data_lock
@@ -660,7 +659,8 @@ void mptcp_subflow_set_scheduled(struct mptcp_subflow_context *subflow,
 				 bool scheduled);
 void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
 				   struct mptcp_sched_data *data);
-struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk);
+struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
+				    struct mptcp_sched_data *data);
 struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk);
 int mptcp_sched_get_send(struct mptcp_sock *msk);
 int mptcp_sched_get_retrans(struct mptcp_sock *msk);
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index 5438a86e897a..a6210ec4ba5f 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -149,7 +149,7 @@ int mptcp_sched_get_send(struct mptcp_sock *msk)
 	if (!msk->sched) {
 		struct sock *ssk;
 
-		ssk = mptcp_subflow_get_send(msk);
+		ssk = mptcp_subflow_get_send(msk, msk->sched_data);
 		if (!ssk)
 			return -EINVAL;
 		mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 06/15] mptcp: register default scheduler
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (4 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 05/15] mptcp: add snd_burst " Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 07/15] mptcp: rename __mptcp_set_timeout for bpf_burst Geliang Tang
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch defines the default packet scheduler mptcp_sched_default.
Register it in mptcp_sched_init(), which is invoked in mptcp_proto_init().
Skip deleting this default scheduler in mptcp_unregister_scheduler().

Set msk->sched to the default scheduler when the input parameter of
mptcp_init_sched() is NULL.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/protocol.c |  3 ++-
 net/mptcp/protocol.h |  3 ++-
 net/mptcp/sched.c    | 38 ++++++++++++++++++++++++++++++++++++--
 3 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index a178cd841d82..0d85add4c8dc 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2278,7 +2278,7 @@ static void mptcp_timeout_timer(struct timer_list *t)
  *
  * A backup subflow is returned only if that is the only kind available.
  */
-struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk)
+struct sock *mptcp_subflow_get_retrans(const struct mptcp_sock *msk)
 {
 	struct sock *backup = NULL, *pick = NULL;
 	struct mptcp_subflow_context *subflow;
@@ -4006,6 +4006,7 @@ void __init mptcp_proto_init(void)
 
 	mptcp_subflow_init();
 	mptcp_pm_init();
+	mptcp_sched_init();
 	mptcp_token_init();
 
 	if (proto_register(&mptcp_prot, 1) != 0)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 37f5cb898b45..41468a3a0d0e 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -651,6 +651,7 @@ void mptcp_info2sockaddr(const struct mptcp_addr_info *info,
 struct mptcp_sched_ops *mptcp_sched_find(const char *name);
 int mptcp_register_scheduler(struct mptcp_sched_ops *sched);
 void mptcp_unregister_scheduler(struct mptcp_sched_ops *sched);
+void mptcp_sched_init(void);
 int mptcp_init_sched(struct mptcp_sock *msk,
 		     struct mptcp_sched_ops *sched,
 		     gfp_t gfp);
@@ -661,7 +662,7 @@ void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
 				   struct mptcp_sched_data *data);
 struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
 				    struct mptcp_sched_data *data);
-struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk);
+struct sock *mptcp_subflow_get_retrans(const struct mptcp_sock *msk);
 int mptcp_sched_get_send(struct mptcp_sock *msk);
 int mptcp_sched_get_retrans(struct mptcp_sock *msk);
 
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index a6210ec4ba5f..fdf2459a0fcc 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -16,6 +16,33 @@
 static DEFINE_SPINLOCK(mptcp_sched_list_lock);
 static LIST_HEAD(mptcp_sched_list);
 
+static void mptcp_sched_default_data_init(const struct mptcp_sock *msk,
+					  struct mptcp_sched_data *data)
+{
+	data->snd_burst = 0;
+}
+
+static int mptcp_sched_default_get_subflow(const struct mptcp_sock *msk,
+					   struct mptcp_sched_data *data)
+{
+	struct sock *ssk;
+
+	ssk = data->reinject ? mptcp_subflow_get_retrans(msk) :
+			       mptcp_subflow_get_send(msk, data);
+	if (!ssk)
+		return -EINVAL;
+
+	mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true);
+	return 0;
+}
+
+static struct mptcp_sched_ops mptcp_sched_default = {
+	.data_init	= mptcp_sched_default_data_init,
+	.get_subflow	= mptcp_sched_default_get_subflow,
+	.name		= "default",
+	.owner		= THIS_MODULE,
+};
+
 /* Must be called with rcu read lock held */
 struct mptcp_sched_ops *mptcp_sched_find(const char *name)
 {
@@ -50,17 +77,25 @@ int mptcp_register_scheduler(struct mptcp_sched_ops *sched)
 
 void mptcp_unregister_scheduler(struct mptcp_sched_ops *sched)
 {
+	if (sched == &mptcp_sched_default)
+		return;
+
 	spin_lock(&mptcp_sched_list_lock);
 	list_del_rcu(&sched->list);
 	spin_unlock(&mptcp_sched_list_lock);
 }
 
+void mptcp_sched_init(void)
+{
+	mptcp_register_scheduler(&mptcp_sched_default);
+}
+
 int mptcp_init_sched(struct mptcp_sock *msk,
 		     struct mptcp_sched_ops *sched,
 		     gfp_t gfp)
 {
 	if (!sched)
-		goto out;
+		sched = &mptcp_sched_default;
 
 	if (!bpf_try_module_get(sched, sched->owner))
 		return -EBUSY;
@@ -77,7 +112,6 @@ int mptcp_init_sched(struct mptcp_sock *msk,
 
 	pr_debug("sched=%s", msk->sched->name);
 
-out:
 	return 0;
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 07/15] mptcp: rename __mptcp_set_timeout for bpf_burst
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (5 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 06/15] mptcp: register default scheduler Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 08/15] mptcp: add two wrappers " Geliang Tang
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

__mptcp_set_timeout() is needed to export in BPF context for bpf_burst
scheduler, but the "__" prefix cannot be used. So this patch renames it
to mptcp_set_timer().

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/protocol.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0d85add4c8dc..02e22bb49eb0 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -483,7 +483,7 @@ static void mptcp_set_datafin_timeout(struct sock *sk)
 	mptcp_sk(sk)->timer_ival = TCP_RTO_MIN << retransmits;
 }
 
-static void __mptcp_set_timeout(struct sock *sk, long tout)
+void mptcp_set_timer(struct sock *sk, long tout)
 {
 	mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
 }
@@ -503,7 +503,7 @@ static void mptcp_set_timeout(struct sock *sk)
 
 	mptcp_for_each_subflow(mptcp_sk(sk), subflow)
 		tout = max(tout, mptcp_timeout_from_subflow(subflow));
-	__mptcp_set_timeout(sk, tout);
+	mptcp_set_timer(sk, tout);
 }
 
 static inline bool tcp_can_send_ack(const struct sock *ssk)
@@ -1457,7 +1457,7 @@ struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
 			send_info[subflow->backup].linger_time = linger_time;
 		}
 	}
-	__mptcp_set_timeout(sk, tout);
+	mptcp_set_timer(sk, tout);
 
 	/* pick the best backup if no other subflow is active */
 	if (!nr_active)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 08/15] mptcp: add two wrappers for bpf_burst
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (6 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 07/15] mptcp: rename __mptcp_set_timeout for bpf_burst Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 09/15] mptcp: add three helpers " Geliang Tang
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

sk_stream_memory_free() and tcp_rtx_and_write_queues_empty() are needed
to export into the BPF context for bpf_burst scheduler. But these two
functions are inline ones. So this patch added two wrappers for them,
and export the wrappers in the BPF context.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/protocol.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 02e22bb49eb0..c8fd0915fa0b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1408,6 +1408,13 @@ bool mptcp_subflow_active(struct mptcp_subflow_context *subflow)
 	return __mptcp_subflow_active(subflow);
 }
 
+bool mptcp_stream_memory_free(struct mptcp_subflow_context *subflow)
+{
+	const struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+	return sk_stream_memory_free(ssk);
+}
+
 #define SSK_MODE_ACTIVE	0
 #define SSK_MODE_BACKUP	1
 #define SSK_MODE_MAX	2
@@ -2273,6 +2280,11 @@ static void mptcp_timeout_timer(struct timer_list *t)
 	sock_put(sk);
 }
 
+bool mptcp_rtx_and_write_queues_empty(const struct sock *sk)
+{
+	return tcp_rtx_and_write_queues_empty(sk);
+}
+
 /* Find an idle subflow.  Return NULL if there is unacked data at tcp
  * level.
  *
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 09/15] mptcp: add three helpers for bpf_burst
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (7 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 08/15] mptcp: add two wrappers " Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 10/15] Squash to "bpf: Add bpf_mptcp_sched_ops" Geliang Tang
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch adds three helpers needed to export into the BPF context for
bpf_burst scheduler: mptcp_get_linger_time(), mptcp_get_burst() and
mptcp_get_pacing_rate().

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/protocol.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index c8fd0915fa0b..8dbc9b9c3eb3 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1415,6 +1415,26 @@ bool mptcp_stream_memory_free(struct mptcp_subflow_context *subflow)
 	return sk_stream_memory_free(ssk);
 }
 
+u64 mptcp_get_linger_time(struct sock *ssk, u32 pace)
+{
+	return div_u64((u64)READ_ONCE(ssk->sk_wmem_queued) << 32, pace);
+}
+
+u32 mptcp_get_burst(const struct mptcp_sock *msk)
+{
+	return min_t(int, MPTCP_SEND_BURST_SIZE, mptcp_wnd_end(msk) - msk->snd_nxt);
+}
+
+unsigned long mptcp_get_pacing_rate(struct mptcp_subflow_context *subflow, u32 burst)
+{
+	struct sock *ssk =  mptcp_subflow_tcp_sock(subflow);
+	u32 wmem = READ_ONCE(ssk->sk_wmem_queued);
+
+	return div_u64((u64)subflow->avg_pacing_rate * wmem +
+		       READ_ONCE(ssk->sk_pacing_rate) * burst,
+		       burst + wmem);
+}
+
 #define SSK_MODE_ACTIVE	0
 #define SSK_MODE_BACKUP	1
 #define SSK_MODE_MAX	2
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 10/15] Squash to "bpf: Add bpf_mptcp_sched_ops"
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (8 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 09/15] mptcp: add three helpers " Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 11/15] Squash to "bpf: Add bpf_mptcp_sched_kfunc_set" Geliang Tang
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Add more struct accesses.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/bpf.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index dd1208670c54..f2ce9acc2628 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -18,8 +18,9 @@
 #ifdef CONFIG_BPF_JIT
 extern struct bpf_struct_ops bpf_mptcp_sched_ops;
 extern struct btf *btf_vmlinux;
-static const struct btf_type *mptcp_sched_type __read_mostly;
-static u32 mptcp_sched_id;
+static const struct btf_type *mptcp_context_type __read_mostly;
+static const struct btf_type *mptcp_data_type __read_mostly;
+static u32 mptcp_context_id, mptcp_data_id;
 
 static u32 optional_sched_ops[] = {
 	offsetof(struct mptcp_sched_ops, init),
@@ -41,8 +42,8 @@ static int bpf_mptcp_sched_btf_struct_access(struct bpf_verifier_log *log,
 	size_t end;
 
 	t = btf_type_by_id(reg->btf, reg->btf_id);
-	if (t != mptcp_sched_type) {
-		bpf_log(log, "only access to mptcp_subflow_context is supported\n");
+	if (t != mptcp_context_type && t != mptcp_data_type) {
+		bpf_log(log, "only access to subflow_context or sched_data is supported\n");
 		return -EACCES;
 	}
 
@@ -50,14 +51,21 @@ static int bpf_mptcp_sched_btf_struct_access(struct bpf_verifier_log *log,
 	case offsetof(struct mptcp_subflow_context, scheduled):
 		end = offsetofend(struct mptcp_subflow_context, scheduled);
 		break;
+	case offsetof(struct mptcp_subflow_context, avg_pacing_rate):
+		end = offsetofend(struct mptcp_subflow_context, avg_pacing_rate);
+		break;
+	case offsetof(struct mptcp_sched_data, snd_burst):
+		end = offsetofend(struct mptcp_sched_data, snd_burst);
+		break;
 	default:
-		bpf_log(log, "no write support to mptcp_subflow_context at off %d\n", off);
+		bpf_log(log, "no write support to %s at off %d\n",
+			t == mptcp_context_type ? "subflow_context" : "sched_data", off);
 		return -EACCES;
 	}
 
 	if (off + size > end) {
-		bpf_log(log, "access beyond mptcp_subflow_context at off %u size %u ended at %zu",
-			off, size, end);
+		bpf_log(log, "access beyond %s at off %u size %u ended at %zu",
+			t == mptcp_context_type ? "subflow_context" : "sched_data", off, size, end);
 		return -EACCES;
 	}
 
@@ -141,8 +149,15 @@ static int bpf_mptcp_sched_init(struct btf *btf)
 					BTF_KIND_STRUCT);
 	if (type_id < 0)
 		return -EINVAL;
-	mptcp_sched_id = type_id;
-	mptcp_sched_type = btf_type_by_id(btf, mptcp_sched_id);
+	mptcp_context_id = type_id;
+	mptcp_context_type = btf_type_by_id(btf, mptcp_context_id);
+
+	type_id = btf_find_by_name_kind(btf, "mptcp_sched_data",
+					BTF_KIND_STRUCT);
+	if (type_id < 0)
+		return -EINVAL;
+	mptcp_data_id = type_id;
+	mptcp_data_type = btf_type_by_id(btf, mptcp_data_id);
 
 	return 0;
 }
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 11/15] Squash to "bpf: Add bpf_mptcp_sched_kfunc_set"
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (9 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 10/15] Squash to "bpf: Add bpf_mptcp_sched_ops" Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 12/15] Squash to "selftests/bpf: Add mptcp sched structs" Geliang Tang
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Add more bpf_burst related functions into bpf_mptcp_sched_kfunc_set.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/bpf.c      | 9 +++++++++
 net/mptcp/protocol.c | 2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index f2ce9acc2628..c5acce1409f5 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -175,6 +175,15 @@ struct bpf_struct_ops bpf_mptcp_sched_ops = {
 BTF_SET8_START(bpf_mptcp_sched_kfunc_ids)
 BTF_ID_FLAGS(func, mptcp_subflow_set_scheduled)
 BTF_ID_FLAGS(func, mptcp_sched_data_set_contexts)
+BTF_ID_FLAGS(func, mptcp_subflow_active)
+BTF_ID_FLAGS(func, mptcp_timeout_from_subflow)
+BTF_ID_FLAGS(func, mptcp_set_timer)
+BTF_ID_FLAGS(func, mptcp_stream_memory_free)
+BTF_ID_FLAGS(func, mptcp_get_linger_time)
+BTF_ID_FLAGS(func, mptcp_get_burst)
+BTF_ID_FLAGS(func, mptcp_get_pacing_rate)
+BTF_ID_FLAGS(func, mptcp_rtx_and_write_queues_empty)
+BTF_ID_FLAGS(func, mptcp_pm_subflow_chk_stale)
 BTF_SET8_END(bpf_mptcp_sched_kfunc_ids)
 
 static const struct btf_kfunc_id_set bpf_mptcp_sched_kfunc_set = {
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 8dbc9b9c3eb3..83ed8ab58be2 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -488,7 +488,7 @@ void mptcp_set_timer(struct sock *sk, long tout)
 	mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
 }
 
-static long mptcp_timeout_from_subflow(const struct mptcp_subflow_context *subflow)
+long mptcp_timeout_from_subflow(const struct mptcp_subflow_context *subflow)
 {
 	const struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 12/15] Squash to "selftests/bpf: Add mptcp sched structs"
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (10 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 11/15] Squash to "bpf: Add bpf_mptcp_sched_kfunc_set" Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 13/15] Squash to "selftests/bpf: Add bpf_rr scheduler" Geliang Tang
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Use two tabs.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 tools/testing/selftests/bpf/bpf_tcp_helpers.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index 72c618037386..fcb023a749ad 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -239,7 +239,7 @@ struct mptcp_subflow_context {
 } __attribute__((preserve_access_index));
 
 struct mptcp_sched_data {
-	bool	reinject;
+	bool		reinject;
 	struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
 } __attribute__((preserve_access_index));
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 13/15] Squash to "selftests/bpf: Add bpf_rr scheduler"
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (11 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 12/15] Squash to "selftests/bpf: Add mptcp sched structs" Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 14/15] selftests/bpf: Add bpf_burst scheduler Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 15/15] selftests/bpf: Add bpf_burst test Geliang Tang
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Use data->last_snd instead of msk->last_snd.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 tools/testing/selftests/bpf/bpf_tcp_helpers.h    | 2 +-
 tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index fcb023a749ad..dddb51a47740 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -239,6 +239,7 @@ struct mptcp_subflow_context {
 } __attribute__((preserve_access_index));
 
 struct mptcp_sched_data {
+	struct sock	*last_snd;
 	bool		reinject;
 	struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
 } __attribute__((preserve_access_index));
@@ -259,7 +260,6 @@ struct mptcp_sched_ops {
 struct mptcp_sock {
 	struct inet_connection_sock	sk;
 
-	struct sock	*last_snd;
 	__u32		token;
 	struct sock	*first;
 	char		ca_name[TCP_CA_NAME_MAX];
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c
index e101428e5906..4b4141056fe2 100644
--- a/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c
@@ -28,10 +28,10 @@ int BPF_STRUCT_OPS(bpf_rr_get_subflow, const struct mptcp_sock *msk,
 	int nr = 0;
 
 	for (int i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
-		if (!msk->last_snd || !data->contexts[i])
+		if (!data->last_snd || !data->contexts[i])
 			break;
 
-		if (data->contexts[i]->tcp_sock == msk->last_snd) {
+		if (data->contexts[i]->tcp_sock == data->last_snd) {
 			if (i + 1 == MPTCP_SUBFLOWS_MAX || !data->contexts[i + 1])
 				break;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 14/15] selftests/bpf: Add bpf_burst scheduler
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (12 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 13/15] Squash to "selftests/bpf: Add bpf_rr scheduler" Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:17 ` [PATCH mptcp-next v3 15/15] selftests/bpf: Add bpf_burst test Geliang Tang
  14 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch implements the burst BPF MPTCP scheduler, named bpf_burst,
which is the default scheduler in protocol.c.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 tools/testing/selftests/bpf/bpf_tcp_helpers.h |   3 +
 .../selftests/bpf/progs/mptcp_bpf_burst.c     | 195 ++++++++++++++++++
 2 files changed, 198 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c

diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index dddb51a47740..d42a0c00ed73 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -234,12 +234,15 @@ extern void tcp_cong_avoid_ai(struct tcp_sock *tp, __u32 w, __u32 acked) __ksym;
 #define MPTCP_SUBFLOWS_MAX	8
 
 struct mptcp_subflow_context {
+	unsigned long avg_pacing_rate;
 	__u32	backup : 1;
+	__u8	stale_count;
 	struct	sock *tcp_sock;	    /* tcp sk backpointer */
 } __attribute__((preserve_access_index));
 
 struct mptcp_sched_data {
 	struct sock	*last_snd;
+	int		snd_burst;
 	bool		reinject;
 	struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
 } __attribute__((preserve_access_index));
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c
new file mode 100644
index 000000000000..305829162ce5
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c
@@ -0,0 +1,195 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023, SUSE. */
+
+#include <linux/bpf.h>
+#include <limits.h>
+#include "bpf_tcp_helpers.h"
+
+char _license[] SEC("license") = "GPL";
+
+struct subflow_send_info {
+	struct sock *ssk;
+	__u64 linger_time;
+};
+
+static inline struct sock *
+mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow)
+{
+	return subflow->tcp_sock;
+}
+
+extern bool mptcp_subflow_active(struct mptcp_subflow_context *subflow) __ksym;
+extern long mptcp_timeout_from_subflow(const struct mptcp_subflow_context *subflow) __ksym;
+extern void mptcp_set_timer(struct sock *sk, long tout) __ksym;
+extern bool mptcp_stream_memory_free(struct mptcp_subflow_context *subflow) __ksym;
+extern __u64 mptcp_get_linger_time(struct sock *ssk, __u32 pace) __ksym;
+extern __u32 mptcp_get_burst(const struct mptcp_sock *msk) __ksym;
+extern unsigned long mptcp_get_pacing_rate(struct mptcp_subflow_context *subflow, __u32 burst) __ksym;
+extern bool mptcp_rtx_and_write_queues_empty(const struct sock *sk) __ksym;
+extern void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk) __ksym;
+
+#define SSK_MODE_ACTIVE	0
+#define SSK_MODE_BACKUP	1
+#define SSK_MODE_MAX	2
+
+SEC("struct_ops/mptcp_sched_burst_init")
+void BPF_PROG(mptcp_sched_burst_init, const struct mptcp_sock *msk)
+{
+}
+
+SEC("struct_ops/mptcp_sched_burst_release")
+void BPF_PROG(mptcp_sched_burst_release, const struct mptcp_sock *msk)
+{
+}
+
+void BPF_STRUCT_OPS(bpf_burst_data_init, const struct mptcp_sock *msk,
+		    struct mptcp_sched_data *data)
+{
+	mptcp_sched_data_set_contexts(msk, data);
+}
+
+static int bpf_burst_get_send(const struct mptcp_sock *msk,
+			      struct mptcp_sched_data *data)
+{
+	struct subflow_send_info send_info[SSK_MODE_MAX];
+	struct mptcp_subflow_context *subflow;
+	struct sock *sk = (struct sock *)msk;
+	int i, nr_active = 0;
+	__u32 pace, burst;
+	__u64 linger_time;
+	struct sock *ssk;
+	long tout = 0;
+	int nr = 0;
+
+	/* pick the subflow with the lower wmem/wspace ratio */
+	for (i = 0; i < SSK_MODE_MAX; ++i) {
+		send_info[i].ssk = NULL;
+		send_info[i].linger_time = -1;
+	}
+
+	for (i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
+		if (!data->contexts[i])
+			break;
+
+		subflow = data->contexts[i];
+		ssk = mptcp_subflow_tcp_sock(subflow);
+		if (!mptcp_subflow_active(subflow))
+			continue;
+
+		tout = max(tout, mptcp_timeout_from_subflow(subflow));
+		nr_active += !subflow->backup;
+		pace = subflow->avg_pacing_rate;
+		if (!pace) {
+			/* init pacing rate from socket */
+			subflow->avg_pacing_rate = ssk->sk_pacing_rate;
+			pace = subflow->avg_pacing_rate;
+			if (!pace)
+				continue;
+		}
+
+		linger_time = mptcp_get_linger_time(ssk, pace);
+		if (linger_time < send_info[subflow->backup].linger_time) {
+			send_info[subflow->backup].ssk = ssk;
+			send_info[subflow->backup].linger_time = linger_time;
+		}
+	}
+	mptcp_set_timer(sk, tout);
+
+	/* pick the best backup if no other subflow is active */
+	if (!nr_active)
+		send_info[SSK_MODE_ACTIVE].ssk = send_info[SSK_MODE_BACKUP].ssk;
+
+	ssk = send_info[SSK_MODE_ACTIVE].ssk;
+	if (!ssk)
+		return -1;
+
+	for (i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
+		if (data->contexts[i]->tcp_sock == ssk) {
+			nr = i;
+			break;
+		}
+	}
+	subflow = data->contexts[nr];
+
+	if (!mptcp_stream_memory_free(subflow))
+		return -1;
+
+	burst = mptcp_get_burst(msk);
+	if (!burst)
+		goto out;
+
+	data->snd_burst = burst;
+	subflow->avg_pacing_rate = mptcp_get_pacing_rate(subflow, burst);
+
+out:
+	mptcp_subflow_set_scheduled(subflow, true);
+	return 0;
+}
+
+static int bpf_burst_get_retrans(const struct mptcp_sock *msk,
+				 struct mptcp_sched_data *data)
+{
+	struct sock *backup = NULL, *pick = NULL, *ret = NULL;
+	struct mptcp_subflow_context *subflow;
+	int min_stale_count = INT_MAX;
+	struct sock *ssk;
+	int i, nr = 0;
+
+	for (i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
+		if (!data->contexts[i])
+			break;
+
+		subflow = data->contexts[i];
+		ssk = mptcp_subflow_tcp_sock(subflow);
+		if (!mptcp_subflow_active(subflow))
+			continue;
+
+		/* still data outstanding at TCP level? skip this */
+		if (!mptcp_rtx_and_write_queues_empty(ssk)) {
+			mptcp_pm_subflow_chk_stale(msk, ssk);
+			min_stale_count = min(min_stale_count, subflow->stale_count);
+			continue;
+		}
+
+		if (subflow->backup) {
+			if (!backup)
+				backup = ssk;
+			continue;
+		}
+
+		if (!pick)
+			pick = ssk;
+	}
+
+	if (pick)
+		ret = pick;
+	ret = min_stale_count > 1 ? backup : NULL;
+
+	if (ret) {
+		for (i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
+			if (data->contexts[i]->tcp_sock == ret) {
+				nr = i;
+				break;
+			}
+		}
+	}
+	mptcp_subflow_set_scheduled(data->contexts[nr], true);
+	return 0;
+}
+
+int BPF_STRUCT_OPS(bpf_burst_get_subflow, const struct mptcp_sock *msk,
+		   struct mptcp_sched_data *data)
+{
+	if (data->reinject)
+		return bpf_burst_get_retrans(msk, data);
+	return bpf_burst_get_send(msk, data);
+}
+
+SEC(".struct_ops")
+struct mptcp_sched_ops burst = {
+	.init		= (void *)mptcp_sched_burst_init,
+	.release	= (void *)mptcp_sched_burst_release,
+	.data_init	= (void *)bpf_burst_data_init,
+	.get_subflow	= (void *)bpf_burst_get_subflow,
+	.name		= "bpf_burst",
+};
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH mptcp-next v3 15/15] selftests/bpf: Add bpf_burst test
  2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
                   ` (13 preceding siblings ...)
  2023-05-30 13:17 ` [PATCH mptcp-next v3 14/15] selftests/bpf: Add bpf_burst scheduler Geliang Tang
@ 2023-05-30 13:17 ` Geliang Tang
  2023-05-30 13:50   ` selftests/bpf: Add bpf_burst test: Build Failure MPTCP CI
  2023-05-30 14:46   ` selftests/bpf: Add bpf_burst test: Tests Results MPTCP CI
  14 siblings, 2 replies; 19+ messages in thread
From: Geliang Tang @ 2023-05-30 13:17 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch adds the burst BPF MPTCP scheduler test: test_burst(). Use
sysctl to set net.mptcp.scheduler to use this sched. Add two veth net
devices to simulate the multiple addresses case. Use 'ip mptcp endpoint'
command to add the new endpoint ADDR_2 to PM netlink. Send data and check
bytes_sent of 'ss' output after it to make sure the data has been sent
on both net devices.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 .../testing/selftests/bpf/prog_tests/mptcp.c  | 38 +++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index a968641cc94a..b9f6dcf995fd 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -10,6 +10,7 @@
 #include "mptcp_bpf_bkup.skel.h"
 #include "mptcp_bpf_rr.skel.h"
 #include "mptcp_bpf_red.skel.h"
+#include "mptcp_bpf_burst.skel.h"
 
 char NS_TEST[32];
 
@@ -455,6 +456,41 @@ static void test_red(void)
 	mptcp_bpf_red__destroy(red_skel);
 }
 
+static void test_burst(void)
+{
+	struct mptcp_bpf_burst *burst_skel;
+	int server_fd, client_fd;
+	struct nstoken *nstoken;
+	struct bpf_link *link;
+
+	burst_skel = mptcp_bpf_burst__open_and_load();
+	if (!ASSERT_OK_PTR(burst_skel, "bpf_burst__open_and_load"))
+		return;
+
+	link = bpf_map__attach_struct_ops(burst_skel->maps.burst);
+	if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) {
+		mptcp_bpf_burst__destroy(burst_skel);
+		return;
+	}
+
+	nstoken = sched_init("subflow", "bpf_burst");
+	if (!ASSERT_OK_PTR(nstoken, "sched_init:bpf_burst"))
+		goto fail;
+	server_fd = start_mptcp_server(AF_INET, ADDR_1, 0, 0);
+	client_fd = connect_to_fd(server_fd, 0);
+
+	send_data(server_fd, client_fd);
+	ASSERT_OK(has_bytes_sent(ADDR_1), "has_bytes_sent addr 1");
+	ASSERT_OK(has_bytes_sent(ADDR_2), "has_bytes_sent addr 2");
+
+	close(client_fd);
+	close(server_fd);
+fail:
+	cleanup_netns(nstoken);
+	bpf_link__destroy(link);
+	mptcp_bpf_burst__destroy(burst_skel);
+}
+
 void test_mptcp(void)
 {
 	if (test__start_subtest("base"))
@@ -467,4 +503,6 @@ void test_mptcp(void)
 		test_rr();
 	if (test__start_subtest("red"))
 		test_red();
+	if (test__start_subtest("burst"))
+		test_burst();
 }
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: selftests/bpf: Add bpf_burst test: Build Failure
  2023-05-30 13:17 ` [PATCH mptcp-next v3 15/15] selftests/bpf: Add bpf_burst test Geliang Tang
@ 2023-05-30 13:50   ` MPTCP CI
  2023-05-30 14:46   ` selftests/bpf: Add bpf_burst test: Tests Results MPTCP CI
  1 sibling, 0 replies; 19+ messages in thread
From: MPTCP CI @ 2023-05-30 13:50 UTC (permalink / raw)
  To: Geliang Tang; +Cc: mptcp

Hi Geliang,

Thank you for your modifications, that's great!

But sadly, our CI spotted some issues with it when trying to build it.

You can find more details there:

  https://patchwork.kernel.org/project/mptcp/patch/3be9b9fa90210c4543212c088bc7b4ebe4dc8ece.1685452619.git.geliang.tang@suse.com/
  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/5122145743

Status: failure
Initiator: MPTCPimporter
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/ae3dad1762e2

Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: selftests/bpf: Add bpf_burst test: Tests Results
  2023-05-30 13:17 ` [PATCH mptcp-next v3 15/15] selftests/bpf: Add bpf_burst test Geliang Tang
  2023-05-30 13:50   ` selftests/bpf: Add bpf_burst test: Build Failure MPTCP CI
@ 2023-05-30 14:46   ` MPTCP CI
  1 sibling, 0 replies; 19+ messages in thread
From: MPTCP CI @ 2023-05-30 14:46 UTC (permalink / raw)
  To: Geliang Tang; +Cc: mptcp

Hi Geliang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Unstable: 1 failed test(s): packetdrill_fastopen 🔴:
  - Task: https://cirrus-ci.com/task/6149320072757248
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6149320072757248/summary/summary.txt

- KVM Validation: normal (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5586370119335936
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5586370119335936/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/4566023328759808
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/4566023328759808/summary/summary.txt

- KVM Validation: debug (except selftest_mptcp_join):
  - Unstable: 3 failed test(s): packetdrill_add_addr packetdrill_fastopen selftest_diag 🔴:
  - Task: https://cirrus-ci.com/task/6712270026178560
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6712270026178560/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/ae3dad1762e2


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: selftests/bpf: Add bpf_burst test: Build Failure
  2023-06-05 10:04 [PATCH mptcp-next v4 14/14] selftests/bpf: Add bpf_burst test Geliang Tang
@ 2023-06-05 10:34 ` MPTCP CI
  0 siblings, 0 replies; 19+ messages in thread
From: MPTCP CI @ 2023-06-05 10:34 UTC (permalink / raw)
  To: Geliang Tang; +Cc: mptcp

Hi Geliang,

Thank you for your modifications, that's great!

But sadly, our CI spotted some issues with it when trying to build it.

You can find more details there:

  https://patchwork.kernel.org/project/mptcp/patch/cc7e5e740b2f42bd69ad219190c221cf62fb83ba.1685959315.git.geliang.tang@suse.com/
  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/5175972267

Status: failure
Initiator: MPTCPimporter
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/03eb102a4f4b

Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-06-05 10:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-30 13:17 [PATCH mptcp-next v3 00/15] save sched_data at mptcp_sock Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 01/15] Squash to "mptcp: add struct mptcp_sched_ops" Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 02/15] Squash to "mptcp: add sched in mptcp_sock" Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 03/15] Squash to "mptcp: add scheduler wrappers" Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 04/15] mptcp: add last_snd in sched_data Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 05/15] mptcp: add snd_burst " Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 06/15] mptcp: register default scheduler Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 07/15] mptcp: rename __mptcp_set_timeout for bpf_burst Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 08/15] mptcp: add two wrappers " Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 09/15] mptcp: add three helpers " Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 10/15] Squash to "bpf: Add bpf_mptcp_sched_ops" Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 11/15] Squash to "bpf: Add bpf_mptcp_sched_kfunc_set" Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 12/15] Squash to "selftests/bpf: Add mptcp sched structs" Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 13/15] Squash to "selftests/bpf: Add bpf_rr scheduler" Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 14/15] selftests/bpf: Add bpf_burst scheduler Geliang Tang
2023-05-30 13:17 ` [PATCH mptcp-next v3 15/15] selftests/bpf: Add bpf_burst test Geliang Tang
2023-05-30 13:50   ` selftests/bpf: Add bpf_burst test: Build Failure MPTCP CI
2023-05-30 14:46   ` selftests/bpf: Add bpf_burst test: Tests Results MPTCP CI
2023-06-05 10:04 [PATCH mptcp-next v4 14/14] selftests/bpf: Add bpf_burst test Geliang Tang
2023-06-05 10:34 ` selftests/bpf: Add bpf_burst test: Build Failure MPTCP CI

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.