mptcp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC mptcp-next 0/4] round-robin packet scheduler support
@ 2021-09-07 10:41 Geliang Tang
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 1/4] mptcp: add a new sysctl scheduler Geliang Tang
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Geliang Tang @ 2021-09-07 10:41 UTC (permalink / raw)
  To: mptcp, geliangtang; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@xiaomi.com>

Implement the round-robin packet scheduler, ported from the mptcp.org kernel.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/194

Geliang Tang (4):
  mptcp: add a new sysctl scheduler
  mptcp: add struct mptcp_sched_ops
  mptcp: round-robin packet scheduler support
  selftests: mptcp: add round-robin testcase

 Documentation/networking/mptcp-sysctl.rst     |  8 ++
 net/mptcp/Kconfig                             |  7 ++
 net/mptcp/ctrl.c                              | 14 +++
 net/mptcp/protocol.c                          | 86 ++++++++++++++++++-
 net/mptcp/protocol.h                          | 11 +++
 .../testing/selftests/net/mptcp/mptcp_join.sh | 11 ++-
 6 files changed, 132 insertions(+), 5 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC mptcp-next 1/4] mptcp: add a new sysctl scheduler
  2021-09-07 10:41 [PATCH RFC mptcp-next 0/4] round-robin packet scheduler support Geliang Tang
@ 2021-09-07 10:41 ` Geliang Tang
  2021-09-07 10:59   ` Paolo Abeni
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 2/4] mptcp: add struct mptcp_sched_ops Geliang Tang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Geliang Tang @ 2021-09-07 10:41 UTC (permalink / raw)
  To: mptcp, geliangtang; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@xiaomi.com>

This patch added a new sysctl, named scheduler, to support for selection
of different schedulers.

Signed-off-by: Geliang Tang <geliangtang@xiaomi.com>
---
 Documentation/networking/mptcp-sysctl.rst |  8 ++++++++
 net/mptcp/ctrl.c                          | 14 ++++++++++++++
 net/mptcp/protocol.h                      |  3 +++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index b0d4da71e68e..ecd593c44e26 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -57,3 +57,11 @@ stale_loss_cnt - INTEGER
 	This is a per-namespace sysctl.
 
 	Default: 4
+
+scheduler - STRING
+	Select the scheduler of your choice.
+
+	Support for selection of different schedulers. This is a per-namespace
+	sysctl.
+
+	Default: "default"
diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index 8b235468c88f..369369b0b17e 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -26,6 +26,7 @@ struct mptcp_pernet {
 	u8 mptcp_enabled;
 	u8 checksum_enabled;
 	u8 allow_join_initial_addr_port;
+	char scheduler[MPTCP_SCHED_NAME_MAX];
 };
 
 static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
@@ -58,6 +59,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
 	return mptcp_get_pernet(net)->stale_loss_cnt;
 }
 
+char *mptcp_get_scheduler(struct net *net)
+{
+	return mptcp_get_pernet(net)->scheduler;
+}
+
 static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 {
 	pernet->mptcp_enabled = 1;
@@ -65,6 +71,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 	pernet->checksum_enabled = 0;
 	pernet->allow_join_initial_addr_port = 1;
 	pernet->stale_loss_cnt = 4;
+	strcpy(pernet->scheduler, "default");
 }
 
 #ifdef CONFIG_SYSCTL
@@ -108,6 +115,12 @@ static struct ctl_table mptcp_sysctl_table[] = {
 		.mode = 0644,
 		.proc_handler = proc_douintvec_minmax,
 	},
+	{
+		.procname	= "scheduler",
+		.maxlen		= MPTCP_SCHED_NAME_MAX,
+		.mode		= 0644,
+		.proc_handler	= proc_dostring,
+	},
 	{}
 };
 
@@ -128,6 +141,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
 	table[2].data = &pernet->checksum_enabled;
 	table[3].data = &pernet->allow_join_initial_addr_port;
 	table[4].data = &pernet->stale_loss_cnt;
+	table[5].data = &pernet->scheduler;
 
 	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
 	if (!hdr)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 99a23fff7b03..305d373332b7 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -206,6 +206,8 @@ struct mptcp_pm_data {
 	struct mptcp_rm_list rm_list_rx;
 };
 
+#define MPTCP_SCHED_NAME_MAX 16
+
 struct mptcp_data_frag {
 	struct list_head list;
 	u64 data_seq;
@@ -564,6 +566,7 @@ unsigned int mptcp_get_add_addr_timeout(const struct net *net);
 int mptcp_is_checksum_enabled(const struct net *net);
 int mptcp_allow_join_id0(const struct net *net);
 unsigned int mptcp_stale_loss_cnt(const struct net *net);
+char *mptcp_get_scheduler(struct net *net);
 void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
 				     struct mptcp_options_received *mp_opt);
 bool __mptcp_retransmit_pending_data(struct sock *sk);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC mptcp-next 2/4] mptcp: add struct mptcp_sched_ops
  2021-09-07 10:41 [PATCH RFC mptcp-next 0/4] round-robin packet scheduler support Geliang Tang
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 1/4] mptcp: add a new sysctl scheduler Geliang Tang
@ 2021-09-07 10:41 ` Geliang Tang
  2021-09-07 11:04   ` Paolo Abeni
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 3/4] mptcp: round-robin packet scheduler support Geliang Tang
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 4/4] selftests: mptcp: add round-robin testcase Geliang Tang
  3 siblings, 1 reply; 9+ messages in thread
From: Geliang Tang @ 2021-09-07 10:41 UTC (permalink / raw)
  To: mptcp, geliangtang; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@xiaomi.com>

This patch added struct mptcp_sched_ops. And define the scheduler
init, register and find functions.

Signed-off-by: Geliang Tang <geliangtang@xiaomi.com>
---
 net/mptcp/protocol.c | 60 +++++++++++++++++++++++++++++++++++++++++---
 net/mptcp/protocol.h |  8 ++++++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2a525c7ae920..ab72a3950f2b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1515,6 +1515,58 @@ static struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
 	return NULL;
 }
 
+static struct mptcp_sched_ops mptcp_sched_default = {
+	.get_subflow	= mptcp_subflow_get_send,
+	.name		= "default",
+	.owner		= THIS_MODULE,
+};
+
+static DEFINE_SPINLOCK(mptcp_sched_list_lock);
+static LIST_HEAD(mptcp_sched_list);
+
+static struct mptcp_sched_ops *mptcp_sched_find(const char *name)
+{
+	struct mptcp_sched_ops *ops;
+
+	list_for_each_entry_rcu(ops, &mptcp_sched_list, list) {
+		if (!strcmp(ops->name, name))
+			return ops;
+	}
+
+	return NULL;
+}
+
+static int mptcp_register_scheduler(struct mptcp_sched_ops *sched)
+{
+	int ret = 0;
+
+	if (!sched->get_subflow)
+		return -EINVAL;
+
+	spin_lock(&mptcp_sched_list_lock);
+	if (mptcp_sched_find(sched->name)) {
+		pr_debug("%s already registered", sched->name);
+		ret = -EEXIST;
+	} else {
+		list_add_tail_rcu(&sched->list, &mptcp_sched_list);
+		pr_debug("%s registered", sched->name);
+	}
+	spin_unlock(&mptcp_sched_list_lock);
+	return 0;
+}
+
+static void mptcp_sched_data_init(struct mptcp_sock *msk)
+{
+	struct net *net = sock_net((struct sock *)msk);
+
+	msk->sched = mptcp_sched_find(mptcp_get_scheduler(net));
+}
+
+static void mptcp_sched_init(void)
+{
+	mptcp_register_scheduler(&mptcp_sched_default);
+}
+
 static void mptcp_push_release(struct sock *sk, struct sock *ssk,
 			       struct mptcp_sendmsg_info *info)
 {
@@ -1567,7 +1619,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
 
 			prev_ssk = ssk;
 			mptcp_flush_join_list(msk);
-			ssk = mptcp_subflow_get_send(msk);
+			ssk = msk->sched->get_subflow(msk);
 
 			/* First check. If the ssk has changed since
 			 * the last round, release prev_ssk
@@ -1634,7 +1686,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk)
 			 * check for a different subflow usage only after
 			 * spooling the first chunk of data
 			 */
-			xmit_ssk = first ? ssk : mptcp_subflow_get_send(mptcp_sk(sk));
+			xmit_ssk = first ? ssk : mptcp_sk(sk)->sched->get_subflow(mptcp_sk(sk));
 			if (!xmit_ssk)
 				goto out;
 			if (xmit_ssk != ssk) {
@@ -2534,6 +2586,7 @@ static int __mptcp_init_sock(struct sock *sk)
 	msk->recovery = false;
 
 	mptcp_pm_data_init(msk);
+	mptcp_sched_data_init(msk);
 
 	/* re-use the csk retrans timer for MPTCP-level retrans */
 	timer_setup(&msk->sk.icsk_retransmit_timer, mptcp_retransmit_timer, 0);
@@ -3005,7 +3058,7 @@ void __mptcp_check_push(struct sock *sk, struct sock *ssk)
 		return;
 
 	if (!sock_owned_by_user(sk)) {
-		struct sock *xmit_ssk = mptcp_subflow_get_send(mptcp_sk(sk));
+		struct sock *xmit_ssk = mptcp_sk(sk)->sched->get_subflow(mptcp_sk(sk));
 
 		if (xmit_ssk == ssk)
 			__mptcp_subflow_push_pending(sk, ssk);
@@ -3569,6 +3622,7 @@ void __init mptcp_proto_init(void)
 
 	mptcp_subflow_init();
 	mptcp_pm_init();
+	mptcp_sched_init();
 	mptcp_token_init();
 
 	if (proto_register(&mptcp_prot, MPTCP_USE_SLAB) != 0)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 305d373332b7..71237d1458ea 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -207,6 +207,13 @@ struct mptcp_pm_data {
 };
 
 #define MPTCP_SCHED_NAME_MAX 16
+struct mptcp_sched_ops {
+	struct list_head list;
+
+	struct sock *	(*get_subflow)(struct mptcp_sock *msk);
+	char		name[MPTCP_SCHED_NAME_MAX];
+	struct module	*owner;
+};
 
 struct mptcp_data_frag {
 	struct list_head list;
@@ -264,6 +271,7 @@ struct mptcp_sock {
 	struct socket	*subflow; /* outgoing connect/listener/!mp_capable */
 	struct sock	*first;
 	struct mptcp_pm_data	pm;
+	struct mptcp_sched_ops	*sched;
 	struct {
 		u32	space;	/* bytes copied in last measurement window */
 		u32	copied; /* bytes copied in this measurement window */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC mptcp-next 3/4] mptcp: round-robin packet scheduler support
  2021-09-07 10:41 [PATCH RFC mptcp-next 0/4] round-robin packet scheduler support Geliang Tang
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 1/4] mptcp: add a new sysctl scheduler Geliang Tang
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 2/4] mptcp: add struct mptcp_sched_ops Geliang Tang
@ 2021-09-07 10:41 ` Geliang Tang
  2021-09-07 11:16   ` Paolo Abeni
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 4/4] selftests: mptcp: add round-robin testcase Geliang Tang
  3 siblings, 1 reply; 9+ messages in thread
From: Geliang Tang @ 2021-09-07 10:41 UTC (permalink / raw)
  To: mptcp, geliangtang; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@xiaomi.com>

Implement the round-robin packet scheduler like on the mptcp.org kernel.

Signed-off-by: Geliang Tang <geliangtang@xiaomi.com>
---
 net/mptcp/Kconfig    |  7 +++++++
 net/mptcp/protocol.c | 26 ++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/net/mptcp/Kconfig b/net/mptcp/Kconfig
index 10c97e19a7da..0df36991566a 100644
--- a/net/mptcp/Kconfig
+++ b/net/mptcp/Kconfig
@@ -23,6 +23,13 @@ config MPTCP_IPV6
 	depends on IPV6=y
 	default y
 
+config MPTCP_ROUNDROBIN
+	tristate "MPTCP Round-Robin"
+	default n
+	help
+	  This is a very simple round-robin scheduler. Probably has bad performance
+	  but might be interesting for researchers.
+
 config MPTCP_KUNIT_TEST
 	tristate "This builds the MPTCP KUnit tests" if !KUNIT_ALL_TESTS
 	depends on KUNIT
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index ab72a3950f2b..b78c4eb4947f 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1521,6 +1521,29 @@ static struct mptcp_sched_ops mptcp_sched_default = {
 	.owner		= THIS_MODULE,
 };
 
+#if IS_ENABLED(CONFIG_MPTCP_ROUNDROBIN)
+static struct sock *rr_get_subflow(struct mptcp_sock *msk)
+{
+	struct mptcp_subflow_context *subflow;
+	struct sock *ssk;
+
+	mptcp_for_each_subflow(msk, subflow) {
+		ssk = mptcp_subflow_tcp_sock(subflow);
+		if (ssk != msk->last_snd) {
+			pr_debug("msk=%p ssk=%p last_snd=%p", msk, ssk, msk->last_snd);
+			msk->last_snd = ssk;
+		}
+	}
+	return msk->last_snd;
+}
+
+static struct mptcp_sched_ops mptcp_sched_rr = {
+	.get_subflow	= rr_get_subflow,
+	.name		= "roundrobin",
+	.owner		= THIS_MODULE,
+};
+#endif
+
 static DEFINE_SPINLOCK(mptcp_sched_list_lock);
 static LIST_HEAD(mptcp_sched_list);
 
@@ -1565,6 +1588,9 @@ static void mptcp_sched_data_init(struct mptcp_sock *msk)
 static void mptcp_sched_init(void)
 {
 	mptcp_register_scheduler(&mptcp_sched_default);
+#if IS_ENABLED(CONFIG_MPTCP_ROUNDROBIN)
+	mptcp_register_scheduler(&mptcp_sched_rr);
+#endif
 }
 
 static void mptcp_push_release(struct sock *sk, struct sock *ssk,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC mptcp-next 4/4] selftests: mptcp: add round-robin testcase
  2021-09-07 10:41 [PATCH RFC mptcp-next 0/4] round-robin packet scheduler support Geliang Tang
                   ` (2 preceding siblings ...)
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 3/4] mptcp: round-robin packet scheduler support Geliang Tang
@ 2021-09-07 10:41 ` Geliang Tang
  3 siblings, 0 replies; 9+ messages in thread
From: Geliang Tang @ 2021-09-07 10:41 UTC (permalink / raw)
  To: mptcp, geliangtang; +Cc: Geliang Tang

From: Geliang Tang <geliangtang@xiaomi.com>

Add the round-robin scheduler testcase.

Signed-off-by: Geliang Tang <geliangtang@xiaomi.com>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 255793c5ac4f..c69b1dce95f7 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -15,6 +15,7 @@ timeout_test=$((timeout_poll * 2 + 1))
 mptcp_connect=""
 capture=0
 checksum=0
+roundrobin=0
 do_all_tests=1
 
 TEST_COUNT=0
@@ -55,6 +56,9 @@ init()
 		if [ $checksum -eq 1 ]; then
 			ip netns exec $netns sysctl -q net.mptcp.checksum_enabled=1
 		fi
+		if [ $roundrobin -eq 1 ]; then
+			ip netns exec $netns sysctl -q net.mptcp.scheduler="roundrobin"
+		fi
 	done
 
 	#  ns1              ns2
@@ -1854,9 +1858,12 @@ for arg in "$@"; do
 	if [[ "${arg}" =~ ^"-"[0-9a-zA-Z]*"C"[0-9a-zA-Z]*$ ]]; then
 		checksum=1
 	fi
+	if [[ "${arg}" =~ ^"-"[0-9a-zA-Z]*"R"[0-9a-zA-Z]*$ ]]; then
+		roundrobin=1
+	fi
 
-	# exception for the capture/checksum options, the rest means: a part of the tests
-	if [ "${arg}" != "-c" ] && [ "${arg}" != "-C" ]; then
+	# exception for the capture/checksum/roundrobin options, the rest means: a part of the tests
+	if [ "${arg}" != "-c" ] && [ "${arg}" != "-C" ] && [ "${arg}" != "-R" ]; then
 		do_all_tests=0
 	fi
 done
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC mptcp-next 1/4] mptcp: add a new sysctl scheduler
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 1/4] mptcp: add a new sysctl scheduler Geliang Tang
@ 2021-09-07 10:59   ` Paolo Abeni
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Abeni @ 2021-09-07 10:59 UTC (permalink / raw)
  To: Geliang Tang, mptcp; +Cc: Geliang Tang

Hi,

On Tue, 2021-09-07 at 18:41 +0800, Geliang Tang wrote:
> From: Geliang Tang <geliangtang@xiaomi.com>
> 
> This patch added a new sysctl, named scheduler, to support for selection
> of different schedulers.
> 
> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com>
> ---
>  Documentation/networking/mptcp-sysctl.rst |  8 ++++++++
>  net/mptcp/ctrl.c                          | 14 ++++++++++++++
>  net/mptcp/protocol.h                      |  3 +++
>  3 files changed, 25 insertions(+)
> 
> diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
> index b0d4da71e68e..ecd593c44e26 100644
> --- a/Documentation/networking/mptcp-sysctl.rst
> +++ b/Documentation/networking/mptcp-sysctl.rst
> @@ -57,3 +57,11 @@ stale_loss_cnt - INTEGER
>  	This is a per-namespace sysctl.
>  
>  	Default: 4
> +
> +scheduler - STRING
> +	Select the scheduler of your choice.
> +
> +	Support for selection of different schedulers. This is a per-namespace
> +	sysctl.
> +
> +	Default: "default"
> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> index 8b235468c88f..369369b0b17e 100644
> --- a/net/mptcp/ctrl.c
> +++ b/net/mptcp/ctrl.c
> @@ -26,6 +26,7 @@ struct mptcp_pernet {
>  	u8 mptcp_enabled;
>  	u8 checksum_enabled;
>  	u8 allow_join_initial_addr_port;
> +	char scheduler[MPTCP_SCHED_NAME_MAX];
>  };
>  
>  static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
> @@ -58,6 +59,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
>  	return mptcp_get_pernet(net)->stale_loss_cnt;
>  }
>  
> +char *mptcp_get_scheduler(struct net *net)

I think it should returns 'const char *' and even the 'net' argument
should be const.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC mptcp-next 2/4] mptcp: add struct mptcp_sched_ops
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 2/4] mptcp: add struct mptcp_sched_ops Geliang Tang
@ 2021-09-07 11:04   ` Paolo Abeni
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Abeni @ 2021-09-07 11:04 UTC (permalink / raw)
  To: Geliang Tang, mptcp; +Cc: Geliang Tang

On Tue, 2021-09-07 at 18:41 +0800, Geliang Tang wrote:
> From: Geliang Tang <geliangtang@xiaomi.com>
> 
> This patch added struct mptcp_sched_ops. And define the scheduler
> init, register and find functions.
> 
> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com>
> ---
>  net/mptcp/protocol.c | 60 +++++++++++++++++++++++++++++++++++++++++---
>  net/mptcp/protocol.h |  8 ++++++
>  2 files changed, 65 insertions(+), 3 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 2a525c7ae920..ab72a3950f2b 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1515,6 +1515,58 @@ static struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
>  	return NULL;
>  }
>  
> +static struct mptcp_sched_ops mptcp_sched_default = {
> +	.get_subflow	= mptcp_subflow_get_send,
> +	.name		= "default",
> +	.owner		= THIS_MODULE,
> +};
> +
> +static DEFINE_SPINLOCK(mptcp_sched_list_lock);
> +static LIST_HEAD(mptcp_sched_list);
> +
> +static struct mptcp_sched_ops *mptcp_sched_find(const char *name)
> +{
> +	struct mptcp_sched_ops *ops;
> +
> +	list_for_each_entry_rcu(ops, &mptcp_sched_list, list) {
> +		if (!strcmp(ops->name, name))
> +			return ops;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int mptcp_register_scheduler(struct mptcp_sched_ops *sched)
> +{
> +	int ret = 0;
> +
> +	if (!sched->get_subflow)
> +		return -EINVAL;
> +
> +	spin_lock(&mptcp_sched_list_lock);
> +	if (mptcp_sched_find(sched->name)) {
> +		pr_debug("%s already registered", sched->name);
> +		ret = -EEXIST;
> +	} else {
> +		list_add_tail_rcu(&sched->list, &mptcp_sched_list);
> +		pr_debug("%s registered", sched->name);
> +	}
> +	spin_unlock(&mptcp_sched_list_lock);
> +	return 0;
> +}
> +
> +static void mptcp_sched_data_init(struct mptcp_sock *msk)
> +{
> +	struct net *net = sock_net((struct sock *)msk);
> +
> +	msk->sched = mptcp_sched_find(mptcp_get_scheduler(net));
> +}
> +
> +static void mptcp_sched_init(void)
> +{
> +	mptcp_register_scheduler(&mptcp_sched_default);
> +}
> +

I think it would be nice keeping this infrastructure in a separate
file, but I have no strong opinion on it.

>  static void mptcp_push_release(struct sock *sk, struct sock *ssk,
>  			       struct mptcp_sendmsg_info *info)
>  {
> @@ -1567,7 +1619,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
>  
>  			prev_ssk = ssk;
>  			mptcp_flush_join_list(msk);
> -			ssk = mptcp_subflow_get_send(msk);
> +			ssk = msk->sched->get_subflow(msk);


uhm... it would be nice to avoid the indirect call, but I don't see any
solution other then IDC wrapper usage
(see linux/indirect_call_wrapper.h)

>  			/* First check. If the ssk has changed since
>  			 * the last round, release prev_ssk
> @@ -1634,7 +1686,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk)
>  			 * check for a different subflow usage only after
>  			 * spooling the first chunk of data
>  			 */
> -			xmit_ssk = first ? ssk : mptcp_subflow_get_send(mptcp_sk(sk));
> +			xmit_ssk = first ? ssk : mptcp_sk(sk)->sched->get_subflow(mptcp_sk(sk));
>  			if (!xmit_ssk)
>  				goto out;
>  			if (xmit_ssk != ssk) {
> @@ -2534,6 +2586,7 @@ static int __mptcp_init_sock(struct sock *sk)
>  	msk->recovery = false;
>  
>  	mptcp_pm_data_init(msk);
> +	mptcp_sched_data_init(msk);
>  
>  	/* re-use the csk retrans timer for MPTCP-level retrans */
>  	timer_setup(&msk->sk.icsk_retransmit_timer, mptcp_retransmit_timer, 0);

I think we should do the lookup in mptcp_init_sock(), and copy the
scheduler from the parent in mptcp_sk_clone().

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC mptcp-next 3/4] mptcp: round-robin packet scheduler support
  2021-09-07 10:41 ` [PATCH RFC mptcp-next 3/4] mptcp: round-robin packet scheduler support Geliang Tang
@ 2021-09-07 11:16   ` Paolo Abeni
  2021-09-09  0:18     ` Mat Martineau
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Abeni @ 2021-09-07 11:16 UTC (permalink / raw)
  To: Geliang Tang, mptcp, Christoph Paasch; +Cc: Geliang Tang

On Tue, 2021-09-07 at 18:41 +0800, Geliang Tang wrote:
> From: Geliang Tang <geliangtang@xiaomi.com>
> 
> Implement the round-robin packet scheduler like on the mptcp.org kernel.
> 
> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com>
> ---
>  net/mptcp/Kconfig    |  7 +++++++
>  net/mptcp/protocol.c | 26 ++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/net/mptcp/Kconfig b/net/mptcp/Kconfig
> index 10c97e19a7da..0df36991566a 100644
> --- a/net/mptcp/Kconfig
> +++ b/net/mptcp/Kconfig
> @@ -23,6 +23,13 @@ config MPTCP_IPV6
>  	depends on IPV6=y
>  	default y
>  
> +config MPTCP_ROUNDROBIN
> +	tristate "MPTCP Round-Robin"
> +	default n
> +	help
> +	  This is a very simple round-robin scheduler. Probably has bad performance
> +	  but might be interesting for researchers.
> +
>  config MPTCP_KUNIT_TEST
>  	tristate "This builds the MPTCP KUnit tests" if !KUNIT_ALL_TESTS
>  	depends on KUNIT
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index ab72a3950f2b..b78c4eb4947f 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1521,6 +1521,29 @@ static struct mptcp_sched_ops mptcp_sched_default = {
>  	.owner		= THIS_MODULE,
>  };
>  
> +#if IS_ENABLED(CONFIG_MPTCP_ROUNDROBIN)
> +static struct sock *rr_get_subflow(struct mptcp_sock *msk)
> +{
> +	struct mptcp_subflow_context *subflow;
> +	struct sock *ssk;
> +
> +	mptcp_for_each_subflow(msk, subflow) {
> +		ssk = mptcp_subflow_tcp_sock(subflow);
> +		if (ssk != msk->last_snd) {
> +			pr_debug("msk=%p ssk=%p last_snd=%p", msk, ssk, msk->last_snd);
> +			msk->last_snd = ssk;
> +		}

This needs some more logic:
- it must take in account backup vs non backup
- it should consider if the given subflow has available write
space/snd_wnd
- it should move from the 'last_snd' subflow to the 'next' one, instead
of always selecting the fist subflow other then 'last_snd'. Otherwise
with 3 non backup subflow active we will never pick the last created
one.

There is an import side question, already asked by Christoph, why the
round robin? perhpas we should consider instead some different
scheduler, known to perform well. @Christoph: which is the 'best'
mptcp.org scheduler out there? blest?

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC mptcp-next 3/4] mptcp: round-robin packet scheduler support
  2021-09-07 11:16   ` Paolo Abeni
@ 2021-09-09  0:18     ` Mat Martineau
  0 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2021-09-09  0:18 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Geliang Tang, mptcp, Christoph Paasch, Geliang Tang


On Tue, 7 Sep 2021, Paolo Abeni wrote:

> There is an import side question, already asked by Christoph, why the
> round robin? perhpas we should consider instead some different
> scheduler, known to perform well. @Christoph: which is the 'best'
> mptcp.org scheduler out there? blest?
>

This is the main question I have!

We've also discussed the idea of eBPF-based schedulers, so I've been 
thinking we would have one built-in scheduler and then allow alternatives 
(like round robin) via eBPF.

If there's a good use case for round robin maybe there's a solution here 
similar to what we did with the fullmesh path manager: could there be a 
configuration knob for the existing scheduler that makes it try to *not* 
use the last-used subflow, instead of favoring the last-used subflow?

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-09  0:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-07 10:41 [PATCH RFC mptcp-next 0/4] round-robin packet scheduler support Geliang Tang
2021-09-07 10:41 ` [PATCH RFC mptcp-next 1/4] mptcp: add a new sysctl scheduler Geliang Tang
2021-09-07 10:59   ` Paolo Abeni
2021-09-07 10:41 ` [PATCH RFC mptcp-next 2/4] mptcp: add struct mptcp_sched_ops Geliang Tang
2021-09-07 11:04   ` Paolo Abeni
2021-09-07 10:41 ` [PATCH RFC mptcp-next 3/4] mptcp: round-robin packet scheduler support Geliang Tang
2021-09-07 11:16   ` Paolo Abeni
2021-09-09  0:18     ` Mat Martineau
2021-09-07 10:41 ` [PATCH RFC mptcp-next 4/4] selftests: mptcp: add round-robin testcase Geliang Tang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).