All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM
@ 2021-12-02  0:52 Mat Martineau
  2021-12-02  0:52 ` [PATCH mptcp-next v2 1/6] mptcp: Remove redundant assignments in path manager init Mat Martineau
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Mat Martineau @ 2021-12-02  0:52 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau

One part of supporting userspace path managers is to prevent the
in-kernel PM from acting on userspace-managed MPTCP connections.

These patches:

 * Add a per-MPTCP-socket 'pm_type'. A mix of kernel-managed and
   userspace-managed connections are supported within each namespace.

 * Conditionally decouple incoming ADD_ADDR/RM_ADDR and subflow changes
   from the in-kernel PM. Netlink events are still triggered, and ADD_ADDR
   echo handling is still in kernel code even if path management is
   otherwise handled in userspace.

 * Add a sysctl for setting the per-namespace default for in-kernel vs
   userspace path management of new MPTCP sockets. This is an integer
   value to allow extensibility.

 * Add selftests to confirm that the in-kernel PM is bypassed.

RFC -> v1: Changed sysctl from a bool to an integer, added patch 1
(cleanup) and patch 6 (selftests), fixed ADD_ADDR echo and initial
pm->subflows_allowed settings.

v1 -> v2: Rebased on latest export branch, removed extra kernel-mode
check when receiving ADD_ADDR, and fix !CONFIG_SYSCTL build.


Mat Martineau (6):
  mptcp: Remove redundant assignments in path manager init
  mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode
  mptcp: Bypass kernel PM when userspace PM is enabled
  mptcp: Make kernel path manager check for userspace-managed sockets
  mptcp: Add a per-namespace sysctl to set the default path manager type
  selftests: mptcp: Add tests for userspace PM type

 Documentation/networking/mptcp-sysctl.rst     | 18 +++++
 net/mptcp/ctrl.c                              | 21 ++++++
 net/mptcp/pm.c                                | 44 ++++++++----
 net/mptcp/pm_netlink.c                        | 32 ++++-----
 net/mptcp/protocol.h                          | 11 ++-
 .../testing/selftests/net/mptcp/mptcp_join.sh | 70 ++++++++++++++++++-
 6 files changed, 161 insertions(+), 35 deletions(-)


base-commit: 7efd6f5c53fb8e55fb71d5dea9e934de9b9b766b
-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH mptcp-next v2 1/6] mptcp: Remove redundant assignments in path manager init
  2021-12-02  0:52 [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
@ 2021-12-02  0:52 ` Mat Martineau
  2021-12-02  0:52 ` [PATCH mptcp-next v2 2/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode Mat Martineau
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-12-02  0:52 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau

A few members of the mptcp_pm_data struct were assigned to hard-coded
values in mptcp_pm_data_reset(), and then immediately changed in
mptcp_pm_nl_data_init().

Instead, flatten all the assignments in to mptcp_pm_data_reset().

v2: Resolve conflicts due to rename of mptcp_pm_data_reset()

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm.c         | 32 ++++++++++++++++++--------------
 net/mptcp/pm_netlink.c | 12 ------------
 net/mptcp/protocol.h   |  1 -
 3 files changed, 18 insertions(+), 27 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 761995a34124..4b79b73aee3c 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -364,20 +364,24 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
 
 void mptcp_pm_data_reset(struct mptcp_sock *msk)
 {
-	msk->pm.add_addr_signaled = 0;
-	msk->pm.add_addr_accepted = 0;
-	msk->pm.local_addr_used = 0;
-	msk->pm.subflows = 0;
-	msk->pm.rm_list_tx.nr = 0;
-	msk->pm.rm_list_rx.nr = 0;
-	WRITE_ONCE(msk->pm.work_pending, false);
-	WRITE_ONCE(msk->pm.addr_signal, 0);
-	WRITE_ONCE(msk->pm.accept_addr, false);
-	WRITE_ONCE(msk->pm.accept_subflow, false);
-	WRITE_ONCE(msk->pm.remote_deny_join_id0, false);
-	msk->pm.status = 0;
-
-	mptcp_pm_nl_data_init(msk);
+	bool subflows_allowed = !!mptcp_pm_get_subflows_max(msk);
+	struct mptcp_pm_data *pm = &msk->pm;
+
+	pm->add_addr_signaled = 0;
+	pm->add_addr_accepted = 0;
+	pm->local_addr_used = 0;
+	pm->subflows = 0;
+	pm->rm_list_tx.nr = 0;
+	pm->rm_list_rx.nr = 0;
+	WRITE_ONCE(pm->work_pending,
+		   (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
+		   !!mptcp_pm_get_add_addr_signal_max(msk));
+	WRITE_ONCE(pm->addr_signal, 0);
+	WRITE_ONCE(pm->accept_addr,
+		   !!mptcp_pm_get_add_addr_accept_max(msk) && subflows_allowed);
+	WRITE_ONCE(pm->accept_subflow, subflows_allowed);
+	WRITE_ONCE(pm->remote_deny_join_id0, false);
+	pm->status = 0;
 }
 
 void mptcp_pm_data_init(struct mptcp_sock *msk)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 3186d33b5208..a74eb0444cd2 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -957,18 +957,6 @@ int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct sock_common *skc)
 	return ret;
 }
 
-void mptcp_pm_nl_data_init(struct mptcp_sock *msk)
-{
-	struct mptcp_pm_data *pm = &msk->pm;
-	bool subflows;
-
-	subflows = !!mptcp_pm_get_subflows_max(msk);
-	WRITE_ONCE(pm->work_pending, (!!mptcp_pm_get_local_addr_max(msk) && subflows) ||
-		   !!mptcp_pm_get_add_addr_signal_max(msk));
-	WRITE_ONCE(pm->accept_addr, !!mptcp_pm_get_add_addr_accept_max(msk) && subflows);
-	WRITE_ONCE(pm->accept_subflow, subflows);
-}
-
 #define MPTCP_PM_CMD_GRP_OFFSET       0
 #define MPTCP_PM_EV_GRP_OFFSET        1
 
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 47d24478763c..50175e4cbcb8 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -821,7 +821,6 @@ bool mptcp_pm_rm_addr_signal(struct mptcp_sock *msk, unsigned int remaining,
 int mptcp_pm_get_local_id(struct mptcp_sock *msk, struct sock_common *skc);
 
 void __init mptcp_pm_nl_init(void);
-void mptcp_pm_nl_data_init(struct mptcp_sock *msk);
 void mptcp_pm_nl_work(struct mptcp_sock *msk);
 void mptcp_pm_nl_rm_subflow_received(struct mptcp_sock *msk,
 				     const struct mptcp_rm_list *rm_list);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH mptcp-next v2 2/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode
  2021-12-02  0:52 [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
  2021-12-02  0:52 ` [PATCH mptcp-next v2 1/6] mptcp: Remove redundant assignments in path manager init Mat Martineau
@ 2021-12-02  0:52 ` Mat Martineau
  2021-12-02  0:52 ` [PATCH mptcp-next v2 3/6] mptcp: Bypass kernel PM when userspace PM is enabled Mat Martineau
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-12-02  0:52 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau

When adding support for netlink path management commands, the kernel
needs to know whether paths are being controlled by the in-kernel path
manager or a userspace PM.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm.c       | 4 ++++
 net/mptcp/protocol.h | 9 +++++++++
 2 files changed, 13 insertions(+)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 4b79b73aee3c..053afb058440 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -373,6 +373,10 @@ void mptcp_pm_data_reset(struct mptcp_sock *msk)
 	pm->subflows = 0;
 	pm->rm_list_tx.nr = 0;
 	pm->rm_list_rx.nr = 0;
+	WRITE_ONCE(pm->pm_type, MPTCP_PM_TYPE_KERNEL);
+	/* pm->work_pending must be only be set to 'true' when
+	 * pm->pm_type is set to MPTCP_PM_TYPE_KERNEL
+	 */
 	WRITE_ONCE(pm->work_pending,
 		   (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
 		   !!mptcp_pm_get_add_addr_signal_max(msk));
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 50175e4cbcb8..478abe18b9e9 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -177,6 +177,14 @@ enum mptcp_pm_status {
 	MPTCP_PM_SUBFLOW_ESTABLISHED,
 };
 
+enum mptcp_pm_type {
+	MPTCP_PM_TYPE_KERNEL = 0,
+	MPTCP_PM_TYPE_USERSPACE,
+
+	__MPTCP_PM_TYPE_NR,
+	__MPTCP_PM_TYPE_MAX = __MPTCP_PM_TYPE_NR - 1,
+};
+
 enum mptcp_addr_signal_status {
 	MPTCP_ADD_ADDR_SIGNAL,
 	MPTCP_ADD_ADDR_ECHO,
@@ -199,6 +207,7 @@ struct mptcp_pm_data {
 	u8		add_addr_signaled;
 	u8		add_addr_accepted;
 	u8		local_addr_used;
+	u8		pm_type;
 	u8		subflows;
 	u8		status;
 	struct mptcp_rm_list rm_list_tx;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH mptcp-next v2 3/6] mptcp: Bypass kernel PM when userspace PM is enabled
  2021-12-02  0:52 [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
  2021-12-02  0:52 ` [PATCH mptcp-next v2 1/6] mptcp: Remove redundant assignments in path manager init Mat Martineau
  2021-12-02  0:52 ` [PATCH mptcp-next v2 2/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode Mat Martineau
@ 2021-12-02  0:52 ` Mat Martineau
  2021-12-06 17:48   ` Paolo Abeni
  2021-12-02  0:52 ` [PATCH mptcp-next v2 4/6] mptcp: Make kernel path manager check for userspace-managed sockets Mat Martineau
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Mat Martineau @ 2021-12-02  0:52 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau

When a MPTCP connection is managed by a userspace PM, bypass the kernel
PM for incoming advertisements and subflow events. Netlink events are
still sent to userspace.

v2: Remove unneeded check in mptcp_pm_add_addr_received() (Kishen Maloor)

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 053afb058440..451948b11030 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -189,7 +189,8 @@ void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
 
 	spin_lock_bh(&pm->lock);
 
-	if (!READ_ONCE(pm->accept_addr)) {
+	if (!READ_ONCE(pm->accept_addr) ||
+	    READ_ONCE(pm->pm_type) != MPTCP_PM_TYPE_KERNEL) {
 		mptcp_pm_announce_addr(msk, addr, true);
 		mptcp_pm_add_addr_send_ack(msk);
 	} else if (mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_RECEIVED)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH mptcp-next v2 4/6] mptcp: Make kernel path manager check for userspace-managed sockets
  2021-12-02  0:52 [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
                   ` (2 preceding siblings ...)
  2021-12-02  0:52 ` [PATCH mptcp-next v2 3/6] mptcp: Bypass kernel PM when userspace PM is enabled Mat Martineau
@ 2021-12-02  0:52 ` Mat Martineau
  2021-12-06 17:52   ` Paolo Abeni
  2021-12-02  0:52 ` [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
  2021-12-02  0:52 ` [PATCH mptcp-next v2 6/6] selftests: mptcp: Add tests for userspace PM type Mat Martineau
  5 siblings, 1 reply; 16+ messages in thread
From: Mat Martineau @ 2021-12-02  0:52 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau

Userspace-managed sockets should not have their subflows or
advertisements changed by the kernel path manager.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm_netlink.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index a74eb0444cd2..2cd491229d23 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -1122,7 +1122,8 @@ static int mptcp_nl_add_subflow_or_signal_addr(struct net *net)
 	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
 		struct sock *sk = (struct sock *)msk;
 
-		if (!READ_ONCE(msk->fully_established))
+		if (!READ_ONCE(msk->fully_established) ||
+		    (READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL))
 			goto next;
 
 		lock_sock(sk);
@@ -1260,6 +1261,9 @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net,
 		struct sock *sk = (struct sock *)msk;
 		bool remove_subflow;
 
+		if (READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL)
+			goto next;
+
 		if (list_empty(&msk->conn_list)) {
 			mptcp_pm_remove_anno_addr(msk, addr, false);
 			goto next;
@@ -1301,7 +1305,8 @@ static int mptcp_nl_remove_id_zero_address(struct net *net,
 		struct sock *sk = (struct sock *)msk;
 		struct mptcp_addr_info msk_local;
 
-		if (list_empty(&msk->conn_list))
+		if (list_empty(&msk->conn_list) ||
+		    (READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL))
 			goto next;
 
 		local_address((struct sock_common *)msk, &msk_local);
@@ -1410,9 +1415,11 @@ static void mptcp_nl_remove_addrs_list(struct net *net,
 	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
 		struct sock *sk = (struct sock *)msk;
 
-		lock_sock(sk);
-		mptcp_pm_remove_addrs_and_subflows(msk, rm_list);
-		release_sock(sk);
+		if (READ_ONCE(msk->pm.pm_type) == MPTCP_PM_TYPE_KERNEL) {
+			lock_sock(sk);
+			mptcp_pm_remove_addrs_and_subflows(msk, rm_list);
+			release_sock(sk);
+		}
 
 		sock_put(sk);
 		cond_resched();
@@ -1674,7 +1681,8 @@ static int mptcp_nl_addr_backup(struct net *net,
 	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
 		struct sock *sk = (struct sock *)msk;
 
-		if (list_empty(&msk->conn_list))
+		if (list_empty(&msk->conn_list) ||
+		    (READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL))
 			goto next;
 
 		lock_sock(sk);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-12-02  0:52 [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
                   ` (3 preceding siblings ...)
  2021-12-02  0:52 ` [PATCH mptcp-next v2 4/6] mptcp: Make kernel path manager check for userspace-managed sockets Mat Martineau
@ 2021-12-02  0:52 ` Mat Martineau
  2021-12-06  7:44   ` Geliang Tang
  2021-12-06  7:56   ` Geliang Tang
  2021-12-02  0:52 ` [PATCH mptcp-next v2 6/6] selftests: mptcp: Add tests for userspace PM type Mat Martineau
  5 siblings, 2 replies; 16+ messages in thread
From: Mat Martineau @ 2021-12-02  0:52 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau

The new net.mptcp.pm_type sysctl determines which path manager will be
used by each newly-created MPTCP socket.

v2: Handle builds without CONFIG_SYSCTL

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 Documentation/networking/mptcp-sysctl.rst | 18 ++++++++++++++++++
 net/mptcp/ctrl.c                          | 21 +++++++++++++++++++++
 net/mptcp/pm.c                            | 13 +++++++++----
 net/mptcp/protocol.h                      |  1 +
 4 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index b0d4da71e68e..e263dfcc4b40 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -46,6 +46,24 @@ allow_join_initial_addr_port - BOOLEAN
 
 	Default: 1
 
+pm_type - INTEGER
+
+	Set the default path manager type to use for each new MPTCP
+	socket. In-kernel path management will control subflow
+	connections and address advertisements according to
+	per-namespace values configured over the MPTCP netlink
+	API. Userspace path management puts per-MPTCP-connection subflow
+	connection decisions and address advertisements under control of
+	a privileged userspace program, at the cost of more netlink
+	traffic to propagate all of the related events and commands.
+
+	This is a per-namespace sysctl.
+
+	* 0 - In-kernel path manager
+	* 1 - Userspace path manager
+
+	Default: 0
+
 stale_loss_cnt - INTEGER
 	The number of MPTCP-level retransmission intervals with no traffic and
 	pending outstanding data on a given subflow required to declare it stale.
diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index 8b235468c88f..ae20b7d92e28 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -16,6 +16,11 @@
 #define MPTCP_SYSCTL_PATH "net/mptcp"
 
 static int mptcp_pernet_id;
+
+#ifdef CONFIG_SYSCTL
+static int mptcp_pm_type_max = __MPTCP_PM_TYPE_MAX;
+#endif
+
 struct mptcp_pernet {
 #ifdef CONFIG_SYSCTL
 	struct ctl_table_header *ctl_table_hdr;
@@ -26,6 +31,7 @@ struct mptcp_pernet {
 	u8 mptcp_enabled;
 	u8 checksum_enabled;
 	u8 allow_join_initial_addr_port;
+	u8 pm_type;
 };
 
 static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
@@ -58,6 +64,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
 	return mptcp_get_pernet(net)->stale_loss_cnt;
 }
 
+int mptcp_get_pm_type(const struct net *net)
+{
+	return mptcp_get_pernet(net)->pm_type;
+}
+
 static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 {
 	pernet->mptcp_enabled = 1;
@@ -65,6 +76,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 	pernet->checksum_enabled = 0;
 	pernet->allow_join_initial_addr_port = 1;
 	pernet->stale_loss_cnt = 4;
+	pernet->pm_type = MPTCP_PM_TYPE_KERNEL;
 }
 
 #ifdef CONFIG_SYSCTL
@@ -108,6 +120,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
 		.mode = 0644,
 		.proc_handler = proc_douintvec_minmax,
 	},
+	{
+		.procname = "pm_type",
+		.maxlen = sizeof(u8),
+		.mode = 0644,
+		.proc_handler = proc_dou8vec_minmax,
+		.extra1       = SYSCTL_ZERO,
+		.extra2       = &mptcp_pm_type_max
+	},
 	{}
 };
 
@@ -128,6 +148,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
 	table[2].data = &pernet->checksum_enabled;
 	table[3].data = &pernet->allow_join_initial_addr_port;
 	table[4].data = &pernet->stale_loss_cnt;
+	table[5].data = &pernet->pm_type;
 
 	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
 	if (!hdr)
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 451948b11030..f1fc08d89c20 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -365,8 +365,12 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
 
 void mptcp_pm_data_reset(struct mptcp_sock *msk)
 {
-	bool subflows_allowed = !!mptcp_pm_get_subflows_max(msk);
+	u8 pm_type = mptcp_get_pm_type(sock_net((struct sock *)msk));
 	struct mptcp_pm_data *pm = &msk->pm;
+	bool subflows_allowed;
+
+	subflows_allowed = !!mptcp_pm_get_subflows_max(msk) &&
+		pm_type == MPTCP_PM_TYPE_KERNEL;
 
 	pm->add_addr_signaled = 0;
 	pm->add_addr_accepted = 0;
@@ -374,13 +378,14 @@ void mptcp_pm_data_reset(struct mptcp_sock *msk)
 	pm->subflows = 0;
 	pm->rm_list_tx.nr = 0;
 	pm->rm_list_rx.nr = 0;
-	WRITE_ONCE(pm->pm_type, MPTCP_PM_TYPE_KERNEL);
+	WRITE_ONCE(pm->pm_type, pm_type);
 	/* pm->work_pending must be only be set to 'true' when
 	 * pm->pm_type is set to MPTCP_PM_TYPE_KERNEL
 	 */
 	WRITE_ONCE(pm->work_pending,
-		   (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
-		   !!mptcp_pm_get_add_addr_signal_max(msk));
+		   pm_type == MPTCP_PM_TYPE_KERNEL &&
+		   ((!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
+		    !!mptcp_pm_get_add_addr_signal_max(msk)));
 	WRITE_ONCE(pm->addr_signal, 0);
 	WRITE_ONCE(pm->accept_addr,
 		   !!mptcp_pm_get_add_addr_accept_max(msk) && subflows_allowed);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 478abe18b9e9..e4c2b39b029c 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -579,6 +579,7 @@ unsigned int mptcp_get_add_addr_timeout(const struct net *net);
 int mptcp_is_checksum_enabled(const struct net *net);
 int mptcp_allow_join_id0(const struct net *net);
 unsigned int mptcp_stale_loss_cnt(const struct net *net);
+int mptcp_get_pm_type(const struct net *net);
 void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
 				     struct mptcp_options_received *mp_opt);
 bool __mptcp_retransmit_pending_data(struct sock *sk);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH mptcp-next v2 6/6] selftests: mptcp: Add tests for userspace PM type
  2021-12-02  0:52 [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
                   ` (4 preceding siblings ...)
  2021-12-02  0:52 ` [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
@ 2021-12-02  0:52 ` Mat Martineau
  2021-12-06  8:37   ` Geliang Tang
  5 siblings, 1 reply; 16+ messages in thread
From: Mat Martineau @ 2021-12-02  0:52 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau

These tests ensure that the in-kernel path manager is bypassed when
the userspace path manager is configured. Kernel code is still
responsible for ADD_ADDR echo, so also make sure that's working.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 .../testing/selftests/net/mptcp/mptcp_join.sh | 70 ++++++++++++++++++-
 1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 2684ef9c0d42..7df9ddb307a8 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -50,6 +50,7 @@ init()
 		ip netns add $netns || exit $ksft_skip
 		ip -net $netns link set lo up
 		ip netns exec $netns sysctl -q net.mptcp.enabled=1
+		ip netns exec $netns sysctl -q net.mptcp.pm_type=0
 		ip netns exec $netns sysctl -q net.ipv4.conf.all.rp_filter=0
 		ip netns exec $netns sysctl -q net.ipv4.conf.default.rp_filter=0
 		if [ $checksum -eq 1 ]; then
@@ -1837,6 +1838,68 @@ fullmesh_tests()
 	chk_add_nr 1 1
 }
 
+userspace_tests()
+{
+	# userspace pm type prevents add_addr
+	reset
+	ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "userspace pm type prevents add_addr" 0 0 0
+	chk_add_nr 0 0
+
+	# userspace pm type echoes add_addr
+	reset
+	ip netns exec $ns2 sysctl -q net.mptcp.pm_type=1
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 2
+	ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "userspace pm type echoes add_addr" 0 0 0
+	chk_add_nr 1 1
+
+	# userspace pm type rejects join
+	reset
+	ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
+	ip netns exec $ns1 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "userspace pm type rejects join" 1 1 0
+
+	# userspace pm type does not send join
+	reset
+	ip netns exec $ns2 sysctl -q net.mptcp.pm_type=1
+	ip netns exec $ns1 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "userspace pm type does not send join" 0 0 0
+
+	# userspace pm type prevents mp_prio
+	reset
+	ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
+	ip netns exec $ns1 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow backup
+	chk_join_nr "userspace pm type prevents mp_prio" 1 1 0
+	chk_prio_nr 0 0
+
+	# userspace pm type prevents rm_addr
+	reset
+	ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
+	ip netns exec $ns2 sysctl -q net.mptcp.pm_type=1
+	ip netns exec $ns1 ./pm_nl_ctl limits 0 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 0 1
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1 0 0 -1 slow
+	chk_join_nr "userspace pm type prevents rm_addr" 0 0 0
+	chk_rm_nr 0 0
+}
+
 all_tests()
 {
 	subflows_tests
@@ -1853,6 +1916,7 @@ all_tests()
 	checksum_tests
 	deny_join_id0_tests
 	fullmesh_tests
+	userspace_tests
 }
 
 usage()
@@ -1872,6 +1936,7 @@ usage()
 	echo "  -S checksum_tests"
 	echo "  -d deny_join_id0_tests"
 	echo "  -m fullmesh_tests"
+	echo "  -u userspace_tests"
 	echo "  -c capture pcap files"
 	echo "  -C enable data checksum"
 	echo "  -h help"
@@ -1907,7 +1972,7 @@ if [ $do_all_tests -eq 1 ]; then
 	exit $ret
 fi
 
-while getopts 'fsltra64bpkdmchCS' opt; do
+while getopts 'fsltra64bpkdmuchCS' opt; do
 	case $opt in
 		f)
 			subflows_tests
@@ -1951,6 +2016,9 @@ while getopts 'fsltra64bpkdmchCS' opt; do
 		m)
 			fullmesh_tests
 			;;
+		u)
+			userspace_tests
+			;;
 		c)
 			;;
 		C)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-12-02  0:52 ` [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
@ 2021-12-06  7:44   ` Geliang Tang
  2021-12-06  7:56   ` Geliang Tang
  1 sibling, 0 replies; 16+ messages in thread
From: Geliang Tang @ 2021-12-06  7:44 UTC (permalink / raw)
  To: Mat Martineau; +Cc: MPTCP Upstream

Hi Mat,

Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年12月2日周四 08:52写道:
>
> The new net.mptcp.pm_type sysctl determines which path manager will be
> used by each newly-created MPTCP socket.
>
> v2: Handle builds without CONFIG_SYSCTL
>
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
>  Documentation/networking/mptcp-sysctl.rst | 18 ++++++++++++++++++
>  net/mptcp/ctrl.c                          | 21 +++++++++++++++++++++
>  net/mptcp/pm.c                            | 13 +++++++++----
>  net/mptcp/protocol.h                      |  1 +
>  4 files changed, 49 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
> index b0d4da71e68e..e263dfcc4b40 100644
> --- a/Documentation/networking/mptcp-sysctl.rst
> +++ b/Documentation/networking/mptcp-sysctl.rst
> @@ -46,6 +46,24 @@ allow_join_initial_addr_port - BOOLEAN
>
>         Default: 1
>
> +pm_type - INTEGER
> +
> +       Set the default path manager type to use for each new MPTCP
> +       socket. In-kernel path management will control subflow
> +       connections and address advertisements according to
> +       per-namespace values configured over the MPTCP netlink
> +       API. Userspace path management puts per-MPTCP-connection subflow
> +       connection decisions and address advertisements under control of
> +       a privileged userspace program, at the cost of more netlink
> +       traffic to propagate all of the related events and commands.
> +
> +       This is a per-namespace sysctl.
> +
> +       * 0 - In-kernel path manager
> +       * 1 - Userspace path manager
> +
> +       Default: 0
> +
>  stale_loss_cnt - INTEGER
>         The number of MPTCP-level retransmission intervals with no traffic and
>         pending outstanding data on a given subflow required to declare it stale.
> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> index 8b235468c88f..ae20b7d92e28 100644
> --- a/net/mptcp/ctrl.c
> +++ b/net/mptcp/ctrl.c
> @@ -16,6 +16,11 @@
>  #define MPTCP_SYSCTL_PATH "net/mptcp"
>
>  static int mptcp_pernet_id;
> +
> +#ifdef CONFIG_SYSCTL
> +static int mptcp_pm_type_max = __MPTCP_PM_TYPE_MAX;
> +#endif
> +
>  struct mptcp_pernet {
>  #ifdef CONFIG_SYSCTL
>         struct ctl_table_header *ctl_table_hdr;
> @@ -26,6 +31,7 @@ struct mptcp_pernet {
>         u8 mptcp_enabled;
>         u8 checksum_enabled;
>         u8 allow_join_initial_addr_port;
> +       u8 pm_type;
>  };
>
>  static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
> @@ -58,6 +64,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
>         return mptcp_get_pernet(net)->stale_loss_cnt;
>  }
>
> +int mptcp_get_pm_type(const struct net *net)
> +{
> +       return mptcp_get_pernet(net)->pm_type;
> +}
> +
>  static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>  {
>         pernet->mptcp_enabled = 1;
> @@ -65,6 +76,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>         pernet->checksum_enabled = 0;
>         pernet->allow_join_initial_addr_port = 1;
>         pernet->stale_loss_cnt = 4;
> +       pernet->pm_type = MPTCP_PM_TYPE_KERNEL;
>  }
>
>  #ifdef CONFIG_SYSCTL
> @@ -108,6 +120,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
>                 .mode = 0644,
>                 .proc_handler = proc_douintvec_minmax,
>         },
> +       {
> +               .procname = "pm_type",
> +               .maxlen = sizeof(u8),
> +               .mode = 0644,
> +               .proc_handler = proc_dou8vec_minmax,
> +               .extra1       = SYSCTL_ZERO,
> +               .extra2       = &mptcp_pm_type_max
> +       },
>         {}
>  };
>
> @@ -128,6 +148,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
>         table[2].data = &pernet->checksum_enabled;
>         table[3].data = &pernet->allow_join_initial_addr_port;
>         table[4].data = &pernet->stale_loss_cnt;
> +       table[5].data = &pernet->pm_type;
>
>         hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
>         if (!hdr)
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 451948b11030..f1fc08d89c20 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -365,8 +365,12 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
>
>  void mptcp_pm_data_reset(struct mptcp_sock *msk)
>  {
> -       bool subflows_allowed = !!mptcp_pm_get_subflows_max(msk);
> +       u8 pm_type = mptcp_get_pm_type(sock_net((struct sock *)msk));
>         struct mptcp_pm_data *pm = &msk->pm;
> +       bool subflows_allowed;
> +
> +       subflows_allowed = !!mptcp_pm_get_subflows_max(msk) &&
> +               pm_type == MPTCP_PM_TYPE_KERNEL;
>
>         pm->add_addr_signaled = 0;
>         pm->add_addr_accepted = 0;
> @@ -374,13 +378,14 @@ void mptcp_pm_data_reset(struct mptcp_sock *msk)
>         pm->subflows = 0;
>         pm->rm_list_tx.nr = 0;
>         pm->rm_list_rx.nr = 0;
> -       WRITE_ONCE(pm->pm_type, MPTCP_PM_TYPE_KERNEL);
> +       WRITE_ONCE(pm->pm_type, pm_type);
>         /* pm->work_pending must be only be set to 'true' when
>          * pm->pm_type is set to MPTCP_PM_TYPE_KERNEL
>          */
>         WRITE_ONCE(pm->work_pending,
> -                  (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
> -                  !!mptcp_pm_get_add_addr_signal_max(msk));
> +                  pm_type == MPTCP_PM_TYPE_KERNEL &&
> +                  ((!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
> +                   !!mptcp_pm_get_add_addr_signal_max(msk)));


 12 +                  !!mptcp_pm_get_local_addr_max(msk) && subflows_allowed ||
 13 +                  !!mptcp_pm_get_add_addr_signal_max(msk) &&
pm_type == MPTCP_PM_TYPE_KERNEL);


>         WRITE_ONCE(pm->addr_signal, 0);
>         WRITE_ONCE(pm->accept_addr,
>                    !!mptcp_pm_get_add_addr_accept_max(msk) && subflows_allowed);
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 478abe18b9e9..e4c2b39b029c 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -579,6 +579,7 @@ unsigned int mptcp_get_add_addr_timeout(const struct net *net);
>  int mptcp_is_checksum_enabled(const struct net *net);
>  int mptcp_allow_join_id0(const struct net *net);
>  unsigned int mptcp_stale_loss_cnt(const struct net *net);
> +int mptcp_get_pm_type(const struct net *net);
>  void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
>                                      struct mptcp_options_received *mp_opt);
>  bool __mptcp_retransmit_pending_data(struct sock *sk);
> --
> 2.34.1
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-12-02  0:52 ` [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
  2021-12-06  7:44   ` Geliang Tang
@ 2021-12-06  7:56   ` Geliang Tang
  2021-12-06 17:47     ` Paolo Abeni
  1 sibling, 1 reply; 16+ messages in thread
From: Geliang Tang @ 2021-12-06  7:56 UTC (permalink / raw)
  To: Mat Martineau; +Cc: MPTCP Upstream

Hi Mat,

Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年12月2日周四 08:52写道:
>
> The new net.mptcp.pm_type sysctl determines which path manager will be
> used by each newly-created MPTCP socket.
>
> v2: Handle builds without CONFIG_SYSCTL
>
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
>  Documentation/networking/mptcp-sysctl.rst | 18 ++++++++++++++++++
>  net/mptcp/ctrl.c                          | 21 +++++++++++++++++++++
>  net/mptcp/pm.c                            | 13 +++++++++----
>  net/mptcp/protocol.h                      |  1 +
>  4 files changed, 49 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
> index b0d4da71e68e..e263dfcc4b40 100644
> --- a/Documentation/networking/mptcp-sysctl.rst
> +++ b/Documentation/networking/mptcp-sysctl.rst
> @@ -46,6 +46,24 @@ allow_join_initial_addr_port - BOOLEAN
>
>         Default: 1
>
> +pm_type - INTEGER
> +
> +       Set the default path manager type to use for each new MPTCP
> +       socket. In-kernel path management will control subflow
> +       connections and address advertisements according to
> +       per-namespace values configured over the MPTCP netlink
> +       API. Userspace path management puts per-MPTCP-connection subflow
> +       connection decisions and address advertisements under control of
> +       a privileged userspace program, at the cost of more netlink
> +       traffic to propagate all of the related events and commands.
> +
> +       This is a per-namespace sysctl.
> +
> +       * 0 - In-kernel path manager
> +       * 1 - Userspace path manager
> +
> +       Default: 0
> +
>  stale_loss_cnt - INTEGER
>         The number of MPTCP-level retransmission intervals with no traffic and
>         pending outstanding data on a given subflow required to declare it stale.
> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> index 8b235468c88f..ae20b7d92e28 100644
> --- a/net/mptcp/ctrl.c
> +++ b/net/mptcp/ctrl.c
> @@ -16,6 +16,11 @@
>  #define MPTCP_SYSCTL_PATH "net/mptcp"
>
>  static int mptcp_pernet_id;
> +
> +#ifdef CONFIG_SYSCTL
> +static int mptcp_pm_type_max = __MPTCP_PM_TYPE_MAX;
> +#endif
> +
>  struct mptcp_pernet {
>  #ifdef CONFIG_SYSCTL
>         struct ctl_table_header *ctl_table_hdr;
> @@ -26,6 +31,7 @@ struct mptcp_pernet {
>         u8 mptcp_enabled;
>         u8 checksum_enabled;
>         u8 allow_join_initial_addr_port;
> +       u8 pm_type;
>  };
>
>  static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
> @@ -58,6 +64,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
>         return mptcp_get_pernet(net)->stale_loss_cnt;
>  }
>
> +int mptcp_get_pm_type(const struct net *net)
> +{
> +       return mptcp_get_pernet(net)->pm_type;
> +}
> +
>  static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>  {
>         pernet->mptcp_enabled = 1;
> @@ -65,6 +76,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>         pernet->checksum_enabled = 0;
>         pernet->allow_join_initial_addr_port = 1;
>         pernet->stale_loss_cnt = 4;
> +       pernet->pm_type = MPTCP_PM_TYPE_KERNEL;
>  }
>
>  #ifdef CONFIG_SYSCTL
> @@ -108,6 +120,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
>                 .mode = 0644,
>                 .proc_handler = proc_douintvec_minmax,
>         },
> +       {
> +               .procname = "pm_type",
> +               .maxlen = sizeof(u8),
> +               .mode = 0644,
> +               .proc_handler = proc_dou8vec_minmax,
> +               .extra1       = SYSCTL_ZERO,
> +               .extra2       = &mptcp_pm_type_max
> +       },
>         {}
>  };
>
> @@ -128,6 +148,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
>         table[2].data = &pernet->checksum_enabled;
>         table[3].data = &pernet->allow_join_initial_addr_port;
>         table[4].data = &pernet->stale_loss_cnt;
> +       table[5].data = &pernet->pm_type;
>
>         hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
>         if (!hdr)
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 451948b11030..f1fc08d89c20 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -365,8 +365,12 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
>
>  void mptcp_pm_data_reset(struct mptcp_sock *msk)
>  {
> -       bool subflows_allowed = !!mptcp_pm_get_subflows_max(msk);
> +       u8 pm_type = mptcp_get_pm_type(sock_net((struct sock *)msk));
>         struct mptcp_pm_data *pm = &msk->pm;
> +       bool subflows_allowed;
> +
> +       subflows_allowed = !!mptcp_pm_get_subflows_max(msk) &&
> +               pm_type == MPTCP_PM_TYPE_KERNEL;
>
>         pm->add_addr_signaled = 0;
>         pm->add_addr_accepted = 0;
> @@ -374,13 +378,14 @@ void mptcp_pm_data_reset(struct mptcp_sock *msk)
>         pm->subflows = 0;
>         pm->rm_list_tx.nr = 0;
>         pm->rm_list_rx.nr = 0;
> -       WRITE_ONCE(pm->pm_type, MPTCP_PM_TYPE_KERNEL);
> +       WRITE_ONCE(pm->pm_type, pm_type);
>         /* pm->work_pending must be only be set to 'true' when
>          * pm->pm_type is set to MPTCP_PM_TYPE_KERNEL
>          */
>         WRITE_ONCE(pm->work_pending,
> -                  (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
> -                  !!mptcp_pm_get_add_addr_signal_max(msk));
> +                  pm_type == MPTCP_PM_TYPE_KERNEL &&
> +                  ((!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
> +                   !!mptcp_pm_get_add_addr_signal_max(msk)));

I think pm_type == MPTCP_PM_TYPE_KERNEL is somehow duplicate here, how
about use this code instead:

        WRITE_ONCE(pm->work_pending,
                   (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
                   (!!mptcp_pm_get_add_addr_signal_max(msk) && pm_type
== MPTCP_PM_TYPE_KERNEL));

Thanks,
-Geliang

>         WRITE_ONCE(pm->addr_signal, 0);
>         WRITE_ONCE(pm->accept_addr,
>                    !!mptcp_pm_get_add_addr_accept_max(msk) && subflows_allowed);
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 478abe18b9e9..e4c2b39b029c 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -579,6 +579,7 @@ unsigned int mptcp_get_add_addr_timeout(const struct net *net);
>  int mptcp_is_checksum_enabled(const struct net *net);
>  int mptcp_allow_join_id0(const struct net *net);
>  unsigned int mptcp_stale_loss_cnt(const struct net *net);
> +int mptcp_get_pm_type(const struct net *net);
>  void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
>                                      struct mptcp_options_received *mp_opt);
>  bool __mptcp_retransmit_pending_data(struct sock *sk);
> --
> 2.34.1
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 6/6] selftests: mptcp: Add tests for userspace PM type
  2021-12-02  0:52 ` [PATCH mptcp-next v2 6/6] selftests: mptcp: Add tests for userspace PM type Mat Martineau
@ 2021-12-06  8:37   ` Geliang Tang
  0 siblings, 0 replies; 16+ messages in thread
From: Geliang Tang @ 2021-12-06  8:37 UTC (permalink / raw)
  To: Mat Martineau; +Cc: MPTCP Upstream

Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年12月2日周四 08:52写道:
>
> These tests ensure that the in-kernel path manager is bypassed when
> the userspace path manager is configured. Kernel code is still
> responsible for ADD_ADDR echo, so also make sure that's working.
>
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>

Tested-by: Geliang Tang <geliang.tang@suse.com>

> ---
>  .../testing/selftests/net/mptcp/mptcp_join.sh | 70 ++++++++++++++++++-
>  1 file changed, 69 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> index 2684ef9c0d42..7df9ddb307a8 100755
> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> @@ -50,6 +50,7 @@ init()
>                 ip netns add $netns || exit $ksft_skip
>                 ip -net $netns link set lo up
>                 ip netns exec $netns sysctl -q net.mptcp.enabled=1
> +               ip netns exec $netns sysctl -q net.mptcp.pm_type=0
>                 ip netns exec $netns sysctl -q net.ipv4.conf.all.rp_filter=0
>                 ip netns exec $netns sysctl -q net.ipv4.conf.default.rp_filter=0
>                 if [ $checksum -eq 1 ]; then
> @@ -1837,6 +1838,68 @@ fullmesh_tests()
>         chk_add_nr 1 1
>  }
>
> +userspace_tests()
> +{
> +       # userspace pm type prevents add_addr
> +       reset
> +       ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
> +       ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> +       ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> +       ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal
> +       run_tests $ns1 $ns2 10.0.1.1
> +       chk_join_nr "userspace pm type prevents add_addr" 0 0 0
> +       chk_add_nr 0 0
> +
> +       # userspace pm type echoes add_addr
> +       reset
> +       ip netns exec $ns2 sysctl -q net.mptcp.pm_type=1
> +       ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> +       ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> +       ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal
> +       run_tests $ns1 $ns2 10.0.1.1
> +       chk_join_nr "userspace pm type echoes add_addr" 0 0 0
> +       chk_add_nr 1 1
> +
> +       # userspace pm type rejects join
> +       reset
> +       ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
> +       ip netns exec $ns1 ./pm_nl_ctl limits 1 1
> +       ip netns exec $ns2 ./pm_nl_ctl limits 1 1
> +       ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> +       run_tests $ns1 $ns2 10.0.1.1
> +       chk_join_nr "userspace pm type rejects join" 1 1 0
> +
> +       # userspace pm type does not send join
> +       reset
> +       ip netns exec $ns2 sysctl -q net.mptcp.pm_type=1
> +       ip netns exec $ns1 ./pm_nl_ctl limits 1 1
> +       ip netns exec $ns2 ./pm_nl_ctl limits 1 1
> +       ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> +       run_tests $ns1 $ns2 10.0.1.1
> +       chk_join_nr "userspace pm type does not send join" 0 0 0
> +
> +       # userspace pm type prevents mp_prio
> +       reset
> +       ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
> +       ip netns exec $ns1 ./pm_nl_ctl limits 1 1
> +       ip netns exec $ns2 ./pm_nl_ctl limits 1 1
> +       ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> +       run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow backup
> +       chk_join_nr "userspace pm type prevents mp_prio" 1 1 0
> +       chk_prio_nr 0 0
> +
> +       # userspace pm type prevents rm_addr
> +       reset
> +       ip netns exec $ns1 sysctl -q net.mptcp.pm_type=1
> +       ip netns exec $ns2 sysctl -q net.mptcp.pm_type=1
> +       ip netns exec $ns1 ./pm_nl_ctl limits 0 1
> +       ip netns exec $ns2 ./pm_nl_ctl limits 0 1
> +       ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
> +       run_tests $ns1 $ns2 10.0.1.1 0 0 -1 slow
> +       chk_join_nr "userspace pm type prevents rm_addr" 0 0 0
> +       chk_rm_nr 0 0
> +}
> +
>  all_tests()
>  {
>         subflows_tests
> @@ -1853,6 +1916,7 @@ all_tests()
>         checksum_tests
>         deny_join_id0_tests
>         fullmesh_tests
> +       userspace_tests
>  }
>
>  usage()
> @@ -1872,6 +1936,7 @@ usage()
>         echo "  -S checksum_tests"
>         echo "  -d deny_join_id0_tests"
>         echo "  -m fullmesh_tests"
> +       echo "  -u userspace_tests"
>         echo "  -c capture pcap files"
>         echo "  -C enable data checksum"
>         echo "  -h help"
> @@ -1907,7 +1972,7 @@ if [ $do_all_tests -eq 1 ]; then
>         exit $ret
>  fi
>
> -while getopts 'fsltra64bpkdmchCS' opt; do
> +while getopts 'fsltra64bpkdmuchCS' opt; do
>         case $opt in
>                 f)
>                         subflows_tests
> @@ -1951,6 +2016,9 @@ while getopts 'fsltra64bpkdmchCS' opt; do
>                 m)
>                         fullmesh_tests
>                         ;;
> +               u)
> +                       userspace_tests
> +                       ;;
>                 c)
>                         ;;
>                 C)
> --
> 2.34.1
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-12-06  7:56   ` Geliang Tang
@ 2021-12-06 17:47     ` Paolo Abeni
  2021-12-06 22:42       ` Mat Martineau
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2021-12-06 17:47 UTC (permalink / raw)
  To: Geliang Tang, Mat Martineau; +Cc: MPTCP Upstream

On Mon, 2021-12-06 at 15:56 +0800, Geliang Tang wrote:
> Hi Mat,
> 
> Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年12月2日周四 08:52写道:
> > 
> > The new net.mptcp.pm_type sysctl determines which path manager will be
> > used by each newly-created MPTCP socket.
> > 
> > v2: Handle builds without CONFIG_SYSCTL
> > 
> > Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> > ---
> >  Documentation/networking/mptcp-sysctl.rst | 18 ++++++++++++++++++
> >  net/mptcp/ctrl.c                          | 21 +++++++++++++++++++++
> >  net/mptcp/pm.c                            | 13 +++++++++----
> >  net/mptcp/protocol.h                      |  1 +
> >  4 files changed, 49 insertions(+), 4 deletions(-)
> > 
> > diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
> > index b0d4da71e68e..e263dfcc4b40 100644
> > --- a/Documentation/networking/mptcp-sysctl.rst
> > +++ b/Documentation/networking/mptcp-sysctl.rst
> > @@ -46,6 +46,24 @@ allow_join_initial_addr_port - BOOLEAN
> > 
> >         Default: 1
> > 
> > +pm_type - INTEGER
> > +
> > +       Set the default path manager type to use for each new MPTCP
> > +       socket. In-kernel path management will control subflow
> > +       connections and address advertisements according to
> > +       per-namespace values configured over the MPTCP netlink
> > +       API. Userspace path management puts per-MPTCP-connection subflow
> > +       connection decisions and address advertisements under control of
> > +       a privileged userspace program, at the cost of more netlink
> > +       traffic to propagate all of the related events and commands.
> > +
> > +       This is a per-namespace sysctl.
> > +
> > +       * 0 - In-kernel path manager
> > +       * 1 - Userspace path manager
> > +
> > +       Default: 0
> > +
> >  stale_loss_cnt - INTEGER
> >         The number of MPTCP-level retransmission intervals with no traffic and
> >         pending outstanding data on a given subflow required to declare it stale.
> > diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> > index 8b235468c88f..ae20b7d92e28 100644
> > --- a/net/mptcp/ctrl.c
> > +++ b/net/mptcp/ctrl.c
> > @@ -16,6 +16,11 @@
> >  #define MPTCP_SYSCTL_PATH "net/mptcp"
> > 
> >  static int mptcp_pernet_id;
> > +
> > +#ifdef CONFIG_SYSCTL
> > +static int mptcp_pm_type_max = __MPTCP_PM_TYPE_MAX;
> > +#endif
> > +
> >  struct mptcp_pernet {
> >  #ifdef CONFIG_SYSCTL
> >         struct ctl_table_header *ctl_table_hdr;
> > @@ -26,6 +31,7 @@ struct mptcp_pernet {
> >         u8 mptcp_enabled;
> >         u8 checksum_enabled;
> >         u8 allow_join_initial_addr_port;
> > +       u8 pm_type;
> >  };
> > 
> >  static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
> > @@ -58,6 +64,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
> >         return mptcp_get_pernet(net)->stale_loss_cnt;
> >  }
> > 
> > +int mptcp_get_pm_type(const struct net *net)
> > +{
> > +       return mptcp_get_pernet(net)->pm_type;
> > +}
> > +
> >  static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
> >  {
> >         pernet->mptcp_enabled = 1;
> > @@ -65,6 +76,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
> >         pernet->checksum_enabled = 0;
> >         pernet->allow_join_initial_addr_port = 1;
> >         pernet->stale_loss_cnt = 4;
> > +       pernet->pm_type = MPTCP_PM_TYPE_KERNEL;
> >  }
> > 
> >  #ifdef CONFIG_SYSCTL
> > @@ -108,6 +120,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
> >                 .mode = 0644,
> >                 .proc_handler = proc_douintvec_minmax,
> >         },
> > +       {
> > +               .procname = "pm_type",
> > +               .maxlen = sizeof(u8),
> > +               .mode = 0644,
> > +               .proc_handler = proc_dou8vec_minmax,
> > +               .extra1       = SYSCTL_ZERO,
> > +               .extra2       = &mptcp_pm_type_max
> > +       },
> >         {}
> >  };
> > 
> > @@ -128,6 +148,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
> >         table[2].data = &pernet->checksum_enabled;
> >         table[3].data = &pernet->allow_join_initial_addr_port;
> >         table[4].data = &pernet->stale_loss_cnt;
> > +       table[5].data = &pernet->pm_type;
> > 
> >         hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
> >         if (!hdr)
> > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> > index 451948b11030..f1fc08d89c20 100644
> > --- a/net/mptcp/pm.c
> > +++ b/net/mptcp/pm.c
> > @@ -365,8 +365,12 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
> > 
> >  void mptcp_pm_data_reset(struct mptcp_sock *msk)
> >  {
> > -       bool subflows_allowed = !!mptcp_pm_get_subflows_max(msk);
> > +       u8 pm_type = mptcp_get_pm_type(sock_net((struct sock *)msk));
> >         struct mptcp_pm_data *pm = &msk->pm;
> > +       bool subflows_allowed;
> > +
> > +       subflows_allowed = !!mptcp_pm_get_subflows_max(msk) &&
> > +               pm_type == MPTCP_PM_TYPE_KERNEL;
> > 
> >         pm->add_addr_signaled = 0;
> >         pm->add_addr_accepted = 0;
> > @@ -374,13 +378,14 @@ void mptcp_pm_data_reset(struct mptcp_sock *msk)
> >         pm->subflows = 0;
> >         pm->rm_list_tx.nr = 0;
> >         pm->rm_list_rx.nr = 0;
> > -       WRITE_ONCE(pm->pm_type, MPTCP_PM_TYPE_KERNEL);
> > +       WRITE_ONCE(pm->pm_type, pm_type);
> >         /* pm->work_pending must be only be set to 'true' when
> >          * pm->pm_type is set to MPTCP_PM_TYPE_KERNEL
> >          */
> >         WRITE_ONCE(pm->work_pending,
> > -                  (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
> > -                  !!mptcp_pm_get_add_addr_signal_max(msk));
> > +                  pm_type == MPTCP_PM_TYPE_KERNEL &&
> > +                  ((!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
> > +                   !!mptcp_pm_get_add_addr_signal_max(msk)));
> 
> I think pm_type == MPTCP_PM_TYPE_KERNEL is somehow duplicate here, how
> about use this code instead:
> 
>         WRITE_ONCE(pm->work_pending,
>                    (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
>                    (!!mptcp_pm_get_add_addr_signal_max(msk) && pm_type
> == MPTCP_PM_TYPE_KERNEL));

Still apropos making this chunk of code more readable, what about:

	if (pm_type == MPTCP_PM_TYPE_KERNEL) {
		// [current code]
	} else {
		WRITE_ONCE(pm->work_pending, 0);
		WRITE_ONCE(pm->accept_addr, 0);
		WRITE_ONCE(pm->accept_subflow, 0); 
	}

?

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 3/6] mptcp: Bypass kernel PM when userspace PM is enabled
  2021-12-02  0:52 ` [PATCH mptcp-next v2 3/6] mptcp: Bypass kernel PM when userspace PM is enabled Mat Martineau
@ 2021-12-06 17:48   ` Paolo Abeni
  2021-12-06 22:34     ` Mat Martineau
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2021-12-06 17:48 UTC (permalink / raw)
  To: Mat Martineau, mptcp

On Wed, 2021-12-01 at 16:52 -0800, Mat Martineau wrote:
> When a MPTCP connection is managed by a userspace PM, bypass the kernel
> PM for incoming advertisements and subflow events. Netlink events are
> still sent to userspace.
> 
> v2: Remove unneeded check in mptcp_pm_add_addr_received() (Kishen Maloor)

I guess the above should be 'mptcp_pm_rm_addr_received'.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 4/6] mptcp: Make kernel path manager check for userspace-managed sockets
  2021-12-02  0:52 ` [PATCH mptcp-next v2 4/6] mptcp: Make kernel path manager check for userspace-managed sockets Mat Martineau
@ 2021-12-06 17:52   ` Paolo Abeni
  2021-12-06 22:35     ` Mat Martineau
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2021-12-06 17:52 UTC (permalink / raw)
  To: Mat Martineau, mptcp

On Wed, 2021-12-01 at 16:52 -0800, Mat Martineau wrote:
> Userspace-managed sockets should not have their subflows or
> advertisements changed by the kernel path manager.
> 
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
>  net/mptcp/pm_netlink.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index a74eb0444cd2..2cd491229d23 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -1122,7 +1122,8 @@ static int mptcp_nl_add_subflow_or_signal_addr(struct net *net)
>  	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
>  		struct sock *sk = (struct sock *)msk;
>  
> -		if (!READ_ONCE(msk->fully_established))
> +		if (!READ_ONCE(msk->fully_established) ||
> +		    (READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL))

I'm ok with merging the series as is, but if you are considering
another version, what about replacing a 'READ_ONCE(msk->pm.pm_type) !=
MPTCP_PM_TYPE_KERNEL' with an helper, e.g. something alike
mptcp_pm_is_user_space(sk) ?

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 3/6] mptcp: Bypass kernel PM when userspace PM is enabled
  2021-12-06 17:48   ` Paolo Abeni
@ 2021-12-06 22:34     ` Mat Martineau
  0 siblings, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-12-06 22:34 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

On Mon, 6 Dec 2021, Paolo Abeni wrote:

> On Wed, 2021-12-01 at 16:52 -0800, Mat Martineau wrote:
>> When a MPTCP connection is managed by a userspace PM, bypass the kernel
>> PM for incoming advertisements and subflow events. Netlink events are
>> still sent to userspace.
>>
>> v2: Remove unneeded check in mptcp_pm_add_addr_received() (Kishen Maloor)
>
> I guess the above should be 'mptcp_pm_rm_addr_received'.
>

Thanks for noticing that, mptcp_pm_rm_addr_received is the function that 
was modified from v1.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 4/6] mptcp: Make kernel path manager check for userspace-managed sockets
  2021-12-06 17:52   ` Paolo Abeni
@ 2021-12-06 22:35     ` Mat Martineau
  0 siblings, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-12-06 22:35 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

On Mon, 6 Dec 2021, Paolo Abeni wrote:

> On Wed, 2021-12-01 at 16:52 -0800, Mat Martineau wrote:
>> Userspace-managed sockets should not have their subflows or
>> advertisements changed by the kernel path manager.
>>
>> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>> ---
>>  net/mptcp/pm_netlink.c | 20 ++++++++++++++------
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
>> index a74eb0444cd2..2cd491229d23 100644
>> --- a/net/mptcp/pm_netlink.c
>> +++ b/net/mptcp/pm_netlink.c
>> @@ -1122,7 +1122,8 @@ static int mptcp_nl_add_subflow_or_signal_addr(struct net *net)
>>  	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
>>  		struct sock *sk = (struct sock *)msk;
>>
>> -		if (!READ_ONCE(msk->fully_established))
>> +		if (!READ_ONCE(msk->fully_established) ||
>> +		    (READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL))
>
> I'm ok with merging the series as is, but if you are considering
> another version, what about replacing a 'READ_ONCE(msk->pm.pm_type) !=
> MPTCP_PM_TYPE_KERNEL' with an helper, e.g. something alike
> mptcp_pm_is_user_space(sk) ?
>

I'll send a v3 - agree that a helper function would clarify this.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-12-06 17:47     ` Paolo Abeni
@ 2021-12-06 22:42       ` Mat Martineau
  0 siblings, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-12-06 22:42 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Geliang Tang, MPTCP Upstream

[-- Attachment #1: Type: text/plain, Size: 6771 bytes --]

On Mon, 6 Dec 2021, Paolo Abeni wrote:

> On Mon, 2021-12-06 at 15:56 +0800, Geliang Tang wrote:
>> Hi Mat,
>>
>> Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年12月2日周四 08:52写道:
>>>
>>> The new net.mptcp.pm_type sysctl determines which path manager will be
>>> used by each newly-created MPTCP socket.
>>>
>>> v2: Handle builds without CONFIG_SYSCTL
>>>
>>> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>>> ---
>>>  Documentation/networking/mptcp-sysctl.rst | 18 ++++++++++++++++++
>>>  net/mptcp/ctrl.c                          | 21 +++++++++++++++++++++
>>>  net/mptcp/pm.c                            | 13 +++++++++----
>>>  net/mptcp/protocol.h                      |  1 +
>>>  4 files changed, 49 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
>>> index b0d4da71e68e..e263dfcc4b40 100644
>>> --- a/Documentation/networking/mptcp-sysctl.rst
>>> +++ b/Documentation/networking/mptcp-sysctl.rst
>>> @@ -46,6 +46,24 @@ allow_join_initial_addr_port - BOOLEAN
>>>
>>>         Default: 1
>>>
>>> +pm_type - INTEGER
>>> +
>>> +       Set the default path manager type to use for each new MPTCP
>>> +       socket. In-kernel path management will control subflow
>>> +       connections and address advertisements according to
>>> +       per-namespace values configured over the MPTCP netlink
>>> +       API. Userspace path management puts per-MPTCP-connection subflow
>>> +       connection decisions and address advertisements under control of
>>> +       a privileged userspace program, at the cost of more netlink
>>> +       traffic to propagate all of the related events and commands.
>>> +
>>> +       This is a per-namespace sysctl.
>>> +
>>> +       * 0 - In-kernel path manager
>>> +       * 1 - Userspace path manager
>>> +
>>> +       Default: 0
>>> +
>>>  stale_loss_cnt - INTEGER
>>>         The number of MPTCP-level retransmission intervals with no traffic and
>>>         pending outstanding data on a given subflow required to declare it stale.
>>> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
>>> index 8b235468c88f..ae20b7d92e28 100644
>>> --- a/net/mptcp/ctrl.c
>>> +++ b/net/mptcp/ctrl.c
>>> @@ -16,6 +16,11 @@
>>>  #define MPTCP_SYSCTL_PATH "net/mptcp"
>>>
>>>  static int mptcp_pernet_id;
>>> +
>>> +#ifdef CONFIG_SYSCTL
>>> +static int mptcp_pm_type_max = __MPTCP_PM_TYPE_MAX;
>>> +#endif
>>> +
>>>  struct mptcp_pernet {
>>>  #ifdef CONFIG_SYSCTL
>>>         struct ctl_table_header *ctl_table_hdr;
>>> @@ -26,6 +31,7 @@ struct mptcp_pernet {
>>>         u8 mptcp_enabled;
>>>         u8 checksum_enabled;
>>>         u8 allow_join_initial_addr_port;
>>> +       u8 pm_type;
>>>  };
>>>
>>>  static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
>>> @@ -58,6 +64,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
>>>         return mptcp_get_pernet(net)->stale_loss_cnt;
>>>  }
>>>
>>> +int mptcp_get_pm_type(const struct net *net)
>>> +{
>>> +       return mptcp_get_pernet(net)->pm_type;
>>> +}
>>> +
>>>  static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>>>  {
>>>         pernet->mptcp_enabled = 1;
>>> @@ -65,6 +76,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>>>         pernet->checksum_enabled = 0;
>>>         pernet->allow_join_initial_addr_port = 1;
>>>         pernet->stale_loss_cnt = 4;
>>> +       pernet->pm_type = MPTCP_PM_TYPE_KERNEL;
>>>  }
>>>
>>>  #ifdef CONFIG_SYSCTL
>>> @@ -108,6 +120,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
>>>                 .mode = 0644,
>>>                 .proc_handler = proc_douintvec_minmax,
>>>         },
>>> +       {
>>> +               .procname = "pm_type",
>>> +               .maxlen = sizeof(u8),
>>> +               .mode = 0644,
>>> +               .proc_handler = proc_dou8vec_minmax,
>>> +               .extra1       = SYSCTL_ZERO,
>>> +               .extra2       = &mptcp_pm_type_max
>>> +       },
>>>         {}
>>>  };
>>>
>>> @@ -128,6 +148,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
>>>         table[2].data = &pernet->checksum_enabled;
>>>         table[3].data = &pernet->allow_join_initial_addr_port;
>>>         table[4].data = &pernet->stale_loss_cnt;
>>> +       table[5].data = &pernet->pm_type;
>>>
>>>         hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
>>>         if (!hdr)
>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>>> index 451948b11030..f1fc08d89c20 100644
>>> --- a/net/mptcp/pm.c
>>> +++ b/net/mptcp/pm.c
>>> @@ -365,8 +365,12 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
>>>
>>>  void mptcp_pm_data_reset(struct mptcp_sock *msk)
>>>  {
>>> -       bool subflows_allowed = !!mptcp_pm_get_subflows_max(msk);
>>> +       u8 pm_type = mptcp_get_pm_type(sock_net((struct sock *)msk));
>>>         struct mptcp_pm_data *pm = &msk->pm;
>>> +       bool subflows_allowed;
>>> +
>>> +       subflows_allowed = !!mptcp_pm_get_subflows_max(msk) &&
>>> +               pm_type == MPTCP_PM_TYPE_KERNEL;
>>>
>>>         pm->add_addr_signaled = 0;
>>>         pm->add_addr_accepted = 0;
>>> @@ -374,13 +378,14 @@ void mptcp_pm_data_reset(struct mptcp_sock *msk)
>>>         pm->subflows = 0;
>>>         pm->rm_list_tx.nr = 0;
>>>         pm->rm_list_rx.nr = 0;
>>> -       WRITE_ONCE(pm->pm_type, MPTCP_PM_TYPE_KERNEL);
>>> +       WRITE_ONCE(pm->pm_type, pm_type);
>>>         /* pm->work_pending must be only be set to 'true' when
>>>          * pm->pm_type is set to MPTCP_PM_TYPE_KERNEL
>>>          */
>>>         WRITE_ONCE(pm->work_pending,
>>> -                  (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
>>> -                  !!mptcp_pm_get_add_addr_signal_max(msk));
>>> +                  pm_type == MPTCP_PM_TYPE_KERNEL &&
>>> +                  ((!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
>>> +                   !!mptcp_pm_get_add_addr_signal_max(msk)));
>>
>> I think pm_type == MPTCP_PM_TYPE_KERNEL is somehow duplicate here, how
>> about use this code instead:
>>
>>         WRITE_ONCE(pm->work_pending,
>>                    (!!mptcp_pm_get_local_addr_max(msk) && subflows_allowed) ||
>>                    (!!mptcp_pm_get_add_addr_signal_max(msk) && pm_type
>> == MPTCP_PM_TYPE_KERNEL));
>
> Still apropos making this chunk of code more readable, what about:
>
> 	if (pm_type == MPTCP_PM_TYPE_KERNEL) {
> 		// [current code]
> 	} else {
> 		WRITE_ONCE(pm->work_pending, 0);
> 		WRITE_ONCE(pm->accept_addr, 0);
> 		WRITE_ONCE(pm->accept_subflow, 0);
> 	}
>
> ?

I agree that it's more readable using the 'if' statement, I'll make that 
change.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-12-06 22:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-02  0:52 [PATCH mptcp-next v2 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
2021-12-02  0:52 ` [PATCH mptcp-next v2 1/6] mptcp: Remove redundant assignments in path manager init Mat Martineau
2021-12-02  0:52 ` [PATCH mptcp-next v2 2/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode Mat Martineau
2021-12-02  0:52 ` [PATCH mptcp-next v2 3/6] mptcp: Bypass kernel PM when userspace PM is enabled Mat Martineau
2021-12-06 17:48   ` Paolo Abeni
2021-12-06 22:34     ` Mat Martineau
2021-12-02  0:52 ` [PATCH mptcp-next v2 4/6] mptcp: Make kernel path manager check for userspace-managed sockets Mat Martineau
2021-12-06 17:52   ` Paolo Abeni
2021-12-06 22:35     ` Mat Martineau
2021-12-02  0:52 ` [PATCH mptcp-next v2 5/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
2021-12-06  7:44   ` Geliang Tang
2021-12-06  7:56   ` Geliang Tang
2021-12-06 17:47     ` Paolo Abeni
2021-12-06 22:42       ` Mat Martineau
2021-12-02  0:52 ` [PATCH mptcp-next v2 6/6] selftests: mptcp: Add tests for userspace PM type Mat Martineau
2021-12-06  8:37   ` Geliang Tang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.