All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM
@ 2021-10-06 23:59 Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 1/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode Mat Martineau
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Mat Martineau @ 2021-10-06 23:59 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau, kishen.maloor

One part of supporting userspace path managers is to prevent the
in-kernel PM from acting on userspace-managed MPTCP connections.

These patches:

 * Add a per-MPTCP-socket 'userspace' bool to track which PM type is in
   use for that socket. A mix of kernel-managed and userspace-managed
   connections are supported within each namespace.

 * Conditionally decouple incoming ADD_ADDR/RM_ADDR and subflow changes
   from the in-kernel PM. Netlink events are still triggered.

 * Add a sysctl for setting the per-namespace default for in-kernel vs
   userspace path management of new MPTCP sockets. This might be better
   handled with a new netlink command to get/set the default, but I've
   included this commit so we can discuss the best approach!

 * Some debug code to confirm that the in-kernel PM is not interfering
   with userspace-managed sockets. Some of these warnings (perhaps
   changed to debug messages?) may be good to permanently include, but
   are labeled "DO-NOT-MERGE" for now.


I've confirmed that mptcp_join.sh selftests fail as they should with the
in-kernel PM disconnected, but do not run into the warnings I
added. These are RFC patches so testing continues.

Kishen has some netlink command patches to follow. We're also working on
mptcpd changes (https://github.com/intel/mptcpd/pull/156) to interact
with the upstream kernel, but still have to resolve some differences
relative to the multipath-tcp.org implementation of the netlink commands.


Mat Martineau (6):
  mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode
  mptcp: Bypass kernel PM when userspace PM is enabled
  mptcp: Make kernel path manager check for userspace-managed sockets
  mptcp: Add a per-namespace sysctl to set the default path manager type
  DO-NOT-MERGE: debug: mptcp: Warn on use of in-kernel PM functions
  DO-NOT-MERGE: debug: mptcp: Warn on unexpected events when userspace
    PM is active

 Documentation/networking/mptcp-sysctl.rst | 17 ++++++
 net/mptcp/ctrl.c                          | 16 ++++++
 net/mptcp/pm.c                            | 16 +++++-
 net/mptcp/pm_netlink.c                    | 70 +++++++++++++++++++----
 net/mptcp/protocol.h                      |  2 +
 5 files changed, 108 insertions(+), 13 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 1/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode
  2021-10-06 23:59 [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
@ 2021-10-06 23:59 ` Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 2/6] mptcp: Bypass kernel PM when userspace PM is enabled Mat Martineau
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2021-10-06 23:59 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau, kishen.maloor

When adding support for netlink path management commands, the kernel
needs to know whether paths are being controlled by the in-kernel path
manager or a userspace PM.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm.c       | 1 +
 net/mptcp/protocol.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 6ab386ff3294..79aafcb51756 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -369,6 +369,7 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
 	WRITE_ONCE(msk->pm.accept_addr, false);
 	WRITE_ONCE(msk->pm.accept_subflow, false);
 	WRITE_ONCE(msk->pm.remote_deny_join_id0, false);
+	WRITE_ONCE(msk->pm.userspace, false);
 	msk->pm.status = 0;
 
 	spin_lock_init(&msk->pm.lock);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 284fdcec067e..b06c2307296d 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -197,6 +197,7 @@ struct mptcp_pm_data {
 	bool		accept_addr;
 	bool		accept_subflow;
 	bool		remote_deny_join_id0;
+	bool		userspace;
 	u8		add_addr_signaled;
 	u8		add_addr_accepted;
 	u8		local_addr_used;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 2/6] mptcp: Bypass kernel PM when userspace PM is enabled
  2021-10-06 23:59 [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 1/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode Mat Martineau
@ 2021-10-06 23:59 ` Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 3/6] mptcp: Make kernel path manager check for userspace-managed sockets Mat Martineau
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2021-10-06 23:59 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau, kishen.maloor

When a MPTCP connection is managed by a userspace PM, bypass the kernel
PM for incoming advertisements and subflow events. Netlink events are
still sent to userspace.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 79aafcb51756..3fa3c0faed48 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -136,7 +136,7 @@ void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock *ssk,
 	 * racing paths - accept() and check_fully_established()
 	 * be sure to serve this event only once.
 	 */
-	if (READ_ONCE(pm->work_pending) &&
+	if (READ_ONCE(pm->work_pending) && !READ_ONCE(pm->userspace) &&
 	    !(msk->pm.status & BIT(MPTCP_PM_ALREADY_ESTABLISHED)))
 		mptcp_pm_schedule_work(msk, MPTCP_PM_ESTABLISHED);
 
@@ -161,7 +161,7 @@ void mptcp_pm_subflow_established(struct mptcp_sock *msk)
 
 	pr_debug("msk=%p", msk);
 
-	if (!READ_ONCE(pm->work_pending))
+	if (!READ_ONCE(pm->work_pending) || READ_ONCE(pm->userspace))
 		return;
 
 	spin_lock_bh(&pm->lock);
@@ -187,6 +187,9 @@ void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
 
 	mptcp_event_addr_announced(msk, addr);
 
+	if (READ_ONCE(pm->userspace))
+		return;
+
 	spin_lock_bh(&pm->lock);
 
 	if (!READ_ONCE(pm->accept_addr)) {
@@ -206,6 +209,9 @@ void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk,
 
 	pr_debug("msk=%p", msk);
 
+	if (READ_ONCE(pm->userspace))
+		return;
+
 	spin_lock_bh(&pm->lock);
 
 	if (mptcp_lookup_anno_list_by_saddr(msk, addr) && READ_ONCE(pm->work_pending))
@@ -233,6 +239,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock *msk,
 	for (i = 0; i < rm_list->nr; i++)
 		mptcp_event_addr_removed(msk, rm_list->ids[i]);
 
+	if (READ_ONCE(pm->userspace))
+		return;
+
 	spin_lock_bh(&pm->lock);
 	mptcp_pm_schedule_work(msk, MPTCP_PM_RM_ADDR_RECEIVED);
 	pm->rm_list_rx = *rm_list;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 3/6] mptcp: Make kernel path manager check for userspace-managed sockets
  2021-10-06 23:59 [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 1/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 2/6] mptcp: Bypass kernel PM when userspace PM is enabled Mat Martineau
@ 2021-10-06 23:59 ` Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 4/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2021-10-06 23:59 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau, kishen.maloor

Userspace-managed sockets should not have their subflows or advertisements changed by the kernel path manager.

(Possible squash with earlier commit)

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm_netlink.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 30b974801d51..434049836707 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -1131,7 +1131,8 @@ static int mptcp_nl_add_subflow_or_signal_addr(struct net *net)
 	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
 		struct sock *sk = (struct sock *)msk;
 
-		if (!READ_ONCE(msk->fully_established))
+		if (!READ_ONCE(msk->fully_established) ||
+		    READ_ONCE(msk->pm.userspace))
 			goto next;
 
 		lock_sock(sk);
@@ -1269,6 +1270,9 @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net,
 		struct sock *sk = (struct sock *)msk;
 		bool remove_subflow;
 
+		if (READ_ONCE(msk->pm.userspace))
+			goto next;
+
 		if (list_empty(&msk->conn_list)) {
 			mptcp_pm_remove_anno_addr(msk, addr, false);
 			goto next;
@@ -1310,7 +1314,7 @@ static int mptcp_nl_remove_id_zero_address(struct net *net,
 		struct sock *sk = (struct sock *)msk;
 		struct mptcp_addr_info msk_local;
 
-		if (list_empty(&msk->conn_list))
+		if (list_empty(&msk->conn_list) || READ_ONCE(msk->pm.userspace))
 			goto next;
 
 		local_address((struct sock_common *)msk, &msk_local);
@@ -1419,9 +1423,11 @@ static void mptcp_nl_remove_addrs_list(struct net *net,
 	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
 		struct sock *sk = (struct sock *)msk;
 
-		lock_sock(sk);
-		mptcp_pm_remove_addrs_and_subflows(msk, rm_list);
-		release_sock(sk);
+		if (!READ_ONCE(msk->pm.userspace)) {
+			lock_sock(sk);
+			mptcp_pm_remove_addrs_and_subflows(msk, rm_list);
+			release_sock(sk);
+		}
 
 		sock_put(sk);
 		cond_resched();
@@ -1683,7 +1689,8 @@ static int mptcp_nl_addr_backup(struct net *net,
 	while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) {
 		struct sock *sk = (struct sock *)msk;
 
-		if (list_empty(&msk->conn_list))
+		if (list_empty(&msk->conn_list) ||
+		    READ_ONCE(msk->pm.userspace))
 			goto next;
 
 		lock_sock(sk);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 4/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-10-06 23:59 [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
                   ` (2 preceding siblings ...)
  2021-10-06 23:59 ` [RFC PATCH 3/6] mptcp: Make kernel path manager check for userspace-managed sockets Mat Martineau
@ 2021-10-06 23:59 ` Mat Martineau
  2021-10-07 10:27   ` Paolo Abeni
  2021-10-06 23:59 ` [RFC PATCH 5/6] DO-NOT-MERGE: debug: mptcp: Warn on use of in-kernel PM functions Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 6/6] DO-NOT-MERGE: debug: mptcp: Warn on unexpected events when userspace PM is active Mat Martineau
  5 siblings, 1 reply; 9+ messages in thread
From: Mat Martineau @ 2021-10-06 23:59 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau, kishen.maloor

The new net.mptcp.userspace_pm_default sysctl determines which path
manager will be used by each newly-created MPTCP socket.

Alternative: set the per-namespace default using the netlink API instead.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 Documentation/networking/mptcp-sysctl.rst | 17 +++++++++++++++++
 net/mptcp/ctrl.c                          | 16 ++++++++++++++++
 net/mptcp/pm.c                            |  4 +++-
 net/mptcp/protocol.h                      |  1 +
 4 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index b0d4da71e68e..9f5d7bad9fba 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -46,6 +46,23 @@ allow_join_initial_addr_port - BOOLEAN
 
 	Default: 1
 
+userspace_pm_default - BOOLEAN
+
+	Set the default path manager type to use for each new MPTCP
+	socket.  The in-kernel path manager is used if this is set to 0,
+	and a userspace path manager is used if the value is set
+	to 1. In-kernel path management will control subflow connections
+	and address advertisements according to per-namespace values
+	configured over the MPTCP netlink API. Userspace path management
+	puts per-MPTCP-connection subflow connection decisions and
+	address advertisements under control of a privileged userspace
+	program, at the cost of more netlink traffic to propagate all of
+	the related events and commands.
+
+	This is a per-namespace sysctl.
+
+	Default: 0
+
 stale_loss_cnt - INTEGER
 	The number of MPTCP-level retransmission intervals with no traffic and
 	pending outstanding data on a given subflow required to declare it stale.
diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index 8b235468c88f..e7aedb90e35a 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -26,6 +26,7 @@ struct mptcp_pernet {
 	u8 mptcp_enabled;
 	u8 checksum_enabled;
 	u8 allow_join_initial_addr_port;
+	u8 userspace_pm_default;
 };
 
 static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
@@ -58,6 +59,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
 	return mptcp_get_pernet(net)->stale_loss_cnt;
 }
 
+int mptcp_userspace_pm_default(const struct net *net)
+{
+	return mptcp_get_pernet(net)->userspace_pm_default;
+}
+
 static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 {
 	pernet->mptcp_enabled = 1;
@@ -65,6 +71,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 	pernet->checksum_enabled = 0;
 	pernet->allow_join_initial_addr_port = 1;
 	pernet->stale_loss_cnt = 4;
+	pernet->userspace_pm_default = 0;
 }
 
 #ifdef CONFIG_SYSCTL
@@ -108,6 +115,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
 		.mode = 0644,
 		.proc_handler = proc_douintvec_minmax,
 	},
+	{
+		.procname = "userspace_pm_default",
+		.maxlen = sizeof(u8),
+		.mode = 0644,
+		.proc_handler = proc_dou8vec_minmax,
+		.extra1       = SYSCTL_ZERO,
+		.extra2       = SYSCTL_ONE
+	},
 	{}
 };
 
@@ -128,6 +143,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
 	table[2].data = &pernet->checksum_enabled;
 	table[3].data = &pernet->allow_join_initial_addr_port;
 	table[4].data = &pernet->stale_loss_cnt;
+	table[5].data = &pernet->userspace_pm_default;
 
 	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
 	if (!hdr)
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 3fa3c0faed48..e97a22fcc0fa 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -367,6 +367,8 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
 
 void mptcp_pm_data_init(struct mptcp_sock *msk)
 {
+	struct sock *sk = (struct sock *)msk;
+
 	msk->pm.add_addr_signaled = 0;
 	msk->pm.add_addr_accepted = 0;
 	msk->pm.local_addr_used = 0;
@@ -378,7 +380,7 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
 	WRITE_ONCE(msk->pm.accept_addr, false);
 	WRITE_ONCE(msk->pm.accept_subflow, false);
 	WRITE_ONCE(msk->pm.remote_deny_join_id0, false);
-	WRITE_ONCE(msk->pm.userspace, false);
+	WRITE_ONCE(msk->pm.userspace, mptcp_userspace_pm_default(sock_net(sk)));
 	msk->pm.status = 0;
 
 	spin_lock_init(&msk->pm.lock);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index b06c2307296d..a8f4133013bc 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -564,6 +564,7 @@ unsigned int mptcp_get_add_addr_timeout(const struct net *net);
 int mptcp_is_checksum_enabled(const struct net *net);
 int mptcp_allow_join_id0(const struct net *net);
 unsigned int mptcp_stale_loss_cnt(const struct net *net);
+int mptcp_userspace_pm_default(const struct net *net);
 void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
 				     struct mptcp_options_received *mp_opt);
 bool __mptcp_retransmit_pending_data(struct sock *sk);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 5/6] DO-NOT-MERGE: debug: mptcp: Warn on use of in-kernel PM functions
  2021-10-06 23:59 [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
                   ` (3 preceding siblings ...)
  2021-10-06 23:59 ` [RFC PATCH 4/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
@ 2021-10-06 23:59 ` Mat Martineau
  2021-10-06 23:59 ` [RFC PATCH 6/6] DO-NOT-MERGE: debug: mptcp: Warn on unexpected events when userspace PM is active Mat Martineau
  5 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2021-10-06 23:59 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau, kishen.maloor

Double-check that all relevant call sites are filtered when sockets are
managed by userspace.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm_netlink.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 434049836707..affa84099bdd 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -468,6 +468,8 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
 	struct pm_nl_pernet *pernet;
 	unsigned int subflows_max;
 
+	WARN_ON_ONCE(READ_ONCE(msk->pm.userspace));
+
 	pernet = net_generic(sock_net(sk), pm_nl_pernet_id);
 
 	add_addr_signal_max = mptcp_pm_get_add_addr_signal_max(msk);
@@ -700,6 +702,8 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
 
 	msk_owned_by_me(msk);
 
+	WARN_ON_ONCE(READ_ONCE(msk->pm.userspace));
+
 	if (!rm_list->nr)
 		return;
 
@@ -1244,6 +1248,8 @@ static bool mptcp_pm_remove_anno_addr(struct mptcp_sock *msk,
 	struct mptcp_rm_list list = { .nr = 0 };
 	bool ret;
 
+	WARN_ON_ONCE(READ_ONCE(msk->pm.userspace));
+
 	list.ids[list.nr++] = addr->id;
 
 	ret = remove_anno_list_by_saddr(msk, addr);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 6/6] DO-NOT-MERGE: debug: mptcp: Warn on unexpected events when userspace PM is active
  2021-10-06 23:59 [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
                   ` (4 preceding siblings ...)
  2021-10-06 23:59 ` [RFC PATCH 5/6] DO-NOT-MERGE: debug: mptcp: Warn on use of in-kernel PM functions Mat Martineau
@ 2021-10-06 23:59 ` Mat Martineau
  5 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2021-10-06 23:59 UTC (permalink / raw)
  To: mptcp; +Cc: Mat Martineau, kishen.maloor

This is to confirm the previous commits work as expected, not sure if
such checks will be needed upstream.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/pm_netlink.c | 45 +++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index affa84099bdd..f6299bda6eb2 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -753,14 +753,10 @@ void mptcp_pm_nl_rm_subflow_received(struct mptcp_sock *msk,
 	mptcp_pm_nl_rm_addr_or_subflow(msk, rm_list, MPTCP_MIB_RMSUBFLOW);
 }
 
-void mptcp_pm_nl_work(struct mptcp_sock *msk)
+static void mptcp_pm_kernel_work(struct mptcp_sock *msk)
 {
 	struct mptcp_pm_data *pm = &msk->pm;
 
-	msk_owned_by_me(msk);
-
-	spin_lock_bh(&msk->pm.lock);
-
 	pr_debug("msk=%p status=%x", msk, pm->status);
 	if (pm->status & BIT(MPTCP_PM_ADD_ADDR_RECEIVED)) {
 		pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_RECEIVED);
@@ -782,6 +778,45 @@ void mptcp_pm_nl_work(struct mptcp_sock *msk)
 		pm->status &= ~BIT(MPTCP_PM_SUBFLOW_ESTABLISHED);
 		mptcp_pm_nl_subflow_established(msk);
 	}
+}
+
+static void mptcp_pm_userspace_work(struct mptcp_sock *msk)
+{
+	struct mptcp_pm_data *pm = &msk->pm;
+
+	pr_debug("msk=%p status=%x", msk, pm->status);
+	if (pm->status & BIT(MPTCP_PM_ADD_ADDR_RECEIVED)) {
+		pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_RECEIVED);
+		WARN_ONCE(1, "Unexpected ADD_ADDR status in userspace PM mode");
+	}
+	if (pm->status & BIT(MPTCP_PM_ADD_ADDR_SEND_ACK)) {
+		pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_SEND_ACK);
+		WARN_ONCE(1, "Unexpected ADD_ADDR retransmit status in userspace PM mode");
+	}
+	if (pm->status & BIT(MPTCP_PM_RM_ADDR_RECEIVED)) {
+		pm->status &= ~BIT(MPTCP_PM_RM_ADDR_RECEIVED);
+		WARN_ONCE(1, "Unexpected RM_ADDR status in userspace PM mode");
+	}
+	if (pm->status & BIT(MPTCP_PM_ESTABLISHED)) {
+		pm->status &= ~BIT(MPTCP_PM_ESTABLISHED);
+		WARN_ONCE(1, "Unexpected established status in userspace PM mode");
+	}
+	if (pm->status & BIT(MPTCP_PM_SUBFLOW_ESTABLISHED)) {
+		pm->status &= ~BIT(MPTCP_PM_SUBFLOW_ESTABLISHED);
+		WARN_ONCE(1, "Unexpected subflow established status in userspace PM mode");
+	}
+}
+
+void mptcp_pm_nl_work(struct mptcp_sock *msk)
+{
+	msk_owned_by_me(msk);
+
+	spin_lock_bh(&msk->pm.lock);
+
+	if (READ_ONCE(msk->pm.userspace))
+		mptcp_pm_userspace_work(msk);
+	else
+		mptcp_pm_kernel_work(msk);
 
 	spin_unlock_bh(&msk->pm.lock);
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 4/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-10-06 23:59 ` [RFC PATCH 4/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
@ 2021-10-07 10:27   ` Paolo Abeni
  2021-10-08 21:14     ` Mat Martineau
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Abeni @ 2021-10-07 10:27 UTC (permalink / raw)
  To: Mat Martineau, mptcp; +Cc: kishen.maloor

On Wed, 2021-10-06 at 16:59 -0700, Mat Martineau wrote:
> The new net.mptcp.userspace_pm_default sysctl determines which path
> manager will be used by each newly-created MPTCP socket.
> 
> Alternative: set the per-namespace default using the netlink API instead.
> 
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
>  Documentation/networking/mptcp-sysctl.rst | 17 +++++++++++++++++
>  net/mptcp/ctrl.c                          | 16 ++++++++++++++++
>  net/mptcp/pm.c                            |  4 +++-
>  net/mptcp/protocol.h                      |  1 +
>  4 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
> index b0d4da71e68e..9f5d7bad9fba 100644
> --- a/Documentation/networking/mptcp-sysctl.rst
> +++ b/Documentation/networking/mptcp-sysctl.rst
> @@ -46,6 +46,23 @@ allow_join_initial_addr_port - BOOLEAN
>  
>  	Default: 1
>  
> +userspace_pm_default - BOOLEAN
> +
> +	Set the default path manager type to use for each new MPTCP
> +	socket.  The in-kernel path manager is used if this is set to 0,
> +	and a userspace path manager is used if the value is set
> +	to 1. In-kernel path management will control subflow connections
> +	and address advertisements according to per-namespace values
> +	configured over the MPTCP netlink API. Userspace path management
> +	puts per-MPTCP-connection subflow connection decisions and
> +	address advertisements under control of a privileged userspace
> +	program, at the cost of more netlink traffic to propagate all of
> +	the related events and commands.
> +
> +	This is a per-namespace sysctl.
> +
> +	Default: 0
> +
>  stale_loss_cnt - INTEGER
>  	The number of MPTCP-level retransmission intervals with no traffic and
>  	pending outstanding data on a given subflow required to declare it stale.
> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> index 8b235468c88f..e7aedb90e35a 100644
> --- a/net/mptcp/ctrl.c
> +++ b/net/mptcp/ctrl.c
> @@ -26,6 +26,7 @@ struct mptcp_pernet {
>  	u8 mptcp_enabled;
>  	u8 checksum_enabled;
>  	u8 allow_join_initial_addr_port;
> +	u8 userspace_pm_default;
>  };
>  
>  static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
> @@ -58,6 +59,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
>  	return mptcp_get_pernet(net)->stale_loss_cnt;
>  }
>  
> +int mptcp_userspace_pm_default(const struct net *net)
> +{
> +	return mptcp_get_pernet(net)->userspace_pm_default;
> +}
> +
>  static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>  {
>  	pernet->mptcp_enabled = 1;
> @@ -65,6 +71,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>  	pernet->checksum_enabled = 0;
>  	pernet->allow_join_initial_addr_port = 1;
>  	pernet->stale_loss_cnt = 4;
> +	pernet->userspace_pm_default = 0;
>  }
>  
>  #ifdef CONFIG_SYSCTL
> @@ -108,6 +115,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
>  		.mode = 0644,
>  		.proc_handler = proc_douintvec_minmax,
>  	},
> +	{
> +		.procname = "userspace_pm_default",
> +		.maxlen = sizeof(u8),
> +		.mode = 0644,
> +		.proc_handler = proc_dou8vec_minmax,
> +		.extra1       = SYSCTL_ZERO,
> +		.extra2       = SYSCTL_ONE
> +	},
>  	{}
>  };
>  
> @@ -128,6 +143,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
>  	table[2].data = &pernet->checksum_enabled;
>  	table[3].data = &pernet->allow_join_initial_addr_port;
>  	table[4].data = &pernet->stale_loss_cnt;
> +	table[5].data = &pernet->userspace_pm_default;
>  
>  	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
>  	if (!hdr)
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 3fa3c0faed48..e97a22fcc0fa 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -367,6 +367,8 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
>  
>  void mptcp_pm_data_init(struct mptcp_sock *msk)
>  {
> +	struct sock *sk = (struct sock *)msk;
> +
>  	msk->pm.add_addr_signaled = 0;
>  	msk->pm.add_addr_accepted = 0;
>  	msk->pm.local_addr_used = 0;
> @@ -378,7 +380,7 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
>  	WRITE_ONCE(msk->pm.accept_addr, false);
>  	WRITE_ONCE(msk->pm.accept_subflow, false);
>  	WRITE_ONCE(msk->pm.remote_deny_join_id0, false);
> -	WRITE_ONCE(msk->pm.userspace, false);
> +	WRITE_ONCE(msk->pm.userspace, mptcp_userspace_pm_default(sock_net(sk)));

I'm wondering if we could set work_pending as well, and avoid some
multiple conditionals in patch 2/6

Other than that LGTM!

It would be nice to have some simple self-test, enabling user-space pm
and observing no action is taken on add_addr, connection established,
etc.

I think the procfs switch for user-space pm fits more easily self-
tests.

/P


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 4/6] mptcp: Add a per-namespace sysctl to set the default path manager type
  2021-10-07 10:27   ` Paolo Abeni
@ 2021-10-08 21:14     ` Mat Martineau
  0 siblings, 0 replies; 9+ messages in thread
From: Mat Martineau @ 2021-10-08 21:14 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp, kishen.maloor

On Thu, 7 Oct 2021, Paolo Abeni wrote:

> On Wed, 2021-10-06 at 16:59 -0700, Mat Martineau wrote:
>> The new net.mptcp.userspace_pm_default sysctl determines which path
>> manager will be used by each newly-created MPTCP socket.
>>
>> Alternative: set the per-namespace default using the netlink API instead.
>>
>> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>> ---
>>  Documentation/networking/mptcp-sysctl.rst | 17 +++++++++++++++++
>>  net/mptcp/ctrl.c                          | 16 ++++++++++++++++
>>  net/mptcp/pm.c                            |  4 +++-
>>  net/mptcp/protocol.h                      |  1 +
>>  4 files changed, 37 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
>> index b0d4da71e68e..9f5d7bad9fba 100644
>> --- a/Documentation/networking/mptcp-sysctl.rst
>> +++ b/Documentation/networking/mptcp-sysctl.rst
>> @@ -46,6 +46,23 @@ allow_join_initial_addr_port - BOOLEAN
>>
>>  	Default: 1
>>
>> +userspace_pm_default - BOOLEAN
>> +
>> +	Set the default path manager type to use for each new MPTCP
>> +	socket.  The in-kernel path manager is used if this is set to 0,
>> +	and a userspace path manager is used if the value is set
>> +	to 1. In-kernel path management will control subflow connections
>> +	and address advertisements according to per-namespace values
>> +	configured over the MPTCP netlink API. Userspace path management
>> +	puts per-MPTCP-connection subflow connection decisions and
>> +	address advertisements under control of a privileged userspace
>> +	program, at the cost of more netlink traffic to propagate all of
>> +	the related events and commands.
>> +
>> +	This is a per-namespace sysctl.
>> +
>> +	Default: 0
>> +
>>  stale_loss_cnt - INTEGER
>>  	The number of MPTCP-level retransmission intervals with no traffic and
>>  	pending outstanding data on a given subflow required to declare it stale.
>> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
>> index 8b235468c88f..e7aedb90e35a 100644
>> --- a/net/mptcp/ctrl.c
>> +++ b/net/mptcp/ctrl.c
>> @@ -26,6 +26,7 @@ struct mptcp_pernet {
>>  	u8 mptcp_enabled;
>>  	u8 checksum_enabled;
>>  	u8 allow_join_initial_addr_port;
>> +	u8 userspace_pm_default;
>>  };
>>
>>  static struct mptcp_pernet *mptcp_get_pernet(const struct net *net)
>> @@ -58,6 +59,11 @@ unsigned int mptcp_stale_loss_cnt(const struct net *net)
>>  	return mptcp_get_pernet(net)->stale_loss_cnt;
>>  }
>>
>> +int mptcp_userspace_pm_default(const struct net *net)
>> +{
>> +	return mptcp_get_pernet(net)->userspace_pm_default;
>> +}
>> +
>>  static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>>  {
>>  	pernet->mptcp_enabled = 1;
>> @@ -65,6 +71,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
>>  	pernet->checksum_enabled = 0;
>>  	pernet->allow_join_initial_addr_port = 1;
>>  	pernet->stale_loss_cnt = 4;
>> +	pernet->userspace_pm_default = 0;
>>  }
>>
>>  #ifdef CONFIG_SYSCTL
>> @@ -108,6 +115,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
>>  		.mode = 0644,
>>  		.proc_handler = proc_douintvec_minmax,
>>  	},
>> +	{
>> +		.procname = "userspace_pm_default",
>> +		.maxlen = sizeof(u8),
>> +		.mode = 0644,
>> +		.proc_handler = proc_dou8vec_minmax,
>> +		.extra1       = SYSCTL_ZERO,
>> +		.extra2       = SYSCTL_ONE
>> +	},
>>  	{}
>>  };
>>
>> @@ -128,6 +143,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
>>  	table[2].data = &pernet->checksum_enabled;
>>  	table[3].data = &pernet->allow_join_initial_addr_port;
>>  	table[4].data = &pernet->stale_loss_cnt;
>> +	table[5].data = &pernet->userspace_pm_default;
>>
>>  	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
>>  	if (!hdr)
>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>> index 3fa3c0faed48..e97a22fcc0fa 100644
>> --- a/net/mptcp/pm.c
>> +++ b/net/mptcp/pm.c
>> @@ -367,6 +367,8 @@ void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk)
>>
>>  void mptcp_pm_data_init(struct mptcp_sock *msk)
>>  {
>> +	struct sock *sk = (struct sock *)msk;
>> +
>>  	msk->pm.add_addr_signaled = 0;
>>  	msk->pm.add_addr_accepted = 0;
>>  	msk->pm.local_addr_used = 0;
>> @@ -378,7 +380,7 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
>>  	WRITE_ONCE(msk->pm.accept_addr, false);
>>  	WRITE_ONCE(msk->pm.accept_subflow, false);
>>  	WRITE_ONCE(msk->pm.remote_deny_join_id0, false);
>> -	WRITE_ONCE(msk->pm.userspace, false);
>> +	WRITE_ONCE(msk->pm.userspace, mptcp_userspace_pm_default(sock_net(sk)));
>
> I'm wondering if we could set work_pending as well, and avoid some
> multiple conditionals in patch 2/6

Since work_pending is only set to 'true' in mptcp_pm_nl_data_init() right 
now, that would work. work_pending (and a couple of other mptcp_pm_data 
members) are initialized twice right now, so I can clean that up and make 
the relationship with pm->userspace more obvious.

>
> Other than that LGTM!
>
> It would be nice to have some simple self-test, enabling user-space pm
> and observing no action is taken on add_addr, connection established,
> etc.
>

I will add tests for the next version of this series.

> I think the procfs switch for user-space pm fits more easily self-
> tests.


Thanks,

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-08 21:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-06 23:59 [RFC PATCH 0/6] mptcp: Add userspace PM mode to bypass kernel PM Mat Martineau
2021-10-06 23:59 ` [RFC PATCH 1/6] mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode Mat Martineau
2021-10-06 23:59 ` [RFC PATCH 2/6] mptcp: Bypass kernel PM when userspace PM is enabled Mat Martineau
2021-10-06 23:59 ` [RFC PATCH 3/6] mptcp: Make kernel path manager check for userspace-managed sockets Mat Martineau
2021-10-06 23:59 ` [RFC PATCH 4/6] mptcp: Add a per-namespace sysctl to set the default path manager type Mat Martineau
2021-10-07 10:27   ` Paolo Abeni
2021-10-08 21:14     ` Mat Martineau
2021-10-06 23:59 ` [RFC PATCH 5/6] DO-NOT-MERGE: debug: mptcp: Warn on use of in-kernel PM functions Mat Martineau
2021-10-06 23:59 ` [RFC PATCH 6/6] DO-NOT-MERGE: debug: mptcp: Warn on unexpected events when userspace PM is active Mat Martineau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.