All of lore.kernel.org
 help / color / mirror / Atom feed
* [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
@ 2021-05-14 14:32 Geliang Tang
  2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port Geliang Tang
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Geliang Tang @ 2021-05-14 14:32 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

v6:
 - rename mptcp_pm_add_entry to mptcp_pm_anno_entry
 - update mptcp_pm_free_add_list
 - update code in mptcp_pm_nl_add_addr_received
 - tag: export/20210514T055902

v5:
 - add a new patch "mptcp: add add_list in mptcp_pm_data"
 - fix !CONFIG_SYSCTL case
 - tag: export/20210507T174457

v4:
 - rename mptcp_is_allow_join_id0 to mptcp_allow_join_id0
 - rename deny_join_id0 in mptcp_pm_data to remote_deny_join_id0
 - add comments for self test cases
 - apply: export/20210504T064955 +
          "data checksum support" +
          "data checksum support cleanups"

v3:
 - use 'u8 allow_join_initial_addr_port'
 - drop the spinlock in patch 3

v2:
 - rename join_denied to allow_join_id0 in mptcp_out_options
 - rename join_denied to deny_join_id0 in mptcp_options_received
 - add a new function mptcp_pm_deny_join_id0_received
 - move deny_join_id0 flag from mptcp_sock to mptcp_pm_data
 - check deny_join_id0 flag in mptcp_pm_create_subflow_or_signal_addr
instead of in mptcp_syn_options.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/183

Geliang Tang (6):
  mptcp: add sysctl allow_join_initial_addr_port
  mptcp: add allow_join_id0 in mptcp_out_options
  mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
  mptcp: add add_list in mptcp_pm_data
  mptcp: add deny_join_id0 in mptcp_options_received
  selftests: mptcp: add deny_join_id0 testcases

 Documentation/networking/mptcp-sysctl.rst     |  13 ++
 include/net/mptcp.h                           |   3 +-
 net/mptcp/ctrl.c                              |  16 ++
 net/mptcp/options.c                           |  14 +-
 net/mptcp/pm.c                                |   2 +
 net/mptcp/pm_netlink.c                        | 151 ++++++++++++++----
 net/mptcp/protocol.c                          |   1 +
 net/mptcp/protocol.h                          |  21 ++-
 net/mptcp/subflow.c                           |   3 +
 .../testing/selftests/net/mptcp/mptcp_join.sh |  56 ++++++-
 10 files changed, 238 insertions(+), 42 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port
  2021-05-14 14:32 [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Geliang Tang
@ 2021-05-14 14:32 ` Geliang Tang
  2021-05-14 14:32   ` [MPTCP][PATCH v6 mptcp-next 2/6] mptcp: add allow_join_id0 in mptcp_out_options Geliang Tang
  2021-05-14 23:36 ` [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Mat Martineau
  2021-05-17 14:17 ` Paolo Abeni
  2 siblings, 1 reply; 16+ messages in thread
From: Geliang Tang @ 2021-05-14 14:32 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang, Florian Westphal

This patch added a new sysctl, named allow_join_initial_addr_port, to
control whether allow peers to send join requests to the IP address and
port number used by the initial subflow.

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 Documentation/networking/mptcp-sysctl.rst | 13 +++++++++++++
 net/mptcp/ctrl.c                          | 16 ++++++++++++++++
 net/mptcp/protocol.h                      |  1 +
 3 files changed, 30 insertions(+)

diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index ee06fd782465..76d939e688b8 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -32,3 +32,16 @@ checksum_enabled - BOOLEAN
 	per-namespace sysctl.
 
 	Default: 0
+
+allow_join_initial_addr_port - BOOLEAN
+	Allow peers to send join requests to the IP address and port number used
+	by the initial subflow if the value is 1. This controls a flag that is
+	sent to the peer at connection time, and whether such join requests are
+	accepted or denied.
+
+	Joins to addresses advertised with ADD_ADDR are not affected by this
+	value.
+
+	This is a per-namespace sysctl.
+
+	Default: 1
diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index 6c2639bb9c19..7d738bd06f2c 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -24,6 +24,7 @@ struct mptcp_pernet {
 	u8 mptcp_enabled;
 	unsigned int add_addr_timeout;
 	u8 checksum_enabled;
+	u8 allow_join_initial_addr_port;
 };
 
 static struct mptcp_pernet *mptcp_get_pernet(struct net *net)
@@ -46,11 +47,17 @@ int mptcp_is_checksum_enabled(struct net *net)
 	return mptcp_get_pernet(net)->checksum_enabled;
 }
 
+int mptcp_allow_join_id0(struct net *net)
+{
+	return mptcp_get_pernet(net)->allow_join_initial_addr_port;
+}
+
 static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
 {
 	pernet->mptcp_enabled = 1;
 	pernet->add_addr_timeout = TCP_RTO_MAX;
 	pernet->checksum_enabled = 0;
+	pernet->allow_join_initial_addr_port = 1;
 }
 
 #ifdef CONFIG_SYSCTL
@@ -80,6 +87,14 @@ static struct ctl_table mptcp_sysctl_table[] = {
 		.extra1       = SYSCTL_ZERO,
 		.extra2       = SYSCTL_ONE
 	},
+	{
+		.procname = "allow_join_initial_addr_port",
+		.maxlen = sizeof(u8),
+		.mode = 0644,
+		.proc_handler = proc_dou8vec_minmax,
+		.extra1       = SYSCTL_ZERO,
+		.extra2       = SYSCTL_ONE
+	},
 	{}
 };
 
@@ -98,6 +113,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
 	table[0].data = &pernet->mptcp_enabled;
 	table[1].data = &pernet->add_addr_timeout;
 	table[2].data = &pernet->checksum_enabled;
+	table[3].data = &pernet->allow_join_initial_addr_port;
 
 	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
 	if (!hdr)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 868e878af526..3d6bbdf00bfb 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -543,6 +543,7 @@ static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *su
 int mptcp_is_enabled(struct net *net);
 unsigned int mptcp_get_add_addr_timeout(struct net *net);
 int mptcp_is_checksum_enabled(struct net *net);
+int mptcp_allow_join_id0(struct net *net);
 void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow,
 				     struct mptcp_options_received *mp_opt);
 bool mptcp_subflow_data_available(struct sock *sk);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [MPTCP][PATCH v6 mptcp-next 2/6] mptcp: add allow_join_id0 in mptcp_out_options
  2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port Geliang Tang
@ 2021-05-14 14:32   ` Geliang Tang
  2021-05-14 14:32     ` [MPTCP][PATCH v6 mptcp-next 3/6] mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry Geliang Tang
  0 siblings, 1 reply; 16+ messages in thread
From: Geliang Tang @ 2021-05-14 14:32 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch defined a new flag MPTCP_CAP_DENY_JOIN_ID0 for the third bit,
labeled "C" of the MP_CAPABLE option.

Add a new flag allow_join_id0 in struct mptcp_out_options. If this flag is
set, send out the MP_CAPABLE option with the flag MPTCP_CAP_DENY_JOIN_ID0.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 include/net/mptcp.h  | 3 ++-
 net/mptcp/options.c  | 6 ++++++
 net/mptcp/protocol.h | 6 ++++--
 net/mptcp/subflow.c  | 1 +
 4 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index d61bbbf11979..cb580b06152f 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -67,7 +67,8 @@ struct mptcp_out_options {
 	u8 backup;
 	u8 reset_reason:4,
 	   reset_transient:1,
-	   csum_reqd:1;
+	   csum_reqd:1,
+	   allow_join_id0:1;
 	u32 nonce;
 	u64 thmac;
 	u32 token;
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index b713828819b6..ab9e39d2e645 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -401,6 +401,7 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
 	if (subflow->request_mptcp) {
 		opts->suboptions = OPTION_MPTCP_MPC_SYN;
 		opts->csum_reqd = mptcp_is_checksum_enabled(sock_net(sk));
+		opts->allow_join_id0 = mptcp_allow_join_id0(sock_net(sk));
 		*size = TCPOLEN_MPTCP_MPC_SYN;
 		return true;
 	} else if (subflow->request_join) {
@@ -489,6 +490,7 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 		opts->sndr_key = subflow->local_key;
 		opts->rcvr_key = subflow->remote_key;
 		opts->csum_reqd = READ_ONCE(msk->csum_enabled);
+		opts->allow_join_id0 = mptcp_allow_join_id0(sock_net(sk));
 
 		/* Section 3.1.
 		 * The MP_CAPABLE option is carried on the SYN, SYN/ACK, and ACK
@@ -826,6 +828,7 @@ bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
 		opts->suboptions = OPTION_MPTCP_MPC_SYNACK;
 		opts->sndr_key = subflow_req->local_key;
 		opts->csum_reqd = subflow_req->csum_reqd;
+		opts->allow_join_id0 = subflow_req->allow_join_id0;
 		*size = TCPOLEN_MPTCP_MPC_SYNACK;
 		pr_debug("subflow_req=%p, local_key=%llu",
 			 subflow_req, subflow_req->local_key);
@@ -1200,6 +1203,9 @@ void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp,
 		if (opts->csum_reqd)
 			flag |= MPTCP_CAP_CHECKSUM_REQD;
 
+		if (!opts->allow_join_id0)
+			flag |= MPTCP_CAP_DENY_JOIN_ID0;
+
 		*ptr++ = mptcp_option(MPTCPOPT_MP_CAPABLE, len,
 				      MPTCP_SUPPORTED_VERSION,
 				      flag);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 3d6bbdf00bfb..fd7ff2d28cb3 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -79,8 +79,9 @@
 #define MPTCP_VERSION_MASK	(0x0F)
 #define MPTCP_CAP_CHECKSUM_REQD	BIT(7)
 #define MPTCP_CAP_EXTENSIBILITY	BIT(6)
+#define MPTCP_CAP_DENY_JOIN_ID0	BIT(5)
 #define MPTCP_CAP_HMAC_SHA256	BIT(0)
-#define MPTCP_CAP_FLAG_MASK	(0x3F)
+#define MPTCP_CAP_FLAG_MASK	(0x1F)
 
 /* MPTCP DSS flags */
 #define MPTCP_DSS_DATA_FIN	BIT(4)
@@ -352,7 +353,8 @@ struct mptcp_subflow_request_sock {
 	u16	mp_capable : 1,
 		mp_join : 1,
 		backup : 1,
-		csum_reqd : 1;
+		csum_reqd : 1,
+		allow_join_id0 : 1;
 	u8	local_id;
 	u8	remote_id;
 	u64	local_key;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 554e7ccee02a..41e1b5091e42 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -109,6 +109,7 @@ static void subflow_init_req(struct request_sock *req, const struct sock *sk_lis
 	subflow_req->mp_capable = 0;
 	subflow_req->mp_join = 0;
 	subflow_req->csum_reqd = mptcp_is_checksum_enabled(sock_net(sk_listener));
+	subflow_req->allow_join_id0 = mptcp_allow_join_id0(sock_net(sk_listener));
 	subflow_req->msk = NULL;
 	mptcp_token_init_request(req);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [MPTCP][PATCH v6 mptcp-next 3/6] mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
  2021-05-14 14:32   ` [MPTCP][PATCH v6 mptcp-next 2/6] mptcp: add allow_join_id0 in mptcp_out_options Geliang Tang
@ 2021-05-14 14:32     ` Geliang Tang
  2021-05-14 14:32       ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Geliang Tang
  0 siblings, 1 reply; 16+ messages in thread
From: Geliang Tang @ 2021-05-14 14:32 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch renamed struct mptcp_pm_add_entry to mptcp_pm_anno_entry, since
these entries will be added to the anno_list of PM.

Also rename the struct member add_timer to anno_timer.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/options.c    |  2 +-
 net/mptcp/pm_netlink.c | 50 +++++++++++++++++++++---------------------
 net/mptcp/protocol.h   |  8 +++----
 3 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index ab9e39d2e645..82fc481a25f0 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -1063,7 +1063,7 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
 			MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR);
 		} else {
 			mptcp_pm_add_addr_echoed(msk, &mp_opt.addr);
-			mptcp_pm_del_add_timer(msk, &mp_opt.addr);
+			mptcp_pm_del_anno_timer(msk, &mp_opt.addr);
 			MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ECHOADD);
 		}
 
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index d094588afad8..4fff8aef45e4 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -31,10 +31,10 @@ struct mptcp_pm_addr_entry {
 	struct socket		*lsk;
 };
 
-struct mptcp_pm_add_entry {
+struct mptcp_pm_anno_entry {
 	struct list_head	list;
 	struct mptcp_addr_info	addr;
-	struct timer_list	add_timer;
+	struct timer_list	anno_timer;
 	struct mptcp_sock	*sock;
 	u8			retrans_times;
 };
@@ -263,11 +263,11 @@ static void check_work_pending(struct mptcp_sock *msk)
 		WRITE_ONCE(msk->pm.work_pending, false);
 }
 
-struct mptcp_pm_add_entry *
+struct mptcp_pm_anno_entry *
 mptcp_lookup_anno_list_by_saddr(struct mptcp_sock *msk,
 				struct mptcp_addr_info *addr)
 {
-	struct mptcp_pm_add_entry *entry;
+	struct mptcp_pm_anno_entry *entry;
 
 	lockdep_assert_held(&msk->pm.lock);
 
@@ -281,7 +281,7 @@ mptcp_lookup_anno_list_by_saddr(struct mptcp_sock *msk,
 
 bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk)
 {
-	struct mptcp_pm_add_entry *entry;
+	struct mptcp_pm_anno_entry *entry;
 	struct mptcp_addr_info saddr;
 	bool ret = false;
 
@@ -300,9 +300,9 @@ bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk)
 	return ret;
 }
 
-static void mptcp_pm_add_timer(struct timer_list *timer)
+static void mptcp_pm_anno_timer(struct timer_list *timer)
 {
-	struct mptcp_pm_add_entry *entry = from_timer(entry, timer, add_timer);
+	struct mptcp_pm_anno_entry *entry = from_timer(entry, timer, anno_timer);
 	struct mptcp_sock *msk = entry->sock;
 	struct sock *sk = (struct sock *)msk;
 
@@ -344,11 +344,11 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 	__sock_put(sk);
 }
 
-struct mptcp_pm_add_entry *
-mptcp_pm_del_add_timer(struct mptcp_sock *msk,
-		       struct mptcp_addr_info *addr)
+struct mptcp_pm_anno_entry *
+mptcp_pm_del_anno_timer(struct mptcp_sock *msk,
+		        struct mptcp_addr_info *addr)
 {
-	struct mptcp_pm_add_entry *entry;
+	struct mptcp_pm_anno_entry *entry;
 	struct sock *sk = (struct sock *)msk;
 
 	spin_lock_bh(&msk->pm.lock);
@@ -358,7 +358,7 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
 	spin_unlock_bh(&msk->pm.lock);
 
 	if (entry)
-		sk_stop_timer_sync(sk, &entry->add_timer);
+		sk_stop_timer_sync(sk, &entry->anno_timer);
 
 	return entry;
 }
@@ -366,7 +366,7 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
 static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
 				     struct mptcp_pm_addr_entry *entry)
 {
-	struct mptcp_pm_add_entry *add_entry = NULL;
+	struct mptcp_pm_anno_entry *anno_entry = NULL;
 	struct sock *sk = (struct sock *)msk;
 	struct net *net = sock_net(sk);
 
@@ -375,18 +375,18 @@ static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
 	if (mptcp_lookup_anno_list_by_saddr(msk, &entry->addr))
 		return false;
 
-	add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC);
-	if (!add_entry)
+	anno_entry = kmalloc(sizeof(*anno_entry), GFP_ATOMIC);
+	if (!anno_entry)
 		return false;
 
-	list_add(&add_entry->list, &msk->pm.anno_list);
+	list_add(&anno_entry->list, &msk->pm.anno_list);
 
-	add_entry->addr = entry->addr;
-	add_entry->sock = msk;
-	add_entry->retrans_times = 0;
+	anno_entry->addr = entry->addr;
+	anno_entry->sock = msk;
+	anno_entry->retrans_times = 0;
 
-	timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
-	sk_reset_timer(sk, &add_entry->add_timer,
+	timer_setup(&anno_entry->anno_timer, mptcp_pm_anno_timer, 0);
+	sk_reset_timer(sk, &anno_entry->anno_timer,
 		       jiffies + mptcp_get_add_addr_timeout(net));
 
 	return true;
@@ -394,7 +394,7 @@ static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
 
 void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
 {
-	struct mptcp_pm_add_entry *entry, *tmp;
+	struct mptcp_pm_anno_entry *entry, *tmp;
 	struct sock *sk = (struct sock *)msk;
 	LIST_HEAD(free_list);
 
@@ -405,7 +405,7 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
 	spin_unlock_bh(&msk->pm.lock);
 
 	list_for_each_entry_safe(entry, tmp, &free_list, list) {
-		sk_stop_timer_sync(sk, &entry->add_timer);
+		sk_stop_timer_sync(sk, &entry->anno_timer);
 		kfree(entry);
 	}
 }
@@ -1068,9 +1068,9 @@ __lookup_addr_by_id(struct pm_nl_pernet *pernet, unsigned int id)
 static bool remove_anno_list_by_saddr(struct mptcp_sock *msk,
 				      struct mptcp_addr_info *addr)
 {
-	struct mptcp_pm_add_entry *entry;
+	struct mptcp_pm_anno_entry *entry;
 
-	entry = mptcp_pm_del_add_timer(msk, addr);
+	entry = mptcp_pm_del_anno_timer(msk, addr);
 	if (entry) {
 		list_del(&entry->list);
 		kfree(entry);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index fd7ff2d28cb3..5ccc0d3e5693 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -694,10 +694,10 @@ int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk,
 				 u8 bkup);
 void mptcp_pm_free_anno_list(struct mptcp_sock *msk);
 bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk);
-struct mptcp_pm_add_entry *
-mptcp_pm_del_add_timer(struct mptcp_sock *msk,
-		       struct mptcp_addr_info *addr);
-struct mptcp_pm_add_entry *
+struct mptcp_pm_anno_entry *
+mptcp_pm_del_anno_timer(struct mptcp_sock *msk,
+		        struct mptcp_addr_info *addr);
+struct mptcp_pm_anno_entry *
 mptcp_lookup_anno_list_by_saddr(struct mptcp_sock *msk,
 				struct mptcp_addr_info *addr);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data
  2021-05-14 14:32     ` [MPTCP][PATCH v6 mptcp-next 3/6] mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry Geliang Tang
@ 2021-05-14 14:32       ` Geliang Tang
  2021-05-14 14:32         ` [MPTCP][PATCH v6 mptcp-next 5/6] mptcp: add deny_join_id0 in mptcp_options_received Geliang Tang
  2021-05-20 22:50         ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Mat Martineau
  0 siblings, 2 replies; 16+ messages in thread
From: Geliang Tang @ 2021-05-14 14:32 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

Like the anno_list member in struct mptcp_pm_data, this patch added a
new member named add_list in it, to save all the received ADD_ADDRs in
this add_list.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/pm.c         |  1 +
 net/mptcp/pm_netlink.c | 89 +++++++++++++++++++++++++++++++++++++++---
 net/mptcp/protocol.c   |  1 +
 net/mptcp/protocol.h   |  2 +
 4 files changed, 87 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 9d00fa6d22e9..9456fe17b6a3 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -324,6 +324,7 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
 
 	spin_lock_init(&msk->pm.lock);
 	INIT_LIST_HEAD(&msk->pm.anno_list);
+	INIT_LIST_HEAD(&msk->pm.add_list);
 
 	mptcp_pm_nl_data_init(msk);
 }
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 4fff8aef45e4..3fcc167ea702 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -39,6 +39,11 @@ struct mptcp_pm_anno_entry {
 	u8			retrans_times;
 };
 
+struct mptcp_pm_add_entry {
+	struct list_head	list;
+	struct mptcp_addr_info	addr;
+};
+
 #define MAX_ADDR_ID		255
 #define BITMAP_SZ DIV_ROUND_UP(MAX_ADDR_ID + 1, BITS_PER_LONG)
 
@@ -483,6 +488,69 @@ static void mptcp_pm_nl_subflow_established(struct mptcp_sock *msk)
 	mptcp_pm_create_subflow_or_signal_addr(msk);
 }
 
+struct mptcp_pm_add_entry *
+mptcp_lookup_add_list_by_id(struct mptcp_sock *msk, u8 id)
+{
+	struct mptcp_pm_add_entry *entry;
+
+	lockdep_assert_held(&msk->pm.lock);
+
+	list_for_each_entry(entry, &msk->pm.add_list, list) {
+		if (entry->addr.id == id)
+			return entry;
+	}
+
+	return NULL;
+}
+
+struct mptcp_pm_add_entry *
+mptcp_lookup_add_list_by_saddr(struct mptcp_sock *msk,
+			       struct mptcp_addr_info *addr)
+{
+	struct mptcp_pm_add_entry *entry;
+
+	lockdep_assert_held(&msk->pm.lock);
+
+	list_for_each_entry(entry, &msk->pm.add_list, list) {
+		if (addresses_equal(&entry->addr, addr, true))
+			return entry;
+	}
+
+	return NULL;
+}
+
+static bool mptcp_pm_alloc_add_list(struct mptcp_sock *msk,
+				    struct mptcp_addr_info *addr)
+{
+	struct mptcp_pm_add_entry *add_entry = NULL;
+
+	lockdep_assert_held(&msk->pm.lock);
+
+	if (mptcp_lookup_add_list_by_saddr(msk, addr))
+		return false;
+
+	add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC);
+	if (!add_entry)
+		return false;
+
+	list_add(&add_entry->list, &msk->pm.add_list);
+	add_entry->addr = *addr;
+
+	return true;
+}
+
+void mptcp_pm_free_add_list(struct mptcp_sock *msk)
+{
+	struct mptcp_pm_add_entry *entry, *tmp;
+
+	pr_debug("msk=%p", msk);
+
+	spin_lock_bh(&msk->pm.lock);
+	list_for_each_entry_safe(entry, tmp, &msk->pm.add_list, list)
+		kfree(entry);
+	spin_unlock_bh(&msk->pm.lock);
+}
+
 static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
 {
 	struct sock *sk = (struct sock *)msk;
@@ -501,12 +569,6 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
 	if (lookup_subflow_by_daddr(&msk->conn_list, &msk->pm.remote))
 		goto add_addr_echo;
 
-	msk->pm.add_addr_accepted++;
-	msk->pm.subflows++;
-	if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
-	    msk->pm.subflows >= subflows_max)
-		WRITE_ONCE(msk->pm.accept_addr, false);
-
 	/* connect to the specified remote address, using whatever
 	 * local address the routing configuration will pick.
 	 */
@@ -516,6 +578,15 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
 	memset(&local, 0, sizeof(local));
 	local.family = remote.family;
 
+	if (!mptcp_pm_alloc_add_list(msk, &remote))
+		return;
+
+	msk->pm.add_addr_accepted++;
+	msk->pm.subflows++;
+	if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
+	    msk->pm.subflows >= subflows_max)
+		WRITE_ONCE(msk->pm.accept_addr, false);
+
 	spin_unlock_bh(&msk->pm.lock);
 	__mptcp_subflow_connect(sk, &local, &remote, 0, 0);
 	spin_lock_bh(&msk->pm.lock);
@@ -612,6 +683,7 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
 		list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) {
 			struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
 			int how = RCV_SHUTDOWN | SEND_SHUTDOWN;
+			struct mptcp_pm_add_entry *entry;
 			u8 id = subflow->local_id;
 
 			if (rm_type == MPTCP_MIB_RMADDR)
@@ -631,6 +703,11 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
 			if (rm_type == MPTCP_MIB_RMADDR) {
 				msk->pm.add_addr_accepted--;
 				WRITE_ONCE(msk->pm.accept_addr, true);
+				entry = mptcp_lookup_add_list_by_id(msk, id);
+				if (entry) {
+					list_del(&entry->list);
+					kfree(entry);
+				}
 			} else if (rm_type == MPTCP_MIB_RMSUBFLOW) {
 				msk->pm.local_addr_used--;
 			}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 35c0b1ca95c3..05ceba3972f6 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2888,6 +2888,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk)
 	skb_rbtree_purge(&msk->out_of_order_queue);
 	mptcp_token_destroy(msk);
 	mptcp_pm_free_anno_list(msk);
+	mptcp_pm_free_add_list(msk);
 }
 
 static void mptcp_destroy(struct sock *sk)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 5ccc0d3e5693..1df8da3da695 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -185,6 +185,7 @@ struct mptcp_pm_data {
 	struct mptcp_addr_info local;
 	struct mptcp_addr_info remote;
 	struct list_head anno_list;
+	struct list_head add_list;
 
 	spinlock_t	lock;		/*protects the whole PM data */
 
@@ -693,6 +694,7 @@ int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk,
 				 struct mptcp_addr_info *addr,
 				 u8 bkup);
 void mptcp_pm_free_anno_list(struct mptcp_sock *msk);
+void mptcp_pm_free_add_list(struct mptcp_sock *msk);
 bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk);
 struct mptcp_pm_anno_entry *
 mptcp_pm_del_anno_timer(struct mptcp_sock *msk,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [MPTCP][PATCH v6 mptcp-next 5/6] mptcp: add deny_join_id0 in mptcp_options_received
  2021-05-14 14:32       ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Geliang Tang
@ 2021-05-14 14:32         ` Geliang Tang
  2021-05-14 14:32           ` [MPTCP][PATCH v6 mptcp-next 6/6] selftests: mptcp: add deny_join_id0 testcases Geliang Tang
  2021-05-20 22:50         ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Mat Martineau
  1 sibling, 1 reply; 16+ messages in thread
From: Geliang Tang @ 2021-05-14 14:32 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang, Florian Westphal

This patch added a new flag named deny_join_id0 in struct
mptcp_options_received. Set it when MP_CAPABLE with the flag
MPTCP_CAP_DENYJOIN_ID0 is received.

Also add a new flag remote_deny_join_id0 in struct mptcp_pm_data. When the
flag deny_join_id0 is set, set this remote_deny_join_id0 flag.

In mptcp_pm_create_subflow_or_signal_addr, if the remote_deny_join_id0 flag
is set, and the remote address id is zero, stop this connection.

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 net/mptcp/options.c    |  6 ++++++
 net/mptcp/pm.c         |  1 +
 net/mptcp/pm_netlink.c | 12 +++++++++++-
 net/mptcp/protocol.h   |  4 +++-
 net/mptcp/subflow.c    |  2 ++
 5 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 82fc481a25f0..bb600eb29f3c 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -83,6 +83,9 @@ static void mptcp_parse_option(const struct sk_buff *skb,
 		if (flags & MPTCP_CAP_CHECKSUM_REQD)
 			mp_opt->csum_reqd = 1;
 
+		if (flags & MPTCP_CAP_DENY_JOIN_ID0)
+			mp_opt->deny_join_id0 = 1;
+
 		mp_opt->mp_capable = 1;
 		if (opsize >= TCPOLEN_MPTCP_MPC_SYNACK) {
 			mp_opt->sndr_key = get_unaligned_be64(ptr);
@@ -361,6 +364,7 @@ void mptcp_get_options(const struct sock *sk,
 	mp_opt->mp_prio = 0;
 	mp_opt->reset = 0;
 	mp_opt->csum_reqd = READ_ONCE(msk->csum_enabled);
+	mp_opt->deny_join_id0 = 0;
 
 	length = (th->doff * 4) - sizeof(struct tcphdr);
 	ptr = (const unsigned char *)(th + 1);
@@ -1048,6 +1052,8 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
 	}
 
 	mptcp_get_options(sk, skb, &mp_opt);
+	if (mp_opt.deny_join_id0)
+		WRITE_ONCE(msk->pm.remote_deny_join_id0, true);
 	if (!check_fully_established(msk, sk, subflow, skb, &mp_opt))
 		return;
 
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 9456fe17b6a3..23a309f641aa 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -320,6 +320,7 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
 	WRITE_ONCE(msk->pm.addr_signal, 0);
 	WRITE_ONCE(msk->pm.accept_addr, false);
 	WRITE_ONCE(msk->pm.accept_subflow, false);
+	WRITE_ONCE(msk->pm.remote_deny_join_id0, false);
 	msk->pm.status = 0;
 
 	spin_lock_init(&msk->pm.lock);
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 3fcc167ea702..bdc174cdb42e 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -461,10 +461,20 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
 		if (local) {
 			struct mptcp_addr_info remote = { 0 };
 
+			remote_address((struct sock_common *)sk, &remote);
+			if (READ_ONCE(msk->pm.remote_deny_join_id0)) {
+				struct mptcp_pm_add_entry *entry;
+
+				entry = list_first_entry_or_null(&msk->pm.add_list,
+								 typeof(*entry), list);
+				if (!entry)
+					return;
+
+				remote = entry->addr;
+			}
 			msk->pm.local_addr_used++;
 			msk->pm.subflows++;
 			check_work_pending(msk);
-			remote_address((struct sock_common *)sk, &remote);
 			spin_unlock_bh(&msk->pm.lock);
 			__mptcp_subflow_connect(sk, &local->addr, &remote,
 						local->flags, local->ifindex);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 1df8da3da695..136b7f9b5cd9 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -138,7 +138,8 @@ struct mptcp_options_received {
 		mp_prio : 1,
 		echo : 1,
 		csum_reqd : 1,
-		backup : 1;
+		backup : 1,
+		deny_join_id0 : 1;
 	u32	token;
 	u32	nonce;
 	u64	thmac;
@@ -194,6 +195,7 @@ struct mptcp_pm_data {
 	bool		work_pending;
 	bool		accept_addr;
 	bool		accept_subflow;
+	bool		remote_deny_join_id0;
 	u8		add_addr_signaled;
 	u8		add_addr_accepted;
 	u8		local_addr_used;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 41e1b5091e42..0e6e9c9dc976 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -408,6 +408,8 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 
 		if (mp_opt.csum_reqd)
 			WRITE_ONCE(mptcp_sk(parent)->csum_enabled, true);
+		if (mp_opt.deny_join_id0)
+			WRITE_ONCE(mptcp_sk(parent)->pm.remote_deny_join_id0, true);
 		subflow->mp_capable = 1;
 		subflow->can_ack = 1;
 		subflow->remote_key = mp_opt.sndr_key;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [MPTCP][PATCH v6 mptcp-next 6/6] selftests: mptcp: add deny_join_id0 testcases
  2021-05-14 14:32         ` [MPTCP][PATCH v6 mptcp-next 5/6] mptcp: add deny_join_id0 in mptcp_options_received Geliang Tang
@ 2021-05-14 14:32           ` Geliang Tang
  0 siblings, 0 replies; 16+ messages in thread
From: Geliang Tang @ 2021-05-14 14:32 UTC (permalink / raw)
  To: mptcp; +Cc: Geliang Tang

This patch added a new argument '-d' for mptcp_join.sh script, to invoke
the testcases for the MP_CAPABLE 'C' flag.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 .../testing/selftests/net/mptcp/mptcp_join.sh | 56 ++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 523c7797f30a..17b385f011d2 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -139,6 +139,17 @@ reset_with_checksum()
 	ip netns exec $ns2 sysctl -q net.mptcp.checksum_enabled=$ns2_enable
 }
 
+reset_with_allow_join_id0()
+{
+	local ns1_enable=$1
+	local ns2_enable=$2
+
+	reset
+
+	ip netns exec $ns1 sysctl -q net.mptcp.allow_join_initial_addr_port=$ns1_enable
+	ip netns exec $ns2 sysctl -q net.mptcp.allow_join_initial_addr_port=$ns2_enable
+}
+
 ip -Version > /dev/null 2>&1
 if [ $? -ne 0 ];then
 	echo "SKIP: Could not run test without ip tool"
@@ -1462,6 +1473,44 @@ checksum_tests()
 	chk_csum_nr "checksum test 1 0"
 }
 
+deny_join_id0_tests()
+{
+	# subflow allow join id0 ns1
+	reset_with_allow_join_id0 1 0
+	ip netns exec $ns1 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "single subflow allow join id0 ns1" 1 1 1
+
+	# subflow allow join id0 ns2
+	reset_with_allow_join_id0 0 1
+	ip netns exec $ns1 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "single subflow allow join id0 ns2" 0 0 0
+
+	# subflow and address allow join id0
+	reset_with_allow_join_id0 0 1
+	ip netns exec $ns1 ./pm_nl_ctl limits 2 2
+	ip netns exec $ns2 ./pm_nl_ctl limits 2 2
+	ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal
+	ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "subflow and address allow join id0" 2 2 2
+
+	# signal address allow join id0
+	# ADD_ADDRs are not affected by allow_join_id0 value.
+	reset_with_allow_join_id0 0 0
+	ip netns exec $ns1 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns2 ./pm_nl_ctl limits 1 1
+	ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal
+	run_tests $ns1 $ns2 10.0.1.1
+	chk_join_nr "signal address allow join id0" 1 1 1
+	chk_add_nr 1 1
+}
+
 all_tests()
 {
 	subflows_tests
@@ -1476,6 +1525,7 @@ all_tests()
 	add_addr_ports_tests
 	syncookies_tests
 	checksum_tests
+	deny_join_id0_tests
 }
 
 usage()
@@ -1493,6 +1543,7 @@ usage()
 	echo "  -p add_addr_ports_tests"
 	echo "  -k syncookies_tests"
 	echo "  -S checksum_tests"
+	echo "  -d deny_join_id0_tests"
 	echo "  -c capture pcap files"
 	echo "  -C enable data checksum"
 	echo "  -h help"
@@ -1528,7 +1579,7 @@ if [ $do_all_tests -eq 1 ]; then
 	exit $ret
 fi
 
-while getopts 'fsltra64bpkchCS' opt; do
+while getopts 'fsltra64bpkdchCS' opt; do
 	case $opt in
 		f)
 			subflows_tests
@@ -1566,6 +1617,9 @@ while getopts 'fsltra64bpkchCS' opt; do
 		S)
 			checksum_tests
 			;;
+		d)
+			deny_join_id0_tests
+			;;
 		c)
 			;;
 		C)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-14 14:32 [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Geliang Tang
  2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port Geliang Tang
@ 2021-05-14 23:36 ` Mat Martineau
  2021-05-17 14:17 ` Paolo Abeni
  2 siblings, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-05-14 23:36 UTC (permalink / raw)
  To: Geliang Tang; +Cc: mptcp


On Fri, 14 May 2021, Geliang Tang wrote:

> v6:
> - rename mptcp_pm_add_entry to mptcp_pm_anno_entry
> - update mptcp_pm_free_add_list
> - update code in mptcp_pm_nl_add_addr_received
> - tag: export/20210514T055902
>

Hi Geliang -

Thanks for the changes, v6 looks good for the export branch.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>


-Mat


> v5:
> - add a new patch "mptcp: add add_list in mptcp_pm_data"
> - fix !CONFIG_SYSCTL case
> - tag: export/20210507T174457
>
> v4:
> - rename mptcp_is_allow_join_id0 to mptcp_allow_join_id0
> - rename deny_join_id0 in mptcp_pm_data to remote_deny_join_id0
> - add comments for self test cases
> - apply: export/20210504T064955 +
>          "data checksum support" +
>          "data checksum support cleanups"
>
> v3:
> - use 'u8 allow_join_initial_addr_port'
> - drop the spinlock in patch 3
>
> v2:
> - rename join_denied to allow_join_id0 in mptcp_out_options
> - rename join_denied to deny_join_id0 in mptcp_options_received
> - add a new function mptcp_pm_deny_join_id0_received
> - move deny_join_id0 flag from mptcp_sock to mptcp_pm_data
> - check deny_join_id0 flag in mptcp_pm_create_subflow_or_signal_addr
> instead of in mptcp_syn_options.
>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/183
>
> Geliang Tang (6):
>  mptcp: add sysctl allow_join_initial_addr_port
>  mptcp: add allow_join_id0 in mptcp_out_options
>  mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
>  mptcp: add add_list in mptcp_pm_data
>  mptcp: add deny_join_id0 in mptcp_options_received
>  selftests: mptcp: add deny_join_id0 testcases
>
> Documentation/networking/mptcp-sysctl.rst     |  13 ++
> include/net/mptcp.h                           |   3 +-
> net/mptcp/ctrl.c                              |  16 ++
> net/mptcp/options.c                           |  14 +-
> net/mptcp/pm.c                                |   2 +
> net/mptcp/pm_netlink.c                        | 151 ++++++++++++++----
> net/mptcp/protocol.c                          |   1 +
> net/mptcp/protocol.h                          |  21 ++-
> net/mptcp/subflow.c                           |   3 +
> .../testing/selftests/net/mptcp/mptcp_join.sh |  56 ++++++-
> 10 files changed, 238 insertions(+), 42 deletions(-)
>
> -- 
> 2.31.1
>
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-14 14:32 [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Geliang Tang
  2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port Geliang Tang
  2021-05-14 23:36 ` [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Mat Martineau
@ 2021-05-17 14:17 ` Paolo Abeni
  2021-05-17 19:59   ` Mat Martineau
  2 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2021-05-17 14:17 UTC (permalink / raw)
  To: Geliang Tang, mptcp, Mat Martineau

Hello,

On Fri, 2021-05-14 at 22:32 +0800, Geliang Tang wrote:
> v6:
>  - rename mptcp_pm_add_entry to mptcp_pm_anno_entry
>  - update mptcp_pm_free_add_list
>  - update code in mptcp_pm_nl_add_addr_received
>  - tag: export/20210514T055902
> 
> v5:
>  - add a new patch "mptcp: add add_list in mptcp_pm_data"
>  - fix !CONFIG_SYSCTL case
>  - tag: export/20210507T174457
> 
> v4:
>  - rename mptcp_is_allow_join_id0 to mptcp_allow_join_id0
>  - rename deny_join_id0 in mptcp_pm_data to remote_deny_join_id0
>  - add comments for self test cases
>  - apply: export/20210504T064955 +
>           "data checksum support" +
>           "data checksum support cleanups"
> 
> v3:
>  - use 'u8 allow_join_initial_addr_port'
>  - drop the spinlock in patch 3
> 
> v2:
>  - rename join_denied to allow_join_id0 in mptcp_out_options
>  - rename join_denied to deny_join_id0 in mptcp_options_received
>  - add a new function mptcp_pm_deny_join_id0_received
>  - move deny_join_id0 flag from mptcp_sock to mptcp_pm_data
>  - check deny_join_id0 flag in mptcp_pm_create_subflow_or_signal_addr
> instead of in mptcp_syn_options.
> 
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/183
> 
> Geliang Tang (6):
>   mptcp: add sysctl allow_join_initial_addr_port
>   mptcp: add allow_join_id0 in mptcp_out_options
>   mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
>   mptcp: add add_list in mptcp_pm_data
>   mptcp: add deny_join_id0 in mptcp_options_received
>   selftests: mptcp: add deny_join_id0 testcases
> 
>  Documentation/networking/mptcp-sysctl.rst     |  13 ++
>  include/net/mptcp.h                           |   3 +-
>  net/mptcp/ctrl.c                              |  16 ++
>  net/mptcp/options.c                           |  14 +-
>  net/mptcp/pm.c                                |   2 +
>  net/mptcp/pm_netlink.c                        | 151 ++++++++++++++----
>  net/mptcp/protocol.c                          |   1 +
>  net/mptcp/protocol.h                          |  21 ++-
>  net/mptcp/subflow.c                           |   3 +
>  .../testing/selftests/net/mptcp/mptcp_join.sh |  56 ++++++-
>  10 files changed, 238 insertions(+), 42 deletions(-)
> 
I'm very sorry to give feedback this late, but I was unable to review
this before.

I have some very generic/broad questions:
1) shuold 'allow_join_initial_addr_port' be by default 0 for clients?
If I read correctly, will be 1 after this patch.

2) is the additional add_addr accounting required? I mean storing the
received addrs in the msk pm add_list (patch 4/6). Can we instead
simply enforce !allow_join_initial_addr_port looking up the relevant
address into the existing subflows? (or caching first subflow dst addr
and port in some new msk fields, whatever is simpler)

if !allow_join_initial_addr_port && <exists subflow with target
ip/port> && <such subflow is mp_capable/>
	<do not create new subflow>

3) If I read correctly the series enforce the peer who could create the
new subflow to respect the 'allow_join_initial_addr_port' field, but
does not implement the corrsponding check on the other end. e.g. if a
server sets 'allow_join_initial_addr_port' to 0 and a client tries to
create additional subflows, the server will not forbit that. I
think/fear the missing complexity here is quite significant.

I think it would be nice avoid adding additional complexity for the 'C'
bit - currently even mptcp.org support it.

Perhaps we could keep 'allow_join_initial_addr_port' constant (always 0
for client, always 0 for server) - avoding the additional complexity of
point 3) above.

Again, I'm sorry for suggesting significant changes this late.

Paolo




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-17 14:17 ` Paolo Abeni
@ 2021-05-17 19:59   ` Mat Martineau
  2021-05-18 10:11     ` Geliang Tang
  2021-05-18 13:56     ` Paolo Abeni
  0 siblings, 2 replies; 16+ messages in thread
From: Mat Martineau @ 2021-05-17 19:59 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Geliang Tang, mptcp

On Mon, 17 May 2021, Paolo Abeni wrote:

> Hello,
>
> On Fri, 2021-05-14 at 22:32 +0800, Geliang Tang wrote:
>> v6:
>>  - rename mptcp_pm_add_entry to mptcp_pm_anno_entry
>>  - update mptcp_pm_free_add_list
>>  - update code in mptcp_pm_nl_add_addr_received
>>  - tag: export/20210514T055902
>>
>> v5:
>>  - add a new patch "mptcp: add add_list in mptcp_pm_data"
>>  - fix !CONFIG_SYSCTL case
>>  - tag: export/20210507T174457
>>
>> v4:
>>  - rename mptcp_is_allow_join_id0 to mptcp_allow_join_id0
>>  - rename deny_join_id0 in mptcp_pm_data to remote_deny_join_id0
>>  - add comments for self test cases
>>  - apply: export/20210504T064955 +
>>           "data checksum support" +
>>           "data checksum support cleanups"
>>
>> v3:
>>  - use 'u8 allow_join_initial_addr_port'
>>  - drop the spinlock in patch 3
>>
>> v2:
>>  - rename join_denied to allow_join_id0 in mptcp_out_options
>>  - rename join_denied to deny_join_id0 in mptcp_options_received
>>  - add a new function mptcp_pm_deny_join_id0_received
>>  - move deny_join_id0 flag from mptcp_sock to mptcp_pm_data
>>  - check deny_join_id0 flag in mptcp_pm_create_subflow_or_signal_addr
>> instead of in mptcp_syn_options.
>>
>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/183
>>
>> Geliang Tang (6):
>>   mptcp: add sysctl allow_join_initial_addr_port
>>   mptcp: add allow_join_id0 in mptcp_out_options
>>   mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
>>   mptcp: add add_list in mptcp_pm_data
>>   mptcp: add deny_join_id0 in mptcp_options_received
>>   selftests: mptcp: add deny_join_id0 testcases
>>
>>  Documentation/networking/mptcp-sysctl.rst     |  13 ++
>>  include/net/mptcp.h                           |   3 +-
>>  net/mptcp/ctrl.c                              |  16 ++
>>  net/mptcp/options.c                           |  14 +-
>>  net/mptcp/pm.c                                |   2 +
>>  net/mptcp/pm_netlink.c                        | 151 ++++++++++++++----
>>  net/mptcp/protocol.c                          |   1 +
>>  net/mptcp/protocol.h                          |  21 ++-
>>  net/mptcp/subflow.c                           |   3 +
>>  .../testing/selftests/net/mptcp/mptcp_join.sh |  56 ++++++-
>>  10 files changed, 238 insertions(+), 42 deletions(-)
>>
> I'm very sorry to give feedback this late, but I was unable to review
> this before.
>
> I have some very generic/broad questions:
> 1) shuold 'allow_join_initial_addr_port' be by default 0 for clients?
> If I read correctly, will be 1 after this patch.

It should default to 1 because that is what keeps the current behavior.

The logic is flipped from the 'C' bit: C=1 in the header means "do not 
connect to initial subflow addr/port", but allow_join_initial_addr_port=1 
means "*do* connect to the initial subflow addr/port".

> 2) is the additional add_addr accounting required? I mean storing the
> received addrs in the msk pm add_list (patch 4/6). Can we instead
> simply enforce !allow_join_initial_addr_port looking up the relevant
> address into the existing subflows? (or caching first subflow dst addr
> and port in some new msk fields, whatever is simpler)
>
> if !allow_join_initial_addr_port && <exists subflow with target
> ip/port> && <such subflow is mp_capable/>
> 	<do not create new subflow>
>

Caching the first subflow dst addr/port is enough to prevent sending a 
MP_JOIN to that address, but I don't think it does quite enough.

Consider:

1. Initial connection is made between Peer A and Peer B with C=1 (both 
directions).

2. Peer A sends ADD_ADDR. Peer B currently has limits set that don't allow 
outgoing MP_JOINs.

3. Later, Peer B updates limits to allow outgoing MP_JOINs.

At this point, Peer B needs to know what to connect to. Without the 
add_list, the only active address it knows for Peer A is one it is not 
allowed to connect to.


I've expected that we would add better tracking of ADD_ADDRs - remember 
that ADD_ADDR is an advertisement, not a request to "please connect to 
this address right now". Userspace path managers will certainly have more 
persistance of the peer's advertised addresses. Any more comprehensive 
in-kernel PM will also need the add_list. The main concern is that the 
list doesn't consume excessive memory, and the add_list is not expanded 
when pm->accept_addr is false.


> 3) If I read correctly the series enforce the peer who could create the
> new subflow to respect the 'allow_join_initial_addr_port' field, but
> does not implement the corrsponding check on the other end. e.g. if a
> server sets 'allow_join_initial_addr_port' to 0 and a client tries to
> create additional subflows, the server will not forbit that. I
> think/fear the missing complexity here is quite significant.
>
> I think it would be nice avoid adding additional complexity for the 'C'
> bit - currently even mptcp.org support it.
>

Assuming you intended to say "currently even mptcp.org does not support 
it."

You're right: the incoming join address check is missing. But I'm not sure 
the complexity is so bad, the PM can already reject joins. I think we 
could pass enough information to mptcp_pm_allow_new_subflow() to implement 
this check. Am I oversimplifying - is it harder to get at the dest 
addr/port? Or is it something else?

> Perhaps we could keep 'allow_join_initial_addr_port' constant (always 0
> for client, always 0 for server) - avoding the additional complexity of
> point 3) above.
>

Given the answer to #1 is this still what you'd propose? 
allow_join_initial_addr_port=1 is unchanged behavior from the kernel 
today, and is the simpler case.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-17 19:59   ` Mat Martineau
@ 2021-05-18 10:11     ` Geliang Tang
  2021-05-18 13:56     ` Paolo Abeni
  1 sibling, 0 replies; 16+ messages in thread
From: Geliang Tang @ 2021-05-18 10:11 UTC (permalink / raw)
  To: Mat Martineau; +Cc: Paolo Abeni, mptcp

Hi Mat, Paolo,

Mat Martineau <mathew.j.martineau@linux.intel.com> 于2021年5月18日周二 上午3:59写道:
>
> On Mon, 17 May 2021, Paolo Abeni wrote:
>
> > Hello,
> >
> > On Fri, 2021-05-14 at 22:32 +0800, Geliang Tang wrote:
> >> v6:
> >>  - rename mptcp_pm_add_entry to mptcp_pm_anno_entry
> >>  - update mptcp_pm_free_add_list
> >>  - update code in mptcp_pm_nl_add_addr_received
> >>  - tag: export/20210514T055902
> >>
> >> v5:
> >>  - add a new patch "mptcp: add add_list in mptcp_pm_data"
> >>  - fix !CONFIG_SYSCTL case
> >>  - tag: export/20210507T174457
> >>
> >> v4:
> >>  - rename mptcp_is_allow_join_id0 to mptcp_allow_join_id0
> >>  - rename deny_join_id0 in mptcp_pm_data to remote_deny_join_id0
> >>  - add comments for self test cases
> >>  - apply: export/20210504T064955 +
> >>           "data checksum support" +
> >>           "data checksum support cleanups"
> >>
> >> v3:
> >>  - use 'u8 allow_join_initial_addr_port'
> >>  - drop the spinlock in patch 3
> >>
> >> v2:
> >>  - rename join_denied to allow_join_id0 in mptcp_out_options
> >>  - rename join_denied to deny_join_id0 in mptcp_options_received
> >>  - add a new function mptcp_pm_deny_join_id0_received
> >>  - move deny_join_id0 flag from mptcp_sock to mptcp_pm_data
> >>  - check deny_join_id0 flag in mptcp_pm_create_subflow_or_signal_addr
> >> instead of in mptcp_syn_options.
> >>
> >> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/183
> >>
> >> Geliang Tang (6):
> >>   mptcp: add sysctl allow_join_initial_addr_port
> >>   mptcp: add allow_join_id0 in mptcp_out_options
> >>   mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
> >>   mptcp: add add_list in mptcp_pm_data
> >>   mptcp: add deny_join_id0 in mptcp_options_received
> >>   selftests: mptcp: add deny_join_id0 testcases
> >>
> >>  Documentation/networking/mptcp-sysctl.rst     |  13 ++
> >>  include/net/mptcp.h                           |   3 +-
> >>  net/mptcp/ctrl.c                              |  16 ++
> >>  net/mptcp/options.c                           |  14 +-
> >>  net/mptcp/pm.c                                |   2 +
> >>  net/mptcp/pm_netlink.c                        | 151 ++++++++++++++----
> >>  net/mptcp/protocol.c                          |   1 +
> >>  net/mptcp/protocol.h                          |  21 ++-
> >>  net/mptcp/subflow.c                           |   3 +
> >>  .../testing/selftests/net/mptcp/mptcp_join.sh |  56 ++++++-
> >>  10 files changed, 238 insertions(+), 42 deletions(-)
> >>
> > I'm very sorry to give feedback this late, but I was unable to review
> > this before.
> >
> > I have some very generic/broad questions:
> > 1) shuold 'allow_join_initial_addr_port' be by default 0 for clients?
> > If I read correctly, will be 1 after this patch.
>
> It should default to 1 because that is what keeps the current behavior.
>
> The logic is flipped from the 'C' bit: C=1 in the header means "do not
> connect to initial subflow addr/port", but allow_join_initial_addr_port=1
> means "*do* connect to the initial subflow addr/port".
>
> > 2) is the additional add_addr accounting required? I mean storing the
> > received addrs in the msk pm add_list (patch 4/6). Can we instead
> > simply enforce !allow_join_initial_addr_port looking up the relevant
> > address into the existing subflows? (or caching first subflow dst addr
> > and port in some new msk fields, whatever is simpler)
> >
> > if !allow_join_initial_addr_port && <exists subflow with target
> > ip/port> && <such subflow is mp_capable/>
> >       <do not create new subflow>
> >
>
> Caching the first subflow dst addr/port is enough to prevent sending a
> MP_JOIN to that address, but I don't think it does quite enough.
>
> Consider:
>
> 1. Initial connection is made between Peer A and Peer B with C=1 (both
> directions).
>
> 2. Peer A sends ADD_ADDR. Peer B currently has limits set that don't allow
> outgoing MP_JOINs.
>
> 3. Later, Peer B updates limits to allow outgoing MP_JOINs.
>
> At this point, Peer B needs to know what to connect to. Without the
> add_list, the only active address it knows for Peer A is one it is not
> allowed to connect to.
>
>
> I've expected that we would add better tracking of ADD_ADDRs - remember
> that ADD_ADDR is an advertisement, not a request to "please connect to
> this address right now". Userspace path managers will certainly have more
> persistance of the peer's advertised addresses. Any more comprehensive
> in-kernel PM will also need the add_list. The main concern is that the
> list doesn't consume excessive memory, and the add_list is not expanded
> when pm->accept_addr is false.
>
>
> > 3) If I read correctly the series enforce the peer who could create the
> > new subflow to respect the 'allow_join_initial_addr_port' field, but
> > does not implement the corrsponding check on the other end. e.g. if a
> > server sets 'allow_join_initial_addr_port' to 0 and a client tries to
> > create additional subflows, the server will not forbit that. I
> > think/fear the missing complexity here is quite significant.
> >
> > I think it would be nice avoid adding additional complexity for the 'C'
> > bit - currently even mptcp.org support it.
> >
>
> Assuming you intended to say "currently even mptcp.org does not support
> it."
>
> You're right: the incoming join address check is missing. But I'm not sure
> the complexity is so bad, the PM can already reject joins. I think we
> could pass enough information to mptcp_pm_allow_new_subflow() to implement
> this check. Am I oversimplifying - is it harder to get at the dest
> addr/port? Or is it something else?

Thanks for your review.

I just sent out a patch named (Squash to "mptcp: add deny_join_id0 in
mptcp_options_received") to fix this.

-Geliang

>
> > Perhaps we could keep 'allow_join_initial_addr_port' constant (always 0
> > for client, always 0 for server) - avoding the additional complexity of
> > point 3) above.
> >
>
> Given the answer to #1 is this still what you'd propose?
> allow_join_initial_addr_port=1 is unchanged behavior from the kernel
> today, and is the simpler case.
>
> --
> Mat Martineau
> Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-17 19:59   ` Mat Martineau
  2021-05-18 10:11     ` Geliang Tang
@ 2021-05-18 13:56     ` Paolo Abeni
  2021-05-18 19:07       ` Mat Martineau
  1 sibling, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2021-05-18 13:56 UTC (permalink / raw)
  To: Mat Martineau; +Cc: Geliang Tang, mptcp

On Mon, 2021-05-17 at 12:59 -0700, Mat Martineau wrote:
> On Mon, 17 May 2021, Paolo Abeni wrote:
> 
> > Hello,
> > 
> > On Fri, 2021-05-14 at 22:32 +0800, Geliang Tang wrote:
> > > v6:
> > >  - rename mptcp_pm_add_entry to mptcp_pm_anno_entry
> > >  - update mptcp_pm_free_add_list
> > >  - update code in mptcp_pm_nl_add_addr_received
> > >  - tag: export/20210514T055902
> > > 
> > > v5:
> > >  - add a new patch "mptcp: add add_list in mptcp_pm_data"
> > >  - fix !CONFIG_SYSCTL case
> > >  - tag: export/20210507T174457
> > > 
> > > v4:
> > >  - rename mptcp_is_allow_join_id0 to mptcp_allow_join_id0
> > >  - rename deny_join_id0 in mptcp_pm_data to remote_deny_join_id0
> > >  - add comments for self test cases
> > >  - apply: export/20210504T064955 +
> > >           "data checksum support" +
> > >           "data checksum support cleanups"
> > > 
> > > v3:
> > >  - use 'u8 allow_join_initial_addr_port'
> > >  - drop the spinlock in patch 3
> > > 
> > > v2:
> > >  - rename join_denied to allow_join_id0 in mptcp_out_options
> > >  - rename join_denied to deny_join_id0 in mptcp_options_received
> > >  - add a new function mptcp_pm_deny_join_id0_received
> > >  - move deny_join_id0 flag from mptcp_sock to mptcp_pm_data
> > >  - check deny_join_id0 flag in mptcp_pm_create_subflow_or_signal_addr
> > > instead of in mptcp_syn_options.
> > > 
> > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/183
> > > 
> > > Geliang Tang (6):
> > >   mptcp: add sysctl allow_join_initial_addr_port
> > >   mptcp: add allow_join_id0 in mptcp_out_options
> > >   mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
> > >   mptcp: add add_list in mptcp_pm_data
> > >   mptcp: add deny_join_id0 in mptcp_options_received
> > >   selftests: mptcp: add deny_join_id0 testcases
> > > 
> > >  Documentation/networking/mptcp-sysctl.rst     |  13 ++
> > >  include/net/mptcp.h                           |   3 +-
> > >  net/mptcp/ctrl.c                              |  16 ++
> > >  net/mptcp/options.c                           |  14 +-
> > >  net/mptcp/pm.c                                |   2 +
> > >  net/mptcp/pm_netlink.c                        | 151 ++++++++++++++----
> > >  net/mptcp/protocol.c                          |   1 +
> > >  net/mptcp/protocol.h                          |  21 ++-
> > >  net/mptcp/subflow.c                           |   3 +
> > >  .../testing/selftests/net/mptcp/mptcp_join.sh |  56 ++++++-
> > >  10 files changed, 238 insertions(+), 42 deletions(-)
> > > 
> > I'm very sorry to give feedback this late, but I was unable to review
> > this before.
> > 
> > I have some very generic/broad questions:
> > 1) shuold 'allow_join_initial_addr_port' be by default 0 for clients?
> > If I read correctly, will be 1 after this patch.
> 
> It should default to 1 because that is what keeps the current behavior.
> 
> The logic is flipped from the 'C' bit: C=1 in the header means "do not 
> connect to initial subflow addr/port", but allow_join_initial_addr_port=1 
> means "*do* connect to the initial subflow addr/port".
> 
> > 2) is the additional add_addr accounting required? I mean storing the
> > received addrs in the msk pm add_list (patch 4/6). Can we instead
> > simply enforce !allow_join_initial_addr_port looking up the relevant
> > address into the existing subflows? (or caching first subflow dst addr
> > and port in some new msk fields, whatever is simpler)
> > 
> > if !allow_join_initial_addr_port && <exists subflow with target
> > ip/port> && <such subflow is mp_capable/>
> > 	<do not create new subflow>
> > 
> 
> Caching the first subflow dst addr/port is enough to prevent sending a 
> MP_JOIN to that address, but I don't think it does quite enough.
> 
> Consider:
> 
> 1. Initial connection is made between Peer A and Peer B with C=1 (both 
> directions).
> 
> 2. Peer A sends ADD_ADDR. Peer B currently has limits set that don't allow 
> outgoing MP_JOINs.

Uhmmm.... Looking at the RFC, ADD_ADDR <MPC addr/MPC port> is a gray
area to me. My understanding is that with 'C' bit == 0 the peer is
allowed to create additional subflow towardst the initial subflow dst
addr/port, and such subflows are not allowed with C==1. 

But it looks like that ADD_ADDR should override that request. At the
very least an ADD_ADDR for the initial addr/port when the MPC handshake
completed with C==1 is weird.

> I've expected that we would add better tracking of ADD_ADDRs - 
> remember 
> that ADD_ADDR is an advertisement, not a request to "please connect 
> to 
> this address right now". Userspace path managers will certainly have 
> more 
> persistance of the peer's advertised addresses. Any more 
> comprehensive 
> in-kernel PM will also need the add_list. The main concern is that
> the 
> list doesn't consume excessive memory, and the add_list is not 
> expanded 
> when pm->accept_addr is false.

The in-kernel path manager is want to be as simple as possible. As soon
as we start allocating memory on peer req it's very easy to be prone to
DoS. Memory accounting and limits could save us or at least mitigate
the problem, but they will add more complexity.

> > 3) If I read correctly the series enforce the peer who could create the
> > new subflow to respect the 'allow_join_initial_addr_port' field, but
> > does not implement the corrsponding check on the other end. e.g. if a
> > server sets 'allow_join_initial_addr_port' to 0 and a client tries to
> > create additional subflows, the server will not forbit that. I
> > think/fear the missing complexity here is quite significant.
> > 
> > I think it would be nice avoid adding additional complexity for the 'C'
> > bit - currently even mptcp.org support it.
> > 
> 
> Assuming you intended to say "currently even mptcp.org does not support 
> it."
> 
> You're right: the incoming join address check is missing. But I'm not sure 
> the complexity is so bad, the PM can already reject joins. I think we 
> could pass enough information to mptcp_pm_allow_new_subflow() to implement 
> this check. Am I oversimplifying - is it harder to get at the dest 
> addr/port? Or is it something else?

What abot that setting C==0 on client sockets? does that imply that the
client will accept incoming connection on the src port/addr?

For the server case, even a list-based filter would add addtional
overhead and complexity for all setup.

AFAICS the RFC mentions this option as on optimization to avoid
connection attempts failing when something outside the endpoint
themself would prevent them from being succeful.

But this "optimization" has a cost in term of memory, complexity and
additional overhead for each connection/subflow. It will work only if
the administrator will configure the C bit properly. And that could be
non trivial, when load balancing and NAT are outside the mptcp host.

Just does not seam worthy to me, I really think we should start with
the bare minumum to respect the RFC itself, avoiding as much complexity
as possible:
- setting C always to 0 on server
- when the peer send C==1, cache the MPC addr/port, and do not attempt
other connection to such endpoint.

I'm not even sure the latter step is strictly necessary. Another option
would be passing the info to the user-space path manager - if available
- and let it deal with all the complexity. If no user-space path
manager is available just be dumb and simple ;)

> > Perhaps we could keep 'allow_join_initial_addr_port' constant (always 0
> > for client, always 0 for server) - avoding the additional complexity of
> > point 3) above.
> > 
> 
> Given the answer to #1 is this still what you'd propose? 
> allow_join_initial_addr_port=1 is unchanged behavior from the kernel 
> today, and is the simpler case.
> 
> --
> Mat Martineau
> Intel
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-18 13:56     ` Paolo Abeni
@ 2021-05-18 19:07       ` Mat Martineau
  2021-05-19  9:51         ` Paolo Abeni
  0 siblings, 1 reply; 16+ messages in thread
From: Mat Martineau @ 2021-05-18 19:07 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Geliang Tang, mptcp

On Tue, 18 May 2021, Paolo Abeni wrote:

> On Mon, 2021-05-17 at 12:59 -0700, Mat Martineau wrote:
>> On Mon, 17 May 2021, Paolo Abeni wrote:
>>
>>> Hello,
>>>
>>> On Fri, 2021-05-14 at 22:32 +0800, Geliang Tang wrote:
>>>> v6:
>>>>  - rename mptcp_pm_add_entry to mptcp_pm_anno_entry
>>>>  - update mptcp_pm_free_add_list
>>>>  - update code in mptcp_pm_nl_add_addr_received
>>>>  - tag: export/20210514T055902
>>>>
>>>> v5:
>>>>  - add a new patch "mptcp: add add_list in mptcp_pm_data"
>>>>  - fix !CONFIG_SYSCTL case
>>>>  - tag: export/20210507T174457
>>>>
>>>> v4:
>>>>  - rename mptcp_is_allow_join_id0 to mptcp_allow_join_id0
>>>>  - rename deny_join_id0 in mptcp_pm_data to remote_deny_join_id0
>>>>  - add comments for self test cases
>>>>  - apply: export/20210504T064955 +
>>>>           "data checksum support" +
>>>>           "data checksum support cleanups"
>>>>
>>>> v3:
>>>>  - use 'u8 allow_join_initial_addr_port'
>>>>  - drop the spinlock in patch 3
>>>>
>>>> v2:
>>>>  - rename join_denied to allow_join_id0 in mptcp_out_options
>>>>  - rename join_denied to deny_join_id0 in mptcp_options_received
>>>>  - add a new function mptcp_pm_deny_join_id0_received
>>>>  - move deny_join_id0 flag from mptcp_sock to mptcp_pm_data
>>>>  - check deny_join_id0 flag in mptcp_pm_create_subflow_or_signal_addr
>>>> instead of in mptcp_syn_options.
>>>>
>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/183
>>>>
>>>> Geliang Tang (6):
>>>>   mptcp: add sysctl allow_join_initial_addr_port
>>>>   mptcp: add allow_join_id0 in mptcp_out_options
>>>>   mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry
>>>>   mptcp: add add_list in mptcp_pm_data
>>>>   mptcp: add deny_join_id0 in mptcp_options_received
>>>>   selftests: mptcp: add deny_join_id0 testcases
>>>>
>>>>  Documentation/networking/mptcp-sysctl.rst     |  13 ++
>>>>  include/net/mptcp.h                           |   3 +-
>>>>  net/mptcp/ctrl.c                              |  16 ++
>>>>  net/mptcp/options.c                           |  14 +-
>>>>  net/mptcp/pm.c                                |   2 +
>>>>  net/mptcp/pm_netlink.c                        | 151 ++++++++++++++----
>>>>  net/mptcp/protocol.c                          |   1 +
>>>>  net/mptcp/protocol.h                          |  21 ++-
>>>>  net/mptcp/subflow.c                           |   3 +
>>>>  .../testing/selftests/net/mptcp/mptcp_join.sh |  56 ++++++-
>>>>  10 files changed, 238 insertions(+), 42 deletions(-)
>>>>
>>> I'm very sorry to give feedback this late, but I was unable to review
>>> this before.
>>>
>>> I have some very generic/broad questions:
>>> 1) shuold 'allow_join_initial_addr_port' be by default 0 for clients?
>>> If I read correctly, will be 1 after this patch.
>>
>> It should default to 1 because that is what keeps the current behavior.
>>
>> The logic is flipped from the 'C' bit: C=1 in the header means "do not
>> connect to initial subflow addr/port", but allow_join_initial_addr_port=1
>> means "*do* connect to the initial subflow addr/port".
>>
>>> 2) is the additional add_addr accounting required? I mean storing the
>>> received addrs in the msk pm add_list (patch 4/6). Can we instead
>>> simply enforce !allow_join_initial_addr_port looking up the relevant
>>> address into the existing subflows? (or caching first subflow dst addr
>>> and port in some new msk fields, whatever is simpler)
>>>
>>> if !allow_join_initial_addr_port && <exists subflow with target
>>> ip/port> && <such subflow is mp_capable/>
>>> 	<do not create new subflow>
>>>
>>
>> Caching the first subflow dst addr/port is enough to prevent sending a
>> MP_JOIN to that address, but I don't think it does quite enough.
>>
>> Consider:
>>
>> 1. Initial connection is made between Peer A and Peer B with C=1 (both
>> directions).
>>
>> 2. Peer A sends ADD_ADDR. Peer B currently has limits set that don't allow
>> outgoing MP_JOINs.
>
> Uhmmm.... Looking at the RFC, ADD_ADDR <MPC addr/MPC port> is a gray
> area to me. My understanding is that with 'C' bit == 0 the peer is
> allowed to create additional subflow towardst the initial subflow dst
> addr/port, and such subflows are not allowed with C==1.

That's correct. My example was unclear: my intent was to say Peer A sent 
an ADD_ADDR for a new/different address that it does want Peer B to 
connect to.

>
> But it looks like that ADD_ADDR should override that request. At the
> very least an ADD_ADDR for the initial addr/port when the MPC handshake
> completed with C==1 is weird.

This is a corner case that I hadn't considered and I agree that it's 
weird. Given the unconditional "MUST NOT" wording of the C bit 
description, I would not expect the ADD_ADDR to override. If a peer 
sends a weird ADD_ADDR they are asking for weirdness I guess, but it's ok 
to ignore them :)

> 
>> I've expected that we would add better tracking of ADD_ADDRs -
>> remember
>> that ADD_ADDR is an advertisement, not a request to "please connect
>> to
>> this address right now". Userspace path managers will certainly have
>> more
>> persistance of the peer's advertised addresses. Any more
>> comprehensive
>> in-kernel PM will also need the add_list. The main concern is that
>> the
>> list doesn't consume excessive memory, and the add_list is not
>> expanded
>> when pm->accept_addr is false.
>
> The in-kernel path manager is want to be as simple as possible. As soon
> as we start allocating memory on peer req it's very easy to be prone to
> DoS. Memory accounting and limits could save us or at least mitigate
> the problem, but they will add more complexity.

My assumption was that high-connection-count servers would probably have 
their limits set in a way that ADD_ADDRs would not be accepted anyway. 
This is the use case where clients may be behind a NAT and the server is 
accepting MP_JOINs from the client (but the server is never sending 
MP_JOINs), and the server is possibly sending ADD_ADDRs to the client. The 
configurable limit is already there from the existing in-kernel PM code.

I do agree with "as simple as possible", but the RFC requirements are 
constraining that. Even though the C bit is ignored in Linux 
implementations so far, I don't think it should stay that way. The current 
code always tries to connect to the initial subflow, so we need to do 
*something* different to remember an alternate peer address. There are 
other options (like remembering only the most recent ADD_ADDR), but given 
the configurable limits we have it doesn't seem that different 
memory-wise. What's important is implementing this RFC functionality well, 
so it's really helpful to explore the alternatives here - and your 
understanding of the resource issues and DoS tradeoffs is very useful.


>
>>> 3) If I read correctly the series enforce the peer who could create the
>>> new subflow to respect the 'allow_join_initial_addr_port' field, but
>>> does not implement the corrsponding check on the other end. e.g. if a
>>> server sets 'allow_join_initial_addr_port' to 0 and a client tries to
>>> create additional subflows, the server will not forbit that. I
>>> think/fear the missing complexity here is quite significant.
>>>
>>> I think it would be nice avoid adding additional complexity for the 'C'
>>> bit - currently even mptcp.org support it.
>>>
>>
>> Assuming you intended to say "currently even mptcp.org does not support
>> it."
>>
>> You're right: the incoming join address check is missing. But I'm not sure
>> the complexity is so bad, the PM can already reject joins. I think we
>> could pass enough information to mptcp_pm_allow_new_subflow() to implement
>> this check. Am I oversimplifying - is it harder to get at the dest
>> addr/port? Or is it something else?
>
> What abot that setting C==0 on client sockets? does that imply that the
> client will accept incoming connection on the src port/addr?

C==0 says it's valid to send MP_JOINs to the client on the src port/addr. 
It's up to the client's PM to accept or reject them.

>
> For the server case, even a list-based filter would add addtional
> overhead and complexity for all setup.
>

It's not free. The RFC wording is a little fuzzy here, since it says the 
sender of C==1 "will not accept" incoming connections on that subflow. 
There's no "MUST/MUST NOT", so maybe it's possible to get by without. It 
would be good to have some input from RFC authors on this :)


> AFAICS the RFC mentions this option as on optimization to avoid
> connection attempts failing when something outside the endpoint
> themself would prevent them from being succeful.
>
> But this "optimization" has a cost in term of memory, complexity and
> additional overhead for each connection/subflow. It will work only if
> the administrator will configure the C bit properly. And that could be
> non trivial, when load balancing and NAT are outside the mptcp host.
>
> Just does not seam worthy to me, I really think we should start with
> the bare minumum to respect the RFC itself, avoiding as much complexity
> as possible:
> - setting C always to 0 on server
> - when the peer send C==1, cache the MPC addr/port, and do not attempt
> other connection to such endpoint.
>
> I'm not even sure the latter step is strictly necessary.

For the former, it is certainly simpler to always send C==0, but sending 
C==1 from a high-connection-count server behind a load balancer is the 
primary use case.

I think the "MUST NOT" wording of the RFC does require the latter step 
(server or client), and that implies a minimally functional in-kernel PM 
would need to track some other address to connect to.


> Another option
> would be passing the info to the user-space path manager - if available
> - and let it deal with all the complexity. If no user-space path
> manager is available just be dumb and simple ;)

The C bit does need to propagate to the userspace path manager. Since the 
userspace PM is mostly a client-side thing, it is situated well to handle 
the complexity of tracking incoming ADD_ADDRs and sending/not-sending 
MP_JOINs that comply with the C bit from the server.

However, an important complexity question in this email thread is about 
the server side (without userspace PM) that may or may not need to reject 
incoming MP_JOINs that it said it won't accept (as I mentioned above, the 
RFC is fuzzy here).



Is any of that more convincing about what the minimal kernel PM needs? Do 
you think some RFC author input on the rejection of MP_JOINs by the C==1 
sender would be useful?


Thanks for the questions!


>
>>> Perhaps we could keep 'allow_join_initial_addr_port' constant (always 0
>>> for client, always 0 for server) - avoding the additional complexity of
>>> point 3) above.
>>>
>>
>> Given the answer to #1 is this still what you'd propose?
>> allow_join_initial_addr_port=1 is unchanged behavior from the kernel
>> today, and is the simpler case.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-18 19:07       ` Mat Martineau
@ 2021-05-19  9:51         ` Paolo Abeni
  2021-05-19 18:06           ` Mat Martineau
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2021-05-19  9:51 UTC (permalink / raw)
  To: Mat Martineau, Florian Westphal; +Cc: Geliang Tang, mptcp

On Tue, 2021-05-18 at 12:07 -0700, Mat Martineau wrote:
> On Tue, 18 May 2021, Paolo Abeni wrote:
> > The in-kernel path manager is want to be as simple as possible. As soon
> > as we start allocating memory on peer req it's very easy to be prone to
> > DoS. Memory accounting and limits could save us or at least mitigate
> > the problem, but they will add more complexity.
> 
> My assumption was that high-connection-count servers would probably have 
> their limits set in a way that ADD_ADDRs would not be accepted anyway. 
> This is the use case where clients may be behind a NAT and the server is 
> accepting MP_JOINs from the client (but the server is never sending 
> MP_JOINs), and the server is possibly sending ADD_ADDRs to the client. The 
> configurable limit is already there from the existing in-kernel PM code.
> 
> I do agree with "as simple as possible", but the RFC requirements are 
> constraining that. Even though the C bit is ignored in Linux 
> implementations so far, I don't think it should stay that way. The current 
> code always tries to connect to the initial subflow, so we need to do 
> *something* different to remember an alternate peer address. There are 
> other options (like remembering only the most recent ADD_ADDR), but given 
> the configurable limits we have it doesn't seem that different 
> memory-wise. What's important is implementing this RFC functionality well, 
> so it's really helpful to explore the alternatives here - and your 
> understanding of the resource issues and DoS tradeoffs is very useful.

Uhmmm... I start feeling I misunderstand the RFC WRT to 'C' bit
handling.

"""
		 The third bit, labeled "C", is set to 1 to indicate
                 that the sender of this option will not accept
                 additional MPTCP subflows to the source address and
                 port, and therefore the receiver MUST NOT try to open
                 any additional subflows toward this address and port.
"""

To avoid creating additional subflows towards the MPC server
address/port, we don't need to store any additional information beyond
the 'C' flag: net-next uses that pair only for new subflow with
'subflow' type (yep, the type name is not very self-explaining :(

We just need to forbit them:

---
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index d094588afad8..2ae9fac00623 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -451,7 +451,7 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
 
        /* check if should create a new subflow */
        if (msk->pm.local_addr_used < local_addr_max &&
-           msk->pm.subflows < subflows_max) {
+           msk->pm.subflows < subflows_max && !msk->pm.remote_deny_join_id0) {
                local = select_local_address(pernet, msk);
                if (local) {
                        struct mptcp_addr_info remote = { 0 };

---

> > Just does not seam worthy to me, I really think we should start with
> > the bare minumum to respect the RFC itself, avoiding as much complexity
> > as possible:
> > - setting C always to 0 on server
> > - when the peer send C==1, cache the MPC addr/port, and do not attempt
> > other connection to such endpoint.
> > 
> > I'm not even sure the latter step is strictly necessary.
> 
> For the former, it is certainly simpler to always send C==0, but sending 
> C==1 from a high-connection-count server behind a load balancer is the 
> primary use case.

I *think* it would be far better (and possibly not that complex!!!
@Florian: WDYT? ) implementing an MPTCP aware load balancer. The MPJ
syn pkt carries the remote token, so it's possible for the load
balancer to make the "correct" decision.

> I think the "MUST NOT" wording of the RFC does require the latter step 
> (server or client), and that implies a minimally functional in-kernel PM 
> would need to track some other address to connect to.

This part is unclear to me. Why should the kernel cache more
addresses? 

It's already capabale of using as many ADD_ADDRs as the peer provides
with the current options handling. Sure, there are some constraints.
The stricter one posed by the current implementation is that the kernel
fully processes (up to subflow creation) a given ADD_ADDR option before
using another one.

That is useful to avoid transforming an MPTCP client in a DDOS tool
controlled by an (evil) MPTCP server, and sounds like a reasonable
tread-off between complexity and flexibility.

Looks like we have some topic for tomorrow's mtg :)

Thanks!

Paolo


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
  2021-05-19  9:51         ` Paolo Abeni
@ 2021-05-19 18:06           ` Mat Martineau
  0 siblings, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-05-19 18:06 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Florian Westphal, Geliang Tang, mptcp


On Wed, 19 May 2021, Paolo Abeni wrote:

> On Tue, 2021-05-18 at 12:07 -0700, Mat Martineau wrote:
>> On Tue, 18 May 2021, Paolo Abeni wrote:
>>> The in-kernel path manager is want to be as simple as possible. As soon
>>> as we start allocating memory on peer req it's very easy to be prone to
>>> DoS. Memory accounting and limits could save us or at least mitigate
>>> the problem, but they will add more complexity.
>>
>> My assumption was that high-connection-count servers would probably have
>> their limits set in a way that ADD_ADDRs would not be accepted anyway.
>> This is the use case where clients may be behind a NAT and the server is
>> accepting MP_JOINs from the client (but the server is never sending
>> MP_JOINs), and the server is possibly sending ADD_ADDRs to the client. The
>> configurable limit is already there from the existing in-kernel PM code.
>>
>> I do agree with "as simple as possible", but the RFC requirements are
>> constraining that. Even though the C bit is ignored in Linux
>> implementations so far, I don't think it should stay that way. The current
>> code always tries to connect to the initial subflow, so we need to do
>> *something* different to remember an alternate peer address. There are
>> other options (like remembering only the most recent ADD_ADDR), but given
>> the configurable limits we have it doesn't seem that different
>> memory-wise. What's important is implementing this RFC functionality well,
>> so it's really helpful to explore the alternatives here - and your
>> understanding of the resource issues and DoS tradeoffs is very useful.
>
> Uhmmm... I start feeling I misunderstand the RFC WRT to 'C' bit
> handling.
>
> """
> 		 The third bit, labeled "C", is set to 1 to indicate
>                 that the sender of this option will not accept
>                 additional MPTCP subflows to the source address and
>                 port, and therefore the receiver MUST NOT try to open
>                 any additional subflows toward this address and port.
> """
>
> To avoid creating additional subflows towards the MPC server
> address/port, we don't need to store any additional information beyond
> the 'C' flag: net-next uses that pair only for new subflow with
> 'subflow' type (yep, the type name is not very self-explaining :(
>
> We just need to forbit them:
>
> ---
> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index d094588afad8..2ae9fac00623 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -451,7 +451,7 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
>
>        /* check if should create a new subflow */
>        if (msk->pm.local_addr_used < local_addr_max &&
> -           msk->pm.subflows < subflows_max) {
> +           msk->pm.subflows < subflows_max && !msk->pm.remote_deny_join_id0) {
>                local = select_local_address(pernet, msk);
>                if (local) {
>                        struct mptcp_addr_info remote = { 0 };
>
> ---
>
>>> Just does not seam worthy to me, I really think we should start with
>>> the bare minumum to respect the RFC itself, avoiding as much complexity
>>> as possible:
>>> - setting C always to 0 on server
>>> - when the peer send C==1, cache the MPC addr/port, and do not attempt
>>> other connection to such endpoint.
>>>
>>> I'm not even sure the latter step is strictly necessary.
>>
>> For the former, it is certainly simpler to always send C==0, but sending
>> C==1 from a high-connection-count server behind a load balancer is the
>> primary use case.
>
> I *think* it would be far better (and possibly not that complex!!!
> @Florian: WDYT? ) implementing an MPTCP aware load balancer. The MPJ
> syn pkt carries the remote token, so it's possible for the load
> balancer to make the "correct" decision.
>
>> I think the "MUST NOT" wording of the RFC does require the latter step
>> (server or client), and that implies a minimally functional in-kernel PM
>> would need to track some other address to connect to.
>
> This part is unclear to me. Why should the kernel cache more
> addresses?
>
> It's already capabale of using as many ADD_ADDRs as the peer provides
> with the current options handling. Sure, there are some constraints.
> The stricter one posed by the current implementation is that the kernel
> fully processes (up to subflow creation) a given ADD_ADDR option before
> using another one.
>
> That is useful to avoid transforming an MPTCP client in a DDOS tool
> controlled by an (evil) MPTCP server, and sounds like a reasonable
> tread-off between complexity and flexibility.
>
> Looks like we have some topic for tomorrow's mtg :)
>

Yeah, I think it will be far easier to discuss in that format!

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data
  2021-05-14 14:32       ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Geliang Tang
  2021-05-14 14:32         ` [MPTCP][PATCH v6 mptcp-next 5/6] mptcp: add deny_join_id0 in mptcp_options_received Geliang Tang
@ 2021-05-20 22:50         ` Mat Martineau
  1 sibling, 0 replies; 16+ messages in thread
From: Mat Martineau @ 2021-05-20 22:50 UTC (permalink / raw)
  To: Geliang Tang; +Cc: mptcp, Paolo Abeni

On Fri, 14 May 2021, Geliang Tang wrote:

> Like the anno_list member in struct mptcp_pm_data, this patch added a
> new member named add_list in it, to save all the received ADD_ADDRs in
> this add_list.

Hi Geliang -

How to handle MP_JOIN with 'C==1' connections was a topic in today's 
community meeting, and Paolo made a strong case that reception of 
ADD_ADDRs from a peer should not be tracked in a linked list (to avoid 
resource issues in servers with lots of connections).

My concern is that preventing any call to __mptcp_subflow_connect() from 
mptcp_pm_create_subflow_or_signal_addr() when the peer sent C==1, would be 
a confusing/unexpected silent failure to establish subflows.


For v7, could you:

1. Drop patches 3/6 & 4/6

2. Track the most recent received ADD_ADDR information in new struct 
members of mptcp_pm_data (also, be sure to invalidate if a matching 
REMOVE_ADDR is received).

3. If C==1 was received, used the stored ADD_ADDR address for the remote 
address in mptcp_pm_create_subflow_or_signal_addr(). If C==1 and there is 
no stored address, skip subflow creation. If C==0 was received, keep 
existing behavior.


I've captured what I recall from the discussion - if anyone has 
corrections, clarifications, or a better proposal, please follow up!


-Mat


>
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>
> ---
> net/mptcp/pm.c         |  1 +
> net/mptcp/pm_netlink.c | 89 +++++++++++++++++++++++++++++++++++++++---
> net/mptcp/protocol.c   |  1 +
> net/mptcp/protocol.h   |  2 +
> 4 files changed, 87 insertions(+), 6 deletions(-)
>
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 9d00fa6d22e9..9456fe17b6a3 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -324,6 +324,7 @@ void mptcp_pm_data_init(struct mptcp_sock *msk)
>
> 	spin_lock_init(&msk->pm.lock);
> 	INIT_LIST_HEAD(&msk->pm.anno_list);
> +	INIT_LIST_HEAD(&msk->pm.add_list);
>
> 	mptcp_pm_nl_data_init(msk);
> }
> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index 4fff8aef45e4..3fcc167ea702 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -39,6 +39,11 @@ struct mptcp_pm_anno_entry {
> 	u8			retrans_times;
> };
>
> +struct mptcp_pm_add_entry {
> +	struct list_head	list;
> +	struct mptcp_addr_info	addr;
> +};
> +
> #define MAX_ADDR_ID		255
> #define BITMAP_SZ DIV_ROUND_UP(MAX_ADDR_ID + 1, BITS_PER_LONG)
>
> @@ -483,6 +488,69 @@ static void mptcp_pm_nl_subflow_established(struct mptcp_sock *msk)
> 	mptcp_pm_create_subflow_or_signal_addr(msk);
> }
>
> +struct mptcp_pm_add_entry *
> +mptcp_lookup_add_list_by_id(struct mptcp_sock *msk, u8 id)
> +{
> +	struct mptcp_pm_add_entry *entry;
> +
> +	lockdep_assert_held(&msk->pm.lock);
> +
> +	list_for_each_entry(entry, &msk->pm.add_list, list) {
> +		if (entry->addr.id == id)
> +			return entry;
> +	}
> +
> +	return NULL;
> +}
> +
> +struct mptcp_pm_add_entry *
> +mptcp_lookup_add_list_by_saddr(struct mptcp_sock *msk,
> +			       struct mptcp_addr_info *addr)
> +{
> +	struct mptcp_pm_add_entry *entry;
> +
> +	lockdep_assert_held(&msk->pm.lock);
> +
> +	list_for_each_entry(entry, &msk->pm.add_list, list) {
> +		if (addresses_equal(&entry->addr, addr, true))
> +			return entry;
> +	}
> +
> +	return NULL;
> +}
> +
> +static bool mptcp_pm_alloc_add_list(struct mptcp_sock *msk,
> +				    struct mptcp_addr_info *addr)
> +{
> +	struct mptcp_pm_add_entry *add_entry = NULL;
> +
> +	lockdep_assert_held(&msk->pm.lock);
> +
> +	if (mptcp_lookup_add_list_by_saddr(msk, addr))
> +		return false;
> +
> +	add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC);
> +	if (!add_entry)
> +		return false;
> +
> +	list_add(&add_entry->list, &msk->pm.add_list);
> +	add_entry->addr = *addr;
> +
> +	return true;
> +}
> +
> +void mptcp_pm_free_add_list(struct mptcp_sock *msk)
> +{
> +	struct mptcp_pm_add_entry *entry, *tmp;
> +
> +	pr_debug("msk=%p", msk);
> +
> +	spin_lock_bh(&msk->pm.lock);
> +	list_for_each_entry_safe(entry, tmp, &msk->pm.add_list, list)
> +		kfree(entry);
> +	spin_unlock_bh(&msk->pm.lock);
> +}
> +
> static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
> {
> 	struct sock *sk = (struct sock *)msk;
> @@ -501,12 +569,6 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
> 	if (lookup_subflow_by_daddr(&msk->conn_list, &msk->pm.remote))
> 		goto add_addr_echo;
>
> -	msk->pm.add_addr_accepted++;
> -	msk->pm.subflows++;
> -	if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
> -	    msk->pm.subflows >= subflows_max)
> -		WRITE_ONCE(msk->pm.accept_addr, false);
> -
> 	/* connect to the specified remote address, using whatever
> 	 * local address the routing configuration will pick.
> 	 */
> @@ -516,6 +578,15 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
> 	memset(&local, 0, sizeof(local));
> 	local.family = remote.family;
>
> +	if (!mptcp_pm_alloc_add_list(msk, &remote))
> +		return;
> +
> +	msk->pm.add_addr_accepted++;
> +	msk->pm.subflows++;
> +	if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
> +	    msk->pm.subflows >= subflows_max)
> +		WRITE_ONCE(msk->pm.accept_addr, false);
> +
> 	spin_unlock_bh(&msk->pm.lock);
> 	__mptcp_subflow_connect(sk, &local, &remote, 0, 0);
> 	spin_lock_bh(&msk->pm.lock);
> @@ -612,6 +683,7 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
> 		list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) {
> 			struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> 			int how = RCV_SHUTDOWN | SEND_SHUTDOWN;
> +			struct mptcp_pm_add_entry *entry;
> 			u8 id = subflow->local_id;
>
> 			if (rm_type == MPTCP_MIB_RMADDR)
> @@ -631,6 +703,11 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
> 			if (rm_type == MPTCP_MIB_RMADDR) {
> 				msk->pm.add_addr_accepted--;
> 				WRITE_ONCE(msk->pm.accept_addr, true);
> +				entry = mptcp_lookup_add_list_by_id(msk, id);
> +				if (entry) {
> +					list_del(&entry->list);
> +					kfree(entry);
> +				}
> 			} else if (rm_type == MPTCP_MIB_RMSUBFLOW) {
> 				msk->pm.local_addr_used--;
> 			}
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 35c0b1ca95c3..05ceba3972f6 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2888,6 +2888,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk)
> 	skb_rbtree_purge(&msk->out_of_order_queue);
> 	mptcp_token_destroy(msk);
> 	mptcp_pm_free_anno_list(msk);
> +	mptcp_pm_free_add_list(msk);
> }
>
> static void mptcp_destroy(struct sock *sk)
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 5ccc0d3e5693..1df8da3da695 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -185,6 +185,7 @@ struct mptcp_pm_data {
> 	struct mptcp_addr_info local;
> 	struct mptcp_addr_info remote;
> 	struct list_head anno_list;
> +	struct list_head add_list;
>
> 	spinlock_t	lock;		/*protects the whole PM data */
>
> @@ -693,6 +694,7 @@ int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk,
> 				 struct mptcp_addr_info *addr,
> 				 u8 bkup);
> void mptcp_pm_free_anno_list(struct mptcp_sock *msk);
> +void mptcp_pm_free_add_list(struct mptcp_sock *msk);
> bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk);
> struct mptcp_pm_anno_entry *
> mptcp_pm_del_anno_timer(struct mptcp_sock *msk,
> -- 
> 2.31.1
>
>
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-05-20 22:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-14 14:32 [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port Geliang Tang
2021-05-14 14:32   ` [MPTCP][PATCH v6 mptcp-next 2/6] mptcp: add allow_join_id0 in mptcp_out_options Geliang Tang
2021-05-14 14:32     ` [MPTCP][PATCH v6 mptcp-next 3/6] mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry Geliang Tang
2021-05-14 14:32       ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Geliang Tang
2021-05-14 14:32         ` [MPTCP][PATCH v6 mptcp-next 5/6] mptcp: add deny_join_id0 in mptcp_options_received Geliang Tang
2021-05-14 14:32           ` [MPTCP][PATCH v6 mptcp-next 6/6] selftests: mptcp: add deny_join_id0 testcases Geliang Tang
2021-05-20 22:50         ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Mat Martineau
2021-05-14 23:36 ` [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Mat Martineau
2021-05-17 14:17 ` Paolo Abeni
2021-05-17 19:59   ` Mat Martineau
2021-05-18 10:11     ` Geliang Tang
2021-05-18 13:56     ` Paolo Abeni
2021-05-18 19:07       ` Mat Martineau
2021-05-19  9:51         ` Paolo Abeni
2021-05-19 18:06           ` Mat Martineau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.