All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management
@ 2022-01-28  0:38 Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 1/8] mptcp: do not restrict subflows with non-kernel PMs Kishen Maloor
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

This patch series contains fixes and enhancements related to
path management over MPTCP connections, particularly in support of
out-of-kernel PMs. The changes ensure that the required bits of
information are conveyed through MPTCP netlink events which 
would be consumed by a path manager in making decisions, more
flexibility in establishing paths from either end of an MPTCP
connection, and better handling of listening sockets which serve
in MPJ handshakes.

v1 -> v2:
-fixed formatting
-check_fully_established: check for 3rd ACK retransmission only on passive
side of the MPJ handshake

v2 -> v3:
-subflow_simultaneous_connect: check for active subflow socket
-new helper lsk_list_find_or_create()
-updated mptcp_pm_nl_create_listen_socket() to take struct net* as param
-new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a
listening socket in the kernel during an ADD_ADDR request
-reflect the pm.server_side attribute in the MPTCP_EVENT_CREATED
and MPTCP_EVENT_ESTABLISHED events 

Kishen Maloor (8):
  mptcp: do not restrict subflows with non-kernel PMs
  mptcp: store remote id from MP_JOIN SYN/ACK in local ctx
  mptcp: reflect remote port (not 0) in ANNOUNCED events
  mptcp: establish subflows from either end of connection
  mptcp: netlink: store per namespace list of refcounted listen socks
  mptcp: netlink: store lsk ref in mptcp_pm_addr_entry
  mptcp: attempt to add listening sockets for announced addrs
  mptcp: expose server_side attribute in MPTCP netlink events

 include/uapi/linux/mptcp.h |   2 +
 net/mptcp/options.c        |   4 +-
 net/mptcp/pm.c             |   8 +-
 net/mptcp/pm_netlink.c     | 210 ++++++++++++++++++++++++++++++++-----
 net/mptcp/protocol.c       |   5 +-
 net/mptcp/protocol.h       |  14 ++-
 net/mptcp/subflow.c        |   4 +-
 7 files changed, 209 insertions(+), 38 deletions(-)


base-commit: 05854a699d27b11d8fb3217ec2e0dbf28ecb58e8
-- 
2.31.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 1/8] mptcp: do not restrict subflows with non-kernel PMs
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 2/8] mptcp: store remote id from MP_JOIN SYN/ACK in local ctx Kishen Maloor
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

Current limits on the # of subflows must apply only to in-kernel
PM managed sockets. Thus this change bypasses such limitations for
connections overseen by non-kernel (e.g. userspace) PMs.

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
 net/mptcp/pm.c      | 3 +++
 net/mptcp/subflow.c | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 1f8878cc29e3..6b6220895929 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -87,6 +87,9 @@ bool mptcp_pm_allow_new_subflow(struct mptcp_sock *msk)
 	unsigned int subflows_max;
 	int ret = 0;
 
+	if (READ_ONCE(pm->pm_type) != MPTCP_PM_TYPE_KERNEL)
+		return true;
+
 	subflows_max = mptcp_pm_get_subflows_max(msk);
 
 	pr_debug("msk=%p subflows=%d max=%d allow=%d", msk, pm->subflows,
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index c8126986793e..a7b2d6fd1c1e 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -62,7 +62,8 @@ static void subflow_generate_hmac(u64 key1, u64 key2, u32 nonce1, u32 nonce2,
 static bool mptcp_can_accept_new_subflow(const struct mptcp_sock *msk)
 {
 	return mptcp_is_fully_established((void *)msk) &&
-	       READ_ONCE(msk->pm.accept_subflow);
+	       ((READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL) ||
+		READ_ONCE(msk->pm.accept_subflow));
 }
 
 /* validate received token and create truncated hmac and nonce for SYN-ACK */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 2/8] mptcp: store remote id from MP_JOIN SYN/ACK in local ctx
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 1/8] mptcp: do not restrict subflows with non-kernel PMs Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events Kishen Maloor
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

This change reads the addr id assigned to the remote endpoint
of a subflow from the MP_JOIN SYN/ACK message and stores it
in the related subflow context. The remote id was not being
captured prior to this change, and will now provide a consistent
view of remote endpoints and their ids as seen through netlink
events.

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
 net/mptcp/subflow.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index a7b2d6fd1c1e..479a4f53bbdd 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -444,6 +444,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 		subflow->backup = mp_opt.backup;
 		subflow->thmac = mp_opt.thmac;
 		subflow->remote_nonce = mp_opt.nonce;
+		subflow->remote_id = mp_opt.join_id;
 		pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u backup=%d",
 			 subflow, subflow->thmac, subflow->remote_nonce,
 			 subflow->backup);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 1/8] mptcp: do not restrict subflows with non-kernel PMs Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 2/8] mptcp: store remote id from MP_JOIN SYN/ACK in local ctx Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  2022-01-28  4:07   ` Geliang Tang
  2022-01-28  0:38 ` [PATCH mptcp-next v3 4/8] mptcp: establish subflows from either end of connection Kishen Maloor
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP
SHOULD attempt to connect to the specified address on the same port
as the port that is already in use by the subflow on which the
ADD_ADDR signal was sent.

To facilitate that, this change reflects the specific remote port in
use by that subflow in MPTCP_EVENT_ANNOUNCED events.

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
 net/mptcp/options.c    | 2 +-
 net/mptcp/pm.c         | 5 +++--
 net/mptcp/pm_netlink.c | 8 ++++++--
 net/mptcp/protocol.h   | 6 ++++--
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 0d0d2eb8c8ca..0d3c8f7e5be6 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -1132,7 +1132,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
 		if ((mp_opt.suboptions & OPTION_MPTCP_ADD_ADDR) &&
 		    add_addr_hmac_valid(msk, &mp_opt)) {
 			if (!mp_opt.echo) {
-				mptcp_pm_add_addr_received(msk, &mp_opt.addr);
+				mptcp_pm_add_addr_received(msk, &mp_opt.addr, sk);
 				MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR);
 			} else {
 				mptcp_pm_add_addr_echoed(msk, &mp_opt.addr);
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 6b6220895929..e5d5cb847209 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -200,14 +200,15 @@ void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk,
 }
 
 void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
-				const struct mptcp_addr_info *addr)
+				const struct mptcp_addr_info *addr,
+				const struct sock *ssk)
 {
 	struct mptcp_pm_data *pm = &msk->pm;
 
 	pr_debug("msk=%p remote_id=%d accept=%d", msk, addr->id,
 		 READ_ONCE(pm->accept_addr));
 
-	mptcp_event_addr_announced(msk, addr);
+	mptcp_event_addr_announced(msk, addr, ssk);
 
 	spin_lock_bh(&pm->lock);
 
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 93800f32fcb6..f90e77c3775d 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -1972,7 +1972,8 @@ void mptcp_event_addr_removed(const struct mptcp_sock *msk, uint8_t id)
 }
 
 void mptcp_event_addr_announced(const struct mptcp_sock *msk,
-				const struct mptcp_addr_info *info)
+				const struct mptcp_addr_info *info,
+				const struct sock *ssk)
 {
 	struct net *net = sock_net((const struct sock *)msk);
 	struct nlmsghdr *nlh;
@@ -1996,7 +1997,10 @@ void mptcp_event_addr_announced(const struct mptcp_sock *msk,
 	if (nla_put_u8(skb, MPTCP_ATTR_REM_ID, info->id))
 		goto nla_put_failure;
 
-	if (nla_put_be16(skb, MPTCP_ATTR_DPORT, info->port))
+	if (nla_put_be16(skb, MPTCP_ATTR_DPORT,
+			 info->port  == 0 ?
+			 ((struct inet_sock *)inet_sk(ssk))->inet_dport :
+			 info->port))
 		goto nla_put_failure;
 
 	switch (info->family) {
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index c47d69a42fcb..d20c65fcba89 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -753,7 +753,8 @@ bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk);
 void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk,
 				 const struct mptcp_subflow_context *subflow);
 void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
-				const struct mptcp_addr_info *addr);
+				const struct mptcp_addr_info *addr,
+				const struct sock *ssk);
 void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk,
 			      struct mptcp_addr_info *addr);
 void mptcp_pm_add_addr_send_ack(struct mptcp_sock *msk);
@@ -781,7 +782,8 @@ int mptcp_pm_remove_subflow(struct mptcp_sock *msk, const struct mptcp_rm_list *
 
 void mptcp_event(enum mptcp_event_type type, const struct mptcp_sock *msk,
 		 const struct sock *ssk, gfp_t gfp);
-void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info);
+void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info,
+				const struct sock *ssk);
 void mptcp_event_addr_removed(const struct mptcp_sock *msk, u8 id);
 
 static inline bool mptcp_pm_should_add_signal(struct mptcp_sock *msk)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 4/8] mptcp: establish subflows from either end of connection
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
                   ` (2 preceding siblings ...)
  2022-01-28  0:38 ` [PATCH mptcp-next v3 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 5/8] mptcp: netlink: store per namespace list of refcounted listen socks Kishen Maloor
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

This change updates internal logic to permit subflows to be
established from either the client or server ends of MPTCP
connections. This symmetry and added flexibility may be
harnessed by PM implementations running on either end in
creating new subflows.

The essence of this change lies in not relying on the
"server_side" flag (which continues to be available if needed).

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
v2: check for 3rd ACK retransmission only on passive side
of the MPJ handshake
v3: check for active subflow socket in subflow_simultaneous_connect
---
 net/mptcp/options.c  | 2 +-
 net/mptcp/protocol.c | 5 +----
 net/mptcp/protocol.h | 8 ++++++--
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 0d3c8f7e5be6..947820a9da0d 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -930,7 +930,7 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
 		if (TCP_SKB_CB(skb)->seq == subflow->ssn_offset + 1 &&
 		    TCP_SKB_CB(skb)->end_seq == TCP_SKB_CB(skb)->seq &&
 		    subflow->mp_join && (mp_opt->suboptions & OPTIONS_MPTCP_MPJ) &&
-		    READ_ONCE(msk->pm.server_side))
+		    !subflow->request_join)
 			tcp_send_ack(ssk);
 		goto fully_established;
 	}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 526a0f5ba415..7c591177c3e8 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3256,15 +3256,12 @@ bool mptcp_finish_join(struct sock *ssk)
 		return false;
 	}
 
-	if (!msk->pm.server_side)
+	if (!list_empty(&subflow->node))
 		goto out;
 
 	if (!mptcp_pm_allow_new_subflow(msk))
 		goto err_prohibited;
 
-	if (WARN_ON_ONCE(!list_empty(&subflow->node)))
-		goto err_prohibited;
-
 	/* active connections are already on conn_list.
 	 * If we can't acquire msk socket lock here, let the release callback
 	 * handle it
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index d20c65fcba89..30006735afb7 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -906,13 +906,17 @@ static inline bool mptcp_check_infinite_map(struct sk_buff *skb)
 	return false;
 }
 
+static inline bool is_active_ssk(struct mptcp_subflow_context *subflow)
+{
+	return (subflow->request_mptcp || subflow->request_join);
+}
+
 static inline bool subflow_simultaneous_connect(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
-	struct sock *parent = subflow->conn;
 
 	return sk->sk_state == TCP_ESTABLISHED &&
-	       !mptcp_sk(parent)->pm.server_side &&
+	       is_active_ssk(subflow) &&
 	       !subflow->conn_finished;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 5/8] mptcp: netlink: store per namespace list of refcounted listen socks
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
                   ` (3 preceding siblings ...)
  2022-01-28  0:38 ` [PATCH mptcp-next v3 4/8] mptcp: establish subflows from either end of connection Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry Kishen Maloor
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

The kernel can create listening sockets bound to announced addresses
via the ADD_ADDR option for receiving MP_JOIN requests. Path
managers may further choose to advertise the same addr+port over multiple
MPTCP connections. So this change provides a simple framework to
manage a list of all distinct listning sockets created in the kernel
over a namespace by encapsulating the socket in a structure that is
ref counted and can be shared across multiple connections. The sockets
are released when there are no more references.

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
v2: fixed formatting
---
 net/mptcp/pm_netlink.c | 76 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index f90e77c3775d..aac9438dbf6c 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -22,6 +22,14 @@ static struct genl_family mptcp_genl_family;
 
 static int pm_nl_pernet_id;
 
+struct mptcp_local_lsk {
+	struct list_head	list;
+	struct mptcp_addr_info	addr;
+	struct socket		*lsk;
+	struct rcu_head		rcu;
+	refcount_t		refcount;
+};
+
 struct mptcp_pm_addr_entry {
 	struct list_head	list;
 	struct mptcp_addr_info	addr;
@@ -41,7 +49,10 @@ struct mptcp_pm_add_entry {
 struct pm_nl_pernet {
 	/* protects pernet updates */
 	spinlock_t		lock;
+	/* protects access to pernet lsk list */
+	spinlock_t		lsk_list_lock;
 	struct list_head	local_addr_list;
+	struct list_head	lsk_list;
 	unsigned int		addrs;
 	unsigned int		stale_loss_cnt;
 	unsigned int		add_addr_signal_max;
@@ -83,6 +94,69 @@ static bool addresses_equal(const struct mptcp_addr_info *a,
 	return a->port == b->port;
 }
 
+static struct mptcp_local_lsk *lsk_list_find(struct pm_nl_pernet *pernet,
+					     struct mptcp_addr_info *addr)
+{
+	struct mptcp_local_lsk *lsk_ref = NULL;
+	struct mptcp_local_lsk *i;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(i, &pernet->lsk_list, list) {
+		if (addresses_equal(&i->addr, addr, true)) {
+			if (refcount_inc_not_zero(&i->refcount)) {
+				lsk_ref = i;
+				break;
+			}
+		}
+	}
+
+	rcu_read_unlock();
+
+	return lsk_ref;
+}
+
+static void lsk_list_add_ref(struct mptcp_local_lsk *lsk_ref)
+{
+	refcount_inc(&lsk_ref->refcount);
+}
+
+static struct mptcp_local_lsk *lsk_list_add(struct pm_nl_pernet *pernet,
+					    struct mptcp_addr_info *addr,
+					    struct socket *lsk)
+{
+	struct mptcp_local_lsk *lsk_ref;
+
+	lsk_ref = kmalloc(sizeof(*lsk_ref), GFP_ATOMIC);
+
+	if (!lsk_ref)
+		return NULL;
+
+	lsk_ref->lsk = lsk;
+	memcpy(&lsk_ref->addr, addr, sizeof(struct mptcp_addr_info));
+	refcount_set(&lsk_ref->refcount, 1);
+
+	spin_lock_bh(&pernet->lsk_list_lock);
+	list_add_rcu(&lsk_ref->list, &pernet->lsk_list);
+	spin_unlock_bh(&pernet->lsk_list_lock);
+
+	return lsk_ref;
+}
+
+static void lsk_list_release(struct pm_nl_pernet *pernet,
+			     struct mptcp_local_lsk *lsk_ref)
+{
+	if (lsk_ref && refcount_dec_and_test(&lsk_ref->refcount)) {
+		sock_release(lsk_ref->lsk);
+
+		spin_lock_bh(&pernet->lsk_list_lock);
+		list_del_rcu(&lsk_ref->list);
+		spin_unlock_bh(&pernet->lsk_list_lock);
+
+		kfree_rcu(lsk_ref, rcu);
+	}
+}
+
 static bool address_zero(const struct mptcp_addr_info *addr)
 {
 	struct mptcp_addr_info zero;
@@ -2137,12 +2211,14 @@ static int __net_init pm_nl_init_net(struct net *net)
 	struct pm_nl_pernet *pernet = net_generic(net, pm_nl_pernet_id);
 
 	INIT_LIST_HEAD_RCU(&pernet->local_addr_list);
+	INIT_LIST_HEAD_RCU(&pernet->lsk_list);
 
 	/* Cit. 2 subflows ought to be enough for anybody. */
 	pernet->subflows_max = 2;
 	pernet->next_id = 1;
 	pernet->stale_loss_cnt = 4;
 	spin_lock_init(&pernet->lock);
+	spin_lock_init(&pernet->lsk_list_lock);
 
 	/* No need to initialize other pernet fields, the struct is zeroed at
 	 * allocation time.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
                   ` (4 preceding siblings ...)
  2022-01-28  0:38 ` [PATCH mptcp-next v3 5/8] mptcp: netlink: store per namespace list of refcounted listen socks Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  2022-02-01 11:31   ` Paolo Abeni
  2022-01-28  0:38 ` [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs Kishen Maloor
  2022-01-28  0:38 ` [PATCH mptcp-next v3 8/8] mptcp: expose server_side attribute in MPTCP netlink events Kishen Maloor
  7 siblings, 1 reply; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

This change updates struct mptcp_pm_addr_entry to store a
listening socket (lsk) reference, i.e. a pointer to a reference
counted structure containing the lsk (struct socket *) instead
of the lsk itself. Code blocks that previously operated on
the lsk in struct mptcp_pm_addr_entry have been updated to work
with the lsk ref instead, utilizing new helper functions.

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
v2: fixed formatting
v3: added helper lsk_list_find_or_create(), updated
mptcp_pm_nl_create_listen_socket() to take struct net* as param
---
 net/mptcp/pm_netlink.c | 76 ++++++++++++++++++++++++++++++------------
 1 file changed, 55 insertions(+), 21 deletions(-)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index aac9438dbf6c..dc02dfe917e1 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -35,7 +35,7 @@ struct mptcp_pm_addr_entry {
 	struct mptcp_addr_info	addr;
 	u8			flags;
 	int			ifindex;
-	struct socket		*lsk;
+	struct mptcp_local_lsk	*lsk_ref;
 };
 
 struct mptcp_pm_add_entry {
@@ -157,6 +157,33 @@ static void lsk_list_release(struct pm_nl_pernet *pernet,
 	}
 }
 
+static struct mptcp_local_lsk *lsk_list_find_or_create(struct net *net,
+						       struct pm_nl_pernet *pernet,
+						       struct mptcp_pm_addr_entry *entry,
+						       int *createlsk_err)
+{
+	struct mptcp_local_lsk *lsk_ref;
+	struct socket *lsk;
+	int err;
+
+	lsk_ref = lsk_list_find(pernet, &entry->addr);
+
+	if (!lsk_ref) {
+		err = mptcp_pm_nl_create_listen_socket(net, entry, &lsk);
+
+		if (createlsk_err)
+			*createlsk_err = err;
+
+		if (lsk)
+			lsk_ref = lsk_list_add(pernet, &entry->addr, lsk);
+
+		if (lsk && !lsk_ref)
+			sock_release(lsk);
+	}
+
+	return lsk_ref;
+}
+
 static bool address_zero(const struct mptcp_addr_info *addr)
 {
 	struct mptcp_addr_info zero;
@@ -996,8 +1023,9 @@ static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet,
 	return ret;
 }
 
-static int mptcp_pm_nl_create_listen_socket(struct sock *sk,
-					    struct mptcp_pm_addr_entry *entry)
+static int mptcp_pm_nl_create_listen_socket(struct net *net,
+					    struct mptcp_pm_addr_entry *entry,
+					    struct socket **lsk)
 {
 	int addrlen = sizeof(struct sockaddr_in);
 	struct sockaddr_storage addr;
@@ -1006,12 +1034,12 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk,
 	int backlog = 1024;
 	int err;
 
-	err = sock_create_kern(sock_net(sk), entry->addr.family,
-			       SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk);
+	err = sock_create_kern(net, entry->addr.family,
+			       SOCK_STREAM, IPPROTO_MPTCP, lsk);
 	if (err)
 		return err;
 
-	msk = mptcp_sk(entry->lsk->sk);
+	msk = mptcp_sk((*lsk)->sk);
 	if (!msk) {
 		err = -EINVAL;
 		goto out;
@@ -1043,7 +1071,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk,
 	return 0;
 
 out:
-	sock_release(entry->lsk);
+	sock_release(*lsk);
+	*lsk = NULL;
 	return err;
 }
 
@@ -1092,7 +1121,7 @@ int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct sock_common *skc)
 	entry->addr.port = 0;
 	entry->ifindex = 0;
 	entry->flags = 0;
-	entry->lsk = NULL;
+	entry->lsk_ref = NULL;
 	ret = mptcp_pm_nl_append_new_local_addr(pernet, entry);
 	if (ret < 0)
 		kfree(entry);
@@ -1301,18 +1330,22 @@ static int mptcp_nl_cmd_add_addr(struct sk_buff *skb, struct genl_info *info)
 
 	*entry = addr;
 	if (entry->addr.port) {
-		ret = mptcp_pm_nl_create_listen_socket(skb->sk, entry);
-		if (ret) {
-			GENL_SET_ERR_MSG(info, "create listen socket error");
+		entry->lsk_ref = lsk_list_find_or_create(sock_net(skb->sk), pernet, entry, &ret);
+
+		if (!entry->lsk_ref) {
+			GENL_SET_ERR_MSG(info, "can't create/allocate lsk");
 			kfree(entry);
+			ret = (ret == 0) ? -ENOMEM : ret;
 			return ret;
 		}
 	}
+
 	ret = mptcp_pm_nl_append_new_local_addr(pernet, entry);
+
 	if (ret < 0) {
 		GENL_SET_ERR_MSG(info, "too many addresses or duplicate one");
-		if (entry->lsk)
-			sock_release(entry->lsk);
+		if (entry->lsk_ref)
+			lsk_list_release(pernet, entry->lsk_ref);
 		kfree(entry);
 		return ret;
 	}
@@ -1415,10 +1448,11 @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net,
 }
 
 /* caller must ensure the RCU grace period is already elapsed */
-static void __mptcp_pm_release_addr_entry(struct mptcp_pm_addr_entry *entry)
+static void __mptcp_pm_release_addr_entry(struct pm_nl_pernet *pernet,
+					  struct mptcp_pm_addr_entry *entry)
 {
-	if (entry->lsk)
-		sock_release(entry->lsk);
+	if (entry->lsk_ref)
+		lsk_list_release(pernet, entry->lsk_ref);
 	kfree(entry);
 }
 
@@ -1500,7 +1534,7 @@ static int mptcp_nl_cmd_del_addr(struct sk_buff *skb, struct genl_info *info)
 
 	mptcp_nl_remove_subflow_and_signal_addr(sock_net(skb->sk), &entry->addr);
 	synchronize_rcu();
-	__mptcp_pm_release_addr_entry(entry);
+	__mptcp_pm_release_addr_entry(pernet, entry);
 
 	return ret;
 }
@@ -1556,7 +1590,7 @@ static void mptcp_nl_remove_addrs_list(struct net *net,
 }
 
 /* caller must ensure the RCU grace period is already elapsed */
-static void __flush_addrs(struct list_head *list)
+static void __flush_addrs(struct pm_nl_pernet *pernet, struct list_head *list)
 {
 	while (!list_empty(list)) {
 		struct mptcp_pm_addr_entry *cur;
@@ -1564,7 +1598,7 @@ static void __flush_addrs(struct list_head *list)
 		cur = list_entry(list->next,
 				 struct mptcp_pm_addr_entry, list);
 		list_del_rcu(&cur->list);
-		__mptcp_pm_release_addr_entry(cur);
+		__mptcp_pm_release_addr_entry(pernet, cur);
 	}
 }
 
@@ -1589,7 +1623,7 @@ static int mptcp_nl_cmd_flush_addrs(struct sk_buff *skb, struct genl_info *info)
 	spin_unlock_bh(&pernet->lock);
 	mptcp_nl_remove_addrs_list(sock_net(skb->sk), &free_list);
 	synchronize_rcu();
-	__flush_addrs(&free_list);
+	__flush_addrs(pernet, &free_list);
 	return 0;
 }
 
@@ -2238,7 +2272,7 @@ static void __net_exit pm_nl_exit_net(struct list_head *net_list)
 		 * other modifiers, also netns core already waited for a
 		 * RCU grace period.
 		 */
-		__flush_addrs(&pernet->local_addr_list);
+		__flush_addrs(pernet, &pernet->local_addr_list);
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
                   ` (5 preceding siblings ...)
  2022-01-28  0:38 ` [PATCH mptcp-next v3 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  2022-02-01 11:42   ` Paolo Abeni
  2022-01-28  0:38 ` [PATCH mptcp-next v3 8/8] mptcp: expose server_side attribute in MPTCP netlink events Kishen Maloor
  7 siblings, 1 reply; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

When ADD_ADDR announcements use the port associated with an
active subflow, this change ensures that a listening socket is bound
to the announced addr+port in the kernel for subsequently receiving
MP_JOINs. But if a listening socket for this address is already held
by the application then no action is taken.

A listening socket is created (when there isn't a listener)
just prior to the addr advertisement. If it is desired to not create
a listening socket in the kernel for an address, then this can be
requested by including the MPTCP_PM_ADDR_FLAG_NO_LISTEN flag
with the address.

When a listening socket is created, it is stored in
struct mptcp_pm_add_entry and released accordingly.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/203
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
v2: fixed formatting
v3: added new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a
listening socket in the kernel during an ADD_ADDR request, use this flag
along the in-kernel PM flow for ADD_ADDR requests (Note: listening sockets
are always created for port-based endpoints as before), use the
lsk_list_find_or_create() helper
---
 include/uapi/linux/mptcp.h |  1 +
 net/mptcp/pm_netlink.c     | 47 ++++++++++++++++++++++++++++++++++++--
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index f106a3941cdf..265cabc0d7aa 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -81,6 +81,7 @@ enum {
 #define MPTCP_PM_ADDR_FLAG_SUBFLOW			(1 << 1)
 #define MPTCP_PM_ADDR_FLAG_BACKUP			(1 << 2)
 #define MPTCP_PM_ADDR_FLAG_FULLMESH			(1 << 3)
+#define MPTCP_PM_ADDR_FLAG_NO_LISTEN			(1 << 4)
 
 enum {
 	MPTCP_PM_CMD_UNSPEC,
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index dc02dfe917e1..ceb4517a6e2b 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -43,6 +43,7 @@ struct mptcp_pm_add_entry {
 	struct mptcp_addr_info	addr;
 	struct timer_list	add_timer;
 	struct mptcp_sock	*sock;
+	struct mptcp_local_lsk	*lsk_ref;
 	u8			retrans_times;
 };
 
@@ -66,6 +67,10 @@ struct pm_nl_pernet {
 #define MPTCP_PM_ADDR_MAX	8
 #define ADD_ADDR_RETRANS_MAX	3
 
+static int mptcp_pm_nl_create_listen_socket(struct net *net,
+					    struct mptcp_pm_addr_entry *entry,
+					    struct socket **lsk);
+
 static bool addresses_equal(const struct mptcp_addr_info *a,
 			    const struct mptcp_addr_info *b, bool use_port)
 {
@@ -465,7 +470,8 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
 }
 
 static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
-				     struct mptcp_pm_addr_entry *entry)
+				     struct mptcp_pm_addr_entry *entry,
+				     struct mptcp_local_lsk *lsk_ref)
 {
 	struct mptcp_pm_add_entry *add_entry = NULL;
 	struct sock *sk = (struct sock *)msk;
@@ -485,6 +491,10 @@ static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
 	add_entry->addr = entry->addr;
 	add_entry->sock = msk;
 	add_entry->retrans_times = 0;
+	add_entry->lsk_ref = lsk_ref;
+
+	if (lsk_ref)
+		lsk_list_add_ref(lsk_ref);
 
 	timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
 	sk_reset_timer(sk, &add_entry->add_timer,
@@ -497,8 +507,11 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
 {
 	struct mptcp_pm_add_entry *entry, *tmp;
 	struct sock *sk = (struct sock *)msk;
+	struct pm_nl_pernet *pernet;
 	LIST_HEAD(free_list);
 
+	pernet = net_generic(sock_net(sk), pm_nl_pernet_id);
+
 	pr_debug("msk=%p", msk);
 
 	spin_lock_bh(&msk->pm.lock);
@@ -507,6 +520,8 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
 
 	list_for_each_entry_safe(entry, tmp, &free_list, list) {
 		sk_stop_timer_sync(sk, &entry->add_timer);
+		if (entry->lsk_ref)
+			lsk_list_release(pernet, entry->lsk_ref);
 		kfree(entry);
 	}
 }
@@ -611,7 +626,9 @@ lookup_id_by_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *add
 }
 
 static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
+	__must_hold(&msk->pm.lock)
 {
+	struct mptcp_local_lsk *lsk_ref = NULL;
 	struct sock *sk = (struct sock *)msk;
 	struct mptcp_pm_addr_entry *local;
 	unsigned int add_addr_signal_max;
@@ -648,12 +665,31 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
 		local = select_signal_address(pernet, msk);
 
 		if (local) {
-			if (mptcp_pm_alloc_anno_list(msk, local)) {
+			if (!(local->flags & MPTCP_PM_ADDR_FLAG_NO_LISTEN) &&
+			    !local->addr.port) {
+				local->addr.port =
+					((struct inet_sock *)inet_sk
+					 ((struct sock *)msk))->inet_sport;
+
+				spin_unlock_bh(&msk->pm.lock);
+
+				lsk_ref = lsk_list_find_or_create(sock_net(sk), pernet,
+								  local, NULL);
+
+				spin_lock_bh(&msk->pm.lock);
+
+				local->addr.port = 0;
+			}
+
+			if (mptcp_pm_alloc_anno_list(msk, local, lsk_ref)) {
 				__clear_bit(local->addr.id, msk->pm.id_avail_bitmap);
 				msk->pm.add_addr_signaled++;
 				mptcp_pm_announce_addr(msk, &local->addr, false);
 				mptcp_pm_nl_addr_send_ack(msk);
 			}
+
+			if (lsk_ref)
+				lsk_list_release(pernet, lsk_ref);
 		}
 	}
 
@@ -745,6 +781,7 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
 }
 
 static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
+	__must_hold(&msk->pm.lock)
 {
 	struct mptcp_addr_info addrs[MPTCP_PM_ADDR_MAX];
 	struct sock *sk = (struct sock *)msk;
@@ -1379,11 +1416,17 @@ int mptcp_pm_get_flags_and_ifindex_by_id(struct net *net, unsigned int id,
 static bool remove_anno_list_by_saddr(struct mptcp_sock *msk,
 				      struct mptcp_addr_info *addr)
 {
+	struct sock *sk = (struct sock *)msk;
 	struct mptcp_pm_add_entry *entry;
+	struct pm_nl_pernet *pernet;
+
+	pernet = net_generic(sock_net(sk), pm_nl_pernet_id);
 
 	entry = mptcp_pm_del_add_timer(msk, addr, false);
 	if (entry) {
 		list_del(&entry->list);
+		if (entry->lsk_ref)
+			lsk_list_release(pernet, entry->lsk_ref);
 		kfree(entry);
 		return true;
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH mptcp-next v3 8/8] mptcp: expose server_side attribute in MPTCP netlink events
  2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
                   ` (6 preceding siblings ...)
  2022-01-28  0:38 ` [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs Kishen Maloor
@ 2022-01-28  0:38 ` Kishen Maloor
  7 siblings, 0 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-01-28  0:38 UTC (permalink / raw)
  To: kishen.maloor, mptcp

This change records the server_side attribute in MPTCP_EVENT_CREATED
and MPTCP_EVENT_ESTABLISHED events to inform the recipient of the role
of the associated MPTCP application (Client/Server) that is handling
it's end of the MPTCP connection.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/246
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
 include/uapi/linux/mptcp.h | 1 +
 net/mptcp/pm_netlink.c     | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index 265cabc0d7aa..0df44a116a31 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -188,6 +188,7 @@ enum mptcp_event_attr {
 	MPTCP_ATTR_IF_IDX,	/* s32 */
 	MPTCP_ATTR_RESET_REASON,/* u32 */
 	MPTCP_ATTR_RESET_FLAGS, /* u32 */
+	MPTCP_ATTR_SERVER_SIDE,	/* u8 */
 
 	__MPTCP_ATTR_AFTER_LAST
 };
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index ceb4517a6e2b..126cc961a4fd 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -2088,6 +2088,9 @@ static int mptcp_event_created(struct sk_buff *skb,
 	if (err)
 		return err;
 
+	if (nla_put_u8(skb, MPTCP_ATTR_SERVER_SIDE, READ_ONCE(msk->pm.server_side)))
+		return -EMSGSIZE;
+
 	return mptcp_event_add_subflow(skb, ssk);
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events
  2022-01-28  0:38 ` [PATCH mptcp-next v3 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events Kishen Maloor
@ 2022-01-28  4:07   ` Geliang Tang
  2022-01-31 22:22     ` Kishen Maloor
  0 siblings, 1 reply; 17+ messages in thread
From: Geliang Tang @ 2022-01-28  4:07 UTC (permalink / raw)
  To: Kishen Maloor; +Cc: MPTCP Upstream

Hi Kishen,

Kishen Maloor <kishen.maloor@intel.com> 于2022年1月28日周五 08:38写道:
>
> Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP
> SHOULD attempt to connect to the specified address on the same port
> as the port that is already in use by the subflow on which the
> ADD_ADDR signal was sent.
>
> To facilitate that, this change reflects the specific remote port in
> use by that subflow in MPTCP_EVENT_ANNOUNCED events.
>
> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
> ---
>  net/mptcp/options.c    | 2 +-
>  net/mptcp/pm.c         | 5 +++--
>  net/mptcp/pm_netlink.c | 8 ++++++--
>  net/mptcp/protocol.h   | 6 ++++--
>  4 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
> index 0d0d2eb8c8ca..0d3c8f7e5be6 100644
> --- a/net/mptcp/options.c
> +++ b/net/mptcp/options.c
> @@ -1132,7 +1132,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
>                 if ((mp_opt.suboptions & OPTION_MPTCP_ADD_ADDR) &&
>                     add_addr_hmac_valid(msk, &mp_opt)) {
>                         if (!mp_opt.echo) {
> -                               mptcp_pm_add_addr_received(msk, &mp_opt.addr);
> +                               mptcp_pm_add_addr_received(msk, &mp_opt.addr, sk);
>                                 MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR);
>                         } else {
>                                 mptcp_pm_add_addr_echoed(msk, &mp_opt.addr);
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 6b6220895929..e5d5cb847209 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -200,14 +200,15 @@ void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk,
>  }
>
>  void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
> -                               const struct mptcp_addr_info *addr)
> +                               const struct mptcp_addr_info *addr,
> +                               const struct sock *ssk)

I think the better parameters order of this function is:

(struct mptcp_sock *msk, const struct sock *ssk, const struct
mptcp_addr_info *addr)

Put the new parameter ssk just after msk.

Furthermore, I think instead of adding a new parameters ssk here, it's
better to change the parameter msk to ssk.

We can get the msk from ssk like this:

    struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
    struct mptcp_sock *msk = mptcp_sk(subflow->conn);

>  {
>         struct mptcp_pm_data *pm = &msk->pm;
>
>         pr_debug("msk=%p remote_id=%d accept=%d", msk, addr->id,
>                  READ_ONCE(pm->accept_addr));
>
> -       mptcp_event_addr_announced(msk, addr);
> +       mptcp_event_addr_announced(msk, addr, ssk);
>
>         spin_lock_bh(&pm->lock);
>
> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index 93800f32fcb6..f90e77c3775d 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -1972,7 +1972,8 @@ void mptcp_event_addr_removed(const struct mptcp_sock *msk, uint8_t id)
>  }
>
>  void mptcp_event_addr_announced(const struct mptcp_sock *msk,
> -                               const struct mptcp_addr_info *info)
> +                               const struct mptcp_addr_info *info,
> +                               const struct sock *ssk)

Here change the parameter msk to ssk too.

>  {
>         struct net *net = sock_net((const struct sock *)msk);

    struct net *net = sock_net(ssk);

>         struct nlmsghdr *nlh;
> @@ -1996,7 +1997,10 @@ void mptcp_event_addr_announced(const struct mptcp_sock *msk,
>         if (nla_put_u8(skb, MPTCP_ATTR_REM_ID, info->id))
>                 goto nla_put_failure;
>
> -       if (nla_put_be16(skb, MPTCP_ATTR_DPORT, info->port))
> +       if (nla_put_be16(skb, MPTCP_ATTR_DPORT,
> +                        info->port  == 0 ?

Here two blanks before '=='.

Thanks,
-Geliang

> +                        ((struct inet_sock *)inet_sk(ssk))->inet_dport :
> +                        info->port))
>                 goto nla_put_failure;
>
>         switch (info->family) {
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index c47d69a42fcb..d20c65fcba89 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -753,7 +753,8 @@ bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk);
>  void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk,
>                                  const struct mptcp_subflow_context *subflow);
>  void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
> -                               const struct mptcp_addr_info *addr);
> +                               const struct mptcp_addr_info *addr,
> +                               const struct sock *ssk);
>  void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk,
>                               struct mptcp_addr_info *addr);
>  void mptcp_pm_add_addr_send_ack(struct mptcp_sock *msk);
> @@ -781,7 +782,8 @@ int mptcp_pm_remove_subflow(struct mptcp_sock *msk, const struct mptcp_rm_list *
>
>  void mptcp_event(enum mptcp_event_type type, const struct mptcp_sock *msk,
>                  const struct sock *ssk, gfp_t gfp);
> -void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info);
> +void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info,
> +                               const struct sock *ssk);
>  void mptcp_event_addr_removed(const struct mptcp_sock *msk, u8 id);
>
>  static inline bool mptcp_pm_should_add_signal(struct mptcp_sock *msk)
> --
> 2.31.1
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events
  2022-01-28  4:07   ` Geliang Tang
@ 2022-01-31 22:22     ` Kishen Maloor
  0 siblings, 0 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-01-31 22:22 UTC (permalink / raw)
  To: Geliang Tang; +Cc: MPTCP Upstream

On 1/27/22 8:07 PM, Geliang Tang wrote:
> Hi Kishen,
> 
> Kishen Maloor <kishen.maloor@intel.com> 于2022年1月28日周五 08:38写道:
>>
>> Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP
>> SHOULD attempt to connect to the specified address on the same port
>> as the port that is already in use by the subflow on which the
>> ADD_ADDR signal was sent.
>>
>> To facilitate that, this change reflects the specific remote port in
>> use by that subflow in MPTCP_EVENT_ANNOUNCED events.
>>
>> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
>> ---
>>  net/mptcp/options.c    | 2 +-
>>  net/mptcp/pm.c         | 5 +++--
>>  net/mptcp/pm_netlink.c | 8 ++++++--
>>  net/mptcp/protocol.h   | 6 ++++--
>>  4 files changed, 14 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
>> index 0d0d2eb8c8ca..0d3c8f7e5be6 100644
>> --- a/net/mptcp/options.c
>> +++ b/net/mptcp/options.c
>> @@ -1132,7 +1132,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
>>                 if ((mp_opt.suboptions & OPTION_MPTCP_ADD_ADDR) &&
>>                     add_addr_hmac_valid(msk, &mp_opt)) {
>>                         if (!mp_opt.echo) {
>> -                               mptcp_pm_add_addr_received(msk, &mp_opt.addr);
>> +                               mptcp_pm_add_addr_received(msk, &mp_opt.addr, sk);
>>                                 MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR);
>>                         } else {
>>                                 mptcp_pm_add_addr_echoed(msk, &mp_opt.addr);
>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>> index 6b6220895929..e5d5cb847209 100644
>> --- a/net/mptcp/pm.c
>> +++ b/net/mptcp/pm.c
>> @@ -200,14 +200,15 @@ void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk,
>>  }
>>
>>  void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
>> -                               const struct mptcp_addr_info *addr)
>> +                               const struct mptcp_addr_info *addr,
>> +                               const struct sock *ssk)
> 
> I think the better parameters order of this function is:
> 
> (struct mptcp_sock *msk, const struct sock *ssk, const struct
> mptcp_addr_info *addr)
> 
> Put the new parameter ssk just after msk.
> 
> Furthermore, I think instead of adding a new parameters ssk here, it's
> better to change the parameter msk to ssk.
> 
> We can get the msk from ssk like this:
> 
>     struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
>     struct mptcp_sock *msk = mptcp_sk(subflow->conn);
> 

Thanks! Your suggestions look fine to me. I shall reflect this in v4.

>>  {
>>         struct mptcp_pm_data *pm = &msk->pm;
>>
>>         pr_debug("msk=%p remote_id=%d accept=%d", msk, addr->id,
>>                  READ_ONCE(pm->accept_addr));
>>
>> -       mptcp_event_addr_announced(msk, addr);
>> +       mptcp_event_addr_announced(msk, addr, ssk);
>>
>>         spin_lock_bh(&pm->lock);
>>
>> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
>> index 93800f32fcb6..f90e77c3775d 100644
>> --- a/net/mptcp/pm_netlink.c
>> +++ b/net/mptcp/pm_netlink.c
>> @@ -1972,7 +1972,8 @@ void mptcp_event_addr_removed(const struct mptcp_sock *msk, uint8_t id)
>>  }
>>
>>  void mptcp_event_addr_announced(const struct mptcp_sock *msk,
>> -                               const struct mptcp_addr_info *info)
>> +                               const struct mptcp_addr_info *info,
>> +                               const struct sock *ssk)
> 
> Here change the parameter msk to ssk too.
> 
>>  {
>>         struct net *net = sock_net((const struct sock *)msk);
> 
>     struct net *net = sock_net(ssk);
> 
>>         struct nlmsghdr *nlh;
>> @@ -1996,7 +1997,10 @@ void mptcp_event_addr_announced(const struct mptcp_sock *msk,
>>         if (nla_put_u8(skb, MPTCP_ATTR_REM_ID, info->id))
>>                 goto nla_put_failure;
>>
>> -       if (nla_put_be16(skb, MPTCP_ATTR_DPORT, info->port))
>> +       if (nla_put_be16(skb, MPTCP_ATTR_DPORT,
>> +                        info->port  == 0 ?
> 
> Here two blanks before '=='.
> 
> Thanks,
> -Geliang
> 
>> +                        ((struct inet_sock *)inet_sk(ssk))->inet_dport :
>> +                        info->port))
>>                 goto nla_put_failure;
>>
>>         switch (info->family) {
>> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
>> index c47d69a42fcb..d20c65fcba89 100644
>> --- a/net/mptcp/protocol.h
>> +++ b/net/mptcp/protocol.h
>> @@ -753,7 +753,8 @@ bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk);
>>  void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk,
>>                                  const struct mptcp_subflow_context *subflow);
>>  void mptcp_pm_add_addr_received(struct mptcp_sock *msk,
>> -                               const struct mptcp_addr_info *addr);
>> +                               const struct mptcp_addr_info *addr,
>> +                               const struct sock *ssk);
>>  void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk,
>>                               struct mptcp_addr_info *addr);
>>  void mptcp_pm_add_addr_send_ack(struct mptcp_sock *msk);
>> @@ -781,7 +782,8 @@ int mptcp_pm_remove_subflow(struct mptcp_sock *msk, const struct mptcp_rm_list *
>>
>>  void mptcp_event(enum mptcp_event_type type, const struct mptcp_sock *msk,
>>                  const struct sock *ssk, gfp_t gfp);
>> -void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info);
>> +void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info,
>> +                               const struct sock *ssk);
>>  void mptcp_event_addr_removed(const struct mptcp_sock *msk, u8 id);
>>
>>  static inline bool mptcp_pm_should_add_signal(struct mptcp_sock *msk)
>> --
>> 2.31.1
>>
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry
  2022-01-28  0:38 ` [PATCH mptcp-next v3 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry Kishen Maloor
@ 2022-02-01 11:31   ` Paolo Abeni
  2022-02-01 21:19     ` Kishen Maloor
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Abeni @ 2022-02-01 11:31 UTC (permalink / raw)
  To: Kishen Maloor, mptcp

Hello,

On Thu, 2022-01-27 at 19:38 -0500, Kishen Maloor wrote:
> @@ -157,6 +157,33 @@ static void lsk_list_release(struct pm_nl_pernet *pernet,
>  	}
>  }
>  
> +static struct mptcp_local_lsk *lsk_list_find_or_create(struct net *net,
> +						       struct pm_nl_pernet *pernet,
> +						       struct mptcp_pm_addr_entry *entry,
> +						       int *createlsk_err)
> +{
> +	struct mptcp_local_lsk *lsk_ref;
> +	struct socket *lsk;
> +	int err;
> +
> +	lsk_ref = lsk_list_find(pernet, &entry->addr);
> +
> +	if (!lsk_ref) {
> +		err = mptcp_pm_nl_create_listen_socket(net, entry, &lsk);

What happens if multiple cores call 'lsk_list_find_or_create'
simultaneously? Is that possible/expected?

I think the expected behaviour in that scenario is creating a single
new lsk, and have all the callers fetching such instances. If the race
happens on mptcp_pm_nl_create_listen_socket() it looks like only one
caller will get a valid lsk reference, all the others will get back an
error.

Possibly calling again lsk_list_find() in case of failure could address
the above.

If the race is not possible, it should be at least documented in a
comment why it can't happen.

Side note: using:

	if (lsk_ref)
		return lsk_ref;

instead of:
	if (!lsk_ref) { //...

will reduce the indentation level.

/P


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs
  2022-01-28  0:38 ` [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs Kishen Maloor
@ 2022-02-01 11:42   ` Paolo Abeni
  2022-02-01 17:25     ` Matthieu Baerts
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Abeni @ 2022-02-01 11:42 UTC (permalink / raw)
  To: Kishen Maloor, mptcp

On Thu, 2022-01-27 at 19:38 -0500, Kishen Maloor wrote:
> When ADD_ADDR announcements use the port associated with an
> active subflow, this change ensures that a listening socket is bound
> to the announced addr+port in the kernel for subsequently receiving
> MP_JOINs. But if a listening socket for this address is already held
> by the application then no action is taken.
> 
> A listening socket is created (when there isn't a listener)
> just prior to the addr advertisement. If it is desired to not create
> a listening socket in the kernel for an address, then this can be
> requested by including the MPTCP_PM_ADDR_FLAG_NO_LISTEN flag
> with the address.
> 
> When a listening socket is created, it is stored in
> struct mptcp_pm_add_entry and released accordingly.
> 
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/203
> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
> ---
> v2: fixed formatting
> v3: added new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a
> listening socket in the kernel during an ADD_ADDR request, use this flag
> along the in-kernel PM flow for ADD_ADDR requests (Note: listening sockets
> are always created for port-based endpoints as before), use the
> lsk_list_find_or_create() helper

I think it's better introducing the opposite flag (e.g.
'MPTCP_PM_ADDR_FLAG_LISTEN') otherwise this will change the default
behavior

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs
  2022-02-01 11:42   ` Paolo Abeni
@ 2022-02-01 17:25     ` Matthieu Baerts
  2022-02-01 21:21       ` Kishen Maloor
  0 siblings, 1 reply; 17+ messages in thread
From: Matthieu Baerts @ 2022-02-01 17:25 UTC (permalink / raw)
  To: Paolo Abeni, Kishen Maloor; +Cc: mptcp

Hi Paolo, Kishen,

On 01/02/2022 12:42, Paolo Abeni wrote:
> On Thu, 2022-01-27 at 19:38 -0500, Kishen Maloor wrote:
>> When ADD_ADDR announcements use the port associated with an
>> active subflow, this change ensures that a listening socket is bound
>> to the announced addr+port in the kernel for subsequently receiving
>> MP_JOINs. But if a listening socket for this address is already held
>> by the application then no action is taken.
>>
>> A listening socket is created (when there isn't a listener)
>> just prior to the addr advertisement. If it is desired to not create
>> a listening socket in the kernel for an address, then this can be
>> requested by including the MPTCP_PM_ADDR_FLAG_NO_LISTEN flag
>> with the address.
>>
>> When a listening socket is created, it is stored in
>> struct mptcp_pm_add_entry and released accordingly.
>>
>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/203
>> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
>> ---
>> v2: fixed formatting
>> v3: added new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a
>> listening socket in the kernel during an ADD_ADDR request, use this flag
>> along the in-kernel PM flow for ADD_ADDR requests (Note: listening sockets
>> are always created for port-based endpoints as before), use the
>> lsk_list_find_or_create() helper
> 
> I think it's better introducing the opposite flag (e.g.
> 'MPTCP_PM_ADDR_FLAG_LISTEN') otherwise this will change the default
> behavior

Maybe it is fine to change the behaviour. Without changing the user
exposed API of course.

I mean: it all depends if we consider the fact that when the userspace
closes the listening socket to accept new "MPTCP" connections, it is not
normal (bug) to close the possibility to create new subflows → socket
controlled by the user vs socket controlled by the PM. If we do consider
this as a "bug", then that's OK to change the default behaviour, no?

I don't know if other people are sharing my view here.

For me, if an app closes the listening socket after having accepted a
new connection, it is just not to receive new "main" connections on this
socket but it is OK to accept new subflows as they are part of existing
connections (and managed by the PM).
We would then avoid people hitting issues like #203. If you hit this
issue, it is not easy to find the answer I think.

If people know they don't need/want the creation of a listening socket
only to accept new subflows, they can set the NO_LISTEN flag when adding
an address with "ip mptcp". But if the cost is minimal most of the time
because no additional listening sockets will be actually created, that's
fine to use a "NO_LISTEN" flag I think.

WDYT?

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry
  2022-02-01 11:31   ` Paolo Abeni
@ 2022-02-01 21:19     ` Kishen Maloor
  0 siblings, 0 replies; 17+ messages in thread
From: Kishen Maloor @ 2022-02-01 21:19 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

On 2/1/22 3:31 AM, Paolo Abeni wrote:
> Hello,
> 
> On Thu, 2022-01-27 at 19:38 -0500, Kishen Maloor wrote:
>> @@ -157,6 +157,33 @@ static void lsk_list_release(struct pm_nl_pernet *pernet,
>>  	}
>>  }
>>  
>> +static struct mptcp_local_lsk *lsk_list_find_or_create(struct net *net,
>> +						       struct pm_nl_pernet *pernet,
>> +						       struct mptcp_pm_addr_entry *entry,
>> +						       int *createlsk_err)
>> +{
>> +	struct mptcp_local_lsk *lsk_ref;
>> +	struct socket *lsk;
>> +	int err;
>> +
>> +	lsk_ref = lsk_list_find(pernet, &entry->addr);
>> +
>> +	if (!lsk_ref) {
>> +		err = mptcp_pm_nl_create_listen_socket(net, entry, &lsk);
> 
> What happens if multiple cores call 'lsk_list_find_or_create'
> simultaneously? Is that possible/expected?
> 

That is technically possible, yes.

> I think the expected behaviour in that scenario is creating a single
> new lsk, and have all the callers fetching such instances. If the race
> happens on mptcp_pm_nl_create_listen_socket() it looks like only one
> caller will get a valid lsk reference, all the others will get back an
> error.
> 

That's right, there will be at most one socket created per addr+port.
I think there shouldn't be a problem with simultaneous calls for
different addresses.

But for a scenario where the simultaneous calls refer to the same addr+port,
I could (as you suggest below) add another lsk_list_find() following a failed
lsk_list_find_or_create() to then either a) obtain a lsk_ref to a
recently created lsk (by the parallel call), or b) fail more definitively.

> Possibly calling again lsk_list_find() in case of failure could address
> the above.
> 
> If the race is not possible, it should be at least documented in a
> comment why it can't happen.
> 
> Side note: using:
> 
> 	if (lsk_ref)
> 		return lsk_ref;
> 
> instead of:
> 	if (!lsk_ref) { //...
> 
> will reduce the indentation level.
> 
> /P
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs
  2022-02-01 17:25     ` Matthieu Baerts
@ 2022-02-01 21:21       ` Kishen Maloor
  2022-02-02  1:18         ` Mat Martineau
  0 siblings, 1 reply; 17+ messages in thread
From: Kishen Maloor @ 2022-02-01 21:21 UTC (permalink / raw)
  To: Matthieu Baerts, Paolo Abeni; +Cc: mptcp

On 2/1/22 9:25 AM, Matthieu Baerts wrote:
> Hi Paolo, Kishen,
> 
> On 01/02/2022 12:42, Paolo Abeni wrote:
>> On Thu, 2022-01-27 at 19:38 -0500, Kishen Maloor wrote:
>>> When ADD_ADDR announcements use the port associated with an
>>> active subflow, this change ensures that a listening socket is bound
>>> to the announced addr+port in the kernel for subsequently receiving
>>> MP_JOINs. But if a listening socket for this address is already held
>>> by the application then no action is taken.
>>>
>>> A listening socket is created (when there isn't a listener)
>>> just prior to the addr advertisement. If it is desired to not create
>>> a listening socket in the kernel for an address, then this can be
>>> requested by including the MPTCP_PM_ADDR_FLAG_NO_LISTEN flag
>>> with the address.
>>>
>>> When a listening socket is created, it is stored in
>>> struct mptcp_pm_add_entry and released accordingly.
>>>
>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/203
>>> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
>>> ---
>>> v2: fixed formatting
>>> v3: added new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a
>>> listening socket in the kernel during an ADD_ADDR request, use this flag
>>> along the in-kernel PM flow for ADD_ADDR requests (Note: listening sockets
>>> are always created for port-based endpoints as before), use the
>>> lsk_list_find_or_create() helper
>>
>> I think it's better introducing the opposite flag (e.g.
>> 'MPTCP_PM_ADDR_FLAG_LISTEN') otherwise this will change the default
>> behavior
> 
> Maybe it is fine to change the behaviour. Without changing the user
> exposed API of course.
> 
> I mean: it all depends if we consider the fact that when the userspace
> closes the listening socket to accept new "MPTCP" connections, it is not
> normal (bug) to close the possibility to create new subflows → socket
> controlled by the user vs socket controlled by the PM. If we do consider
> this as a "bug", then that's OK to change the default behaviour, no?
> 
> I don't know if other people are sharing my view here.
> 
> For me, if an app closes the listening socket after having accepted a
> new connection, it is just not to receive new "main" connections on this
> socket but it is OK to accept new subflows as they are part of existing
> connections (and managed by the PM).
> We would then avoid people hitting issues like #203. If you hit this
> issue, it is not easy to find the answer I think.
> 

I think Matthieu has clearly captured above my rationale that led to this change because
I did consider this a "bug".

> If people know they don't need/want the creation of a listening socket
> only to accept new subflows, they can set the NO_LISTEN flag when adding
> an address with "ip mptcp". But if the cost is minimal most of the time
> because no additional listening sockets will be actually created, that's
> fine to use a "NO_LISTEN" flag I think.
> 

Yes, when the application is listening (which may be the majority of cases), no listening
socket is created for address announcements reusing the subflow's port. If the application
happens to not be listening, then a listener is established, so issues like #203 would be
addressed by default. Further, if the user does not want (for any reason) a listener
to be created, then they could supply the NO_LISTEN flag with the address.

> WDYT?
> 
> Cheers,
> Matt


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs
  2022-02-01 21:21       ` Kishen Maloor
@ 2022-02-02  1:18         ` Mat Martineau
  0 siblings, 0 replies; 17+ messages in thread
From: Mat Martineau @ 2022-02-02  1:18 UTC (permalink / raw)
  To: Kishen Maloor; +Cc: Matthieu Baerts, Paolo Abeni, mptcp

[-- Attachment #1: Type: text/plain, Size: 3813 bytes --]

On Tue, 1 Feb 2022, Kishen Maloor wrote:

> On 2/1/22 9:25 AM, Matthieu Baerts wrote:
>> Hi Paolo, Kishen,
>>
>> On 01/02/2022 12:42, Paolo Abeni wrote:
>>> On Thu, 2022-01-27 at 19:38 -0500, Kishen Maloor wrote:
>>>> When ADD_ADDR announcements use the port associated with an
>>>> active subflow, this change ensures that a listening socket is bound
>>>> to the announced addr+port in the kernel for subsequently receiving
>>>> MP_JOINs. But if a listening socket for this address is already held
>>>> by the application then no action is taken.
>>>>
>>>> A listening socket is created (when there isn't a listener)
>>>> just prior to the addr advertisement. If it is desired to not create
>>>> a listening socket in the kernel for an address, then this can be
>>>> requested by including the MPTCP_PM_ADDR_FLAG_NO_LISTEN flag
>>>> with the address.
>>>>
>>>> When a listening socket is created, it is stored in
>>>> struct mptcp_pm_add_entry and released accordingly.
>>>>
>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/203
>>>> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
>>>> ---
>>>> v2: fixed formatting
>>>> v3: added new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a
>>>> listening socket in the kernel during an ADD_ADDR request, use this flag
>>>> along the in-kernel PM flow for ADD_ADDR requests (Note: listening sockets
>>>> are always created for port-based endpoints as before), use the
>>>> lsk_list_find_or_create() helper
>>>
>>> I think it's better introducing the opposite flag (e.g.
>>> 'MPTCP_PM_ADDR_FLAG_LISTEN') otherwise this will change the default
>>> behavior
>>
>> Maybe it is fine to change the behaviour. Without changing the user
>> exposed API of course.
>>
>> I mean: it all depends if we consider the fact that when the userspace
>> closes the listening socket to accept new "MPTCP" connections, it is not
>> normal (bug) to close the possibility to create new subflows → socket
>> controlled by the user vs socket controlled by the PM. If we do consider
>> this as a "bug", then that's OK to change the default behaviour, no?
>>
>> I don't know if other people are sharing my view here.
>>
>> For me, if an app closes the listening socket after having accepted a
>> new connection, it is just not to receive new "main" connections on this
>> socket but it is OK to accept new subflows as they are part of existing
>> connections (and managed by the PM).
>> We would then avoid people hitting issues like #203. If you hit this
>> issue, it is not easy to find the answer I think.
>>
>
> I think Matthieu has clearly captured above my rationale that led to this change because
> I did consider this a "bug".
>
>> If people know they don't need/want the creation of a listening socket
>> only to accept new subflows, they can set the NO_LISTEN flag when adding
>> an address with "ip mptcp". But if the cost is minimal most of the time
>> because no additional listening sockets will be actually created, that's
>> fine to use a "NO_LISTEN" flag I think.
>>
>
> Yes, when the application is listening (which may be the majority of cases), no listening
> socket is created for address announcements reusing the subflow's port. If the application
> happens to not be listening, then a listener is established, so issues like #203 would be
> addressed by default. Further, if the user does not want (for any reason) a listener
> to be created, then they could supply the NO_LISTEN flag with the address.
>

I concur with Matthieu and Kishen here, I think the NO_LISTEN flag gives 
enough control that everyone will be able to get the behavior they need 
with the least surprising defaults. It seems like keeping exactly the same 
existing behavior in different special cases would be a less desirable 
tradeoff.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-02-02  1:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28  0:38 [PATCH mptcp-next v3 0/8] mptcp: fixes and enhancements related to path management Kishen Maloor
2022-01-28  0:38 ` [PATCH mptcp-next v3 1/8] mptcp: do not restrict subflows with non-kernel PMs Kishen Maloor
2022-01-28  0:38 ` [PATCH mptcp-next v3 2/8] mptcp: store remote id from MP_JOIN SYN/ACK in local ctx Kishen Maloor
2022-01-28  0:38 ` [PATCH mptcp-next v3 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events Kishen Maloor
2022-01-28  4:07   ` Geliang Tang
2022-01-31 22:22     ` Kishen Maloor
2022-01-28  0:38 ` [PATCH mptcp-next v3 4/8] mptcp: establish subflows from either end of connection Kishen Maloor
2022-01-28  0:38 ` [PATCH mptcp-next v3 5/8] mptcp: netlink: store per namespace list of refcounted listen socks Kishen Maloor
2022-01-28  0:38 ` [PATCH mptcp-next v3 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry Kishen Maloor
2022-02-01 11:31   ` Paolo Abeni
2022-02-01 21:19     ` Kishen Maloor
2022-01-28  0:38 ` [PATCH mptcp-next v3 7/8] mptcp: attempt to add listening sockets for announced addrs Kishen Maloor
2022-02-01 11:42   ` Paolo Abeni
2022-02-01 17:25     ` Matthieu Baerts
2022-02-01 21:21       ` Kishen Maloor
2022-02-02  1:18         ` Mat Martineau
2022-01-28  0:38 ` [PATCH mptcp-next v3 8/8] mptcp: expose server_side attribute in MPTCP netlink events Kishen Maloor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.