All of lore.kernel.org
 help / color / mirror / Atom feed
* [MPTCP] [PATCH net-next 00/12] Exchange MPTCP DATA_FIN/DATA_ACK before TCP FIN
@ 2020-07-28 22:11 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:11 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1614 bytes --]

This series allows the MPTCP-level connection to be closed with the
peers exchanging DATA_FIN and DATA_ACK according to the state machine in
appendix D of RFC 8684. The process is very similar to the TCP
disconnect state machine. 

The prior code sends DATA_FIN only when TCP FIN packets are sent, and
does not allow for the MPTCP-level connection to be half-closed.

Patch 8 ("mptcp: Use full MPTCP-level disconnect state machine") is the
core of the series. Earlier patches in the series have some small fixes
and helpers in preparation, and the final four small patches do some
cleanup.


Mat Martineau (12):
  mptcp: Allow DATA_FIN in headers without TCP FIN
  mptcp: Return EPIPE if sending is shut down during a sendmsg
  mptcp: Remove outdated and incorrect comment
  mptcp: Use MPTCP-level flag for sending DATA_FIN
  mptcp: Track received DATA_FIN sequence number and add related helpers
  mptcp: Add mptcp_close_state() helper
  mptcp: Add helper to process acks of DATA_FIN
  mptcp: Use full MPTCP-level disconnect state machine
  mptcp: Only use subflow EOF signaling on fallback connections
  mptcp: Skip unnecessary skb extension allocation for bare acks
  mptcp: Safely read sequence number when lock isn't held
  mptcp: Safely store sequence number when sending data

 net/mptcp/options.c  |  57 +++++++--
 net/mptcp/protocol.c | 295 ++++++++++++++++++++++++++++++++++++-------
 net/mptcp/protocol.h |   6 +-
 net/mptcp/subflow.c  |  14 +-
 4 files changed, 306 insertions(+), 66 deletions(-)


base-commit: 0003041e7a0bf24594e5d66fe217bbbefdac44ab
-- 
2.28.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH net-next 00/12] Exchange MPTCP DATA_FIN/DATA_ACK before TCP FIN
@ 2020-07-28 22:11 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:11 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

This series allows the MPTCP-level connection to be closed with the
peers exchanging DATA_FIN and DATA_ACK according to the state machine in
appendix D of RFC 8684. The process is very similar to the TCP
disconnect state machine. 

The prior code sends DATA_FIN only when TCP FIN packets are sent, and
does not allow for the MPTCP-level connection to be half-closed.

Patch 8 ("mptcp: Use full MPTCP-level disconnect state machine") is the
core of the series. Earlier patches in the series have some small fixes
and helpers in preparation, and the final four small patches do some
cleanup.


Mat Martineau (12):
  mptcp: Allow DATA_FIN in headers without TCP FIN
  mptcp: Return EPIPE if sending is shut down during a sendmsg
  mptcp: Remove outdated and incorrect comment
  mptcp: Use MPTCP-level flag for sending DATA_FIN
  mptcp: Track received DATA_FIN sequence number and add related helpers
  mptcp: Add mptcp_close_state() helper
  mptcp: Add helper to process acks of DATA_FIN
  mptcp: Use full MPTCP-level disconnect state machine
  mptcp: Only use subflow EOF signaling on fallback connections
  mptcp: Skip unnecessary skb extension allocation for bare acks
  mptcp: Safely read sequence number when lock isn't held
  mptcp: Safely store sequence number when sending data

 net/mptcp/options.c  |  57 +++++++--
 net/mptcp/protocol.c | 295 ++++++++++++++++++++++++++++++++++++-------
 net/mptcp/protocol.h |   6 +-
 net/mptcp/subflow.c  |  14 +-
 4 files changed, 306 insertions(+), 66 deletions(-)


base-commit: 0003041e7a0bf24594e5d66fe217bbbefdac44ab
-- 
2.28.0


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 01/12] mptcp: Allow DATA_FIN in headers without TCP FIN
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:11 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:11 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1493 bytes --]

RFC 8684-compliant DATA_FIN needs to be sent and ack'd before subflows
are closed with TCP FIN, so write DATA_FIN DSS headers whenever their
transmission has been enabled by the MPTCP connection-level socket.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/options.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 3bc56eb608d8..0b122b2a9c69 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -482,17 +482,10 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 	struct mptcp_sock *msk;
 	unsigned int ack_size;
 	bool ret = false;
-	u8 tcp_fin;
 
-	if (skb) {
-		mpext = mptcp_get_ext(skb);
-		tcp_fin = TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN;
-	} else {
-		mpext = NULL;
-		tcp_fin = 0;
-	}
+	mpext = skb ? mptcp_get_ext(skb) : NULL;
 
-	if (!skb || (mpext && mpext->use_map) || tcp_fin) {
+	if (!skb || (mpext && mpext->use_map) || subflow->data_fin_tx_enable) {
 		unsigned int map_size;
 
 		map_size = TCPOLEN_MPTCP_DSS_BASE + TCPOLEN_MPTCP_DSS_MAP64;
@@ -502,7 +495,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 		if (mpext)
 			opts->ext_copy = *mpext;
 
-		if (skb && tcp_fin && subflow->data_fin_tx_enable)
+		if (skb && subflow->data_fin_tx_enable)
 			mptcp_write_data_fin(subflow, skb, &opts->ext_copy);
 		ret = true;
 	}
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 01/12] mptcp: Allow DATA_FIN in headers without TCP FIN
@ 2020-07-28 22:11 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:11 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

RFC 8684-compliant DATA_FIN needs to be sent and ack'd before subflows
are closed with TCP FIN, so write DATA_FIN DSS headers whenever their
transmission has been enabled by the MPTCP connection-level socket.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/options.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 3bc56eb608d8..0b122b2a9c69 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -482,17 +482,10 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 	struct mptcp_sock *msk;
 	unsigned int ack_size;
 	bool ret = false;
-	u8 tcp_fin;
 
-	if (skb) {
-		mpext = mptcp_get_ext(skb);
-		tcp_fin = TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN;
-	} else {
-		mpext = NULL;
-		tcp_fin = 0;
-	}
+	mpext = skb ? mptcp_get_ext(skb) : NULL;
 
-	if (!skb || (mpext && mpext->use_map) || tcp_fin) {
+	if (!skb || (mpext && mpext->use_map) || subflow->data_fin_tx_enable) {
 		unsigned int map_size;
 
 		map_size = TCPOLEN_MPTCP_DSS_BASE + TCPOLEN_MPTCP_DSS_MAP64;
@@ -502,7 +495,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 		if (mpext)
 			opts->ext_copy = *mpext;
 
-		if (skb && tcp_fin && subflow->data_fin_tx_enable)
+		if (skb && subflow->data_fin_tx_enable)
 			mptcp_write_data_fin(subflow, skb, &opts->ext_copy);
 		ret = true;
 	}
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 02/12] mptcp: Return EPIPE if sending is shut down during a sendmsg
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 767 bytes --]

A MPTCP socket where sending has been shut down should not attempt to
send additional data, since DATA_FIN has already been sent.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/protocol.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2891ae8a1028..b3c3dbc89b3f 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -748,6 +748,11 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 restart:
 	mptcp_clean_una(sk);
 
+	if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) {
+		ret = -EPIPE;
+		goto out;
+	}
+
 wait_for_sndbuf:
 	__mptcp_flush_join_list(msk);
 	ssk = mptcp_subflow_get_send(msk);
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 02/12] mptcp: Return EPIPE if sending is shut down during a sendmsg
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

A MPTCP socket where sending has been shut down should not attempt to
send additional data, since DATA_FIN has already been sent.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/protocol.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2891ae8a1028..b3c3dbc89b3f 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -748,6 +748,11 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 restart:
 	mptcp_clean_una(sk);
 
+	if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) {
+		ret = -EPIPE;
+		goto out;
+	}
+
 wait_for_sndbuf:
 	__mptcp_flush_join_list(msk);
 	ssk = mptcp_subflow_get_send(msk);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 03/12] mptcp: Remove outdated and incorrect comment
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 704 bytes --]

mptcp_close() acquires the msk lock, so it clearly should not be held
before the function is called.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/protocol.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b3c3dbc89b3f..7d7e0fa17219 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1421,7 +1421,6 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how,
 	release_sock(ssk);
 }
 
-/* Called with msk lock held, releases such lock before returning */
 static void mptcp_close(struct sock *sk, long timeout)
 {
 	struct mptcp_subflow_context *subflow, *tmp;
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 03/12] mptcp: Remove outdated and incorrect comment
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

mptcp_close() acquires the msk lock, so it clearly should not be held
before the function is called.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/protocol.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b3c3dbc89b3f..7d7e0fa17219 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1421,7 +1421,6 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how,
 	release_sock(ssk);
 }
 
-/* Called with msk lock held, releases such lock before returning */
 static void mptcp_close(struct sock *sk, long timeout)
 {
 	struct mptcp_subflow_context *subflow, *tmp;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 04/12] mptcp: Use MPTCP-level flag for sending DATA_FIN
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 5975 bytes --]

Since DATA_FIN information is the same for every subflow, store it only
in the mptcp_sock.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/options.c  | 18 ++++++++++++------
 net/mptcp/protocol.c | 21 +++++----------------
 net/mptcp/protocol.h |  3 +--
 3 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 0b122b2a9c69..f157cb7e14c0 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -451,6 +451,8 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 static void mptcp_write_data_fin(struct mptcp_subflow_context *subflow,
 				 struct sk_buff *skb, struct mptcp_ext *ext)
 {
+	u64 data_fin_tx_seq = READ_ONCE(mptcp_sk(subflow->conn)->write_seq);
+
 	if (!ext->use_map || !skb->len) {
 		/* RFC6824 requires a DSS mapping with specific values
 		 * if DATA_FIN is set but no data payload is mapped
@@ -458,10 +460,13 @@ static void mptcp_write_data_fin(struct mptcp_subflow_context *subflow,
 		ext->data_fin = 1;
 		ext->use_map = 1;
 		ext->dsn64 = 1;
-		ext->data_seq = subflow->data_fin_tx_seq;
+		/* The write_seq value has already been incremented, so
+		 * the actual sequence number for the DATA_FIN is one less.
+		 */
+		ext->data_seq = data_fin_tx_seq - 1;
 		ext->subflow_seq = 0;
 		ext->data_len = 1;
-	} else if (ext->data_seq + ext->data_len == subflow->data_fin_tx_seq) {
+	} else if (ext->data_seq + ext->data_len == data_fin_tx_seq) {
 		/* If there's an existing DSS mapping and it is the
 		 * final mapping, DATA_FIN consumes 1 additional byte of
 		 * mapping space.
@@ -477,15 +482,17 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 					  struct mptcp_out_options *opts)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
+	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
 	unsigned int dss_size = 0;
+	u64 snd_data_fin_enable;
 	struct mptcp_ext *mpext;
-	struct mptcp_sock *msk;
 	unsigned int ack_size;
 	bool ret = false;
 
 	mpext = skb ? mptcp_get_ext(skb) : NULL;
+	snd_data_fin_enable = READ_ONCE(msk->snd_data_fin_enable);
 
-	if (!skb || (mpext && mpext->use_map) || subflow->data_fin_tx_enable) {
+	if (!skb || (mpext && mpext->use_map) || snd_data_fin_enable) {
 		unsigned int map_size;
 
 		map_size = TCPOLEN_MPTCP_DSS_BASE + TCPOLEN_MPTCP_DSS_MAP64;
@@ -495,7 +502,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 		if (mpext)
 			opts->ext_copy = *mpext;
 
-		if (skb && subflow->data_fin_tx_enable)
+		if (skb && snd_data_fin_enable)
 			mptcp_write_data_fin(subflow, skb, &opts->ext_copy);
 		ret = true;
 	}
@@ -504,7 +511,6 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 	 * if the first subflow may have the already the remote key handy
 	 */
 	opts->ext_copy.use_ack = 0;
-	msk = mptcp_sk(subflow->conn);
 	if (!READ_ONCE(msk->can_ack)) {
 		*size = ALIGN(dss_size, 4);
 		return ret;
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 7d7e0fa17219..dd403ba3679a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1391,8 +1391,7 @@ static void mptcp_cancel_work(struct sock *sk)
 		sock_put(sk);
 }
 
-static void mptcp_subflow_shutdown(struct sock *ssk, int how,
-				   bool data_fin_tx_enable, u64 data_fin_tx_seq)
+static void mptcp_subflow_shutdown(struct sock *ssk, int how)
 {
 	lock_sock(ssk);
 
@@ -1405,14 +1404,6 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how,
 		tcp_disconnect(ssk, O_NONBLOCK);
 		break;
 	default:
-		if (data_fin_tx_enable) {
-			struct mptcp_subflow_context *subflow;
-
-			subflow = mptcp_subflow_ctx(ssk);
-			subflow->data_fin_tx_seq = data_fin_tx_seq;
-			subflow->data_fin_tx_enable = 1;
-		}
-
 		ssk->sk_shutdown |= how;
 		tcp_shutdown(ssk, how);
 		break;
@@ -1426,7 +1417,6 @@ static void mptcp_close(struct sock *sk, long timeout)
 	struct mptcp_subflow_context *subflow, *tmp;
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	LIST_HEAD(conn_list);
-	u64 data_fin_tx_seq;
 
 	lock_sock(sk);
 
@@ -1440,7 +1430,7 @@ static void mptcp_close(struct sock *sk, long timeout)
 	spin_unlock_bh(&msk->join_list_lock);
 	list_splice_init(&msk->conn_list, &conn_list);
 
-	data_fin_tx_seq = msk->write_seq;
+	msk->snd_data_fin_enable = 1;
 
 	__mptcp_clear_xmit(sk);
 
@@ -1448,9 +1438,6 @@ static void mptcp_close(struct sock *sk, long timeout)
 
 	list_for_each_entry_safe(subflow, tmp, &conn_list, node) {
 		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
-
-		subflow->data_fin_tx_seq = data_fin_tx_seq;
-		subflow->data_fin_tx_enable = 1;
 		__mptcp_close_ssk(sk, ssk, subflow, timeout);
 	}
 
@@ -2146,10 +2133,12 @@ static int mptcp_shutdown(struct socket *sock, int how)
 	}
 
 	__mptcp_flush_join_list(msk);
+	msk->snd_data_fin_enable = 1;
+
 	mptcp_for_each_subflow(msk, subflow) {
 		struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
 
-		mptcp_subflow_shutdown(tcp_sk, how, 1, msk->write_seq);
+		mptcp_subflow_shutdown(tcp_sk, how);
 	}
 
 	/* Wake up anyone sleeping in poll. */
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 67634b595466..3f49cc105772 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -199,6 +199,7 @@ struct mptcp_sock {
 	unsigned long	flags;
 	bool		can_ack;
 	bool		fully_established;
+	bool		snd_data_fin_enable;
 	spinlock_t	join_list_lock;
 	struct work_struct work;
 	struct list_head conn_list;
@@ -291,10 +292,8 @@ struct mptcp_subflow_context {
 		backup : 1,
 		data_avail : 1,
 		rx_eof : 1,
-		data_fin_tx_enable : 1,
 		use_64bit_ack : 1, /* Set when we received a 64-bit DSN */
 		can_ack : 1;	    /* only after processing the remote a key */
-	u64	data_fin_tx_seq;
 	u32	remote_nonce;
 	u64	thmac;
 	u32	local_nonce;
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 04/12] mptcp: Use MPTCP-level flag for sending DATA_FIN
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

Since DATA_FIN information is the same for every subflow, store it only
in the mptcp_sock.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/options.c  | 18 ++++++++++++------
 net/mptcp/protocol.c | 21 +++++----------------
 net/mptcp/protocol.h |  3 +--
 3 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 0b122b2a9c69..f157cb7e14c0 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -451,6 +451,8 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
 static void mptcp_write_data_fin(struct mptcp_subflow_context *subflow,
 				 struct sk_buff *skb, struct mptcp_ext *ext)
 {
+	u64 data_fin_tx_seq = READ_ONCE(mptcp_sk(subflow->conn)->write_seq);
+
 	if (!ext->use_map || !skb->len) {
 		/* RFC6824 requires a DSS mapping with specific values
 		 * if DATA_FIN is set but no data payload is mapped
@@ -458,10 +460,13 @@ static void mptcp_write_data_fin(struct mptcp_subflow_context *subflow,
 		ext->data_fin = 1;
 		ext->use_map = 1;
 		ext->dsn64 = 1;
-		ext->data_seq = subflow->data_fin_tx_seq;
+		/* The write_seq value has already been incremented, so
+		 * the actual sequence number for the DATA_FIN is one less.
+		 */
+		ext->data_seq = data_fin_tx_seq - 1;
 		ext->subflow_seq = 0;
 		ext->data_len = 1;
-	} else if (ext->data_seq + ext->data_len == subflow->data_fin_tx_seq) {
+	} else if (ext->data_seq + ext->data_len == data_fin_tx_seq) {
 		/* If there's an existing DSS mapping and it is the
 		 * final mapping, DATA_FIN consumes 1 additional byte of
 		 * mapping space.
@@ -477,15 +482,17 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 					  struct mptcp_out_options *opts)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
+	struct mptcp_sock *msk = mptcp_sk(subflow->conn);
 	unsigned int dss_size = 0;
+	u64 snd_data_fin_enable;
 	struct mptcp_ext *mpext;
-	struct mptcp_sock *msk;
 	unsigned int ack_size;
 	bool ret = false;
 
 	mpext = skb ? mptcp_get_ext(skb) : NULL;
+	snd_data_fin_enable = READ_ONCE(msk->snd_data_fin_enable);
 
-	if (!skb || (mpext && mpext->use_map) || subflow->data_fin_tx_enable) {
+	if (!skb || (mpext && mpext->use_map) || snd_data_fin_enable) {
 		unsigned int map_size;
 
 		map_size = TCPOLEN_MPTCP_DSS_BASE + TCPOLEN_MPTCP_DSS_MAP64;
@@ -495,7 +502,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 		if (mpext)
 			opts->ext_copy = *mpext;
 
-		if (skb && subflow->data_fin_tx_enable)
+		if (skb && snd_data_fin_enable)
 			mptcp_write_data_fin(subflow, skb, &opts->ext_copy);
 		ret = true;
 	}
@@ -504,7 +511,6 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
 	 * if the first subflow may have the already the remote key handy
 	 */
 	opts->ext_copy.use_ack = 0;
-	msk = mptcp_sk(subflow->conn);
 	if (!READ_ONCE(msk->can_ack)) {
 		*size = ALIGN(dss_size, 4);
 		return ret;
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 7d7e0fa17219..dd403ba3679a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1391,8 +1391,7 @@ static void mptcp_cancel_work(struct sock *sk)
 		sock_put(sk);
 }
 
-static void mptcp_subflow_shutdown(struct sock *ssk, int how,
-				   bool data_fin_tx_enable, u64 data_fin_tx_seq)
+static void mptcp_subflow_shutdown(struct sock *ssk, int how)
 {
 	lock_sock(ssk);
 
@@ -1405,14 +1404,6 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how,
 		tcp_disconnect(ssk, O_NONBLOCK);
 		break;
 	default:
-		if (data_fin_tx_enable) {
-			struct mptcp_subflow_context *subflow;
-
-			subflow = mptcp_subflow_ctx(ssk);
-			subflow->data_fin_tx_seq = data_fin_tx_seq;
-			subflow->data_fin_tx_enable = 1;
-		}
-
 		ssk->sk_shutdown |= how;
 		tcp_shutdown(ssk, how);
 		break;
@@ -1426,7 +1417,6 @@ static void mptcp_close(struct sock *sk, long timeout)
 	struct mptcp_subflow_context *subflow, *tmp;
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	LIST_HEAD(conn_list);
-	u64 data_fin_tx_seq;
 
 	lock_sock(sk);
 
@@ -1440,7 +1430,7 @@ static void mptcp_close(struct sock *sk, long timeout)
 	spin_unlock_bh(&msk->join_list_lock);
 	list_splice_init(&msk->conn_list, &conn_list);
 
-	data_fin_tx_seq = msk->write_seq;
+	msk->snd_data_fin_enable = 1;
 
 	__mptcp_clear_xmit(sk);
 
@@ -1448,9 +1438,6 @@ static void mptcp_close(struct sock *sk, long timeout)
 
 	list_for_each_entry_safe(subflow, tmp, &conn_list, node) {
 		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
-
-		subflow->data_fin_tx_seq = data_fin_tx_seq;
-		subflow->data_fin_tx_enable = 1;
 		__mptcp_close_ssk(sk, ssk, subflow, timeout);
 	}
 
@@ -2146,10 +2133,12 @@ static int mptcp_shutdown(struct socket *sock, int how)
 	}
 
 	__mptcp_flush_join_list(msk);
+	msk->snd_data_fin_enable = 1;
+
 	mptcp_for_each_subflow(msk, subflow) {
 		struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
 
-		mptcp_subflow_shutdown(tcp_sk, how, 1, msk->write_seq);
+		mptcp_subflow_shutdown(tcp_sk, how);
 	}
 
 	/* Wake up anyone sleeping in poll. */
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 67634b595466..3f49cc105772 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -199,6 +199,7 @@ struct mptcp_sock {
 	unsigned long	flags;
 	bool		can_ack;
 	bool		fully_established;
+	bool		snd_data_fin_enable;
 	spinlock_t	join_list_lock;
 	struct work_struct work;
 	struct list_head conn_list;
@@ -291,10 +292,8 @@ struct mptcp_subflow_context {
 		backup : 1,
 		data_avail : 1,
 		rx_eof : 1,
-		data_fin_tx_enable : 1,
 		use_64bit_ack : 1, /* Set when we received a 64-bit DSN */
 		can_ack : 1;	    /* only after processing the remote a key */
-	u64	data_fin_tx_seq;
 	u32	remote_nonce;
 	u64	thmac;
 	u32	local_nonce;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 05/12] mptcp: Track received DATA_FIN sequence number and add related helpers
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 6172 bytes --]

Incoming DATA_FIN headers need to propagate the presence of the DATA_FIN
bit and the associated sequence number to the MPTCP layer, even when
arriving on a bare ACK that does not get added to the receive queue. Add
structure members to store the DATA_FIN information and helpers to set
and check those values.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/options.c  |  16 +++++++
 net/mptcp/protocol.c | 106 +++++++++++++++++++++++++++++++++++++++----
 net/mptcp/protocol.h |   3 ++
 3 files changed, 115 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index f157cb7e14c0..38583d1b9b5f 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -782,6 +782,22 @@ static void update_una(struct mptcp_sock *msk,
 	}
 }
 
+bool mptcp_update_rcv_data_fin(struct mptcp_sock *msk, u64 data_fin_seq)
+{
+	/* Skip if DATA_FIN was already received.
+	 * If updating simultaneously with the recvmsg loop, values
+	 * should match. If they mismatch, the peer is misbehaving and
+	 * we will prefer the most recent information.
+	 */
+	if (READ_ONCE(msk->rcv_data_fin) || !READ_ONCE(msk->first))
+		return false;
+
+	WRITE_ONCE(msk->rcv_data_fin_seq, data_fin_seq);
+	WRITE_ONCE(msk->rcv_data_fin, 1);
+
+	return true;
+}
+
 static bool add_addr_hmac_valid(struct mptcp_sock *msk,
 				struct mptcp_options_received *mp_opt)
 {
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index dd403ba3679a..e1c71bfd61a3 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -16,6 +16,7 @@
 #include <net/inet_hashtables.h>
 #include <net/protocol.h>
 #include <net/tcp.h>
+#include <net/tcp_states.h>
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
 #include <net/transp_v6.h>
 #endif
@@ -163,6 +164,101 @@ static bool mptcp_subflow_dsn_valid(const struct mptcp_sock *msk,
 	return mptcp_subflow_data_available(ssk);
 }
 
+static bool mptcp_pending_data_fin(struct sock *sk, u64 *seq)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+
+	if (READ_ONCE(msk->rcv_data_fin) &&
+	    ((1 << sk->sk_state) &
+	     (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2))) {
+		u64 rcv_data_fin_seq = READ_ONCE(msk->rcv_data_fin_seq);
+
+		if (msk->ack_seq == rcv_data_fin_seq) {
+			if (seq)
+				*seq = rcv_data_fin_seq;
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+static void mptcp_set_timeout(const struct sock *sk, const struct sock *ssk)
+{
+	long tout = ssk && inet_csk(ssk)->icsk_pending ?
+				      inet_csk(ssk)->icsk_timeout - jiffies : 0;
+
+	if (tout <= 0)
+		tout = mptcp_sk(sk)->timer_ival;
+	mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
+}
+
+static void mptcp_check_data_fin(struct sock *sk)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+	u64 rcv_data_fin_seq;
+
+	if (__mptcp_check_fallback(msk) || !msk->first)
+		return;
+
+	/* Need to ack a DATA_FIN received from a peer while this side
+	 * of the connection is in ESTABLISHED, FIN_WAIT1, or FIN_WAIT2.
+	 * msk->rcv_data_fin was set when parsing the incoming options
+	 * at the subflow level and the msk lock was not held, so this
+	 * is the first opportunity to act on the DATA_FIN and change
+	 * the msk state.
+	 *
+	 * If we are caught up to the sequence number of the incoming
+	 * DATA_FIN, send the DATA_ACK now and do state transition.  If
+	 * not caught up, do nothing and let the recv code send DATA_ACK
+	 * when catching up.
+	 */
+
+	if (mptcp_pending_data_fin(sk, &rcv_data_fin_seq)) {
+		struct mptcp_subflow_context *subflow;
+
+		msk->ack_seq++;
+		WRITE_ONCE(msk->rcv_data_fin, 0);
+
+		sk->sk_shutdown |= RCV_SHUTDOWN;
+
+		switch (sk->sk_state) {
+		case TCP_ESTABLISHED:
+			inet_sk_state_store(sk, TCP_CLOSE_WAIT);
+			break;
+		case TCP_FIN_WAIT1:
+			inet_sk_state_store(sk, TCP_CLOSING);
+			break;
+		case TCP_FIN_WAIT2:
+			inet_sk_state_store(sk, TCP_CLOSE);
+			// @@ Close subflows now?
+			break;
+		default:
+			/* Other states not expected */
+			WARN_ON_ONCE(1);
+			break;
+		}
+
+		mptcp_set_timeout(sk, NULL);
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+			lock_sock(ssk);
+			tcp_send_ack(ssk);
+			release_sock(ssk);
+		}
+
+		sk->sk_state_change(sk);
+
+		if (sk->sk_shutdown == SHUTDOWN_MASK ||
+		    sk->sk_state == TCP_CLOSE)
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP);
+		else
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
+	}
+}
+
 static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 					   struct sock *ssk,
 					   unsigned int *bytes)
@@ -303,16 +399,6 @@ static void __mptcp_flush_join_list(struct mptcp_sock *msk)
 	spin_unlock_bh(&msk->join_list_lock);
 }
 
-static void mptcp_set_timeout(const struct sock *sk, const struct sock *ssk)
-{
-	long tout = ssk && inet_csk(ssk)->icsk_pending ?
-				      inet_csk(ssk)->icsk_timeout - jiffies : 0;
-
-	if (tout <= 0)
-		tout = mptcp_sk(sk)->timer_ival;
-	mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
-}
-
 static bool mptcp_timer_pending(struct sock *sk)
 {
 	return timer_pending(&inet_csk(sk)->icsk_retransmit_timer);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 3f49cc105772..beb34b8a5363 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -193,12 +193,14 @@ struct mptcp_sock {
 	u64		remote_key;
 	u64		write_seq;
 	u64		ack_seq;
+	u64		rcv_data_fin_seq;
 	atomic64_t	snd_una;
 	unsigned long	timer_ival;
 	u32		token;
 	unsigned long	flags;
 	bool		can_ack;
 	bool		fully_established;
+	bool		rcv_data_fin;
 	bool		snd_data_fin_enable;
 	spinlock_t	join_list_lock;
 	struct work_struct work;
@@ -385,6 +387,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk);
 bool mptcp_finish_join(struct sock *sk);
 void mptcp_data_acked(struct sock *sk);
 void mptcp_subflow_eof(struct sock *sk);
+bool mptcp_update_rcv_data_fin(struct mptcp_sock *msk, u64 data_fin_seq);
 
 void __init mptcp_token_init(void);
 static inline void mptcp_token_init_request(struct request_sock *req)
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 05/12] mptcp: Track received DATA_FIN sequence number and add related helpers
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

Incoming DATA_FIN headers need to propagate the presence of the DATA_FIN
bit and the associated sequence number to the MPTCP layer, even when
arriving on a bare ACK that does not get added to the receive queue. Add
structure members to store the DATA_FIN information and helpers to set
and check those values.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/options.c  |  16 +++++++
 net/mptcp/protocol.c | 106 +++++++++++++++++++++++++++++++++++++++----
 net/mptcp/protocol.h |   3 ++
 3 files changed, 115 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index f157cb7e14c0..38583d1b9b5f 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -782,6 +782,22 @@ static void update_una(struct mptcp_sock *msk,
 	}
 }
 
+bool mptcp_update_rcv_data_fin(struct mptcp_sock *msk, u64 data_fin_seq)
+{
+	/* Skip if DATA_FIN was already received.
+	 * If updating simultaneously with the recvmsg loop, values
+	 * should match. If they mismatch, the peer is misbehaving and
+	 * we will prefer the most recent information.
+	 */
+	if (READ_ONCE(msk->rcv_data_fin) || !READ_ONCE(msk->first))
+		return false;
+
+	WRITE_ONCE(msk->rcv_data_fin_seq, data_fin_seq);
+	WRITE_ONCE(msk->rcv_data_fin, 1);
+
+	return true;
+}
+
 static bool add_addr_hmac_valid(struct mptcp_sock *msk,
 				struct mptcp_options_received *mp_opt)
 {
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index dd403ba3679a..e1c71bfd61a3 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -16,6 +16,7 @@
 #include <net/inet_hashtables.h>
 #include <net/protocol.h>
 #include <net/tcp.h>
+#include <net/tcp_states.h>
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
 #include <net/transp_v6.h>
 #endif
@@ -163,6 +164,101 @@ static bool mptcp_subflow_dsn_valid(const struct mptcp_sock *msk,
 	return mptcp_subflow_data_available(ssk);
 }
 
+static bool mptcp_pending_data_fin(struct sock *sk, u64 *seq)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+
+	if (READ_ONCE(msk->rcv_data_fin) &&
+	    ((1 << sk->sk_state) &
+	     (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2))) {
+		u64 rcv_data_fin_seq = READ_ONCE(msk->rcv_data_fin_seq);
+
+		if (msk->ack_seq == rcv_data_fin_seq) {
+			if (seq)
+				*seq = rcv_data_fin_seq;
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+static void mptcp_set_timeout(const struct sock *sk, const struct sock *ssk)
+{
+	long tout = ssk && inet_csk(ssk)->icsk_pending ?
+				      inet_csk(ssk)->icsk_timeout - jiffies : 0;
+
+	if (tout <= 0)
+		tout = mptcp_sk(sk)->timer_ival;
+	mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
+}
+
+static void mptcp_check_data_fin(struct sock *sk)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+	u64 rcv_data_fin_seq;
+
+	if (__mptcp_check_fallback(msk) || !msk->first)
+		return;
+
+	/* Need to ack a DATA_FIN received from a peer while this side
+	 * of the connection is in ESTABLISHED, FIN_WAIT1, or FIN_WAIT2.
+	 * msk->rcv_data_fin was set when parsing the incoming options
+	 * at the subflow level and the msk lock was not held, so this
+	 * is the first opportunity to act on the DATA_FIN and change
+	 * the msk state.
+	 *
+	 * If we are caught up to the sequence number of the incoming
+	 * DATA_FIN, send the DATA_ACK now and do state transition.  If
+	 * not caught up, do nothing and let the recv code send DATA_ACK
+	 * when catching up.
+	 */
+
+	if (mptcp_pending_data_fin(sk, &rcv_data_fin_seq)) {
+		struct mptcp_subflow_context *subflow;
+
+		msk->ack_seq++;
+		WRITE_ONCE(msk->rcv_data_fin, 0);
+
+		sk->sk_shutdown |= RCV_SHUTDOWN;
+
+		switch (sk->sk_state) {
+		case TCP_ESTABLISHED:
+			inet_sk_state_store(sk, TCP_CLOSE_WAIT);
+			break;
+		case TCP_FIN_WAIT1:
+			inet_sk_state_store(sk, TCP_CLOSING);
+			break;
+		case TCP_FIN_WAIT2:
+			inet_sk_state_store(sk, TCP_CLOSE);
+			// @@ Close subflows now?
+			break;
+		default:
+			/* Other states not expected */
+			WARN_ON_ONCE(1);
+			break;
+		}
+
+		mptcp_set_timeout(sk, NULL);
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+			lock_sock(ssk);
+			tcp_send_ack(ssk);
+			release_sock(ssk);
+		}
+
+		sk->sk_state_change(sk);
+
+		if (sk->sk_shutdown == SHUTDOWN_MASK ||
+		    sk->sk_state == TCP_CLOSE)
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP);
+		else
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
+	}
+}
+
 static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 					   struct sock *ssk,
 					   unsigned int *bytes)
@@ -303,16 +399,6 @@ static void __mptcp_flush_join_list(struct mptcp_sock *msk)
 	spin_unlock_bh(&msk->join_list_lock);
 }
 
-static void mptcp_set_timeout(const struct sock *sk, const struct sock *ssk)
-{
-	long tout = ssk && inet_csk(ssk)->icsk_pending ?
-				      inet_csk(ssk)->icsk_timeout - jiffies : 0;
-
-	if (tout <= 0)
-		tout = mptcp_sk(sk)->timer_ival;
-	mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
-}
-
 static bool mptcp_timer_pending(struct sock *sk)
 {
 	return timer_pending(&inet_csk(sk)->icsk_retransmit_timer);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 3f49cc105772..beb34b8a5363 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -193,12 +193,14 @@ struct mptcp_sock {
 	u64		remote_key;
 	u64		write_seq;
 	u64		ack_seq;
+	u64		rcv_data_fin_seq;
 	atomic64_t	snd_una;
 	unsigned long	timer_ival;
 	u32		token;
 	unsigned long	flags;
 	bool		can_ack;
 	bool		fully_established;
+	bool		rcv_data_fin;
 	bool		snd_data_fin_enable;
 	spinlock_t	join_list_lock;
 	struct work_struct work;
@@ -385,6 +387,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk);
 bool mptcp_finish_join(struct sock *sk);
 void mptcp_data_acked(struct sock *sk);
 void mptcp_subflow_eof(struct sock *sk);
+bool mptcp_update_rcv_data_fin(struct mptcp_sock *msk, u64 data_fin_seq);
 
 void __init mptcp_token_init(void);
 static inline void mptcp_token_init_request(struct request_sock *req)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 06/12] mptcp: Add mptcp_close_state() helper
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1613 bytes --]

This will be used to transition to the appropriate state on close and
determine if a DATA_FIN needs to be sent for that state transition.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/protocol.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e1c71bfd61a3..51370b69e30b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1498,6 +1498,33 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how)
 	release_sock(ssk);
 }
 
+static const unsigned char new_state[16] = {
+	/* current state:     new state:      action:	*/
+	[0 /* (Invalid) */] = TCP_CLOSE,
+	[TCP_ESTABLISHED]   = TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+	[TCP_SYN_SENT]      = TCP_CLOSE,
+	[TCP_SYN_RECV]      = TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+	[TCP_FIN_WAIT1]     = TCP_FIN_WAIT1,
+	[TCP_FIN_WAIT2]     = TCP_FIN_WAIT2,
+	[TCP_TIME_WAIT]     = TCP_CLOSE,	/* should not happen ! */
+	[TCP_CLOSE]         = TCP_CLOSE,
+	[TCP_CLOSE_WAIT]    = TCP_LAST_ACK  | TCP_ACTION_FIN,
+	[TCP_LAST_ACK]      = TCP_LAST_ACK,
+	[TCP_LISTEN]        = TCP_CLOSE,
+	[TCP_CLOSING]       = TCP_CLOSING,
+	[TCP_NEW_SYN_RECV]  = TCP_CLOSE,	/* should not happen ! */
+};
+
+static int mptcp_close_state(struct sock *sk)
+{
+	int next = (int)new_state[sk->sk_state];
+	int ns = next & TCP_STATE_MASK;
+
+	inet_sk_state_store(sk, ns);
+
+	return next & TCP_ACTION_FIN;
+}
+
 static void mptcp_close(struct sock *sk, long timeout)
 {
 	struct mptcp_subflow_context *subflow, *tmp;
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 06/12] mptcp: Add mptcp_close_state() helper
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

This will be used to transition to the appropriate state on close and
determine if a DATA_FIN needs to be sent for that state transition.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/protocol.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e1c71bfd61a3..51370b69e30b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1498,6 +1498,33 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how)
 	release_sock(ssk);
 }
 
+static const unsigned char new_state[16] = {
+	/* current state:     new state:      action:	*/
+	[0 /* (Invalid) */] = TCP_CLOSE,
+	[TCP_ESTABLISHED]   = TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+	[TCP_SYN_SENT]      = TCP_CLOSE,
+	[TCP_SYN_RECV]      = TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+	[TCP_FIN_WAIT1]     = TCP_FIN_WAIT1,
+	[TCP_FIN_WAIT2]     = TCP_FIN_WAIT2,
+	[TCP_TIME_WAIT]     = TCP_CLOSE,	/* should not happen ! */
+	[TCP_CLOSE]         = TCP_CLOSE,
+	[TCP_CLOSE_WAIT]    = TCP_LAST_ACK  | TCP_ACTION_FIN,
+	[TCP_LAST_ACK]      = TCP_LAST_ACK,
+	[TCP_LISTEN]        = TCP_CLOSE,
+	[TCP_CLOSING]       = TCP_CLOSING,
+	[TCP_NEW_SYN_RECV]  = TCP_CLOSE,	/* should not happen ! */
+};
+
+static int mptcp_close_state(struct sock *sk)
+{
+	int next = (int)new_state[sk->sk_state];
+	int ns = next & TCP_STATE_MASK;
+
+	inet_sk_state_store(sk, ns);
+
+	return next & TCP_ACTION_FIN;
+}
+
 static void mptcp_close(struct sock *sk, long timeout)
 {
 	struct mptcp_subflow_context *subflow, *tmp;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 07/12] mptcp: Add helper to process acks of DATA_FIN
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2901 bytes --]

After DATA_FIN has been sent, the peer will acknowledge it. An ack of
the relevant MPTCP-level sequence number will update the MPTCP
connection state appropriately.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/protocol.c | 54 +++++++++++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 51370b69e30b..b3350830e14d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -143,6 +143,14 @@ static void __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk,
 	MPTCP_SKB_CB(skb)->offset = offset;
 }
 
+static void mptcp_stop_timer(struct sock *sk)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+
+	sk_stop_timer(sk, &icsk->icsk_retransmit_timer);
+	mptcp_sk(sk)->timer_ival = 0;
+}
+
 /* both sockets must be locked */
 static bool mptcp_subflow_dsn_valid(const struct mptcp_sock *msk,
 				    struct sock *ssk)
@@ -164,6 +172,42 @@ static bool mptcp_subflow_dsn_valid(const struct mptcp_sock *msk,
 	return mptcp_subflow_data_available(ssk);
 }
 
+static void mptcp_check_data_fin_ack(struct sock *sk)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+
+	if (__mptcp_check_fallback(msk))
+		return;
+
+	/* Look for an acknowledged DATA_FIN */
+	if (((1 << sk->sk_state) &
+	     (TCPF_FIN_WAIT1 | TCPF_CLOSING | TCPF_LAST_ACK)) &&
+	    msk->write_seq == atomic64_read(&msk->snd_una)) {
+		mptcp_stop_timer(sk);
+
+		WRITE_ONCE(msk->snd_data_fin_enable, 0);
+
+		switch (sk->sk_state) {
+		case TCP_FIN_WAIT1:
+			inet_sk_state_store(sk, TCP_FIN_WAIT2);
+			sk->sk_state_change(sk);
+			break;
+		case TCP_CLOSING:
+			fallthrough;
+		case TCP_LAST_ACK:
+			inet_sk_state_store(sk, TCP_CLOSE);
+			sk->sk_state_change(sk);
+			break;
+		}
+
+		if (sk->sk_shutdown == SHUTDOWN_MASK ||
+		    sk->sk_state == TCP_CLOSE)
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP);
+		else
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
+	}
+}
+
 static bool mptcp_pending_data_fin(struct sock *sk, u64 *seq)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
@@ -222,6 +266,8 @@ static void mptcp_check_data_fin(struct sock *sk)
 		WRITE_ONCE(msk->rcv_data_fin, 0);
 
 		sk->sk_shutdown |= RCV_SHUTDOWN;
+		smp_mb__before_atomic(); /* SHUTDOWN must be visible first */
+		set_bit(MPTCP_DATA_READY, &msk->flags);
 
 		switch (sk->sk_state) {
 		case TCP_ESTABLISHED:
@@ -455,14 +501,6 @@ static void mptcp_check_for_eof(struct mptcp_sock *msk)
 	}
 }
 
-static void mptcp_stop_timer(struct sock *sk)
-{
-	struct inet_connection_sock *icsk = inet_csk(sk);
-
-	sk_stop_timer(sk, &icsk->icsk_retransmit_timer);
-	mptcp_sk(sk)->timer_ival = 0;
-}
-
 static bool mptcp_ext_cache_refill(struct mptcp_sock *msk)
 {
 	const struct sock *sk = (const struct sock *)msk;
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 07/12] mptcp: Add helper to process acks of DATA_FIN
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

After DATA_FIN has been sent, the peer will acknowledge it. An ack of
the relevant MPTCP-level sequence number will update the MPTCP
connection state appropriately.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/protocol.c | 54 +++++++++++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 51370b69e30b..b3350830e14d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -143,6 +143,14 @@ static void __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk,
 	MPTCP_SKB_CB(skb)->offset = offset;
 }
 
+static void mptcp_stop_timer(struct sock *sk)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+
+	sk_stop_timer(sk, &icsk->icsk_retransmit_timer);
+	mptcp_sk(sk)->timer_ival = 0;
+}
+
 /* both sockets must be locked */
 static bool mptcp_subflow_dsn_valid(const struct mptcp_sock *msk,
 				    struct sock *ssk)
@@ -164,6 +172,42 @@ static bool mptcp_subflow_dsn_valid(const struct mptcp_sock *msk,
 	return mptcp_subflow_data_available(ssk);
 }
 
+static void mptcp_check_data_fin_ack(struct sock *sk)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+
+	if (__mptcp_check_fallback(msk))
+		return;
+
+	/* Look for an acknowledged DATA_FIN */
+	if (((1 << sk->sk_state) &
+	     (TCPF_FIN_WAIT1 | TCPF_CLOSING | TCPF_LAST_ACK)) &&
+	    msk->write_seq == atomic64_read(&msk->snd_una)) {
+		mptcp_stop_timer(sk);
+
+		WRITE_ONCE(msk->snd_data_fin_enable, 0);
+
+		switch (sk->sk_state) {
+		case TCP_FIN_WAIT1:
+			inet_sk_state_store(sk, TCP_FIN_WAIT2);
+			sk->sk_state_change(sk);
+			break;
+		case TCP_CLOSING:
+			fallthrough;
+		case TCP_LAST_ACK:
+			inet_sk_state_store(sk, TCP_CLOSE);
+			sk->sk_state_change(sk);
+			break;
+		}
+
+		if (sk->sk_shutdown == SHUTDOWN_MASK ||
+		    sk->sk_state == TCP_CLOSE)
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP);
+		else
+			sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
+	}
+}
+
 static bool mptcp_pending_data_fin(struct sock *sk, u64 *seq)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
@@ -222,6 +266,8 @@ static void mptcp_check_data_fin(struct sock *sk)
 		WRITE_ONCE(msk->rcv_data_fin, 0);
 
 		sk->sk_shutdown |= RCV_SHUTDOWN;
+		smp_mb__before_atomic(); /* SHUTDOWN must be visible first */
+		set_bit(MPTCP_DATA_READY, &msk->flags);
 
 		switch (sk->sk_state) {
 		case TCP_ESTABLISHED:
@@ -455,14 +501,6 @@ static void mptcp_check_for_eof(struct mptcp_sock *msk)
 	}
 }
 
-static void mptcp_stop_timer(struct sock *sk)
-{
-	struct inet_connection_sock *icsk = inet_csk(sk);
-
-	sk_stop_timer(sk, &icsk->icsk_retransmit_timer);
-	mptcp_sk(sk)->timer_ival = 0;
-}
-
 static bool mptcp_ext_cache_refill(struct mptcp_sock *msk)
 {
 	const struct sock *sk = (const struct sock *)msk;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 08/12] mptcp: Use full MPTCP-level disconnect state machine
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 8595 bytes --]

RFC 8684 appendix D describes the connection state machine for
MPTCP. This patch implements the DATA_FIN / DATA_ACK exchanges and
MPTCP-level socket state changes described in that appendix, rather than
simply sending DATA_FIN along with TCP FIN when disconnecting subflows.

DATA_FIN is now sent and acknowledged before shutting down the
subflows. Received DATA_FIN information (if not part of a data packet)
is written to the MPTCP socket when the incoming DSS option is parsed by
the subflow, and the MPTCP worker is scheduled to process the
flag. DATA_FIN received as part of a full DSS mapping will be handled
when the mapping is processed.

The DATA_FIN is acknowledged by the worker if the reader is caught
up. If there is still data to be moved to the MPTCP-level queue, ack_seq
will be incremented to account for the DATA_FIN when it reaches the end
of the stream and a DATA_ACK will be sent to the peer.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/options.c  | 11 ++++++
 net/mptcp/protocol.c | 87 +++++++++++++++++++++++++++++++++++++-------
 net/mptcp/subflow.c  | 11 ++++--
 3 files changed, 92 insertions(+), 17 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 38583d1b9b5f..b4458ecd01f8 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -868,6 +868,17 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
 	if (mp_opt.use_ack)
 		update_una(msk, &mp_opt);
 
+	/* Zero-length packets, like bare ACKs carrying a DATA_FIN, are
+	 * dropped by the caller and not propagated to the MPTCP layer.
+	 * Copy the DATA_FIN information now.
+	 */
+	if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq) {
+		if (mp_opt.data_fin && mp_opt.data_len == 1 &&
+		    mptcp_update_rcv_data_fin(msk, mp_opt.data_seq) &&
+		    schedule_work(&msk->work))
+			sock_hold(subflow->conn);
+	}
+
 	mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
 	if (!mpext)
 		return;
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b3350830e14d..f264ea15e081 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -381,6 +381,15 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 
 	*bytes = moved;
 
+	/* If the moves have caught up with the DATA_FIN sequence number
+	 * it's time to ack the DATA_FIN and change socket state, but
+	 * this is not a good place to change state. Let the workqueue
+	 * do it.
+	 */
+	if (mptcp_pending_data_fin(sk, NULL) &&
+	    schedule_work(&msk->work))
+		sock_hold(sk);
+
 	return done;
 }
 
@@ -466,7 +475,8 @@ void mptcp_data_acked(struct sock *sk)
 {
 	mptcp_reset_timer(sk);
 
-	if (!sk_stream_is_writeable(sk) &&
+	if ((!sk_stream_is_writeable(sk) ||
+	     (inet_sk_state_load(sk) != TCP_ESTABLISHED)) &&
 	    schedule_work(&mptcp_sk(sk)->work))
 		sock_hold(sk);
 }
@@ -1384,6 +1394,7 @@ static void mptcp_worker(struct work_struct *work)
 
 	lock_sock(sk);
 	mptcp_clean_una(sk);
+	mptcp_check_data_fin_ack(sk);
 	__mptcp_flush_join_list(msk);
 	__mptcp_move_skbs(msk);
 
@@ -1393,6 +1404,8 @@ static void mptcp_worker(struct work_struct *work)
 	if (test_and_clear_bit(MPTCP_WORK_EOF, &msk->flags))
 		mptcp_check_for_eof(msk);
 
+	mptcp_check_data_fin(sk);
+
 	if (!test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags))
 		goto unlock;
 
@@ -1515,7 +1528,7 @@ static void mptcp_cancel_work(struct sock *sk)
 		sock_put(sk);
 }
 
-static void mptcp_subflow_shutdown(struct sock *ssk, int how)
+static void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how)
 {
 	lock_sock(ssk);
 
@@ -1528,8 +1541,15 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how)
 		tcp_disconnect(ssk, O_NONBLOCK);
 		break;
 	default:
-		ssk->sk_shutdown |= how;
-		tcp_shutdown(ssk, how);
+		if (__mptcp_check_fallback(mptcp_sk(sk))) {
+			pr_debug("Fallback");
+			ssk->sk_shutdown |= how;
+			tcp_shutdown(ssk, how);
+		} else {
+			pr_debug("Sending DATA_FIN on subflow %p", ssk);
+			mptcp_set_timeout(sk, ssk);
+			tcp_send_ack(ssk);
+		}
 		break;
 	}
 
@@ -1570,9 +1590,35 @@ static void mptcp_close(struct sock *sk, long timeout)
 	LIST_HEAD(conn_list);
 
 	lock_sock(sk);
+	sk->sk_shutdown = SHUTDOWN_MASK;
+
+	if (sk->sk_state == TCP_LISTEN) {
+		inet_sk_state_store(sk, TCP_CLOSE);
+		goto cleanup;
+	} else if (sk->sk_state == TCP_CLOSE) {
+		goto cleanup;
+	}
+
+	if (__mptcp_check_fallback(msk)) {
+		goto update_state;
+	} else if (mptcp_close_state(sk)) {
+		pr_debug("Sending DATA_FIN sk=%p", sk);
+		WRITE_ONCE(msk->write_seq, msk->write_seq + 1);
+		WRITE_ONCE(msk->snd_data_fin_enable, 1);
+
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
+
+			mptcp_subflow_shutdown(sk, tcp_sk, SHUTDOWN_MASK);
+		}
+	}
 
+	sk_stream_wait_close(sk, timeout);
+
+update_state:
 	inet_sk_state_store(sk, TCP_CLOSE);
 
+cleanup:
 	/* be sure to always acquire the join list lock, to sync vs
 	 * mptcp_finish_join().
 	 */
@@ -1581,8 +1627,6 @@ static void mptcp_close(struct sock *sk, long timeout)
 	spin_unlock_bh(&msk->join_list_lock);
 	list_splice_init(&msk->conn_list, &conn_list);
 
-	msk->snd_data_fin_enable = 1;
-
 	__mptcp_clear_xmit(sk);
 
 	release_sock(sk);
@@ -2265,11 +2309,8 @@ static int mptcp_shutdown(struct socket *sock, int how)
 	pr_debug("sk=%p, how=%d", msk, how);
 
 	lock_sock(sock->sk);
-	if (how == SHUT_WR || how == SHUT_RDWR)
-		inet_sk_state_store(sock->sk, TCP_FIN_WAIT1);
 
 	how++;
-
 	if ((how & ~SHUTDOWN_MASK) || !how) {
 		ret = -EINVAL;
 		goto out_unlock;
@@ -2283,13 +2324,31 @@ static int mptcp_shutdown(struct socket *sock, int how)
 			sock->state = SS_CONNECTED;
 	}
 
-	__mptcp_flush_join_list(msk);
-	msk->snd_data_fin_enable = 1;
+	/* If we've already sent a FIN, or it's a closed state, skip this. */
+	if (__mptcp_check_fallback(msk)) {
+		if (how == SHUT_WR || how == SHUT_RDWR)
+			inet_sk_state_store(sock->sk, TCP_FIN_WAIT1);
 
-	mptcp_for_each_subflow(msk, subflow) {
-		struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
 
-		mptcp_subflow_shutdown(tcp_sk, how);
+			mptcp_subflow_shutdown(sock->sk, tcp_sk, how);
+		}
+	} else if ((how & SEND_SHUTDOWN) &&
+		   ((1 << sock->sk->sk_state) &
+		    (TCPF_ESTABLISHED | TCPF_SYN_SENT |
+		     TCPF_SYN_RECV | TCPF_CLOSE_WAIT)) &&
+		   mptcp_close_state(sock->sk)) {
+		__mptcp_flush_join_list(msk);
+
+		WRITE_ONCE(msk->write_seq, msk->write_seq + 1);
+		WRITE_ONCE(msk->snd_data_fin_enable, 1);
+
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
+
+			mptcp_subflow_shutdown(sock->sk, tcp_sk, how);
+		}
 	}
 
 	/* Wake up anyone sleeping in poll. */
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index e645483d1200..7ab2a52ad150 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -598,7 +598,8 @@ static bool validate_mapping(struct sock *ssk, struct sk_buff *skb)
 	return true;
 }
 
-static enum mapping_status get_mapping_status(struct sock *ssk)
+static enum mapping_status get_mapping_status(struct sock *ssk,
+					      struct mptcp_sock *msk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
 	struct mptcp_ext *mpext;
@@ -648,7 +649,8 @@ static enum mapping_status get_mapping_status(struct sock *ssk)
 
 	if (mpext->data_fin == 1) {
 		if (data_len == 1) {
-			pr_debug("DATA_FIN with no payload");
+			mptcp_update_rcv_data_fin(msk, mpext->data_seq);
+			pr_debug("DATA_FIN with no payload seq=%llu", mpext->data_seq);
 			if (subflow->map_valid) {
 				/* A DATA_FIN might arrive in a DSS
 				 * option before the previous mapping
@@ -660,6 +662,9 @@ static enum mapping_status get_mapping_status(struct sock *ssk)
 			} else {
 				return MAPPING_DATA_FIN;
 			}
+		} else {
+			mptcp_update_rcv_data_fin(msk, mpext->data_seq + data_len);
+			pr_debug("DATA_FIN with mapping seq=%llu", mpext->data_seq + data_len);
 		}
 
 		/* Adjust for DATA_FIN using 1 byte of sequence space */
@@ -748,7 +753,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
 		u64 ack_seq;
 		u64 old_ack;
 
-		status = get_mapping_status(ssk);
+		status = get_mapping_status(ssk, msk);
 		pr_debug("msk=%p ssk=%p status=%d", msk, ssk, status);
 		if (status == MAPPING_INVALID) {
 			ssk->sk_err = EBADMSG;
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 08/12] mptcp: Use full MPTCP-level disconnect state machine
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

RFC 8684 appendix D describes the connection state machine for
MPTCP. This patch implements the DATA_FIN / DATA_ACK exchanges and
MPTCP-level socket state changes described in that appendix, rather than
simply sending DATA_FIN along with TCP FIN when disconnecting subflows.

DATA_FIN is now sent and acknowledged before shutting down the
subflows. Received DATA_FIN information (if not part of a data packet)
is written to the MPTCP socket when the incoming DSS option is parsed by
the subflow, and the MPTCP worker is scheduled to process the
flag. DATA_FIN received as part of a full DSS mapping will be handled
when the mapping is processed.

The DATA_FIN is acknowledged by the worker if the reader is caught
up. If there is still data to be moved to the MPTCP-level queue, ack_seq
will be incremented to account for the DATA_FIN when it reaches the end
of the stream and a DATA_ACK will be sent to the peer.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/options.c  | 11 ++++++
 net/mptcp/protocol.c | 87 +++++++++++++++++++++++++++++++++++++-------
 net/mptcp/subflow.c  | 11 ++++--
 3 files changed, 92 insertions(+), 17 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 38583d1b9b5f..b4458ecd01f8 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -868,6 +868,17 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
 	if (mp_opt.use_ack)
 		update_una(msk, &mp_opt);
 
+	/* Zero-length packets, like bare ACKs carrying a DATA_FIN, are
+	 * dropped by the caller and not propagated to the MPTCP layer.
+	 * Copy the DATA_FIN information now.
+	 */
+	if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq) {
+		if (mp_opt.data_fin && mp_opt.data_len == 1 &&
+		    mptcp_update_rcv_data_fin(msk, mp_opt.data_seq) &&
+		    schedule_work(&msk->work))
+			sock_hold(subflow->conn);
+	}
+
 	mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
 	if (!mpext)
 		return;
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b3350830e14d..f264ea15e081 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -381,6 +381,15 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 
 	*bytes = moved;
 
+	/* If the moves have caught up with the DATA_FIN sequence number
+	 * it's time to ack the DATA_FIN and change socket state, but
+	 * this is not a good place to change state. Let the workqueue
+	 * do it.
+	 */
+	if (mptcp_pending_data_fin(sk, NULL) &&
+	    schedule_work(&msk->work))
+		sock_hold(sk);
+
 	return done;
 }
 
@@ -466,7 +475,8 @@ void mptcp_data_acked(struct sock *sk)
 {
 	mptcp_reset_timer(sk);
 
-	if (!sk_stream_is_writeable(sk) &&
+	if ((!sk_stream_is_writeable(sk) ||
+	     (inet_sk_state_load(sk) != TCP_ESTABLISHED)) &&
 	    schedule_work(&mptcp_sk(sk)->work))
 		sock_hold(sk);
 }
@@ -1384,6 +1394,7 @@ static void mptcp_worker(struct work_struct *work)
 
 	lock_sock(sk);
 	mptcp_clean_una(sk);
+	mptcp_check_data_fin_ack(sk);
 	__mptcp_flush_join_list(msk);
 	__mptcp_move_skbs(msk);
 
@@ -1393,6 +1404,8 @@ static void mptcp_worker(struct work_struct *work)
 	if (test_and_clear_bit(MPTCP_WORK_EOF, &msk->flags))
 		mptcp_check_for_eof(msk);
 
+	mptcp_check_data_fin(sk);
+
 	if (!test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags))
 		goto unlock;
 
@@ -1515,7 +1528,7 @@ static void mptcp_cancel_work(struct sock *sk)
 		sock_put(sk);
 }
 
-static void mptcp_subflow_shutdown(struct sock *ssk, int how)
+static void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how)
 {
 	lock_sock(ssk);
 
@@ -1528,8 +1541,15 @@ static void mptcp_subflow_shutdown(struct sock *ssk, int how)
 		tcp_disconnect(ssk, O_NONBLOCK);
 		break;
 	default:
-		ssk->sk_shutdown |= how;
-		tcp_shutdown(ssk, how);
+		if (__mptcp_check_fallback(mptcp_sk(sk))) {
+			pr_debug("Fallback");
+			ssk->sk_shutdown |= how;
+			tcp_shutdown(ssk, how);
+		} else {
+			pr_debug("Sending DATA_FIN on subflow %p", ssk);
+			mptcp_set_timeout(sk, ssk);
+			tcp_send_ack(ssk);
+		}
 		break;
 	}
 
@@ -1570,9 +1590,35 @@ static void mptcp_close(struct sock *sk, long timeout)
 	LIST_HEAD(conn_list);
 
 	lock_sock(sk);
+	sk->sk_shutdown = SHUTDOWN_MASK;
+
+	if (sk->sk_state == TCP_LISTEN) {
+		inet_sk_state_store(sk, TCP_CLOSE);
+		goto cleanup;
+	} else if (sk->sk_state == TCP_CLOSE) {
+		goto cleanup;
+	}
+
+	if (__mptcp_check_fallback(msk)) {
+		goto update_state;
+	} else if (mptcp_close_state(sk)) {
+		pr_debug("Sending DATA_FIN sk=%p", sk);
+		WRITE_ONCE(msk->write_seq, msk->write_seq + 1);
+		WRITE_ONCE(msk->snd_data_fin_enable, 1);
+
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
+
+			mptcp_subflow_shutdown(sk, tcp_sk, SHUTDOWN_MASK);
+		}
+	}
 
+	sk_stream_wait_close(sk, timeout);
+
+update_state:
 	inet_sk_state_store(sk, TCP_CLOSE);
 
+cleanup:
 	/* be sure to always acquire the join list lock, to sync vs
 	 * mptcp_finish_join().
 	 */
@@ -1581,8 +1627,6 @@ static void mptcp_close(struct sock *sk, long timeout)
 	spin_unlock_bh(&msk->join_list_lock);
 	list_splice_init(&msk->conn_list, &conn_list);
 
-	msk->snd_data_fin_enable = 1;
-
 	__mptcp_clear_xmit(sk);
 
 	release_sock(sk);
@@ -2265,11 +2309,8 @@ static int mptcp_shutdown(struct socket *sock, int how)
 	pr_debug("sk=%p, how=%d", msk, how);
 
 	lock_sock(sock->sk);
-	if (how == SHUT_WR || how == SHUT_RDWR)
-		inet_sk_state_store(sock->sk, TCP_FIN_WAIT1);
 
 	how++;
-
 	if ((how & ~SHUTDOWN_MASK) || !how) {
 		ret = -EINVAL;
 		goto out_unlock;
@@ -2283,13 +2324,31 @@ static int mptcp_shutdown(struct socket *sock, int how)
 			sock->state = SS_CONNECTED;
 	}
 
-	__mptcp_flush_join_list(msk);
-	msk->snd_data_fin_enable = 1;
+	/* If we've already sent a FIN, or it's a closed state, skip this. */
+	if (__mptcp_check_fallback(msk)) {
+		if (how == SHUT_WR || how == SHUT_RDWR)
+			inet_sk_state_store(sock->sk, TCP_FIN_WAIT1);
 
-	mptcp_for_each_subflow(msk, subflow) {
-		struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
 
-		mptcp_subflow_shutdown(tcp_sk, how);
+			mptcp_subflow_shutdown(sock->sk, tcp_sk, how);
+		}
+	} else if ((how & SEND_SHUTDOWN) &&
+		   ((1 << sock->sk->sk_state) &
+		    (TCPF_ESTABLISHED | TCPF_SYN_SENT |
+		     TCPF_SYN_RECV | TCPF_CLOSE_WAIT)) &&
+		   mptcp_close_state(sock->sk)) {
+		__mptcp_flush_join_list(msk);
+
+		WRITE_ONCE(msk->write_seq, msk->write_seq + 1);
+		WRITE_ONCE(msk->snd_data_fin_enable, 1);
+
+		mptcp_for_each_subflow(msk, subflow) {
+			struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);
+
+			mptcp_subflow_shutdown(sock->sk, tcp_sk, how);
+		}
 	}
 
 	/* Wake up anyone sleeping in poll. */
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index e645483d1200..7ab2a52ad150 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -598,7 +598,8 @@ static bool validate_mapping(struct sock *ssk, struct sk_buff *skb)
 	return true;
 }
 
-static enum mapping_status get_mapping_status(struct sock *ssk)
+static enum mapping_status get_mapping_status(struct sock *ssk,
+					      struct mptcp_sock *msk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
 	struct mptcp_ext *mpext;
@@ -648,7 +649,8 @@ static enum mapping_status get_mapping_status(struct sock *ssk)
 
 	if (mpext->data_fin == 1) {
 		if (data_len == 1) {
-			pr_debug("DATA_FIN with no payload");
+			mptcp_update_rcv_data_fin(msk, mpext->data_seq);
+			pr_debug("DATA_FIN with no payload seq=%llu", mpext->data_seq);
 			if (subflow->map_valid) {
 				/* A DATA_FIN might arrive in a DSS
 				 * option before the previous mapping
@@ -660,6 +662,9 @@ static enum mapping_status get_mapping_status(struct sock *ssk)
 			} else {
 				return MAPPING_DATA_FIN;
 			}
+		} else {
+			mptcp_update_rcv_data_fin(msk, mpext->data_seq + data_len);
+			pr_debug("DATA_FIN with mapping seq=%llu", mpext->data_seq + data_len);
 		}
 
 		/* Adjust for DATA_FIN using 1 byte of sequence space */
@@ -748,7 +753,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
 		u64 ack_seq;
 		u64 old_ack;
 
-		status = get_mapping_status(ssk);
+		status = get_mapping_status(ssk, msk);
 		pr_debug("msk=%p ssk=%p status=%d", msk, ssk, status);
 		if (status == MAPPING_INVALID) {
 			ssk->sk_err = EBADMSG;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 09/12] mptcp: Only use subflow EOF signaling on fallback connections
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 886 bytes --]

The MPTCP state machine handles disconnections on non-fallback connections,
but the mptcp_sock still needs to get notified when fallback subflows
disconnect.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/subflow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 7ab2a52ad150..1c8482bc2ce5 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1159,7 +1159,8 @@ static void subflow_state_change(struct sock *sk)
 	if (mptcp_subflow_data_available(sk))
 		mptcp_data_ready(parent, sk);
 
-	if (!(parent->sk_shutdown & RCV_SHUTDOWN) &&
+	if (__mptcp_check_fallback(mptcp_sk(parent)) &&
+	    !(parent->sk_shutdown & RCV_SHUTDOWN) &&
 	    !subflow->rx_eof && subflow_is_done(sk)) {
 		subflow->rx_eof = 1;
 		mptcp_subflow_eof(parent);
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 09/12] mptcp: Only use subflow EOF signaling on fallback connections
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

The MPTCP state machine handles disconnections on non-fallback connections,
but the mptcp_sock still needs to get notified when fallback subflows
disconnect.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/subflow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 7ab2a52ad150..1c8482bc2ce5 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1159,7 +1159,8 @@ static void subflow_state_change(struct sock *sk)
 	if (mptcp_subflow_data_available(sk))
 		mptcp_data_ready(parent, sk);
 
-	if (!(parent->sk_shutdown & RCV_SHUTDOWN) &&
+	if (__mptcp_check_fallback(mptcp_sk(parent)) &&
+	    !(parent->sk_shutdown & RCV_SHUTDOWN) &&
 	    !subflow->rx_eof && subflow_is_done(sk)) {
 		subflow->rx_eof = 1;
 		mptcp_subflow_eof(parent);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 10/12] mptcp: Skip unnecessary skb extension allocation for bare acks
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1416 bytes --]

Bare TCP ack skbs are freed right after MPTCP sees them, so the work to
allocate, zero, and populate the MPTCP skb extension is wasted. Detect
these skbs and do not add skb extensions to them.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/options.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index b4458ecd01f8..7fa822b55c34 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -868,15 +868,18 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
 	if (mp_opt.use_ack)
 		update_una(msk, &mp_opt);
 
-	/* Zero-length packets, like bare ACKs carrying a DATA_FIN, are
-	 * dropped by the caller and not propagated to the MPTCP layer.
-	 * Copy the DATA_FIN information now.
+	/* Zero-data-length packets are dropped by the caller and not
+	 * propagated to the MPTCP layer, so the skb extension does not
+	 * need to be allocated or populated. DATA_FIN information, if
+	 * present, needs to be updated here before the skb is freed.
 	 */
 	if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq) {
 		if (mp_opt.data_fin && mp_opt.data_len == 1 &&
 		    mptcp_update_rcv_data_fin(msk, mp_opt.data_seq) &&
 		    schedule_work(&msk->work))
 			sock_hold(subflow->conn);
+
+		return;
 	}
 
 	mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 10/12] mptcp: Skip unnecessary skb extension allocation for bare acks
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

Bare TCP ack skbs are freed right after MPTCP sees them, so the work to
allocate, zero, and populate the MPTCP skb extension is wasted. Detect
these skbs and do not add skb extensions to them.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/options.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index b4458ecd01f8..7fa822b55c34 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -868,15 +868,18 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb,
 	if (mp_opt.use_ack)
 		update_una(msk, &mp_opt);
 
-	/* Zero-length packets, like bare ACKs carrying a DATA_FIN, are
-	 * dropped by the caller and not propagated to the MPTCP layer.
-	 * Copy the DATA_FIN information now.
+	/* Zero-data-length packets are dropped by the caller and not
+	 * propagated to the MPTCP layer, so the skb extension does not
+	 * need to be allocated or populated. DATA_FIN information, if
+	 * present, needs to be updated here before the skb is freed.
 	 */
 	if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq) {
 		if (mp_opt.data_fin && mp_opt.data_len == 1 &&
 		    mptcp_update_rcv_data_fin(msk, mp_opt.data_seq) &&
 		    schedule_work(&msk->work))
 			sock_hold(subflow->conn);
+
+		return;
 	}
 
 	mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 11/12] mptcp: Safely read sequence number when lock isn't held
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 756 bytes --]

The MPTCP socket's write_seq member should be read with READ_ONCE() when
the msk lock is not held.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/protocol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f264ea15e081..f2455a68d231 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1269,7 +1269,7 @@ static void mptcp_retransmit_handler(struct sock *sk)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
 
-	if (atomic64_read(&msk->snd_una) == msk->write_seq) {
+	if (atomic64_read(&msk->snd_una) == READ_ONCE(msk->write_seq)) {
 		mptcp_stop_timer(sk);
 	} else {
 		set_bit(MPTCP_WORK_RTX, &msk->flags);
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 11/12] mptcp: Safely read sequence number when lock isn't held
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

The MPTCP socket's write_seq member should be read with READ_ONCE() when
the msk lock is not held.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/protocol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f264ea15e081..f2455a68d231 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1269,7 +1269,7 @@ static void mptcp_retransmit_handler(struct sock *sk)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
 
-	if (atomic64_read(&msk->snd_una) == msk->write_seq) {
+	if (atomic64_read(&msk->snd_una) == READ_ONCE(msk->write_seq)) {
 		mptcp_stop_timer(sk);
 	} else {
 		set_bit(MPTCP_WORK_RTX, &msk->flags);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] [PATCH net-next 12/12] mptcp: Safely store sequence number when sending data
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-28 22:12 ` Mat Martineau
  -1 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 721 bytes --]

The MPTCP socket's write_seq member can be read without the msk lock
held, so use WRITE_ONCE() to store it.

Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
 net/mptcp/protocol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f2455a68d231..687f0bea2b35 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -793,7 +793,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 out:
 	if (!retransmission)
 		pfrag->offset += frag_truesize;
-	*write_seq += ret;
+	WRITE_ONCE(*write_seq, *write_seq + ret);
 	mptcp_subflow_ctx(ssk)->rel_write_seq += ret;
 
 	return ret;
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next 12/12] mptcp: Safely store sequence number when sending data
@ 2020-07-28 22:12 ` Mat Martineau
  0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2020-07-28 22:12 UTC (permalink / raw)
  To: netdev; +Cc: Mat Martineau, mptcp, matthieu.baerts, pabeni

The MPTCP socket's write_seq member can be read without the msk lock
held, so use WRITE_ONCE() to store it.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 net/mptcp/protocol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f2455a68d231..687f0bea2b35 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -793,7 +793,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 out:
 	if (!retransmission)
 		pfrag->offset += frag_truesize;
-	*write_seq += ret;
+	WRITE_ONCE(*write_seq, *write_seq + ret);
 	mptcp_subflow_ctx(ssk)->rel_write_seq += ret;
 
 	return ret;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [MPTCP] Re: [PATCH net-next 00/12] Exchange MPTCP DATA_FIN/DATA_ACK before TCP FIN
  2020-07-28 22:11 ` Mat Martineau
@ 2020-07-29  0:02 ` David Miller
  -1 siblings, 0 replies; 28+ messages in thread
From: David Miller @ 2020-07-29  0:02 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

From: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Date: Tue, 28 Jul 2020 15:11:58 -0700

> This series allows the MPTCP-level connection to be closed with the
> peers exchanging DATA_FIN and DATA_ACK according to the state machine in
> appendix D of RFC 8684. The process is very similar to the TCP
> disconnect state machine. 
> 
> The prior code sends DATA_FIN only when TCP FIN packets are sent, and
> does not allow for the MPTCP-level connection to be half-closed.
> 
> Patch 8 ("mptcp: Use full MPTCP-level disconnect state machine") is the
> core of the series. Earlier patches in the series have some small fixes
> and helpers in preparation, and the final four small patches do some
> cleanup.

Series applied, thanks.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next 00/12] Exchange MPTCP DATA_FIN/DATA_ACK before TCP FIN
@ 2020-07-29  0:02 ` David Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David Miller @ 2020-07-29  0:02 UTC (permalink / raw)
  To: mathew.j.martineau; +Cc: netdev, mptcp, matthieu.baerts, pabeni

From: Mat Martineau <mathew.j.martineau@linux.intel.com>
Date: Tue, 28 Jul 2020 15:11:58 -0700

> This series allows the MPTCP-level connection to be closed with the
> peers exchanging DATA_FIN and DATA_ACK according to the state machine in
> appendix D of RFC 8684. The process is very similar to the TCP
> disconnect state machine. 
> 
> The prior code sends DATA_FIN only when TCP FIN packets are sent, and
> does not allow for the MPTCP-level connection to be half-closed.
> 
> Patch 8 ("mptcp: Use full MPTCP-level disconnect state machine") is the
> core of the series. Earlier patches in the series have some small fixes
> and helpers in preparation, and the final four small patches do some
> cleanup.

Series applied, thanks.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-07-29  0:02 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-28 22:12 [MPTCP] [PATCH net-next 08/12] mptcp: Use full MPTCP-level disconnect state machine Mat Martineau
2020-07-28 22:12 ` Mat Martineau
  -- strict thread matches above, loose matches on Subject: below --
2020-07-29  0:02 [MPTCP] Re: [PATCH net-next 00/12] Exchange MPTCP DATA_FIN/DATA_ACK before TCP FIN David Miller
2020-07-29  0:02 ` David Miller
2020-07-28 22:12 [MPTCP] [PATCH net-next 12/12] mptcp: Safely store sequence number when sending data Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 11/12] mptcp: Safely read sequence number when lock isn't held Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 10/12] mptcp: Skip unnecessary skb extension allocation for bare acks Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 09/12] mptcp: Only use subflow EOF signaling on fallback connections Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 07/12] mptcp: Add helper to process acks of DATA_FIN Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 06/12] mptcp: Add mptcp_close_state() helper Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 05/12] mptcp: Track received DATA_FIN sequence number and add related helpers Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 04/12] mptcp: Use MPTCP-level flag for sending DATA_FIN Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 03/12] mptcp: Remove outdated and incorrect comment Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:12 [MPTCP] [PATCH net-next 02/12] mptcp: Return EPIPE if sending is shut down during a sendmsg Mat Martineau
2020-07-28 22:12 ` Mat Martineau
2020-07-28 22:11 [MPTCP] [PATCH net-next 01/12] mptcp: Allow DATA_FIN in headers without TCP FIN Mat Martineau
2020-07-28 22:11 ` Mat Martineau
2020-07-28 22:11 [MPTCP] [PATCH net-next 00/12] Exchange MPTCP DATA_FIN/DATA_ACK before " Mat Martineau
2020-07-28 22:11 ` Mat Martineau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.