All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xin Long <lucien.xin@gmail.com>
To: network dev <netdev@vger.kernel.org>,
	davem@davemloft.net, kuba@kernel.org,
	Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	linux-sctp@vger.kernel.org
Subject: [PATCH net-next 09/14] sctp: do state transition when receiving an icmp TOOBIG packet
Date: Sun, 20 Jun 2021 21:38:44 -0400	[thread overview]
Message-ID: <079e91bfe9ea9f081df75d04e74253b099ce7a84.1624239422.git.lucien.xin@gmail.com> (raw)
In-Reply-To: <cover.1624239422.git.lucien.xin@gmail.com>

PLPMTUD will short-circuit the old process for icmp TOOBIG packets.
This part is described in rfc8899#section-4.6.2 (PL_PTB_SIZE =
PTB_SIZE - other_headers_len). Note that from rfc8899#section-5.2
State Machine, each case below is for some specific states only:

  a) PL_PTB_SIZE < MIN_PLPMTU || PL_PTB_SIZE >= PROBED_SIZE,
     discard it, for any state

  b) MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU,
     Base -> Error, for Base state

  c) BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU,
     Search -> Base or Complete -> Base, for Search and Complete states.

  d) PLPMTU < PL_PTB_SIZE < PROBED_SIZE,
     set pl.probe_size to PL_PTB_SIZE then verify it, for Search state.

The most important one is case d), which will help find the optimal
fast during searching. Like when pathmtu = 1392 for SCTP over IPv4,
the search will be (20 is iphdr_len):

  1. probe with 1200 - 20
  2. probe with 1232 - 20
  3. probe with 1264 - 20
  ...
  7. probe with 1388 - 20
  8. probe with 1420 - 20

When sending the probe with 1420 - 20, TOOBIG may come with PL_PTB_SIZE =
1392 - 20. Then it matches case d), and saves some rounds to try with the
1392 - 20 probe. But of course, PLPMTUD doesn't trust TOOBIG packets, and
it will go back to the common searching once the probe with the new size
can't be verified.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/input.c     |  4 +++-
 net/sctp/transport.c | 51 +++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/net/sctp/input.c b/net/sctp/input.c
index d508f6f3dd08..9ffdbd6526e9 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -385,7 +385,9 @@ static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb)
 void sctp_icmp_frag_needed(struct sock *sk, struct sctp_association *asoc,
 			   struct sctp_transport *t, __u32 pmtu)
 {
-	if (!t || (t->pathmtu <= pmtu))
+	if (!t ||
+	    (t->pathmtu <= pmtu &&
+	     t->pl.probe_size + sctp_transport_pl_hlen(t) <= pmtu))
 		return;
 
 	if (sock_owned_by_user(sk)) {
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index 79ff5ca6b472..5cefb4eab8a0 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -343,10 +343,55 @@ void sctp_transport_pl_recv(struct sctp_transport *t)
 	}
 }
 
+static bool sctp_transport_pl_toobig(struct sctp_transport *t, u32 pmtu)
+{
+	pr_debug("%s: PLPMTUD: transport: %p, state: %d, pmtu: %d, size: %d, ptb: %d\n",
+		 __func__, t, t->pl.state, t->pl.pmtu, t->pl.probe_size, pmtu);
+
+	if (pmtu < SCTP_MIN_PLPMTU || pmtu >= t->pl.probe_size)
+		return false;
+
+	if (t->pl.state == SCTP_PL_BASE) {
+		if (pmtu >= SCTP_MIN_PLPMTU && pmtu < SCTP_BASE_PLPMTU) {
+			t->pl.state = SCTP_PL_ERROR; /* Base -> Error */
+
+			t->pl.pmtu = SCTP_MIN_PLPMTU;
+			t->pathmtu = t->pl.pmtu + sctp_transport_pl_hlen(t);
+		}
+	} else if (t->pl.state == SCTP_PL_SEARCH) {
+		if (pmtu >= SCTP_BASE_PLPMTU && pmtu < t->pl.pmtu) {
+			t->pl.state = SCTP_PL_BASE;  /* Search -> Base */
+			t->pl.probe_size = SCTP_BASE_PLPMTU;
+			t->pl.probe_count = 0;
+
+			t->pl.probe_high = 0;
+			t->pl.pmtu = SCTP_BASE_PLPMTU;
+			t->pathmtu = t->pl.pmtu + sctp_transport_pl_hlen(t);
+		} else if (pmtu > t->pl.pmtu && pmtu < t->pl.probe_size) {
+			t->pl.probe_size = pmtu;
+			t->pl.probe_count = 0;
+
+			return false;
+		}
+	} else if (t->pl.state == SCTP_PL_COMPLETE) {
+		if (pmtu >= SCTP_BASE_PLPMTU && pmtu < t->pl.pmtu) {
+			t->pl.state = SCTP_PL_BASE;  /* Complete -> Base */
+			t->pl.probe_size = SCTP_BASE_PLPMTU;
+			t->pl.probe_count = 0;
+
+			t->pl.probe_high = 0;
+			t->pl.pmtu = SCTP_BASE_PLPMTU;
+			t->pathmtu = t->pl.pmtu + sctp_transport_pl_hlen(t);
+		}
+	}
+
+	return true;
+}
+
 bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu)
 {
-	struct dst_entry *dst = sctp_transport_dst_check(t);
 	struct sock *sk = t->asoc->base.sk;
+	struct dst_entry *dst;
 	bool change = true;
 
 	if (unlikely(pmtu < SCTP_DEFAULT_MINSEGMENT)) {
@@ -357,6 +402,10 @@ bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu)
 	}
 	pmtu = SCTP_TRUNC4(pmtu);
 
+	if (sctp_transport_pl_enabled(t))
+		return sctp_transport_pl_toobig(t, pmtu - sctp_transport_pl_hlen(t));
+
+	dst = sctp_transport_dst_check(t);
 	if (dst) {
 		struct sctp_pf *pf = sctp_get_pf_specific(dst->ops->family);
 		union sctp_addr addr;
-- 
2.27.0


  parent reply	other threads:[~2021-06-21  1:39 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-21  1:38 [PATCH net-next 00/14] sctp: implement RFC8899: Packetization Layer Path MTU Discovery for SCTP transport Xin Long
2021-06-21  1:38 ` [PATCH net-next 01/14] sctp: add pad chunk and its make function and event table Xin Long
2021-06-21  1:38 ` [PATCH net-next 02/14] sctp: add probe_interval in sysctl and sock/asoc/transport Xin Long
2021-06-21  1:38 ` [PATCH net-next 03/14] sctp: add SCTP_PLPMTUD_PROBE_INTERVAL sockopt for sock/asoc/transport Xin Long
2021-06-21  1:38 ` [PATCH net-next 04/14] sctp: add the constants/variables and states and some APIs for transport Xin Long
2021-06-21  1:38 ` [PATCH net-next 05/14] sctp: add the probe timer in transport for PLPMTUD Xin Long
2021-06-21  1:38 ` [PATCH net-next 06/14] sctp: do the basic send and recv for PLPMTUD probe Xin Long
2021-06-21  3:49   ` kernel test robot
2021-06-21  3:49     ` kernel test robot
2021-06-22  1:13     ` Xin Long
2021-06-22  1:13       ` Xin Long
2021-06-22 17:02       ` David Miller
2021-06-22 17:02         ` David Miller
2021-06-21  1:38 ` [PATCH net-next 07/14] sctp: do state transition when PROBE_COUNT == MAX_PROBES on HB send path Xin Long
2021-06-21  1:38 ` [PATCH net-next 08/14] sctp: do state transition when a probe succeeds on HB ACK recv path Xin Long
2021-06-21  1:38 ` Xin Long [this message]
2021-06-21  1:38 ` [PATCH net-next 10/14] sctp: enable PLPMTUD when the transport is ready Xin Long
2021-06-21  1:38 ` [PATCH net-next 11/14] sctp: remove the unessessary hold for idev in sctp_v6_err Xin Long
2021-06-21  1:38 ` [PATCH net-next 12/14] sctp: extract sctp_v6_err_handle function from sctp_v6_err Xin Long
2021-06-21  1:38 ` [PATCH net-next 13/14] sctp: extract sctp_v4_err_handle function from sctp_v4_err Xin Long
2021-06-21  1:38 ` [PATCH net-next 14/14] sctp: process sctp over udp icmp err on sctp side Xin Long
2021-06-22  1:30 ` [PATCH net-next 00/14] sctp: implement RFC8899: Packetization Layer Path MTU Discovery for SCTP transport Marcelo Ricardo Leitner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=079e91bfe9ea9f081df75d04e74253b099ce7a84.1624239422.git.lucien.xin@gmail.com \
    --to=lucien.xin@gmail.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-sctp@vger.kernel.org \
    --cc=marcelo.leitner@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.