netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Nabil S. Alramli" <nalramli@fastly.com>
To: sbhogavilli@fastly.com, davem@davemloft.net, dsahern@kernel.org,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: srao@fastly.com, dev@nalramli.com
Subject: [net] ipv4: Fix broken PMTUD when using L4 multipath hash
Date: Thu, 12 Oct 2023 19:40:25 -0400	[thread overview]
Message-ID: <20231012234025.4025-1-nalramli@fastly.com> (raw)
In-Reply-To: <20231012005721.2742-2-nalramli@fastly.com>

From: Suresh Bhogavilli <sbhogavilli@fastly.com>

On a node with multiple network interfaces, if we enable layer 4 hash
policy with net.ipv4.fib_multipath_hash_policy=1, path MTU discovery is
broken and TCP connection does not make progress unless the incoming
ICMP Fragmentation Needed (type 3, code 4) message is received on the
egress interface of selected nexthop of the socket.

This is because build_sk_flow_key() does not provide the sport and dport
from the socket when calling flowi4_init_output(). This appears to be a
copy/paste error of build_skb_flow_key() -> __build_flow_key() ->
flowi4_init_output() call used for packet forwarding where an skb is
present, is passed later to fib_multipath_hash() call, and can scrape
out both sport and dport from the skb if L4 hash policy is in use.

In the socket write case, fib_multipath_hash() does not get an skb so
it expects the fl4 to have sport and dport populated when L4 hashing is
in use. Not populating them results in creating a nexthop exception
entry against a nexthop that may not be the one used by the socket.
Hence it is not later matched when inet_csk_rebuild_route is called to
update the cached dst entry in the socket, so TCP does not lower its MSS
and the connection does not make progress.

Fix this by providing the source port and destination ports to
flowi4_init_output() call in build_sk_flow_key().

Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: Suresh Bhogavilli <sbhogavilli@fastly.com>
---
 net/ipv4/route.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e2bf4602b559..2517eb12b7ef 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -557,7 +557,8 @@ static void build_sk_flow_key(struct flowi4 *fl4, const struct sock *sk)
 			   inet_test_bit(HDRINCL, sk) ?
 				IPPROTO_RAW : sk->sk_protocol,
 			   inet_sk_flowi_flags(sk),
-			   daddr, inet->inet_saddr, 0, 0, sk->sk_uid);
+			   daddr, inet->inet_saddr, inet->inet_dport, inet->inet_sport,
+			   sk->sk_uid);
 	rcu_read_unlock();
 }
 
-- 
2.31.1


       reply	other threads:[~2023-10-12 23:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20231012005721.2742-2-nalramli@fastly.com>
2023-10-12 23:40 ` Nabil S. Alramli [this message]
2023-10-13 16:19   ` [net] ipv4: Fix broken PMTUD when using L4 multipath hash David Ahern
2023-10-16 18:51     ` Nabil S. Alramli
2024-02-09 17:11       ` Suresh Bhogavilli
2024-02-09 22:27         ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231012234025.4025-1-nalramli@fastly.com \
    --to=nalramli@fastly.com \
    --cc=davem@davemloft.net \
    --cc=dev@nalramli.com \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sbhogavilli@fastly.com \
    --cc=srao@fastly.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).