netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Maximilien Cuony <maximilien.cuony@arcanite.ch>
Cc: netdev@vger.kernel.org, Phil Sutter <phil@nwl.cc>,
	Florian Westphal <fw@strlen.de>,
	Mike Manning <mvrmanning@gmail.com>,
	David Ahern <dsahern@kernel.org>,
	netfilter-devel@vger.kernel.org
Subject: Re: [REGRESSION] Unable to NAT own TCP packets from another VRF with tcp_l3mdev_accept = 1
Date: Fri, 30 Sep 2022 17:42:37 -0700	[thread overview]
Message-ID: <20220930174237.2e89c9e1@kernel.org> (raw)
In-Reply-To: <98348818-28c5-4cb2-556b-5061f77e112c@arcanite.ch>

Adding netfilter and vrf experts.

On Wed, 28 Sep 2022 16:02:43 +0200 Maximilien Cuony wrote:
> Hello,
> 
> We're using VRF with a machine used as a router and have a specific 
> issue where the router doesn't handle his own packets correctly during 
> NATing if the packet is coming from a different VRF.
> 
> We had the issue with debian buster (4.19), but the issue solved itself 
> when we updated to debian bullseye (5.10.92).
> 
> However, during an upgrade of debian bullseye to the latest kernel, the 
> issue appeared again (5.10.140).
> 
> We did a bisection and this leaded us to 
> "b0d67ef5b43aedbb558b9def2da5b4fffeb19966 net: allow unbound socket for 
> packets in VRF when tcp_l3mdev_accept set [ Upstream commit 
> 944fd1aeacb627fa617f85f8e5a34f7ae8ea4d8e ]".
> 
> Simplified case setup:
> 
> There is two machines in the setup. They both forward packets 
> (net.ipv4.ip_forward = 1) and there is two interface between them.
> 
> The main machine has two VRF. The default VRF is using the second 
> machine as the default route, on a specific interface.
> The second machine has as default route to main machine, on the other 
> VRF using the second pair of interfaces.
> 
> On the main machine, the second interface is in a specific VRF. In that 
> VRF, packets are NATed to the internet on a third interface.
> 
> A visual schema with the normal flow is available there: 
> https://etinacra.ch/kernel.png
> 
> Configuration command:
> 
> Main machine:
> sysctl -w net.ipv4.tcp_l3mdev_accept = 1
> sysctl -w systnet.ipv4.ip_forward = 1
> iptables -t raw -A PREROUTING -i eth0 -j CT --zone 5
> iptables -t raw -A OUTPUT -o eth0 -j CT --zone 5
> iptables -t nat -A POSTROUTING -o eth2 -j SNAT --to 192.168.1.1
> cat /etc/network/interfaces
> 
> auto firewall
> iface firewall
>      vrf-table 1200
> 
> auto eth0
> iface eth0
>      address 192.168.5.1/24
>      gateway 192.168.5.2
> 
> auto eth1
> iface eth1
>      address 192.168.10.1/24
>      vrf firewall
>      up ip route add 192.168.5.0/24 via 192.168.10.2 vrf firewall
> 
> auto eth2
> iface eth2
>      address 192.168.1.1/24
>      gateway 192.168.1.250
>      vrf firewall
> 
> ==
> 
> Second machine:
> 
> sysctl -w net.ipv4.ip_forward = 1
> 
> cat /etc/network/interfaces
> 
> auto eth0
> iface eth0
>      address 192.168.5.2/24
> 
> auto eth1
> iface eth1
>      address 192.168.10.2/24
>      gateway 192.168.10.1
> 
> ==
> 
> Without issue, if we look at a tcpdump on all interface on the main 
> machine, everything is fine (output truncated):
> 
> 10:28:32.811283 eth0 Out IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
> [S], seq 2216112145
> 10:28:32.811666 eth1 In  IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
> [S], seq 2216112145
> 10:28:32.811679 eth2 Out IP 192.168.1.1.55750 > 99.99.99.99.80: Flags 
> [S], seq 2216112145
> 10:28:32.835138 eth2 In  IP 99.99.99.99.80 > 192.168.1.1.55750: Flags 
> [S.], seq 383992840, ack 2216112146
> 10:28:32.835152 eth1 Out IP 99.99.99.99.80 > 192.168.5.1.55750: Flags 
> [S.], seq 383992840, ack 2216112146
> 10:28:32.835457 eth0 In  IP 99.99.99.99.80 > 192.168.5.1.55750: Flags 
> [S.], seq 383992840, ack 2216112146
> 10:28:32.835511 eth0 Out IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
> [.], ack 1, win 502
> 
> However when the issue is present, the SYNACK does arrives on eth2, but 
> is never "unNATed" back to eth1:
> 
> 10:25:07.644433 eth0 Out IP 192.168.5.1.48684 > 99.99.99.99.80: Flags 
> [S], seq 3207393154
> 10:25:07.644782 eth1 In  IP 192.168.5.1.48684 > 99.99.99.99.80: Flags 
> [S], seq 3207393154
> 10:25:07.644793 eth2 Out IP 192.168.1.1.48684 > 99.99.99.99.80: Flags 
> [S], seq 3207393154
> 10:25:07.668551 eth2 In  IP 54.36.61.42.80 > 192.168.1.1.48684: Flags 
> [S.], seq 823335485, ack 3207393155
> 
> The issue is only with TCP connections. UDP or ICMP works fine.
> 
> Turing off net.ipv4.tcp_l3mdev_accept back to 0 also fix the issue, but 
> we need this flag since we use some sockets that does not understand VRFs.
> 
> We did have a look at the diff and the code of inet_bound_dev_eq, but we 
> didn't understand much the real problem - but it does seem now that 
> bound_dev_if if now checked not to be False before the bound_dev_if == 
> dif || bound_dev_if == sdif comparison, something that was not the case 
> before (especially since it's dependent on l3mdev_accept).
> 
> Maybe our setup is wrong and we should not be able to route packets like 
> that?
> 
> Thanks a lot and have a nice day!
> 
> Maximilien Cuony
> 
> 


  parent reply	other threads:[~2022-10-01  0:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-28 14:02 [REGRESSION] Unable to NAT own TCP packets from another VRF with tcp_l3mdev_accept = 1 Maximilien Cuony
2022-09-30  7:36 ` Thorsten Leemhuis
2022-10-01  0:42 ` Jakub Kicinski [this message]
2022-10-07 14:42   ` David Ahern
2022-10-07 16:47   ` Mike Manning
2022-10-12 12:24     ` Maximilien Cuony
2022-10-26 12:40       ` [REGRESSION] Unable to NAT own TCP packets from another VRF with tcp_l3mdev_accept = 1 #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220930174237.2e89c9e1@kernel.org \
    --to=kuba@kernel.org \
    --cc=dsahern@kernel.org \
    --cc=fw@strlen.de \
    --cc=maximilien.cuony@arcanite.ch \
    --cc=mvrmanning@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=phil@nwl.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).