Hi! > I do not see how this worked on 4.19. My comment above is a fundamental > property of VRF and has been needed since day 1. That's why 'ip vrf > exec' exists. I'm afraid I have to disagree here: first of all, I created a regression-test in NixOS for this purpose a while ago[1]. The third test-case (lines 197-208) does basically what I demonstrated in my previous emails (opening SSH connetions through a local VRF). This worked fine until we bumped our default kernel to 5.4.x which is the reason why this testcase is temporarily commented out. While this is helpful to demonstrate the issue, I acknowledge that this is pretty useless for a non-NixOS user which is why I did some further research today: After skimming through the VRF-related changes in 4.20 and 5.0 (which might've had some relevant changes as you suggested previously), I rebuilt the kernels 5.4.29 and 5.5.13 with 3c82a21f4320c8d54cf6456b27c8d49e5ffb722e[2] reverted on top and the commented-out testcase works fine again. In other words, my usecase seems to have worked before and the mentioned commit appears to cause the "regression". To provide you with further context, I decided to run `sudo perf record -e fib:* -a -g -- ssh root@92.60.36.231 -o ConnectTimeout=10s true` again on my patched kernel at 5.5.13. The result is available under https://gist.githubusercontent.com/Ma27/a6f83e05f6ffede21c2e27d5c7d27098/raw/40c78603d5f76caa8717e293aba5c609bf7f013d/perf-report.txt Thanks! Maximilian [1] https://github.com/NixOS/nixpkgs/blob/58c7a952a13a65398bed3f539061e69f523ee377/nixos/tests/systemd-networkd-vrf.nix [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c82a21f4320c8d54cf6456b27c8d49e5ffb722e On Wed, Apr 01, 2020 at 02:41:56PM -0600, David Ahern wrote: > On 4/1/20 2:35 PM, Maximilian Bosch wrote: > > Hi! > > > >> This should work: > >> make -C tools/testing/selftests/net nettest > >> PATH=$PWD/tools/testing/selftests/net:$PATH > >> tools/testing/selftests/net/fcnal-test.sh > > > > Thanks, will try this out later. > > > >> If you want that ssh connection to work over a VRF you either need to > >> set the shell context: > >> ip vrf exec su - $USER > >> > > > > Yes, using `ip vrf exec` is basically my current workaround. > > that's not a workaround, it's a requirement. With VRF configured all > addresses are relative to the L3 domain. When trying to connect to a > remote host, the VRF needs to be given. > > > > >> or add 'ip vrf exec' before the ssh. If it is an incoming connection to > >> a server the ssh server either needs to be bound to the VRF or you need > >> 'net.ipv4.tcp_l3mdev_accept = 1' > > > > Does this mean that the `*l3mdev_accept`-parameters only "fix" this > > issue if the VRF is on the server I connect to? > > server side setting only. > > > > > In my case the VRF is on my local machine and I try to connect through > > the VRF to the server. > > > >> The tcp reset suggests you are doing an outbound connection but the > >> lookup for what must be the SYN-ACK is not finding the local socket - > >> and that is because of the missing 'ip vrf exec' above. > > > > I only experience this behavior on a 5.x kernel, not on e.g. 4.19 > > though. I may be wrong, but isn't this a breaking change for userspace > > applications in the end? > > I do not see how this worked on 4.19. My comment above is a fundamental > property of VRF and has been needed since day 1. That's why 'ip vrf > exec' exists.