* PMTUD broken inside network namespace with multipath routing @ 2020-08-03 11:14 mastertheknife 2020-08-03 13:32 ` David Ahern 0 siblings, 1 reply; 12+ messages in thread From: mastertheknife @ 2020-08-03 11:14 UTC (permalink / raw) To: netdev Hi, I have observed that PMTUD (Path MTU discovery) is broken using multipath routing inside a network namespace. This breaks TCP, because it keeps trying to send oversized packets. Observed on kernel 5.4.44, other kernels weren't tested. However i went through net/ipv4/route.c and haven't spotted changes in this area, so i believe this bug is still there. Host test with multipath routing: --------------------------------- root@host1:~# ip route add 192.168.247.100/32 dev vmbr2 nexthop via 192.168.252.250 dev vmbr2 nexthop via 192.168.252.252 dev vmbr2 root@host1:~# ip route | grep -A2 192.168.247.100 192.168.247.100 nexthop via 192.168.252.250 dev vmbr2 weight 1 nexthop via 192.168.252.252 dev vmbr2 weight 1 root@host1:~# ping -M do -s 1380 192.168.247.100 PING 192.168.247.100 (192.168.247.100) 1380(1408) bytes of data. From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1406) ping: local error: Message too long, mtu=1406 ping: local error: Message too long, mtu=1406 ping: local error: Message too long, mtu=1406 ping: local error: Message too long, mtu=1406 ^C --- 192.168.247.100 ping statistics --- 5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 80ms root@host1:~# ip route get 192.168.247.100 192.168.247.100 via 192.168.252.250 dev vmbr2 src 192.168.252.15 uid 0 cache expires 583sec mtu 1406 LXC container inside that host with multipath routing: ------------------------------------------------------ [root@lxctest ~]# ip route add 192.168.247.100/32 dev eth0 nexthop via 192.168.252.250 dev eth0 nexthop via 192.168.252.252 dev eth0 [root@lxctest ~]# ip route default via 192.168.252.100 dev eth0 proto static metric 100 192.168.247.100 nexthop via 192.168.252.250 dev eth0 weight 1 nexthop via 192.168.252.252 dev eth0 weight 1 192.168.252.0/24 dev eth0 proto kernel scope link src 192.168.252.207 metric 100 [root@lxctest ~]# ping -M do -s 1380 192.168.247.100 PING 192.168.247.100 (192.168.247.100) 1380(1408) bytes of data. From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1406) From 192.168.252.252 icmp_seq=2 Frag needed and DF set (mtu = 1406) From 192.168.252.252 icmp_seq=3 Frag needed and DF set (mtu = 1406) From 192.168.252.252 icmp_seq=4 Frag needed and DF set (mtu = 1406) [root@lxctest ~]# ip route get 192.168.247.100 192.168.247.100 via 192.168.252.252 dev eth0 src 192.168.252.207 uid 0 cache LXC container inside that host with regular routing: ---------------------------------------------------- [root@lxctest ~]# ip route add 192.168.247.100/32 via 192.168.252.252 dev eth0 [root@lxctest ~]# ip route default via 192.168.252.100 dev eth0 proto static metric 100 192.168.247.100 via 192.168.252.252 dev eth0 192.168.252.0/24 dev eth0 proto kernel scope link src 192.168.252.207 metric 100 [root@lxctest ~]# ping -M do -s 1380 192.168.247.100 PING 192.168.247.100 (192.168.247.100) 1380(1408) bytes of data. From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1406) ping: local error: Message too long, mtu=1406 ping: local error: Message too long, mtu=1406 ping: local error: Message too long, mtu=1406 ^C --- 192.168.247.100 ping statistics --- 4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 82ms [root@lxctest ~]# ip route get 192.168.247.100 192.168.247.100 via 192.168.252.252 dev eth0 src 192.168.252.207 uid 0 cache expires 591sec mtu 1406 What seems to be happening, is that when multipath routing is used inside LXC (or any network namespace), the kernel doesn't generate a routing exception to force the lower MTU. I believe this is a bug inside the kernel. Kfir Itzhak ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-03 11:14 PMTUD broken inside network namespace with multipath routing mastertheknife @ 2020-08-03 13:32 ` David Ahern 2020-08-03 14:24 ` mastertheknife 0 siblings, 1 reply; 12+ messages in thread From: David Ahern @ 2020-08-03 13:32 UTC (permalink / raw) To: mastertheknife, netdev On 8/3/20 5:14 AM, mastertheknife wrote: > What seems to be happening, is that when multipath routing is used > inside LXC (or any network namespace), the kernel doesn't generate a > routing exception to force the lower MTU. > I believe this is a bug inside the kernel. > Known problem. Original message can take path 1 and ICMP message can path 2. The exception is then created on the wrong path. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-03 13:32 ` David Ahern @ 2020-08-03 14:24 ` mastertheknife 2020-08-03 15:38 ` David Ahern 0 siblings, 1 reply; 12+ messages in thread From: mastertheknife @ 2020-08-03 14:24 UTC (permalink / raw) To: David Ahern; +Cc: netdev Hi David, In this case, both paths are in the same layer2 network, there is no symmetric multi-path routing. If original message takes path 1, ICMP response will come from path 1 If original message takes path 2, ICMP response will come from path 2 Also, It works fine outside of LXC. Thank you, Kfir On Mon, Aug 3, 2020 at 4:32 PM David Ahern <dsahern@gmail.com> wrote: > > On 8/3/20 5:14 AM, mastertheknife wrote: > > What seems to be happening, is that when multipath routing is used > > inside LXC (or any network namespace), the kernel doesn't generate a > > routing exception to force the lower MTU. > > I believe this is a bug inside the kernel. > > > > Known problem. Original message can take path 1 and ICMP message can > path 2. The exception is then created on the wrong path. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-03 14:24 ` mastertheknife @ 2020-08-03 15:38 ` David Ahern 2020-08-03 18:39 ` mastertheknife 0 siblings, 1 reply; 12+ messages in thread From: David Ahern @ 2020-08-03 15:38 UTC (permalink / raw) To: mastertheknife; +Cc: netdev On 8/3/20 8:24 AM, mastertheknife wrote: > Hi David, > > In this case, both paths are in the same layer2 network, there is no > symmetric multi-path routing. > If original message takes path 1, ICMP response will come from path 1 > If original message takes path 2, ICMP response will come from path 2 > Also, It works fine outside of LXC. > > I'll take a look when I get some time; most likely end of the week. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-03 15:38 ` David Ahern @ 2020-08-03 18:39 ` mastertheknife 2020-08-10 22:13 ` David Ahern 0 siblings, 1 reply; 12+ messages in thread From: mastertheknife @ 2020-08-03 18:39 UTC (permalink / raw) To: David Ahern; +Cc: netdev Hi David, I found something that can shed some light on the issue. The issue only happens if the ICMP response doesn't come from the first nexthop. In my case, both nexthops are linux routers, and they are the ones generating the ICMP (because of IPSEC next). This is what I meant earlier, that the ICMP path is identical to the original message path. Test IP #1 - 192.168.249.116 - Hash will choose nexthop #1 Test IP #2 - 192.168.249.117 - Hash will choose nexthop #2 Test with 252.250 as nexthop #1: -------------------------------- root@lxctest:[~] # ip route add 192.168.249.0/24 dev eth1 nexthop via 192.168.252.250 dev eth1 nexthop via 192.168.252.252 dev eth1 root@lxctest:[~] # ping -M do -s 1450 192.168.249.116 PING 192.168.249.116 (192.168.249.116) 1450(1478) bytes of data. From 192.168.252.250 icmp_seq=1 Frag needed and DF set (mtu = 1446) ping: local error: Message too long, mtu=1446 ping: local error: Message too long, mtu=1446 ping: local error: Message too long, mtu=1446 ^C --- 192.168.249.116 ping statistics --- 4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3067ms root@testlxc:[~] # ping -M do -s 1450 192.168.249.117 PING 192.168.249.117 (192.168.249.117) 1450(1478) bytes of data. From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1446) From 192.168.252.252 icmp_seq=2 Frag needed and DF set (mtu = 1446) From 192.168.252.252 icmp_seq=3 Frag needed and DF set (mtu = 1446) ^C --- 192.168.249.117 ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2052ms Test with 252.252 as nexthop #1: -------------------------------- root@testlxc:[~] # ip route add 192.168.249.0/24 dev eth1 nexthop via 192.168.252.252 dev eth1 nexthop via 192.168.252.250 dev eth1 root@testlxc:[~] # ping -M do -s 1450 192.168.249.116 PING 192.168.249.116 (192.168.249.116) 1450(1478) bytes of data. From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1446) ping: local error: Message too long, mtu=1446 ping: local error: Message too long, mtu=1446 ^C --- 192.168.249.116 ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2044ms root@testlxc:[~] # ping -M do -s 1450 192.168.249.117 PING 192.168.249.117 (192.168.249.117) 1450(1478) bytes of data. From 192.168.252.250 icmp_seq=1 Frag needed and DF set (mtu = 1446) From 192.168.252.250 icmp_seq=2 Frag needed and DF set (mtu = 1446) From 192.168.252.250 icmp_seq=3 Frag needed and DF set (mtu = 1446) ^C --- 192.168.249.117 ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2046ms In summary: It seems that it doesn't matter who is the nexthop. If the ICMP response isn't from the nexthop, it'll be rejected. About why i couldn't reproduce this outside LXC, i don't know yet but i will keep trying to figure this out. Let me know if you need me to test this. Thank you, Kfir Itzhak On Mon, Aug 3, 2020 at 6:38 PM David Ahern <dsahern@gmail.com> wrote: > > On 8/3/20 8:24 AM, mastertheknife wrote: > > Hi David, > > > > In this case, both paths are in the same layer2 network, there is no > > symmetric multi-path routing. > > If original message takes path 1, ICMP response will come from path 1 > > If original message takes path 2, ICMP response will come from path 2 > > Also, It works fine outside of LXC. > > > > > > I'll take a look when I get some time; most likely end of the week. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-03 18:39 ` mastertheknife @ 2020-08-10 22:13 ` David Ahern 2020-08-12 12:37 ` mastertheknife 0 siblings, 1 reply; 12+ messages in thread From: David Ahern @ 2020-08-10 22:13 UTC (permalink / raw) To: mastertheknife; +Cc: netdev On 8/3/20 12:39 PM, mastertheknife wrote: > In summary: It seems that it doesn't matter who is the nexthop. If the > ICMP response isn't from the nexthop, it'll be rejected. > About why i couldn't reproduce this outside LXC, i don't know yet but > i will keep trying to figure this out. do you have a shell script that reproduces the problem? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-10 22:13 ` David Ahern @ 2020-08-12 12:37 ` mastertheknife 2020-08-12 19:21 ` David Ahern 0 siblings, 1 reply; 12+ messages in thread From: mastertheknife @ 2020-08-12 12:37 UTC (permalink / raw) To: David Ahern; +Cc: netdev Hello David, I tried and it seems i can reproduce it: # Create test NS root@host:~# ip netns add testns # Create veth pair, veth0 in host, veth1 in NS root@host:~# ip link add veth0 type veth peer name veth1 root@host:~# ip link set veth1 netns testns # Configure veth1 (NS) root@host:~# ip netns exec testns ip addr add 192.168.252.209/24 dev veth1 root@host:~# ip netns exec testns ip link set dev veth1 up root@host:~# ip netns exec testns ip route add default via 192.168.252.100 root@host:~# ip netns exec testns ip route add 192.168.249.0/24 nexthop via 192.168.252.250 nexthop via 192.168.252.252 # Configure veth0 (host) root@host:~# brctl addif vmbr2 veth0 root@host:~# ip link set veth0 up # Tests root@host:~# ip netns exec testns ping -M do -s 1450 192.168.249.116 PING 192.168.249.116 (192.168.249.116) 1450(1478) bytes of data. From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1366) ping: local error: Message too long, mtu=1366 ping: local error: Message too long, mtu=1366 ping: local error: Message too long, mtu=1366 ^C --- 192.168.249.116 ping statistics --- 4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 81ms root@host:~# ip netns exec testns ping -M do -s 1450 192.168.249.134 PING 192.168.249.134 (192.168.249.134) 1450(1478) bytes of data. From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1366) From 192.168.252.252 icmp_seq=2 Frag needed and DF set (mtu = 1366) From 192.168.252.252 icmp_seq=3 Frag needed and DF set (mtu = 1366) ^C --- 192.168.249.134 ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 40ms root@host:~# ip netns exec testns ip route show cache 192.168.249.116 via 192.168.252.250 dev veth1 cache expires 584sec mtu 1366 192.168.249.134 via 192.168.252.250 dev veth1 cache expires 593sec mtu 1366 root@host:~# ip netns exec testns ip route get 192.168.249.116 192.168.249.116 via 192.168.252.250 dev veth1 src 192.168.252.209 uid 0 cache expires 578sec mtu 1366 root@host:~# ip netns exec testns ip route get 192.168.249.134 192.168.249.134 via 192.168.252.252 dev veth1 src 192.168.252.209 uid 0 cache Please notice the above, 'ip route show cache' and 'ip route get' return different nexthop for 192.168.249.134, i suspect that may be part of the problem. Thank you, Kfir On Tue, Aug 11, 2020 at 1:13 AM David Ahern <dsahern@gmail.com> wrote: > > On 8/3/20 12:39 PM, mastertheknife wrote: > > In summary: It seems that it doesn't matter who is the nexthop. If the > > ICMP response isn't from the nexthop, it'll be rejected. > > About why i couldn't reproduce this outside LXC, i don't know yet but > > i will keep trying to figure this out. > > do you have a shell script that reproduces the problem? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-12 12:37 ` mastertheknife @ 2020-08-12 19:21 ` David Ahern 2020-08-14 7:08 ` mastertheknife 0 siblings, 1 reply; 12+ messages in thread From: David Ahern @ 2020-08-12 19:21 UTC (permalink / raw) To: mastertheknife; +Cc: netdev On 8/12/20 6:37 AM, mastertheknife wrote: > Hello David, > > I tried and it seems i can reproduce it: > > # Create test NS > root@host:~# ip netns add testns > # Create veth pair, veth0 in host, veth1 in NS > root@host:~# ip link add veth0 type veth peer name veth1 > root@host:~# ip link set veth1 netns testns > # Configure veth1 (NS) > root@host:~# ip netns exec testns ip addr add 192.168.252.209/24 dev veth1 > root@host:~# ip netns exec testns ip link set dev veth1 up > root@host:~# ip netns exec testns ip route add default via 192.168.252.100 > root@host:~# ip netns exec testns ip route add 192.168.249.0/24 > nexthop via 192.168.252.250 nexthop via 192.168.252.252 > # Configure veth0 (host) > root@host:~# brctl addif vmbr2 veth0 vmbr2's config is not defined. ip li add vmbr2 type bridge ip li set veth0 master vmbr2 ip link set veth0 up anything else? e.g., address for vmbr2? What holds 192.168.252.250 and 192.168.252.252 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-12 19:21 ` David Ahern @ 2020-08-14 7:08 ` mastertheknife 2020-09-01 10:40 ` mastertheknife 0 siblings, 1 reply; 12+ messages in thread From: mastertheknife @ 2020-08-14 7:08 UTC (permalink / raw) To: David Ahern; +Cc: netdev Hello David, It's on a production system, vmbr2 is a bridge with eth.X VLAN interface inside for the connectivity on that 252.0/24 network. vmbr2 has address 192.168.252.5 in that case 192.168.252.250 and 192.168.252.252 are CentOS8 LXCs on another host, with libreswan inside for any/any IPSECs with VTi interfaces. Everything is kernel 5.4.44 LTS I wish i could fully reproduce all of it in a script, but i am not sure how to create such hops that return this ICMP Thank you, Kfir On Wed, Aug 12, 2020 at 10:21 PM David Ahern <dsahern@gmail.com> wrote: > > On 8/12/20 6:37 AM, mastertheknife wrote: > > Hello David, > > > > I tried and it seems i can reproduce it: > > > > # Create test NS > > root@host:~# ip netns add testns > > # Create veth pair, veth0 in host, veth1 in NS > > root@host:~# ip link add veth0 type veth peer name veth1 > > root@host:~# ip link set veth1 netns testns > > # Configure veth1 (NS) > > root@host:~# ip netns exec testns ip addr add 192.168.252.209/24 dev veth1 > > root@host:~# ip netns exec testns ip link set dev veth1 up > > root@host:~# ip netns exec testns ip route add default via 192.168.252.100 > > root@host:~# ip netns exec testns ip route add 192.168.249.0/24 > > nexthop via 192.168.252.250 nexthop via 192.168.252.252 > > # Configure veth0 (host) > > root@host:~# brctl addif vmbr2 veth0 > > vmbr2's config is not defined. > > ip li add vmbr2 type bridge > ip li set veth0 master vmbr2 > ip link set veth0 up > > anything else? e.g., address for vmbr2? What holds 192.168.252.250 and > 192.168.252.252 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-08-14 7:08 ` mastertheknife @ 2020-09-01 10:40 ` mastertheknife 2020-09-01 10:44 ` mastertheknife 2020-09-02 0:42 ` David Ahern 0 siblings, 2 replies; 12+ messages in thread From: mastertheknife @ 2020-09-01 10:40 UTC (permalink / raw) To: David Ahern; +Cc: netdev Hello David. I was able to solve it while troubleshooting some fragmentation issue. The VTI interfaces had MTU of 1480 by default. I reduced to them to the real PMTUD (1366) and now its all working just fine. I am not sure how its related and why, but seems like it solved the issue. P.S: while reading the relevant code in the kernel, i think i spotted some mistake in net/ipv4/route.c, in function "update_or_create_fnhe". It looks like it loops over all the exceptions for the nexthop entry, but always overwriting the first (and only) entry, so effectively only 1 exception can exist per nexthop entry. Line 678: "if (fnhe) {" Should probably be: "if (fnhe && fnhe->fnhe_daddr == daddr) {" Thank you for your efforts, Kfir Itzhak On Fri, Aug 14, 2020 at 10:08 AM mastertheknife <mastertheknife@gmail.com> wrote: > > Hello David, > > It's on a production system, vmbr2 is a bridge with eth.X VLAN > interface inside for the connectivity on that 252.0/24 network. vmbr2 > has address 192.168.252.5 in that case > 192.168.252.250 and 192.168.252.252 are CentOS8 LXCs on another host, > with libreswan inside for any/any IPSECs with VTi interfaces. > > Everything is kernel 5.4.44 LTS > > I wish i could fully reproduce all of it in a script, but i am not > sure how to create such hops that return this ICMP > > Thank you, > Kfir > > > On Wed, Aug 12, 2020 at 10:21 PM David Ahern <dsahern@gmail.com> wrote: > > > > On 8/12/20 6:37 AM, mastertheknife wrote: > > > Hello David, > > > > > > I tried and it seems i can reproduce it: > > > > > > # Create test NS > > > root@host:~# ip netns add testns > > > # Create veth pair, veth0 in host, veth1 in NS > > > root@host:~# ip link add veth0 type veth peer name veth1 > > > root@host:~# ip link set veth1 netns testns > > > # Configure veth1 (NS) > > > root@host:~# ip netns exec testns ip addr add 192.168.252.209/24 dev veth1 > > > root@host:~# ip netns exec testns ip link set dev veth1 up > > > root@host:~# ip netns exec testns ip route add default via 192.168.252.100 > > > root@host:~# ip netns exec testns ip route add 192.168.249.0/24 > > > nexthop via 192.168.252.250 nexthop via 192.168.252.252 > > > # Configure veth0 (host) > > > root@host:~# brctl addif vmbr2 veth0 > > > > vmbr2's config is not defined. > > > > ip li add vmbr2 type bridge > > ip li set veth0 master vmbr2 > > ip link set veth0 up > > > > anything else? e.g., address for vmbr2? What holds 192.168.252.250 and > > 192.168.252.252 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-09-01 10:40 ` mastertheknife @ 2020-09-01 10:44 ` mastertheknife 2020-09-02 0:42 ` David Ahern 1 sibling, 0 replies; 12+ messages in thread From: mastertheknife @ 2020-09-01 10:44 UTC (permalink / raw) To: David Ahern; +Cc: netdev Hello David, A quick correction; The issue is not solved, it was a mistake in my testing. The issue is still there. Kfir On Tue, Sep 1, 2020 at 1:40 PM mastertheknife <mastertheknife@gmail.com> wrote: > > Hello David. > > I was able to solve it while troubleshooting some fragmentation issue. > The VTI interfaces had MTU of 1480 by default. I reduced to them to > the real PMTUD (1366) and now its all working just fine. > I am not sure how its related and why, but seems like it solved the issue. > > P.S: while reading the relevant code in the kernel, i think i spotted > some mistake in net/ipv4/route.c, in function "update_or_create_fnhe". > It looks like it loops over all the exceptions for the nexthop entry, > but always overwriting the first (and only) entry, so effectively only > 1 exception can exist per nexthop entry. > Line 678: > "if (fnhe) {" > Should probably be: > "if (fnhe && fnhe->fnhe_daddr == daddr) {" > > > Thank you for your efforts, > Kfir Itzhak > > On Fri, Aug 14, 2020 at 10:08 AM mastertheknife > <mastertheknife@gmail.com> wrote: > > > > Hello David, > > > > It's on a production system, vmbr2 is a bridge with eth.X VLAN > > interface inside for the connectivity on that 252.0/24 network. vmbr2 > > has address 192.168.252.5 in that case > > 192.168.252.250 and 192.168.252.252 are CentOS8 LXCs on another host, > > with libreswan inside for any/any IPSECs with VTi interfaces. > > > > Everything is kernel 5.4.44 LTS > > > > I wish i could fully reproduce all of it in a script, but i am not > > sure how to create such hops that return this ICMP > > > > Thank you, > > Kfir > > > > > > On Wed, Aug 12, 2020 at 10:21 PM David Ahern <dsahern@gmail.com> wrote: > > > > > > On 8/12/20 6:37 AM, mastertheknife wrote: > > > > Hello David, > > > > > > > > I tried and it seems i can reproduce it: > > > > > > > > # Create test NS > > > > root@host:~# ip netns add testns > > > > # Create veth pair, veth0 in host, veth1 in NS > > > > root@host:~# ip link add veth0 type veth peer name veth1 > > > > root@host:~# ip link set veth1 netns testns > > > > # Configure veth1 (NS) > > > > root@host:~# ip netns exec testns ip addr add 192.168.252.209/24 dev veth1 > > > > root@host:~# ip netns exec testns ip link set dev veth1 up > > > > root@host:~# ip netns exec testns ip route add default via 192.168.252.100 > > > > root@host:~# ip netns exec testns ip route add 192.168.249.0/24 > > > > nexthop via 192.168.252.250 nexthop via 192.168.252.252 > > > > # Configure veth0 (host) > > > > root@host:~# brctl addif vmbr2 veth0 > > > > > > vmbr2's config is not defined. > > > > > > ip li add vmbr2 type bridge > > > ip li set veth0 master vmbr2 > > > ip link set veth0 up > > > > > > anything else? e.g., address for vmbr2? What holds 192.168.252.250 and > > > 192.168.252.252 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PMTUD broken inside network namespace with multipath routing 2020-09-01 10:40 ` mastertheknife 2020-09-01 10:44 ` mastertheknife @ 2020-09-02 0:42 ` David Ahern 1 sibling, 0 replies; 12+ messages in thread From: David Ahern @ 2020-09-02 0:42 UTC (permalink / raw) To: mastertheknife; +Cc: netdev On 9/1/20 4:40 AM, mastertheknife wrote: > > P.S: while reading the relevant code in the kernel, i think i spotted > some mistake in net/ipv4/route.c, in function "update_or_create_fnhe". > It looks like it loops over all the exceptions for the nexthop entry, > but always overwriting the first (and only) entry, so effectively only > 1 exception can exist per nexthop entry. > Line 678: > "if (fnhe) {" > Should probably be: > "if (fnhe && fnhe->fnhe_daddr == daddr) {" > Right above that line is: for (fnhe = rcu_dereference(hash->chain); fnhe; fnhe = rcu_dereference(fnhe->fnhe_next)) { if (fnhe->fnhe_daddr == daddr) break; depth++; } so fnhe is set based on daddr match. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-09-02 0:42 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-03 11:14 PMTUD broken inside network namespace with multipath routing mastertheknife 2020-08-03 13:32 ` David Ahern 2020-08-03 14:24 ` mastertheknife 2020-08-03 15:38 ` David Ahern 2020-08-03 18:39 ` mastertheknife 2020-08-10 22:13 ` David Ahern 2020-08-12 12:37 ` mastertheknife 2020-08-12 19:21 ` David Ahern 2020-08-14 7:08 ` mastertheknife 2020-09-01 10:40 ` mastertheknife 2020-09-01 10:44 ` mastertheknife 2020-09-02 0:42 ` David Ahern
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).