* Re: VRF with enslaved L3 enabled bridge [not found] <9E3A1F70-E009-49DC-B639-B48B28F99C52@ciena.com> @ 2018-07-20 3:37 ` David Ahern 2018-07-20 19:03 ` [**EXTERNAL**] " D'Souza, Nelson 0 siblings, 1 reply; 17+ messages in thread From: David Ahern @ 2018-07-20 3:37 UTC (permalink / raw) To: D'Souza, Nelson, netdev On 7/19/18 8:19 PM, D'Souza, Nelson wrote: > Hi, > > > > I'm seeing the following issue on a system running a 4.14.52 Linux kernel. > > > > With an eth interface enslaved to a VRF device, pings sent out on the > VRF to an neighboring host are successful. But, with an eth interface > enslaved to a L3 enabled bridge (mgmtbr0), and the bridge enslaved to a > l3mdev VRF (mgmtvrf), the pings sent out on the VRF are not received > back at the application level. you mean this setup: eth1 (ingress port) -> br0 (bridge) -> red (vrf) IP address on br0: 9: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master red state UP group default qlen 1000 link/ether 02:e0:f9:1c:00:37 brd ff:ff:ff:ff:ff:ff inet 10.100.1.4/24 scope global br0 valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe1c:37/64 scope link valid_lft forever preferred_lft forever And then ping a neighbor: # ping -I red -c1 -w1 10.100.1.254 ping: Warning: source address might be selected on device other than red. PING 10.100.1.254 (10.100.1.254) from 10.100.1.4 red: 56(84) bytes of data. 64 bytes from 10.100.1.254: icmp_seq=1 ttl=64 time=0.810 ms --- 10.100.1.254 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.810/0.810/0.810/0.000 ms > > > > ICMP Echo requests are successfully sent out on the mgmtvrf device to a > neighboring host. However, ICMP echo replies that are received back from > the neighboring host via the eth and mgmtbr0 interfaces are not seen at > the vrf device level and therefore fail to be delivered locally to the > ping application. Does tcpdump on each level show the response? tcpdump on eth, tcpdump on bridge and tcpdump on the vrf device? > > > > The following LOG rules were added to the raw table, prerouting chain > and the filter table, OUTPUT chains: > > > > root@x10sdv-4c-tln4f:~# iptables -t raw -S PREROUTING > > -P PREROUTING ACCEPT > > -A PREROUTING -s 10.32.8.135/32 -i mgmtbr0 -j LOG > > -A PREROUTING -s 10.32.8.135/32 -i mgmtvrf -j LOG > > > > root@x10sdv-4c-tln4f:~# iptables -S OUTPUT > > -P OUTPUT ACCEPT > > -A OUTPUT -o mgmtvrf -j LOG > > -A OUTPUT -o mgmtbr0 -j LOG > > > > Pings are sent on the management VRF to a neighboring host (10.32.8.135) > and the netfilter logs included below: > > Note, that in the logs, ICMP echo requests are sent out on the mgmtvrf > and match the output rules for mgmvrf and mgmtbr0, but the ICMP echo > replies are only seen on mgmtbr0, not on mgmtvrf > > > > root@x10sdv-4c-tln4f:~# ping 10.32.8.135 -I mgmtvrf -c 1 > > PING 10.32.8.135 (10.32.8.135): > > 56 data bytes > > [ 2679.683027] IN= OUT=mgmtvrf SRC=10.33.96.131 DST=10.32.8.135 LEN=84 > TOS=0x00 PREC=0x00 TTL=64 ID=23921 DF PROTO=ICMP TYPE=8 CODE=0 ID=32610 > SEQ=0 <<< ICMP echo sent on mgmtvrf > > [ 2679.697560] IN= OUT=mgmtbr0 SRC=10.33.96.131 DST=10.32.8.135 LEN=84 > TOS=0x00 PREC=0x00 TTL=64 ID=23921 DF PROTO=ICMP TYPE=8 CODE=0 ID=32610 > SEQ=0 <<< ICMP echo sent on mgmtbr0 > > [ 2679.713312] IN=mgmtbr0 OUT= PHYSIN=ethUSB > MAC=c0:56:27:90:4f:75:c4:7d:4f:bb:02:e7:08:00 SRC=10.32.8.135 > DST=10.33.96.131 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=64949 PROTO=ICMP > TYPE=0 CODE=0 ID=32610 SEQ=0 <<< ICMP echo reply rcvd on mgmtbr0, > but not on mgmtvrf > > > > --- 10.32.8.135 ping statistics --- > > 1 packets transmitted, 0 packets received, 100% packet loss <<<< > ping failed > > > > I’d like to know if this is an outstanding/resolved issue. > This one works (see above), so I suspect it is something with your setup. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-20 3:37 ` VRF with enslaved L3 enabled bridge David Ahern @ 2018-07-20 19:03 ` D'Souza, Nelson 2018-07-20 19:11 ` David Ahern 2018-07-23 22:00 ` David Ahern 0 siblings, 2 replies; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-20 19:03 UTC (permalink / raw) To: David Ahern, netdev Hi Dave, It is good to know that this works in your case. However, I'm not able to pinpoint what the issue is and looking for a way to narrow down to the root cause. Do you know if this has been an issue in the past and resolved in Linux kernel versions after 4.14.52? I have the same setup as you and tcpdump works at all levels (eth, bridge, vrf). Setup is as follows: ethUSB(ingress port) -> mgmtbr0 (bridge) -> mgmtvrf (vrf) Logs from my setup: b) ethUSB is enslaved to mgmtbr0 (bridge) root@x10sdv-4c-tln4f:~# ip link show master mgmtbr0 6: ethUSB: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master mgmtbr0 state UNKNOWN mode DEFAULT group default qlen 1000 link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff b) mgmtbr0 bridge is enslaved to mgmtvrf (vrf) root@x10sdv-4c-tln4f:~# ip link show master mgmtvrf 16: mgmtbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmtvrf state UP mode DEFAULT group default qlen 1000 link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff c) ip address configured on mgmtbr0 root@x10sdv-4c-tln4f:~# ip addr show dev mgmtbr0 16: mgmtbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmtvrf state UP group default qlen 1000 link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff inet 10.33.96.131/24 brd 10.33.96.255 scope global mgmtbr0 valid_lft forever preferred_lft forever inet6 fe80::c256:27ff:fe90:4f75/64 scope link valid_lft forever preferred_lft forever d) tcpdump on ethUSB successful, but ping fails root@x10sdv-4c-tln4f:~# ping 10.32.8.135 -I mgmtvrf -c1 -w1 PING 10.32.8.135 (10.32.8.135): 56 data bytes --- 10.32.8.135 ping statistics --- 1 packets transmitted, 0 packets received, 100% packet loss root@x10sdv-4c-tln4f:~# tcpdump -i ethUSB icmp 11:38:37.169678 IP 10.33.96.131 > 10.32.8.135: ICMP echo request, id 62312, seq 0, length 64 11:38:37.170906 IP 10.32.8.135 > 10.33.96.131: ICMP echo reply, id 62312, seq 0, length 64 e) tcpdump on mgmtbr0 successful, but ping fails root@x10sdv-4c-tln4f:~# ping 10.32.8.135 -I mgmtvrf -c1 -w1 PING 10.32.8.135 (10.32.8.135): 56 data bytes --- 10.32.8.135 ping statistics --- 1 packets transmitted, 0 packets received, 100% packet loss root@x10sdv-4c-tln4f:~# tcpdump -i mgmtbr0 icmp 11:46:21.566739 IP 10.33.96.131 > 10.32.8.135: ICMP echo request, id 617, seq 0, length 64 11:46:21.567982 IP 10.32.8.135 > 10.33.96.131: ICMP echo reply, id 617, seq 0, length 64 f) tcpdump on mgmtvrf successful, but ping fails root@x10sdv-4c-tln4f:~# ping 10.32.8.135 -I mgmtvrf -c1 -w1 PING 10.32.8.135 (10.32.8.135): 56 data bytes --- 10.32.8.135 ping statistics --- 1 packets transmitted, 0 packets received, 100% packet loss root@x10sdv-4c-tln4f:~# tcpdump -i mgmtvrf icmp 11:50:24.155706 IP 10.33.96.131 > 10.32.8.135: ICMP echo request, id 2153, seq 0, length 64 11:50:24.156977 IP 10.32.8.135 > 10.33.96.131: ICMP echo reply, id 2153, seq 0, length 64 f) Netfilter prerouting rules added to the raw table, only sees packets ingressing on mgmtbr0, not mgmtvrf. root@x10sdv-4c-tln4f:~# iptables -t raw -nvL PREROUTING Chain PREROUTING (policy ACCEPT 3 packets, 252 bytes) pkts bytes target prot opt in out source destination 3 252 LOG all -- mgmtbr0 * 10.32.8.135 0.0.0.0/0 LOG flags 0 level 4 0 0 LOG all -- mgmtvrf * 10.32.8.135 0.0.0.0/0 LOG flags 0 level 4 It's strange that while the tcpdump works at the mgmtvrf level, netfilter prerouting rules do not match on the mgmtvrf level. Appreciate the help, please let me know if you need additional logs. Thanks, Nelson On 7/19/18, 8:37 PM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/19/18 8:19 PM, D'Souza, Nelson wrote: > Hi, > > > > I'm seeing the following issue on a system running a 4.14.52 Linux kernel. > > > > With an eth interface enslaved to a VRF device, pings sent out on the > VRF to an neighboring host are successful. But, with an eth interface > enslaved to a L3 enabled bridge (mgmtbr0), and the bridge enslaved to a > l3mdev VRF (mgmtvrf), the pings sent out on the VRF are not received > back at the application level. you mean this setup: eth1 (ingress port) -> br0 (bridge) -> red (vrf) IP address on br0: 9: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master red state UP group default qlen 1000 link/ether 02:e0:f9:1c:00:37 brd ff:ff:ff:ff:ff:ff inet 10.100.1.4/24 scope global br0 valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe1c:37/64 scope link valid_lft forever preferred_lft forever And then ping a neighbor: # ping -I red -c1 -w1 10.100.1.254 ping: Warning: source address might be selected on device other than red. PING 10.100.1.254 (10.100.1.254) from 10.100.1.4 red: 56(84) bytes of data. 64 bytes from 10.100.1.254: icmp_seq=1 ttl=64 time=0.810 ms --- 10.100.1.254 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.810/0.810/0.810/0.000 ms > > > > ICMP Echo requests are successfully sent out on the mgmtvrf device to a > neighboring host. However, ICMP echo replies that are received back from > the neighboring host via the eth and mgmtbr0 interfaces are not seen at > the vrf device level and therefore fail to be delivered locally to the > ping application. Does tcpdump on each level show the response? tcpdump on eth, tcpdump on bridge and tcpdump on the vrf device? > > > > The following LOG rules were added to the raw table, prerouting chain > and the filter table, OUTPUT chains: > > > > root@x10sdv-4c-tln4f:~# iptables -t raw -S PREROUTING > > -P PREROUTING ACCEPT > > -A PREROUTING -s 10.32.8.135/32 -i mgmtbr0 -j LOG > > -A PREROUTING -s 10.32.8.135/32 -i mgmtvrf -j LOG > > > > root@x10sdv-4c-tln4f:~# iptables -S OUTPUT > > -P OUTPUT ACCEPT > > -A OUTPUT -o mgmtvrf -j LOG > > -A OUTPUT -o mgmtbr0 -j LOG > > > > Pings are sent on the management VRF to a neighboring host (10.32.8.135) > and the netfilter logs included below: > > Note, that in the logs, ICMP echo requests are sent out on the mgmtvrf > and match the output rules for mgmvrf and mgmtbr0, but the ICMP echo > replies are only seen on mgmtbr0, not on mgmtvrf > > > > root@x10sdv-4c-tln4f:~# ping 10.32.8.135 -I mgmtvrf -c 1 > > PING 10.32.8.135 (10.32.8.135): > > 56 data bytes > > [ 2679.683027] IN= OUT=mgmtvrf SRC=10.33.96.131 DST=10.32.8.135 LEN=84 > TOS=0x00 PREC=0x00 TTL=64 ID=23921 DF PROTO=ICMP TYPE=8 CODE=0 ID=32610 > SEQ=0 <<< ICMP echo sent on mgmtvrf > > [ 2679.697560] IN= OUT=mgmtbr0 SRC=10.33.96.131 DST=10.32.8.135 LEN=84 > TOS=0x00 PREC=0x00 TTL=64 ID=23921 DF PROTO=ICMP TYPE=8 CODE=0 ID=32610 > SEQ=0 <<< ICMP echo sent on mgmtbr0 > > [ 2679.713312] IN=mgmtbr0 OUT= PHYSIN=ethUSB > MAC=c0:56:27:90:4f:75:c4:7d:4f:bb:02:e7:08:00 SRC=10.32.8.135 > DST=10.33.96.131 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=64949 PROTO=ICMP > TYPE=0 CODE=0 ID=32610 SEQ=0 <<< ICMP echo reply rcvd on mgmtbr0, > but not on mgmtvrf > > > > --- 10.32.8.135 ping statistics --- > > 1 packets transmitted, 0 packets received, 100% packet loss <<<< > ping failed > > > > I’d like to know if this is an outstanding/resolved issue. > This one works (see above), so I suspect it is something with your setup. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: VRF with enslaved L3 enabled bridge 2018-07-20 19:03 ` [**EXTERNAL**] " D'Souza, Nelson @ 2018-07-20 19:11 ` David Ahern 2018-07-20 21:59 ` [**EXTERNAL**] " D'Souza, Nelson 2018-07-23 22:00 ` David Ahern 1 sibling, 1 reply; 17+ messages in thread From: David Ahern @ 2018-07-20 19:11 UTC (permalink / raw) To: D'Souza, Nelson, netdev On 7/20/18 1:03 PM, D'Souza, Nelson wrote: > Hi Dave, > > It is good to know that this works in your case. However, I'm not able to pinpoint what the issue is and looking for a way to narrow down to the root cause. > Do you know if this has been an issue in the past and resolved in Linux kernel versions after 4.14.52? It has always worked as far as I recall. > > I have the same setup as you and tcpdump works at all levels (eth, bridge, vrf). > > Setup is as follows: > > ethUSB(ingress port) -> mgmtbr0 (bridge) -> mgmtvrf (vrf) > > Logs from my setup: > > b) ethUSB is enslaved to mgmtbr0 (bridge) > > root@x10sdv-4c-tln4f:~# ip link show master mgmtbr0 > 6: ethUSB: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master mgmtbr0 state UNKNOWN mode DEFAULT group default qlen 1000 > link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff > > b) mgmtbr0 bridge is enslaved to mgmtvrf (vrf) > > root@x10sdv-4c-tln4f:~# ip link show master mgmtvrf > 16: mgmtbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmtvrf state UP mode DEFAULT group default qlen 1000 > link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff > > c) ip address configured on mgmtbr0 > > root@x10sdv-4c-tln4f:~# ip addr show dev mgmtbr0 > 16: mgmtbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmtvrf state UP group default qlen 1000 > link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff > inet 10.33.96.131/24 brd 10.33.96.255 scope global mgmtbr0 > valid_lft forever preferred_lft forever > inet6 fe80::c256:27ff:fe90:4f75/64 scope link > valid_lft forever preferred_lft forever > > d) tcpdump on ethUSB successful, but ping fails > root@x10sdv-4c-tln4f:~# ping 10.32.8.135 -I mgmtvrf -c1 -w1 > PING 10.32.8.135 (10.32.8.135): 56 data bytes > --- 10.32.8.135 ping statistics --- > 1 packets transmitted, 0 packets received, 100% packet loss > > root@x10sdv-4c-tln4f:~# tcpdump -i ethUSB icmp > 11:38:37.169678 IP 10.33.96.131 > 10.32.8.135: ICMP echo request, id 62312, seq 0, length 64 > 11:38:37.170906 IP 10.32.8.135 > 10.33.96.131: ICMP echo reply, id 62312, seq 0, length 64 First, is this a modified kernel? What does the following show? $ ip ru ls $ ip route ls vrf mgmt $ ip li sh vrf mgmt Try perf: perf record -e fib:* -a -g -- sleep 3 (run ping during record) perf script Look at the table used for lookups. Is the correct one for the mgmt vrf? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-20 19:11 ` David Ahern @ 2018-07-20 21:59 ` D'Souza, Nelson 0 siblings, 0 replies; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-20 21:59 UTC (permalink / raw) To: David Ahern, netdev The Linux kernel has kernel patches applied beyond 4.14.52 but aside from that it has no custom changes. Currently don't have perf on the linux system, so will have to get back to you with the perf traces. Meanwhile, here's the ip outputs you requested. root@x10sdv-4c-tln4f:~# ip rule ls 0: from all lookup local 1000: from all lookup [l3mdev-table] 32766: from all lookup main 32767: from all lookup default root@x10sdv-4c-tln4f:~# ip route ls vrf mgmtvrf default via 10.33.96.1 dev mgmtbr0 10.33.96.0/24 dev mgmtbr0 proto kernel scope link src 10.33.96.131 root@x10sdv-4c-tln4f:~# ip link show vrf mgmtvrf 16: mgmtbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmtvrf state UP mode DEFAULT group default qlen 1000 link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff Thanks, Nelson On 7/20/18, 12:11 PM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/20/18 1:03 PM, D'Souza, Nelson wrote: > Hi Dave, > > It is good to know that this works in your case. However, I'm not able to pinpoint what the issue is and looking for a way to narrow down to the root cause. > Do you know if this has been an issue in the past and resolved in Linux kernel versions after 4.14.52? It has always worked as far as I recall. > > I have the same setup as you and tcpdump works at all levels (eth, bridge, vrf). > > Setup is as follows: > > ethUSB(ingress port) -> mgmtbr0 (bridge) -> mgmtvrf (vrf) > > Logs from my setup: > > b) ethUSB is enslaved to mgmtbr0 (bridge) > > root@x10sdv-4c-tln4f:~# ip link show master mgmtbr0 > 6: ethUSB: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master mgmtbr0 state UNKNOWN mode DEFAULT group default qlen 1000 > link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff > > b) mgmtbr0 bridge is enslaved to mgmtvrf (vrf) > > root@x10sdv-4c-tln4f:~# ip link show master mgmtvrf > 16: mgmtbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmtvrf state UP mode DEFAULT group default qlen 1000 > link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff > > c) ip address configured on mgmtbr0 > > root@x10sdv-4c-tln4f:~# ip addr show dev mgmtbr0 > 16: mgmtbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmtvrf state UP group default qlen 1000 > link/ether c0:56:27:90:4f:75 brd ff:ff:ff:ff:ff:ff > inet 10.33.96.131/24 brd 10.33.96.255 scope global mgmtbr0 > valid_lft forever preferred_lft forever > inet6 fe80::c256:27ff:fe90:4f75/64 scope link > valid_lft forever preferred_lft forever > > d) tcpdump on ethUSB successful, but ping fails > root@x10sdv-4c-tln4f:~# ping 10.32.8.135 -I mgmtvrf -c1 -w1 > PING 10.32.8.135 (10.32.8.135): 56 data bytes > --- 10.32.8.135 ping statistics --- > 1 packets transmitted, 0 packets received, 100% packet loss > > root@x10sdv-4c-tln4f:~# tcpdump -i ethUSB icmp > 11:38:37.169678 IP 10.33.96.131 > 10.32.8.135: ICMP echo request, id 62312, seq 0, length 64 > 11:38:37.170906 IP 10.32.8.135 > 10.33.96.131: ICMP echo reply, id 62312, seq 0, length 64 First, is this a modified kernel? What does the following show? $ ip ru ls $ ip route ls vrf mgmt $ ip li sh vrf mgmt Try perf: perf record -e fib:* -a -g -- sleep 3 (run ping during record) perf script Look at the table used for lookups. Is the correct one for the mgmt vrf? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-20 19:03 ` [**EXTERNAL**] " D'Souza, Nelson 2018-07-20 19:11 ` David Ahern @ 2018-07-23 22:00 ` David Ahern 2018-07-24 1:43 ` D'Souza, Nelson 1 sibling, 1 reply; 17+ messages in thread From: David Ahern @ 2018-07-23 22:00 UTC (permalink / raw) To: D'Souza, Nelson, netdev On 7/20/18 1:03 PM, D'Souza, Nelson wrote: > Setup is as follows: > > ethUSB(ingress port) -> mgmtbr0 (bridge) -> mgmtvrf (vrf) | netns foo [ test-vrf ] | | | [ br0 ] 172.16.1.1 | | | [ veth1 ] ============|======= [ veth2 ] lo | 172.16.1.2 172.16.2.2 | Copy and paste the following into your environment: ip netns add foo ip li add veth1 type veth peer name veth2 ip li set veth2 netns foo ip -netns foo li set lo up ip -netns foo li set veth2 up ip -netns foo addr add 172.16.1.2/24 dev veth2 ip li add test-vrf type vrf table 123 ip li set test-vrf up ip ro add vrf test-vrf unreachable default ip li add br0 type bridge ip li set veth1 master br0 ip li set veth1 up ip li set br0 up ip addr add dev br0 172.16.1.1/24 ip li set br0 master test-vrf ip -netns foo addr add 172.16.2.2/32 dev lo ip ro add vrf test-vrf 172.16.2.2/32 via 172.16.1.2 Does ping work? # ping -I test-vrf 172.16.2.2 ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.228 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.263 ms and: # ping -I br0 172.16.2.2 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.227 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.223 ms ^C --- 172.16.2.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.223/0.225/0.227/0.002 ms ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-23 22:00 ` David Ahern @ 2018-07-24 1:43 ` D'Souza, Nelson 2018-07-24 12:54 ` David Ahern 0 siblings, 1 reply; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-24 1:43 UTC (permalink / raw) To: David Ahern, netdev Hi David, I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. I'm essentially seeing the same issue as I reported before. In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. Logs are as follows... a) pings on test-vrf or br0 fail. # ping -I test-vrf 172.16.2.2 -c1 -w1 PING 172.16.2.2 (172.16.2.2): 56 data bytes --- 172.16.2.2 ping statistics --- 1 packets transmitted, 0 packets received, 100% packet loss b) tcpdump in the foo namespace, shows icmp echos/replies on veth2 # ip netns exec foo tcpdump -i veth2 icmp -c 2 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on veth2, link-type EN10MB (Ethernet), capture size 262144 bytes 18:34:13.205210 IP 172.16.1.1 > 172.16.2.2: ICMP echo request, id 19513, seq 0, length 64 18:34:13.205253 IP 172.16.2.2 > 172.16.1.1: ICMP echo reply, id 19513, seq 0, length 64 2 packets captured 2 packets received by filter 0 packets dropped by kernel c) tcpdump in the host namespace, shows icmp echos/replies on test-vrf, br0 and veth1: # tcpdump -i test-vrf icmp -c 2 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on test-vrf, link-type EN10MB (Ethernet), capture size 262144 bytes 18:34:13.204061 IP 172.16.1.1 > 172.16.2.2: ICMP echo request, id 19513, seq 0, length 64 18:34:13.205278 IP 172.16.2.2 > 172.16.1.1: ICMP echo reply, id 19513, seq 0, length 64 2 packets captured 2 packets received by filter 0 packets dropped by kernel Thanks, Nelson On 7/23/18, 3:00 PM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/20/18 1:03 PM, D'Souza, Nelson wrote: > Setup is as follows: > > ethUSB(ingress port) -> mgmtbr0 (bridge) -> mgmtvrf (vrf) | netns foo [ test-vrf ] | | | [ br0 ] 172.16.1.1 | | | [ veth1 ] ============|======= [ veth2 ] lo | 172.16.1.2 172.16.2.2 | Copy and paste the following into your environment: ip netns add foo ip li add veth1 type veth peer name veth2 ip li set veth2 netns foo ip -netns foo li set lo up ip -netns foo li set veth2 up ip -netns foo addr add 172.16.1.2/24 dev veth2 ip li add test-vrf type vrf table 123 ip li set test-vrf up ip ro add vrf test-vrf unreachable default ip li add br0 type bridge ip li set veth1 master br0 ip li set veth1 up ip li set br0 up ip addr add dev br0 172.16.1.1/24 ip li set br0 master test-vrf ip -netns foo addr add 172.16.2.2/32 dev lo ip ro add vrf test-vrf 172.16.2.2/32 via 172.16.1.2 Does ping work? # ping -I test-vrf 172.16.2.2 ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.228 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.263 ms and: # ping -I br0 172.16.2.2 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.227 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.223 ms ^C --- 172.16.2.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.223/0.225/0.227/0.002 ms ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-24 1:43 ` D'Souza, Nelson @ 2018-07-24 12:54 ` David Ahern 2018-07-24 15:58 ` D'Souza, Nelson 0 siblings, 1 reply; 17+ messages in thread From: David Ahern @ 2018-07-24 12:54 UTC (permalink / raw) To: D'Souza, Nelson, netdev On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-24 12:54 ` David Ahern @ 2018-07-24 15:58 ` D'Souza, Nelson 2018-07-24 16:08 ` D'Souza, Nelson 0 siblings, 1 reply; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-24 15:58 UTC (permalink / raw) To: David Ahern, netdev Thank you David, really appreciate the help. Most likely something specific to my environment. ip vrf id, does not report anything on my system. Here's the result after running the command. # ip vrf id # I'll follow up with a VM. Nelson On 7/24/18, 5:55 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-24 15:58 ` D'Souza, Nelson @ 2018-07-24 16:08 ` D'Souza, Nelson 2018-07-26 0:35 ` D'Souza, Nelson 0 siblings, 1 reply; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-24 16:08 UTC (permalink / raw) To: David Ahern, netdev It's strange that enslaving eth1 -> br0 -> test-vrf does not work, but enslaving eth1->test-vrf works fine. Nelson On 7/24/18, 8:58 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: Thank you David, really appreciate the help. Most likely something specific to my environment. ip vrf id, does not report anything on my system. Here's the result after running the command. # ip vrf id # I'll follow up with a VM. Nelson On 7/24/18, 5:55 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-24 16:08 ` D'Souza, Nelson @ 2018-07-26 0:35 ` D'Souza, Nelson 2018-07-26 0:44 ` D'Souza, Nelson 2018-07-27 23:29 ` D'Souza, Nelson 0 siblings, 2 replies; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-26 0:35 UTC (permalink / raw) To: David Ahern, netdev David, I tried out the commands on an Ubuntu 17.10.1 VM. The pings on test-vrf are successful, but the pings on br0 are not successful. # uname -rv 4.13.0-21-generic #24-Ubuntu SMP Mon Dec 18 17:29:16 UTC 2017 # lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 17.10 Release: 17.10 Codename: artful # ip rule --> Note: its missing the l3mdev rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default Ran the configs from a bash script vrf.sh # ./vrf.sh + ip netns add foo + ip li add veth1 type veth peer name veth2 + ip li set veth2 netns foo + ip -netns foo li set lo up + ip -netns foo li set veth2 up + ip -netns foo addr add 172.16.1.2/24 dev veth2 + ip li add test-vrf type vrf table 123 + ip li set test-vrf up + ip ro add vrf test-vrf unreachable default + ip li add br0 type bridge + ip li set veth1 master br0 + ip li set veth1 up + ip li set br0 up + ip addr add dev br0 172.16.1.1/24 + ip li set br0 master test-vrf + ip -netns foo addr add 172.16.2.2/32 dev lo + ip ro add vrf test-vrf 172.16.2.2/32 via 172.16.1.2 # ping -I test-vrf 172.16.2.2 -c 2 <<< successful on test-vrf ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.035 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.045 ms --- 172.16.2.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1022ms rtt min/avg/max/mdev = 0.035/0.040/0.045/0.005 ms #ping -I br0 172.16.2.2 -c 2 <<< fails on br0 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. --- 172.16.2.2 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1022ms Please let me know if I should try a different version. Nelson On 7/24/18, 9:08 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: It's strange that enslaving eth1 -> br0 -> test-vrf does not work, but enslaving eth1->test-vrf works fine. Nelson On 7/24/18, 8:58 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: Thank you David, really appreciate the help. Most likely something specific to my environment. ip vrf id, does not report anything on my system. Here's the result after running the command. # ip vrf id # I'll follow up with a VM. Nelson On 7/24/18, 5:55 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-26 0:35 ` D'Souza, Nelson @ 2018-07-26 0:44 ` D'Souza, Nelson 2018-07-27 23:29 ` D'Souza, Nelson 1 sibling, 0 replies; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-26 0:44 UTC (permalink / raw) To: David Ahern, netdev David, To narrow down on the issue, I've been requested by our kernel team for the following information: "Can you clarify what kernel configuration was used for the clean 4.14.52 kernel (no changes) The kernel configuration may be available in /proc/config.gz, or it might be available as a text file in the /boot directory." Would you be able to provide this? Nelson On 7/25/18, 5:35 PM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: David, I tried out the commands on an Ubuntu 17.10.1 VM. The pings on test-vrf are successful, but the pings on br0 are not successful. # uname -rv 4.13.0-21-generic #24-Ubuntu SMP Mon Dec 18 17:29:16 UTC 2017 # lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 17.10 Release: 17.10 Codename: artful # ip rule --> Note: its missing the l3mdev rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default Ran the configs from a bash script vrf.sh # ./vrf.sh + ip netns add foo + ip li add veth1 type veth peer name veth2 + ip li set veth2 netns foo + ip -netns foo li set lo up + ip -netns foo li set veth2 up + ip -netns foo addr add 172.16.1.2/24 dev veth2 + ip li add test-vrf type vrf table 123 + ip li set test-vrf up + ip ro add vrf test-vrf unreachable default + ip li add br0 type bridge + ip li set veth1 master br0 + ip li set veth1 up + ip li set br0 up + ip addr add dev br0 172.16.1.1/24 + ip li set br0 master test-vrf + ip -netns foo addr add 172.16.2.2/32 dev lo + ip ro add vrf test-vrf 172.16.2.2/32 via 172.16.1.2 # ping -I test-vrf 172.16.2.2 -c 2 <<< successful on test-vrf ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.035 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.045 ms --- 172.16.2.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1022ms rtt min/avg/max/mdev = 0.035/0.040/0.045/0.005 ms #ping -I br0 172.16.2.2 -c 2 <<< fails on br0 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. --- 172.16.2.2 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1022ms Please let me know if I should try a different version. Nelson On 7/24/18, 9:08 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: It's strange that enslaving eth1 -> br0 -> test-vrf does not work, but enslaving eth1->test-vrf works fine. Nelson On 7/24/18, 8:58 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: Thank you David, really appreciate the help. Most likely something specific to my environment. ip vrf id, does not report anything on my system. Here's the result after running the command. # ip vrf id # I'll follow up with a VM. Nelson On 7/24/18, 5:55 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-26 0:35 ` D'Souza, Nelson 2018-07-26 0:44 ` D'Souza, Nelson @ 2018-07-27 23:29 ` D'Souza, Nelson 2018-08-02 23:12 ` D'Souza, Nelson 1 sibling, 1 reply; 17+ messages in thread From: D'Souza, Nelson @ 2018-07-27 23:29 UTC (permalink / raw) To: David Ahern, netdev David, With Ubuntu 18.04.1 (kernel 4.15.0-29) pings sent out on test-vrf and br0 are successful. # uname -rv 4.15.0-29-generic #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018 # ping -c 1 -I test-vrf 172.16.2.2 ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.050 ms --- 172.16.2.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.050/0.050/0.050/0.000 ms # ping -c 1 -I br0 172.16.2.2 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.026 ms --- 172.16.2.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.026/0.026/0.026/0.000 ms However, with Ubuntu 17.10.1 (kernel 4.13.0-21) pings on only test-vrf are successful. Pings on br0 are not successful. So it seems like there maybe a change in versions after 4.13.0-21 that causes pings on br0 to pass. Nelson On 7/25/18, 5:35 PM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: David, I tried out the commands on an Ubuntu 17.10.1 VM. The pings on test-vrf are successful, but the pings on br0 are not successful. # uname -rv 4.13.0-21-generic #24-Ubuntu SMP Mon Dec 18 17:29:16 UTC 2017 # lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 17.10 Release: 17.10 Codename: artful # ip rule --> Note: its missing the l3mdev rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default Ran the configs from a bash script vrf.sh # ./vrf.sh + ip netns add foo + ip li add veth1 type veth peer name veth2 + ip li set veth2 netns foo + ip -netns foo li set lo up + ip -netns foo li set veth2 up + ip -netns foo addr add 172.16.1.2/24 dev veth2 + ip li add test-vrf type vrf table 123 + ip li set test-vrf up + ip ro add vrf test-vrf unreachable default + ip li add br0 type bridge + ip li set veth1 master br0 + ip li set veth1 up + ip li set br0 up + ip addr add dev br0 172.16.1.1/24 + ip li set br0 master test-vrf + ip -netns foo addr add 172.16.2.2/32 dev lo + ip ro add vrf test-vrf 172.16.2.2/32 via 172.16.1.2 # ping -I test-vrf 172.16.2.2 -c 2 <<< successful on test-vrf ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.035 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.045 ms --- 172.16.2.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1022ms rtt min/avg/max/mdev = 0.035/0.040/0.045/0.005 ms #ping -I br0 172.16.2.2 -c 2 <<< fails on br0 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. --- 172.16.2.2 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1022ms Please let me know if I should try a different version. Nelson On 7/24/18, 9:08 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: It's strange that enslaving eth1 -> br0 -> test-vrf does not work, but enslaving eth1->test-vrf works fine. Nelson On 7/24/18, 8:58 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: Thank you David, really appreciate the help. Most likely something specific to my environment. ip vrf id, does not report anything on my system. Here's the result after running the command. # ip vrf id # I'll follow up with a VM. Nelson On 7/24/18, 5:55 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-07-27 23:29 ` D'Souza, Nelson @ 2018-08-02 23:12 ` D'Souza, Nelson 2018-09-05 18:00 ` D'Souza, Nelson 0 siblings, 1 reply; 17+ messages in thread From: D'Souza, Nelson @ 2018-08-02 23:12 UTC (permalink / raw) To: David Ahern, netdev Hi David, Turns out the VRF bridge Rx issue is triggered by a docker install. Docker makes the following sysctl changes: net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 <<< exposes the ipv4 VRF Rx issue when a bridge is enslaved to a VRF which causes packets flowing through all bridges to be subjected to netfilter rules. This is required for bridge net filtering when ip forwarding is enabled. Please refer to https://github.com/docker/libnetwork/blob/master/drivers/bridge/setup_bridgenetfiltering.go#L53 Setting net.bridge.bridge-nf-call-iptables = 0 resolves the issue, but is not really a viable option given that bridge net filtering is a basic requirement in existing docker deployments. It's not clear to me why this conf setting breaks local Rx delivery for a bridge enslaved to a VRF, because these packets would always be sent up by the bridge for IP netfilter processing. This issue is easily reproducible on an Ubuntu 18.04.1 VM. Simply installing docker will cause pings running on test-vrf to fail. Clearing the sysctl conf restores Rx local delivery. Thanks, Nelson On 7/27/18, 4:29 PM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: David, With Ubuntu 18.04.1 (kernel 4.15.0-29) pings sent out on test-vrf and br0 are successful. # uname -rv 4.15.0-29-generic #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018 # ping -c 1 -I test-vrf 172.16.2.2 ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.050 ms --- 172.16.2.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.050/0.050/0.050/0.000 ms # ping -c 1 -I br0 172.16.2.2 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.026 ms --- 172.16.2.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.026/0.026/0.026/0.000 ms However, with Ubuntu 17.10.1 (kernel 4.13.0-21) pings on only test-vrf are successful. Pings on br0 are not successful. So it seems like there maybe a change in versions after 4.13.0-21 that causes pings on br0 to pass. Nelson On 7/25/18, 5:35 PM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: David, I tried out the commands on an Ubuntu 17.10.1 VM. The pings on test-vrf are successful, but the pings on br0 are not successful. # uname -rv 4.13.0-21-generic #24-Ubuntu SMP Mon Dec 18 17:29:16 UTC 2017 # lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 17.10 Release: 17.10 Codename: artful # ip rule --> Note: its missing the l3mdev rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default Ran the configs from a bash script vrf.sh # ./vrf.sh + ip netns add foo + ip li add veth1 type veth peer name veth2 + ip li set veth2 netns foo + ip -netns foo li set lo up + ip -netns foo li set veth2 up + ip -netns foo addr add 172.16.1.2/24 dev veth2 + ip li add test-vrf type vrf table 123 + ip li set test-vrf up + ip ro add vrf test-vrf unreachable default + ip li add br0 type bridge + ip li set veth1 master br0 + ip li set veth1 up + ip li set br0 up + ip addr add dev br0 172.16.1.1/24 + ip li set br0 master test-vrf + ip -netns foo addr add 172.16.2.2/32 dev lo + ip ro add vrf test-vrf 172.16.2.2/32 via 172.16.1.2 # ping -I test-vrf 172.16.2.2 -c 2 <<< successful on test-vrf ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.035 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.045 ms --- 172.16.2.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1022ms rtt min/avg/max/mdev = 0.035/0.040/0.045/0.005 ms #ping -I br0 172.16.2.2 -c 2 <<< fails on br0 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. --- 172.16.2.2 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1022ms Please let me know if I should try a different version. Nelson On 7/24/18, 9:08 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: It's strange that enslaving eth1 -> br0 -> test-vrf does not work, but enslaving eth1->test-vrf works fine. Nelson On 7/24/18, 8:58 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: Thank you David, really appreciate the help. Most likely something specific to my environment. ip vrf id, does not report anything on my system. Here's the result after running the command. # ip vrf id # I'll follow up with a VM. Nelson On 7/24/18, 5:55 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-08-02 23:12 ` D'Souza, Nelson @ 2018-09-05 18:00 ` D'Souza, Nelson 2018-09-07 0:26 ` David Ahern 0 siblings, 1 reply; 17+ messages in thread From: D'Souza, Nelson @ 2018-09-05 18:00 UTC (permalink / raw) To: David Ahern, netdev Hi David, Just following up.... would you be able to confirm that this is a Linux VRF issue? Also, how do I log a VRF related defect to ensure this gets resolved in a subsequent release. Thanks, Nelson On 8/2/18, 4:12 PM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: Hi David, Turns out the VRF bridge Rx issue is triggered by a docker install. Docker makes the following sysctl changes: net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 <<< exposes the ipv4 VRF Rx issue when a bridge is enslaved to a VRF which causes packets flowing through all bridges to be subjected to netfilter rules. This is required for bridge net filtering when ip forwarding is enabled. Please refer to https://github.com/docker/libnetwork/blob/master/drivers/bridge/setup_bridgenetfiltering.go#L53 Setting net.bridge.bridge-nf-call-iptables = 0 resolves the issue, but is not really a viable option given that bridge net filtering is a basic requirement in existing docker deployments. It's not clear to me why this conf setting breaks local Rx delivery for a bridge enslaved to a VRF, because these packets would always be sent up by the bridge for IP netfilter processing. This issue is easily reproducible on an Ubuntu 18.04.1 VM. Simply installing docker will cause pings running on test-vrf to fail. Clearing the sysctl conf restores Rx local delivery. Thanks, Nelson On 7/27/18, 4:29 PM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: David, With Ubuntu 18.04.1 (kernel 4.15.0-29) pings sent out on test-vrf and br0 are successful. # uname -rv 4.15.0-29-generic #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018 # ping -c 1 -I test-vrf 172.16.2.2 ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.050 ms --- 172.16.2.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.050/0.050/0.050/0.000 ms # ping -c 1 -I br0 172.16.2.2 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.026 ms --- 172.16.2.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.026/0.026/0.026/0.000 ms However, with Ubuntu 17.10.1 (kernel 4.13.0-21) pings on only test-vrf are successful. Pings on br0 are not successful. So it seems like there maybe a change in versions after 4.13.0-21 that causes pings on br0 to pass. Nelson On 7/25/18, 5:35 PM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: David, I tried out the commands on an Ubuntu 17.10.1 VM. The pings on test-vrf are successful, but the pings on br0 are not successful. # uname -rv 4.13.0-21-generic #24-Ubuntu SMP Mon Dec 18 17:29:16 UTC 2017 # lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 17.10 Release: 17.10 Codename: artful # ip rule --> Note: its missing the l3mdev rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default Ran the configs from a bash script vrf.sh # ./vrf.sh + ip netns add foo + ip li add veth1 type veth peer name veth2 + ip li set veth2 netns foo + ip -netns foo li set lo up + ip -netns foo li set veth2 up + ip -netns foo addr add 172.16.1.2/24 dev veth2 + ip li add test-vrf type vrf table 123 + ip li set test-vrf up + ip ro add vrf test-vrf unreachable default + ip li add br0 type bridge + ip li set veth1 master br0 + ip li set veth1 up + ip li set br0 up + ip addr add dev br0 172.16.1.1/24 + ip li set br0 master test-vrf + ip -netns foo addr add 172.16.2.2/32 dev lo + ip ro add vrf test-vrf 172.16.2.2/32 via 172.16.1.2 # ping -I test-vrf 172.16.2.2 -c 2 <<< successful on test-vrf ping: Warning: source address might be selected on device other than test-vrf. PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 test-vrf: 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=64 time=0.035 ms 64 bytes from 172.16.2.2: icmp_seq=2 ttl=64 time=0.045 ms --- 172.16.2.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1022ms rtt min/avg/max/mdev = 0.035/0.040/0.045/0.005 ms #ping -I br0 172.16.2.2 -c 2 <<< fails on br0 PING 172.16.2.2 (172.16.2.2) from 172.16.1.1 br0: 56(84) bytes of data. --- 172.16.2.2 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1022ms Please let me know if I should try a different version. Nelson On 7/24/18, 9:08 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: It's strange that enslaving eth1 -> br0 -> test-vrf does not work, but enslaving eth1->test-vrf works fine. Nelson On 7/24/18, 8:58 AM, "D'Souza, Nelson" <ndsouza@ciena.com> wrote: Thank you David, really appreciate the help. Most likely something specific to my environment. ip vrf id, does not report anything on my system. Here's the result after running the command. # ip vrf id # I'll follow up with a VM. Nelson On 7/24/18, 5:55 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 7/23/18 7:43 PM, D'Souza, Nelson wrote: > I copy and pasted the configs onto my device, but pings on test-vrf do not work in my setup. > I'm essentially seeing the same issue as I reported before. > > In this case, pings sent out on test-vrf (host ns) are received and replied to by the loopback interface (foo ns). Although the replies are seen at the test-vrf level, they are not locally delivered to the ping application. > I just built v4.14.52 kernel and ran those commands - worked fine. It is something specific to your environment. Is your shell tied to a VRF -- (ip vrf id)? After that, I suggest you create a VM running a newer distribution of your choice (Ubuntu 17.10 or newer, debian stretch with 4.14 kernel, or Fedora 26 or newer) and run the commands there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-09-05 18:00 ` D'Souza, Nelson @ 2018-09-07 0:26 ` David Ahern [not found] ` <BYAPR04MB3797F5219C2164F3C0B6DE57AD000@BYAPR04MB3797.namprd04.prod.outlook.com> 0 siblings, 1 reply; 17+ messages in thread From: David Ahern @ 2018-09-07 0:26 UTC (permalink / raw) To: D'Souza, Nelson, netdev On 9/5/18 12:00 PM, D'Souza, Nelson wrote: > Just following up.... would you be able to confirm that this is a Linux VRF issue? I can confirm that I can reproduce the problem. Need to find time to dig into it. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <BYAPR04MB3797F5219C2164F3C0B6DE57AD000@BYAPR04MB3797.namprd04.prod.outlook.com>]
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge [not found] ` <BYAPR04MB3797F5219C2164F3C0B6DE57AD000@BYAPR04MB3797.namprd04.prod.outlook.com> @ 2018-09-07 16:09 ` David Ahern 2018-09-07 23:42 ` D'Souza, Nelson 0 siblings, 1 reply; 17+ messages in thread From: David Ahern @ 2018-09-07 16:09 UTC (permalink / raw) To: D'Souza, Nelson, netdev; +Cc: Ido Schimmel On 9/7/18 9:56 AM, D'Souza, Nelson wrote: > ------------------------------------------------------------------------ > *From:* David Ahern <dsa@cumulusnetworks.com> > *Sent:* Thursday, September 6, 2018 5:27 PM > *To:* D'Souza, Nelson; netdev@vger.kernel.org > *Subject:* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge > > On 9/5/18 12:00 PM, D'Souza, Nelson wrote: >> Just following up.... would you be able to confirm that this is a > Linux VRF issue? > > I can confirm that I can reproduce the problem. Need to find time to dig > into it. bridge's netfilter hook is dropping the packet. bridge's netfilter code registers hook operations that are invoked when nh_hook is called. It then sees all subsequent calls to nf_hook. Packet wise, the bridge netfilter hook runs first. br_nf_pre_routing allocates nf_bridge, sets in_prerouting to 1 and calls NF_HOOK for NF_INET_PRE_ROUTING. It's finish function, br_nf_pre_routing_finish, then resets in_prerouting flag to 0. Any subsequent calls to nf_hook invoke ip_sabotage_in. That function sees in_prerouting is not set and steals (drops) the packet. The simplest change is to have ip_sabotage_in recognize that the bridge can be enslaved to a VRF (L3 master device) and allow the packet to continue. Thanks to Ido for the hint on ip_sabotage_in. This patch works for me: diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 6e0dc6bcd32a..37278dc280eb 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -835,7 +835,8 @@ static unsigned int ip_sabotage_in(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { - if (skb->nf_bridge && !skb->nf_bridge->in_prerouting) { + if (skb->nf_bridge && !skb->nf_bridge->in_prerouting && + !netif_is_l3_master(skb->dev)) { state->okfn(state->net, state->sk, skb); return NF_STOLEN; } ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge 2018-09-07 16:09 ` David Ahern @ 2018-09-07 23:42 ` D'Souza, Nelson 0 siblings, 0 replies; 17+ messages in thread From: D'Souza, Nelson @ 2018-09-07 23:42 UTC (permalink / raw) To: David Ahern, netdev; +Cc: Ido Schimmel Thanks David and Ido, for finding the root-cause for bridge Rx packets getting dropped, also for coming up with a patch. Regards, Nelson On 9/7/18, 9:09 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote: On 9/7/18 9:56 AM, D'Souza, Nelson wrote: > ------------------------------------------------------------------------ > *From:* David Ahern <dsa@cumulusnetworks.com> > *Sent:* Thursday, September 6, 2018 5:27 PM > *To:* D'Souza, Nelson; netdev@vger.kernel.org > *Subject:* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge > > On 9/5/18 12:00 PM, D'Souza, Nelson wrote: >> Just following up.... would you be able to confirm that this is a > Linux VRF issue? > > I can confirm that I can reproduce the problem. Need to find time to dig > into it. bridge's netfilter hook is dropping the packet. bridge's netfilter code registers hook operations that are invoked when nh_hook is called. It then sees all subsequent calls to nf_hook. Packet wise, the bridge netfilter hook runs first. br_nf_pre_routing allocates nf_bridge, sets in_prerouting to 1 and calls NF_HOOK for NF_INET_PRE_ROUTING. It's finish function, br_nf_pre_routing_finish, then resets in_prerouting flag to 0. Any subsequent calls to nf_hook invoke ip_sabotage_in. That function sees in_prerouting is not set and steals (drops) the packet. The simplest change is to have ip_sabotage_in recognize that the bridge can be enslaved to a VRF (L3 master device) and allow the packet to continue. Thanks to Ido for the hint on ip_sabotage_in. This patch works for me: diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 6e0dc6bcd32a..37278dc280eb 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -835,7 +835,8 @@ static unsigned int ip_sabotage_in(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { - if (skb->nf_bridge && !skb->nf_bridge->in_prerouting) { + if (skb->nf_bridge && !skb->nf_bridge->in_prerouting && + !netif_is_l3_master(skb->dev)) { state->okfn(state->net, state->sk, skb); return NF_STOLEN; } ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2018-09-08 4:26 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <9E3A1F70-E009-49DC-B639-B48B28F99C52@ciena.com> 2018-07-20 3:37 ` VRF with enslaved L3 enabled bridge David Ahern 2018-07-20 19:03 ` [**EXTERNAL**] " D'Souza, Nelson 2018-07-20 19:11 ` David Ahern 2018-07-20 21:59 ` [**EXTERNAL**] " D'Souza, Nelson 2018-07-23 22:00 ` David Ahern 2018-07-24 1:43 ` D'Souza, Nelson 2018-07-24 12:54 ` David Ahern 2018-07-24 15:58 ` D'Souza, Nelson 2018-07-24 16:08 ` D'Souza, Nelson 2018-07-26 0:35 ` D'Souza, Nelson 2018-07-26 0:44 ` D'Souza, Nelson 2018-07-27 23:29 ` D'Souza, Nelson 2018-08-02 23:12 ` D'Souza, Nelson 2018-09-05 18:00 ` D'Souza, Nelson 2018-09-07 0:26 ` David Ahern [not found] ` <BYAPR04MB3797F5219C2164F3C0B6DE57AD000@BYAPR04MB3797.namprd04.prod.outlook.com> 2018-09-07 16:09 ` David Ahern 2018-09-07 23:42 ` D'Souza, Nelson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).