Re: VRF Issue Since kernel 5

From: Gowen <gowen@potatocomputing.co.uk>
To: Alexis Bauvin <abauvin@online.net>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: VRF Issue Since kernel 5
Date: Tue, 10 Sep 2019 14:22:39 +0000	[thread overview]
Message-ID: <CWLP265MB1554D3C90B56AB5A6EC23771FDB60@CWLP265MB1554.GBRP265.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <CWLP265MB1554989CAA2DB59B6862A6A2FDB70@CWLP265MB1554.GBRP265.PROD.OUTLOOK.COM>

Hi Alexis,

I enabled the target TRACE and found that the packet is passing through the security table - which I thought was for SELinux only. As far as I can tell the config is working, is being seen by iptables nut for some reason is not getting accepted by the local process - which isn't right surely. Debugs below from TRACE for the 91.0.0.0/8 subnet for the updates

Sep 10 13:50:37 NETM06 kernel: [442740.425992] TRACE: raw:PREROUTING:policy:2 IN=mgmt-vrf OUT= MAC=00:22:48:07:cc:ad:74:83:ef:a9:ca:c1:08:00 SRC=91.189.88.24 DST=10.24.12.10 LEN=60 TOS=0x00 PREC=0x00 TTL=51 ID=0 DF PROTO=TCP SPT=80 DPT=40164 SEQ=2210516855 ACK=3954601288 WINDOW=28960 RES=0x00 ACK SYN URGP=0 OPT (0204058A0402080AD622157697AC236D01030307)
Sep 10 13:50:37 NETM06 kernel: [442740.426045] TRACE: filter:INPUT:rule:1 IN=mgmt-vrf OUT= MAC=00:22:48:07:cc:ad:74:83:ef:a9:ca:c1:08:00 SRC=91.189.88.24 DST=10.24.12.10 LEN=60 TOS=0x00 PREC=0x00 TTL=51 ID=0 DF PROTO=TCP SPT=80 DPT=40164 SEQ=2210516855 ACK=3954601288 WINDOW=28960 RES=0x00 ACK SYN URGP=0 OPT (0204058A0402080AD622157697AC236D01030307)
Sep 10 13:50:37 NETM06 kernel: [442740.426060] TRACE: security:INPUT:rule:1 IN=mgmt-vrf OUT= MAC=00:22:48:07:cc:ad:74:83:ef:a9:ca:c1:08:00 SRC=91.189.88.24 DST=10.24.12.10 LEN=60 TOS=0x00 PREC=0x00 TTL=51 ID=0 DF PROTO=TCP SPT=80 DPT=40164 SEQ=2210516855 ACK=3954601288 WINDOW=28960 RES=0x00 ACK SYN URGP=0 OPT (0204058A0402080AD622157697AC236D01030307)
Sep 10 13:50:37 NETM06 kernel: [442740.426108] TRACE: security:INPUT:policy:2 IN=mgmt-vrf OUT= MAC=00:22:48:07:cc:ad:74:83:ef:a9:ca:c1:08:00 SRC=91.189.88.24 DST=10.24.12.10 LEN=60 TOS=0x00 PREC=0x00 TTL=51 ID=0 DF PROTO=TCP SPT=80 DPT=40164 SEQ=2210516855 ACK=3954601288 WINDOW=28960 RES=0x00 ACK SYN URGP=0 OPT (0204058A0402080AD622157697AC236D01030307)

Admin@NETM06:~$ sudo iptables -L PREROUTING -t raw  -n -v
Chain PREROUTING (policy ACCEPT 56061 packets, 5260K bytes)
 pkts bytes target     prot opt in     out     source               destination
  296 16480 TRACE      tcp  --  mgmt-vrf *       91.0.0.0/8           0.0.0.0/0            ctstate RELATED,ESTABLISHED tcp spt:80

Chain INPUT (policy DROP 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1      330 18260 ACCEPT     tcp  --  mgmt-vrf *       91.0.0.0/8           0.0.0.0/0            ctstate RELATED,ESTABLISHED tcp spt:80

Admin@NETM06:~$ sudo iptables -L  -t security  -n -v --line-numbers
Chain INPUT (policy ACCEPT 4190 packets, 371K bytes)
num   pkts bytes target     prot opt in     out     source               destination
1      248 13980 LOG        all  --  *      *       91.0.0.0/8           0.0.0.0/0            LOG flags 0 level 4 prefix "LOG-SECURITY"

From: Gowen

Sent: 09 September 2019 20:43

To: Alexis Bauvin <abauvin@online.net>

Cc: netdev@vger.kernel.org <netdev@vger.kernel.org>

Subject: RE: VRF Issue Since kernel 5

Hi alexis,

I did this earlier today and no change.

I’ll look at trying to see if the return traffic is hitting the INPUT table tomorrow with some conntrack rules and see if it hits any of those rules. If not then do you have any hints/techniques I can use to find the source of the issue?

Gareth

-----Original Message-----

From: Alexis Bauvin <abauvin@online.net> 

Sent: 09 September 2019 13:02

To: Gowen <gowen@potatocomputing.co.uk>

Cc: netdev@vger.kernel.org

Subject: Re: VRF Issue Since kernel 5

Hi,

I guess all routing from the management VRF itself is working correctly (i.e. cURLing an IP from this VRF or digging any DNS), and it is your route leakage that’s at fault.

Could you try swapping the local and l3mdev rules?

`ip rule del pref 0; ip rule add from all lookup local pref 1001`

I faced security issues and behavioral weirdnesses from the default kernel rule ordering regarding the default vrf.

Alexis

> Le 9 sept. 2019 à 12:53, Gowen <gowen@potatocomputing.co.uk> a écrit :

> 

> Hi Alexis,

> 

> Admin@NETM06:~$ sysctl net.ipv4.tcp_l3mdev_accept 

> net.ipv4.tcp_l3mdev_accept = 1

> 

> Admin@NETM06:~$ sudo ip vrf exec mgmt-vrf curl kernel.org

> curl: (6) Could not resolve host: kernel.org

> 

> the failure to resolve is the same with all DNS lookups from any 

> process I've run

> 

> The route is there from the guide I originally used, I can't remember 

> the purpose but I know I don't need it - I've removed it now and no 

> change

> 

> Admin@NETM06:~$ ip rule show

> 0:      from all lookup local

> 1000:   from all lookup [l3mdev-table]

> 32766:  from all lookup main

> 32767:  from all lookup default

> 

> I could switch the VRFs over, but this is a test-box and i have prod boxes on this as well so not so keen on that if I can avoid it.

> 

> From what I can speculate, because the TCP return traffic is met with an RST, it looks like it may be something to do with iptables - but even if I set the policy to ACCEPT and flush all the rules, the behaviour remains the same.

> 

> Is it possible that the TCP stack isn't aware of the session (as is mapped to wrong VRF internally or something to that effect) and is therefore sending the RST?

> 

> Gareth

> From: Alexis Bauvin <abauvin@online.net>

> Sent: 09 September 2019 10:28

> To: Gowen <gowen@potatocomputing.co.uk>

> Cc: netdev@vger.kernel.org <netdev@vger.kernel.org>

> Subject: Re: VRF Issue Since kernel 5

>  

> Hi,

> 

> There has been some changes regarding VRF isolation in Linux 5 IIRC, 

> namely proper isolation of the default VRF.

> 

> Some things you may try:

> 

> - looking at the l3mdev_accept sysctls (e.g. 

> `net.ipv4.tcp_l3mdev_accept`)

> - querying stuff from the management vrf through `ip vrf exec vrf-mgmt <stuff>`

>   e.g. `ip vrf exec vrf-mgmt curl kernel.org`

>        `ip vrf exec vrf-mgmt dig @1.1.1.1 kernel.org`

> - reversing your logic: default VRF is your management one, the other one is for your

>   other boxes

> 

> Also, your `unreachable default metric 4278198272` route looks odd to me.

> 

> What are your routing rules? (`ip rule`)

> 

> Alexis

> 

> > Le 9 sept. 2019 à 09:46, Gowen <gowen@potatocomputing.co.uk> a écrit :

> > 

> > Hi there,

> > 

> > Dave A said this was the mailer to send this to:

> > 

> > 

> > I’ve been using my management interface in a VRF for several months now and it’s worked perfectly – I’ve been able to update/upgrade the packages just fine and iptables works excellently with it – exactly as I needed.

> > 

> > 

> > Since Kernel 5 though I am no longer able to update – but the issue 

> > is quite a curious one as some traffic appears to be fine (DNS 

> > lookups use VRF correctly) but others don’t (updating/upgrading the 

> > packages)

> > 

> > 

> > I have on this device 2 interfaces:

> > Eth0 for management – inbound SSH, DNS, updates/upgrades

> > Eth1 for managing other boxes (ansible using SSH)

> > 

> > 

> > Link and addr info shown below:

> > 

> > 

> > Admin@NETM06:~$ ip link show

> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000

> >     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

> > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt-vrf state UP mode DEFAULT group default qlen 1000

> >     link/ether 00:22:48:07:cc:ad brd ff:ff:ff:ff:ff:ff

> > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000

> >     link/ether 00:22:48:07:c9:6c brd ff:ff:ff:ff:ff:ff

> > 4: mgmt-vrf: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc noqueue state UP mode DEFAULT group default qlen 1000

> >     link/ether 8a:f6:26:65:02:5a brd ff:ff:ff:ff:ff:ff

> > 

> > 

> > Admin@NETM06:~$ ip addr

> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

> >     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

> >     inet 127.0.0.1/8 scope host lo

> >        valid_lft forever preferred_lft forever

> >     inet6 ::1/128 scope host

> >        valid_lft forever preferred_lft forever

> > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt-vrf state UP group default qlen 1000

> >     link/ether 00:22:48:07:cc:ad brd ff:ff:ff:ff:ff:ff

> >     inet 10.24.12.10/24 brd 10.24.12.255 scope global eth0

> >        valid_lft forever preferred_lft forever

> >     inet6 fe80::222:48ff:fe07:ccad/64 scope link

> >        valid_lft forever preferred_lft forever

> > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

> >     link/ether 00:22:48:07:c9:6c brd ff:ff:ff:ff:ff:ff

> >     inet 10.24.12.9/24 brd 10.24.12.255 scope global eth1

> >        valid_lft forever preferred_lft forever

> >     inet6 fe80::222:48ff:fe07:c96c/64 scope link

> >        valid_lft forever preferred_lft forever

> > 4: mgmt-vrf: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc noqueue state UP group default qlen 1000

> >     link/ether 8a:f6:26:65:02:5a brd ff:ff:ff:ff:ff:ff

> > 

> > 

> > 

> > the production traffic is all in the 10.0.0.0/8 network (eth1 global 

> > VRF) except for a few subnets (DNS) which are routed out eth0 

> > (mgmt-vrf)

> > 

> > 

> > Admin@NETM06:~$ ip route show

> > default via 10.24.12.1 dev eth0

> > 10.0.0.0/8 via 10.24.12.1 dev eth1

> > 10.24.12.0/24 dev eth1 proto kernel scope link src 10.24.12.9

> > 10.24.65.0/24 via 10.24.12.1 dev eth0

> > 10.25.65.0/24 via 10.24.12.1 dev eth0

> > 10.26.0.0/21 via 10.24.12.1 dev eth0

> > 10.26.64.0/21 via 10.24.12.1 dev eth0

> > 

> > 

> > Admin@NETM06:~$ ip route show vrf mgmt-vrf default via 10.24.12.1 

> > dev eth0 unreachable default metric 4278198272

> > 10.24.12.0/24 dev eth0 proto kernel scope link src 10.24.12.10

> > 10.24.65.0/24 via 10.24.12.1 dev eth0

> > 10.25.65.0/24 via 10.24.12.1 dev eth0

> > 10.26.0.0/21 via 10.24.12.1 dev eth0

> > 10.26.64.0/21 via 10.24.12.1 dev eth0

> > 

> > 

> > 

> > The strange activity occurs when I enter the command “sudo apt update” as I can resolve the DNS request (10.24.65.203 or 10.24.64.203, verified with tcpdump) out eth0 but for the actual update traffic there is no activity:

> > 

> > 

> > sudo tcpdump -i eth0 '(host 10.24.65.203 or host 10.25.65.203) and 

> > port 53' -n <OUTPUT OMITTED FOR BREVITY>

> > 10:06:05.268735 IP 10.24.12.10.39963 > 10.24.65.203.53: 48798+ [1au] 

> > A? security.ubuntu.com. (48) <OUTPUT OMITTED FOR BREVITY>

> > 10:06:05.284403 IP 10.24.65.203.53 > 10.24.12.10.39963: 48798 13/0/1 

> > A 91.189.91.23, A 91.189.88.24, A 91.189.91.26, A 91.189.88.162, A 

> > 91.189.88.149, A 91.189.91.24, A 91.189.88.173, A 91.189.88.177, A 

> > 91.189.88.31, A 91.189.91.14, A 91.189.88.176, A 91.189.88.175, A 

> > 91.189.88.174 (256)

> > 

> > 

> > 

> > You can see that the update traffic is returned but is not accepted 

> > by the stack and a RST is sent

> > 

> > 

> > Admin@NETM06:~$ sudo tcpdump -i eth0 '(not host 168.63.129.16 and 

> > port 80)' -n

> > tcpdump: verbose output suppressed, use -v or -vv for full protocol 

> > decode listening on eth0, link-type EN10MB (Ethernet), capture size 

> > 262144 bytes

> > 10:17:12.690658 IP 10.24.12.10.40216 > 91.189.88.175.80: Flags [S], 

> > seq 2279624826, win 64240, options [mss 1460,sackOK,TS val 

> > 2029365856 ecr 0,nop,wscale 7], length 0

> > 10:17:12.691929 IP 10.24.12.10.52362 > 91.189.95.83.80: Flags [S], seq 1465797256, win 64240, options [mss 1460,sackOK,TS val 3833463674 ecr 0,nop,wscale 7], length 0

> > 10:17:12.696270 IP 91.189.88.175.80 > 10.24.12.10.40216: Flags [S.], seq 968450722, ack 2279624827, win 28960, options [mss 1418,sackOK,TS val 81957103 ecr 2029365856,nop,wscale 7], length 0                                                                                                                           

> > 10:17:12.696301 IP 10.24.12.10.40216 > 91.189.88.175.80: Flags [R], seq 2279624827, win 0, length 0

> > 10:17:12.697884 IP 91.189.95.83.80 > 10.24.12.10.52362: Flags [S.], seq 4148330738, ack 1465797257, win 28960, options [mss 1418,sackOK,TS val 2257624414 ecr 3833463674,nop,wscale 8], length 0                                                                                                                        

> > 10:17:12.697909 IP 10.24.12.10.52362 > 91.189.95.83.80: Flags [R], 

> > seq 1465797257, win 0, length 0

> > 

> > 

> > 

> > 

> > I can emulate the DNS lookup using netcat in the vrf:

> > 

> > 

> > sudo ip vrf exec mgmt-vrf nc -u 10.24.65.203 53

> > 

> > 

> > then interactively enter the binary for a www.google.co.uk request:

> > 

> > 

> > 0035624be394010000010000000000010377777706676f6f676c6502636f02756b00

> > 000100010000290200000000000000

> > 

> > 

> > This returns as expected:

> > 

> > 

> > 00624be394010000010000000000010377777706676f6f676c6502636f02756b0000

> > 0100010000290200000000000000

> > 

> > 

> > I can run:

> > 

> > 

> > Admin@NETM06:~$ host www.google.co.uk 
www.google.co.uk has address 

> > 172.217.169.3 www.google.co.uk has IPv6 address

> > 2a00:1450:4009:80d::2003

> > 

> > 

> > but I get a timeout for:

> > 

> > 

> > sudo ip vrf  exec mgmt-vrf host www.google.co.uk ;; connection timed

> > out; no servers could be reached

> > 

> > 

> > 

> > However I can take a repo address and vrf exec to it on port 80:

> > 

> > 

> > Admin@NETM06:~$ sudo ip vrf  exec mgmt-vrf nc 91.189.91.23 80 hello

> > HTTP/1.1 400 Bad Request

> > <OUTPUT OMITTED>

> > 

> > My iptables rule:

> > 

> > 

> > sudo iptables -Z

> > Admin@NETM06:~$ sudo iptables -L -v

> > Chain INPUT (policy DROP 16 packets, 3592 bytes)

> > pkts bytes target     prot opt in     out     source               destination

> >    44  2360 ACCEPT     tcp  --  any    any     anywhere             anywhere             tcp spt:http ctstate RELATED,ESTABLISHED

> >    83 10243 ACCEPT     udp  --  any    any     anywhere             anywhere             udp spt:domain ctstate RELATED,ESTABLISHED

> > 

> > 

> > 

> > I cannot find out why the update isn’t working. Any help greatly 

> > appreciated

> > 

> > 

> > Kind Regards,

> > 

> > 

> > Gareth