All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Eugene Crosser <crosser@average.org>,
	David Ahern <dsahern@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>,
	kuba@kernel.org, netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 5.14 14/18] vrf: Revert "Reset skb conntrack connection..."
Date: Mon, 25 Oct 2021 12:59:27 -0400	[thread overview]
Message-ID: <20211025165939.1393655-14-sashal@kernel.org> (raw)
In-Reply-To: <20211025165939.1393655-1-sashal@kernel.org>

From: Eugene Crosser <crosser@average.org>

[ Upstream commit 55161e67d44fdd23900be166a81e996abd6e3be9 ]

This reverts commit 09e856d54bda5f288ef8437a90ab2b9b3eab83d1.

When an interface is enslaved in a VRF, prerouting conntrack hook is
called twice: once in the context of the original input interface, and
once in the context of the VRF interface. If no special precausions are
taken, this leads to creation of two conntrack entries instead of one,
and breaks SNAT.

Commit above was intended to avoid creation of extra conntrack entries
when input interface is enslaved in a VRF. It did so by resetting
conntrack related data associated with the skb when it enters VRF context.

However it breaks netfilter operation. Imagine a use case when conntrack
zone must be assigned based on the original input interface, rather than
VRF interface (that would make original interfaces indistinguishable). One
could create netfilter rules similar to these:

        chain rawprerouting {
                type filter hook prerouting priority raw;
                iif realiface1 ct zone set 1 return
                iif realiface2 ct zone set 2 return
        }

This works before the mentioned commit, but not after: zone assignment
is "forgotten", and any subsequent NAT or filtering that is dependent
on the conntrack zone does not work.

Here is a reproducer script that demonstrates the difference in behaviour.

==========
#!/bin/sh

# This script demonstrates unexpected change of nftables behaviour
# caused by commit 09e856d54bda5f28 ""vrf: Reset skb conntrack
# connection on VRF rcv"
# https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=09e856d54bda5f288ef8437a90ab2b9b3eab83d1
#
# Before the commit, it was possible to assign conntrack zone to a
# packet (or mark it for `notracking`) in the prerouting chanin, raw
# priority, based on the `iif` (interface from which the packet
# arrived).
# After the change, # if the interface is enslaved in a VRF, such
# assignment is lost. Instead, assignment based on the `iif` matching
# the VRF master interface is honored. Thus it is impossible to
# distinguish packets based on the original interface.
#
# This script demonstrates this change of behaviour: conntrack zone 1
# or 2 is assigned depending on the match with the original interface
# or the vrf master interface. It can be observed that conntrack entry
# appears in different zone in the kernel versions before and after
# the commit.

IPIN=172.30.30.1
IPOUT=172.30.30.2
PFXL=30

ip li sh vein >/dev/null 2>&1 && ip li del vein
ip li sh tvrf >/dev/null 2>&1 && ip li del tvrf
nft list table testct >/dev/null 2>&1 && nft delete table testct

ip li add vein type veth peer veout
ip li add tvrf type vrf table 9876
ip li set veout master tvrf
ip li set vein up
ip li set veout up
ip li set tvrf up
/sbin/sysctl -w net.ipv4.conf.veout.accept_local=1
/sbin/sysctl -w net.ipv4.conf.veout.rp_filter=0
ip addr add $IPIN/$PFXL dev vein
ip addr add $IPOUT/$PFXL dev veout

nft -f - <<__END__
table testct {
	chain rawpre {
		type filter hook prerouting priority raw;
		iif { veout, tvrf } meta nftrace set 1
		iif veout ct zone set 1 return
		iif tvrf ct zone set 2 return
		notrack
	}
	chain rawout {
		type filter hook output priority raw;
		notrack
	}
}
__END__

uname -rv
conntrack -F
ping -W 1 -c 1 -I vein $IPOUT
conntrack -L

Signed-off-by: Eugene Crosser <crosser@average.org>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/vrf.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 8bbe2a7bb141..2b1b944d4b28 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -1367,8 +1367,6 @@ static struct sk_buff *vrf_ip6_rcv(struct net_device *vrf_dev,
 	bool need_strict = rt6_need_strict(&ipv6_hdr(skb)->daddr);
 	bool is_ndisc = ipv6_ndisc_frame(skb);
 
-	nf_reset_ct(skb);
-
 	/* loopback, multicast & non-ND link-local traffic; do not push through
 	 * packet taps again. Reset pkt_type for upper layers to process skb.
 	 * For strict packets with a source LLA, determine the dst using the
@@ -1431,8 +1429,6 @@ static struct sk_buff *vrf_ip_rcv(struct net_device *vrf_dev,
 	skb->skb_iif = vrf_dev->ifindex;
 	IPCB(skb)->flags |= IPSKB_L3SLAVE;
 
-	nf_reset_ct(skb);
-
 	if (ipv4_is_multicast(ip_hdr(skb)->daddr))
 		goto out;
 
-- 
2.33.0


  parent reply	other threads:[~2021-10-25 17:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 16:59 [PATCH AUTOSEL 5.14 01/18] KVM: arm64: Report corrupted refcount at EL2 Sasha Levin
2021-10-25 16:59 ` Sasha Levin
2021-10-25 16:59 ` Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 02/18] ASoC: soc-core: fix null-ptr-deref in snd_soc_del_component_unlocked() Sasha Levin
2021-10-25 16:59   ` Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 03/18] ASoC: cs42l42: Ensure 0dB full scale volume is used for headsets Sasha Levin
2021-10-25 16:59   ` Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 04/18] scsi: core: Put LLD module refcnt after SCSI device is released Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 05/18] ALSA: hda/realtek: Fixes HP Spectre x360 15-eb1xxx speakers Sasha Levin
2021-10-25 16:59   ` Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 06/18] ptp: fix error print of ptp_kvm on X86_64 platform Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 07/18] net: sparx5: Add of_node_put() before goto Sasha Levin
2021-10-25 16:59   ` Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 08/18] net: mscc: ocelot: " Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 09/18] cavium: Return negative value when pci_alloc_irq_vectors() fails Sasha Levin
2021-10-25 16:59   ` Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 10/18] scsi: qla2xxx: Return -ENOMEM if kzalloc() fails Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 11/18] scsi: qla2xxx: Fix unmap of already freed sgl Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 12/18] mISDN: Fix return values of the probe function Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 13/18] cavium: " Sasha Levin
2021-10-25 16:59   ` Sasha Levin
2021-10-25 16:59 ` Sasha Levin [this message]
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 15/18] sfc: Export fibre-specific supported link modes Sasha Levin
2021-10-25 18:24   ` Erik Ekman
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 16/18] sfc: Don't use netif_info before net_device setup Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 17/18] usbnet: sanity check for maxpacket Sasha Levin
2021-10-25 16:59 ` [PATCH AUTOSEL 5.14 18/18] hyperv/vmbus: include linux/bitops.h Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211025165939.1393655-14-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=crosser@average.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.