netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while in tcp SYN_SENT state
@ 2021-06-24 10:36 Florian Westphal
  2021-06-24 10:36 ` [PATCH nf-next 1/2] selftest: netfilter: add test case for unreplied tcp connections Florian Westphal
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Florian Westphal @ 2021-06-24 10:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: kadlec, Florian Westphal

Antonio Ojea reported a problem with a container environment where
connection retries prevent expiry of a SYN_SENT conntrack entry.

This in turn prevents a NAT rule from becoming active.

Consider:
  client -----> conntrack ---> Host

client sends a SYN, but $Host is unreachable/silent.

In the reported case, $host address doesn't exist at all --
its a 'virtual' ip that is made accessible via dnat/redirect.

The routing table even passes the packet back via the same interface
it arrived on.

In the mean time, a NAT rule has been added to the conntrack
namespace, but it has no effect until the existing conntrack
entry times out.

Unfortunately, in the above scenario, the client retries reconnects
faster than the SYN default timeout (60 seconds), i.e. the entry
never expires and the 'virtual' ip never becomes active.

First patch adds a test case:
 3 namespaces, one sender, one receiver.
 sender connects to non-existent/virtual ip.
 Then a dnat rule gets added.

 The test case succeeds once conntrack tool shows that the nat rule
 was evaluated.

Second patch prevents timeout refresh for entries stuck in
SYN_SENT state.

Without second patch the test case doesn't pass even though syn
timeout is set to 10 seconds.

Florian Westphal (2):
  selftest: netfilter: add test case for unreplied tcp connections
  netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state

 net/netfilter/nf_conntrack_proto_tcp.c        |  10 ++
 tools/testing/selftests/netfilter/Makefile    |   2 +-
 .../netfilter/conntrack_tcp_unreplied.sh      | 167 ++++++++++++++++++
 3 files changed, 178 insertions(+), 1 deletion(-)
 create mode 100755 tools/testing/selftests/netfilter/conntrack_tcp_unreplied.sh

-- 
2.31.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH nf-next 1/2] selftest: netfilter: add test case for unreplied tcp connections
  2021-06-24 10:36 [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while in tcp SYN_SENT state Florian Westphal
@ 2021-06-24 10:36 ` Florian Westphal
  2021-06-24 10:36 ` [PATCH nf-next 2/2] netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state Florian Westphal
  2021-07-02  0:53 ` [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while " Pablo Neira Ayuso
  2 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2021-06-24 10:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: kadlec, Florian Westphal

TCP connections in UNREPLIED state (only SYN seen) can be kept alive
indefinitely, as each SYN re-sets the timeout.

This means that even if a peer has closed its socket the entry
never times out.

This also prevents re-evaluation of configured NAT rules.
Add a test case that sets SYN timeout to 10 seconds, then check
that the nat redirection added later eventually takes effect.

This is based off a repro script from Antonio Ojea.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 tools/testing/selftests/netfilter/Makefile    |   2 +-
 .../netfilter/conntrack_tcp_unreplied.sh      | 167 ++++++++++++++++++
 2 files changed, 168 insertions(+), 1 deletion(-)
 create mode 100755 tools/testing/selftests/netfilter/conntrack_tcp_unreplied.sh

diff --git a/tools/testing/selftests/netfilter/Makefile b/tools/testing/selftests/netfilter/Makefile
index cd6430b39982..8748199ac109 100644
--- a/tools/testing/selftests/netfilter/Makefile
+++ b/tools/testing/selftests/netfilter/Makefile
@@ -5,7 +5,7 @@ TEST_PROGS := nft_trans_stress.sh nft_fib.sh nft_nat.sh bridge_brouter.sh \
 	conntrack_icmp_related.sh nft_flowtable.sh ipvs.sh \
 	nft_concat_range.sh nft_conntrack_helper.sh \
 	nft_queue.sh nft_meta.sh nf_nat_edemux.sh \
-	ipip-conntrack-mtu.sh
+	ipip-conntrack-mtu.sh conntrack_tcp_unreplied.sh
 
 LDLIBS = -lmnl
 TEST_GEN_FILES =  nf-queue
diff --git a/tools/testing/selftests/netfilter/conntrack_tcp_unreplied.sh b/tools/testing/selftests/netfilter/conntrack_tcp_unreplied.sh
new file mode 100755
index 000000000000..e7d7bf13cff5
--- /dev/null
+++ b/tools/testing/selftests/netfilter/conntrack_tcp_unreplied.sh
@@ -0,0 +1,167 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Check that UNREPLIED tcp conntrack will eventually timeout.
+#
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+ret=0
+
+waittime=20
+sfx=$(mktemp -u "XXXXXXXX")
+ns1="ns1-$sfx"
+ns2="ns2-$sfx"
+
+nft --version > /dev/null 2>&1
+if [ $? -ne 0 ];then
+	echo "SKIP: Could not run test without nft tool"
+	exit $ksft_skip
+fi
+
+ip -Version > /dev/null 2>&1
+if [ $? -ne 0 ];then
+	echo "SKIP: Could not run test without ip tool"
+	exit $ksft_skip
+fi
+
+cleanup() {
+	ip netns pids $ns1 | xargs kill 2>/dev/null
+	ip netns pids $ns2 | xargs kill 2>/dev/null
+
+	ip netns del $ns1
+	ip netns del $ns2
+}
+
+ipv4() {
+    echo -n 192.168.$1.2
+}
+
+check_counter()
+{
+	ns=$1
+	name=$2
+	expect=$3
+	local lret=0
+
+	cnt=$(ip netns exec $ns2 nft list counter inet filter "$name" | grep -q "$expect")
+	if [ $? -ne 0 ]; then
+		echo "ERROR: counter $name in $ns2 has unexpected value (expected $expect)" 1>&2
+		ip netns exec $ns2 nft list counter inet filter "$name" 1>&2
+		lret=1
+	fi
+
+	return $lret
+}
+
+# Create test namespaces
+ip netns add $ns1 || exit 1
+
+trap cleanup EXIT
+
+ip netns add $ns2 || exit 1
+
+# Connect the namespace to the host using a veth pair
+ip -net $ns1 link add name veth1 type veth peer name veth2
+ip -net $ns1 link set netns $ns2 dev veth2
+
+ip -net $ns1 link set up dev lo
+ip -net $ns2 link set up dev lo
+ip -net $ns1 link set up dev veth1
+ip -net $ns2 link set up dev veth2
+
+ip -net $ns2 addr add 10.11.11.2/24 dev veth2
+ip -net $ns2 route add default via 10.11.11.1
+
+ip netns exec $ns2 sysctl -q net.ipv4.conf.veth2.forwarding=1
+
+# add a rule inside NS so we enable conntrack
+ip netns exec $ns1 iptables -A INPUT -m state --state established,related -j ACCEPT
+
+ip -net $ns1 addr add 10.11.11.1/24 dev veth1
+ip -net $ns1 route add 10.99.99.99 via 10.11.11.2
+
+# Check connectivity works
+ip netns exec $ns1 ping -q -c 2 10.11.11.2 >/dev/null || exit 1
+
+ip netns exec $ns2 nc -l -p 8080 < /dev/null &
+
+# however, conntrack entries are there
+
+ip netns exec $ns2 nft -f - <<EOF
+table inet filter {
+	counter connreq { }
+	counter redir { }
+	chain input {
+		type filter hook input priority 0; policy accept;
+		ct state new tcp flags syn ip daddr 10.99.99.99 tcp dport 80 counter name "connreq" accept
+		ct state new ct status dnat tcp dport 8080 counter name "redir" accept
+	}
+}
+EOF
+if [ $? -ne 0 ]; then
+	echo "ERROR: Could not load nft rules"
+	exit 1
+fi
+
+ip netns exec $ns2 sysctl -q net.netfilter.nf_conntrack_tcp_timeout_syn_sent=10
+
+echo "INFO: connect $ns1 -> $ns2 to the virtual ip"
+ip netns exec $ns1 bash -c 'while true ; do
+	nc -p 60000 10.99.99.99 80
+	sleep 1
+	done' &
+
+sleep 1
+
+ip netns exec $ns2 nft -f - <<EOF
+table inet nat {
+	chain prerouting {
+		type nat hook prerouting priority 0; policy accept;
+		ip daddr 10.99.99.99 tcp dport 80 redirect to :8080
+	}
+}
+EOF
+if [ $? -ne 0 ]; then
+	echo "ERROR: Could not load nat redirect"
+	exit 1
+fi
+
+count=$(ip netns exec $ns2 conntrack -L -p tcp --dport 80 2>/dev/null | wc -l)
+if [ $count -eq 0 ]; then
+	echo "ERROR: $ns2 did not pick up tcp connection from peer"
+	exit 1
+fi
+
+echo "INFO: NAT redirect added in ns $ns2, waiting for $waittime seconds for nat to take effect"
+for i in $(seq 1 $waittime); do
+	echo -n "."
+
+	sleep 1
+
+	count=$(ip netns exec $ns2 conntrack -L -p tcp --reply-port-src 8080 2>/dev/null | wc -l)
+	if [ $count -gt 0 ]; then
+		echo
+		echo "PASS: redirection took effect after $i seconds"
+		break
+	fi
+
+	m=$((i%20))
+	if [ $m -eq 0 ]; then
+		echo " waited for $i seconds"
+	fi
+done
+
+expect="packets 1 bytes 60"
+check_counter "$ns2" "redir" "$expect"
+if [ $? -ne 0 ]; then
+	ret=1
+fi
+
+if [ $ret -eq 0 ];then
+	echo "PASS: redirection counter has expected values"
+else
+	echo "ERROR: no tcp connection was redirected"
+fi
+
+exit $ret
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH nf-next 2/2] netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state
  2021-06-24 10:36 [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while in tcp SYN_SENT state Florian Westphal
  2021-06-24 10:36 ` [PATCH nf-next 1/2] selftest: netfilter: add test case for unreplied tcp connections Florian Westphal
@ 2021-06-24 10:36 ` Florian Westphal
  2021-07-02  0:53 ` [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while " Pablo Neira Ayuso
  2 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2021-06-24 10:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: kadlec, Florian Westphal

Consider:
  client -----> conntrack ---> Host

client sends a SYN, but $Host is unreachable/silent.
Client eventually gives up and the conntrack entry will time out.

However, if the client is restarted with same addr/port pair, it
may prevent the conntrack entry from timing out.

This is noticeable when the existing conntrack entry has no NAT
transformation or an outdated one and port reuse happens either
on client or due to a NAT middlebox.

This change prevents refresh of the timeout for SYN retransmits,
so entry is going away after nf_conntrack_tcp_timeout_syn_sent
seconds (default: 60).

Entry will be re-created on next connection attempt, but then
nat rules will be evaluated again.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_proto_tcp.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index f7e8baf59b51..eb4de92077a8 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -1134,6 +1134,16 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct,
 			nf_ct_kill_acct(ct, ctinfo, skb);
 			return NF_ACCEPT;
 		}
+
+		if (index == TCP_SYN_SET && old_state == TCP_CONNTRACK_SYN_SENT) {
+			/* do not renew timeout on SYN retransmit.
+			 *
+			 * Else port reuse by client or NAT middlebox can keep
+			 * entry alive indefinitely (including nat info).
+			 */
+			return NF_ACCEPT;
+		}
+
 		/* ESTABLISHED without SEEN_REPLY, i.e. mid-connection
 		 * pickup with loose=1. Avoid large ESTABLISHED timeout.
 		 */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while in tcp SYN_SENT state
  2021-06-24 10:36 [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while in tcp SYN_SENT state Florian Westphal
  2021-06-24 10:36 ` [PATCH nf-next 1/2] selftest: netfilter: add test case for unreplied tcp connections Florian Westphal
  2021-06-24 10:36 ` [PATCH nf-next 2/2] netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state Florian Westphal
@ 2021-07-02  0:53 ` Pablo Neira Ayuso
  2 siblings, 0 replies; 4+ messages in thread
From: Pablo Neira Ayuso @ 2021-07-02  0:53 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, kadlec

On Thu, Jun 24, 2021 at 12:36:40PM +0200, Florian Westphal wrote:
> Antonio Ojea reported a problem with a container environment where
> connection retries prevent expiry of a SYN_SENT conntrack entry.
> 
> This in turn prevents a NAT rule from becoming active.
> 
> Consider:
>   client -----> conntrack ---> Host
> 
> client sends a SYN, but $Host is unreachable/silent.
> 
> In the reported case, $host address doesn't exist at all --
> its a 'virtual' ip that is made accessible via dnat/redirect.
> 
> The routing table even passes the packet back via the same interface
> it arrived on.
> 
> In the mean time, a NAT rule has been added to the conntrack
> namespace, but it has no effect until the existing conntrack
> entry times out.
> 
> Unfortunately, in the above scenario, the client retries reconnects
> faster than the SYN default timeout (60 seconds), i.e. the entry
> never expires and the 'virtual' ip never becomes active.
> 
> First patch adds a test case:
>  3 namespaces, one sender, one receiver.
>  sender connects to non-existent/virtual ip.
>  Then a dnat rule gets added.
> 
>  The test case succeeds once conntrack tool shows that the nat rule
>  was evaluated.
> 
> Second patch prevents timeout refresh for entries stuck in
> SYN_SENT state.
> 
> Without second patch the test case doesn't pass even though syn
> timeout is set to 10 seconds.

Series applied, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-02  0:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-24 10:36 [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while in tcp SYN_SENT state Florian Westphal
2021-06-24 10:36 ` [PATCH nf-next 1/2] selftest: netfilter: add test case for unreplied tcp connections Florian Westphal
2021-06-24 10:36 ` [PATCH nf-next 2/2] netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state Florian Westphal
2021-07-02  0:53 ` [PATCH nf-next 0/2] netfilter: conntrack: do not renew timeout while " Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).