All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Maciej Żenczykowski" <zenczykowski@gmail.com>
To: Jan Engelhardt <jengelh@inai.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>,
	Florian Westphal <fw@strlen.de>,
	Linux Network Development Mailing List  <netdev@vger.kernel.org>,
	Netfilter Development Mailing List 
	<netfilter-devel@vger.kernel.org>
Subject: Re: [PATCH] document danger of '-j REJECT'ing of '-m state INVALID' packets
Date: Sat, 9 May 2020 10:45:42 -0700	[thread overview]
Message-ID: <CANP3RGeL_VuCChw=YX5W0kenmXctMY0ROoxPYe_nRnuemaWUfg@mail.gmail.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2005091231090.11519@n3.vanv.qr>

So I've never tried to figure out how things break, just observed that
they do - first many many years ago (close to 15ish) - between my wifi
connected laptop at home and my university server in the same city.
I've kept an INVALID->DROP rule in all my firewalls since then and not
had problems.  I vaguely recall seeing delayed packets when I debugged
it back then.

See for example: https://github.com/moby/libnetwork/issues/1090 for
others running into this.

Now we've hit an issue at work where a network misconfiguration has
asymmetric one way pathing with a result that some packets were
getting *massively* delayed, and it's been causing user firewalls to
generate tcp resets for 'too old' 'already ack'ed' packets (ie. dups).

While this is of course a misconfig, and it shouldn't happen, in
practice it sometimes simply does.
All it takes is for a packet to get into a long queue, and the network
path to shift (immediately after it) to a less congested path.
Due to bufferbloat those long queues can take seconds to drain and
exceed path rtt by orders of magnitude.

I *think* what happens is:

A non-final tcp packet gets massively delayed, the packet past that
makes it through to the receive, and triggers an ACK with SACK, which
makes it back to the sender and triggers a retransmit and the
connections keeps on making forward progress,  then eventually the
delayed packet arrives and it's no longer considered valid and
triggers a tcp reset.  Massively of course depends on the rtt and
retransmit aggressiveness.

Here's my attempt to demonstrate what I believe the problem to be:

(on a freshly booted clean/empty/idle fedora 31 vm)

iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m state --state INVALID -j DROP
modprobe ifb
ip link set dev ifb0 up
tc qdisc add dev ifb0 root netem reorder 99% 0% delay 10s
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress u32 match u32 0 0 action mirred egress
redirect dev ifb0
wget -O /dev/null https://git.kernel.org/torvalds/t/linux-5.7-rc4.tar.gz
iptables-save -c

...
/dev/null                             [     <=>
                           ] 169.58M  2.93MB/s    in 45s
2020-05-09 10:35:44 (3.81 MB/s) - ‘/dev/null’ saved [177819073]
...
[31750:181080717] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
[244:1403178] -A INPUT -m state --state INVALID -j DROP


Now if I reboot, and run the same script, except instead of the
INVALID/DROP rule I do
  iptables -A INPUT -p tcp -j REJECT --reject-with tcp-reset
then the download never finishes (it hangs after 15MB @ 2MB/s and
eventually times out).

[4170:16758894] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
[37:147454] -A INPUT -p tcp -j REJECT --reject-with tcp-reset

(arguably since this is a VM, and thus NAT'ed by my host, and then
again by the real ipv4 NAT, the setup isn't entirely clear, but I hope
it makes my point: INVALID state needs to be dropped, not rejected)

  reply	other threads:[~2020-05-09 17:45 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-09  5:22 [PATCH] document danger of '-j REJECT'ing of '-m state INVALID' packets Maciej Żenczykowski
2020-05-09 10:52 ` Jan Engelhardt
2020-05-09 17:45   ` Maciej Żenczykowski [this message]
2020-05-09 18:02     ` Maciej Żenczykowski
2020-05-09 21:17     ` [PATCH] doc: document danger of applying REJECT to INVALID CTs Jan Engelhardt
2020-05-09 21:28       ` Maciej Żenczykowski
2020-05-09 21:31         ` Maciej Żenczykowski
2020-05-12 21:00         ` [PATCH v2] " Jan Engelhardt
2020-05-12 21:25           ` Maciej Żenczykowski
2020-05-13  4:39           ` Benjamin Poirier
2020-05-13  9:17             ` [PATCH v3] " Jan Engelhardt
2020-05-13  9:28               ` Maciej Żenczykowski
2020-05-13  9:39                 ` [PATCH v4] " Jan Engelhardt
2020-05-13 16:29                   ` Maciej Żenczykowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANP3RGeL_VuCChw=YX5W0kenmXctMY0ROoxPYe_nRnuemaWUfg@mail.gmail.com' \
    --to=zenczykowski@gmail.com \
    --cc=fw@strlen.de \
    --cc=jengelh@inai.de \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.