* [PATCH nf v2] netfilter: conntrack: connection timeout after re-register @ 2020-10-07 19:32 Francesco Ruggeri 2020-10-08 23:41 ` Francesco Ruggeri 2020-10-20 15:21 ` Pablo Neira Ayuso 0 siblings, 2 replies; 15+ messages in thread From: Francesco Ruggeri @ 2020-10-07 19:32 UTC (permalink / raw) To: linux-kernel, netdev, coreteam, netfilter-devel, kuba, davem, fw, kadlec, pablo, fruggeri If the first packet conntrack sees after a re-register is an outgoing keepalive packet with no data (SEG.SEQ = SND.NXT-1), td_end is set to SND.NXT-1. When the peer correctly acknowledges SND.NXT, tcp_in_window fails check III (Upper bound for valid (s)ack: sack <= receiver.td_end) and returns false, which cascades into nf_conntrack_in setting skb->_nfct = 0 and in later conntrack iptables rules not matching. In cases where iptables are dropping packets that do not match conntrack rules this can result in idle tcp connections to time out. v2: adjust td_end when getting the reply rather than when sending out the keepalive packet. Fixes: f94e63801ab2 ("netfilter: conntrack: reset tcp maxwin on re-register") Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> --- net/netfilter/nf_conntrack_proto_tcp.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index e8c86ee4c1c4..c8fb2187ad4b 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -541,13 +541,20 @@ static bool tcp_in_window(const struct nf_conn *ct, swin = win << sender->td_scale; sender->td_maxwin = (swin == 0 ? 1 : swin); sender->td_maxend = end + sender->td_maxwin; - /* - * We haven't seen traffic in the other direction yet - * but we have to tweak window tracking to pass III - * and IV until that happens. - */ - if (receiver->td_maxwin == 0) + if (receiver->td_maxwin == 0) { + /* We haven't seen traffic in the other + * direction yet but we have to tweak window + * tracking to pass III and IV until that + * happens. + */ receiver->td_end = receiver->td_maxend = sack; + } else if (sack == receiver->td_end + 1) { + /* Likely a reply to a keepalive. + * Needed for III. + */ + receiver->td_end++; + } + } } else if (((state->state == TCP_CONNTRACK_SYN_SENT && dir == IP_CT_DIR_ORIGINAL) -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-07 19:32 [PATCH nf v2] netfilter: conntrack: connection timeout after re-register Francesco Ruggeri @ 2020-10-08 23:41 ` Francesco Ruggeri 2020-10-09 6:52 ` Jozsef Kadlecsik 2020-10-20 15:21 ` Pablo Neira Ayuso 1 sibling, 1 reply; 15+ messages in thread From: Francesco Ruggeri @ 2020-10-08 23:41 UTC (permalink / raw) To: open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, fw, kadlec, Pablo Neira Ayuso, Francesco Ruggeri On Wed, Oct 7, 2020 at 12:32 PM Francesco Ruggeri <fruggeri@arista.com> wrote: > > If the first packet conntrack sees after a re-register is an outgoing > keepalive packet with no data (SEG.SEQ = SND.NXT-1), td_end is set to > SND.NXT-1. > When the peer correctly acknowledges SND.NXT, tcp_in_window fails > check III (Upper bound for valid (s)ack: sack <= receiver.td_end) and > returns false, which cascades into nf_conntrack_in setting > skb->_nfct = 0 and in later conntrack iptables rules not matching. > In cases where iptables are dropping packets that do not match > conntrack rules this can result in idle tcp connections to time out. > > v2: adjust td_end when getting the reply rather than when sending out > the keepalive packet. > Any comments? Here is a simple reproducer. The idea is to show that keepalive packets in an idle tcp connection will be dropped (and the connection will time out) if conntrack hooks are de-registered and then re-registered. The reproducer has two files. client_server.py creates both ends of a tcp connection, bounces a few packets back and forth, and then blocks on a recv on the client side. The client's keepalive is configured to time out in 20 seconds. This connection should not time out. test is a bash script that creates a net namespace where it sets iptables rules for the connection, starts client_server.py, and then clears and restores the iptables rules (which causes conntrack hooks to be de-registered and re-registered). ================ file client_server.py #!/usr/bin/python import socket PORT=4446 # create server socket sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.bind(('localhost', PORT)) sock.listen(1) # create client socket cl_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) cl_sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1) cl_sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 2) cl_sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 2) cl_sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 10) cl_sock.connect(('localhost', PORT)) srv_sock, _ = sock.accept() # Bounce a packet back and forth a few times buf = 'aaaaaaaaaaaa' for i in range(5): cl_sock.send(buf) buf = srv_sock.recv(100) srv_sock.send(buf) buf = cl_sock.recv(100) print buf # idle the connection try: buf = cl_sock.recv(100) except socket.error, e: print "Error: %s" % e sock.close() cl_sock.close() srv_sock.close() ============== file test #!/bin/bash ip netns add dummy ip netns exec dummy ip link set lo up echo "Created namespace" ip netns exec dummy iptables-restore <<END *filter :INPUT DROP [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A INPUT -p tcp -m tcp --dport 4446 -j ACCEPT COMMIT END echo "Installed iptables rules" ip netns exec dummy ./client_server.py & echo "Created tcp connection" sleep 2 ip netns exec dummy iptables-restore << END *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] COMMIT END echo "Cleared iptables rules" sleep 4 ip netns exec dummy iptables-restore << END *filter :INPUT DROP [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A INPUT -p tcp -m tcp --dport 4446 -j ACCEPT COMMIT END echo "Restored original iptables rules" wait ip netns del dummy exit 0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-08 23:41 ` Francesco Ruggeri @ 2020-10-09 6:52 ` Jozsef Kadlecsik 2020-10-09 11:03 ` Florian Westphal 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2020-10-09 6:52 UTC (permalink / raw) To: Francesco Ruggeri Cc: open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, fw, Pablo Neira Ayuso Hi Francesco, On Thu, 8 Oct 2020, Francesco Ruggeri wrote: > On Wed, Oct 7, 2020 at 12:32 PM Francesco Ruggeri <fruggeri@arista.com> wrote: > > > > If the first packet conntrack sees after a re-register is an outgoing > > keepalive packet with no data (SEG.SEQ = SND.NXT-1), td_end is set to > > SND.NXT-1. When the peer correctly acknowledges SND.NXT, tcp_in_window > > fails check III (Upper bound for valid (s)ack: sack <= > > receiver.td_end) and returns false, which cascades into > > nf_conntrack_in setting skb->_nfct = 0 and in later conntrack iptables > > rules not matching. In cases where iptables are dropping packets that > > do not match conntrack rules this can result in idle tcp connections > > to time out. > > > > v2: adjust td_end when getting the reply rather than when sending out > > the keepalive packet. > > > > Any comments? > Here is a simple reproducer. The idea is to show that keepalive packets > in an idle tcp connection will be dropped (and the connection will time > out) if conntrack hooks are de-registered and then re-registered. The > reproducer has two files. client_server.py creates both ends of a tcp > connection, bounces a few packets back and forth, and then blocks on a > recv on the client side. The client's keepalive is configured to time > out in 20 seconds. This connection should not time out. test is a bash > script that creates a net namespace where it sets iptables rules for the > connection, starts client_server.py, and then clears and restores the > iptables rules (which causes conntrack hooks to be de-registered and > re-registered). In my opinion an iptables restore should not cause conntrack hooks to be de-registered and re-registered, because important TCP initialization parameters cannot be "restored" later from the packets. Therefore the proper fix would be to prevent it to happen. Otherwise your patch looks OK to handle the case when conntrack is intentionally restarted. Best regards, Jozsef > ================ file client_server.py > #!/usr/bin/python > > import socket > > PORT=4446 > > # create server socket > sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > sock.bind(('localhost', PORT)) > sock.listen(1) > > # create client socket > cl_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > cl_sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1) > cl_sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 2) > cl_sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 2) > cl_sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 10) > cl_sock.connect(('localhost', PORT)) > > srv_sock, _ = sock.accept() > > # Bounce a packet back and forth a few times > buf = 'aaaaaaaaaaaa' > for i in range(5): > cl_sock.send(buf) > buf = srv_sock.recv(100) > srv_sock.send(buf) > buf = cl_sock.recv(100) > print buf > > # idle the connection > try: > buf = cl_sock.recv(100) > except socket.error, e: > print "Error: %s" % e > > sock.close() > cl_sock.close() > srv_sock.close() > > ============== file test > #!/bin/bash > > ip netns add dummy > ip netns exec dummy ip link set lo up > echo "Created namespace" > > ip netns exec dummy iptables-restore <<END > *filter > :INPUT DROP [0:0] > :FORWARD ACCEPT [0:0] > :OUTPUT ACCEPT [0:0] > -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT > -A INPUT -p tcp -m tcp --dport 4446 -j ACCEPT > COMMIT > END > echo "Installed iptables rules" > > ip netns exec dummy ./client_server.py & > echo "Created tcp connection" > sleep 2 > > ip netns exec dummy iptables-restore << END > *filter > :INPUT ACCEPT [0:0] > :FORWARD ACCEPT [0:0] > :OUTPUT ACCEPT [0:0] > COMMIT > END > echo "Cleared iptables rules" > sleep 4 > > ip netns exec dummy iptables-restore << END > *filter > :INPUT DROP [0:0] > :FORWARD ACCEPT [0:0] > :OUTPUT ACCEPT [0:0] > -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT > -A INPUT -p tcp -m tcp --dport 4446 -j ACCEPT > COMMIT > END > echo "Restored original iptables rules" > > wait > ip netns del dummy > exit 0 > - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-09 6:52 ` Jozsef Kadlecsik @ 2020-10-09 11:03 ` Florian Westphal 2020-10-09 18:48 ` Jozsef Kadlecsik 0 siblings, 1 reply; 15+ messages in thread From: Florian Westphal @ 2020-10-09 11:03 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, fw, Pablo Neira Ayuso Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > Any comments? > > Here is a simple reproducer. The idea is to show that keepalive packets > > in an idle tcp connection will be dropped (and the connection will time > > out) if conntrack hooks are de-registered and then re-registered. The > > reproducer has two files. client_server.py creates both ends of a tcp > > connection, bounces a few packets back and forth, and then blocks on a > > recv on the client side. The client's keepalive is configured to time > > out in 20 seconds. This connection should not time out. test is a bash > > script that creates a net namespace where it sets iptables rules for the > > connection, starts client_server.py, and then clears and restores the > > iptables rules (which causes conntrack hooks to be de-registered and > > re-registered). > > In my opinion an iptables restore should not cause conntrack hooks to be > de-registered and re-registered, because important TCP initialization > parameters cannot be "restored" later from the packets. Therefore the > proper fix would be to prevent it to happen. Otherwise your patch looks OK > to handle the case when conntrack is intentionally restarted. The repro clears all rules, waits 4 seconds, then restores the ruleset. using iptables-restore < FOO; sleep 4; iptables-restore < FOO will not result in any unregister ops. We could make kernel defer unregister via some work queue but i don't see what this would help/accomplish (and its questionable of how long it should wait). We could disallow unregister, but that seems silly (forces reboot...). I think the patch is fine. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-09 11:03 ` Florian Westphal @ 2020-10-09 18:48 ` Jozsef Kadlecsik 2020-10-09 18:55 ` Florian Westphal 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2020-10-09 18:48 UTC (permalink / raw) To: Florian Westphal Cc: Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, fw, Pablo Neira Ayuso Hi Florian, On Fri, 9 Oct 2020, Florian Westphal wrote: > Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > > The reproducer has two files. client_server.py creates both ends of > > > a tcp connection, bounces a few packets back and forth, and then > > > blocks on a recv on the client side. The client's keepalive is > > > configured to time out in 20 seconds. This connection should not > > > time out. test is a bash script that creates a net namespace where > > > it sets iptables rules for the connection, starts client_server.py, > > > and then clears and restores the iptables rules (which causes > > > conntrack hooks to be de-registered and re-registered). > > > > In my opinion an iptables restore should not cause conntrack hooks to be > > de-registered and re-registered, because important TCP initialization > > parameters cannot be "restored" later from the packets. Therefore the > > proper fix would be to prevent it to happen. Otherwise your patch looks OK > > to handle the case when conntrack is intentionally restarted. > > The repro clears all rules, waits 4 seconds, then restores the ruleset. > using iptables-restore < FOO; sleep 4; iptables-restore < FOO will not > result in any unregister ops. > > We could make kernel defer unregister via some work queue but i don't > see what this would help/accomplish (and its questionable of how long it > should wait). Sorry, I can't put together the two paragraphs above: in the first you wrote that no (hook) unregister-register happens and in the second one that those could be derefed. > We could disallow unregister, but that seems silly (forces reboot...). > > I think the patch is fine. The patch is fine, but why the packets are handled by conntrack (after the first restore and during the 4s sleep? And then again after the second restore?) as if all conntrack entries were removed? Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-09 18:48 ` Jozsef Kadlecsik @ 2020-10-09 18:55 ` Florian Westphal 2020-10-09 19:49 ` Jozsef Kadlecsik 0 siblings, 1 reply; 15+ messages in thread From: Florian Westphal @ 2020-10-09 18:55 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Florian Westphal, Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, fw, Pablo Neira Ayuso Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > The repro clears all rules, waits 4 seconds, then restores the ruleset. > > using iptables-restore < FOO; sleep 4; iptables-restore < FOO will not > > result in any unregister ops. > > > > We could make kernel defer unregister via some work queue but i don't > > see what this would help/accomplish (and its questionable of how long it > > should wait). > > Sorry, I can't put together the two paragraphs above: in the first you > wrote that no (hook) unregister-register happens and in the second one > that those could be derefed. Sorry, my reply is confusing indeed. Matches/targets that need conntrack increment a refcount. So, when all rules are flushed, refcount goes down to 0 and conntrack is disabled because the hooks get removed.. Just doing iptables-restore doesn't unregister as long as both the old and new rulesets need conntrack. The "delay unregister" remark was wrt. the "all rules were deleted" case, i.e. add a "grace period" rather than acting right away when conntrack use count did hit 0. > > We could disallow unregister, but that seems silly (forces reboot...). > > > > I think the patch is fine. > > The patch is fine, but why the packets are handled by conntrack (after the > first restore and during the 4s sleep? And then again after the second > restore?) as if all conntrack entries were removed? Conntrack entries are not removed, only the base hooks get unregistered. This is a problem for tcp window tracking. When re-register occurs, kernel is supposed to switch the existing entries to "loose" mode so window tracking won't flag packets as invalid, but apparently this isn't enough to handle keepalive case. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-09 18:55 ` Florian Westphal @ 2020-10-09 19:49 ` Jozsef Kadlecsik 2020-10-09 20:00 ` Francesco Ruggeri 2020-10-09 20:05 ` Florian Westphal 0 siblings, 2 replies; 15+ messages in thread From: Jozsef Kadlecsik @ 2020-10-09 19:49 UTC (permalink / raw) To: Florian Westphal Cc: Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, Pablo Neira Ayuso On Fri, 9 Oct 2020, Florian Westphal wrote: > Matches/targets that need conntrack increment a refcount. So, when all > rules are flushed, refcount goes down to 0 and conntrack is disabled > because the hooks get removed.. > > Just doing iptables-restore doesn't unregister as long as both the old > and new rulesets need conntrack. > > The "delay unregister" remark was wrt. the "all rules were deleted" > case, i.e. add a "grace period" rather than acting right away when > conntrack use count did hit 0. Now I understand it, thanks really. The hooks are removed, so conntrack cannot "see" the packets and the entries become stale. What is the rationale behind "remove the conntrack hooks when there are no rule left referring to conntrack"? Performance optimization? But then the content of the whole conntrack table could be deleted too... ;-) > Conntrack entries are not removed, only the base hooks get unregistered. > This is a problem for tcp window tracking. > > When re-register occurs, kernel is supposed to switch the existing > entries to "loose" mode so window tracking won't flag packets as > invalid, but apparently this isn't enough to handle keepalive case. "loose" (nf_ct_tcp_loose) mode doesn't disable window tracking, it enables/disables picking up already established connections. nf_ct_tcp_be_liberal would disable TCP window checking (but not tracking) for non RST packets. But both seems to be modified only via the proc entries. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-09 19:49 ` Jozsef Kadlecsik @ 2020-10-09 20:00 ` Francesco Ruggeri 2020-10-09 20:05 ` Florian Westphal 1 sibling, 0 replies; 15+ messages in thread From: Francesco Ruggeri @ 2020-10-09 20:00 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Florian Westphal, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, Pablo Neira Ayuso On Fri, Oct 9, 2020 at 12:49 PM Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > What is the rationale behind "remove the conntrack hooks when there are no > rule left referring to conntrack"? Performance optimization? That seems to be the case. See commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection tracking unless needed"). Francesco ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-09 19:49 ` Jozsef Kadlecsik 2020-10-09 20:00 ` Francesco Ruggeri @ 2020-10-09 20:05 ` Florian Westphal 2020-10-14 0:06 ` Pablo Neira Ayuso 1 sibling, 1 reply; 15+ messages in thread From: Florian Westphal @ 2020-10-09 20:05 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Florian Westphal, Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller, Pablo Neira Ayuso Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > The "delay unregister" remark was wrt. the "all rules were deleted" > > case, i.e. add a "grace period" rather than acting right away when > > conntrack use count did hit 0. > > Now I understand it, thanks really. The hooks are removed, so conntrack > cannot "see" the packets and the entries become stale. Yes. > What is the rationale behind "remove the conntrack hooks when there are no > rule left referring to conntrack"? Performance optimization? But then the > content of the whole conntrack table could be deleted too... ;-) Yes, this isn't the case at the moment -- only hooks are removed, entries will eventually time out. > > Conntrack entries are not removed, only the base hooks get unregistered. > > This is a problem for tcp window tracking. > > > > When re-register occurs, kernel is supposed to switch the existing > > entries to "loose" mode so window tracking won't flag packets as > > invalid, but apparently this isn't enough to handle keepalive case. > > "loose" (nf_ct_tcp_loose) mode doesn't disable window tracking, it > enables/disables picking up already established connections. > > nf_ct_tcp_be_liberal would disable TCP window checking (but not tracking) > for non RST packets. You are right, mixup on my part. > But both seems to be modified only via the proc entries. Yes, we iterate table on re-register and modify the existing entries. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-09 20:05 ` Florian Westphal @ 2020-10-14 0:06 ` Pablo Neira Ayuso 2020-10-14 8:11 ` Pablo Neira Ayuso 2020-10-14 8:23 ` Florian Westphal 0 siblings, 2 replies; 15+ messages in thread From: Pablo Neira Ayuso @ 2020-10-14 0:06 UTC (permalink / raw) To: Florian Westphal Cc: Jozsef Kadlecsik, Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller On Fri, Oct 09, 2020 at 10:05:48PM +0200, Florian Westphal wrote: > Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > > The "delay unregister" remark was wrt. the "all rules were deleted" > > > case, i.e. add a "grace period" rather than acting right away when > > > conntrack use count did hit 0. > > > > Now I understand it, thanks really. The hooks are removed, so conntrack > > cannot "see" the packets and the entries become stale. > > Yes. > > > What is the rationale behind "remove the conntrack hooks when there are no > > rule left referring to conntrack"? Performance optimization? But then the > > content of the whole conntrack table could be deleted too... ;-) > > Yes, this isn't the case at the moment -- only hooks are removed, > entries will eventually time out. > > > > Conntrack entries are not removed, only the base hooks get unregistered. > > > This is a problem for tcp window tracking. > > > > > > When re-register occurs, kernel is supposed to switch the existing > > > entries to "loose" mode so window tracking won't flag packets as > > > invalid, but apparently this isn't enough to handle keepalive case. > > > > "loose" (nf_ct_tcp_loose) mode doesn't disable window tracking, it > > enables/disables picking up already established connections. > > > > nf_ct_tcp_be_liberal would disable TCP window checking (but not tracking) > > for non RST packets. > > You are right, mixup on my part. > > > But both seems to be modified only via the proc entries. > > Yes, we iterate table on re-register and modify the existing entries. For iptables-nft, it might be possible to avoid this deregister + register ct hooks in the same transaction: Maybe add something like nf_ct_netns_get_all() to bump refcounters by one _iff_ they are > 0 before starting the transaction processing, then call nf_ct_netns_put_all() which decrements refcounters and unregister hooks if they reach 0. The only problem with this approach is that this pulls in the conntrack module, to solve that, struct nf_ct_hook in net/netfilter/core.c could be used to store the reference to ->netns_get_all and ->net_put_all. Legacy would still be flawed though. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-14 0:06 ` Pablo Neira Ayuso @ 2020-10-14 8:11 ` Pablo Neira Ayuso 2020-10-14 8:23 ` Florian Westphal 1 sibling, 0 replies; 15+ messages in thread From: Pablo Neira Ayuso @ 2020-10-14 8:11 UTC (permalink / raw) To: Florian Westphal Cc: Jozsef Kadlecsik, Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller On Wed, Oct 14, 2020 at 02:06:28AM +0200, Pablo Neira Ayuso wrote: > On Fri, Oct 09, 2020 at 10:05:48PM +0200, Florian Westphal wrote: > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > > > The "delay unregister" remark was wrt. the "all rules were deleted" > > > > case, i.e. add a "grace period" rather than acting right away when > > > > conntrack use count did hit 0. > > > > > > Now I understand it, thanks really. The hooks are removed, so conntrack > > > cannot "see" the packets and the entries become stale. > > > > Yes. > > > > > What is the rationale behind "remove the conntrack hooks when there are no > > > rule left referring to conntrack"? Performance optimization? But then the > > > content of the whole conntrack table could be deleted too... ;-) > > > > Yes, this isn't the case at the moment -- only hooks are removed, > > entries will eventually time out. > > > > > > Conntrack entries are not removed, only the base hooks get unregistered. > > > > This is a problem for tcp window tracking. > > > > > > > > When re-register occurs, kernel is supposed to switch the existing > > > > entries to "loose" mode so window tracking won't flag packets as > > > > invalid, but apparently this isn't enough to handle keepalive case. > > > > > > "loose" (nf_ct_tcp_loose) mode doesn't disable window tracking, it > > > enables/disables picking up already established connections. > > > > > > nf_ct_tcp_be_liberal would disable TCP window checking (but not tracking) > > > for non RST packets. > > > > You are right, mixup on my part. > > > > > But both seems to be modified only via the proc entries. > > > > Yes, we iterate table on re-register and modify the existing entries. > > For iptables-nft, it might be possible to avoid this deregister + > register ct hooks in the same transaction: Maybe add something like > nf_ct_netns_get_all() to bump refcounters by one _iff_ they are > 0 > before starting the transaction processing, then call > nf_ct_netns_put_all() which decrements refcounters and unregister > hooks if they reach 0. Hm, scratch that, put_all() would create an imbalance with this conditional increment. > The only problem with this approach is that this pulls in the > conntrack module, to solve that, struct nf_ct_hook in > net/netfilter/core.c could be used to store the reference to > ->netns_get_all and ->net_put_all. > > Legacy would still be flawed though. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-14 0:06 ` Pablo Neira Ayuso 2020-10-14 8:11 ` Pablo Neira Ayuso @ 2020-10-14 8:23 ` Florian Westphal 2020-10-14 18:42 ` Francesco Ruggeri 1 sibling, 1 reply; 15+ messages in thread From: Florian Westphal @ 2020-10-14 8:23 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: Florian Westphal, Jozsef Kadlecsik, Francesco Ruggeri, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > Yes, we iterate table on re-register and modify the existing entries. > > For iptables-nft, it might be possible to avoid this deregister + > register ct hooks in the same transaction: Maybe add something like > nf_ct_netns_get_all() to bump refcounters by one _iff_ they are > 0 > before starting the transaction processing, then call > nf_ct_netns_put_all() which decrements refcounters and unregister > hooks if they reach 0. No need, its already fine. Decrement happens from destroy path, so new rules are already in place. > The only problem with this approach is that this pulls in the > conntrack module, to solve that, struct nf_ct_hook in > net/netfilter/core.c could be used to store the reference to > ->netns_get_all and ->net_put_all. > > Legacy would still be flawed though. Its fine too, new rule blob gets handled (and match/target checkentry called) before old one is dismantled. We only have a 0 refcount + hook unregister when rules get flushed/removed explicitly. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-14 8:23 ` Florian Westphal @ 2020-10-14 18:42 ` Francesco Ruggeri 2020-10-14 19:35 ` Florian Westphal 0 siblings, 1 reply; 15+ messages in thread From: Francesco Ruggeri @ 2020-10-14 18:42 UTC (permalink / raw) To: Florian Westphal Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller On Wed, Oct 14, 2020 at 1:23 AM Florian Westphal <fw@strlen.de> wrote: > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > Legacy would still be flawed though. > > Its fine too, new rule blob gets handled (and match/target checkentry > called) before old one is dismantled. > > We only have a 0 refcount + hook unregister when rules get > flushed/removed explicitly. Should the patch be used in the meantime while this gets worked out? Francesco ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-14 18:42 ` Francesco Ruggeri @ 2020-10-14 19:35 ` Florian Westphal 0 siblings, 0 replies; 15+ messages in thread From: Florian Westphal @ 2020-10-14 19:35 UTC (permalink / raw) To: Francesco Ruggeri Cc: Florian Westphal, Pablo Neira Ayuso, Jozsef Kadlecsik, open list, netdev, coreteam, netfilter-devel, Jakub Kicinski, David Miller Francesco Ruggeri <fruggeri@arista.com> wrote: > On Wed, Oct 14, 2020 at 1:23 AM Florian Westphal <fw@strlen.de> wrote: > > > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > Legacy would still be flawed though. > > > > Its fine too, new rule blob gets handled (and match/target checkentry > > called) before old one is dismantled. > > > > We only have a 0 refcount + hook unregister when rules get > > flushed/removed explicitly. > > Should the patch be used in the meantime while this gets > worked out? I think the patch is correct, and I do NOT see a better solution. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH nf v2] netfilter: conntrack: connection timeout after re-register 2020-10-07 19:32 [PATCH nf v2] netfilter: conntrack: connection timeout after re-register Francesco Ruggeri 2020-10-08 23:41 ` Francesco Ruggeri @ 2020-10-20 15:21 ` Pablo Neira Ayuso 1 sibling, 0 replies; 15+ messages in thread From: Pablo Neira Ayuso @ 2020-10-20 15:21 UTC (permalink / raw) To: Francesco Ruggeri Cc: linux-kernel, netdev, coreteam, netfilter-devel, kuba, davem, fw, kadlec On Wed, Oct 07, 2020 at 12:32:52PM -0700, Francesco Ruggeri wrote: > If the first packet conntrack sees after a re-register is an outgoing > keepalive packet with no data (SEG.SEQ = SND.NXT-1), td_end is set to > SND.NXT-1. > When the peer correctly acknowledges SND.NXT, tcp_in_window fails > check III (Upper bound for valid (s)ack: sack <= receiver.td_end) and > returns false, which cascades into nf_conntrack_in setting > skb->_nfct = 0 and in later conntrack iptables rules not matching. > In cases where iptables are dropping packets that do not match > conntrack rules this can result in idle tcp connections to time out. Applied, thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-10-20 15:21 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-07 19:32 [PATCH nf v2] netfilter: conntrack: connection timeout after re-register Francesco Ruggeri 2020-10-08 23:41 ` Francesco Ruggeri 2020-10-09 6:52 ` Jozsef Kadlecsik 2020-10-09 11:03 ` Florian Westphal 2020-10-09 18:48 ` Jozsef Kadlecsik 2020-10-09 18:55 ` Florian Westphal 2020-10-09 19:49 ` Jozsef Kadlecsik 2020-10-09 20:00 ` Francesco Ruggeri 2020-10-09 20:05 ` Florian Westphal 2020-10-14 0:06 ` Pablo Neira Ayuso 2020-10-14 8:11 ` Pablo Neira Ayuso 2020-10-14 8:23 ` Florian Westphal 2020-10-14 18:42 ` Francesco Ruggeri 2020-10-14 19:35 ` Florian Westphal 2020-10-20 15:21 ` Pablo Neira Ayuso
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).