All of lore.kernel.org
 help / color / mirror / Atom feed
* Strange nf_conntrack_tcp_timeout_established behavior
@ 2020-02-17 17:33 FUSTE Emmanuel
  2020-02-18 12:00 ` FUSTE Emmanuel
  0 siblings, 1 reply; 2+ messages in thread
From: FUSTE Emmanuel @ 2020-02-17 17:33 UTC (permalink / raw)
  To: netfilter-devel

Hello,
I am facing a strange problem with recent kernels.

On "bad" kernel, nf_conntrack_tcp_timeout_established default value is 
not honored, and conntrack -L return different results on the same 
machine in different ssh root sessions.

Ubuntu vendor kernel 4.15 (64bits) : correct behaviour
Ubuntu 5.3.0 vendor kernel (64bits): BAD (with iptable 1.6.1 -> iptable 
rules)
Debian kernel 5.4.19 (32bits): BAD (with iptable-nft -> nft rules)

Clean boot, no conntrack module loaded:
# cat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established
cat: /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established: No 
such file or directory
# modprobe nf_conntrack
# cat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established
432000

Add an ip table rule to start connection tracking:
# iptable -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT

show tcp session tracking :
# conntrack -L |grep ^tcp
tcp      6 299 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
dport=54470 [ASSURED] mark=0 use=1

timeout is not 432000s but 300s.
On a moderated loaded smtp server, all sessions are at 300s

do
# echo 432000 >/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established
sometimes sessions start to pick 432000 as new timeout  sometimes not.
Force things to happen:
# conntrack -F
# conntrack -L |grep ^tcp |grep ESTABLISHED |grep ASSURED
now on the loaded server, most tcp sessions pick the 432000 timeout 
value, but time to time some still pick 300s.

On the debian test machine tree ssh sessions are opened in tree window 
(I dont have console on this machine)
First ssh session:
# conntrack -L |grep ^tcp
conntrack v1.4.5 (conntrack-tools): 24 flow entries have been shown.
tcp      6 431144 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
sport=55243 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
dport=55243 [ASSURED] mark=0 use=1
tcp      6 431120 ESTABLISHED src=10.222.219.8 dst=10.222.219.164 
sport=22 dport=55339 src=10.222.219.164 dst=10.222.219.8 sport=55339 
dport=22 [ASSURED] mark=0 use=1
tcp      6 299 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
dport=54470 [ASSURED] mark=0 use=1

second one:
~# conntrack -L |grep ^tcp
conntrack v1.4.5 (conntrack-tools): 27 flow entries have been shown.
tcp      6 431099 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
sport=55243 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
dport=55243 [ASSURED] mark=0 use=1
tcp      6 431999 ESTABLISHED src=10.222.219.8 dst=10.222.219.164 
sport=22 dport=55339 src=10.222.219.164 dst=10.222.219.8 sport=55339 
dport=22 [ASSURED] mark=0 use=1
tcp      6 431963 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
dport=54470 [ASSURED] mark=0 use=1

last one:
# conntrack -L |grep ^tcp
conntrack v1.4.5 (conntrack-tools): 22 flow entries have been shown.
tcp      6 431999 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
sport=55243 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
dport=55243 [ASSURED] mark=0 use=1
tcp      6 431979 ESTABLISHED src=10.222.219.8 dst=10.222.219.164 
sport=22 dport=55339 src=10.222.219.164 dst=10.222.219.8 sport=55339 
dport=22 [ASSURED] mark=0 use=1
tcp      6 431942 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
dport=54470 [ASSURED] mark=0 use=1

crazy no ?!?!.....

Ok these are all "vendor" kernels, but the Debian one is pretty genuine. 
It seems that some upstream bugs are lurking around. Debian kernel 5.2.9 
(32bits) seems not affected, but Ubuntu 5.0 is partially affected: 10% 
of connections (due to some backports ?)

On the most affected production machine (Ubuntu with 5.3 kernel), the 
same conntrack -L invocation sometimes return 300 sometimes 432000 for 
the same long-running tcp connection. I don't know if it is a netlink 
problem or a real conntrack timer change on activity on the tcp session. 
But as my ssh sessions never survive more than 10~15 min I think there 
is a real problem on the conntrack timers.

Any thoughts ?

Emmanuel.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Strange nf_conntrack_tcp_timeout_established behavior
  2020-02-17 17:33 Strange nf_conntrack_tcp_timeout_established behavior FUSTE Emmanuel
@ 2020-02-18 12:00 ` FUSTE Emmanuel
  0 siblings, 0 replies; 2+ messages in thread
From: FUSTE Emmanuel @ 2020-02-18 12:00 UTC (permalink / raw)
  To: netfilter-devel

Ok, top posting my own message ... : Forget it.

Debugging ssh tcp session tracking via ssh session is a very bad idea 
... my test on the debian machine is normal.
I think I found the culprit of my headache : heavy ZWP filtering by some 
firewalls....

Emmanuel.

Le 17/02/2020 à 18:33, Emmanuel Fusté a écrit :
> Hello,
> I am facing a strange problem with recent kernels.
>
> On "bad" kernel, nf_conntrack_tcp_timeout_established default value is 
> not honored, and conntrack -L return different results on the same 
> machine in different ssh root sessions.
>
> Ubuntu vendor kernel 4.15 (64bits) : correct behaviour
> Ubuntu 5.3.0 vendor kernel (64bits): BAD (with iptable 1.6.1 -> 
> iptable rules)
> Debian kernel 5.4.19 (32bits): BAD (with iptable-nft -> nft rules)
>
> Clean boot, no conntrack module loaded:
> # cat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established
> cat: /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established: No 
> such file or directory
> # modprobe nf_conntrack
> # cat /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established
> 432000
>
> Add an ip table rule to start connection tracking:
> # iptable -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
>
> show tcp session tracking :
> # conntrack -L |grep ^tcp
> tcp      6 299 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
> sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
> dport=54470 [ASSURED] mark=0 use=1
>
> timeout is not 432000s but 300s.
> On a moderated loaded smtp server, all sessions are at 300s
>
> do
> # echo 432000 
> >/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established
> sometimes sessions start to pick 432000 as new timeout  sometimes not.
> Force things to happen:
> # conntrack -F
> # conntrack -L |grep ^tcp |grep ESTABLISHED |grep ASSURED
> now on the loaded server, most tcp sessions pick the 432000 timeout 
> value, but time to time some still pick 300s.
>
> On the debian test machine tree ssh sessions are opened in tree window 
> (I dont have console on this machine)
> First ssh session:
> # conntrack -L |grep ^tcp
> conntrack v1.4.5 (conntrack-tools): 24 flow entries have been shown.
> tcp      6 431144 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
> sport=55243 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
> dport=55243 [ASSURED] mark=0 use=1
> tcp      6 431120 ESTABLISHED src=10.222.219.8 dst=10.222.219.164 
> sport=22 dport=55339 src=10.222.219.164 dst=10.222.219.8 sport=55339 
> dport=22 [ASSURED] mark=0 use=1
> tcp      6 299 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
> sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
> dport=54470 [ASSURED] mark=0 use=1
>
> second one:
> ~# conntrack -L |grep ^tcp
> conntrack v1.4.5 (conntrack-tools): 27 flow entries have been shown.
> tcp      6 431099 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
> sport=55243 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
> dport=55243 [ASSURED] mark=0 use=1
> tcp      6 431999 ESTABLISHED src=10.222.219.8 dst=10.222.219.164 
> sport=22 dport=55339 src=10.222.219.164 dst=10.222.219.8 sport=55339 
> dport=22 [ASSURED] mark=0 use=1
> tcp      6 431963 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
> sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
> dport=54470 [ASSURED] mark=0 use=1
>
> last one:
> # conntrack -L |grep ^tcp
> conntrack v1.4.5 (conntrack-tools): 22 flow entries have been shown.
> tcp      6 431999 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
> sport=55243 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
> dport=55243 [ASSURED] mark=0 use=1
> tcp      6 431979 ESTABLISHED src=10.222.219.8 dst=10.222.219.164 
> sport=22 dport=55339 src=10.222.219.164 dst=10.222.219.8 sport=55339 
> dport=22 [ASSURED] mark=0 use=1
> tcp      6 431942 ESTABLISHED src=10.222.219.164 dst=10.222.219.8 
> sport=54470 dport=22 src=10.222.219.8 dst=10.222.219.164 sport=22 
> dport=54470 [ASSURED] mark=0 use=1
>
> crazy no ?!?!.....
>
> Ok these are all "vendor" kernels, but the Debian one is pretty 
> genuine. It seems that some upstream bugs are lurking around. Debian 
> kernel 5.2.9 (32bits) seems not affected, but Ubuntu 5.0 is partially 
> affected: 10% of connections (due to some backports ?)
>
> On the most affected production machine (Ubuntu with 5.3 kernel), the 
> same conntrack -L invocation sometimes return 300 sometimes 432000 for 
> the same long-running tcp connection. I don't know if it is a netlink 
> problem or a real conntrack timer change on activity on the tcp 
> session. But as my ssh sessions never survive more than 10~15 min I 
> think there is a real problem on the conntrack timers.
>
> Any thoughts ?
>
> Emmanuel.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-02-18 12:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-17 17:33 Strange nf_conntrack_tcp_timeout_established behavior FUSTE Emmanuel
2020-02-18 12:00 ` FUSTE Emmanuel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.