All of lore.kernel.org
 help / color / mirror / Atom feed
* Conntrack not matching properly - producing serious outages
@ 2011-08-11  9:46 John A. Sullivan III
  2011-08-11 10:10 ` Eric Leblond
  2011-08-11 10:12 ` Jozsef Kadlecsik
  0 siblings, 2 replies; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-11  9:46 UTC (permalink / raw)
  To: netfilter

Hello, all.  We have been having a subtle problem with conntrack for
quite a long time but it has suddenly gotten much worse.  Packets are
being matched as INVALID when we would expect them to be ESTABLISHED.
We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
that we were going to investigate to provoking serious outages and all
hands to the pump.

The conntrack table is not swamped although we did increase the max
count and the hashsize just in case to no avail:
[root@fw01 netfilter]# cat ip_conntrack_max
65536
[root@fw01 netfilter]# cat ip_conntrack_count
532


Here are three specific examples.  The first is from the FORWARD chain.
Here are the logging messages:


Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0

Aug 11 03:29:19 fw01 kernel: No Match: IN=bond1 OUT=bond4 SRC=172.x.y.73
DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940 DF PROTO=TCP
SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0

The above is a reply packet in response to 172.x.z.34 sending a packet
to 172.x.y.73 on TCP port 8080.
The iptables sequence for the initiating packet is:
Chain INPUT (policy DROP 488 packets, 45215 bytes)
 pkts bytes target     prot opt in     out     source               destination
 175K   26M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED
   56  5924 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0           state INVALID LOG flags 0 level 4 prefix `INPUT INVALID '
    4   234 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0
  344 20692 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:9xxx state NEW
    8  4376 ACCEPT     esp  --  *      *       0.0.0.0/0            0.0.0.0/0
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0           udp dpt:500
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0           udp dpt:4500
  420 31920 VPN_ALLOW  all  --  *      *       0.0.0.0/0            0.0.0.0/0           MARK match 0xcccc/0xcccc
 1181  113K UPEPIN_DENY  all  --  *      *       0.0.0.0/0            0.0.0.0/0
 1181  113K UPEPIN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
  488 45215 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0           LOG flags 0 level 4 prefix `No Match: '

It should find a match in UPEPIN where it will hit the ACESS_GROUPS
rule:
Chain UPEPIN (2 references)
 pkts bytes target     prot opt in     out     source               destination
67188 9977K ProtectionFilterSource  all  --  *      *       0.0.0.0/0            0.0.0.0/0
21302 1311K ProtectionFilterTCP  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0
16218 1235K ProtectionFilterICMP  icmp --  *      *       0.0.0.0/0            0.0.0.0/0
67188 9977K ACCESS_GROUPS  all  --  *      *       0.0.0.0/0            0.0.0.0/0

Inside ACCESS_GROUPS, it will match and jump to chain c52:
Chain ACCESS_GROUPS (3 references)
 pkts bytes target     prot opt in     out     source               destination
2549  798K c52        all  --  *      *       172.x.z.34          0.0.0.0/0

c52 jumps it to chain c29:
Chain c52 (6 references)
 pkts bytes target     prot opt in     out     source               destination
 4991 1598K c29        all  --  *      *       0.0.0.0/0            0.0.0.0/0
  313 49263 c47        all  --  *      *       0.0.0.0/0            0.0.0.0/0

where it finds a match
 pkts bytes target     prot opt in     out     source               destination
  142  8512 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           destination IP range 172.x.y.72-172.x.y.73 tcp dpt:8080

So why is the reply packet INVALID instead of ESTABLISHED? How can we
troubleshoot?

The following two examples on the INPUT chain should not even be on the
INPUT chain as NAT should be translating the addresses and placing the
packet in the FORWARD chain:

Aug 11 04:13:58 fw01 kernel: INPUT INVALID IN=bond3 OUT=
MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=54331 DF PROTO=TCP
SPT=23012 DPT=441 WINDOW=1126 RES=0x00 ACK PSH URGP=0
Aug 11 04:13:58 fw01 kernel: No Match: IN=bond3 OUT=
MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=54331 DF PROTO=TCP
SPT=23012 DPT=441 WINDOW=1126 RES=0x00 ACK PSH URGP=0


Aug 10 19:12:19 fw01 kernel: INPUT INVALID IN=bond3 OUT=
MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=74.75.231.235
DST=208.a.b.8 LEN=52 TOS=0x00 PREC=0x00 TTL=47 ID=12470 DF PROTO=TCP
SPT=47233 DPT=443 WINDOW=1716 RES=0x00 ACK FIN URGP=0
Aug 10 19:12:19 fw01 kernel: No Match: IN=bond3 OUT=
MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=74.75.231.235
DST=208.a.b.8 LEN=52 TOS=0x00 PREC=0x00 TTL=47 ID=12470 DF PROTO=TCP
SPT=47233 DPT=443 WINDOW=1716 RES=0x00 ACK FIN URGP=0

Here is the iptables sequence:
Chain PREROUTING (policy ACCEPT 58761 packets, 4122K bytes)
 pkts bytes target     prot opt in     out     source               destination
59417 4161K ServiceDNAT  all  --  *      *       0.0.0.0/0            0.0.0.0/0
59337 4156K NetNATPRE  all  --  *      *       0.0.0.0/0            0.0.0.0/0

They should hit the ServiceDNAT chain:
Chain ServiceDNAT (1 references)
 pkts bytes target     prot opt in     out     source               destination
60414 4233K ProxyDNAT  all  --  *      *       0.0.0.0/0            0.0.0.0/0
    6   360 DNAT       tcp  --  bond3  *       0.0.0.0/0            208.a.b.8         tcp dpt:441 to:172.c.d.3:9xxx
    5   372 DNAT       tcp  --  bond3  *       0.0.0.0/0            208.a.b.8         tcp dpt:443 to:172.c.d.1:9xxx

So why are we seeing these packets on the INPUT chain? They should be
picked up by CONNTRACK, translated, and then placed on the FORWARD
chain.  Is this typical of a misconfiguration on our part? A bug? How do
we troubleshoot it? Thanks very much - John




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11  9:46 Conntrack not matching properly - producing serious outages John A. Sullivan III
@ 2011-08-11 10:10 ` Eric Leblond
  2011-08-11 12:03   ` John A. Sullivan III
  2011-08-11 16:35   ` John A. Sullivan III
  2011-08-11 10:12 ` Jozsef Kadlecsik
  1 sibling, 2 replies; 19+ messages in thread
From: Eric Leblond @ 2011-08-11 10:10 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netfilter

[-- Attachment #1: Type: text/plain, Size: 2232 bytes --]

Hello John,

Nice to hear from you again ;)

On Thu, 2011-08-11 at 05:46 -0400, John A. Sullivan III wrote:
> Hello, all.  We have been having a subtle problem with conntrack for
> quite a long time but it has suddenly gotten much worse.  Packets are
> being matched as INVALID when we would expect them to be ESTABLISHED.
> We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> that we were going to investigate to provoking serious outages and all
> hands to the pump.
> 
> The conntrack table is not swamped although we did increase the max
> count and the hashsize just in case to no avail:
> [root@fw01 netfilter]# cat ip_conntrack_max
> 65536
> [root@fw01 netfilter]# cat ip_conntrack_count
> 532
> 
> 
> Here are three specific examples.  The first is from the FORWARD chain.
> Here are the logging messages:
> 
> 
> Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0

I've already observed this kind of problem. This was related with some
software/OSes having really strange timeout value.

To check weither this is the same problem, you can ask the kernel to log
the reason why the packets are invalid. This can be made by running:

        echo "255">/proc/sys/net/netfilter/nf_conntrack_log_invalid

After doing this, the kernel will log all invalid packets through the
default log system. You can check which one you are using by doing:
  
        cat /proc/net/netfilter/nf_log 
         0 NONE ()
         1 NONE ()
         2 ipt_LOG (ipt_LOG)
        ...
        
2 is the coding for IPv4. With that ipt_LOG value, the message are sent
via the standard kernel log. If instead of this value, you've got
something like ULOG or NFLOG, you will need to get the message by
listening to the nflog-group or ulog-group 0 in ulogd[2].

If this is timeout issue, you can play with the timeout setting of the
conntrack in the /proc/sys/net/netfilter/nf_conntrack_*time* files.

BR,
-- 
Eric Leblond 
Blog: http://home.regit.org/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11  9:46 Conntrack not matching properly - producing serious outages John A. Sullivan III
  2011-08-11 10:10 ` Eric Leblond
@ 2011-08-11 10:12 ` Jozsef Kadlecsik
  2011-08-11 12:09   ` John A. Sullivan III
  2011-08-11 14:00   ` Jan Engelhardt
  1 sibling, 2 replies; 19+ messages in thread
From: Jozsef Kadlecsik @ 2011-08-11 10:12 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netfilter

Hi,

On Thu, 11 Aug 2011, John A. Sullivan III wrote:

> Hello, all.  We have been having a subtle problem with conntrack for
> quite a long time but it has suddenly gotten much worse.  Packets are
> being matched as INVALID when we would expect them to be ESTABLISHED.
> We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> that we were going to investigate to provoking serious outages and all
> hands to the pump.
> 
> The conntrack table is not swamped although we did increase the max
> count and the hashsize just in case to no avail:
> [root@fw01 netfilter]# cat ip_conntrack_max
> 65536
> [root@fw01 netfilter]# cat ip_conntrack_count
> 532
> 
> Here are three specific examples.  The first is from the FORWARD chain.
> Here are the logging messages:
>  
> Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0

Those are, with high probabilty, late FIN packets: the belonging conntrack 
entry has already been deleted and thus conntrack cannot find the matching 
stream, therefore it sets as INVALID.

> So why is the reply packet INVALID instead of ESTABLISHED? How can we
> troubleshoot?

If NAT is enabled, never ever let packets with INVALID state pass through, 
because NAT will skip them.
 
Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 10:10 ` Eric Leblond
@ 2011-08-11 12:03   ` John A. Sullivan III
  2011-08-11 16:35   ` John A. Sullivan III
  1 sibling, 0 replies; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-11 12:03 UTC (permalink / raw)
  To: Eric Leblond; +Cc: netfilter

On Thu, 2011-08-11 at 12:10 +0200, Eric Leblond wrote:
> Hello John,
> 
> Nice to hear from you again ;)
Thanks, it has been a while since I've been here. We're still plodding
along with ISCS (http://iscs.sourceforge.net).  Although we've not
updated the site in years and I'm way behind on a new release, we have
actually made considerable progress.  There still seems to be nothing
else that does what it does.  We have coined the term Firepipes to
describe it as opposed to Firewall as that better describes our model of
no inside and no outside, i.e., no one can go anywhere on the network
unless they have a firepipe and explosions in the firepipe stay in the
firepipe, i.e., no escalation of privileges.  So, hopefully at some
point, we'll pick up some corporate sponsorship as it is far too big for
our limited resources.
> 
> On Thu, 2011-08-11 at 05:46 -0400, John A. Sullivan III wrote:
> > Hello, all.  We have been having a subtle problem with conntrack for
> > quite a long time but it has suddenly gotten much worse.  Packets are
> > being matched as INVALID when we would expect them to be ESTABLISHED.
> > We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> > iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> > that we were going to investigate to provoking serious outages and all
> > hands to the pump.
> > 
> > The conntrack table is not swamped although we did increase the max
> > count and the hashsize just in case to no avail:
> > [root@fw01 netfilter]# cat ip_conntrack_max
> > 65536
> > [root@fw01 netfilter]# cat ip_conntrack_count
> > 532
> > 
> > 
> > Here are three specific examples.  The first is from the FORWARD chain.
> > Here are the logging messages:
> > 
> > 
> > Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> > SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> > DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> 
> I've already observed this kind of problem. This was related with some
> software/OSes having really strange timeout value.
> 
> To check weither this is the same problem, you can ask the kernel to log
> the reason why the packets are invalid. This can be made by running:
> 
>         echo "255">/proc/sys/net/netfilter/nf_conntrack_log_invalid
> 
> After doing this, the kernel will log all invalid packets through the
> default log system. You can check which one you are using by doing:
>   
>         cat /proc/net/netfilter/nf_log 
>          0 NONE ()
>          1 NONE ()
>          2 ipt_LOG (ipt_LOG)
>         ...
>         
Ah - that's why we didn't see anything when we enabled it.  I'll do that
again with the proper log setting.  Thanks.
> 2 is the coding for IPv4. With that ipt_LOG value, the message are sent
> via the standard kernel log. If instead of this value, you've got
> something like ULOG or NFLOG, you will need to get the message by
> listening to the nflog-group or ulog-group 0 in ulogd[2].
> 
> If this is timeout issue, you can play with the timeout setting of the
> conntrack in the /proc/sys/net/netfilter/nf_conntrack_*time* files.
> 
> BR,
We'll see what the logging says. The strange thing about the last two
examples is that they happen midstream.  They are X2Go sessions
(www.x2go.org), an NX implementation for remote display presentation.
The users are typing away when, suddenly, their session drops and we see
these INVALID packets and associated drops so it doesn't smell like a
timing issue.  But we'll log and see what we get.  Thanks again - John


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 10:12 ` Jozsef Kadlecsik
@ 2011-08-11 12:09   ` John A. Sullivan III
  2011-08-11 12:26     ` Jozsef Kadlecsik
  2011-08-11 14:00   ` Jan Engelhardt
  1 sibling, 1 reply; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-11 12:09 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: netfilter

On Thu, 2011-08-11 at 12:12 +0200, Jozsef Kadlecsik wrote:
> Hi,
> 
> On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> 
> > Hello, all.  We have been having a subtle problem with conntrack for
> > quite a long time but it has suddenly gotten much worse.  Packets are
> > being matched as INVALID when we would expect them to be ESTABLISHED.
> > We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> > iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> > that we were going to investigate to provoking serious outages and all
> > hands to the pump.
> > 
> > The conntrack table is not swamped although we did increase the max
> > count and the hashsize just in case to no avail:
> > [root@fw01 netfilter]# cat ip_conntrack_max
> > 65536
> > [root@fw01 netfilter]# cat ip_conntrack_count
> > 532
> > 
> > Here are three specific examples.  The first is from the FORWARD chain.
> > Here are the logging messages:
> >  
> > Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> > SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> > DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> 
> Those are, with high probabilty, late FIN packets: the belonging conntrack 
> entry has already been deleted and thus conntrack cannot find the matching 
> stream, therefore it sets as INVALID.
Thank you very much, Jozsef.  That would explain why we did not
categorize this as a high priority in the past as it seemed to have
minimal impact.  I would guess we do not need to be concerned about
these.

However, the other two are much more problematic and what escalated this
into a crisis.  As I just explained in another reply, these are
happening in the middle of activity, i.e., they are NX remote desktop
sessions being carried via SSH.  The users are in the middle of typing
or scrolling through their desktops, in other words, the connection is
definitely active and passing many packets.  Then, without warning,
their desktops freeze, the connection eventually times out, and we see
these INVALID and dropped packets.  That's the one we really need to
solve.
> 
> > So why is the reply packet INVALID instead of ESTABLISHED? How can we
> > troubleshoot?
> 
> If NAT is enabled, never ever let packets with INVALID state pass through, 
> because NAT will skip them.
I'm not entirely sure what you mean by this - sorry.  Are you saying we
should always have a rule to drop INVALID packets at the beginning of
NAT or are you saying that the reason we are seeing these in the INPUT
chain is because they were "labeled" as INVALID before hitting the nat
table and that's why NAT skipped them? If the latter, we are still back
to the original problem of why are these ESTABLISHED packets being
considered as INVALID?

Thanks very much - John
>  
> Best regards,
> Jozsef
> -
> E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
> PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address : KFKI Research Institute for Particle and Nuclear Physics
>           H-1525 Budapest 114, POB. 49, Hungary



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 12:09   ` John A. Sullivan III
@ 2011-08-11 12:26     ` Jozsef Kadlecsik
  2011-08-11 12:36       ` John A. Sullivan III
  2011-08-11 19:14       ` John A. Sullivan III
  0 siblings, 2 replies; 19+ messages in thread
From: Jozsef Kadlecsik @ 2011-08-11 12:26 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netfilter

On Thu, 11 Aug 2011, John A. Sullivan III wrote:

> On Thu, 2011-08-11 at 12:12 +0200, Jozsef Kadlecsik wrote:
> > 
> > On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> > 
> > > Hello, all.  We have been having a subtle problem with conntrack for
> > > quite a long time but it has suddenly gotten much worse.  Packets are
> > > being matched as INVALID when we would expect them to be ESTABLISHED.
> > > We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> > > iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> > > that we were going to investigate to provoking serious outages and all
> > > hands to the pump.
> > > 
> > > The conntrack table is not swamped although we did increase the max
> > > count and the hashsize just in case to no avail:
> > > [root@fw01 netfilter]# cat ip_conntrack_max
> > > 65536
> > > [root@fw01 netfilter]# cat ip_conntrack_count
> > > 532
> > > 
> > > Here are three specific examples.  The first is from the FORWARD chain.
> > > Here are the logging messages:
> > >  
> > > Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> > > SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> > > DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> > 
> > Those are, with high probabilty, late FIN packets: the belonging conntrack 
> > entry has already been deleted and thus conntrack cannot find the matching 
> > stream, therefore it sets as INVALID.
> Thank you very much, Jozsef.  That would explain why we did not
> categorize this as a high priority in the past as it seemed to have
> minimal impact.  I would guess we do not need to be concerned about
> these.
> 
> However, the other two are much more problematic and what escalated this
> into a crisis.  As I just explained in another reply, these are
> happening in the middle of activity, i.e., they are NX remote desktop
> sessions being carried via SSH.  The users are in the middle of typing
> or scrolling through their desktops, in other words, the connection is
> definitely active and passing many packets.  Then, without warning,
> their desktops freeze, the connection eventually times out, and we see
> these INVALID and dropped packets.  That's the one we really need to
> solve.

That might be related to SACK option handling: some "clever" devices loves 
to mangle TCP SEQ/ACK values, but forget about the SACK options. Try to 
disable SACK support on both communicating endpoints. If the problem 
disappears, then it's a SACK issue.

> > > So why is the reply packet INVALID instead of ESTABLISHED? How can we
> > > troubleshoot?
> > 
> > If NAT is enabled, never ever let packets with INVALID state pass through, 
> > because NAT will skip them.
> I'm not entirely sure what you mean by this - sorry.  Are you saying we
> should always have a rule to drop INVALID packets at the beginning of
> NAT or are you saying that the reason we are seeing these in the INPUT
> chain is because they were "labeled" as INVALID before hitting the nat
> table and that's why NAT skipped them? If the latter, we are still back
> to the original problem of why are these ESTABLISHED packets being
> considered as INVALID?

Yes, drop INVALID packets. Of course not in the NAT table, but in the 
filter table. The NAT engine will skip them and they'd be sent out
without natting.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 12:26     ` Jozsef Kadlecsik
@ 2011-08-11 12:36       ` John A. Sullivan III
  2011-08-11 19:14       ` John A. Sullivan III
  1 sibling, 0 replies; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-11 12:36 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: netfilter

On Thu, 2011-08-11 at 14:26 +0200, Jozsef Kadlecsik wrote:
> On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> 
> > On Thu, 2011-08-11 at 12:12 +0200, Jozsef Kadlecsik wrote:
> > > 
> > > On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> > > 
> > > > Hello, all.  We have been having a subtle problem with conntrack for
> > > > quite a long time but it has suddenly gotten much worse.  Packets are
> > > > being matched as INVALID when we would expect them to be ESTABLISHED.
> > > > We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> > > > iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> > > > that we were going to investigate to provoking serious outages and all
> > > > hands to the pump.
> > > > 
> > > > The conntrack table is not swamped although we did increase the max
> > > > count and the hashsize just in case to no avail:
> > > > [root@fw01 netfilter]# cat ip_conntrack_max
> > > > 65536
> > > > [root@fw01 netfilter]# cat ip_conntrack_count
> > > > 532
> > > > 
> > > > Here are three specific examples.  The first is from the FORWARD chain.
> > > > Here are the logging messages:
> > > >  
> > > > Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> > > > SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> > > > DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> > > 
> > > Those are, with high probabilty, late FIN packets: the belonging conntrack 
> > > entry has already been deleted and thus conntrack cannot find the matching 
> > > stream, therefore it sets as INVALID.
> > Thank you very much, Jozsef.  That would explain why we did not
> > categorize this as a high priority in the past as it seemed to have
> > minimal impact.  I would guess we do not need to be concerned about
> > these.
> > 
> > However, the other two are much more problematic and what escalated this
> > into a crisis.  As I just explained in another reply, these are
> > happening in the middle of activity, i.e., they are NX remote desktop
> > sessions being carried via SSH.  The users are in the middle of typing
> > or scrolling through their desktops, in other words, the connection is
> > definitely active and passing many packets.  Then, without warning,
> > their desktops freeze, the connection eventually times out, and we see
> > these INVALID and dropped packets.  That's the one we really need to
> > solve.
> 
> That might be related to SACK option handling: some "clever" devices loves 
> to mangle TCP SEQ/ACK values, but forget about the SACK options. Try to 
> disable SACK support on both communicating endpoints. If the problem 
> disappears, then it's a SACK issue.
Thanks, I'll need to refill my SACK knowledge!
> 
> > > > So why is the reply packet INVALID instead of ESTABLISHED? How can we
> > > > troubleshoot?
> > > 
> > > If NAT is enabled, never ever let packets with INVALID state pass through, 
> > > because NAT will skip them.
> > I'm not entirely sure what you mean by this - sorry.  Are you saying we
> > should always have a rule to drop INVALID packets at the beginning of
> > NAT or are you saying that the reason we are seeing these in the INPUT
> > chain is because they were "labeled" as INVALID before hitting the nat
> > table and that's why NAT skipped them? If the latter, we are still back
> > to the original problem of why are these ESTABLISHED packets being
> > considered as INVALID?
> 
> Yes, drop INVALID packets. Of course not in the NAT table, but in the 
> filter table. The NAT engine will skip them and they'd be sent out
> without natting.
Ah, OK - POSTROUTING.  I've been focused on our PREROUTING issue.  I
think we're covered outbound in that, in our configuration, if it's not
ACCEPTed somewhere in the filter table, it is dropped.  Thanks - John
> 
> Best regards,
> Jozsef
> -
> E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
> PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address : KFKI Research Institute for Particle and Nuclear Physics
>           H-1525 Budapest 114, POB. 49, Hungary



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 10:12 ` Jozsef Kadlecsik
  2011-08-11 12:09   ` John A. Sullivan III
@ 2011-08-11 14:00   ` Jan Engelhardt
  2011-08-11 14:36     ` Jozsef Kadlecsik
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Engelhardt @ 2011-08-11 14:00 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: John A. Sullivan III, netfilter


On Thursday 2011-08-11 12:12, Jozsef Kadlecsik wrote:
>> Packets are
>> being matched as INVALID when we would expect them to be ESTABLISHED.
>> We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
>> iptables-1.3.5-5.3.el5_4.1.
>> [...]
>> Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
>> SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
>> DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
>
>Those are, with high probabilty, late FIN packets: the belonging conntrack 
>entry has already been deleted and thus conntrack cannot find the matching 
>stream, therefore it sets as INVALID.

Should not FIN retransmissions ideally be classified as ESTABLISHED (or
perhaps a new state) as long as the final ACK has not been seen?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 14:00   ` Jan Engelhardt
@ 2011-08-11 14:36     ` Jozsef Kadlecsik
  2011-08-11 14:38       ` Jan Engelhardt
  0 siblings, 1 reply; 19+ messages in thread
From: Jozsef Kadlecsik @ 2011-08-11 14:36 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: John A. Sullivan III, netfilter

On Thu, 11 Aug 2011, Jan Engelhardt wrote:

> On Thursday 2011-08-11 12:12, Jozsef Kadlecsik wrote:
> >> Packets are
> >> being matched as INVALID when we would expect them to be ESTABLISHED.
> >> We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> >> iptables-1.3.5-5.3.el5_4.1.
> >> [...]
> >> Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> >> SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> >> DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> >
> >Those are, with high probabilty, late FIN packets: the belonging conntrack 
> >entry has already been deleted and thus conntrack cannot find the matching 
> >stream, therefore it sets as INVALID.
> 
> Should not FIN retransmissions ideally be classified as ESTABLISHED (or
> perhaps a new state) as long as the final ACK has not been seen?

The final ACK might have already been seen. A full tcpdump could tell us 
what happened exactly.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 14:36     ` Jozsef Kadlecsik
@ 2011-08-11 14:38       ` Jan Engelhardt
  2011-08-11 14:48         ` Jozsef Kadlecsik
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Engelhardt @ 2011-08-11 14:38 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: John A. Sullivan III, netfilter

On Thursday 2011-08-11 16:36, Jozsef Kadlecsik wrote:

>On Thu, 11 Aug 2011, Jan Engelhardt wrote:
>
>> On Thursday 2011-08-11 12:12, Jozsef Kadlecsik wrote:
>> >> Packets are
>> >> being matched as INVALID when we would expect them to be ESTABLISHED.
>> >> We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
>> >> iptables-1.3.5-5.3.el5_4.1.
>> >> [...]
>> >> Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
>> >> SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
>> >> DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
>> >
>> >Those are, with high probabilty, late FIN packets: the belonging conntrack 
>> >entry has already been deleted and thus conntrack cannot find the matching 
>> >stream, therefore it sets as INVALID.
>> 
>> Should not FIN retransmissions ideally be classified as ESTABLISHED (or
>> perhaps a new state) as long as the final ACK has not been seen?
>
>The final ACK might have already been seen. A full tcpdump could tell us 
>what happened exactly.

But perhaps NFCT should assume that it did not reach its destination
and should accept more FIN-ACKs until the MSL has elapsed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 14:38       ` Jan Engelhardt
@ 2011-08-11 14:48         ` Jozsef Kadlecsik
  2011-08-11 14:59           ` AW: " Fiedler Roman
  0 siblings, 1 reply; 19+ messages in thread
From: Jozsef Kadlecsik @ 2011-08-11 14:48 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: John A. Sullivan III, netfilter

On Thu, 11 Aug 2011, Jan Engelhardt wrote:

> On Thursday 2011-08-11 16:36, Jozsef Kadlecsik wrote:
> 
> >On Thu, 11 Aug 2011, Jan Engelhardt wrote:
> >
> >> On Thursday 2011-08-11 12:12, Jozsef Kadlecsik wrote:
> >> >> Packets are
> >> >> being matched as INVALID when we would expect them to be ESTABLISHED.
> >> >> We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> >> >> iptables-1.3.5-5.3.el5_4.1.
> >> >> [...]
> >> >> Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> >> >> SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> >> >> DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> >> >
> >> >Those are, with high probabilty, late FIN packets: the belonging conntrack 
> >> >entry has already been deleted and thus conntrack cannot find the matching 
> >> >stream, therefore it sets as INVALID.
> >> 
> >> Should not FIN retransmissions ideally be classified as ESTABLISHED (or
> >> perhaps a new state) as long as the final ACK has not been seen?
> >
> >The final ACK might have already been seen. A full tcpdump could tell us 
> >what happened exactly.
> 
> But perhaps NFCT should assume that it did not reach its destination
> and should accept more FIN-ACKs until the MSL has elapsed.

The price is to waste the memory, by keeping every conntrack entry longer.

We should receive more reports that the current default values are not 
appropriate.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* AW: Conntrack not matching properly - producing serious outages
  2011-08-11 14:48         ` Jozsef Kadlecsik
@ 2011-08-11 14:59           ` Fiedler Roman
  0 siblings, 0 replies; 19+ messages in thread
From: Fiedler Roman @ 2011-08-11 14:59 UTC (permalink / raw)
  To: Jozsef Kadlecsik, Jan Engelhardt; +Cc: John A. Sullivan III, netfilter

> > >> >Those are, with high probabilty, late FIN packets: the belonging
> conntrack
> > >> >entry has already been deleted and thus conntrack cannot find the
> matching
> > >> >stream, therefore it sets as INVALID.
> > >>
> > >> Should not FIN retransmissions ideally be classified as ESTABLISHED (or
> > >> perhaps a new state) as long as the final ACK has not been seen?
> > >
> > >The final ACK might have already been seen. A full tcpdump could tell us
> > >what happened exactly.
> >
> > But perhaps NFCT should assume that it did not reach its destination
> > and should accept more FIN-ACKs until the MSL has elapsed.
> 
> The price is to waste the memory, by keeping every conntrack entry longer.
> 
> We should receive more reports that the current default values are not
> appropriate.

I observed the FIN-repeat problem also but thought, that increasing nf_conntrack_tcp_timeout_fin_wait fixed it. Since it was a dirty trial and error hack, I failed to find docu on that parameter and did not want to go into kernel for that old machine, this just might have been superstition.

Roman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 10:10 ` Eric Leblond
  2011-08-11 12:03   ` John A. Sullivan III
@ 2011-08-11 16:35   ` John A. Sullivan III
  2011-08-11 20:41     ` Jozsef Kadlecsik
  1 sibling, 1 reply; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-11 16:35 UTC (permalink / raw)
  To: Eric Leblond; +Cc: netfilter

On Thu, 2011-08-11 at 12:10 +0200, Eric Leblond wrote:
> Hello John,
> 
> Nice to hear from you again ;)
> 
> On Thu, 2011-08-11 at 05:46 -0400, John A. Sullivan III wrote:
> > Hello, all.  We have been having a subtle problem with conntrack for
> > quite a long time but it has suddenly gotten much worse.  Packets are
> > being matched as INVALID when we would expect them to be ESTABLISHED.
> > We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> > iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> > that we were going to investigate to provoking serious outages and all
> > hands to the pump.
> > 
> > The conntrack table is not swamped although we did increase the max
> > count and the hashsize just in case to no avail:
> > [root@fw01 netfilter]# cat ip_conntrack_max
> > 65536
> > [root@fw01 netfilter]# cat ip_conntrack_count
> > 532
> > 
> > 
> > Here are three specific examples.  The first is from the FORWARD chain.
> > Here are the logging messages:
> > 
> > 
> > Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> > SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> > DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> 
> I've already observed this kind of problem. This was related with some
> software/OSes having really strange timeout value.
> 
> To check weither this is the same problem, you can ask the kernel to log
> the reason why the packets are invalid. This can be made by running:
> 
>         echo "255">/proc/sys/net/netfilter/nf_conntrack_log_invalid
> 
> After doing this, the kernel will log all invalid packets through the
> default log system. You can check which one you are using by doing:
>   
>         cat /proc/net/netfilter/nf_log 
>          0 NONE ()
>          1 NONE ()
>          2 ipt_LOG (ipt_LOG)
>         ...
>         
> 2 is the coding for IPv4. With that ipt_LOG value, the message are sent
> via the standard kernel log. If instead of this value, you've got
> something like ULOG or NFLOG, you will need to get the message by
> listening to the nflog-group or ulog-group 0 in ulogd[2].
> 
> If this is timeout issue, you can play with the timeout setting of the
> conntrack in the /proc/sys/net/netfilter/nf_conntrack_*time* files.
> 
> BR,
I've just begun to wade my way through SACK as Jozsef suggested after
getting some sleep but I was able to catch a live one with logging
enabled:

Aug 11 11:56:24 fw01 kernel: nf_ct_tcp: bad TCP checksum IN= OUT=
SRC=95.172.228.42 DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52
ID=29203 DF PROTO=TCP SPT=46721 DPT=441 SEQ=2834861284 ACK=3682327577
WINDOW=1002 RES=0x00 ACK PSH URGP=0 OPT (0101080A01249B0846B0F23B)

Aug 11 11:56:24 fw01 kernel: INPUT INVALID IN=bond3 OUT=
MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0

Aug 11 11:56:24 fw01 kernel: No Match: IN=bond3 OUT=
MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0

Is this telling me that the reason the packet has been classified as
INVALID is because the TCP checksum is bad? We are doing checksum
offloading so I would think the checksum in the packet evaluated by the
kernel would be irrelevant.  We also have no problem if the users run
their sessions through an OpenVPN tunnel.

I'll be digging into SACK next but wonder if I'm staring at the smoking
gun and just don't recognize it.  I can try disabling offloading but not
right now as the system is in heavy production.  Thanks - John



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 12:26     ` Jozsef Kadlecsik
  2011-08-11 12:36       ` John A. Sullivan III
@ 2011-08-11 19:14       ` John A. Sullivan III
  2011-08-11 20:21         ` Jozsef Kadlecsik
  1 sibling, 1 reply; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-11 19:14 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: netfilter

On Thu, 2011-08-11 at 14:26 +0200, Jozsef Kadlecsik wrote:
> On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> 
> > On Thu, 2011-08-11 at 12:12 +0200, Jozsef Kadlecsik wrote:
> > > 
> > > On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> > > 
> > > > Hello, all.  We have been having a subtle problem with conntrack for
> > > > quite a long time but it has suddenly gotten much worse.  Packets are
> > > > being matched as INVALID when we would expect them to be ESTABLISHED.
> > > > We are running on kernel 2.6.30.5 on X86_64 with CentOS 5.4 and
> > > > iptables-1.3.5-5.3.el5_4.1.  This has escalated from a minor annoyance
> > > > that we were going to investigate to provoking serious outages and all
> > > > hands to the pump.
> > > > 
> > > > The conntrack table is not swamped although we did increase the max
> > > > count and the hashsize just in case to no avail:
> > > > [root@fw01 netfilter]# cat ip_conntrack_max
> > > > 65536
> > > > [root@fw01 netfilter]# cat ip_conntrack_count
> > > > 532
> > > > 
> > > > Here are three specific examples.  The first is from the FORWARD chain.
> > > > Here are the logging messages:
> > > >  
> > > > Aug 11 03:29:19 fw01 kernel: FORWARD INVALID IN=bond1 OUT=bond4
> > > > SRC=172.x.y.73 DST=172.x.z.34 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=32940
> > > > DF PROTO=TCP SPT=8080 DPT=52999 WINDOW=34 RES=0x00 ACK FIN URGP=0
> > > 
> > > Those are, with high probabilty, late FIN packets: the belonging conntrack 
> > > entry has already been deleted and thus conntrack cannot find the matching 
> > > stream, therefore it sets as INVALID.
> > Thank you very much, Jozsef.  That would explain why we did not
> > categorize this as a high priority in the past as it seemed to have
> > minimal impact.  I would guess we do not need to be concerned about
> > these.
> > 
> > However, the other two are much more problematic and what escalated this
> > into a crisis.  As I just explained in another reply, these are
> > happening in the middle of activity, i.e., they are NX remote desktop
> > sessions being carried via SSH.  The users are in the middle of typing
> > or scrolling through their desktops, in other words, the connection is
> > definitely active and passing many packets.  Then, without warning,
> > their desktops freeze, the connection eventually times out, and we see
> > these INVALID and dropped packets.  That's the one we really need to
> > solve.
> 
> That might be related to SACK option handling: some "clever" devices loves 
> to mangle TCP SEQ/ACK values, but forget about the SACK options. Try to 
> disable SACK support on both communicating endpoints. If the problem 
> disappears, then it's a SACK issue.
> 
<snip>
Alas, it is not SACK.  We disabled sack and dsack on both sides of one
user and it still took all of a few seconds for him to lock up.

Where do we look next? Thanks - John


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 19:14       ` John A. Sullivan III
@ 2011-08-11 20:21         ` Jozsef Kadlecsik
  0 siblings, 0 replies; 19+ messages in thread
From: Jozsef Kadlecsik @ 2011-08-11 20:21 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: netfilter

On Thu, 11 Aug 2011, John A. Sullivan III wrote:

> > That might be related to SACK option handling: some "clever" devices loves 
> > to mangle TCP SEQ/ACK values, but forget about the SACK options. Try to 
> > disable SACK support on both communicating endpoints. If the problem 
> > disappears, then it's a SACK issue.
> > 
> <snip>
> Alas, it is not SACK.  We disabled sack and dsack on both sides of one
> user and it still took all of a few seconds for him to lock up.
> 
> Where do we look next? Thanks - John

Please capture a full TCP session traffic (from the very first SYN to 
the very last ACK) with "tcpdump -s 0 ..." and send me the pcap 
file. Then I'll be able to replay it and check what's going on.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 16:35   ` John A. Sullivan III
@ 2011-08-11 20:41     ` Jozsef Kadlecsik
  2011-08-11 21:30       ` John A. Sullivan III
  0 siblings, 1 reply; 19+ messages in thread
From: Jozsef Kadlecsik @ 2011-08-11 20:41 UTC (permalink / raw)
  To: John A. Sullivan III; +Cc: Eric Leblond, netfilter

On Thu, 11 Aug 2011, John A. Sullivan III wrote:

> I've just begun to wade my way through SACK as Jozsef suggested after
> getting some sleep but I was able to catch a live one with logging
> enabled:
> 
> Aug 11 11:56:24 fw01 kernel: nf_ct_tcp: bad TCP checksum IN= OUT=
> SRC=95.172.228.42 DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52
> ID=29203 DF PROTO=TCP SPT=46721 DPT=441 SEQ=2834861284 ACK=3682327577
> WINDOW=1002 RES=0x00 ACK PSH URGP=0 OPT (0101080A01249B0846B0F23B)

That's Noop, Noop and Timestamp options and not SACK.

But the TCP checksum checking in conntrack says that the TCP checksum of 
the received packet is invalid, therefore it assings the INVALID 
state to the packet.
 
> Aug 11 11:56:24 fw01 kernel: INPUT INVALID IN=bond3 OUT=
> MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> 
> Aug 11 11:56:24 fw01 kernel: No Match: IN=bond3 OUT=
> MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> 
> Is this telling me that the reason the packet has been classified as
> INVALID is because the TCP checksum is bad? We are doing checksum
> offloading so I would think the checksum in the packet evaluated by the
> kernel would be irrelevant.  We also have no problem if the users run
> their sessions through an OpenVPN tunnel.

TCP checksum offloading does not discard incoming packets with invalid 
checksum.
 
> I'll be digging into SACK next but wonder if I'm staring at the smoking
> gun and just don't recognize it.  I can try disabling offloading but not
> right now as the system is in heavy production.  Thanks - John

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 20:41     ` Jozsef Kadlecsik
@ 2011-08-11 21:30       ` John A. Sullivan III
  2011-08-12 17:12         ` John A. Sullivan III
  0 siblings, 1 reply; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-11 21:30 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Eric Leblond, netfilter

On Thu, 2011-08-11 at 22:41 +0200, Jozsef Kadlecsik wrote:
> On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> 
> > I've just begun to wade my way through SACK as Jozsef suggested after
> > getting some sleep but I was able to catch a live one with logging
> > enabled:
> > 
> > Aug 11 11:56:24 fw01 kernel: nf_ct_tcp: bad TCP checksum IN= OUT=
> > SRC=95.172.228.42 DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52
> > ID=29203 DF PROTO=TCP SPT=46721 DPT=441 SEQ=2834861284 ACK=3682327577
> > WINDOW=1002 RES=0x00 ACK PSH URGP=0 OPT (0101080A01249B0846B0F23B)
> 
> That's Noop, Noop and Timestamp options and not SACK.
> 
> But the TCP checksum checking in conntrack says that the TCP checksum of 
> the received packet is invalid, therefore it assings the INVALID 
> state to the packet.
Ah, so we do suspect that this is the culprit?
>  
> > Aug 11 11:56:24 fw01 kernel: INPUT INVALID IN=bond3 OUT=
> > MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> > DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> > SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> > 
> > Aug 11 11:56:24 fw01 kernel: No Match: IN=bond3 OUT=
> > MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> > DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> > SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> > 
> > Is this telling me that the reason the packet has been classified as
> > INVALID is because the TCP checksum is bad? We are doing checksum
> > offloading so I would think the checksum in the packet evaluated by the
> > kernel would be irrelevant.  We also have no problem if the users run
> > their sessions through an OpenVPN tunnel.
> 
> TCP checksum offloading does not discard incoming packets with invalid 
> checksum.
Hmm . . . I wonder if we have a card which is going bad. This came on
all of a sudden.  I was planning to disable offloading anyway to see if
it solved the problem; I'm just awaiting a tester.  I'll report back
what I find.  I certainly appreciate all the help - John
>  
> > I'll be digging into SACK next but wonder if I'm staring at the smoking
> > gun and just don't recognize it.  I can try disabling offloading but not
> > right now as the system is in heavy production.  Thanks - John
> 
> Best regards,
> Jozsef
> -
> E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
> PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address : KFKI Research Institute for Particle and Nuclear Physics
>           H-1525 Budapest 114, POB. 49, Hungary



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-11 21:30       ` John A. Sullivan III
@ 2011-08-12 17:12         ` John A. Sullivan III
  2011-08-12 22:31           ` John A. Sullivan III
  0 siblings, 1 reply; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-12 17:12 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Eric Leblond, netfilter

On Thu, 2011-08-11 at 17:30 -0400, John A. Sullivan III wrote: 
> On Thu, 2011-08-11 at 22:41 +0200, Jozsef Kadlecsik wrote:
> > On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> > 
> > > I've just begun to wade my way through SACK as Jozsef suggested after
> > > getting some sleep but I was able to catch a live one with logging
> > > enabled:
> > > 
> > > Aug 11 11:56:24 fw01 kernel: nf_ct_tcp: bad TCP checksum IN= OUT=
> > > SRC=95.172.228.42 DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52
> > > ID=29203 DF PROTO=TCP SPT=46721 DPT=441 SEQ=2834861284 ACK=3682327577
> > > WINDOW=1002 RES=0x00 ACK PSH URGP=0 OPT (0101080A01249B0846B0F23B)
> > 
> > That's Noop, Noop and Timestamp options and not SACK.
> > 
> > But the TCP checksum checking in conntrack says that the TCP checksum of 
> > the received packet is invalid, therefore it assings the INVALID 
> > state to the packet.
> Ah, so we do suspect that this is the culprit?
> >  
> > > Aug 11 11:56:24 fw01 kernel: INPUT INVALID IN=bond3 OUT=
> > > MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> > > DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> > > SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> > > 
> > > Aug 11 11:56:24 fw01 kernel: No Match: IN=bond3 OUT=
> > > MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> > > DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> > > SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> > > 
> > > Is this telling me that the reason the packet has been classified as
> > > INVALID is because the TCP checksum is bad? We are doing checksum
> > > offloading so I would think the checksum in the packet evaluated by the
> > > kernel would be irrelevant.  We also have no problem if the users run
> > > their sessions through an OpenVPN tunnel.
> > 
> > TCP checksum offloading does not discard incoming packets with invalid 
> > checksum.
> Hmm . . . I wonder if we have a card which is going bad. This came on
> all of a sudden.  I was planning to disable offloading anyway to see if
> it solved the problem; I'm just awaiting a tester.  I'll report back
> what I find.  I certainly appreciate all the help - John
> >  
> > > I'll be digging into SACK next but wonder if I'm staring at the smoking
> > > gun and just don't recognize it.  I can try disabling offloading but not
> > > right now as the system is in heavy production.  Thanks - John
> > <snip>
Thanks to everyone for their help and my apologies for not getting back
sooner - we've been up almost continually battling this problem.

It looks like the netfilter involvement was a red herring.  We disabled
checksumming and the INVALID packet problem went away but the problem
persists.  We have hit and miss access and piles of duplicate ACKs and
retransmissions but it does not appear to be netfilter related.  Still
trying to figure out what changed of if we have some failing hardware.
Thanks again - John


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Conntrack not matching properly - producing serious outages
  2011-08-12 17:12         ` John A. Sullivan III
@ 2011-08-12 22:31           ` John A. Sullivan III
  0 siblings, 0 replies; 19+ messages in thread
From: John A. Sullivan III @ 2011-08-12 22:31 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Eric Leblond, netfilter

On Fri, 2011-08-12 at 13:12 -0400, John A. Sullivan III wrote: 
> On Thu, 2011-08-11 at 17:30 -0400, John A. Sullivan III wrote: 
> > On Thu, 2011-08-11 at 22:41 +0200, Jozsef Kadlecsik wrote:
> > > On Thu, 11 Aug 2011, John A. Sullivan III wrote:
> > > 
> > > > I've just begun to wade my way through SACK as Jozsef suggested after
> > > > getting some sleep but I was able to catch a live one with logging
> > > > enabled:
> > > > 
> > > > Aug 11 11:56:24 fw01 kernel: nf_ct_tcp: bad TCP checksum IN= OUT=
> > > > SRC=95.172.228.42 DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52
> > > > ID=29203 DF PROTO=TCP SPT=46721 DPT=441 SEQ=2834861284 ACK=3682327577
> > > > WINDOW=1002 RES=0x00 ACK PSH URGP=0 OPT (0101080A01249B0846B0F23B)
> > > 
> > > That's Noop, Noop and Timestamp options and not SACK.
> > > 
> > > But the TCP checksum checking in conntrack says that the TCP checksum of 
> > > the received packet is invalid, therefore it assings the INVALID 
> > > state to the packet.
> > Ah, so we do suspect that this is the culprit?
> > >  
> > > > Aug 11 11:56:24 fw01 kernel: INPUT INVALID IN=bond3 OUT=
> > > > MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> > > > DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> > > > SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> > > > 
> > > > Aug 11 11:56:24 fw01 kernel: No Match: IN=bond3 OUT=
> > > > MAC=00:15:17:90:3c:0b:00:1c:58:ea:79:ff:08:00 SRC=95.172.228.42
> > > > DST=208.a.b.8 LEN=260 TOS=0x00 PREC=0x00 TTL=52 ID=29203 DF PROTO=TCP
> > > > SPT=46721 DPT=441 WINDOW=1002 RES=0x00 ACK PSH URGP=0
> > > > 
> > > > Is this telling me that the reason the packet has been classified as
> > > > INVALID is because the TCP checksum is bad? We are doing checksum
> > > > offloading so I would think the checksum in the packet evaluated by the
> > > > kernel would be irrelevant.  We also have no problem if the users run
> > > > their sessions through an OpenVPN tunnel.
> > > 
> > > TCP checksum offloading does not discard incoming packets with invalid 
> > > checksum.
> > Hmm . . . I wonder if we have a card which is going bad. This came on
> > all of a sudden.  I was planning to disable offloading anyway to see if
> > it solved the problem; I'm just awaiting a tester.  I'll report back
> > what I find.  I certainly appreciate all the help - John
> > >  
> > > > I'll be digging into SACK next but wonder if I'm staring at the smoking
> > > > gun and just don't recognize it.  I can try disabling offloading but not
> > > > right now as the system is in heavy production.  Thanks - John
> > > <snip>
> Thanks to everyone for their help and my apologies for not getting back
> sooner - we've been up almost continually battling this problem.
> 
> It looks like the netfilter involvement was a red herring.  We disabled
> checksumming and the INVALID packet problem went away but the problem
> persists.  We have hit and miss access and piles of duplicate ACKs and
> retransmissions but it does not appear to be netfilter related.  Still
> trying to figure out what changed of if we have some failing hardware.
> Thanks again - John
<snip>
Looks like it might be a malfunctioning trunk port.  That would explain
the wild randomness of the problem.  Thanks again, all.  I certainly
learned (and internally documented) a lot about troubleshooting
conntrack with your help - John


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-08-12 22:31 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-11  9:46 Conntrack not matching properly - producing serious outages John A. Sullivan III
2011-08-11 10:10 ` Eric Leblond
2011-08-11 12:03   ` John A. Sullivan III
2011-08-11 16:35   ` John A. Sullivan III
2011-08-11 20:41     ` Jozsef Kadlecsik
2011-08-11 21:30       ` John A. Sullivan III
2011-08-12 17:12         ` John A. Sullivan III
2011-08-12 22:31           ` John A. Sullivan III
2011-08-11 10:12 ` Jozsef Kadlecsik
2011-08-11 12:09   ` John A. Sullivan III
2011-08-11 12:26     ` Jozsef Kadlecsik
2011-08-11 12:36       ` John A. Sullivan III
2011-08-11 19:14       ` John A. Sullivan III
2011-08-11 20:21         ` Jozsef Kadlecsik
2011-08-11 14:00   ` Jan Engelhardt
2011-08-11 14:36     ` Jozsef Kadlecsik
2011-08-11 14:38       ` Jan Engelhardt
2011-08-11 14:48         ` Jozsef Kadlecsik
2011-08-11 14:59           ` AW: " Fiedler Roman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.