All of lore.kernel.org
 help / color / mirror / Atom feed
* netfilter 00/03: netfilter update/fixes
@ 2008-07-31  6:33 Patrick McHardy
  2008-07-31  6:33 ` netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged Patrick McHardy
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Patrick McHardy @ 2008-07-31  6:33 UTC (permalink / raw)
  To: davem; +Cc: Patrick McHardy, netfilter-devel

Hi Dave,

these patches fix a proc file removal race in ipt_recent, a timer removal
race in hashlimit and, based upon a suggestion by Herbert, change TCP
conntrack to keep track of unacknowledged data and reduce the timeout to
5 minutes while data is unacknowledged in order to more aggressively prune
dead connections.

Please apply, thanks.


 include/linux/netfilter/nf_conntrack_tcp.h |    3 ++
 net/ipv4/netfilter/ipt_recent.c            |    2 +-
 net/netfilter/nf_conntrack_proto_tcp.c     |   29 +++++++++++++++++++++++----
 net/netfilter/xt_hashlimit.c               |    4 +--
 4 files changed, 29 insertions(+), 9 deletions(-)

Patrick McHardy (1):
      netfilter: nf_conntrack_tcp: decrease timeouts while data in unacknowledged

Pavel Emelyanov (2):
      netfilter: ipt_recent: fix race between recent_mt_destroy and proc manipulations
      netfilter: xt_hashlimit: fix race between htable_destroy and htable_gc

^ permalink raw reply	[flat|nested] 12+ messages in thread

* netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2008-07-31  6:33 netfilter 00/03: netfilter update/fixes Patrick McHardy
@ 2008-07-31  6:33 ` Patrick McHardy
  2008-07-31  7:38   ` David Miller
  2009-06-26 14:39   ` Krzysztof Oledzki
  2008-07-31  6:33 ` netfilter 02/03: ipt_recent: fix race between recent_mt_destroy and proc manipulations Patrick McHardy
  2008-07-31  6:33 ` netfilter 03/03: xt_hashlimit: fix race between htable_destroy and htable_gc Patrick McHardy
  2 siblings, 2 replies; 12+ messages in thread
From: Patrick McHardy @ 2008-07-31  6:33 UTC (permalink / raw)
  To: davem; +Cc: Patrick McHardy, netfilter-devel

netfilter: nf_conntrack_tcp: decrease timeouts while data in unacknowledged

In order to time out dead connections quicker, keep track of outstanding data
and cap the timeout.

Suggested by Herbert Xu.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 827768d3835412f6ba386c9fc2d6fdbde0c2c0ee
tree 03b536e5beeeb4585d4aec822125177c04353f2e
parent e93dc4891df93d7efa59d861fdcbb529a1819343
author Patrick McHardy <kaber@trash.net> Wed, 30 Jul 2008 12:06:26 +0200
committer Patrick McHardy <kaber@trash.net> Wed, 30 Jul 2008 12:06:26 +0200

 include/linux/netfilter/nf_conntrack_tcp.h |    3 +++
 net/netfilter/nf_conntrack_proto_tcp.c     |   29 +++++++++++++++++++++++-----
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/include/linux/netfilter/nf_conntrack_tcp.h b/include/linux/netfilter/nf_conntrack_tcp.h
index 22ce299..a049df4 100644
--- a/include/linux/netfilter/nf_conntrack_tcp.h
+++ b/include/linux/netfilter/nf_conntrack_tcp.h
@@ -30,6 +30,9 @@ enum tcp_conntrack {
 /* Be liberal in window checking */
 #define IP_CT_TCP_FLAG_BE_LIBERAL		0x08
 
+/* Has unacknowledged data */
+#define IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED	0x10
+
 struct nf_ct_tcp_flags {
 	u_int8_t flags;
 	u_int8_t mask;
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 420a10d..6f61261 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -67,7 +67,8 @@ static const char *const tcp_conntrack_names[] = {
 /* RFC1122 says the R2 limit should be at least 100 seconds.
    Linux uses 15 packets as limit, which corresponds
    to ~13-30min depending on RTO. */
-static unsigned int nf_ct_tcp_timeout_max_retrans __read_mostly =   5 MINS;
+static unsigned int nf_ct_tcp_timeout_max_retrans __read_mostly    =   5 MINS;
+static unsigned int nf_ct_tcp_timeout_unacknowledged __read_mostly =   5 MINS;
 
 static unsigned int tcp_timeouts[TCP_CONNTRACK_MAX] __read_mostly = {
 	[TCP_CONNTRACK_SYN_SENT]	= 2 MINS,
@@ -625,8 +626,10 @@ static bool tcp_in_window(const struct nf_conn *ct,
 		swin = win + (sack - ack);
 		if (sender->td_maxwin < swin)
 			sender->td_maxwin = swin;
-		if (after(end, sender->td_end))
+		if (after(end, sender->td_end)) {
 			sender->td_end = end;
+			sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED;
+		}
 		/*
 		 * Update receiver data.
 		 */
@@ -637,6 +640,8 @@ static bool tcp_in_window(const struct nf_conn *ct,
 			if (win == 0)
 				receiver->td_maxend++;
 		}
+		if (ack == receiver->td_end)
+			receiver->flags &= ~IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED;
 
 		/*
 		 * Check retransmissions.
@@ -951,9 +956,16 @@ static int tcp_packet(struct nf_conn *ct,
 	if (old_state != new_state
 	    && new_state == TCP_CONNTRACK_FIN_WAIT)
 		ct->proto.tcp.seen[dir].flags |= IP_CT_TCP_FLAG_CLOSE_INIT;
-	timeout = ct->proto.tcp.retrans >= nf_ct_tcp_max_retrans
-		  && tcp_timeouts[new_state] > nf_ct_tcp_timeout_max_retrans
-		  ? nf_ct_tcp_timeout_max_retrans : tcp_timeouts[new_state];
+
+	if (ct->proto.tcp.retrans >= nf_ct_tcp_max_retrans &&
+	    tcp_timeouts[new_state] > nf_ct_tcp_timeout_max_retrans)
+		timeout = nf_ct_tcp_timeout_max_retrans;
+	else if ((ct->proto.tcp.seen[0].flags | ct->proto.tcp.seen[1].flags) &
+		 IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED &&
+		 tcp_timeouts[new_state] > nf_ct_tcp_timeout_unacknowledged)
+		timeout = nf_ct_tcp_timeout_unacknowledged;
+	else
+		timeout = tcp_timeouts[new_state];
 	write_unlock_bh(&tcp_lock);
 
 	nf_conntrack_event_cache(IPCT_PROTOINFO_VOLATILE, skb);
@@ -1236,6 +1248,13 @@ static struct ctl_table tcp_sysctl_table[] = {
 		.proc_handler	= &proc_dointvec_jiffies,
 	},
 	{
+		.procname	= "nf_conntrack_tcp_timeout_unacknowledged",
+		.data		= &nf_ct_tcp_timeout_unacknowledged,
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec_jiffies,
+	},
+	{
 		.ctl_name	= NET_NF_CONNTRACK_TCP_LOOSE,
 		.procname	= "nf_conntrack_tcp_loose",
 		.data		= &nf_ct_tcp_loose,

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* netfilter 02/03: ipt_recent: fix race between recent_mt_destroy and proc manipulations
  2008-07-31  6:33 netfilter 00/03: netfilter update/fixes Patrick McHardy
  2008-07-31  6:33 ` netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged Patrick McHardy
@ 2008-07-31  6:33 ` Patrick McHardy
  2008-07-31  6:33 ` netfilter 03/03: xt_hashlimit: fix race between htable_destroy and htable_gc Patrick McHardy
  2 siblings, 0 replies; 12+ messages in thread
From: Patrick McHardy @ 2008-07-31  6:33 UTC (permalink / raw)
  To: davem; +Cc: Patrick McHardy, netfilter-devel

netfilter: ipt_recent: fix race between recent_mt_destroy and proc manipulations

The thing is that recent_mt_destroy first flushes the entries
from table with the recent_table_flush and only *after* this
removes the proc file, corresponding to that table.

Thus, if we manage to write to this file the '+XXX' command we
will leak some entries. If we manage to write there a 'clean'
command we'll race in two recent_table_flush flows, since the
recent_mt_destroy calls this outside the recent_lock.

The proper solution as I see it is to remove the proc file first
and then go on with flushing the table. This flushing becomes
safe w/o the lock, since the table is already inaccessible from
the outside.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 728845f8c11ec11be4a3725a70389a3013cd4e48
tree c6dfca9346769ab48550ba27c28a3ba2af94e30b
parent 827768d3835412f6ba386c9fc2d6fdbde0c2c0ee
author Pavel Emelyanov <xemul@openvz.org> Wed, 30 Jul 2008 12:52:38 +0200
committer Patrick McHardy <kaber@trash.net> Wed, 30 Jul 2008 12:52:38 +0200

 net/ipv4/netfilter/ipt_recent.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_recent.c b/net/ipv4/netfilter/ipt_recent.c
index 21cb053..3974d7c 100644
--- a/net/ipv4/netfilter/ipt_recent.c
+++ b/net/ipv4/netfilter/ipt_recent.c
@@ -305,10 +305,10 @@ static void recent_mt_destroy(const struct xt_match *match, void *matchinfo)
 		spin_lock_bh(&recent_lock);
 		list_del(&t->list);
 		spin_unlock_bh(&recent_lock);
-		recent_table_flush(t);
 #ifdef CONFIG_PROC_FS
 		remove_proc_entry(t->name, proc_dir);
 #endif
+		recent_table_flush(t);
 		kfree(t);
 	}
 	mutex_unlock(&recent_mutex);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* netfilter 03/03: xt_hashlimit: fix race between htable_destroy and htable_gc
  2008-07-31  6:33 netfilter 00/03: netfilter update/fixes Patrick McHardy
  2008-07-31  6:33 ` netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged Patrick McHardy
  2008-07-31  6:33 ` netfilter 02/03: ipt_recent: fix race between recent_mt_destroy and proc manipulations Patrick McHardy
@ 2008-07-31  6:33 ` Patrick McHardy
  2008-07-31  7:39   ` David Miller
  2 siblings, 1 reply; 12+ messages in thread
From: Patrick McHardy @ 2008-07-31  6:33 UTC (permalink / raw)
  To: davem; +Cc: Patrick McHardy, netfilter-devel

netfilter: xt_hashlimit: fix race between htable_destroy and htable_gc

Deleting a timer with del_timer doesn't guarantee, that the
timer function is not running at the moment of deletion. Thus
in the xt_hashlimit case we can get into a ticklish situation
when the htable_gc rearms the timer back and we'll actually
delete an entry with a pending timer.

Fix it with using del_timer_sync().

AFAIK del_timer_sync checks for the timer to be pending by
itself, so I remove the check.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit d97740075c062dc888ecb4d710e6aafd2a253383
tree d20ba6ebf2b48f7c45014c9d4f6bf8eb208f18d5
parent 728845f8c11ec11be4a3725a70389a3013cd4e48
author Pavel Emelyanov <xemul@openvz.org> Wed, 30 Jul 2008 12:53:30 +0200
committer Patrick McHardy <kaber@trash.net> Wed, 30 Jul 2008 12:53:30 +0200

 net/netfilter/xt_hashlimit.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 6809af5..d9418a2 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -367,9 +367,7 @@ static void htable_gc(unsigned long htlong)
 
 static void htable_destroy(struct xt_hashlimit_htable *hinfo)
 {
-	/* remove timer, if it is pending */
-	if (timer_pending(&hinfo->timer))
-		del_timer(&hinfo->timer);
+	del_timer_sync(&hinfo->timer);
 
 	/* remove proc entry */
 	remove_proc_entry(hinfo->pde->name,

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2008-07-31  6:33 ` netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged Patrick McHardy
@ 2008-07-31  7:38   ` David Miller
  2009-06-26 14:39   ` Krzysztof Oledzki
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2008-07-31  7:38 UTC (permalink / raw)
  To: kaber; +Cc: netfilter-devel

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 31 Jul 2008 08:33:14 +0200 (MEST)

> netfilter: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
> 
> In order to time out dead connections quicker, keep track of outstanding data
> and cap the timeout.
> 
> Suggested by Herbert Xu.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: netfilter 03/03: xt_hashlimit: fix race between htable_destroy and htable_gc
  2008-07-31  6:33 ` netfilter 03/03: xt_hashlimit: fix race between htable_destroy and htable_gc Patrick McHardy
@ 2008-07-31  7:39   ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2008-07-31  7:39 UTC (permalink / raw)
  To: kaber; +Cc: netfilter-devel

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 31 Jul 2008 08:33:16 +0200 (MEST)

> netfilter: xt_hashlimit: fix race between htable_destroy and htable_gc
> 
> Deleting a timer with del_timer doesn't guarantee, that the
> timer function is not running at the moment of deletion. Thus
> in the xt_hashlimit case we can get into a ticklish situation
> when the htable_gc rearms the timer back and we'll actually
> delete an entry with a pending timer.
> 
> Fix it with using del_timer_sync().
> 
> AFAIK del_timer_sync checks for the timer to be pending by
> itself, so I remove the check.
> 
> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Also applied, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2008-07-31  6:33 ` netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged Patrick McHardy
  2008-07-31  7:38   ` David Miller
@ 2009-06-26 14:39   ` Krzysztof Oledzki
  2009-06-26 15:14     ` Patrick McHardy
  1 sibling, 1 reply; 12+ messages in thread
From: Krzysztof Oledzki @ 2009-06-26 14:39 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: davem, netfilter-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 913 bytes --]



On Thu, 31 Jul 2008, Patrick McHardy wrote:

> netfilter: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
>
> In order to time out dead connections quicker, keep track of outstanding data
> and cap the timeout.
>
> Suggested by Herbert Xu.
>
> Signed-off-by: Patrick McHardy <kaber@trash.net>

<CUT>

This patch kills long living ftp transfers from one of my hosts. I'm not 
able to transfer large files if it takes more than 
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged seconds.

After logging to the remote host and issuing any FTP command (ls or 
put/get for example) tuple's timeout is reduced. Additional commands are 
able to bump it but only upto 
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged.

It seems that IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED flag is never cleard.

Tested on 2.6.28.10.

Attaching raw tcpdump from the session.

Best regards,


 				Krzysztof Olędzki

[-- Attachment #2: Type: APPLICATION/octet-stream, Size: 4893 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2009-06-26 14:39   ` Krzysztof Oledzki
@ 2009-06-26 15:14     ` Patrick McHardy
  2009-06-26 16:31       ` Krzysztof Oledzki
  0 siblings, 1 reply; 12+ messages in thread
From: Patrick McHardy @ 2009-06-26 15:14 UTC (permalink / raw)
  To: Krzysztof Oledzki; +Cc: davem, netfilter-devel

Krzysztof Oledzki wrote:
> This patch kills long living ftp transfers from one of my hosts. I'm not 
> able to transfer large files if it takes more than 
> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged seconds.
> 
> After logging to the remote host and issuing any FTP command (ls or 
> put/get for example) tuple's timeout is reduced. Additional commands are 
> able to bump it but only upto 
> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged.
> 
> It seems that IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED flag is never cleard.
> 
> Tested on 2.6.28.10.

Interesting, are you using the FTP NAT helper?

I'm guessing there is some bad interaction between sequence number
adjustments when changing the packet sizes and sequence number
tracking in conntrack.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2009-06-26 15:14     ` Patrick McHardy
@ 2009-06-26 16:31       ` Krzysztof Oledzki
  2009-06-26 17:03         ` Patrick McHardy
  0 siblings, 1 reply; 12+ messages in thread
From: Krzysztof Oledzki @ 2009-06-26 16:31 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: davem, netfilter-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 946 bytes --]



On Fri, 26 Jun 2009, Patrick McHardy wrote:

> Krzysztof Oledzki wrote:
>> This patch kills long living ftp transfers from one of my hosts. I'm not 
>> able to transfer large files if it takes more than 
>> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged seconds.
>> 
>> After logging to the remote host and issuing any FTP command (ls or put/get 
>> for example) tuple's timeout is reduced. Additional commands are able to 
>> bump it but only upto 
>> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged.
>> 
>> It seems that IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED flag is never cleard.
>> 
>> Tested on 2.6.28.10.
>
> Interesting, are you using the FTP NAT helper?

Actually, yes :)

> I'm guessing there is some bad interaction between sequence number
> adjustments when changing the packet sizes and sequence number
> tracking in conntrack.

Very likely, src ip and nat ip have different lengths.

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2009-06-26 16:31       ` Krzysztof Oledzki
@ 2009-06-26 17:03         ` Patrick McHardy
  2009-06-26 17:31           ` Krzysztof Oledzki
  0 siblings, 1 reply; 12+ messages in thread
From: Patrick McHardy @ 2009-06-26 17:03 UTC (permalink / raw)
  To: Krzysztof Oledzki; +Cc: davem, netfilter-devel, Jozsef Kadlecsik

[-- Attachment #1: Type: text/plain, Size: 1681 bytes --]

Krzysztof Oledzki wrote:
> On Fri, 26 Jun 2009, Patrick McHardy wrote:
> 
>> Krzysztof Oledzki wrote:
>>> This patch kills long living ftp transfers from one of my hosts. I'm 
>>> not able to transfer large files if it takes more than 
>>> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged seconds.
>>>
>>> After logging to the remote host and issuing any FTP command (ls or 
>>> put/get for example) tuple's timeout is reduced. Additional commands 
>>> are able to bump it but only upto 
>>> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged.
>>>
>>> It seems that IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED flag is never cleard.
>>>
>>> Tested on 2.6.28.10.
>>
>> Interesting, are you using the FTP NAT helper?
> 
> Actually, yes :)
> 
>> I'm guessing there is some bad interaction between sequence number
>> adjustments when changing the packet sizes and sequence number
>> tracking in conntrack.
> 
> Very likely, src ip and nat ip have different lengths.

I think I know what the problem is. We update the highest seen
sequence number when adjusting packet lengths in NAT, but only
upwards:

void nf_conntrack_tcp_update(const struct sk_buff *skb,
			     unsigned int dataoff,
			     struct nf_conn *ct,
			     int dir)
{
...
	if (after(end, ct->proto.tcp.seen[dir].td_end))
		ct->proto.tcp.seen[dir].td_end = end;

So I'm guessing in your case, the NAT address is shorter than
the original one.

This patch changes TCP conntrack to update the highest sequence
number when the old number matches the new one + the size offset,
so it should effectively update it after NAT whenever the packet
caused an update before NAT.

Please test whether this patch fixes the problem.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 3616 bytes --]

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index a632689..cbdd628 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -258,8 +258,8 @@ static inline bool nf_ct_kill(struct nf_conn *ct)
 /* Update TCP window tracking data when NAT mangles the packet */
 extern void nf_conntrack_tcp_update(const struct sk_buff *skb,
 				    unsigned int dataoff,
-				    struct nf_conn *ct,
-				    int dir);
+				    struct nf_conn *ct, int dir,
+				    s16 offset);
 
 /* Fake conntrack entry for untracked connections */
 extern struct nf_conn nf_conntrack_untracked;
diff --git a/net/ipv4/netfilter/nf_nat_helper.c b/net/ipv4/netfilter/nf_nat_helper.c
index 155c008..09172a6 100644
--- a/net/ipv4/netfilter/nf_nat_helper.c
+++ b/net/ipv4/netfilter/nf_nat_helper.c
@@ -191,7 +191,8 @@ nf_nat_mangle_tcp_packet(struct sk_buff *skb,
 				    ct, ctinfo);
 		/* Tell TCP window tracking about seq change */
 		nf_conntrack_tcp_update(skb, ip_hdrlen(skb),
-					ct, CTINFO2DIR(ctinfo));
+					ct, CTINFO2DIR(ctinfo),
+					(int)rep_len - (int)match_len);
 
 		nf_conntrack_event_cache(IPCT_NATSEQADJ, ct);
 	}
@@ -377,6 +378,7 @@ nf_nat_seq_adjust(struct sk_buff *skb,
 	struct tcphdr *tcph;
 	int dir;
 	__be32 newseq, newack;
+	s16 seqoff, ackoff;
 	struct nf_conn_nat *nat = nfct_nat(ct);
 	struct nf_nat_seq *this_way, *other_way;
 
@@ -390,15 +392,18 @@ nf_nat_seq_adjust(struct sk_buff *skb,
 
 	tcph = (void *)skb->data + ip_hdrlen(skb);
 	if (after(ntohl(tcph->seq), this_way->correction_pos))
-		newseq = htonl(ntohl(tcph->seq) + this_way->offset_after);
+		seqoff = this_way->offset_after;
 	else
-		newseq = htonl(ntohl(tcph->seq) + this_way->offset_before);
+		seqoff = this_way->offset_before;
 
 	if (after(ntohl(tcph->ack_seq) - other_way->offset_before,
 		  other_way->correction_pos))
-		newack = htonl(ntohl(tcph->ack_seq) - other_way->offset_after);
+		ackoff = other_way->offset_after;
 	else
-		newack = htonl(ntohl(tcph->ack_seq) - other_way->offset_before);
+		ackoff = other_way->offset_before;
+
+	newseq = htonl(ntohl(tcph->seq) + seqoff);
+	newack = htonl(ntohl(tcph->ack_seq) - ackoff);
 
 	inet_proto_csum_replace4(&tcph->check, skb, tcph->seq, newseq, 0);
 	inet_proto_csum_replace4(&tcph->check, skb, tcph->ack_seq, newack, 0);
@@ -413,7 +418,7 @@ nf_nat_seq_adjust(struct sk_buff *skb,
 	if (!nf_nat_sack_adjust(skb, tcph, ct, ctinfo))
 		return 0;
 
-	nf_conntrack_tcp_update(skb, ip_hdrlen(skb), ct, dir);
+	nf_conntrack_tcp_update(skb, ip_hdrlen(skb), ct, dir, seqoff);
 
 	return 1;
 }
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 33fc0a4..97a82ba 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -720,8 +720,8 @@ static bool tcp_in_window(const struct nf_conn *ct,
 /* Caller must linearize skb at tcp header. */
 void nf_conntrack_tcp_update(const struct sk_buff *skb,
 			     unsigned int dataoff,
-			     struct nf_conn *ct,
-			     int dir)
+			     struct nf_conn *ct, int dir,
+			     s16 offset)
 {
 	const struct tcphdr *tcph = (const void *)skb->data + dataoff;
 	const struct ip_ct_tcp_state *sender = &ct->proto.tcp.seen[dir];
@@ -734,7 +734,7 @@ void nf_conntrack_tcp_update(const struct sk_buff *skb,
 	/*
 	 * We have to worry for the ack in the reply packet only...
 	 */
-	if (after(end, ct->proto.tcp.seen[dir].td_end))
+	if (ct->proto.tcp.seen[dir].td_end + offset == end)
 		ct->proto.tcp.seen[dir].td_end = end;
 	ct->proto.tcp.last_end = end;
 	spin_unlock_bh(&ct->lock);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2009-06-26 17:03         ` Patrick McHardy
@ 2009-06-26 17:31           ` Krzysztof Oledzki
  2009-06-29 12:20             ` Patrick McHardy
  0 siblings, 1 reply; 12+ messages in thread
From: Krzysztof Oledzki @ 2009-06-26 17:31 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: davem, netfilter-devel, Jozsef Kadlecsik

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2018 bytes --]



On Fri, 26 Jun 2009, Patrick McHardy wrote:

> Krzysztof Oledzki wrote:
>> On Fri, 26 Jun 2009, Patrick McHardy wrote:
>> 
>>> Krzysztof Oledzki wrote:
>>>> This patch kills long living ftp transfers from one of my hosts. I'm not 
>>>> able to transfer large files if it takes more than 
>>>> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged seconds.
>>>> 
>>>> After logging to the remote host and issuing any FTP command (ls or 
>>>> put/get for example) tuple's timeout is reduced. Additional commands are 
>>>> able to bump it but only upto 
>>>> net.netfilter.nf_conntrack_tcp_timeout_unacknowledged.
>>>> 
>>>> It seems that IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED flag is never cleard.
>>>> 
>>>> Tested on 2.6.28.10.
>>> 
>>> Interesting, are you using the FTP NAT helper?
>> 
>> Actually, yes :)
>> 
>>> I'm guessing there is some bad interaction between sequence number
>>> adjustments when changing the packet sizes and sequence number
>>> tracking in conntrack.
>> 
>> Very likely, src ip and nat ip have different lengths.
>
> I think I know what the problem is. We update the highest seen
> sequence number when adjusting packet lengths in NAT, but only
> upwards:
>
> void nf_conntrack_tcp_update(const struct sk_buff *skb,
> 			     unsigned int dataoff,
> 			     struct nf_conn *ct,
> 			     int dir)
> {
> ...
> 	if (after(end, ct->proto.tcp.seen[dir].td_end))
> 		ct->proto.tcp.seen[dir].td_end = end;
>
> So I'm guessing in your case, the NAT address is shorter than
> the original one.

Exactly. Just checked and this problem does not exist when
strlen(src_ip) <= strlen(nat_ip)

> This patch changes TCP conntrack to update the highest sequence
> number when the old number matches the new one + the size offset,
> so it should effectively update it after NAT whenever the packet
> caused an update before NAT.
>
> Please test whether this patch fixes the problem.

The patch fixes my problem. Thank you.

Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>


Best regards,

 			Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged
  2009-06-26 17:31           ` Krzysztof Oledzki
@ 2009-06-29 12:20             ` Patrick McHardy
  0 siblings, 0 replies; 12+ messages in thread
From: Patrick McHardy @ 2009-06-29 12:20 UTC (permalink / raw)
  To: Krzysztof Oledzki; +Cc: davem, netfilter-devel, Jozsef Kadlecsik

Krzysztof Oledzki wrote:
>> This patch changes TCP conntrack to update the highest sequence
>> number when the old number matches the new one + the size offset,
>> so it should effectively update it after NAT whenever the packet
>> caused an update before NAT.
>>
>> Please test whether this patch fixes the problem.
> 
> The patch fixes my problem. Thank you.
> 
> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>

Thanks, I've queued the patch for upstream.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-06-29 12:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-31  6:33 netfilter 00/03: netfilter update/fixes Patrick McHardy
2008-07-31  6:33 ` netfilter 01/03: nf_conntrack_tcp: decrease timeouts while data in unacknowledged Patrick McHardy
2008-07-31  7:38   ` David Miller
2009-06-26 14:39   ` Krzysztof Oledzki
2009-06-26 15:14     ` Patrick McHardy
2009-06-26 16:31       ` Krzysztof Oledzki
2009-06-26 17:03         ` Patrick McHardy
2009-06-26 17:31           ` Krzysztof Oledzki
2009-06-29 12:20             ` Patrick McHardy
2008-07-31  6:33 ` netfilter 02/03: ipt_recent: fix race between recent_mt_destroy and proc manipulations Patrick McHardy
2008-07-31  6:33 ` netfilter 03/03: xt_hashlimit: fix race between htable_destroy and htable_gc Patrick McHardy
2008-07-31  7:39   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.