netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash.net>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Ingo Molnar <mingo@elte.hu>, David Miller <davem@davemloft.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Pablo Neira Ayuso <pablo@netfilter.org>
Subject: Re: [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad()
Date: Thu, 18 Jun 2009 16:56:25 +0200	[thread overview]
Message-ID: <4A3A5599.4080906@trash.net> (raw)
In-Reply-To: <4A3A0D45.8090806@trash.net>

[-- Attachment #1: Type: text/plain, Size: 2003 bytes --]

Patrick McHardy wrote:
> Eric Dumazet wrote:
>>> a test-box still triggered this crash overnight:
>>>
>>> [  252.433471] ------------[ cut here ]------------
>>> [  252.436031] WARNING: at lib/list_debug.c:30 __list_add+0x95/0xa0()
>>> ...
>>> With your patch (repeated below) applied. Is Patrick's alternative 
>>> patch supposed to fix something that yours does not?
>>
>> Hmm, not really, Patrick patch should fix same problem, but without 
>> extra locking
>> as mine.
>>
>> This new stack trace is somewhat different, as corruption is detected 
>> in the add_timer()
>> call in __nf_conntrack_confirm()
>>
>> So there is another problem. CCed Pablo Neira Ayuso who added some stuff
>> in netfilter and timeout logic recently.
> 
> That timeout logic shouldn't be relevant in this case, its only
> activated when netlink event delivery is used, a userspace process
> is actually listening and it has the socket marked for reliable
> delivery.
> 
> I think its still the same problem, the detection is just noticed
> at a different point.

I can't find anything wrong with the conntrack timer use itself.
Since its the timer base that is corrupted, its possible that
something else is using timers incorrectly and conntrack just
happens to hit the problem afterwards, but it seems rather
unlikely since the first backtrace originated in a network driver,
the second one in the socket layer.

But I've noticed a different problem, the lockless lookup might
incorrectly return an conntrack entry that is supposed to be
invisible. This can happen because we only check for equal tuples
and !atomic_inc_not_zero(refcnt), but the conntrack can be removed
from the lists and still have references to it. It should never
have an active timer at that point however, so all following
mod_timer_pending() calls *should* do nothing unless I'm missing
something.

Ingo, could you please try whether this patch (combined with the
last one) makes any difference? Enabling CONFIG_NETFILTER_DEBUG
could also help.



[-- Attachment #2: x --]
[-- Type: text/plain, Size: 876 bytes --]

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5f72b94..ce17a69 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -335,7 +335,8 @@ begin:
 	h = __nf_conntrack_find(net, tuple);
 	if (h) {
 		ct = nf_ct_tuplehash_to_ctrack(h);
-		if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
+		if (unlikely(nf_ct_is_dying(ct) ||
+			     !atomic_inc_not_zero(&ct->ct_general.use)))
 			h = NULL;
 		else {
 			if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple))) {
@@ -503,7 +504,8 @@ static noinline int early_drop(struct net *net, unsigned int hash)
 			cnt++;
 		}
 
-		if (ct && unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
+		if (ct && unlikely(nf_ct_is_dying(ct) ||
+				   !atomic_inc_not_zero(&ct->ct_general.use)))
 			ct = NULL;
 		if (ct || cnt >= NF_CT_EVICTION_RANGE)
 			break;

  reply	other threads:[~2009-06-18 14:56 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-15 12:04 [GIT]: Networking David Miller
2009-06-16  9:15 ` Ingo Molnar
2009-06-16  9:19   ` David Miller
2009-06-16  9:33     ` Ingo Molnar
2009-06-16  9:44       ` Ingo Molnar
2009-06-16  9:44     ` Alan Cox
2009-06-16  9:48       ` Ingo Molnar
2009-06-16  9:56         ` David Miller
2009-06-16 10:11           ` Ingo Molnar
2009-06-16 10:35             ` David Miller
2009-06-16 13:38               ` Eric Dumazet
2009-06-16 14:19                 ` Eric Dumazet
2009-06-16 18:59                   ` Linus Torvalds
2009-06-16 19:08                     ` David Miller
2009-06-16 19:37                       ` Eric Dumazet
2009-06-16 20:12                         ` Eric Dumazet
2009-06-17  6:41                           ` [PATCH] net: correct off-by-one write allocations reports Eric Dumazet
2009-06-17 11:31                             ` [PATCH] atm: sk_wmem_alloc initial value is one Eric Dumazet
2009-06-18  2:06                               ` David Miller
2009-06-18  2:05                             ` [PATCH] net: correct off-by-one write allocations reports David Miller
2009-06-17 11:32                           ` [GIT]: Networking David Miller
2009-06-16  9:21   ` David Miller
2009-06-16  9:26     ` Ingo Molnar
2009-06-16 10:47   ` David Miller
2009-06-16 10:53     ` Ingo Molnar
2009-06-16 12:24       ` Ingo Molnar
2009-06-16 12:39         ` David Miller
2009-06-17  9:21         ` [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad() Ingo Molnar
2009-06-17 10:18           ` Eric Dumazet
2009-06-17 11:08             ` Ingo Molnar
2009-06-17 11:35               ` David Miller
2009-06-18  5:23               ` Ingo Molnar
2009-06-18  5:58                 ` Eric Dumazet
2009-06-18  9:47                   ` Patrick McHardy
2009-06-18 14:56                     ` Patrick McHardy [this message]
2009-06-18 15:46                       ` Eric Dumazet
2009-06-18 16:09                         ` Patrick McHardy
2009-06-18 16:13                           ` Pablo Neira Ayuso
2009-06-18 22:46                           ` [PATCH] netfilter: conntrack: death_by_timeout() fix Eric Dumazet
2009-06-19 11:15                             ` Patrick McHardy
2009-06-20 15:47                       ` [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad() Ingo Molnar
2009-06-20 15:56                         ` Patrick McHardy
2009-06-17 11:38             ` Patrick McHardy
2009-06-17 11:51               ` Patrick McHardy
2009-06-17 11:55               ` Eric Dumazet
2009-06-17 12:00                 ` Patrick McHardy
2009-06-17 12:33                   ` Eric Dumazet
2009-06-17 12:36                     ` Patrick McHardy
2009-06-17 13:27                       ` Eric Dumazet
2009-06-17 13:29                         ` Patrick McHardy
2009-06-17 14:23                           ` Patrick McHardy
2009-06-17 15:29                             ` Eric Dumazet
2009-06-17 15:34                               ` Patrick McHardy
2009-06-18 23:18 ` [GIT]: Networking Tilman Schmidt
2009-06-19  3:36   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A3A5599.4080906@trash.net \
    --to=kaber@trash.net \
    --cc=akpm@linux-foundation.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).