[PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive
@ 2016-09-21 19:49 Fabian Frederick
  2016-09-21 21:02 ` Florian Westphal
  0 siblings, 1 reply; 5+ messages in thread
From: Fabian Frederick @ 2016-09-21 19:49 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Eric Dumazet, Pablo Neira Ayuso, linux-kernel, netfilter-devel, fabf

Since commit f330a7fdbe16
("netfilter: conntrack: get rid of conntrack timer")

closed connections remain longer in /proc/net/nf_conntrack

Running current kernel; just after boot:
cat /proc/net/nf_conntrack | wc -l = 5
4 minutes required to clean up the table.

Going back to kernel version before commit above there are
no connections after some seconds.

Referring to the commit changelog this was an expected behaviour but
it results in temporary kmemleak reports:

unreferenced object 0xffff88003b0e6600 (size 248):
  comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
  ...
  backtrace:
    [<ffffffff818f0163>] kmemleak_alloc+0x23/0x40
    [<ffffffff8117abf9>] kmem_cache_alloc+0xd9/0x180
    [<ffffffff817533a8>] __nf_conntrack_alloc.isra.50+0x48/0x170
    [<ffffffff81754262>] nf_conntrack_in+0x3a2/0x5f0
    [<ffffffff817bc9e0>] ipv4_conntrack_local+0x40/0x50
    [<ffffffff8174f17d>] nf_iterate+0x5d/0x70
    [<ffffffff8174f1ef>] nf_hook_slow+0x5f/0xb0
    [<ffffffff8176ffbd>] __ip_local_out+0xad/0xe0
    [<ffffffff81770007>] ip_local_out+0x17/0x40
    [<ffffffff817711f4>] ip_send_skb+0x14/0x40
    [<ffffffff81797591>] udp_send_skb+0x91/0x260
    [<ffffffff817985e5>] udp_sendmsg+0x2f5/0x950
    [<ffffffff817a4f80>] inet_sendmsg+0x60/0x90
    [<ffffffff816fe7d3>] sock_sendmsg+0x33/0x40
    [<ffffffff816fed0e>] SYSC_sendto+0xee/0x160
    [<ffffffff816ff719>] SyS_sendto+0x9/0x10

(248 bytes being an nf_conn structure)

Those structures being cleared in gc_worker() later on we can't talk
about unreferenced object so this patch uses kmemleak_not_leak() to
prevent those warnings.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 net/netfilter/nf_conntrack_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 50106a1..59396f3 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1022,6 +1022,7 @@ __nf_conntrack_alloc(struct net *net,
 	 * SLAB_DESTROY_BY_RCU.
 	 */
 	ct = kmem_cache_alloc(nf_conntrack_cachep, gfp);
+	kmemleak_not_leak(ct);
 	if (ct == NULL)
 		goto out;
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive
  2016-09-21 19:49 [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive Fabian Frederick
@ 2016-09-21 21:02 ` Florian Westphal
  2016-09-22 17:55   ` Fabian Frederick
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Westphal @ 2016-09-21 21:02 UTC (permalink / raw)
  To: Fabian Frederick
  Cc: Florian Westphal, Eric Dumazet, Pablo Neira Ayuso, linux-kernel,
	netfilter-devel

Fabian Frederick <fabf@skynet.be> wrote:
> Since commit f330a7fdbe16
> ("netfilter: conntrack: get rid of conntrack timer")
> 
> closed connections remain longer in /proc/net/nf_conntrack
> 
> Running current kernel; just after boot:
> cat /proc/net/nf_conntrack | wc -l = 5
> 4 minutes required to clean up the table.

We should reap the stale entries while iterating, just like
we do for ctnetlink interface.

Can you try this patch?

diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -212,6 +212,11 @@ static int ct_seq_show(struct seq_file *s, void *v)
 	if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
 		return 0;
 
+	if (nf_ct_should_gc(ct)) {
+		nf_ct_kill(ct);
+		goto release;
+	}
+
 	/* we only want to print DIR_ORIGINAL */
 	if (NF_CT_DIRECTION(hash))
 		goto release;

> Going back to kernel version before commit above there are
> no connections after some seconds.
> 
> Referring to the commit changelog this was an expected behaviour but
> it results in temporary kmemleak reports:

I don't see kmemleak complaints on my test vm, I'm reluctant to
turn it off.

Can you explain why we see such 'false positive'?

The conntracks should still be referenced, as they
are in main table.

> unreferenced object 0xffff88003b0e6600 (size 248):
>   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
>   ...
>   backtrace:
>     [<ffffffff818f0163>] kmemleak_alloc+0x23/0x40
>     [<ffffffff8117abf9>] kmem_cache_alloc+0xd9/0x180
>     [<ffffffff817533a8>] __nf_conntrack_alloc.isra.50+0x48/0x170
>     [<ffffffff81754262>] nf_conntrack_in+0x3a2/0x5f0
>     [<ffffffff817bc9e0>] ipv4_conntrack_local+0x40/0x50
>     [<ffffffff8174f17d>] nf_iterate+0x5d/0x70
>     [<ffffffff8174f1ef>] nf_hook_slow+0x5f/0xb0
>     [<ffffffff8176ffbd>] __ip_local_out+0xad/0xe0
>     [<ffffffff81770007>] ip_local_out+0x17/0x40
>     [<ffffffff817711f4>] ip_send_skb+0x14/0x40
>     [<ffffffff81797591>] udp_send_skb+0x91/0x260
>     [<ffffffff817985e5>] udp_sendmsg+0x2f5/0x950
>     [<ffffffff817a4f80>] inet_sendmsg+0x60/0x90
>     [<ffffffff816fe7d3>] sock_sendmsg+0x33/0x40
>     [<ffffffff816fed0e>] SYSC_sendto+0xee/0x160
>     [<ffffffff816ff719>] SyS_sendto+0x9/0x10
> 
> (248 bytes being an nf_conn structure)
> 
> Those structures being cleared in gc_worker() later on we can't talk
> about unreferenced object so this patch uses kmemleak_not_leak() to
> prevent those warnings.

If thats the case, why is kmemleak complaining? Are you sure this
is a false positive?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive
  2016-09-21 21:02 ` Florian Westphal
@ 2016-09-22 17:55   ` Fabian Frederick
  2016-09-22 21:56     ` Florian Westphal
  0 siblings, 1 reply; 5+ messages in thread
From: Fabian Frederick @ 2016-09-22 17:55 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Pablo Neira Ayuso, linux-kernel, netfilter-devel, Eric Dumazet



> On 21 September 2016 at 23:02 Florian Westphal <fw@strlen.de> wrote:
>
>
> Fabian Frederick <fabf@skynet.be> wrote:
> > Since commit f330a7fdbe16
> > ("netfilter: conntrack: get rid of conntrack timer")
> >
> > closed connections remain longer in /proc/net/nf_conntrack
> >
> > Running current kernel; just after boot:
> > cat /proc/net/nf_conntrack | wc -l = 5
> > 4 minutes required to clean up the table.
>
> We should reap the stale entries while iterating, just like
> we do for ctnetlink interface.
>
> Can you try this patch?
>
> diff --git a/net/netfilter/nf_conntrack_standalone.c
> b/net/netfilter/nf_conntrack_standalone.c
> --- a/net/netfilter/nf_conntrack_standalone.c
> +++ b/net/netfilter/nf_conntrack_standalone.c
> @@ -212,6 +212,11 @@ static int ct_seq_show(struct seq_file *s, void *v)
>       if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
>               return 0;
> 
> +     if (nf_ct_should_gc(ct)) {
> +             nf_ct_kill(ct);
> +             goto release;
> +     }
> +
>       /* we only want to print DIR_ORIGINAL */
>       if (NF_CT_DIRECTION(hash))
>               goto release;
>

Hello Florian,

        First problem is solved: table gets cleared 3 minutes earlier
but I still have kmemleak before running the following:

echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
Nothing
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
-> rsyslogd

I talked about false positive because everything is cleared later.

Note that problem appears only in a VM which is really slow due to
ksoftirqd eating lot of CPU for an unknown reason. Maybe you should test
somewhere else before applying.

Regards,
Fabian

> > Going back to kernel version before commit above there are
> > no connections after some seconds.
> >
> > Referring to the commit changelog this was an expected behaviour but
> > it results in temporary kmemleak reports:
>
> I don't see kmemleak complaints on my test vm, I'm reluctant to
> turn it off.
>
> Can you explain why we see such 'false positive'?
>
> The conntracks should still be referenced, as they
> are in main table.
>
> > unreferenced object 0xffff88003b0e6600 (size 248):
> >   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
> >   ...
> >   backtrace:
> >     [] kmemleak_alloc+0x23/0x40
> >     [] kmem_cache_alloc+0xd9/0x180
> >     [] __nf_conntrack_alloc.isra.50+0x48/0x170
> >     [] nf_conntrack_in+0x3a2/0x5f0
> >     [] ipv4_conntrack_local+0x40/0x50
> >     [] nf_iterate+0x5d/0x70
> >     [] nf_hook_slow+0x5f/0xb0
> >     [] __ip_local_out+0xad/0xe0
> >     [] ip_local_out+0x17/0x40
> >     [] ip_send_skb+0x14/0x40
> >     [] udp_send_skb+0x91/0x260
> >     [] udp_sendmsg+0x2f5/0x950
> >     [] inet_sendmsg+0x60/0x90
> >     [] sock_sendmsg+0x33/0x40
> >     [] SYSC_sendto+0xee/0x160
> >     [] SyS_sendto+0x9/0x10
> >
> > (248 bytes being an nf_conn structure)
> >
> > Those structures being cleared in gc_worker() later on we can't talk
> > about unreferenced object so this patch uses kmemleak_not_leak() to
> > prevent those warnings.
>
> If thats the case, why is kmemleak complaining? Are you sure this
> is a false positive?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive
  2016-09-22 17:55   ` Fabian Frederick
@ 2016-09-22 21:56     ` Florian Westphal
  2016-09-23 19:40       ` Fabian Frederick
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Westphal @ 2016-09-22 21:56 UTC (permalink / raw)
  To: Fabian Frederick
  Cc: Florian Westphal, Pablo Neira Ayuso, linux-kernel,
	netfilter-devel, Eric Dumazet

Fabian Frederick <fabf@skynet.be> wrote:
> Hello Florian,
> 
>         First problem is solved: table gets cleared 3 minutes earlier
> but I still have kmemleak before running the following:
> 
> echo scan > /sys/kernel/debug/kmemleak
> cat /sys/kernel/debug/kmemleak
> Nothing
> echo scan > /sys/kernel/debug/kmemleak
> cat /sys/kernel/debug/kmemleak
> -> rsyslogd
> 
> I talked about false positive because everything is cleared later.

Hmm, I fear this is a real bug and not false positive.

Should be possible to confirm this via slabinfo:

grep nf_conntrack /proc/slabinfo

The active objects should match the conntrack count.
(conntrack -C, or wc -l < /proc/....).

> > > unreferenced object 0xffff88003b0e6600 (size 248):
> > >   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
> > >   ...
> > >   backtrace:
> > >     [] kmemleak_alloc+0x23/0x40
> > >     [] kmem_cache_alloc+0xd9/0x180
> > >     [] __nf_conntrack_alloc.isra.50+0x48/0x170
> > >     [] nf_conntrack_in+0x3a2/0x5f0
> > >     [] ipv4_conntrack_local+0x40/0x50
> > >     [] nf_iterate+0x5d/0x70
> > >     [] nf_hook_slow+0x5f/0xb0
> > >     [] __ip_local_out+0xad/0xe0
> > >     [] ip_local_out+0x17/0x40
> > >     [] ip_send_skb+0x14/0x40
> > >     [] udp_send_skb+0x91/0x260
> > >     [] udp_sendmsg+0x2f5/0x950
> > >     [] inet_sendmsg+0x60/0x90
> > >     [] sock_sendmsg+0x33/0x40
> > >     [] SYSC_sendto+0xee/0x160
> > >     [] SyS_sendto+0x9/0x10

Hmm, so we leak when allocating conntrack for outgoing packet.
Do you do any filtering (DROP) in output/postrouting?

> > > (248 bytes being an nf_conn structure)
> > >
> > > Those structures being cleared in gc_worker() later on we can't talk
> > > about unreferenced object so this patch uses kmemleak_not_leak() to
> > > prevent those warnings.
> >
> > If thats the case, why is kmemleak complaining? Are you sure this
> > is a false positive?

Looks like a real bug to me, but I don't see anything obvious so far.
I'll look at this again tomorrow.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive
  2016-09-22 21:56     ` Florian Westphal
@ 2016-09-23 19:40       ` Fabian Frederick
  0 siblings, 0 replies; 5+ messages in thread
From: Fabian Frederick @ 2016-09-23 19:40 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Pablo Neira Ayuso, linux-kernel, netfilter-devel, Eric Dumazet



> On 22 September 2016 at 23:56 Florian Westphal <fw@strlen.de> wrote:
>
>
> Fabian Frederick <fabf@skynet.be> wrote:
> > Hello Florian,
> >
> >         First problem is solved: table gets cleared 3 minutes earlier
> > but I still have kmemleak before running the following:
> >
> > echo scan > /sys/kernel/debug/kmemleak
> > cat /sys/kernel/debug/kmemleak
> > Nothing
> > echo scan > /sys/kernel/debug/kmemleak
> > cat /sys/kernel/debug/kmemleak
> > -> rsyslogd
> >
> > I talked about false positive because everything is cleared later.
>
> Hmm, I fear this is a real bug and not false positive.
>
> Should be possible to confirm this via slabinfo:
>
> grep nf_conntrack /proc/slabinfo
>
> The active objects should match the conntrack count.
> (conntrack -C, or wc -l < /proc/....).

Softirq latencies have been solved today in linux-next: commit
60bf815c0d08
("Revert "softirq: fix tasklet_kill() and its users"")

but nf_conntrack still needs a lot of time to reach 0

Slabinfo always remains the following:

/proc/slabinfo: 
nf_conntrack          16     16    256   16

even when /proc/net/nf_conntrack is 0

Going back to kernel version before commit f330a7fdbe16
It's the same with 12 active objects:

/proc/slabinfo: 
nf_conntrack          12     12    320  12

>
> > > > unreferenced object 0xffff88003b0e6600 (size 248):
> > > >   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
> > > >   ...
> > > >   backtrace:
> > > >     [] kmemleak_alloc+0x23/0x40
> > > >     [] kmem_cache_alloc+0xd9/0x180
> > > >     [] __nf_conntrack_alloc.isra.50+0x48/0x170
> > > >     [] nf_conntrack_in+0x3a2/0x5f0
> > > >     [] ipv4_conntrack_local+0x40/0x50
> > > >     [] nf_iterate+0x5d/0x70
> > > >     [] nf_hook_slow+0x5f/0xb0
> > > >     [] __ip_local_out+0xad/0xe0
> > > >     [] ip_local_out+0x17/0x40
> > > >     [] ip_send_skb+0x14/0x40
> > > >     [] udp_send_skb+0x91/0x260
> > > >     [] udp_sendmsg+0x2f5/0x950
> > > >     [] inet_sendmsg+0x60/0x90
> > > >     [] sock_sendmsg+0x33/0x40
> > > >     [] SYSC_sendto+0xee/0x160
> > > >     [] SyS_sendto+0x9/0x10
>
> Hmm, so we leak when allocating conntrack for outgoing packet.
> Do you do any filtering (DROP) in output/postrouting?
iptables -L gives ACCEPT for all.

>
> > > > (248 bytes being an nf_conn structure)
> > > >
> > > > Those structures being cleared in gc_worker() later on we can't talk
> > > > about unreferenced object so this patch uses kmemleak_not_leak() to
> > > > prevent those warnings.
> > >
> > > If thats the case, why is kmemleak complaining? Are you sure this
> > > is a false positive?
>
> Looks like a real bug to me, but I don't see anything obvious so far.
> I'll look at this again tomorrow.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-09-23 19:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-21 19:49 [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive Fabian Frederick
2016-09-21 21:02 ` Florian Westphal
2016-09-22 17:55   ` Fabian Frederick
2016-09-22 21:56     ` Florian Westphal
2016-09-23 19:40       ` Fabian Frederick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.