netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Memory leak in ip_dst_cache
@ 2011-09-09  5:04 Kumar S
  2011-09-09  5:30 ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Kumar S @ 2011-09-09  5:04 UTC (permalink / raw)
  To: netdev

Hi,
We are running Linux-2.6.24 kernel on MPC8360. Though forwarding is disabled, when connected to public network we see the system running out of memory and rebooting frequently. After doing some study we found out slowly "ip_dst-cache" is growing, and doesn't release entries. Interestingly route entries displayed with command "ip route ls cache" show fewer than the active objects listed under "cat /proc/slabinfo | grep ip_dst_cache". After doing some study, we could reproduce it in the lab by injecting packets withdifferent source IP addresses. Ideally ageing should happen, and old entries are supposed to be cleared out, but that doesn't happen. We do see one or two entries getting agedout but not all. 
After some time we did see that "rt_run_flush" kicks in and flushes out the ip_dst_cache. That's why the "ip route ls cache" fewer entries. But looks like __chache_free() doesn't get called, that's why these entries are not really released. This results in the leak.
 
Any idea what is going wrong here. Is it a known bug?
 
Regards
psk

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-09  5:04 Memory leak in ip_dst_cache Kumar S
@ 2011-09-09  5:30 ` Eric Dumazet
       [not found]   ` <1315593553.98279.YahooMailNeo@web113904.mail.gq1.yahoo.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-09-09  5:30 UTC (permalink / raw)
  To: Kumar S; +Cc: netdev

Le jeudi 08 septembre 2011 à 22:04 -0700, Kumar S a écrit :
> Hi,
> We are running Linux-2.6.24 kernel on MPC8360. Though forwarding is
> disabled, when connected to public network we see the system running
> out of memory and rebooting frequently. After doing some study we
> found out slowly "ip_dst-cache" is growing, and doesn't release
> entries. Interestingly route entries displayed with command "ip route
> ls cache" show fewer than the active objects listed under
> "cat /proc/slabinfo | grep ip_dst_cache". After doing some study, we
> could reproduce it in the lab by injecting packets withdifferent
> source IP addresses. Ideally ageing should happen, and old entries are
> supposed to be cleared out, but that doesn't happen. We do see one or
> two entries getting agedout but not all. 
> After some time we did see that "rt_run_flush" kicks in and flushes
> out the ip_dst_cache. That's why the "ip route ls cache" fewer
> entries. But looks like __chache_free() doesn't get called, that's why
> these entries are not really released. This results in the leak.
>  
> Any idea what is going wrong here. Is it a known bug?

Please send :

grep . /proc/sys/net/ipv4/route/*
rtstat -c10 -i1

This a very well known problem. You need to upgrade your kernel in order
to avoid very complex tuning of your IP route cache.

Recent ones have smooth garbage collection.

In the meantime, you can tune a bit :

echo 1 >/proc/sys/net/ipv4/route/gc_interval
echo 4 >/proc/sys/net/ipv4/route/gc_elasticity

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
       [not found]     ` <1315596786.2606.3.camel@edumazet-laptop>
@ 2011-09-09 20:48       ` Kumar S
  2011-09-09 21:47         ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Kumar S @ 2011-09-09 20:48 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

Hi Eric,
I did not flush those routes explicitly. They are flushed out automatically by calling "rt_run_flush". It inturn calls rt_free(), but __cache_free() never gets called after that. Basically, I have 512 MBytes of memory on board. I want to restrict the cache to around 64k, with max_size. But if I do that due to this leak, TCP/IP communication stops working when it reaches 64k.
 
Thanks
psk


----- Original Message -----
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Kumar S <ps2kumar@yahoo.com>
Cc: 
Sent: Friday, September 9, 2011 12:33 PM
Subject: Re: Memory leak in ip_dst_cache

Le vendredi 09 septembre 2011 à 11:39 -0700, Kumar S a écrit :
> Hi Eric,
> Thanks for the response. Here is the output of the commands you mentioned.
> 

Hmm, you should have included netdev in this message.

> After injecting around11k routes:
> -sh-2.05b#
> -sh-2.05b# ip route ls cache | wc
>   23256  162772 1166903
> -sh-2.05b#
> 
> After route flush out:
> 
> -sh-2.05b#
> -sh-2.05b# ip route ls cache | wc
>      14      88     710
> -sh-2.05b#
> -sh-2.05b#
> -sh-2.05b#
> -sh-2.05b# cat /proc/slabinfo | grep ip_dst
> ip_dst_cache       11673  11685    256   15    1 : tunables  120   60    0 : slabdata    779    779      0

Thats OK : Each tcp session has a pointer to a dst.

When you "ip route flush cache", busy dst are all invalidated.
"ip route ls cache" doesnt output invalidated dsts.

As soon as tcp session will try to send a packet, it will notice the dst
is obsolete and 'free' it.


> -sh-2.05b#
> -sh-2.05b# grep . /proc/sys/net/ipv4/route/*
> /proc/sys/net/ipv4/route/error_burst:1250
> /proc/sys/net/ipv4/route/error_cost:250
> grep: /proc/sys/net/ipv4/route/flush: Permission denied
> /proc/sys/net/ipv4/route/gc_elasticity:8
> /proc/sys/net/ipv4/route/gc_interval:60
> /proc/sys/net/ipv4/route/gc_min_interval:0
> /proc/sys/net/ipv4/route/gc_min_interval_ms:500
> /proc/sys/net/ipv4/route/gc_thresh:16384
> /proc/sys/net/ipv4/route/gc_timeout:300
> /proc/sys/net/ipv4/route/max_delay:10
> /proc/sys/net/ipv4/route/max_size:2097152
> /proc/sys/net/ipv4/route/min_adv_mss:256
> /proc/sys/net/ipv4/route/min_delay:2
> /proc/sys/net/ipv4/route/min_pmtu:552
> /proc/sys/net/ipv4/route/mtu_expires:600
> /proc/sys/net/ipv4/route/redirect_load:5
> /proc/sys/net/ipv4/route/redirect_number:9
> /proc/sys/net/ipv4/route/redirect_silence:5120
> /proc/sys/net/ipv4/route/secret_interval:600
> -sh-2.05b# rtstat -c10 -i1
> rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
>  entries|  in_hit|in_slow_|in_slow_|in_no_ro|  in_brd|in_marti|in_marti| out_hit|out_slow|out_slow|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|out_hlis|
>         |        |     tot|      mc|     ute|        |  an_dst|  an_src|        |    _tot|     _mc|        |      ed|    miss| verflow| _search|t_search|
>    11675|   27700|  101014|       0|       0|     130|       0|       0|   36284|   23436|       0|       0|       0|       0|       0|    9658|   23140|
>    11675|       6|      18|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       2|      10|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       8|      18|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       6|      18|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       4|      16|       0|       0|       0|       0|       0|       4|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       8|      12|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       6|      10|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       2|      14|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
>    11675|       8|      12|       0|       0|       0|       0|       0|       2|       0|       0|       0|       0|       0|       0|       0|       0|
> -sh-2.05b#
> 
> 

Thats pretty low ip route usage, you dont need to tweak your tunables.

Thanks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-09 20:48       ` Kumar S
@ 2011-09-09 21:47         ` Eric Dumazet
       [not found]           ` <1315605497.25052.YahooMailNeo@web113916.mail.gq1.yahoo.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-09-09 21:47 UTC (permalink / raw)
  To: Kumar S; +Cc: netdev

Le vendredi 09 septembre 2011 à 13:48 -0700, Kumar S a écrit :
> Hi Eric,
> I did not flush those routes explicitly. They are flushed out
> automatically by calling "rt_run_flush". It inturn calls rt_free(),
> but __cache_free() never gets called after that. Basically, I have 512
> MBytes of memory on board. I want to restrict the cache to around 64k,
> with max_size. But if I do that due to this leak, TCP/IP communication
> stops working when it reaches 64k.
>  

Your dsts use about 3Mb of memory out of 512Mb. Its certainly not the
reason of the failures you have.

Its not a leak.

All "in use" dst are queued in dst_gc, check net/core/dst.c for details.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
       [not found]           ` <1315605497.25052.YahooMailNeo@web113916.mail.gq1.yahoo.com>
@ 2011-09-09 22:08             ` Eric Dumazet
  2011-09-09 22:53               ` Kumar S
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-09-09 22:08 UTC (permalink / raw)
  To: Kumar S; +Cc: netdev

Le vendredi 09 septembre 2011 à 14:58 -0700, Kumar S a écrit :
> I agree, 3 MB is not a concern. But actually this is exceeding, we are
> seeing more than 1200000 objects in a span of 24 hours, which is close
> to 300 MB.

Hmm, please include netdev in your mails.

Then there is something else.

Maybe a timer bug preventing dst_gc_task() to make its duty.

>  Other tasks require somewhere around 200 MB+. When it reaches this
> number, the board reboots. In lab we could simulate this. That is how
> I'm trying to understand the problem. I'm injecting around 10k routes,
> and waiting for them to ageout.
>  

I dont know, you could try disable IP route cache.

On recent (2.6.29+) kernels, thats very easy

# Disable Ip route cache (negative value in rebuild_count)
echo 3000000000 >/proc/sys/net/ipv4/rt_cache_rebuild_count

On an old kernel, you probably need to patch your kernel, or backport
commit 1080d709fb9d8cd43 (net: implement emergency route cache rebulds
when gc_elasticity is exceeded)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-09 22:08             ` Eric Dumazet
@ 2011-09-09 22:53               ` Kumar S
  2011-09-10 13:04                 ` Neil Horman
  0 siblings, 1 reply; 14+ messages in thread
From: Kumar S @ 2011-09-09 22:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

Yes Eric, this command doesn't work on 2.6.24. Which timer are you suspecting?


----- Original Message -----
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Kumar S <ps2kumar@yahoo.com>
Cc: netdev <netdev@vger.kernel.org>
Sent: Friday, September 9, 2011 3:08 PM
Subject: Re: Memory leak in ip_dst_cache

Le vendredi 09 septembre 2011 à 14:58 -0700, Kumar S a écrit :
> I agree, 3 MB is not a concern. But actually this is exceeding, we are
> seeing more than 1200000 objects in a span of 24 hours, which is close
> to 300 MB.

Hmm, please include netdev in your mails.

Then there is something else.

Maybe a timer bug preventing dst_gc_task() to make its duty.

>  Other tasks require somewhere around 200 MB+. When it reaches this
> number, the board reboots. In lab we could simulate this. That is how
> I'm trying to understand the problem. I'm injecting around 10k routes,
> and waiting for them to ageout.
>  

I dont know, you could try disable IP route cache.

On recent (2.6.29+) kernels, thats very easy

# Disable Ip route cache (negative value in rebuild_count)
echo 3000000000 >/proc/sys/net/ipv4/rt_cache_rebuild_count

On an old kernel, you probably need to patch your kernel, or backport
commit 1080d709fb9d8cd43 (net: implement emergency route cache rebulds
when gc_elasticity is exceeded)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-09 22:53               ` Kumar S
@ 2011-09-10 13:04                 ` Neil Horman
  2011-09-12  3:38                   ` Kumar S
  0 siblings, 1 reply; 14+ messages in thread
From: Neil Horman @ 2011-09-10 13:04 UTC (permalink / raw)
  To: Kumar S; +Cc: Eric Dumazet, netdev

On Fri, Sep 09, 2011 at 03:53:45PM -0700, Kumar S wrote:
> Yes Eric, this command doesn't work on 2.6.24. Which timer are you suspecting?
> 
He means that the timer that starts the the workqueue which calls dst_gc_task is
either delayed indefinately, or otherwise not run, which means that the route
cache will never be scanned for old entries.  The implication being that, while
the dst entries aren't leaked per-se, but never expired, so they just sit
around.  You can check this by instrumenting gc_dst_task with a printk or two
and watch to see if it ever pops up.

If you only have 512Mb of ram and want to really restrict you're cache size, you
should definately consider backporting the route cache rebuild patch to disable
the route cache usage entirely.  It shouldn't be too hard to do.
Neil

> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-10 13:04                 ` Neil Horman
@ 2011-09-12  3:38                   ` Kumar S
  2011-09-12  5:40                     ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Kumar S @ 2011-09-12  3:38 UTC (permalink / raw)
  To: Neil Horman; +Cc: Eric Dumazet, netdev

Thanks Neil. I did try with prink(). I do see entries getting aged out, but they are not getting deallocated. This seems to be happening because of "ref_cnt". When the route entries are added the ref_cnt is set to 1. Looks this is causing trouble clearing the entries completely. If I set the ref_cnt to 0, I can see it working. Now I'm trying to understand whether this is right. Please let me know if you have any thoughts on it.


----- Original Message -----
From: Neil Horman <nhorman@tuxdriver.com>
To: Kumar S <ps2kumar@yahoo.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>; netdev <netdev@vger.kernel.org>
Sent: Saturday, September 10, 2011 6:04 AM
Subject: Re: Memory leak in ip_dst_cache

On Fri, Sep 09, 2011 at 03:53:45PM -0700, Kumar S wrote:
> Yes Eric, this command doesn't work on 2.6.24. Which timer are you suspecting?
> 
He means that the timer that starts the the workqueue which calls dst_gc_task is
either delayed indefinately, or otherwise not run, which means that the route
cache will never be scanned for old entries.  The implication being that, while
the dst entries aren't leaked per-se, but never expired, so they just sit
around.  You can check this by instrumenting gc_dst_task with a printk or two
and watch to see if it ever pops up.

If you only have 512Mb of ram and want to really restrict you're cache size, you
should definately consider backporting the route cache rebuild patch to disable
the route cache usage entirely.  It shouldn't be too hard to do.
Neil

> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-12  3:38                   ` Kumar S
@ 2011-09-12  5:40                     ` Eric Dumazet
  2011-09-12  6:07                       ` Kumar S
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-09-12  5:40 UTC (permalink / raw)
  To: Kumar S; +Cc: Neil Horman, netdev

Le dimanche 11 septembre 2011 à 20:38 -0700, Kumar S a écrit :

Please dont top post.

> Thanks Neil. I did try with prink(). I do see entries getting aged
> out, but they are not getting deallocated. This seems to be happening
> because of "ref_cnt". When the route entries are added the ref_cnt is
> set to 1. Looks this is causing trouble clearing the entries
> completely. If I set the ref_cnt to 0, I can see it working. Now I'm
> trying to understand whether this is right. Please let me know if you
> have any thoughts on it.

I believe I already explained what was happening.

A tcp socket has a pointer to a dst, so it holds a reference on it, to
make sure no freeing of dst can happen while at least some socket still
can reference dst. (It could reference freed memory and crash)

As soon as the tcp socket will try to transmit some data, the dst will
be checked and we notice its obsolete : We then release the refcount and
dst pointer.

Later, the garbage collector can notice dst refcount is zero and can
free dst.

If you have dormant tcp sockets (no trafic at all), they hold their dst.

A dormant tcp socket has a pretty more expensive memory cost than its
dst. (Socket structure, dentry, inode, and probably in user land a
thread or process, and data)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-12  5:40                     ` Eric Dumazet
@ 2011-09-12  6:07                       ` Kumar S
  2011-09-12  6:28                         ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Kumar S @ 2011-09-12  6:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Neil Horman, netdev

----- Original Message -----
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Kumar S <ps2kumar@yahoo.com>
Cc: Neil Horman <nhorman@tuxdriver.com>; netdev <netdev@vger.kernel.org>
Sent: Sunday, September 11, 2011 10:40 PM
Subject: Re: Memory leak in ip_dst_cache

Le dimanche 11 septembre 2011 à 20:38 -0700, Kumar S a écrit :

Please dont top post.

>> Thanks Neil. I did try with prink(). I do see entries getting aged
>> out, but they are not getting deallocated. This seems to be happening
>> because of "ref_cnt". When the route entries are added the ref_cnt is
>> set to 1. Looks this is causing trouble clearing the entries
>> completely. If I set the ref_cnt to 0, I can see it working. Now I'm
>> trying to understand whether this is right. Please let me know if you
>> have any thoughts on it.

>I believe I already explained what was happening.

>A tcp socket has a pointer to a dst, so it holds a reference on it, to
>make sure no freeing of dst can happen while at least some socket still
>can reference dst. (It could reference freed memory and crash)

>As soon as the tcp socket will try to transmit some data, the dst will
>be checked and we notice its obsolete : We then release the refcount and
>dst pointer.

>Later, the garbage collector can notice dst refcount is zero and can
>free dst.

>If you have dormant tcp sockets (no trafic at all), they hold their dst.
>A dormant tcp socket has a pretty more expensive memory cost than its
>dst. (Socket structure, dentry, inode, and probably in user land a
>thread or process, and data)
 
Thanks Eric for detailed explanation. You did mention this before. What I see is the cache entries related to the TCP sockets are getting cleared, whenever they age out. But the issue we see here is with the broadcast messages such as SMB messages and network neighbor hood messages. They never get freed. There is no traffic to those destinations from our board. 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-12  6:07                       ` Kumar S
@ 2011-09-12  6:28                         ` Eric Dumazet
  2011-09-12 17:16                           ` Kumar S
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-09-12  6:28 UTC (permalink / raw)
  To: Kumar S; +Cc: Neil Horman, netdev

Le dimanche 11 septembre 2011 à 23:07 -0700, Kumar S a écrit :
> ----- Original Message -----
> From: Eric Dumazet <eric.dumazet@gmail.com>
> To: Kumar S <ps2kumar@yahoo.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>; netdev <netdev@vger.kernel.org>
> Sent: Sunday, September 11, 2011 10:40 PM
> Subject: Re: Memory leak in ip_dst_cache
> 
> Le dimanche 11 septembre 2011 à 20:38 -0700, Kumar S a écrit :
> 
> Please dont top post.
> 
> >> Thanks Neil. I did try with prink(). I do see entries getting aged
> >> out, but they are not getting deallocated. This seems to be happening
> >> because of "ref_cnt". When the route entries are added the ref_cnt is
> >> set to 1. Looks this is causing trouble clearing the entries
> >> completely. If I set the ref_cnt to 0, I can see it working. Now I'm
> >> trying to understand whether this is right. Please let me know if you
> >> have any thoughts on it.
> 
> >I believe I already explained what was happening.
> 
> >A tcp socket has a pointer to a dst, so it holds a reference on it, to
> >make sure no freeing of dst can happen while at least some socket still
> >can reference dst. (It could reference freed memory and crash)
> 
> >As soon as the tcp socket will try to transmit some data, the dst will
> >be checked and we notice its obsolete : We then release the refcount and
> >dst pointer.
> 
> >Later, the garbage collector can notice dst refcount is zero and can
> >free dst.
> 
> >If you have dormant tcp sockets (no trafic at all), they hold their dst.
> >A dormant tcp socket has a pretty more expensive memory cost than its
> >dst. (Socket structure, dentry, inode, and probably in user land a
> >thread or process, and data)
>  
> Thanks Eric for detailed explanation. You did mention this before.
> What I see is the cache entries related to the TCP sockets are getting
> cleared, whenever they age out. But the issue we see here is with the
> broadcast messages such as SMB messages and network neighbor hood
> messages. They never get freed. There is no traffic to those
> destinations from our board. 

What do you mean ? Your box is a router only ?

Those SMB messages are going through it ?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-12  6:28                         ` Eric Dumazet
@ 2011-09-12 17:16                           ` Kumar S
  2011-09-12 17:57                             ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Kumar S @ 2011-09-12 17:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Neil Horman, netdev

----- Original Message -----
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Kumar S <ps2kumar@yahoo.com>
Cc: Neil Horman <nhorman@tuxdriver.com>; netdev <netdev@vger.kernel.org>
Sent: Sunday, September 11, 2011 11:28 PM
Subject: Re: Memory leak in ip_dst_cache

Le dimanche 11 septembre 2011 à 23:07 -0700, Kumar S a écrit :
>> ----- Original Message -----
>> From: Eric Dumazet <eric.dumazet@gmail.com>
> >To: Kumar S <ps2kumar@yahoo.com>
> >Cc: Neil Horman <nhorman@tuxdriver.com>; netdev <netdev@vger.kernel.org>
> >Sent: Sunday, September 11, 2011 10:40 PM
> Subject: Re: Memory leak in ip_dst_cache
> >
>> Le dimanche 11 septembre 2011 à 20:38 -0700, Kumar S a écrit :
>> 
>> Please dont top post.
>> 
> >>> Thanks Neil. I did try with prink(). I do see entries getting aged
> >>> out, but they are not getting deallocated. This seems to be happening
> >>>because of "ref_cnt". When the route entries are added the ref_cnt is
> >>> set to 1. Looks this is causing trouble clearing the entries
> >>> completely. If I set the ref_cnt to 0, I can see it working. Now I'm
> >>> trying to understand whether this is right. Please let me know if you
> >>> have any thoughts on it.
>> 
> >>I believe I already explained what was happening.
>> 
> >>A tcp socket has a pointer to a dst, so it holds a reference on it, to
> >>make sure no freeing of dst can happen while at least some socket still
> >>can reference dst. (It could reference freed memory and crash)
>> 
> >>As soon as the tcp socket will try to transmit some data, the dst will
> >>be checked and we notice its obsolete : We then release the refcount and
> >>dst pointer.
> >
> >>Later, the garbage collector can notice dst refcount is zero and can
> >>free dst.
> >
> >>If you have dormant tcp sockets (no trafic at all), they hold their dst.
> >>A dormant tcp socket has a pretty more expensive memory cost than its
> >>dst. (Socket structure, dentry, inode, and probably in user land a
> >>thread or process, and data)
> > 
>> Thanks Eric for detailed explanation. You did mention this before.
>> What I see is the cache entries related to the TCP sockets are getting
> >cleared, whenever they age out. But the issue we see here is with the
> >broadcast messages such as SMB messages and network neighbor hood
>> messages. They never get freed. There is no traffic to those
> >destinations from our board. 

>What do you mean ? Your box is a router only ?

>Those SMB messages are going through it ?
 
Our box is a stand-alone system with L2 Quick Engine. This QE forwards all broadcast to the other ports and also a copy to the CPU port. 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-12 17:16                           ` Kumar S
@ 2011-09-12 17:57                             ` Eric Dumazet
  2011-09-12 22:20                               ` Kumar S
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-09-12 17:57 UTC (permalink / raw)
  To: Kumar S; +Cc: Neil Horman, netdev

Le lundi 12 septembre 2011 à 10:16 -0700, Kumar S a écrit :
> ----- Original Message -----
> From: Eric Dumazet <eric.dumazet@gmail.com>

> >What do you mean ? Your box is a router only ?
> 
> >Those SMB messages are going through it ?
>  
> Our box is a stand-alone system with L2 Quick Engine. This QE forwards
> all broadcast to the other ports and also a copy to the CPU port. 

It sounds like a modified kernel, maybe you added a bug in the code...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Memory leak in ip_dst_cache
  2011-09-12 17:57                             ` Eric Dumazet
@ 2011-09-12 22:20                               ` Kumar S
  0 siblings, 0 replies; 14+ messages in thread
From: Kumar S @ 2011-09-12 22:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Neil Horman, netdev



----- Original Message -----
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Kumar S <ps2kumar@yahoo.com>
Cc: Neil Horman <nhorman@tuxdriver.com>; netdev <netdev@vger.kernel.org>
Sent: Monday, September 12, 2011 10:57 AM
Subject: Re: Memory leak in ip_dst_cache

Le lundi 12 septembre 2011 à 10:16 -0700, Kumar S a écrit :
>> ----- Original Message -----
>> From: Eric Dumazet <eric.dumazet@gmail.com>

>>What do you mean ? Your box is a router only ?
>> 
> >>Those SMB messages are going through it ?
> > 
>> Our box is a stand-alone system with L2 Quick Engine. This QE forwards
>> all broadcast to the other ports and also a copy to the CPU port. 

>It sounds like a modified kernel, maybe you added a bug in the code...

It is possible. I'm trying to isolate the same. Your input is helping a lot in understanding the flow related to ip_dst_cache. 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-09-12 22:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-09  5:04 Memory leak in ip_dst_cache Kumar S
2011-09-09  5:30 ` Eric Dumazet
     [not found]   ` <1315593553.98279.YahooMailNeo@web113904.mail.gq1.yahoo.com>
     [not found]     ` <1315596786.2606.3.camel@edumazet-laptop>
2011-09-09 20:48       ` Kumar S
2011-09-09 21:47         ` Eric Dumazet
     [not found]           ` <1315605497.25052.YahooMailNeo@web113916.mail.gq1.yahoo.com>
2011-09-09 22:08             ` Eric Dumazet
2011-09-09 22:53               ` Kumar S
2011-09-10 13:04                 ` Neil Horman
2011-09-12  3:38                   ` Kumar S
2011-09-12  5:40                     ` Eric Dumazet
2011-09-12  6:07                       ` Kumar S
2011-09-12  6:28                         ` Eric Dumazet
2011-09-12 17:16                           ` Kumar S
2011-09-12 17:57                             ` Eric Dumazet
2011-09-12 22:20                               ` Kumar S

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).