linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Memory leak in 2.6.11-rc1?
@ 2005-01-21 16:19 Jan Kasprzak
  2005-01-22  2:23 ` Alexander Nyberg
  2005-02-07 11:00 ` Jan Kasprzak
  0 siblings, 2 replies; 87+ messages in thread
From: Jan Kasprzak @ 2005-01-21 16:19 UTC (permalink / raw)
  To: linux-kernel

	Hi all,

I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week
now, and I think there is a memory leak somewhere. I am measuring the
size of active and inactive pages (from /proc/meminfo), and it seems
that the count of sum (active+inactive) pages is decreasing. Please
take look at the graphs at

http://www.linux.cz/stats/mrtg-rrd/vm_active.html

(especially the "monthly" graph) - I've booted 2.6.11-rc1 last Friday,
and since then the size of "inactive" pages is decreasing almost
constantly, while "active" is not increasing. The active+inactive
sum has been steady before, as you can see from both the monthly
and yearly graphs.

Now I am playing with 2.6.11-rc1-bk snapshots to see what happens.
I have been running 2.6.10-rc3 before. More info is available, please ask me.
The box runs 3ware 7506-8 controller with SW RAID-0, 1, and 5 volumes,
Tigon3 network card. The main load is FTP server, and there is also
a HTTP server and Qmail.

	Thanks,

-Yenya

-- 
| Jan "Yenya" Kasprzak  <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839      Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/   Czech Linux Homepage: http://www.linux.cz/ |
> Whatever the Java applications and desktop dances may lead to, Unix will <
> still be pushing the packets around for a quite a while.      --Rob Pike <

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-21 16:19 Memory leak in 2.6.11-rc1? Jan Kasprzak
@ 2005-01-22  2:23 ` Alexander Nyberg
  2005-01-23  9:11   ` Jens Axboe
  2005-02-07 11:00 ` Jan Kasprzak
  1 sibling, 1 reply; 87+ messages in thread
From: Alexander Nyberg @ 2005-01-22  2:23 UTC (permalink / raw)
  To: Jan Kasprzak; +Cc: linux-kernel

fre 2005-01-21 klockan 17:19 +0100 skrev Jan Kasprzak:
> 	Hi all,
> 
> I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week
> now, and I think there is a memory leak somewhere. I am measuring the
> size of active and inactive pages (from /proc/meminfo), and it seems
> that the count of sum (active+inactive) pages is decreasing. Please
> take look at the graphs at
> 
> http://www.linux.cz/stats/mrtg-rrd/vm_active.html
> 
> (especially the "monthly" graph) - I've booted 2.6.11-rc1 last Friday,
> and since then the size of "inactive" pages is decreasing almost
> constantly, while "active" is not increasing. The active+inactive
> sum has been steady before, as you can see from both the monthly
> and yearly graphs.
> 
> Now I am playing with 2.6.11-rc1-bk snapshots to see what happens.
> I have been running 2.6.10-rc3 before. More info is available, please ask me.
> The box runs 3ware 7506-8 controller with SW RAID-0, 1, and 5 volumes,
> Tigon3 network card. The main load is FTP server, and there is also
> a HTTP server and Qmail.

Others have seen this as well, the reports indicated that it takes a day
or two before it becomes noticeable. When it happens next time please
capture the output of /proc/meminfo and /proc/slabinfo.

Thanks
Alexander


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-22  2:23 ` Alexander Nyberg
@ 2005-01-23  9:11   ` Jens Axboe
  2005-01-23  9:19     ` Andrew Morton
  0 siblings, 1 reply; 87+ messages in thread
From: Jens Axboe @ 2005-01-23  9:11 UTC (permalink / raw)
  To: Alexander Nyberg; +Cc: Jan Kasprzak, linux-kernel, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1444 bytes --]

On Sat, Jan 22 2005, Alexander Nyberg wrote:
> fre 2005-01-21 klockan 17:19 +0100 skrev Jan Kasprzak:
> > 	Hi all,
> > 
> > I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week
> > now, and I think there is a memory leak somewhere. I am measuring the
> > size of active and inactive pages (from /proc/meminfo), and it seems
> > that the count of sum (active+inactive) pages is decreasing. Please
> > take look at the graphs at
> > 
> > http://www.linux.cz/stats/mrtg-rrd/vm_active.html
> > 
> > (especially the "monthly" graph) - I've booted 2.6.11-rc1 last Friday,
> > and since then the size of "inactive" pages is decreasing almost
> > constantly, while "active" is not increasing. The active+inactive
> > sum has been steady before, as you can see from both the monthly
> > and yearly graphs.
> > 
> > Now I am playing with 2.6.11-rc1-bk snapshots to see what happens.
> > I have been running 2.6.10-rc3 before. More info is available, please ask me.
> > The box runs 3ware 7506-8 controller with SW RAID-0, 1, and 5 volumes,
> > Tigon3 network card. The main load is FTP server, and there is also
> > a HTTP server and Qmail.
> 
> Others have seen this as well, the reports indicated that it takes a day
> or two before it becomes noticeable. When it happens next time please
> capture the output of /proc/meminfo and /proc/slabinfo.

This is after 2 days of uptime, the box is basically unusable.

-- 
Jens Axboe


[-- Attachment #2: meminfo --]
[-- Type: text/plain, Size: 676 bytes --]

MemTotal:      1022372 kB
MemFree:         10024 kB
Buffers:          4664 kB
Cached:         121564 kB
SwapCached:      33636 kB
Active:         429544 kB
Inactive:       109512 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      1022372 kB
LowFree:         10024 kB
SwapTotal:     1116476 kB
SwapFree:       729056 kB
Dirty:             180 kB
Writeback:           0 kB
Mapped:         422216 kB
Slab:            42948 kB
CommitLimit:   1627660 kB
Committed_AS:  1134080 kB
PageTables:       7976 kB
VmallocTotal: 34359738367 kB
VmallocUsed:      1152 kB
VmallocChunk: 34359737171 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

[-- Attachment #3: slabinfo --]
[-- Type: text/plain, Size: 13371 bytes --]

slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <batchcount> <limit> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
fat_inode_cache        0      0    592    6    1 : tunables   54   27    0 : slabdata      0      0      0
fat_cache              0      0     32  119    1 : tunables  120   60    0 : slabdata      0      0      0
rpc_buffers            8      8   2048    2    1 : tunables   24   12    0 : slabdata      4      4      0
rpc_tasks              8     10    384   10    1 : tunables   54   27    0 : slabdata      1      1      0
rpc_inode_cache       10     10    768    5    1 : tunables   54   27    0 : slabdata      2      2      0
fib6_nodes             7     61     64   61    1 : tunables  120   60    0 : slabdata      1      1      0
ip6_dst_cache          7     12    320   12    1 : tunables   54   27    0 : slabdata      1      1      0
ndisc_cache            1     15    256   15    1 : tunables  120   60    0 : slabdata      1      1      0
rawv6_sock             3      4    960    4    1 : tunables   54   27    0 : slabdata      1      1      0
udpv6_sock             1      4    960    4    1 : tunables   54   27    0 : slabdata      1      1      0
tcpv6_sock             2      5   1600    5    2 : tunables   24   12    0 : slabdata      1      1      0
unix_sock            167    187    704   11    2 : tunables   54   27    0 : slabdata     17     17      0
tcp_tw_bucket          0      0    192   20    1 : tunables  120   60    0 : slabdata      0      0      0
tcp_bind_bucket       42    119     32  119    1 : tunables  120   60    0 : slabdata      1      1      0
tcp_open_request       0      0    128   31    1 : tunables  120   60    0 : slabdata      0      0      0
inet_peer_cache        0      0    128   31    1 : tunables  120   60    0 : slabdata      0      0      0
ip_fib_alias          15    119     32  119    1 : tunables  120   60    0 : slabdata      1      1      0
ip_fib_hash           15     61     64   61    1 : tunables  120   60    0 : slabdata      1      1      0
ip_dst_cache         110    130    384   10    1 : tunables   54   27    0 : slabdata     13     13      0
arp_cache              1     15    256   15    1 : tunables  120   60    0 : slabdata      1      1      0
raw_sock               3      5    768    5    1 : tunables   54   27    0 : slabdata      1      1      0
udp_sock              13     20    768    5    1 : tunables   54   27    0 : slabdata      4      4      0
tcp_sock              43     55   1408    5    2 : tunables   24   12    0 : slabdata     11     11      0
flow_cache             0      0    128   31    1 : tunables  120   60    0 : slabdata      0      0      0
uhci_urb_priv          3     45     88   45    1 : tunables  120   60    0 : slabdata      1      1      0
scsi_cmd_cache         7      7    512    7    1 : tunables   54   27    0 : slabdata      1      1      0
cfq_ioc_pool         615    615     96   41    1 : tunables  120   60    0 : slabdata     15     15      0
cfq_pool             137    140    192   20    1 : tunables  120   60    0 : slabdata      7      7      0
crq_pool             391    451     96   41    1 : tunables  120   60    0 : slabdata     11     11      0
deadline_drq           0      0     96   41    1 : tunables  120   60    0 : slabdata      0      0      0
as_arq                 0      0    112   35    1 : tunables  120   60    0 : slabdata      0      0      0
mqueue_inode_cache      1      9    832    9    2 : tunables   54   27    0 : slabdata      1      1      0
nfs_write_data        36     36    832    9    2 : tunables   54   27    0 : slabdata      4      4      0
nfs_read_data         32     35    768    5    1 : tunables   54   27    0 : slabdata      7      7      0
nfs_inode_cache        3      8    848    4    1 : tunables   54   27    0 : slabdata      2      2      0
nfs_page               0      0    128   31    1 : tunables  120   60    0 : slabdata      0      0      0
isofs_inode_cache      0      0    568    7    1 : tunables   54   27    0 : slabdata      0      0      0
hugetlbfs_inode_cache      1      7    520    7    1 : tunables   54   27    0 : slabdata      1      1      0
ext2_inode_cache       1      6    672    6    1 : tunables   54   27    0 : slabdata      1      1      0
ext2_xattr             0      0     88   45    1 : tunables  120   60    0 : slabdata      0      0      0
journal_handle         8    156     24  156    1 : tunables  120   60    0 : slabdata      1      1      0
journal_head          59    180     88   45    1 : tunables  120   60    0 : slabdata      4      4      0
revoke_table           8    225     16  225    1 : tunables  120   60    0 : slabdata      1      1      0
revoke_record          0      0     32  119    1 : tunables  120   60    0 : slabdata      0      0      0
ext3_inode_cache    8721  22905    776    5    1 : tunables   54   27    0 : slabdata   4581   4581      0
ext3_xattr             0      0     88   45    1 : tunables  120   60    0 : slabdata      0      0      0
dnotify_cache        112    192     40   96    1 : tunables  120   60    0 : slabdata      2      2      0
eventpoll_pwq          0      0     72   54    1 : tunables  120   60    0 : slabdata      0      0      0
eventpoll_epi          0      0    192   20    1 : tunables  120   60    0 : slabdata      0      0      0
kioctx                 0      0    320   12    1 : tunables   54   27    0 : slabdata      0      0      0
kiocb                  0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
fasync_cache           1    156     24  156    1 : tunables  120   60    0 : slabdata      1      1      0
shmem_inode_cache     14     22    704   11    2 : tunables   54   27    0 : slabdata      2      2      0
posix_timers_cache      0      0    168   23    1 : tunables  120   60    0 : slabdata      0      0      0
uid_cache              4     61     64   61    1 : tunables  120   60    0 : slabdata      1      1      0
sgpool-128            32     32   4096    1    1 : tunables   24   12    0 : slabdata     32     32      0
sgpool-64             34     34   2048    2    1 : tunables   24   12    0 : slabdata     17     17      0
sgpool-32             36     36   1024    4    1 : tunables   54   27    0 : slabdata      9      9      0
sgpool-16             40     40    512    8    1 : tunables   54   27    0 : slabdata      5      5      0
sgpool-8              45     45    256   15    1 : tunables  120   60    0 : slabdata      3      3      0
blkdev_ioc           115    138     56   69    1 : tunables  120   60    0 : slabdata      2      2      0
blkdev_queue          74     78    640    6    1 : tunables   54   27    0 : slabdata     13     13      0
blkdev_requests      416    416    248   16    1 : tunables  120   60    0 : slabdata     26     26      0
biovec-(256)         256    256   4096    1    1 : tunables   24   12    0 : slabdata    256    256      0
biovec-128           272    284   2048    2    1 : tunables   24   12    0 : slabdata    137    142      0
biovec-64            260    264   1024    4    1 : tunables   54   27    0 : slabdata     66     66      0
biovec-16            270    270    256   15    1 : tunables  120   60    0 : slabdata     18     18      0
biovec-4             264    305     64   61    1 : tunables  120   60    0 : slabdata      5      5      0
biovec-1             360    900     16  225    1 : tunables  120   60    0 : slabdata      4      4      0
bio                  345    465    128   31    1 : tunables  120   60    0 : slabdata     15     15      0
file_lock_cache        2     26    152   26    1 : tunables  120   60    0 : slabdata      1      1      0
sock_inode_cache     236    252    640    6    1 : tunables   54   27    0 : slabdata     42     42      0
skbuff_head_cache    796    960    256   15    1 : tunables  120   60    0 : slabdata     64     64      0
sock                   5      7    576    7    1 : tunables   54   27    0 : slabdata      1      1      0
proc_inode_cache     191    427    552    7    1 : tunables   54   27    0 : slabdata     61     61      0
sigqueue              23     23    168   23    1 : tunables  120   60    0 : slabdata      1      1      0
radix_tree_node     2786   2996    536    7    1 : tunables   54   27    0 : slabdata    428    428      0
bdev_cache            10     10    704    5    1 : tunables   54   27    0 : slabdata      2      2      0
sysfs_dir_cache     2804   2867     64   61    1 : tunables  120   60    0 : slabdata     47     47      0
mnt_cache             27     40    192   20    1 : tunables  120   60    0 : slabdata      2      2      0
inode_cache         1068   1120    520    7    1 : tunables   54   27    0 : slabdata    160    160      0
dentry_cache        6300  46782    216   18    1 : tunables  120   60    0 : slabdata   2599   2599      0
filp                2955   3345    256   15    1 : tunables  120   60    0 : slabdata    223    223      0
names_cache           15     15   4096    1    1 : tunables   24   12    0 : slabdata     15     15      0
idr_layer_cache       63     70    528    7    1 : tunables   54   27    0 : slabdata     10     10      0
buffer_head         1276   3465     88   45    1 : tunables  120   60    0 : slabdata     77     77      0
mm_struct            104    105   1088    7    2 : tunables   24   12    0 : slabdata     15     15      0
vm_area_struct      7688   8360    176   22    1 : tunables  120   60    0 : slabdata    380    380      0
fs_cache             111    183     64   61    1 : tunables  120   60    0 : slabdata      3      3      0
files_cache          108    108    832    9    2 : tunables   54   27    0 : slabdata     12     12      0
signal_cache         135    135    448    9    1 : tunables   54   27    0 : slabdata     15     15      0
sighand_cache        117    117   2112    3    2 : tunables   24   12    0 : slabdata     39     39      0
task_struct          149    152   1728    4    2 : tunables   24   12    0 : slabdata     38     38      0
anon_vma            2143   2475     16  225    1 : tunables  120   60    0 : slabdata     11     11      0
size-131072(DMA)       0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-131072            0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-65536(DMA)        0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0      0
size-65536             2      2  65536    1   16 : tunables    8    4    0 : slabdata      2      2      0
size-32768(DMA)        0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0      0
size-32768             0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0      0
size-16384(DMA)        0      0  16384    1    4 : tunables    8    4    0 : slabdata      0      0      0
size-16384             2      2  16384    1    4 : tunables    8    4    0 : slabdata      2      2      0
size-8192(DMA)         0      0   8192    1    2 : tunables    8    4    0 : slabdata      0      0      0
size-8192             38     38   8192    1    2 : tunables    8    4    0 : slabdata     38     38      0
size-4096(DMA)         0      0   4096    1    1 : tunables   24   12    0 : slabdata      0      0      0
size-4096             86     86   4096    1    1 : tunables   24   12    0 : slabdata     86     86      0
size-2048(DMA)         0      0   2048    2    1 : tunables   24   12    0 : slabdata      0      0      0
size-2048            694    694   2048    2    1 : tunables   24   12    0 : slabdata    347    347      0
size-1024(DMA)         0      0   1024    4    1 : tunables   54   27    0 : slabdata      0      0      0
size-1024            384    384   1024    4    1 : tunables   54   27    0 : slabdata     96     96      0
size-512(DMA)          0      0    512    8    1 : tunables   54   27    0 : slabdata      0      0      0
size-512             572    672    512    8    1 : tunables   54   27    0 : slabdata     84     84      0
size-256(DMA)          0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
size-256              73     75    256   15    1 : tunables  120   60    0 : slabdata      5      5      0
size-192(DMA)          0      0    192   20    1 : tunables  120   60    0 : slabdata      0      0      0
size-192            1760   1760    192   20    1 : tunables  120   60    0 : slabdata     88     88      0
size-128(DMA)          0      0    128   31    1 : tunables  120   60    0 : slabdata      0      0      0
size-128            2269   2356    128   31    1 : tunables  120   60    0 : slabdata     76     76      0
size-64(DMA)           0      0     64   61    1 : tunables  120   60    0 : slabdata      0      0      0
size-64             1991   8357     64   61    1 : tunables  120   60    0 : slabdata    137    137      0
size-32(DMA)           0      0     32  119    1 : tunables  120   60    0 : slabdata      0      0      0
size-32             1131   1190     32  119    1 : tunables  120   60    0 : slabdata     10     10      0
kmem_cache           140    140    192   20    1 : tunables  120   60    0 : slabdata      7      7      0

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-23  9:11   ` Jens Axboe
@ 2005-01-23  9:19     ` Andrew Morton
  2005-01-23  9:56       ` Jens Axboe
  0 siblings, 1 reply; 87+ messages in thread
From: Andrew Morton @ 2005-01-23  9:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: alexn, kas, linux-kernel

Jens Axboe <axboe@suse.de> wrote:
>
>  This is after 2 days of uptime, the box is basically unusable.

hm, no indication where it all went.

Does the machine still page properly?  Can you do a couple of monster
usemems or fillmems to page everything out, then take another look at
meminfo and the sysrq-M output?


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-23  9:19     ` Andrew Morton
@ 2005-01-23  9:56       ` Jens Axboe
  2005-01-23 10:32         ` Andrew Morton
  0 siblings, 1 reply; 87+ messages in thread
From: Jens Axboe @ 2005-01-23  9:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: alexn, kas, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 581 bytes --]

On Sun, Jan 23 2005, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> >  This is after 2 days of uptime, the box is basically unusable.
> 
> hm, no indication where it all went.

Nope, that's the annoying part.

> Does the machine still page properly?  Can you do a couple of monster
> usemems or fillmems to page everything out, then take another look at
> meminfo and the sysrq-M output?

It seems so, yes. But I'm still stuck with all of my ram gone after a
600MB fillmem, half of it is just in swap.

Attaching meminfo and sysrq-m after fillmem.

-- 
Jens Axboe


[-- Attachment #2: meminfo --]
[-- Type: text/plain, Size: 676 bytes --]

MemTotal:      1022372 kB
MemFree:        383824 kB
Buffers:           908 kB
Cached:          29468 kB
SwapCached:      49912 kB
Active:         122232 kB
Inactive:        64228 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      1022372 kB
LowFree:        383824 kB
SwapTotal:     1116476 kB
SwapFree:       437884 kB
Dirty:              44 kB
Writeback:           0 kB
Mapped:         130812 kB
Slab:            21948 kB
CommitLimit:   1627660 kB
Committed_AS:  1130368 kB
PageTables:       7884 kB
VmallocTotal: 34359738367 kB
VmallocUsed:      1152 kB
VmallocChunk: 34359737171 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

[-- Attachment #3: sysrq-m --]
[-- Type: text/plain, Size: 1303 bytes --]

SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:      379280kB (0kB HighMem)
Active:31304 inactive:16430 dirty:80 writeback:0 unstable:0 free:94820 slab:5486 mapped:33169 pagetables:1987
DMA free:8632kB min:60kB low:72kB high:88kB active:120kB inactive:2076kB present:16384kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 1007 1007
Normal free:370648kB min:4028kB low:5032kB high:6040kB active:125096kB inactive:63644kB present:1031360kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 414*4kB 220*8kB 54*16kB 10*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 8632kB
Normal: 36308*4kB 17473*8kB 4136*16kB 456*32kB 16*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 370648kB
HighMem: empty
Swap cache: add 301052, delete 288547, find 27582/38199, race 0+0
Free swap  = 437948kB
Total swap = 1116476kB
Free swap:       437948kB
261936 pages of RAM
6456 reserved pages
18505 pages shared
12505 pages swap cached

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-23  9:56       ` Jens Axboe
@ 2005-01-23 10:32         ` Andrew Morton
  2005-01-23 20:03           ` Russell King
  2005-01-24  0:56           ` Alexander Nyberg
  0 siblings, 2 replies; 87+ messages in thread
From: Andrew Morton @ 2005-01-23 10:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: alexn, kas, linux-kernel

Jens Axboe <axboe@suse.de> wrote:
>
> But I'm still stuck with all of my ram gone after a
>  600MB fillmem, half of it is just in swap.

Well.  Half of it has gone so far ;)

> 
>  Attaching meminfo and sysrq-m after fillmem.

(I meant a really big fillmem: a couple of 2GB ones.  Not to worry.)

It's not in slab and the pagecache and anonymous memory stuff seems to be
working OK.  So it has to be something else, which does a bare
__alloc_pages().  Low-level block stuff, networking, arch code, perhaps.

I don't think I've ever really seen code to diagnose this.

A simplistic approach would be to add eight or so ulongs into struct page,
populate them with builtin_return_address(0...7) at allocation time, then
modify sysrq-m to walk mem_map[] printing it all out for pages which have
page_count() > 0.  That'd find the culprit.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-23 10:32         ` Andrew Morton
@ 2005-01-23 20:03           ` Russell King
  2005-01-24 11:48             ` Russell King
  2005-01-24  0:56           ` Alexander Nyberg
  1 sibling, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-23 20:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jens Axboe, alexn, kas, linux-kernel, netdev

On Sun, Jan 23, 2005 at 02:32:48AM -0800, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> > But I'm still stuck with all of my ram gone after a
> >  600MB fillmem, half of it is just in swap.
> 
> Well.  Half of it has gone so far ;)
> 
> > 
> >  Attaching meminfo and sysrq-m after fillmem.
> 
> (I meant a really big fillmem: a couple of 2GB ones.  Not to worry.)
> 
> It's not in slab and the pagecache and anonymous memory stuff seems to be
> working OK.  So it has to be something else, which does a bare
> __alloc_pages().  Low-level block stuff, networking, arch code, perhaps.
> 
> I don't think I've ever really seen code to diagnose this.
> 
> A simplistic approach would be to add eight or so ulongs into struct page,
> populate them with builtin_return_address(0...7) at allocation time, then
> modify sysrq-m to walk mem_map[] printing it all out for pages which have
> page_count() > 0.  That'd find the culprit.

I think I may be seeing something odd here, maybe a possible memory leak.
The only problem I have is wondering whether I'm actually comparing like
with like.  Maybe some networking people can provide a hint?

Below is gathered from 2.6.11-rc1.

bash-2.05a# head -n2 /proc/slabinfo
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
115
ip_dst_cache         759    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
117
ip_dst_cache         770    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
133
ip_dst_cache         775    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
18
ip_dst_cache         664    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
20
ip_dst_cache         664    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
22
ip_dst_cache         673    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
23
ip_dst_cache         670    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
24
ip_dst_cache         675    885    256   15    1
bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
24
ip_dst_cache         669    885    256   15    1

I'm fairly positive when I rebooted the machine a couple of days ago,
ip_dst_cache was significantly smaller for the same number of lines in
/proc/net/rt_cache.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-23 10:32         ` Andrew Morton
  2005-01-23 20:03           ` Russell King
@ 2005-01-24  0:56           ` Alexander Nyberg
  2005-01-24 20:47             ` Jens Axboe
  2005-01-24 22:05             ` Andrew Morton
  1 sibling, 2 replies; 87+ messages in thread
From: Alexander Nyberg @ 2005-01-24  0:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jens Axboe, kas, linux-kernel, lennert.vanalboom

> I don't think I've ever really seen code to diagnose this.
> 
> A simplistic approach would be to add eight or so ulongs into struct page,
> populate them with builtin_return_address(0...7) at allocation time, then
> modify sysrq-m to walk mem_map[] printing it all out for pages which have
> page_count() > 0.  That'd find the culprit.

Hi Andrew

I put something similar together of what you described but I made it a 
proc-file. It lists all pages owned by some caller and keeps a backtrace
of max 8 addresses. Each page has an order, -1 for unused and if used it lists
the order under which the first page is allocated, the rest in the group are kept -1.
Below is also a program to sort the enormous amount of
output, it will group together backtraces that are alike and list them like:

5 times: Page allocated via order 0
[0xffffffff8015861f] __get_free_pages+31
[0xffffffff8015c0ef] cache_alloc_refill+719
[0xffffffff8015bd74] kmem_cache_alloc+84
[0xffffffff8015bddc] alloc_arraycache+60
[0xffffffff8015d15d] do_tune_cpucache+93
[0xffffffff8015bbf8] cache_alloc_debugcheck_after+280
[0xffffffff8015d31d] enable_cpucache+93
[0xffffffff8015d8a5] kmem_cache_create+1365

It's a bit of hackety-hack in the function trace routines because doing
__builtin_return_address(0) - 7 doesn't work very well when it
runs out of the stack and the function itself doesn't check for it.

Tested on x86 with and without CONFIG_FRAME_POINTER and x86-64 (which
might are the only archs it'll work on). I hope you like it ;)


Suggested use is
cat /proc/page_owner > pgown; 
./below_program pgown pgsorted; 
vim pgsorted

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

struct block_list {
	struct block_list *next;
	char *txt;
	int len;
	int num;
};

struct block_list *block_head;

int read_block(char *buf, int fd)
{
	int ret = 0, rd = 0;
	int hit = 0;
	char *curr = buf;
	
	for (;;) {
		rd = read(fd, curr, 1);
		if (rd <= 0)
			return -1;
		
		ret += rd;
		if (*curr == '\n' && hit == 1)
			return ret - 1;
		else if (*curr == '\n')
			hit = 1;
		else
			hit = 0;
		curr++;
	}
}

int find_duplicate(char *buf, int len)
{	
	struct block_list *iterate, *item, *prev;
	char *txt;
		
	iterate = block_head;
	while (iterate) {
		if (len != iterate->len)
			goto iterate;
		if (!memcmp(buf, iterate->txt, len)) {
			iterate->num++;
			return 1;
		}
iterate:
		iterate = iterate->next;
	}
	
	/* this block didn't exist */
	txt = malloc(len);
	item = malloc(sizeof(struct block_list));
	strncpy(txt, buf, len);
	item->len = len;
	item->txt = txt;
	item->num = 1;
	item->next = NULL;

	if (block_head) {
		prev = block_head->next;
		block_head->next = item;
		item->next = prev;
	} else
		block_head = item;

	return 0;
}
int main(int argc, char **argv)
{
	int fdin, fdout;
	char buf[1024];
	int ret;
	struct block_list *item;
	
	fdin = open(argv[1], O_RDONLY);
	fdout = open(argv[2], O_CREAT | O_RDWR | O_EXCL, S_IWUSR | S_IRUSR);
	if (fdin < 0 || fdout < 0) {
		printf("Usage: ./program <input> <output>\n");
		perror("open: ");
		exit(2);
	}

	for(;;) {
		ret = read_block(buf, fdin);
		if (ret < 0)
			break;
		
		buf[ret] = '\0';
		find_duplicate(buf, ret);
	}

	for (item = block_head; item; item = item->next) {
		int written;

		/* errors? what errors... */
		ret = snprintf(buf, 1024, "%d times: ", item->num);
		written = write(fdout, buf, ret);
		written = write(fdout, item->txt, item->len);
		written = write(fdout, "\n", 1);
	}
	return 0;
}




===== fs/proc/proc_misc.c 1.113 vs edited =====
--- 1.113/fs/proc/proc_misc.c	2005-01-12 01:42:35 +01:00
+++ edited/fs/proc/proc_misc.c	2005-01-24 00:59:23 +01:00
@@ -534,6 +534,62 @@ static struct file_operations proc_sysrq
 };
 #endif
 
+#if 1
+#include <linux/bootmem.h>
+#include <linux/kallsyms.h>
+static ssize_t
+read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
+{
+	struct page *start = pfn_to_page(min_low_pfn);
+	static struct page *page;
+	char *kbuf, *modname;
+	const char *symname;
+	int ret = 0, next_idx = 1;
+	char namebuf[128];
+	unsigned long offset = 0, symsize;
+	int i;
+	
+	page = start + *ppos;
+	for (; page < pfn_to_page(max_pfn); page++) {
+		if (page->order >= 0)
+			break;
+		next_idx++;
+		continue;
+	}
+
+	if (page >= pfn_to_page(max_pfn))
+		return 0;
+	
+	*ppos += next_idx;
+
+	kbuf = kmalloc(count, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	ret = snprintf(kbuf, count, "Page allocated via order %d\n", page->order);
+	
+	for (i = 0; i < 8; i++) {
+		if (!page->trace[i])
+			break;
+		symname = kallsyms_lookup(page->trace[i], &symsize, &offset, &modname, namebuf);
+		ret += snprintf(kbuf + ret, count - ret, "[0x%lx] %s+%lu\n", 
+			page->trace[i], namebuf, offset);
+	}
+	
+	ret += snprintf(kbuf + ret, count -ret, "\n");
+	
+	if (copy_to_user(buf, kbuf, ret))
+		ret = -EFAULT;
+	
+	kfree(kbuf);
+	return ret;
+}
+
+static struct file_operations proc_page_owner_operations = {
+	.read		= read_page_owner,
+};
+#endif
+
 struct proc_dir_entry *proc_root_kcore;
 
 void create_seq_entry(char *name, mode_t mode, struct file_operations *f)
@@ -610,6 +666,13 @@ void __init proc_misc_init(void)
 		entry = create_proc_entry("ppc_htab", S_IRUGO|S_IWUSR, NULL);
 		if (entry)
 			entry->proc_fops = &ppc_htab_operations;
+	}
+#endif
+#if 1
+	entry = create_proc_entry("page_owner", S_IWUSR | S_IRUGO, NULL);
+	if (entry) {
+		entry->proc_fops = &proc_page_owner_operations;
+		entry->size = 1024;
 	}
 #endif
 }
===== include/linux/mm.h 1.211 vs edited =====
--- 1.211/include/linux/mm.h	2005-01-11 02:29:23 +01:00
+++ edited/include/linux/mm.h	2005-01-23 23:22:52 +01:00
@@ -260,6 +260,10 @@ struct page {
 	void *virtual;			/* Kernel virtual address (NULL if
 					   not kmapped, ie. highmem) */
 #endif /* WANT_PAGE_VIRTUAL */
+#if 1 
+	int order;
+	unsigned long trace[8];
+#endif
 };
 
 /*
===== mm/page_alloc.c 1.254 vs edited =====
--- 1.254/mm/page_alloc.c	2005-01-11 02:29:33 +01:00
+++ edited/mm/page_alloc.c	2005-01-24 01:04:38 +01:00
@@ -103,6 +103,7 @@ static void bad_page(const char *functio
 	tainted |= TAINT_BAD_PAGE;
 }
 
+
 #ifndef CONFIG_HUGETLB_PAGE
 #define prep_compound_page(page, order) do { } while (0)
 #define destroy_compound_page(page, order) do { } while (0)
@@ -680,6 +681,41 @@ int zone_watermark_ok(struct zone *z, in
 	return 1;
 }
 
+static inline int valid_stack_ptr(struct thread_info *tinfo, void *p)
+{
+	return	p > (void *)tinfo &&
+		p < (void *)tinfo + THREAD_SIZE - 3;
+}
+
+static inline void __stack_trace(struct page *page, unsigned long *stack, unsigned long bp)
+{
+	int i = 0;
+	unsigned long addr;
+	struct thread_info *tinfo = (struct thread_info *) 
+		((unsigned long)stack & (~(THREAD_SIZE - 1)));
+	
+	memset(page->trace, 0, sizeof(long) * 8);
+	
+#ifdef	CONFIG_FRAME_POINTER
+	while (valid_stack_ptr(tinfo, (void *)bp)) {
+		addr = *(unsigned long *)(bp + sizeof(long));
+		page->trace[i] = addr;
+		if (++i >= 8)
+			break;
+		bp = *(unsigned long *)bp;
+	}
+#else	
+	while (valid_stack_ptr(tinfo, stack)) {
+		addr = *stack++;
+		if (__kernel_text_address(addr)) {
+			page->trace[i] = addr;
+			if (++i >= 8)
+				break;
+		}
+	}
+#endif
+}
+
 /*
  * This is the 'heart' of the zoned buddy allocator.
  *
@@ -709,6 +745,7 @@ __alloc_pages(unsigned int gfp_mask, uns
 	int alloc_type;
 	int do_retry;
 	int can_try_harder;
+	unsigned long address, bp;
 
 	might_sleep_if(wait);
 
@@ -825,6 +862,14 @@ nopage:
 	return NULL;
 got_pg:
 	zone_statistics(zonelist, z);
+	page->order = (int) order;
+#ifdef X86_64
+	asm ("movq %%rbp, %0" : "=r" (bp) : );
+#else
+	asm ("movl %%ebp, %0" : "=r" (bp) : );
+#endif
+	__stack_trace(page, &address, bp);
+	
 	return page;
 }
 
@@ -877,6 +922,7 @@ fastcall void __free_pages(struct page *
 			free_hot_page(page);
 		else
 			__free_pages_ok(page, order);
+		page->order = -1;
 	}
 }
 
@@ -1508,6 +1554,7 @@ void __init memmap_init_zone(unsigned lo
 			set_page_address(page, __va(start_pfn << PAGE_SHIFT));
 #endif
 		start_pfn++;
+		page->order = -1;
 	}
 }
 



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-23 20:03           ` Russell King
@ 2005-01-24 11:48             ` Russell King
  2005-01-25 19:32               ` Russell King
  0 siblings, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-24 11:48 UTC (permalink / raw)
  To: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel, netdev

On Sun, Jan 23, 2005 at 08:03:15PM +0000, Russell King wrote:
> I think I may be seeing something odd here, maybe a possible memory leak.
> The only problem I have is wondering whether I'm actually comparing like
> with like.  Maybe some networking people can provide a hint?
> 
> Below is gathered from 2.6.11-rc1.
> 
> bash-2.05a# head -n2 /proc/slabinfo
> slabinfo - version: 2.1
> # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 115
> ip_dst_cache         759    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 117
> ip_dst_cache         770    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 133
> ip_dst_cache         775    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 18
> ip_dst_cache         664    885    256   15    1
>...
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 24
> ip_dst_cache         675    885    256   15    1
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 24
> ip_dst_cache         669    885    256   15    1
> 
> I'm fairly positive when I rebooted the machine a couple of days ago,
> ip_dst_cache was significantly smaller for the same number of lines in
> /proc/net/rt_cache.

FYI, today it looks like this:

bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
26
ip_dst_cache         820   1065    256   15    1 

So the dst cache seems to have grown by 151 in 16 hours...  I'll continue
monitoring and providing updates.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24  0:56           ` Alexander Nyberg
@ 2005-01-24 20:47             ` Jens Axboe
  2005-01-24 20:56               ` Andrew Morton
  2005-01-24 22:05             ` Andrew Morton
  1 sibling, 1 reply; 87+ messages in thread
From: Jens Axboe @ 2005-01-24 20:47 UTC (permalink / raw)
  To: Alexander Nyberg; +Cc: Andrew Morton, kas, linux-kernel, lennert.vanalboom

[-- Attachment #1: Type: text/plain, Size: 1458 bytes --]

On Mon, Jan 24 2005, Alexander Nyberg wrote:
> > I don't think I've ever really seen code to diagnose this.
> > 
> > A simplistic approach would be to add eight or so ulongs into struct page,
> > populate them with builtin_return_address(0...7) at allocation time, then
> > modify sysrq-m to walk mem_map[] printing it all out for pages which have
> > page_count() > 0.  That'd find the culprit.
> 
> Hi Andrew
> 
> I put something similar together of what you described but I made it a 
> proc-file. It lists all pages owned by some caller and keeps a backtrace
> of max 8 addresses. Each page has an order, -1 for unused and if used it lists
> the order under which the first page is allocated, the rest in the group are kept -1.
> Below is also a program to sort the enormous amount of
> output, it will group together backtraces that are alike and list them like:
> 
> 5 times: Page allocated via order 0
> [0xffffffff8015861f] __get_free_pages+31
> [0xffffffff8015c0ef] cache_alloc_refill+719
> [0xffffffff8015bd74] kmem_cache_alloc+84
> [0xffffffff8015bddc] alloc_arraycache+60
> [0xffffffff8015d15d] do_tune_cpucache+93
> [0xffffffff8015bbf8] cache_alloc_debugcheck_after+280
> [0xffffffff8015d31d] enable_cpucache+93
> [0xffffffff8015d8a5] kmem_cache_create+1365

Here is the output of your program (somewhat modified, I cut the runtime
by 19/20 killing the 1-byte reads :-) after 10 hours of use with
bk-current as of this morning.

-- 
Jens Axboe


[-- Attachment #2: page_owner_sorted.bz2 --]
[-- Type: application/x-bunzip2, Size: 48730 bytes --]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24 20:47             ` Jens Axboe
@ 2005-01-24 20:56               ` Andrew Morton
  2005-01-24 21:05                 ` Jens Axboe
  2005-01-24 22:35                 ` Linus Torvalds
  0 siblings, 2 replies; 87+ messages in thread
From: Andrew Morton @ 2005-01-24 20:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: alexn, kas, linux-kernel, lennert.vanalboom, Linus Torvalds

Jens Axboe <axboe@suse.de> wrote:
>
>  Here is the output of your program (somewhat modified, I cut the runtime
>  by 19/20 killing the 1-byte reads :-) after 10 hours of use with
>  bk-current as of this morning.

hmm..

62130 times: Page allocated via order 0
[0xffffffff80173d6e] pipe_writev+574
[0xffffffff8017402a] pipe_write+26
[0xffffffff80168b47] vfs_write+199
[0xffffffff80168cb3] sys_write+83
[0xffffffff8011e4f3] cstar_do_call+27

55552 times: Page allocated via order 0
[0xffffffff80173d6e] pipe_writev+574
[0xffffffff8017402a] pipe_write+26
[0xffffffff8038b88d] thread_return+41
[0xffffffff80168b47] vfs_write+199
[0xffffffff80168cb3] sys_write+83
[0xffffffff8011e4f3] cstar_do_call+27

Would indicate that the new pipe code is leaking.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24 20:56               ` Andrew Morton
@ 2005-01-24 21:05                 ` Jens Axboe
  2005-01-24 22:35                 ` Linus Torvalds
  1 sibling, 0 replies; 87+ messages in thread
From: Jens Axboe @ 2005-01-24 21:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: alexn, kas, linux-kernel, lennert.vanalboom, Linus Torvalds

On Mon, Jan 24 2005, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> >  Here is the output of your program (somewhat modified, I cut the runtime
> >  by 19/20 killing the 1-byte reads :-) after 10 hours of use with
> >  bk-current as of this morning.
> 
> hmm..
> 
> 62130 times: Page allocated via order 0
> [0xffffffff80173d6e] pipe_writev+574
> [0xffffffff8017402a] pipe_write+26
> [0xffffffff80168b47] vfs_write+199
> [0xffffffff80168cb3] sys_write+83
> [0xffffffff8011e4f3] cstar_do_call+27
> 
> 55552 times: Page allocated via order 0
> [0xffffffff80173d6e] pipe_writev+574
> [0xffffffff8017402a] pipe_write+26
> [0xffffffff8038b88d] thread_return+41
> [0xffffffff80168b47] vfs_write+199
> [0xffffffff80168cb3] sys_write+83
> [0xffffffff8011e4f3] cstar_do_call+27
> 
> Would indicate that the new pipe code is leaking.

I suspected that, I even tried backing out the new pipe patches but it
still seemed to leak. And the test cases I tried to come up with could
not provoke a pipe leak. But yeah, it certainly is the most likely
culprit and the leak did start in the period when it was introduced.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24  0:56           ` Alexander Nyberg
  2005-01-24 20:47             ` Jens Axboe
@ 2005-01-24 22:05             ` Andrew Morton
  1 sibling, 0 replies; 87+ messages in thread
From: Andrew Morton @ 2005-01-24 22:05 UTC (permalink / raw)
  To: Alexander Nyberg; +Cc: axboe, kas, linux-kernel, lennert.vanalboom

Alexander Nyberg <alexn@dsv.su.se> wrote:
>
> I put something similar together of what you described but I made it a 
> proc-file. It lists all pages owned by some caller and keeps a backtrace
> of max 8 addresses.
> ...
> I hope you like it ;)

I do!  If you have time, please give it all a real config option under the
kernel-hacking menu and I'll sustain it in -mm, thanks.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24 20:56               ` Andrew Morton
  2005-01-24 21:05                 ` Jens Axboe
@ 2005-01-24 22:35                 ` Linus Torvalds
  2005-01-25 15:53                   ` OT " Paulo Marques
                                     ` (2 more replies)
  1 sibling, 3 replies; 87+ messages in thread
From: Linus Torvalds @ 2005-01-24 22:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jens Axboe, alexn, kas, linux-kernel, lennert.vanalboom



On Mon, 24 Jan 2005, Andrew Morton wrote:
> 
> Would indicate that the new pipe code is leaking.

Duh. It's the pipe merging.

		Linus

----
--- 1.40/fs/pipe.c	2005-01-15 12:01:16 -08:00
+++ edited/fs/pipe.c	2005-01-24 14:35:09 -08:00
@@ -630,13 +630,13 @@
 	struct pipe_inode_info *info = inode->i_pipe;
 
 	inode->i_pipe = NULL;
-	if (info->tmp_page)
-		__free_page(info->tmp_page);
 	for (i = 0; i < PIPE_BUFFERS; i++) {
 		struct pipe_buffer *buf = info->bufs + i;
 		if (buf->ops)
 			buf->ops->release(info, buf);
 	}
+	if (info->tmp_page)
+		__free_page(info->tmp_page);
 	kfree(info);
 }
 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* OT Re: Memory leak in 2.6.11-rc1?
  2005-01-24 22:35                 ` Linus Torvalds
@ 2005-01-25 15:53                   ` Paulo Marques
  2005-01-26  8:01                   ` Jens Axboe
  2005-02-02  9:29                   ` Lennert Van Alboom
  2 siblings, 0 replies; 87+ messages in thread
From: Paulo Marques @ 2005-01-25 15:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel, lennert.vanalboom

Linus Torvalds wrote:
> 
> On Mon, 24 Jan 2005, Andrew Morton wrote:
> 
>>Would indicate that the new pipe code is leaking.
> 
> 
> Duh. It's the pipe merging.

Have we just seen the "plumber" side of Linus?

After all, he just fixed a "leaking pipe" :)


(sorry for the OT, just couldn't help it)

-- 
Paulo Marques - www.grupopie.com

"A journey of a thousand miles begins with a single step."
Lao-tzu, The Way of Lao-tzu

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24 11:48             ` Russell King
@ 2005-01-25 19:32               ` Russell King
  2005-01-27  8:28                 ` Russell King
  0 siblings, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-25 19:32 UTC (permalink / raw)
  To: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel, netdev

On Mon, Jan 24, 2005 at 11:48:53AM +0000, Russell King wrote:
> On Sun, Jan 23, 2005 at 08:03:15PM +0000, Russell King wrote:
> > I think I may be seeing something odd here, maybe a possible memory leak.
> > The only problem I have is wondering whether I'm actually comparing like
> > with like.  Maybe some networking people can provide a hint?
> > 
> > Below is gathered from 2.6.11-rc1.
> > 
> > bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> > 24
> > ip_dst_cache         669    885    256   15    1
> > 
> > I'm fairly positive when I rebooted the machine a couple of days ago,
> > ip_dst_cache was significantly smaller for the same number of lines in
> > /proc/net/rt_cache.
> 
> FYI, today it looks like this:
> 
> bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> 26
> ip_dst_cache         820   1065    256   15    1 
> 
> So the dst cache seems to have grown by 151 in 16 hours...  I'll continue
> monitoring and providing updates.

Tonights update:
50
ip_dst_cache        1024   1245    256   15    1

As you can see, the dst cache is consistently growing by about 200
entries per day.  Given this, I predict that the box will fall over
due to "dst cache overflow" in roughly 35 days.

kernel network configuration:

CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_SYN_COOKIES=y
CONFIG_IPV6=y
CONFIG_NETFILTER=y
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_CONNTRACK_MARK=y
CONFIG_IP_NF_FTP=y
CONFIG_IP_NF_IRC=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=y
CONFIG_IP_NF_MATCH_IPRANGE=y
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=y
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=y
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=y
CONFIG_IP_NF_MATCH_STATE=y
CONFIG_IP_NF_MATCH_CONNTRACK=y
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
CONFIG_IP_NF_MATCH_CONNMARK=m
CONFIG_IP_NF_MATCH_HASHLIMIT=m
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=y
CONFIG_IP_NF_TARGET_REDIRECT=y
CONFIG_IP_NF_TARGET_NETMAP=y
CONFIG_IP_NF_TARGET_SAME=y
CONFIG_IP_NF_NAT_IRC=y
CONFIG_IP_NF_NAT_FTP=y
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=y
CONFIG_IP_NF_TARGET_CLASSIFY=m
CONFIG_IP_NF_TARGET_CONNMARK=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP6_NF_IPTABLES=y
CONFIG_IP6_NF_MATCH_LIMIT=y
CONFIG_IP6_NF_MATCH_MAC=y
CONFIG_IP6_NF_MATCH_RT=y
CONFIG_IP6_NF_MATCH_OPTS=y
CONFIG_IP6_NF_MATCH_FRAG=y
CONFIG_IP6_NF_MATCH_HL=y
CONFIG_IP6_NF_MATCH_MULTIPORT=y
CONFIG_IP6_NF_MATCH_MARK=y
CONFIG_IP6_NF_MATCH_AHESP=y
CONFIG_IP6_NF_MATCH_LENGTH=y
CONFIG_IP6_NF_FILTER=y
CONFIG_IP6_NF_MANGLE=y
CONFIG_IP6_NF_TARGET_MARK=y


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24 22:35                 ` Linus Torvalds
  2005-01-25 15:53                   ` OT " Paulo Marques
@ 2005-01-26  8:01                   ` Jens Axboe
  2005-01-26  8:11                     ` Andrew Morton
  2005-02-02  9:29                   ` Lennert Van Alboom
  2 siblings, 1 reply; 87+ messages in thread
From: Jens Axboe @ 2005-01-26  8:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, alexn, kas, linux-kernel, lennert.vanalboom

On Mon, Jan 24 2005, Linus Torvalds wrote:
> 
> 
> On Mon, 24 Jan 2005, Andrew Morton wrote:
> > 
> > Would indicate that the new pipe code is leaking.
> 
> Duh. It's the pipe merging.
> 
> 		Linus
> 
> ----
> --- 1.40/fs/pipe.c	2005-01-15 12:01:16 -08:00
> +++ edited/fs/pipe.c	2005-01-24 14:35:09 -08:00
> @@ -630,13 +630,13 @@
>  	struct pipe_inode_info *info = inode->i_pipe;
>  
>  	inode->i_pipe = NULL;
> -	if (info->tmp_page)
> -		__free_page(info->tmp_page);
>  	for (i = 0; i < PIPE_BUFFERS; i++) {
>  		struct pipe_buffer *buf = info->bufs + i;
>  		if (buf->ops)
>  			buf->ops->release(info, buf);
>  	}
> +	if (info->tmp_page)
> +		__free_page(info->tmp_page);
>  	kfree(info);
>  }

It's better now, no leak anymore. But the 2.6.11-rcX vm is still very
screwy, to get something close to nice and smooth behaviour I have to
run a fillmem every now and then to reclaim used memory.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:01                   ` Jens Axboe
@ 2005-01-26  8:11                     ` Andrew Morton
  2005-01-26  8:40                       ` Jens Axboe
  0 siblings, 1 reply; 87+ messages in thread
From: Andrew Morton @ 2005-01-26  8:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: torvalds, alexn, kas, linux-kernel, lennert.vanalboom

Jens Axboe <axboe@suse.de> wrote:
>
> But the 2.6.11-rcX vm is still very
>  screwy, to get something close to nice and smooth behaviour I have to
>  run a fillmem every now and then to reclaim used memory.

Can you provide more details?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:11                     ` Andrew Morton
@ 2005-01-26  8:40                       ` Jens Axboe
  2005-01-26  8:44                         ` Andrew Morton
  0 siblings, 1 reply; 87+ messages in thread
From: Jens Axboe @ 2005-01-26  8:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, alexn, kas, linux-kernel, lennert.vanalboom

On Wed, Jan 26 2005, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> > But the 2.6.11-rcX vm is still very
> >  screwy, to get something close to nice and smooth behaviour I have to
> >  run a fillmem every now and then to reclaim used memory.
> 
> Can you provide more details?

Hmm not really, I just seem to have a very large piece of
non-cache/buffer memory that seems reluctant to shrink on light memory
pressure. This makes the box feel sluggish, if I force reclaim by
running fillmem and swapping on/off again, it feels much better.

I should mention that this is with 2.6.bk + andreas oom patches that he
asked me to test. I can try 2.6.11-rc2-bkX if you think I should.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:40                       ` Jens Axboe
@ 2005-01-26  8:44                         ` Andrew Morton
  2005-01-26  8:47                           ` Jens Axboe
  0 siblings, 1 reply; 87+ messages in thread
From: Andrew Morton @ 2005-01-26  8:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: torvalds, alexn, kas, linux-kernel, lennert.vanalboom

Jens Axboe <axboe@suse.de> wrote:
>
> On Wed, Jan 26 2005, Andrew Morton wrote:
> > Jens Axboe <axboe@suse.de> wrote:
> > >
> > > But the 2.6.11-rcX vm is still very
> > >  screwy, to get something close to nice and smooth behaviour I have to
> > >  run a fillmem every now and then to reclaim used memory.
> > 
> > Can you provide more details?
> 
> Hmm not really, I just seem to have a very large piece of
> non-cache/buffer memory that seems reluctant to shrink on light memory
> pressure.

If it's not pagecache then what is it? slab?

> This makes the box feel sluggish, if I force reclaim by
> running fillmem and swapping on/off again, it feels much better.

before-n-after /proc/meminfo would be interesting.

If you actually meant that is _is_ sticky pagecache then perhaps the recent
mark_page_accessed() changes in filemap.c, although I'd be surprised.

> I should mention that this is with 2.6.bk + andreas oom patches that he
> asked me to test. I can try 2.6.11-rc2-bkX if you think I should.

They shouldn't be causing this sort of thing.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:44                         ` Andrew Morton
@ 2005-01-26  8:47                           ` Jens Axboe
  2005-01-26  8:52                             ` Jens Axboe
  2005-01-26  8:58                             ` Andrew Morton
  0 siblings, 2 replies; 87+ messages in thread
From: Jens Axboe @ 2005-01-26  8:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, alexn, kas, linux-kernel, lennert.vanalboom

On Wed, Jan 26 2005, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> > On Wed, Jan 26 2005, Andrew Morton wrote:
> > > Jens Axboe <axboe@suse.de> wrote:
> > > >
> > > > But the 2.6.11-rcX vm is still very
> > > >  screwy, to get something close to nice and smooth behaviour I have to
> > > >  run a fillmem every now and then to reclaim used memory.
> > > 
> > > Can you provide more details?
> > 
> > Hmm not really, I just seem to have a very large piece of
> > non-cache/buffer memory that seems reluctant to shrink on light memory
> > pressure.
> 
> If it's not pagecache then what is it? slab?

Must be, if it's reclaimable.

> > This makes the box feel sluggish, if I force reclaim by
> > running fillmem and swapping on/off again, it feels much better.
> 
> before-n-after /proc/meminfo would be interesting.
> 
> If you actually meant that is _is_ sticky pagecache then perhaps the recent
> mark_page_accessed() changes in filemap.c, although I'd be surprised.

I don't think it's sticky page cache, it seems to shrink just fine. This
is my current situtation:

axboe@wiggum:/home/axboe $ free
             total       used       free     shared    buffers cached
Mem:       1024992    1015288       9704          0      76680 328148
-/+ buffers/cache:     610460     414532
Swap:            0          0          0

axboe@wiggum:/home/axboe $ cat /proc/meminfo 
MemTotal:      1024992 kB
MemFree:          9768 kB
Buffers:         76664 kB
Cached:         328024 kB
SwapCached:          0 kB
Active:         534956 kB
Inactive:       224060 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      1024992 kB
LowFree:          9768 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:            1400 kB
Writeback:           0 kB
Mapped:         464232 kB
Slab:           225864 kB
CommitLimit:    512496 kB
Committed_AS:   773844 kB
PageTables:       8004 kB
VmallocTotal: 34359738367 kB
VmallocUsed:       644 kB
VmallocChunk: 34359737167 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

> > I should mention that this is with 2.6.bk + andreas oom patches that he
> > asked me to test. I can try 2.6.11-rc2-bkX if you think I should.
> 
> They shouldn't be causing this sort of thing.

I didn't think so, just mentioning it for completeness :)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:47                           ` Jens Axboe
@ 2005-01-26  8:52                             ` Jens Axboe
  2005-01-26  9:00                               ` William Lee Irwin III
  2005-01-26  8:58                             ` Andrew Morton
  1 sibling, 1 reply; 87+ messages in thread
From: Jens Axboe @ 2005-01-26  8:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, alexn, kas, linux-kernel, lennert.vanalboom

On Wed, Jan 26 2005, Jens Axboe wrote:
> On Wed, Jan 26 2005, Andrew Morton wrote:
> > Jens Axboe <axboe@suse.de> wrote:
> > >
> > > On Wed, Jan 26 2005, Andrew Morton wrote:
> > > > Jens Axboe <axboe@suse.de> wrote:
> > > > >
> > > > > But the 2.6.11-rcX vm is still very
> > > > >  screwy, to get something close to nice and smooth behaviour I have to
> > > > >  run a fillmem every now and then to reclaim used memory.
> > > > 
> > > > Can you provide more details?
> > > 
> > > Hmm not really, I just seem to have a very large piece of
> > > non-cache/buffer memory that seems reluctant to shrink on light memory
> > > pressure.
> > 
> > If it's not pagecache then what is it? slab?
> 
> Must be, if it's reclaimable.
> 
> > > This makes the box feel sluggish, if I force reclaim by
> > > running fillmem and swapping on/off again, it feels much better.
> > 
> > before-n-after /proc/meminfo would be interesting.
> > 
> > If you actually meant that is _is_ sticky pagecache then perhaps the recent
> > mark_page_accessed() changes in filemap.c, although I'd be surprised.
> 
> I don't think it's sticky page cache, it seems to shrink just fine. This
> is my current situtation:
> 
> axboe@wiggum:/home/axboe $ free
>              total       used       free     shared    buffers cached
> Mem:       1024992    1015288       9704          0      76680 328148
> -/+ buffers/cache:     610460     414532
> Swap:            0          0          0
> 
> axboe@wiggum:/home/axboe $ cat /proc/meminfo 
> MemTotal:      1024992 kB
> MemFree:          9768 kB
> Buffers:         76664 kB
> Cached:         328024 kB
> SwapCached:          0 kB
> Active:         534956 kB
> Inactive:       224060 kB
> HighTotal:           0 kB
> HighFree:            0 kB
> LowTotal:      1024992 kB
> LowFree:          9768 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB
> Dirty:            1400 kB
> Writeback:           0 kB
> Mapped:         464232 kB
> Slab:           225864 kB
> CommitLimit:    512496 kB
> Committed_AS:   773844 kB
> PageTables:       8004 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed:       644 kB
> VmallocChunk: 34359737167 kB
> HugePages_Total:     0
> HugePages_Free:      0
> Hugepagesize:     2048 kB

The (by far) two largest slab consumers are:

dentry_cache      140940 183060    216   18    1 : tunables  120   60
0 : slabdata  10170  10170      0

and

ext3_inode_cache  185494 194265    776    5    1 : tunables   54   27
0 : slabdata  38853  38853      0

there are about ~40k buffer_head entries as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:47                           ` Jens Axboe
  2005-01-26  8:52                             ` Jens Axboe
@ 2005-01-26  8:58                             ` Andrew Morton
  2005-01-26  9:03                               ` Jens Axboe
  2005-01-26 15:52                               ` Parag Warudkar
  1 sibling, 2 replies; 87+ messages in thread
From: Andrew Morton @ 2005-01-26  8:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: torvalds, alexn, kas, linux-kernel, lennert.vanalboom

Jens Axboe <axboe@suse.de> wrote:
>
> This is my current situtation:
> 
> ...
>  axboe@wiggum:/home/axboe $ cat /proc/meminfo 
>  MemTotal:      1024992 kB
>  MemFree:          9768 kB
>  Buffers:         76664 kB
>  Cached:         328024 kB
>  SwapCached:          0 kB
>  Active:         534956 kB
>  Inactive:       224060 kB
>  HighTotal:           0 kB
>  HighFree:            0 kB
>  LowTotal:      1024992 kB
>  LowFree:          9768 kB
>  SwapTotal:           0 kB
>  SwapFree:            0 kB
>  Dirty:            1400 kB
>  Writeback:           0 kB
>  Mapped:         464232 kB
>  Slab:           225864 kB
>  CommitLimit:    512496 kB
>  Committed_AS:   773844 kB
>  PageTables:       8004 kB
>  VmallocTotal: 34359738367 kB
>  VmallocUsed:       644 kB
>  VmallocChunk: 34359737167 kB
>  HugePages_Total:     0
>  HugePages_Free:      0
>  Hugepagesize:     2048 kB

OK.  There's rather a lot of anonymous memory there - 700M on the LRU, 300M
pageache, 400M anon, 200M of slab.  You need some swapspace ;)

What are the symptoms?  Slow to load applications?  Lots of paging?  Poor
I/O speeds?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:52                             ` Jens Axboe
@ 2005-01-26  9:00                               ` William Lee Irwin III
  0 siblings, 0 replies; 87+ messages in thread
From: William Lee Irwin III @ 2005-01-26  9:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Andrew Morton, torvalds, alexn, kas, linux-kernel, lennert.vanalboom

On Wed, Jan 26 2005, Jens Axboe wrote:
>> Slab:           225864 kB

On Wed, Jan 26, 2005 at 09:52:30AM +0100, Jens Axboe wrote:
> The (by far) two largest slab consumers are:
> dentry_cache      140940 183060    216   18    1 : tunables  120   60
> 0 : slabdata  10170  10170      0
> and
> ext3_inode_cache  185494 194265    776    5    1 : tunables   54   27
> 0 : slabdata  38853  38853      0
> there are about ~40k buffer_head entries as well.

These don't appear to be due to fragmentation. The dcache has 76.99%
utilization and ext3_inode_cache has 95.48% utilization.


-- wli

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:58                             ` Andrew Morton
@ 2005-01-26  9:03                               ` Jens Axboe
  2005-01-26 15:52                               ` Parag Warudkar
  1 sibling, 0 replies; 87+ messages in thread
From: Jens Axboe @ 2005-01-26  9:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, alexn, kas, linux-kernel, lennert.vanalboom

On Wed, Jan 26 2005, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> > This is my current situtation:
> > 
> > ...
> >  axboe@wiggum:/home/axboe $ cat /proc/meminfo 
> >  MemTotal:      1024992 kB
> >  MemFree:          9768 kB
> >  Buffers:         76664 kB
> >  Cached:         328024 kB
> >  SwapCached:          0 kB
> >  Active:         534956 kB
> >  Inactive:       224060 kB
> >  HighTotal:           0 kB
> >  HighFree:            0 kB
> >  LowTotal:      1024992 kB
> >  LowFree:          9768 kB
> >  SwapTotal:           0 kB
> >  SwapFree:            0 kB
> >  Dirty:            1400 kB
> >  Writeback:           0 kB
> >  Mapped:         464232 kB
> >  Slab:           225864 kB
> >  CommitLimit:    512496 kB
> >  Committed_AS:   773844 kB
> >  PageTables:       8004 kB
> >  VmallocTotal: 34359738367 kB
> >  VmallocUsed:       644 kB
> >  VmallocChunk: 34359737167 kB
> >  HugePages_Total:     0
> >  HugePages_Free:      0
> >  Hugepagesize:     2048 kB
> 
> OK.  There's rather a lot of anonymous memory there - 700M on the LRU, 300M
> pageache, 400M anon, 200M of slab.  You need some swapspace ;)

Just forget to swapon again after the recent fillmem cleanup, I do have
1G of swap usually on as well!

> What are the symptoms?  Slow to load applications?  Lots of paging?  Poor
> I/O speeds?

No paging, it basically never hits swap. Buffered io by itself seems to
run at full speed. But application startup seems sluggish. Hard to
explain really, but there's a noticable difference to the feel of usage
when it has just been force-pruned with fillmem and before.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-26  8:58                             ` Andrew Morton
  2005-01-26  9:03                               ` Jens Axboe
@ 2005-01-26 15:52                               ` Parag Warudkar
  1 sibling, 0 replies; 87+ messages in thread
From: Parag Warudkar @ 2005-01-26 15:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jens Axboe, torvalds, alexn, kas, linux-kernel, lennert.vanalboom

I am running 2.6.11-rc2+ fix for the pipe related leak by Linus. I am 
currently running a QT+KDE compile with distcc on two machines.  I am 
running these machines for around 11 hours now and  swap seems to be 
growing steadily on the -rc2 box - it went to ~260kb after 10hrs, after 
which I ran swapoff.  Now after couple hours it is at 40kb. The other 
machine is Knoppix 2.4.26 kernel with lesser memory and it hasn't run 
into swap at all.

On the -rc2 machine, however, I don't feel anything is sluggish yet. But 
I think if I leave it running long enough it might run out of memory.

I don't know if this is perfectly normal given the differences between 
2.4.x and 2.6.x VM. I will  keep it running and under load for a while 
and report any interesting stuff.

Here is /proc/meminfo on the rc2 box as of now -
root@localhost paragw]# cat /proc/meminfo
MemTotal:       775012 kB
MemFree:         55260 kB
Buffers:         72732 kB
Cached:         371956 kB
SwapCached:         40 kB
Active:         489508 kB
Inactive:       182360 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       775012 kB
LowFree:         55260 kB
SwapTotal:      787176 kB
SwapFree:       787136 kB
Dirty:            2936 kB
Writeback:           0 kB
Mapped:         259024 kB
Slab:            32288 kB
CommitLimit:   1174680 kB
Committed_AS:   450692 kB
PageTables:       3072 kB
VmallocTotal:   253876 kB
VmallocUsed:     25996 kB
VmallocChunk:   226736 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     4096 kB

Parag
Andrew Morton wrote:

>Jens Axboe <axboe@suse.de> wrote:
>  
>
>>This is my current situtation:
>>
>>...
>> axboe@wiggum:/home/axboe $ cat /proc/meminfo 
>> MemTotal:      1024992 kB
>> MemFree:          9768 kB
>> Buffers:         76664 kB
>> Cached:         328024 kB
>> SwapCached:          0 kB
>> Active:         534956 kB
>> Inactive:       224060 kB
>> HighTotal:           0 kB
>> HighFree:            0 kB
>> LowTotal:      1024992 kB
>> LowFree:          9768 kB
>> SwapTotal:           0 kB
>> SwapFree:            0 kB
>> Dirty:            1400 kB
>> Writeback:           0 kB
>> Mapped:         464232 kB
>> Slab:           225864 kB
>> CommitLimit:    512496 kB
>> Committed_AS:   773844 kB
>> PageTables:       8004 kB
>> VmallocTotal: 34359738367 kB
>> VmallocUsed:       644 kB
>> VmallocChunk: 34359737167 kB
>> HugePages_Total:     0
>> HugePages_Free:      0
>> Hugepagesize:     2048 kB
>>    
>>
>
>OK.  There's rather a lot of anonymous memory there - 700M on the LRU, 300M
>pageache, 400M anon, 200M of slab.  You need some swapspace ;)
>
>What are the symptoms?  Slow to load applications?  Lots of paging?  Poor
>I/O speeds?
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>  
>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-25 19:32               ` Russell King
@ 2005-01-27  8:28                 ` Russell King
  2005-01-27  8:47                   ` Andrew Morton
  0 siblings, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-27  8:28 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds, alexn, kas, linux-kernel, netdev

On Tue, Jan 25, 2005 at 07:32:07PM +0000, Russell King wrote:
> On Mon, Jan 24, 2005 at 11:48:53AM +0000, Russell King wrote:
> > On Sun, Jan 23, 2005 at 08:03:15PM +0000, Russell King wrote:
> > > I think I may be seeing something odd here, maybe a possible memory leak.
> > > The only problem I have is wondering whether I'm actually comparing like
> > > with like.  Maybe some networking people can provide a hint?
> > > 
> > > Below is gathered from 2.6.11-rc1.
> > > 
> > > bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> > > 24
> > > ip_dst_cache         669    885    256   15    1
> > > 
> > > I'm fairly positive when I rebooted the machine a couple of days ago,
> > > ip_dst_cache was significantly smaller for the same number of lines in
> > > /proc/net/rt_cache.
> > 
> > FYI, today it looks like this:
> > 
> > bash-2.05a# cat /proc/net/rt_cache | wc -l; grep ip_dst /proc/slabinfo
> > 26
> > ip_dst_cache         820   1065    256   15    1 
> > 
> > So the dst cache seems to have grown by 151 in 16 hours...  I'll continue
> > monitoring and providing updates.
> 
> Tonights update:
> 50
> ip_dst_cache        1024   1245    256   15    1
> 
> As you can see, the dst cache is consistently growing by about 200
> entries per day.  Given this, I predict that the box will fall over
> due to "dst cache overflow" in roughly 35 days.

This mornings magic numbers are:

3
ip_dst_cache        1292   1485    256   15    1

Is no one interested in the fact that the DST cache is leaking and
eventually takes out machines?  I've had virtually zero interest in
this problem so far.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:28                 ` Russell King
@ 2005-01-27  8:47                   ` Andrew Morton
  2005-01-27 10:19                     ` Alessandro Suardi
                                       ` (2 more replies)
  0 siblings, 3 replies; 87+ messages in thread
From: Andrew Morton @ 2005-01-27  8:47 UTC (permalink / raw)
  To: Russell King; +Cc: torvalds, alexn, kas, linux-kernel, netdev

Russell King <rmk+lkml@arm.linux.org.uk> wrote:
>
> This mornings magic numbers are:
> 
>  3
>  ip_dst_cache        1292   1485    256   15    1

I just did a q-n-d test here: send one UDP frame to 1.1.1.1 up to
1.1.255.255.  The ip_dst_cache grew to ~15k entries and grew no further. 
It's now gradually shrinking.  So there doesn't appear to be a trivial
bug..

>  Is no one interested in the fact that the DST cache is leaking and
>  eventually takes out machines?  I've had virtually zero interest in
>  this problem so far.

I guess we should find a way to make it happen faster.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:47                   ` Andrew Morton
@ 2005-01-27 10:19                     ` Alessandro Suardi
  2005-01-27 12:17                     ` Martin Josefsson
  2005-01-27 12:56                     ` Robert Olsson
  2 siblings, 0 replies; 87+ messages in thread
From: Alessandro Suardi @ 2005-01-27 10:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Russell King, torvalds, alexn, kas, linux-kernel, netdev

On Thu, 27 Jan 2005 00:47:32 -0800, Andrew Morton <akpm@osdl.org> wrote:
> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> >
> > This mornings magic numbers are:
> >
> >  3
> >  ip_dst_cache        1292   1485    256   15    1
> 
> I just did a q-n-d test here: send one UDP frame to 1.1.1.1 up to
> 1.1.255.255.  The ip_dst_cache grew to ~15k entries and grew no further.
> It's now gradually shrinking.  So there doesn't appear to be a trivial
> bug..
> 
> >  Is no one interested in the fact that the DST cache is leaking and
> >  eventually takes out machines?  I've had virtually zero interest in
> >  this problem so far.
> 
> I guess we should find a way to make it happen faster.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Data point... on my box, used as ed2k/bittorrent
 machine, the ip_dst_cache grows and shrinks quite
 fast; these two samples were ~3 minutes apart:


[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache         998   1005    256   15    1 : tunables  120   60 
  0 : slabdata     67     67      0
[root@donkey ~]# wc -l /proc/net/rt_cache
926 /proc/net/rt_cache

[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache         466    795    256   15    1 : tunables  120   60 
  0 : slabdata     53     53      0
[root@donkey ~]# wc -l /proc/net/rt_cache
443 /proc/net/rt_cache

 and these were 2 seconds apart

[root@donkey ~]# wc -l /proc/net/rt_cache
737 /proc/net/rt_cache
[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache         795    795    256   15    1 : tunables  120   60 
  0 : slabdata     53     53      0

[root@donkey ~]# wc -l /proc/net/rt_cache
1023 /proc/net/rt_cache
[root@donkey ~]# grep ip_dst /proc/slabinfo
ip_dst_cache        1035   1035    256   15    1 : tunables  120   60 
  0 : slabdata     69     69      0

--alessandro
 
  "And every dream, every, is just a dream after all"
 
     (Heather Nova, "Paper Cup")

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:47                   ` Andrew Morton
  2005-01-27 10:19                     ` Alessandro Suardi
@ 2005-01-27 12:17                     ` Martin Josefsson
  2005-01-27 12:56                     ` Robert Olsson
  2 siblings, 0 replies; 87+ messages in thread
From: Martin Josefsson @ 2005-01-27 12:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Russell King, torvalds, alexn, kas, linux-kernel, netdev

On Thu, 27 Jan 2005, Andrew Morton wrote:

> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> >
> > This mornings magic numbers are:
> >
> >  3
> >  ip_dst_cache        1292   1485    256   15    1
>
> I just did a q-n-d test here: send one UDP frame to 1.1.1.1 up to
> 1.1.255.255.  The ip_dst_cache grew to ~15k entries and grew no further.
> It's now gradually shrinking.  So there doesn't appear to be a trivial
> bug..
>
> >  Is no one interested in the fact that the DST cache is leaking and
> >  eventually takes out machines?  I've had virtually zero interest in
> >  this problem so far.
>
> I guess we should find a way to make it happen faster.

I could be a refcount problem. I think Russell is using NAT, it could be
the MASQUERADE target if that is in use. A simple test would be to switch
to SNAT and try again if possible.

/Martin

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27  8:47                   ` Andrew Morton
  2005-01-27 10:19                     ` Alessandro Suardi
  2005-01-27 12:17                     ` Martin Josefsson
@ 2005-01-27 12:56                     ` Robert Olsson
  2005-01-27 13:03                       ` Robert Olsson
  2005-01-27 16:49                       ` Russell King
  2 siblings, 2 replies; 87+ messages in thread
From: Robert Olsson @ 2005-01-27 12:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Russell King, torvalds, alexn, kas, linux-kernel, netdev


Andrew Morton writes:
 > Russell King <rmk+lkml@arm.linux.org.uk> wrote:

 > >  ip_dst_cache        1292   1485    256   15    1

 > I guess we should find a way to make it happen faster.
 
Here is route DoS attack. Pure routing no NAT no filter.

Start
=====
ip_dst_cache           5     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0

After DoS
=========
ip_dst_cache       66045  76125    256   15    1 : tunables  120   60    8 : slabdata   5075   5075    480

After some GC runs.
==================
ip_dst_cache           2     15    256   15    1 : tunables  120   60    8 : slabdata      1      1      0

No problems here. I saw Martin talked about NAT...

							--ro

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 12:56                     ` Robert Olsson
@ 2005-01-27 13:03                       ` Robert Olsson
  2005-01-27 16:49                       ` Russell King
  1 sibling, 0 replies; 87+ messages in thread
From: Robert Olsson @ 2005-01-27 13:03 UTC (permalink / raw)
  To: Robert Olsson
  Cc: Andrew Morton, Russell King, torvalds, alexn, kas, linux-kernel, netdev


Oh. Linux version 2.6.11-rc2 was used.

Robert Olsson writes:
 > 
 > Andrew Morton writes:
 >  > Russell King <rmk+lkml@arm.linux.org.uk> wrote:
 > 
 >  > >  ip_dst_cache        1292   1485    256   15    1
 > 
 >  > I guess we should find a way to make it happen faster.
 >  
 > Here is route DoS attack. Pure routing no NAT no filter.
 > 
 > Start
 > =====
 > ip_dst_cache           5     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0
 > 
 > After DoS
 > =========
 > ip_dst_cache       66045  76125    256   15    1 : tunables  120   60    8 : slabdata   5075   5075    480
 > 
 > After some GC runs.
 > ==================
 > ip_dst_cache           2     15    256   15    1 : tunables  120   60    8 : slabdata      1      1      0
 > 
 > No problems here. I saw Martin talked about NAT...
 > 
 > 							--ro

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 12:56                     ` Robert Olsson
  2005-01-27 13:03                       ` Robert Olsson
@ 2005-01-27 16:49                       ` Russell King
  2005-01-27 18:37                         ` Phil Oester
  2005-01-27 20:33                         ` David S. Miller
  1 sibling, 2 replies; 87+ messages in thread
From: Russell King @ 2005-01-27 16:49 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Andrew Morton, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 01:56:30PM +0100, Robert Olsson wrote:
> 
> Andrew Morton writes:
>  > Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> 
>  > >  ip_dst_cache        1292   1485    256   15    1
> 
>  > I guess we should find a way to make it happen faster.
>  
> Here is route DoS attack. Pure routing no NAT no filter.
> 
> Start
> =====
> ip_dst_cache           5     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0
> 
> After DoS
> =========
> ip_dst_cache       66045  76125    256   15    1 : tunables  120   60    8 : slabdata   5075   5075    480
> 
> After some GC runs.
> ==================
> ip_dst_cache           2     15    256   15    1 : tunables  120   60    8 : slabdata      1      1      0
> 
> No problems here. I saw Martin talked about NAT...

Yes, I can reproduce that same behaviour, where I can artificially
inflate the DST cache and the GC does run and trims it back down to
something reasonable.

BUT, over time, my DST cache just increases in size and won't trim back
down.  Not even by writing to the /proc/sys/net/ipv4/route/flush sysctl
(which, if I'm reading the code correctly - and would be nice to know
from those who actually know this stuff - should force an immediate
flush of the DST cache.)

For instance, I have (in sequence):

# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
581
ip_dst_cache        1860   1860    256   15    1 : tunables  120   60    0 : slabdata    124    124      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
717
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
690
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
696
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
700
ip_dst_cache        1995   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
718
ip_dst_cache        1993   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
653
ip_dst_cache        1993   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
667
ip_dst_cache        1956   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
620
ip_dst_cache        1944   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
623
ip_dst_cache        1920   1995    256   15    1 : tunables  120   60    0 : slabdata    133    133      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
8
ip_dst_cache        1380   1980    256   15    1 : tunables  120   60    0 : slabdata    132    132      0
# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo
86
ip_dst_cache        1375   1875    256   15    1 : tunables  120   60    0 : slabdata    125    125      0

so obviously the GC does appear to be working - as can be seen from the
number of entries in /proc/net/rt_cache.  However, the number of objects
in the slab cache does grow day on day.  About 4 days ago, it was only
about 600 active objects.  Now it's more than twice that, and it'll
continue increasing until it hits 8192, where upon it's game over.

And, here's the above with /proc/net/stat/rt_cache included:

# cat /proc/net/rt_cache|wc -l;grep ip_dst /proc/slabinfo; cat /proc/net/stat/rt_cache
61
ip_dst_cache        1340   1680    256   15    1 : tunables  120   60    0 : slabdata    112    112      0
entries  in_hit in_slow_tot in_no_route in_brd in_martian_dst in_martian_src  out_hit out_slow_tot out_slow_mc  gc_total gc_ignored gc_goal_miss gc_dst_overflow in_hlist_search out_hlist_search
00000538  005c9f10 0005e163 00000000 00000013 000002e2 00000000 00000005  003102e3 00038f6d 00000000 0007887a 0005286d 00001142 00000000 00138855 0010848d

notice how /proc/net/stat/rt_cache says there's 1336 entries in the
route cache.  _Where_ are they?  They're not there according to
/proc/net/rt_cache.

(PS, the formatting of the headings in /proc/net/stat/rt_cache doesn't
appear to tie up with the formatting of the data which is _really_
confusing.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 16:49                       ` Russell King
@ 2005-01-27 18:37                         ` Phil Oester
  2005-01-27 19:25                           ` Russell King
  2005-01-27 20:33                         ` David S. Miller
  1 sibling, 1 reply; 87+ messages in thread
From: Phil Oester @ 2005-01-27 18:37 UTC (permalink / raw)
  To: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 04:49:18PM +0000, Russell King wrote:
> so obviously the GC does appear to be working - as can be seen from the
> number of entries in /proc/net/rt_cache.  However, the number of objects
> in the slab cache does grow day on day.  About 4 days ago, it was only
> about 600 active objects.  Now it's more than twice that, and it'll
> continue increasing until it hits 8192, where upon it's game over.

I can confirm the behavior you are seeing -- does seem to be a leak
somewhere.  Below from a heavily used gateway with 26 days uptime:

# wc -l /proc/net/rt_cache ; grep ip_dst /proc/slabinfo
  12870 /proc/net/rt_cache
ip_dst_cache       53327  57855

Eventually I get the dst_cache overflow errors and have to reboot.

Phil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 18:37                         ` Phil Oester
@ 2005-01-27 19:25                           ` Russell King
  2005-01-27 20:40                             ` Phil Oester
  0 siblings, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-27 19:25 UTC (permalink / raw)
  To: Phil Oester
  Cc: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 10:37:45AM -0800, Phil Oester wrote:
> On Thu, Jan 27, 2005 at 04:49:18PM +0000, Russell King wrote:
> > so obviously the GC does appear to be working - as can be seen from the
> > number of entries in /proc/net/rt_cache.  However, the number of objects
> > in the slab cache does grow day on day.  About 4 days ago, it was only
> > about 600 active objects.  Now it's more than twice that, and it'll
> > continue increasing until it hits 8192, where upon it's game over.
> 
> I can confirm the behavior you are seeing -- does seem to be a leak
> somewhere.  Below from a heavily used gateway with 26 days uptime:
> 
> # wc -l /proc/net/rt_cache ; grep ip_dst /proc/slabinfo
>   12870 /proc/net/rt_cache
> ip_dst_cache       53327  57855
> 
> Eventually I get the dst_cache overflow errors and have to reboot.

Can you provide some details, eg kernel configuration, loaded modules
and a brief overview of any netfilter modules you may be using.

Maybe we can work out what's common between our setups.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 16:49                       ` Russell King
  2005-01-27 18:37                         ` Phil Oester
@ 2005-01-27 20:33                         ` David S. Miller
  2005-01-28  0:17                           ` Russell King
  1 sibling, 1 reply; 87+ messages in thread
From: David S. Miller @ 2005-01-27 20:33 UTC (permalink / raw)
  To: Russell King
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Thu, 27 Jan 2005 16:49:18 +0000
Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> notice how /proc/net/stat/rt_cache says there's 1336 entries in the
> route cache.  _Where_ are they?  They're not there according to
> /proc/net/rt_cache.

When the route cache is flushed, that kills a reference to each
entry in the routing cache.  If for some reason, other references
remain (route connected to socket, some leak in the stack somewhere)
the route cache entry can't be immediately completely freed up.

So they won't be listed in /proc/net/rt_cache (since they've been
removed from the lookup table) but they will be accounted for in
/proc/net/stat/rt_cache until the final release is done on the
routing cache object and it can be completely freed up.

Do you happen to be using IPV6 in any way by chance?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 19:25                           ` Russell King
@ 2005-01-27 20:40                             ` Phil Oester
  2005-01-28  9:32                               ` Russell King
  0 siblings, 1 reply; 87+ messages in thread
From: Phil Oester @ 2005-01-27 20:40 UTC (permalink / raw)
  To: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 07:25:04PM +0000, Russell King wrote:
> Can you provide some details, eg kernel configuration, loaded modules
> and a brief overview of any netfilter modules you may be using.
> 
> Maybe we can work out what's common between our setups.

Vanilla 2.6.10, though I've been seeing these problems since 2.6.8 or
earlier.  Netfilter running on all boxes, some utilizing SNAT, others
not -- none using MASQ.  This is from a box running no NAT at all,
although has some other filter rules:

# wc -l /proc/net/rt_cache ; grep dst_cache /proc/slabinfo
     50 /proc/net/rt_cache
ip_dst_cache       84285  84285

Also with uptime of 26 days.  

These boxes are all running the quagga OSPF daemon, but those that
are lightly loaded are not exhibiting these problems.

Phil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 20:33                         ` David S. Miller
@ 2005-01-28  0:17                           ` Russell King
  2005-01-28  0:34                             ` David S. Miller
  2005-01-28  1:41                             ` Phil Oester
  0 siblings, 2 replies; 87+ messages in thread
From: Russell King @ 2005-01-28  0:17 UTC (permalink / raw)
  To: David S. Miller
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 12:33:26PM -0800, David S. Miller wrote:
> So they won't be listed in /proc/net/rt_cache (since they've been
> removed from the lookup table) but they will be accounted for in
> /proc/net/stat/rt_cache until the final release is done on the
> routing cache object and it can be completely freed up.
> 
> Do you happen to be using IPV6 in any way by chance?

Yes.  Someone suggested this evening that there may have been a recent
change to do with some IPv6 refcounting which may have caused this
problem.  Is that something you can confirm?

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  0:17                           ` Russell King
@ 2005-01-28  0:34                             ` David S. Miller
  2005-01-28  8:58                               ` Russell King
  2005-01-28  1:41                             ` Phil Oester
  1 sibling, 1 reply; 87+ messages in thread
From: David S. Miller @ 2005-01-28  0:34 UTC (permalink / raw)
  To: Russell King
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Fri, 28 Jan 2005 00:17:01 +0000
Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> Yes.  Someone suggested this evening that there may have been a recent
> change to do with some IPv6 refcounting which may have caused this
> problem.  Is that something you can confirm?

Yep, it would be this change below.  Try backing it out and see
if that makes your leak go away.

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2005/01/14 20:41:55-08:00 herbert@gondor.apana.org.au 
#   [IPV6]: Fix locking in ip6_dst_lookup().
#   
#   The caller does not necessarily have the socket locked
#   (udpv6sendmsg() is one such case) so we have to use
#   sk_dst_check() instead of __sk_dst_check().
#   
#   Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
#   Signed-off-by: David S. Miller <davem@davemloft.net>
# 
# net/ipv6/ip6_output.c
#   2005/01/14 20:41:34-08:00 herbert@gondor.apana.org.au +3 -3
#   [IPV6]: Fix locking in ip6_dst_lookup().
#   
#   The caller does not necessarily have the socket locked
#   (udpv6sendmsg() is one such case) so we have to use
#   sk_dst_check() instead of __sk_dst_check().
#   
#   Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
#   Signed-off-by: David S. Miller <davem@davemloft.net>
# 
diff -Nru a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
--- a/net/ipv6/ip6_output.c	2005-01-27 16:07:21 -08:00
+++ b/net/ipv6/ip6_output.c	2005-01-27 16:07:21 -08:00
@@ -745,7 +745,7 @@
 	if (sk) {
 		struct ipv6_pinfo *np = inet6_sk(sk);
 	
-		*dst = __sk_dst_check(sk, np->dst_cookie);
+		*dst = sk_dst_check(sk, np->dst_cookie);
 		if (*dst) {
 			struct rt6_info *rt = (struct rt6_info*)*dst;
 	
@@ -772,9 +772,9 @@
 			     && (np->daddr_cache == NULL ||
 				 !ipv6_addr_equal(&fl->fl6_dst, np->daddr_cache)))
 			    || (fl->oif && fl->oif != (*dst)->dev->ifindex)) {
+				dst_release(*dst);
 				*dst = NULL;
-			} else
-				dst_hold(*dst);
+			}
 		}
 	}
 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  0:17                           ` Russell King
  2005-01-28  0:34                             ` David S. Miller
@ 2005-01-28  1:41                             ` Phil Oester
  1 sibling, 0 replies; 87+ messages in thread
From: Phil Oester @ 2005-01-28  1:41 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Fri, Jan 28, 2005 at 12:17:01AM +0000, Russell King wrote:
> On Thu, Jan 27, 2005 at 12:33:26PM -0800, David S. Miller wrote:
> > So they won't be listed in /proc/net/rt_cache (since they've been
> > removed from the lookup table) but they will be accounted for in
> > /proc/net/stat/rt_cache until the final release is done on the
> > routing cache object and it can be completely freed up.
> > 
> > Do you happen to be using IPV6 in any way by chance?
> 
> Yes.  Someone suggested this evening that there may have been a recent
> change to do with some IPv6 refcounting which may have caused this
> problem.  Is that something you can confirm?

FWIW, I do not use IPv6, and it is not compiled into the kernel.

Phil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  0:34                             ` David S. Miller
@ 2005-01-28  8:58                               ` Russell King
  2005-01-30 13:23                                 ` Russell King
  0 siblings, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-28  8:58 UTC (permalink / raw)
  To: David S. Miller
  Cc: Robert.Olsson, akpm, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 04:34:44PM -0800, David S. Miller wrote:
> On Fri, 28 Jan 2005 00:17:01 +0000
> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> > Yes.  Someone suggested this evening that there may have been a recent
> > change to do with some IPv6 refcounting which may have caused this
> > problem.  Is that something you can confirm?
> 
> Yep, it would be this change below.  Try backing it out and see
> if that makes your leak go away.

Thanks.  I'll try it, but:

1. Looking at the date of the change it seems unlikely.  The recent
   death occurred with 2.6.10-rc2, booted on 29th November and dying
   on 19th January, which obviously predates this cset.
2. It'll take a couple of days to confirm the behaviour of the dst cache.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-27 20:40                             ` Phil Oester
@ 2005-01-28  9:32                               ` Russell King
  0 siblings, 0 replies; 87+ messages in thread
From: Russell King @ 2005-01-28  9:32 UTC (permalink / raw)
  To: Phil Oester
  Cc: Robert Olsson, Andrew Morton, torvalds, alexn, kas, linux-kernel, netdev

On Thu, Jan 27, 2005 at 12:40:12PM -0800, Phil Oester wrote:
> Vanilla 2.6.10, though I've been seeing these problems since 2.6.8 or
> earlier.

Right.  For me:

- 2.6.9-rc3 (installed 8th Oct) died with dst cache overflow on 29th November
- 2.6.10-rc2 (booted 29th Nov) died with the same on 19th January
- 2.6.11-rc1 (booted 19th Jan) appears to have the same problem, but
  it hasn't died yet.

> Netfilter running on all boxes, some utilizing SNAT, others
> not -- none using MASQ.

IPv4 filter targets: ACCEPT, DROP, REJECT, LOG
	using: state, limit & protocol

IPv4 nat targets: DNAT, MASQ
	using: protocol

IPv4 mangle targets: ACCEPT, MARK
	using: protocol

IPv6 filter targets: ACCEPT, DROP
	using: protocol

IPv6 mangle targets: none

(protocol == at least one rule matching tcp, icmp or udp packets)

IPv6 configured native on internal interface, tun6to4 for external IPv6
communication.

IPv4 and IPv6 forwarding enabled.
IPv4 rpfilter, proxyarp, syncookies enabled.
IPv4 proxy delay on internal interface set to '1'.

> These boxes are all running the quagga OSPF daemon, but those that
> are lightly loaded are not exhibiting these problems.

Running zebra (for ipv6 route advertisment on the local network only.)

Network traffic-wise, 2.6.11-rc1 has this on its public facing
interface(s) in 8.5 days.

4: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    RX: bytes  packets  errors  dropped overrun mcast
    667468541  2603373  0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    1245774764 2777605  0       0       1       2252

5: tun6to4@NONE: <NOARP,UP> mtu 1480 qdisc noqueue
    RX: bytes  packets  errors  dropped overrun mcast
    19130536   84034    0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    10436749   91589    0       0       0       0


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-28  8:58                               ` Russell King
@ 2005-01-30 13:23                                 ` Russell King
  2005-01-30 15:34                                   ` Russell King
  2005-01-30 17:23                                   ` Patrick McHardy
  0 siblings, 2 replies; 87+ messages in thread
From: Russell King @ 2005-01-30 13:23 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Fri, Jan 28, 2005 at 08:58:59AM +0000, Russell King wrote:
> On Thu, Jan 27, 2005 at 04:34:44PM -0800, David S. Miller wrote:
> > On Fri, 28 Jan 2005 00:17:01 +0000
> > Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> > > Yes.  Someone suggested this evening that there may have been a recent
> > > change to do with some IPv6 refcounting which may have caused this
> > > problem.  Is that something you can confirm?
> > 
> > Yep, it would be this change below.  Try backing it out and see
> > if that makes your leak go away.
> 
> Thanks.  I'll try it, but:
> 
> 1. Looking at the date of the change it seems unlikely.  The recent
>    death occurred with 2.6.10-rc2, booted on 29th November and dying
>    on 19th January, which obviously predates this cset.
> 2. It'll take a couple of days to confirm the behaviour of the dst cache.

I have another question whether ip6_output.c is the problem - the leak
is with ip_dst_cache (== IPv4).  If the problem were ip6_output, wouldn't
we see ip6_dst_cache leaking instead?

Anyway, I've produced some code which keeps a record of the __refcnt
increments and decrements, and I think it's produced some interesting
results.  Essentially, I'm seeing the odd dst entry with a __refcnt of
14000 or so (which is still in active use, so probably ok), and a number
with 4, 7, and 13 which haven't had the refcount touched for at least 14
minutes.

One of these were created via ip_route_input_slow(), the other three via
ip_route_output_slow().  That isn't significant on its own.

However, whenever ip_copy_metadata() appears in the refcount log, I see
half the number of increments due to that still remaining to be
decremented (see the output below).  0 = "mark", positive numbers =
increment refcnt this many times, negative numbers = decrement refcnt
this many times.

I don't know if the code is using fragment lists in ip_fragment(), but
on reading the code a question comes to mind: if we have a list of
fragments, does each fragment skb have a valid (and refcounted) dst
pointer before ip_fragment() does it's job?  If yes, then isn't the
first ip_copy_metadata() in ip_fragment() going to overwrite this
pointer without dropping the refcount?

All that said, it's probably far too early to read much into these
results - once the machine has been running for more than 19 minutes
and has a significant number of "stuck" dst cache entries, I think
it'll be far more conclusive.  Nevertheless, it looks like food for
thought.

dst pointer: creation time (200Hz jiffies) last reference time (200Hz jiffies)
c1c66260: ffff6c79 ffff879d:
	location count	function
        c01054f4 0      dst_alloc
        c0114a80 1      ip_route_input_slow
        c00fa95c -18    __kfree_skb
        c0115104 13     ip_route_input
        c011ae1c 8      ip_copy_metadata
        c01055ac 0      __dst_free
	untracked counts
        : 0
	total
	= 4
  next=c1c66b60 refcnt=00000004 use=0000000d dst=24f45cc3 src=0f00a8c0

c1c66b60: ffff20fe ffff5066:
        c01054f4 0      dst_alloc
        c01156e8 1      ip_route_output_slow
        c011b854 6813   ip_append_data
        c011c7e0 6813   ip_push_pending_frames
        c00fa95c -6826  __kfree_skb
        c011c8fc -6813  ip_push_pending_frames
        c0139dbc -6813  udp_sendmsg
        c0115a0c 6814   __ip_route_output_key
        c013764c -2     ip4_datagram_connect
        c011ae1c 26     ip_copy_metadata
        c01055ac 0      __dst_free
        : 0
	= 13
  next=c1c57680 refcnt=0000000d use=00001a9e dst=bbe812d4 src=bae812d4

c1c66960: ffff89ac ffffa42d:
        c01054f4 0      dst_alloc
        c01156e8 1      ip_route_output_slow
        c011b854 3028   ip_append_data
        c0139dbc -3028  udp_sendmsg
        c011c7e0 3028   ip_push_pending_frames
        c011ae1c 8      ip_copy_metadata
        c00fa95c -3032  __kfree_skb
        c011c8fc -3028  ip_push_pending_frames
        c0115a0c 3027   __ip_route_output_key
        c01055ac 0      __dst_free
        : 0
	= 4
  next=c16d1080 refcnt=00000004 use=00000bd3 dst=bbe812d4 src=bae812d4

c16d1080: ffff879b ffff89af:
        c01054f4 0      dst_alloc
        c01156e8 1      ip_route_output_slow
        c011b854 240    ip_append_data
        c011c7e0 240    ip_push_pending_frames
        c00fa95c -247   __kfree_skb
        c011c8fc -240   ip_push_pending_frames
        c0139dbc -240   udp_sendmsg
        c0115a0c 239    __ip_route_output_key
        c011ae1c 14     ip_copy_metadata
        c01055ac 0      __dst_free
        : 0
	= 7
  next=c1c66260 refcnt=00000007 use=000000ef dst=bbe812d4 src=bae812d4


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 13:23                                 ` Russell King
@ 2005-01-30 15:34                                   ` Russell King
  2005-01-30 16:57                                     ` Phil Oester
  2005-01-30 17:23                                   ` Patrick McHardy
  1 sibling, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-30 15:34 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 01:23:43PM +0000, Russell King wrote:
> Anyway, I've produced some code which keeps a record of the __refcnt
> increments and decrements, and I think it's produced some interesting
> results.  Essentially, I'm seeing the odd dst entry with a __refcnt of
> 14000 or so (which is still in active use, so probably ok), and a number
> with 4, 7, and 13 which haven't had the refcount touched for at least 14
> minutes.

An hour or so goes by.  I now have 14 dst cache entries with non-zero
refcounts, and these have the following properties:

* The five from before (with counts 13, 14473, 4, 4, 7 respectively):
  + all remain unfreed.
  + show precisely no change in the refcounts.
  + the refcount has not been touched for more than an hour.
* They have all been touched by ip_copy_metadata.
* Their remaining refcounts are precisely half the number of
  ip_copy_metadata calls in every instance.

No entries with a refcount of zero contain ip_copy_metadata() and do
appear in /proc/net/rt_cache.

The following may also be a pointer - from /proc/net/snmp:

Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates
Ip: 1 64 140510 0 0 36861 0 0 93549 131703 485 0 21 46622 15695 21 21950 0 0

Since FragCreates is 0, this means that we are using the frag_lists
rather than creating our own fragments (and indeed the first
ip_copy_metadata() call rather than the second in ip_fragment()).

I think the case against the IPv4 fragmentation code is mounting.
However, without knowing what the expected conditions for this code,
(eg, are skbs on the fraglist supposed to have NULL skb->dst?) I'm
unable to progress this any further.  However, I think it's quite
clear that there is something bad going on here.

Why many more people aren't seeing this I've no idea.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 15:34                                   ` Russell King
@ 2005-01-30 16:57                                     ` Phil Oester
  0 siblings, 0 replies; 87+ messages in thread
From: Phil Oester @ 2005-01-30 16:57 UTC (permalink / raw)
  To: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 03:34:49PM +0000, Russell King wrote:
> I think the case against the IPv4 fragmentation code is mounting.
> However, without knowing what the expected conditions for this code,
> (eg, are skbs on the fraglist supposed to have NULL skb->dst?) I'm
> unable to progress this any further.  However, I think it's quite
> clear that there is something bad going on here.

Interesting...the gateway which exhibits the problem fastest in my
area does have a large number of fragmented UDP packets running through it,
as shown by tcpdump 'ip[6:2] & 0x1fff != 0'.

> Why many more people aren't seeing this I've no idea.

Perhaps you (and I) experience more fragments than the average user???

Nice detective work!

Phil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 13:23                                 ` Russell King
  2005-01-30 15:34                                   ` Russell King
@ 2005-01-30 17:23                                   ` Patrick McHardy
  2005-01-30 17:26                                     ` Patrick McHardy
  1 sibling, 1 reply; 87+ messages in thread
From: Patrick McHardy @ 2005-01-30 17:23 UTC (permalink / raw)
  To: Russell King
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

Russell King wrote:

>I don't know if the code is using fragment lists in ip_fragment(), but
>on reading the code a question comes to mind: if we have a list of
>fragments, does each fragment skb have a valid (and refcounted) dst
>pointer before ip_fragment() does it's job?  If yes, then isn't the
>first ip_copy_metadata() in ip_fragment() going to overwrite this
>pointer without dropping the refcount?
>
Nice spotting. If conntrack isn't loaded defragmentation happens after
routing, so this is likely the cause.

Regards
Patrick


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:23                                   ` Patrick McHardy
@ 2005-01-30 17:26                                     ` Patrick McHardy
  2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:01                                       ` Russell King
  0 siblings, 2 replies; 87+ messages in thread
From: Patrick McHardy @ 2005-01-30 17:26 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Russell King, David S. Miller, Robert.Olsson, akpm, torvalds,
	alexn, kas, linux-kernel, netdev

Patrick McHardy wrote:

> Russell King wrote:
>
>> I don't know if the code is using fragment lists in ip_fragment(), but
>> on reading the code a question comes to mind: if we have a list of
>> fragments, does each fragment skb have a valid (and refcounted) dst
>> pointer before ip_fragment() does it's job?  If yes, then isn't the
>> first ip_copy_metadata() in ip_fragment() going to overwrite this
>> pointer without dropping the refcount?
>>
> Nice spotting. If conntrack isn't loaded defragmentation happens after
> routing, so this is likely the cause.

OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
so frag_list should be empty. So probably false alarm, sorry.

> Regards
> Patrick




^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:26                                     ` Patrick McHardy
@ 2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:45                                         ` Russell King
                                                           ` (2 more replies)
  2005-01-30 18:01                                       ` Russell King
  1 sibling, 3 replies; 87+ messages in thread
From: Patrick McHardy @ 2005-01-30 17:58 UTC (permalink / raw)
  To: Russell King
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 899 bytes --]

Patrick McHardy wrote:

>> Russell King wrote:
>>
>>> I don't know if the code is using fragment lists in ip_fragment(), but
>>> on reading the code a question comes to mind: if we have a list of
>>> fragments, does each fragment skb have a valid (and refcounted) dst
>>> pointer before ip_fragment() does it's job?  If yes, then isn't the
>>> first ip_copy_metadata() in ip_fragment() going to overwrite this
>>> pointer without dropping the refcount?
>>>
>> Nice spotting. If conntrack isn't loaded defragmentation happens after
>> routing, so this is likely the cause.
>
>
> OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> so frag_list should be empty. So probably false alarm, sorry.

Ok, final decision: you are right :) conntrack also defragments locally
generated packets before they hit ip_fragment. In this case the fragments
have skb->dst set.

Regards
Patrick


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 366 bytes --]

===== net/ipv4/ip_output.c 1.74 vs edited =====
--- 1.74/net/ipv4/ip_output.c	2005-01-25 01:40:10 +01:00
+++ edited/net/ipv4/ip_output.c	2005-01-30 18:54:43 +01:00
@@ -389,6 +389,7 @@
 	to->priority = from->priority;
 	to->protocol = from->protocol;
 	to->security = from->security;
+	dst_release(to->dst);
 	to->dst = dst_clone(from->dst);
 	to->dev = from->dev;
 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:26                                     ` Patrick McHardy
  2005-01-30 17:58                                       ` Patrick McHardy
@ 2005-01-30 18:01                                       ` Russell King
  2005-01-30 18:19                                         ` Phil Oester
  1 sibling, 1 reply; 87+ messages in thread
From: Russell King @ 2005-01-30 18:01 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 06:26:29PM +0100, Patrick McHardy wrote:
> Patrick McHardy wrote:
> 
> > Russell King wrote:
> >
> >> I don't know if the code is using fragment lists in ip_fragment(), but
> >> on reading the code a question comes to mind: if we have a list of
> >> fragments, does each fragment skb have a valid (and refcounted) dst
> >> pointer before ip_fragment() does it's job?  If yes, then isn't the
> >> first ip_copy_metadata() in ip_fragment() going to overwrite this
> >> pointer without dropping the refcount?
> >>
> > Nice spotting. If conntrack isn't loaded defragmentation happens after
> > routing, so this is likely the cause.
> 
> OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> so frag_list should be empty. So probably false alarm, sorry.

I've just checked Phil's mails - both Phil and myself are using
netfilter on the troublesome boxen.

Also, since FragCreates is zero, and this does mean that the frag_list
is not empty in all cases so far where ip_fragment() has been called.
(Reading the code, if frag_list was empty, we'd have to create some
fragments, which increments the FragCreates statistic.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 18:01                                       ` Russell King
@ 2005-01-30 18:19                                         ` Phil Oester
  0 siblings, 0 replies; 87+ messages in thread
From: Phil Oester @ 2005-01-30 18:19 UTC (permalink / raw)
  To: Patrick McHardy, David S. Miller, Robert.Olsson, akpm, torvalds,
	alexn, kas, linux-kernel, netdev

On Sun, Jan 30, 2005 at 06:01:46PM +0000, Russell King wrote:
> > OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> > so frag_list should be empty. So probably false alarm, sorry.
> 
> I've just checked Phil's mails - both Phil and myself are using
> netfilter on the troublesome boxen.
> 
> Also, since FragCreates is zero, and this does mean that the frag_list
> is not empty in all cases so far where ip_fragment() has been called.
> (Reading the code, if frag_list was empty, we'd have to create some
> fragments, which increments the FragCreates statistic.)

The below testcase seems to illustrate the problem nicely -- ip_dst_cache
grows but never shrinks:

On gateway:

iptables -I FORWARD -d 10.10.10.0/24 -j DROP

On client:

for i in `seq 1 254` ; do ping -s 1500 -c 5 -w 1 -f 10.10.10.$i ; done


Phil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:58                                       ` Patrick McHardy
@ 2005-01-30 18:45                                         ` Russell King
  2005-01-31  2:48                                         ` David S. Miller
  2005-01-31  4:11                                         ` Herbert Xu
  2 siblings, 0 replies; 87+ messages in thread
From: Russell King @ 2005-01-30 18:45 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: David S. Miller, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, Jan 30, 2005 at 06:58:27PM +0100, Patrick McHardy wrote:
> Patrick McHardy wrote:
> >> Russell King wrote:
> >>> I don't know if the code is using fragment lists in ip_fragment(), but
> >>> on reading the code a question comes to mind: if we have a list of
> >>> fragments, does each fragment skb have a valid (and refcounted) dst
> >>> pointer before ip_fragment() does it's job?  If yes, then isn't the
> >>> first ip_copy_metadata() in ip_fragment() going to overwrite this
> >>> pointer without dropping the refcount?
> >>>
> >> Nice spotting. If conntrack isn't loaded defragmentation happens after
> >> routing, so this is likely the cause.
> >
> > OTOH, if conntrack isn't loaded forwarded packet are never defragmented,
> > so frag_list should be empty. So probably false alarm, sorry.
> 
> Ok, final decision: you are right :) conntrack also defragments locally
> generated packets before they hit ip_fragment. In this case the fragments
> have skb->dst set.

Good news - with this in place, I no longer have refcounts of 14000!
After 18 minutes (the first clearout of the dst cache from 500 odd
down to 11 or so), all dst cache entries have a ref count of zero.

I'll check it again later this evening to be sure.

Thanks Patrick.

> ===== net/ipv4/ip_output.c 1.74 vs edited =====
> --- 1.74/net/ipv4/ip_output.c	2005-01-25 01:40:10 +01:00
> +++ edited/net/ipv4/ip_output.c	2005-01-30 18:54:43 +01:00
> @@ -389,6 +389,7 @@
>  	to->priority = from->priority;
>  	to->protocol = from->protocol;
>  	to->security = from->security;
> +	dst_release(to->dst);
>  	to->dst = dst_clone(from->dst);
>  	to->dev = from->dev;
>  


-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:45                                         ` Russell King
@ 2005-01-31  2:48                                         ` David S. Miller
  2005-01-31  4:11                                         ` Herbert Xu
  2 siblings, 0 replies; 87+ messages in thread
From: David S. Miller @ 2005-01-31  2:48 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: rmk+lkml, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

On Sun, 30 Jan 2005 18:58:27 +0100
Patrick McHardy <kaber@trash.net> wrote:

> Ok, final decision: you are right :) conntrack also defragments locally
> generated packets before they hit ip_fragment. In this case the fragments
> have skb->dst set.

It's amazing how many bugs exist due to the local defragmentation and
refragmentation done by netfilter. :-)

Good catch Patrick, I'll apply this and push upstream.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-30 17:58                                       ` Patrick McHardy
  2005-01-30 18:45                                         ` Russell King
  2005-01-31  2:48                                         ` David S. Miller
@ 2005-01-31  4:11                                         ` Herbert Xu
  2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
  2 siblings, 1 reply; 87+ messages in thread
From: Herbert Xu @ 2005-01-31  4:11 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: rmk+lkml, davem, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev

Patrick McHardy <kaber@trash.net> wrote:
> 
> Ok, final decision: you are right :) conntrack also defragments locally
> generated packets before they hit ip_fragment. In this case the fragments
> have skb->dst set.

Well caught.  The same thing is needed for IPv6, right?
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  4:11                                         ` Herbert Xu
@ 2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
  2005-01-31  5:00                                             ` Patrick McHardy
  0 siblings, 1 reply; 87+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2005-01-31  4:45 UTC (permalink / raw)
  To: herbert, davem
  Cc: kaber, rmk+lkml, Robert.Olsson, akpm, torvalds, alexn, kas,
	linux-kernel, netdev, yoshfuji

In article <E1CvSuS-00056x-00@gondolin.me.apana.org.au> (at Mon, 31 Jan 2005 15:11:32 +1100), Herbert Xu <herbert@gondor.apana.org.au> says:

> Patrick McHardy <kaber@trash.net> wrote:
> > 
> > Ok, final decision: you are right :) conntrack also defragments locally
> > generated packets before they hit ip_fragment. In this case the fragments
> > have skb->dst set.
> 
> Well caught.  The same thing is needed for IPv6, right?

(not yet confirmed, but) yes, please.

Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>

===== net/ipv6/ip6_output.c 1.82 vs edited =====
--- 1.82/net/ipv6/ip6_output.c	2005-01-25 09:40:10 +09:00
+++ edited/net/ipv6/ip6_output.c	2005-01-31 13:44:01 +09:00
@@ -463,6 +463,7 @@
 	to->priority = from->priority;
 	to->protocol = from->protocol;
 	to->security = from->security;
+	dst_release(to->dst);
 	to->dst = dst_clone(from->dst);
 	to->dev = from->dev;
 

--yoshfuji

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
@ 2005-01-31  5:00                                             ` Patrick McHardy
  2005-01-31  5:11                                               ` David S. Miller
  2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 2 replies; 87+ messages in thread
From: Patrick McHardy @ 2005-01-31  5:00 UTC (permalink / raw)
  To: yoshfuji
  Cc: herbert, davem, rmk+lkml, Robert.Olsson, akpm, torvalds, alexn,
	kas, linux-kernel, netdev

YOSHIFUJI Hideaki / ^[$B5HF#1QL@^[ wrote:

>In article <E1CvSuS-00056x-00@gondolin.me.apana.org.au> (at Mon, 31 Jan 2005 15:11:32 +1100), Herbert Xu <herbert@gondor.apana.org.au> says:
>
>
>>Patrick McHardy <kaber@trash.net> wrote:
>>
>>>Ok, final decision: you are right :) conntrack also defragments locally
>>>generated packets before they hit ip_fragment. In this case the fragments
>>>have skb->dst set.
>>>
>>Well caught.  The same thing is needed for IPv6, right?
>>
>
>(not yet confirmed, but) yes, please.
>
We don't need this for IPv6 yet. Once we get nf_conntrack in we
might need this, but its IPv6 fragment handling is different from
ip_conntrack, I need to check first.

Regards
Patrick


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:00                                             ` Patrick McHardy
@ 2005-01-31  5:11                                               ` David S. Miller
  2005-01-31  5:40                                                 ` Herbert Xu
  2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
  1 sibling, 1 reply; 87+ messages in thread
From: David S. Miller @ 2005-01-31  5:11 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: yoshfuji, herbert, rmk+lkml, Robert.Olsson, akpm, torvalds,
	alexn, kas, linux-kernel, netdev

On Mon, 31 Jan 2005 06:00:40 +0100
Patrick McHardy <kaber@trash.net> wrote:

> We don't need this for IPv6 yet. Once we get nf_conntrack in we
> might need this, but its IPv6 fragment handling is different from
> ip_conntrack, I need to check first.

Right, ipv6 netfilter cannot create this situation yet.

However, logically the fix is still correct and I'll add
it into the tree.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:00                                             ` Patrick McHardy
  2005-01-31  5:11                                               ` David S. Miller
@ 2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
  2005-01-31  5:42                                                 ` Yasuyuki KOZAKAI
  1 sibling, 1 reply; 87+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2005-01-31  5:16 UTC (permalink / raw)
  To: kaber, kozakai
  Cc: herbert, davem, rmk+lkml, Robert.Olsson, akpm, torvalds, alexn,
	kas, linux-kernel, netdev

In article <41FDBB78.2050403@trash.net> (at Mon, 31 Jan 2005 06:00:40 +0100), Patrick McHardy <kaber@trash.net> says:

|We don't need this for IPv6 yet. Once we get nf_conntrack in we
|might need this, but its IPv6 fragment handling is different from
|ip_conntrack, I need to check first.

Ok. It would be better to have some comment but anyway...
kozakai-san?

--yoshfuji

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:11                                               ` David S. Miller
@ 2005-01-31  5:40                                                 ` Herbert Xu
  0 siblings, 0 replies; 87+ messages in thread
From: Herbert Xu @ 2005-01-31  5:40 UTC (permalink / raw)
  To: David S. Miller
  Cc: Patrick McHardy, yoshfuji, rmk+lkml, Robert.Olsson, akpm,
	torvalds, alexn, kas, linux-kernel, netdev

On Sun, Jan 30, 2005 at 09:11:50PM -0800, David S. Miller wrote:
> On Mon, 31 Jan 2005 06:00:40 +0100
> Patrick McHardy <kaber@trash.net> wrote:
> 
> > We don't need this for IPv6 yet. Once we get nf_conntrack in we
> > might need this, but its IPv6 fragment handling is different from
> > ip_conntrack, I need to check first.
> 
> Right, ipv6 netfilter cannot create this situation yet.

Not through netfilter but I'm not convinced that other paths
won't do this.

For instance, what about ipv6_frag_rcv -> esp6_input -> ... -> ip6_fragment?
That would seem to be a potential path for a non-NULL dst to survive
through to ip6_fragment, no?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
@ 2005-01-31  5:42                                                 ` Yasuyuki KOZAKAI
  0 siblings, 0 replies; 87+ messages in thread
From: Yasuyuki KOZAKAI @ 2005-01-31  5:42 UTC (permalink / raw)
  To: yoshfuji
  Cc: kaber, kozakai, herbert, davem, rmk+lkml, Robert.Olsson, akpm,
	torvalds, alexn, kas, linux-kernel, netdev


Hi,

From: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Date: Mon, 31 Jan 2005 14:16:36 +0900 (JST)

> In article <41FDBB78.2050403@trash.net> (at Mon, 31 Jan 2005 06:00:40 +0100), Patrick McHardy <kaber@trash.net> says:
> 
> |We don't need this for IPv6 yet. Once we get nf_conntrack in we
> |might need this, but its IPv6 fragment handling is different from
> |ip_conntrack, I need to check first.
> 
> Ok. It would be better to have some comment but anyway...
> kozakai-san?

IMO, fix for nf_conntrack isn't needed yet. Because someone may change
IPv6 fragment handling in nf_conntrack.

Anyway, current nf_conntrack passes the original (not de-fragmented) skb to
IPv6 stack. nf_conntrack doesn't touch its dst.

Regards,
----------------------------------------
Yasuyuki KOZAKAI

Communication Platform Laboratory,
Corporate Research & Development Center,
Toshiba Corporation

yasuyuki.kozakai@toshiba.co.jp
----------------------------------------

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-24 22:35                 ` Linus Torvalds
  2005-01-25 15:53                   ` OT " Paulo Marques
  2005-01-26  8:01                   ` Jens Axboe
@ 2005-02-02  9:29                   ` Lennert Van Alboom
  2005-02-02 16:00                     ` Linus Torvalds
  2 siblings, 1 reply; 87+ messages in thread
From: Lennert Van Alboom @ 2005-02-02  9:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 986 bytes --]

I applied the patch and it works like a charm. As a kinky side effect: before 
this patch, using a compiled-in vesa or vga16 framebuffer worked with the 
proprietary nvidia driver, whereas now tty1-6 are corrupt when not using 
80x25. Strangeness :)

Lennert

On Monday 24 January 2005 23:35, Linus Torvalds wrote:
> On Mon, 24 Jan 2005, Andrew Morton wrote:
> > Would indicate that the new pipe code is leaking.
>
> Duh. It's the pipe merging.
>
> 		Linus
>
> ----
> --- 1.40/fs/pipe.c	2005-01-15 12:01:16 -08:00
> +++ edited/fs/pipe.c	2005-01-24 14:35:09 -08:00
> @@ -630,13 +630,13 @@
>  	struct pipe_inode_info *info = inode->i_pipe;
>
>  	inode->i_pipe = NULL;
> -	if (info->tmp_page)
> -		__free_page(info->tmp_page);
>  	for (i = 0; i < PIPE_BUFFERS; i++) {
>  		struct pipe_buffer *buf = info->bufs + i;
>  		if (buf->ops)
>  			buf->ops->release(info, buf);
>  	}
> +	if (info->tmp_page)
> +		__free_page(info->tmp_page);
>  	kfree(info);
>  }

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-02  9:29                   ` Lennert Van Alboom
@ 2005-02-02 16:00                     ` Linus Torvalds
  2005-02-02 16:19                       ` Lennert Van Alboom
  2005-02-02 17:49                       ` Dave Hansen
  0 siblings, 2 replies; 87+ messages in thread
From: Linus Torvalds @ 2005-02-02 16:00 UTC (permalink / raw)
  To: Lennert Van Alboom; +Cc: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel



On Wed, 2 Feb 2005, Lennert Van Alboom wrote:
>
> I applied the patch and it works like a charm. As a kinky side effect: before 
> this patch, using a compiled-in vesa or vga16 framebuffer worked with the 
> proprietary nvidia driver, whereas now tty1-6 are corrupt when not using 
> 80x25. Strangeness :)

It really sounds like you should lay off those pharmaceutical drugs ;)

That is _strange_. Is it literally just this single pipe merging change
that matters to you? No other changces? I don't see how it could
_possibly_ make any difference at all to anything else.

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-02 16:00                     ` Linus Torvalds
@ 2005-02-02 16:19                       ` Lennert Van Alboom
  2005-02-02 17:49                       ` Dave Hansen
  1 sibling, 0 replies; 87+ messages in thread
From: Lennert Van Alboom @ 2005-02-02 16:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Jens Axboe, alexn, kas, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 996 bytes --]

Positive, I only applied this single two-line change. I'm not capable of 
messing with kernel code myself so I prefer not to. Probably just a lucky 
shot that the vesa didn't go nuts with nvidia before... O well, with a bit 
more o'those pharmaceutical drugs even this 80x25 doesn't look too bad. 
Hurray!

Lennert

On Wednesday 02 February 2005 17:00, Linus Torvalds wrote:
> On Wed, 2 Feb 2005, Lennert Van Alboom wrote:
> > I applied the patch and it works like a charm. As a kinky side effect:
> > before this patch, using a compiled-in vesa or vga16 framebuffer worked
> > with the proprietary nvidia driver, whereas now tty1-6 are corrupt when
> > not using 80x25. Strangeness :)
>
> It really sounds like you should lay off those pharmaceutical drugs ;)
>
> That is _strange_. Is it literally just this single pipe merging change
> that matters to you? No other changces? I don't see how it could
> _possibly_ make any difference at all to anything else.
>
> 		Linus

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-02 16:00                     ` Linus Torvalds
  2005-02-02 16:19                       ` Lennert Van Alboom
@ 2005-02-02 17:49                       ` Dave Hansen
  2005-02-02 18:27                         ` Linus Torvalds
  1 sibling, 1 reply; 87+ messages in thread
From: Dave Hansen @ 2005-02-02 17:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lennert Van Alboom, Andrew Morton, Jens Axboe, alexn, kas,
	Linux Kernel Mailing List

I think there's still something funky going on in the pipe code, at
least in 2.6.11-rc2-mm2, which does contain the misordered __free_page()
fix in pipe.c.  I'm noticing any leak pretty easily because I'm
attempting memory removal of highmem areas, and these apparently leaked
pipe pages the only things keeping those from succeeding.

In any case, I'm running a horribly hacked up kernel, but this is
certainly a new problem, and not one that I've run into before.  Here's
output from the new CONFIG_PAGE_OWNER code:

Page (e0c4f8b8) pfn: 00566606 allocated via order 0
[0xc0162ef6] pipe_writev+542
[0xc0157f48] do_readv_writev+288
[0xc0163114] pipe_write+0
[0xc0134484] ltt_log_event+64
[0xc0158077] vfs_writev+75
[0xc01581ac] sys_writev+104
[0xc0102430] no_syscall_entry_trace+11

And some more information about the page (yes, it's in the vmalloc
space)

page: e0c4f8b8
pfn: 0008a54e 566606
count: 1
mapcount: 0
index: 786431
mapping: 00000000
private: 00000000
lru->prev: 00200200
lru->next: 00100100
        PG_locked:      0
        PG_error:       0
        PG_referenced:  0
        PG_uptodate:    0
        PG_dirty:       0
        PG_lru: 0
        PG_active:      0
        PG_slab:        0
        PG_highmem:     1
        PG_checked:     0
        PG_arch_1:      0
        PG_reserved:    0
        PG_private:     0
        PG_writeback:   0
        PG_nosave:      0
        PG_compound:    0
        PG_swapcache:   0
        PG_mappedtodisk:        0
        PG_reclaim:     0
        PG_nosave_free: 0
        PG_capture:     1


-- Dave


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-02 17:49                       ` Dave Hansen
@ 2005-02-02 18:27                         ` Linus Torvalds
  2005-02-02 19:07                           ` Dave Hansen
  0 siblings, 1 reply; 87+ messages in thread
From: Linus Torvalds @ 2005-02-02 18:27 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Lennert Van Alboom, Andrew Morton, Jens Axboe, alexn, kas,
	Linux Kernel Mailing List



On Wed, 2 Feb 2005, Dave Hansen wrote:
> 
> In any case, I'm running a horribly hacked up kernel, but this is
> certainly a new problem, and not one that I've run into before.  Here's
> output from the new CONFIG_PAGE_OWNER code:

Hmm.. Everything looks fine. One new thing about the pipe code is that it 
historically never allocated HIGHMEM pages, and the new code no longer 
cares and thus can allocate anything. So there's nothing strange in your 
output that I can see.

How many of these pages do you see? It's normal for a single pipe to be 
associated with up to 16 pages (although that would only happen if there 
is no reader or a slow reader, which is obviously not very common). 

Now, if your memory freeing code depends on the fact that all HIGHMEM
pages are always "freeable" (page cache + VM mappings only), then yes, the
new pipe code introduces highmem pages that weren't highmem before.  But
such long-lived and unfreeable pages have been there before too:  kernel
modules (or any other vmalloc() user, for that matter) also do the same
thing.

Now, there _is_ another possibility here: we might have had a pipe leak
before, and the new pipe code would potentially make it a lot more
noticeable, with up to sixteen times as many pages lost if somebody freed
a pipe inode without calling "free_pipe_info()". I don't see where that 
would happen - all the normal "release" functions seem fine.

Hmm.. Adding a 

	WARN_ON(inode->i_pipe);

to "iput_final()" might be a good idea - showing if somebody is releasing 
an inode while it still associated with a pipe-info data structure.

Also, while I don't see how a write could leak, but maybe you could you
add a

	WARN_ON(buf->ops);

to the pipe_writev() case just before we insert a new buffer (ie to just
after the comment that says "Insert it into the buffer array"). Just to
see if the circular buffer handling might overwrite an old entry (although
I _really_ don't see that - it's not like the code is complex, and it
would also be accompanied by data-loss in the pipe, so we'd have seen
that, methinks).

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-02 18:27                         ` Linus Torvalds
@ 2005-02-02 19:07                           ` Dave Hansen
  2005-02-02 21:08                             ` Linus Torvalds
  0 siblings, 1 reply; 87+ messages in thread
From: Dave Hansen @ 2005-02-02 19:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lennert Van Alboom, Andrew Morton, Jens Axboe, alexn, kas,
	Linux Kernel Mailing List

On Wed, 2005-02-02 at 10:27 -0800, Linus Torvalds wrote:
> How many of these pages do you see? It's normal for a single pipe to be 
> associated with up to 16 pages (although that would only happen if there 
> is no reader or a slow reader, which is obviously not very common). 

Strangely enough, it seems to be one single, persistent page.  

> Now, if your memory freeing code depends on the fact that all HIGHMEM
> pages are always "freeable" (page cache + VM mappings only), then yes, the
> new pipe code introduces highmem pages that weren't highmem before.  But
> such long-lived and unfreeable pages have been there before too:  kernel
> modules (or any other vmalloc() user, for that matter) also do the same
> thing.

That might be it.  For now, I just change the GFP masks for vmalloc() so
that I don't have to deal with it, yet.  But, I certainly can see that
how this is a new user of highmem.

I did go around killing processes like mad to see if any of them still
had a hold of the pipe, but the shotgun approach didn't seem to help.

> Now, there _is_ another possibility here: we might have had a pipe leak
> before, and the new pipe code would potentially make it a lot more
> noticeable, with up to sixteen times as many pages lost if somebody freed
> a pipe inode without calling "free_pipe_info()". I don't see where that 
> would happen - all the normal "release" functions seem fine.
> 
> Hmm.. Adding a 
> 
> 	WARN_ON(inode->i_pipe);
> 
> to "iput_final()" might be a good idea - showing if somebody is releasing 
> an inode while it still associated with a pipe-info data structure.
> 
> Also, while I don't see how a write could leak, but maybe you could you
> add a
> 
> 	WARN_ON(buf->ops);
> 
> to the pipe_writev() case just before we insert a new buffer (ie to just
> after the comment that says "Insert it into the buffer array"). Just to
> see if the circular buffer handling might overwrite an old entry (although
> I _really_ don't see that - it's not like the code is complex, and it
> would also be accompanied by data-loss in the pipe, so we'd have seen
> that, methinks).

I'll put the warnings in, and see if anything comes up.

-- Dave


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-02 19:07                           ` Dave Hansen
@ 2005-02-02 21:08                             ` Linus Torvalds
  0 siblings, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2005-02-02 21:08 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Lennert Van Alboom, Andrew Morton, Jens Axboe, alexn, kas,
	Linux Kernel Mailing List



On Wed, 2 Feb 2005, Dave Hansen wrote:
> 
> Strangely enough, it seems to be one single, persistent page.  

Ok. Almost certainly not a leak. 

It's most likely the FIFO that "init" opens (/dev/initctl). FIFO's use the 
pipe code too.

If you don't want unreclaimable highmem pages, then I suspect you just 
need to change the GFP_HIGHUSER to a GFP_USER in fs/pipe.c

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-01-21 16:19 Memory leak in 2.6.11-rc1? Jan Kasprzak
  2005-01-22  2:23 ` Alexander Nyberg
@ 2005-02-07 11:00 ` Jan Kasprzak
  2005-02-07 11:11   ` William Lee Irwin III
  2005-02-07 15:38   ` Linus Torvalds
  1 sibling, 2 replies; 87+ messages in thread
From: Jan Kasprzak @ 2005-02-07 11:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: torvalds

: I've been running 2.6.11-rc1 on my dual opteron Fedora Core 3 box for a week
: now, and I think there is a memory leak somewhere. I am measuring the
: size of active and inactive pages (from /proc/meminfo), and it seems
: that the count of sum (active+inactive) pages is decreasing. Please
: take look at the graphs at
: 
: http://www.linux.cz/stats/mrtg-rrd/vm_active.html

	Well, with Linus' patch to fs/pipe.c the situation seems to
improve a bit, but some leak is still there (look at the "monthly" graph
at the above URL). The server has been running 2.6.11-rc2 + patch to fs/pipe.c
for last 8 days. I am letting it run for a few more days in case you want
some debugging info from a live system. I am attaching my /proc/meminfo
and /proc/slabinfo.

-Yenya

# cat /proc/meminfo
MemTotal:      4045168 kB
MemFree:         59396 kB
Buffers:         17812 kB
Cached:        2861648 kB
SwapCached:          0 kB
Active:         827700 kB
Inactive:      2239752 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      4045168 kB
LowFree:         59396 kB
SwapTotal:    14651256 kB
SwapFree:     14650584 kB
Dirty:            1616 kB
Writeback:           0 kB
Mapped:         206540 kB
Slab:           861176 kB
CommitLimit:  16673840 kB
Committed_AS:   565684 kB
PageTables:      20812 kB
VmallocTotal: 34359738367 kB
VmallocUsed:      7400 kB
VmallocChunk: 34359730867 kB
# cat /proc/slabinfo
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <batchcount> <limit> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
raid5/md5            256    260   1416    5    2 : tunables   24   12    8 : slabdata     52     52      0
rpc_buffers            8      8   2048    2    1 : tunables   24   12    8 : slabdata      4      4      0
rpc_tasks             12     20    384   10    1 : tunables   54   27    8 : slabdata      2      2      0
rpc_inode_cache        8     10    768    5    1 : tunables   54   27    8 : slabdata      2      2      0
fib6_nodes            27     61     64   61    1 : tunables  120   60    8 : slabdata      1      1      0
ip6_dst_cache         17     36    320   12    1 : tunables   54   27    8 : slabdata      3      3      0
ndisc_cache            2     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0
rawv6_sock             4      4   1024    4    1 : tunables   54   27    8 : slabdata      1      1      0
udpv6_sock             1      4    960    4    1 : tunables   54   27    8 : slabdata      1      1      0
tcpv6_sock             8      8   1664    4    2 : tunables   24   12    8 : slabdata      2      2      0
unix_sock            567    650    768    5    1 : tunables   54   27    8 : slabdata    130    130      0
tcp_tw_bucket        445    920    192   20    1 : tunables  120   60    8 : slabdata     46     46      0
tcp_bind_bucket      389   2261     32  119    1 : tunables  120   60    8 : slabdata     19     19      0
tcp_open_request     135    310    128   31    1 : tunables  120   60    8 : slabdata     10     10      0
inet_peer_cache       32     62    128   31    1 : tunables  120   60    8 : slabdata      2      2      0
ip_fib_alias          20    119     32  119    1 : tunables  120   60    8 : slabdata      1      1      0
ip_fib_hash           18     61     64   61    1 : tunables  120   60    8 : slabdata      1      1      0
ip_dst_cache        1738   2060    384   10    1 : tunables   54   27    8 : slabdata    206    206      0
arp_cache              8     30    256   15    1 : tunables  120   60    8 : slabdata      2      2      0
raw_sock               3      9    832    9    2 : tunables   54   27    8 : slabdata      1      1      0
udp_sock              45     45    832    9    2 : tunables   54   27    8 : slabdata      5      5      0
tcp_sock             431    600   1472    5    2 : tunables   24   12    8 : slabdata    120    120      0
flow_cache             0      0    128   31    1 : tunables  120   60    8 : slabdata      0      0      0
dm_tio                 0      0     24  156    1 : tunables  120   60    8 : slabdata      0      0      0
dm_io                  0      0     32  119    1 : tunables  120   60    8 : slabdata      0      0      0
scsi_cmd_cache       261    315    512    7    1 : tunables   54   27    8 : slabdata     45     45    216
cfq_ioc_pool           0      0     48   81    1 : tunables  120   60    8 : slabdata      0      0      0
cfq_pool               0      0    176   22    1 : tunables  120   60    8 : slabdata      0      0      0
crq_pool               0      0    104   38    1 : tunables  120   60    8 : slabdata      0      0      0
deadline_drq           0      0     96   41    1 : tunables  120   60    8 : slabdata      0      0      0
as_arq               580    700    112   35    1 : tunables  120   60    8 : slabdata     20     20    432
xfs_acl                0      0    304   13    1 : tunables   54   27    8 : slabdata      0      0      0
xfs_chashlist        380   4879     32  119    1 : tunables  120   60    8 : slabdata     41     41     30
xfs_ili               15    120    192   20    1 : tunables  120   60    8 : slabdata      6      6      0
xfs_ifork              0      0     64   61    1 : tunables  120   60    8 : slabdata      0      0      0
xfs_efi_item           0      0    352   11    1 : tunables   54   27    8 : slabdata      0      0      0
xfs_efd_item           0      0    360   11    1 : tunables   54   27    8 : slabdata      0      0      0
xfs_buf_item           5     21    184   21    1 : tunables  120   60    8 : slabdata      1      1      0
xfs_dabuf             10    156     24  156    1 : tunables  120   60    8 : slabdata      1      1      0
xfs_da_state           2      8    488    8    1 : tunables   54   27    8 : slabdata      1      1      0
xfs_trans              1      9    872    9    2 : tunables   54   27    8 : slabdata      1      1      0
xfs_inode            500    959    528    7    1 : tunables   54   27    8 : slabdata    137    137      0
xfs_btree_cur          2     20    192   20    1 : tunables  120   60    8 : slabdata      1      1      0
xfs_bmap_free_item      0      0     24  156    1 : tunables  120   60    8 : slabdata      0      0      0
xfs_buf_t             44     72    448    9    1 : tunables   54   27    8 : slabdata      8      8      0
linvfs_icache        499    792    600    6    1 : tunables   54   27    8 : slabdata    132    132      0
nfs_write_data        36     36    832    9    2 : tunables   54   27    8 : slabdata      4      4      0
nfs_read_data         32     35    768    5    1 : tunables   54   27    8 : slabdata      7      7      0
nfs_inode_cache       28     72    952    4    1 : tunables   54   27    8 : slabdata     10     18      5
nfs_page               2     31    128   31    1 : tunables  120   60    8 : slabdata      1      1      0
isofs_inode_cache     10     12    600    6    1 : tunables   54   27    8 : slabdata      2      2      0
journal_handle        96    156     24  156    1 : tunables  120   60    8 : slabdata      1      1      0
journal_head         324    630     88   45    1 : tunables  120   60    8 : slabdata     14     14     60
revoke_table           6    225     16  225    1 : tunables  120   60    8 : slabdata      1      1      0
revoke_record          0      0     32  119    1 : tunables  120   60    8 : slabdata      0      0      0
ext3_inode_cache     829   1150    816    5    1 : tunables   54   27    8 : slabdata    230    230     54
ext3_xattr             0      0     88   45    1 : tunables  120   60    8 : slabdata      0      0      0
dnotify_cache          1     96     40   96    1 : tunables  120   60    8 : slabdata      1      1      0
dquot                  0      0    256   15    1 : tunables  120   60    8 : slabdata      0      0      0
eventpoll_pwq          0      0     72   54    1 : tunables  120   60    8 : slabdata      0      0      0
eventpoll_epi          0      0    192   20    1 : tunables  120   60    8 : slabdata      0      0      0
kioctx                 0      0    384   10    1 : tunables   54   27    8 : slabdata      0      0      0
kiocb                  0      0    256   15    1 : tunables  120   60    8 : slabdata      0      0      0
fasync_cache           0      0     24  156    1 : tunables  120   60    8 : slabdata      0      0      0
shmem_inode_cache    847    855    760    5    1 : tunables   54   27    8 : slabdata    171    171      0
posix_timers_cache      0      0    176   22    1 : tunables  120   60    8 : slabdata      0      0      0
uid_cache             17    122     64   61    1 : tunables  120   60    8 : slabdata      2      2      0
sgpool-128            32     32   4096    1    1 : tunables   24   12    8 : slabdata     32     32      0
sgpool-64             32     32   2048    2    1 : tunables   24   12    8 : slabdata     16     16      0
sgpool-32            140    140   1024    4    1 : tunables   54   27    8 : slabdata     35     35     76
sgpool-16             77     88    512    8    1 : tunables   54   27    8 : slabdata     11     11      0
sgpool-8             405    405    256   15    1 : tunables  120   60    8 : slabdata     27     27    284
blkdev_ioc           259    480     40   96    1 : tunables  120   60    8 : slabdata      5      5      0
blkdev_queue          80     84    680    6    1 : tunables   54   27    8 : slabdata     14     14      0
blkdev_requests      628    688    248   16    1 : tunables  120   60    8 : slabdata     43     43    480
biovec-(256)         256    256   4096    1    1 : tunables   24   12    8 : slabdata    256    256      0
biovec-128           256    256   2048    2    1 : tunables   24   12    8 : slabdata    128    128      0
biovec-64            358    380   1024    4    1 : tunables   54   27    8 : slabdata     95     95     54
biovec-16            270    300    256   15    1 : tunables  120   60    8 : slabdata     20     20      0
biovec-4             342    366     64   61    1 : tunables  120   60    8 : slabdata      6      6      0
biovec-1          5506200 5506200     16  225    1 : tunables  120   60    8 : slabdata  24472  24472    240
bio               5506189 5506189    128   31    1 : tunables  120   60    8 : slabdata 177619 177619    180
file_lock_cache       35     75    160   25    1 : tunables  120   60    8 : slabdata      3      3      0
sock_inode_cache    1069   1368    640    6    1 : tunables   54   27    8 : slabdata    228    228      0
skbuff_head_cache   5738   7185    256   15    1 : tunables  120   60    8 : slabdata    479    479    360
sock                   4     12    640    6    1 : tunables   54   27    8 : slabdata      2      2      0
proc_inode_cache     222    483    584    7    1 : tunables   54   27    8 : slabdata     69     69    183
sigqueue              23     23    168   23    1 : tunables  120   60    8 : slabdata      1      1      0
radix_tree_node    18317  21476    536    7    1 : tunables   54   27    8 : slabdata   3068   3068     27
bdev_cache            55     60    768    5    1 : tunables   54   27    8 : slabdata     12     12      0
sysfs_dir_cache     3112   3172     64   61    1 : tunables  120   60    8 : slabdata     52     52      0
mnt_cache             37     60    192   20    1 : tunables  120   60    8 : slabdata      3      3      0
inode_cache         1085   1134    552    7    1 : tunables   54   27    8 : slabdata    162    162      0
dentry_cache        4510  12410    224   17    1 : tunables  120   60    8 : slabdata    730    730    404
filp                2970   4380    256   15    1 : tunables  120   60    8 : slabdata    292    292     30
names_cache           25     25   4096    1    1 : tunables   24   12    8 : slabdata     25     25      0
idr_layer_cache       75     77    528    7    1 : tunables   54   27    8 : slabdata     11     11      0
buffer_head         6061  22770     88   45    1 : tunables  120   60    8 : slabdata    506    506      0
mm_struct            319    455   1152    7    2 : tunables   24   12    8 : slabdata     65     65      0
vm_area_struct     18395  30513    184   21    1 : tunables  120   60    8 : slabdata   1453   1453    420
fs_cache             367    793     64   61    1 : tunables  120   60    8 : slabdata     13     13      0
files_cache          332    513    832    9    2 : tunables   54   27    8 : slabdata     57     57      0
signal_cache         379    549    448    9    1 : tunables   54   27    8 : slabdata     61     61      0
sighand_cache        350    420   2112    3    2 : tunables   24   12    8 : slabdata    140    140      6
task_struct          378    460   1744    4    2 : tunables   24   12    8 : slabdata    115    115      6
anon_vma            1098   2340     24  156    1 : tunables  120   60    8 : slabdata     15     15      0
shared_policy_node      0      0     56   69    1 : tunables  120   60    8 : slabdata      0      0      0
numa_policy           33    225     16  225    1 : tunables  120   60    8 : slabdata      1      1      0
size-131072(DMA)       0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-131072            0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-65536(DMA)        0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0      0
size-65536             0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0      0
size-32768(DMA)        0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0      0
size-32768            19     19  32768    1    8 : tunables    8    4    0 : slabdata     19     19      0
size-16384(DMA)        0      0  16384    1    4 : tunables    8    4    0 : slabdata      0      0      0
size-16384             2      2  16384    1    4 : tunables    8    4    0 : slabdata      2      2      0
size-8192(DMA)         0      0   8192    1    2 : tunables    8    4    0 : slabdata      0      0      0
size-8192             33     35   8192    1    2 : tunables    8    4    0 : slabdata     33     35      0
size-4096(DMA)         0      0   4096    1    1 : tunables   24   12    8 : slabdata      0      0      0
size-4096            146    146   4096    1    1 : tunables   24   12    8 : slabdata    146    146      0
size-2048(DMA)         0      0   2048    2    1 : tunables   24   12    8 : slabdata      0      0      0
size-2048            526    546   2048    2    1 : tunables   24   12    8 : slabdata    273    273     88
size-1024(DMA)         0      0   1024    4    1 : tunables   54   27    8 : slabdata      0      0      0
size-1024           5533   6100   1024    4    1 : tunables   54   27    8 : slabdata   1525   1525    189
size-512(DMA)          0      0    512    8    1 : tunables   54   27    8 : slabdata      0      0      0
size-512             409    480    512    8    1 : tunables   54   27    8 : slabdata     60     60     27
size-256(DMA)          0      0    256   15    1 : tunables  120   60    8 : slabdata      0      0      0
size-256              97    105    256   15    1 : tunables  120   60    8 : slabdata      7      7      0
size-192(DMA)          0      0    192   20    1 : tunables  120   60    8 : slabdata      0      0      0
size-192            1747   2240    192   20    1 : tunables  120   60    8 : slabdata    112    112      0
size-128(DMA)          0      0    128   31    1 : tunables  120   60    8 : slabdata      0      0      0
size-128            2858   4495    128   31    1 : tunables  120   60    8 : slabdata    145    145     30
size-64(DMA)           0      0     64   61    1 : tunables  120   60    8 : slabdata      0      0      0
size-64             4595  23302     64   61    1 : tunables  120   60    8 : slabdata    382    382     60
size-32(DMA)           0      0     32  119    1 : tunables  120   60    8 : slabdata      0      0      0
size-32             1888   2142     32  119    1 : tunables  120   60    8 : slabdata     18     18      0
kmem_cache           150    150    256   15    1 : tunables  120   60    8 : slabdata     10     10      0
#

-Yenya

-- 
| Jan "Yenya" Kasprzak  <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839      Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/   Czech Linux Homepage: http://www.linux.cz/ |
> Whatever the Java applications and desktop dances may lead to, Unix will <
> still be pushing the packets around for a quite a while.      --Rob Pike <

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-07 11:00 ` Jan Kasprzak
@ 2005-02-07 11:11   ` William Lee Irwin III
  2005-02-07 15:38   ` Linus Torvalds
  1 sibling, 0 replies; 87+ messages in thread
From: William Lee Irwin III @ 2005-02-07 11:11 UTC (permalink / raw)
  To: Jan Kasprzak; +Cc: linux-kernel, torvalds

On Mon, Feb 07, 2005 at 12:00:30PM +0100, Jan Kasprzak wrote:
> 	Well, with Linus' patch to fs/pipe.c the situation seems to
> improve a bit, but some leak is still there (look at the "monthly" graph
> at the above URL). The server has been running 2.6.11-rc2 + patch to fs/pipe.c
> for last 8 days. I am letting it run for a few more days in case you want
> some debugging info from a live system. I am attaching my /proc/meminfo
> and /proc/slabinfo.

Congratulations. You have 688MB of bio's.


-- wli

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-07 11:00 ` Jan Kasprzak
  2005-02-07 11:11   ` William Lee Irwin III
@ 2005-02-07 15:38   ` Linus Torvalds
  2005-02-07 15:52     ` Jan Kasprzak
  2005-02-08  2:47     ` Memory leak in 2.6.11-rc1? (also here) Noel Maddy
  1 sibling, 2 replies; 87+ messages in thread
From: Linus Torvalds @ 2005-02-07 15:38 UTC (permalink / raw)
  To: Jan Kasprzak, Jens Axboe; +Cc: Kernel Mailing List



On Mon, 7 Feb 2005, Jan Kasprzak wrote:
>
>The server has been running 2.6.11-rc2 + patch to fs/pipe.c
>for last 8 days. 
> 
> # cat /proc/meminfo
> MemTotal:      4045168 kB
> Cached:        2861648 kB
> LowFree:         59396 kB
> Mapped:         206540 kB
> Slab:           861176 kB

Ok, pretty much everything there and accounted for: you've got 4GB of 
memory, and it's pretty much all in cached/mapped/slab. So if something is 
leaking, it's in one of those three.

And I think I see which one it is:

> # cat /proc/slabinfo
> slabinfo - version: 2.1
> # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <batchcount> <limit> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
> biovec-1          5506200 5506200     16  225    1 : tunables  120   60    8 : slabdata  24472  24472    240
> bio               5506189 5506189    128   31    1 : tunables  120   60    8 : slabdata 177619 177619    180

Whee. You've got 5 _million_ bio's "active". Which account for about 750MB
of your 860MB of slab usage.

Jens, any ideas? Doesn't look like the "md sync_page_io bio leak", since
that would just lose one bio per md suprt block read according to you (and
that's the only one I can find fixed since -rc2). I doubt Jan has caused
five million of those..

Jan - can you give Jens a bit of an idea of what drivers and/or schedulers 
you're using?

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-07 15:38   ` Linus Torvalds
@ 2005-02-07 15:52     ` Jan Kasprzak
  2005-02-07 16:38       ` axboe
  2005-02-08  2:47     ` Memory leak in 2.6.11-rc1? (also here) Noel Maddy
  1 sibling, 1 reply; 87+ messages in thread
From: Jan Kasprzak @ 2005-02-07 15:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jens Axboe, Kernel Mailing List

Linus Torvalds wrote:
: Jan - can you give Jens a bit of an idea of what drivers and/or schedulers 
: you're using?

	I have a Tyan S2882 dual Opteron, network is on-board tg3,
there are 8 P-ATA HDDs hooked on 3ware 7506-8 controller (no HW RAID
there, but the drives are partitioned and partition grouped to form
software RAID-0, 1, 5, and 10 volumes - the main fileserving traffic
is on a RAID-5 volume, and /var is on RAID-10 volume.

	Filesystems are XFS for that RAID-5 volume, ext3 for the rest
of the system. I have compiled-in the following I/O schedulers (according
to my /var/log/dmesg :-)

io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered

I have not changed the scheduler by hand, so I suppose the anticipatory
is the default.

	No X, just serial console. The server does FTP serving mostly
(ProFTPd with sendfile() compiled in), sending mail via qmail (cca
100-200k mails a day), and bits of other work (rsync, Apache, ...).
Fedora core 3 with all relevant updates.

	My fstab (physical devices only):
/dev/md0                /                       ext3    defaults        1 1
/dev/md1                /home                   ext3    defaults        1 2
/dev/md6                /var                    ext3    defaults        1 2
/dev/md4                /fastraid               xfs     noatime         1 3
/dev/md5                /export                 xfs     noatime         1 4
/dev/sde4               swap                    swap    pri=10          0 0
/dev/sdf4               swap                    swap    pri=10          0 0
/dev/sdg4               swap                    swap    pri=10          0 0
/dev/sdh4               swap                    swap    pri=10          0 0

	My mdstat:

Personalities : [raid0] [raid1] [raid5]
md6 : active raid0 md3[0] md2[1]
      19550720 blocks 64k chunks

md1 : active raid1 sdd1[1] sdc1[0]
      14659200 blocks [2/2] [UU]

md2 : active raid1 sdf1[1] sde1[0]
      9775424 blocks [2/2] [UU]

md3 : active raid1 sdh1[1] sdg1[0]
      9775424 blocks [2/2] [UU]

md4 : active raid0 sdh2[7] sdg2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0]
      39133184 blocks 256k chunks

md5 : active raid5 sdh3[7] sdg3[6] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1] sda3[0]
      1572512256 blocks level 5, 256k chunk, algorithm 2 [8/8] [UUUUUUUU]

md0 : active raid1 sdb1[1] sda1[0]
      14659200 blocks [2/2] [UU]

unused devices: <none>

	Anything else you want to know? Thanks,

-Yenya

-- 
| Jan "Yenya" Kasprzak  <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839      Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/   Czech Linux Homepage: http://www.linux.cz/ |
> Whatever the Java applications and desktop dances may lead to, Unix will <
> still be pushing the packets around for a quite a while.      --Rob Pike <

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-07 15:52     ` Jan Kasprzak
@ 2005-02-07 16:38       ` axboe
  2005-02-07 17:35         ` Jan Kasprzak
  0 siblings, 1 reply; 87+ messages in thread
From: axboe @ 2005-02-07 16:38 UTC (permalink / raw)
  To: Jan Kasprzak; +Cc: Linus Torvalds, Jens Axboe, Kernel Mailing List

> Linus Torvalds wrote:
> : Jan - can you give Jens a bit of an idea of what drivers and/or
> schedulers
> : you're using?
>
> 	I have a Tyan S2882 dual Opteron, network is on-board tg3,
> there are 8 P-ATA HDDs hooked on 3ware 7506-8 controller (no HW RAID
> there, but the drives are partitioned and partition grouped to form
> software RAID-0, 1, 5, and 10 volumes - the main fileserving traffic
> is on a RAID-5 volume, and /var is on RAID-10 volume.
>
> 	Filesystems are XFS for that RAID-5 volume, ext3 for the rest
> of the system. I have compiled-in the following I/O schedulers (according
> to my /var/log/dmesg :-)
>
> io scheduler noop registered
> io scheduler anticipatory registered
> io scheduler deadline registered
> io scheduler cfq registered
>
> I have not changed the scheduler by hand, so I suppose the anticipatory
> is the default.
>
> 	No X, just serial console. The server does FTP serving mostly
> (ProFTPd with sendfile() compiled in), sending mail via qmail (cca
> 100-200k mails a day), and bits of other work (rsync, Apache, ...).
> Fedora core 3 with all relevant updates.
>
> 	My fstab (physical devices only):
> /dev/md0                /                       ext3    defaults        1
> 1
> /dev/md1                /home                   ext3    defaults        1
> 2
> /dev/md6                /var                    ext3    defaults        1
> 2
> /dev/md4                /fastraid               xfs     noatime         1
> 3
> /dev/md5                /export                 xfs     noatime         1
> 4
> /dev/sde4               swap                    swap    pri=10          0
> 0
> /dev/sdf4               swap                    swap    pri=10          0
> 0
> /dev/sdg4               swap                    swap    pri=10          0
> 0
> /dev/sdh4               swap                    swap    pri=10          0
> 0
>
> 	My mdstat:
>
> Personalities : [raid0] [raid1] [raid5]
> md6 : active raid0 md3[0] md2[1]
>       19550720 blocks 64k chunks
>
> md1 : active raid1 sdd1[1] sdc1[0]
>       14659200 blocks [2/2] [UU]
>
> md2 : active raid1 sdf1[1] sde1[0]
>       9775424 blocks [2/2] [UU]
>
> md3 : active raid1 sdh1[1] sdg1[0]
>       9775424 blocks [2/2] [UU]
>
> md4 : active raid0 sdh2[7] sdg2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1]
> sda2[0]
>       39133184 blocks 256k chunks
>
> md5 : active raid5 sdh3[7] sdg3[6] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1]
> sda3[0]
>       1572512256 blocks level 5, 256k chunk, algorithm 2 [8/8] [UUUUUUUU]
>
> md0 : active raid1 sdb1[1] sda1[0]
>       14659200 blocks [2/2] [UU]

My guess would be the clone change, if raid was not leaking before. I
cannot lookup any patches at the moment, as I'm still at the hospital
taking care of my new born baby and wife :)

But try and reverse the patches to fs/bio.c that mention corruption due to
bio_clone and bio->bi_io_vec and see if that cures it. If it does, I know
where to look. When did you notice this started to leak?

Jens


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-07 16:38       ` axboe
@ 2005-02-07 17:35         ` Jan Kasprzak
  2005-02-07 21:10           ` Jan Kasprzak
  0 siblings, 1 reply; 87+ messages in thread
From: Jan Kasprzak @ 2005-02-07 17:35 UTC (permalink / raw)
  To: axboe; +Cc: Linus Torvalds, Jens Axboe, Kernel Mailing List

axboe@home.kernel.dk wrote:
: My guess would be the clone change, if raid was not leaking before. I
: cannot lookup any patches at the moment, as I'm still at the hospital
: taking care of my new born baby and wife :)

	Congratulations!

: But try and reverse the patches to fs/bio.c that mention corruption due to
: bio_clone and bio->bi_io_vec and see if that cures it. If it does, I know
: where to look. When did you notice this started to leak?

	I think I have been running 2.6.10-rc3 before. I've copied
the fs/bio.c from 2.6.10-rc3 to my 2.6.11-rc2 sources and booted the
resulting kernel. I hope it will not eat my filesystems :-) I will send
my /proc/slabinfo in a few days.

-Yenya

-- 
| Jan "Yenya" Kasprzak  <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839      Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/   Czech Linux Homepage: http://www.linux.cz/ |
> Whatever the Java applications and desktop dances may lead to, Unix will <
> still be pushing the packets around for a quite a while.      --Rob Pike <

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1?
  2005-02-07 17:35         ` Jan Kasprzak
@ 2005-02-07 21:10           ` Jan Kasprzak
  0 siblings, 0 replies; 87+ messages in thread
From: Jan Kasprzak @ 2005-02-07 21:10 UTC (permalink / raw)
  To: axboe; +Cc: Linus Torvalds, Jens Axboe, Kernel Mailing List

Jan Kasprzak wrote:
: 	I think I have been running 2.6.10-rc3 before. I've copied
: the fs/bio.c from 2.6.10-rc3 to my 2.6.11-rc2 sources and booted the
: resulting kernel. I hope it will not eat my filesystems :-) I will send
: my /proc/slabinfo in a few days.

	Hmm, after 3h35min of uptime I have

biovec-1           92157  92250     16  225    1 : tunables  120   60    8 : slabdata    410    410     60
bio                92163  92163    128   31    1 : tunables  120   60    8 : slabdata   2973   2973     60

so it is probably still leaking - about half an hour ago it was

biovec-1           77685  77850     16  225    1 : tunables  120   60    8 : slabdata    346    346      0
bio                77841  77841    128   31    1 : tunables  120   60    8 : slabdata   2511   2511    180

-Yenya

-- 
| Jan "Yenya" Kasprzak  <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839      Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/   Czech Linux Homepage: http://www.linux.cz/ |
> Whatever the Java applications and desktop dances may lead to, Unix will <
> still be pushing the packets around for a quite a while.      --Rob Pike <

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Memory leak in 2.6.11-rc1? (also here)
  2005-02-07 15:38   ` Linus Torvalds
  2005-02-07 15:52     ` Jan Kasprzak
@ 2005-02-08  2:47     ` Noel Maddy
  2005-02-16  4:00       ` -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?] Parag Warudkar
  1 sibling, 1 reply; 87+ messages in thread
From: Noel Maddy @ 2005-02-08  2:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jan Kasprzak, Jens Axboe, Kernel Mailing List

On Mon, Feb 07, 2005 at 07:38:12AM -0800, Linus Torvalds wrote:
> 
> Whee. You've got 5 _million_ bio's "active". Which account for about 750MB
> of your 860MB of slab usage.

Same situation here, at different rates on two different platforms,
both running same kernel build. Both show steadily increasing biovec-1.

uglybox was previously running Ingo's 2.6.11-rc2-RT-V0.7.36-03, and was
well over 3,000,000 bios after about a week of uptime. With only 512M of
memory, it was pretty sluggish.

Interesting that the 4-disk RAID5 seems to be growing about 4 times as
fast as the RAID1.

If there's anything else that could help, or patches you want me to try,
just ask.

Details:

=================================
#1: Soyo KT600 Platinum, Athlon 2500+, 512MB
	2 SATA, 2 PATA (all on 8237)
	RAID1 and RAID5
	on-board tg3
================================

>uname -a
Linux uglybox 2.6.11-rc3 #2 Thu Feb 3 16:19:44 EST 2005 i686 GNU/Linux
>uptime
 21:27:47 up  7:04,  4 users,  load average: 1.06, 1.03, 1.02
>grep '^bio' /proc/slabinfo
biovec-(256)         256    256   3072    2    2 : tunables   24   12    0 : slabdata    128    128      0
biovec-128           256    260   1536    5    2 : tunables   24   12    0 : slabdata     52     52      0
biovec-64            256    260    768    5    1 : tunables   54   27    0 : slabdata     52     52      0
biovec-16            256    260    192   20    1 : tunables  120   60    0 : slabdata     13     13      0
biovec-4             256    305     64   61    1 : tunables  120   60    0 : slabdata      5      5      0
biovec-1           64547  64636     16  226    1 : tunables  120   60    0 : slabdata    286    286      0
bio                64551  64599     64   61    1 : tunables  120   60    0 : slabdata   1059   1059      0
>lsmod
Module                  Size  Used by
ppp_deflate             4928  2 
zlib_deflate           21144  1 ppp_deflate
bsd_comp                5376  0 
ppp_async               9280  1 
crc_ccitt               1728  1 ppp_async
ppp_generic            21396  7 ppp_deflate,bsd_comp,ppp_async
slhc                    6720  1 ppp_generic
radeon                 76224  1 
ipv6                  235456  27 
pcspkr                  3300  0 
tg3                    84932  0 
ohci1394               31748  0 
ieee1394               94196  1 ohci1394
snd_cmipci             30112  1 
snd_pcm_oss            48480  0 
snd_mixer_oss          17728  1 snd_pcm_oss
usbhid                 31168  0 
snd_pcm                83528  2 snd_cmipci,snd_pcm_oss
snd_page_alloc          7620  1 snd_pcm
snd_opl3_lib            9472  1 snd_cmipci
snd_timer              21828  2 snd_pcm,snd_opl3_lib
snd_hwdep               7456  1 snd_opl3_lib
snd_mpu401_uart         6528  1 snd_cmipci
snd_rawmidi            20704  1 snd_mpu401_uart
snd_seq_device          7116  2 snd_opl3_lib,snd_rawmidi
snd                    48996  12 snd_cmipci,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_opl3_lib,snd_timer,snd_hwdep,snd_mpu401_uart,snd_rawmidi,snd_seq_device
soundcore               7648  1 snd
uhci_hcd               29968  0 
ehci_hcd               29000  0 
usbcore               106744  4 usbhid,uhci_hcd,ehci_hcd
dm_mod                 52796  0 
it87                   23900  0 
eeprom                  5776  0 
lm90                   11044  0 
i2c_sensor              2944  3 it87,eeprom,lm90
i2c_isa                 1728  0 
i2c_viapro              6412  0 
i2c_core               18512  6 it87,eeprom,lm90,i2c_sensor,i2c_isa,i2c_viapro
>lspci
0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge (rev 80)
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
0000:00:07.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705 Gigabit Ethernet (rev 03)
0000:00:0d.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46)
0000:00:0e.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
0000:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80)
0000:00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South]
0000:00:13.0 RAID bus controller: Silicon Image, Inc. (formerly CMD Technology Inc) SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV200 QW [Radeon 7500]
>cat /proc/mdstat
Personalities : [raid0] [raid1] [raid5] 
md1 : active raid1 sdb1[0] sda1[1]
      489856 blocks [2/2] [UU]
      
md4 : active raid5 sdb3[2] sda3[3] hdc3[1] hda3[0]
      8795136 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
      
md5 : active raid5 sdb5[2] sda5[3] hdc5[1] hda5[0]
      14650752 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
      
md6 : active raid5 sdb6[2] sda6[3] hdc6[1] hda6[0]
      43953408 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
      
md7 : active raid5 sdb7[2] sda7[3] hdc7[1] hda7[0]
      164103552 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
      
md0 : active raid1 hdc1[1] hda1[0]
      489856 blocks [2/2] [UU]
      
unused devices: <none>

================================
#2: Soyo KT400 Platinum, Athlon 2500+, 512MB
	2 PATA (one on 8235, one on HPT372)
	RAID1
	on-board via rhine
================================

>uname -a
Linux lepke 2.6.11-rc3 #2 Thu Feb 3 16:19:44 EST 2005 i686 GNU/Linux
>uptime
 21:30:13 up  7:16,  1 user,  load average: 1.00, 1.00, 1.23
>grep '^bio' /proc/slabinfo
biovec-(256)         256    256   3072    2    2 : tunables   24   12    0 : slabdata    128    128      0
biovec-128           256    260   1536    5    2 : tunables   24   12    0 : slabdata     52     52      0
biovec-64            256    260    768    5    1 : tunables   54   27    0 : slabdata     52     52      0
biovec-16            256    260    192   20    1 : tunables  120   60    0 : slabdata     13     13      0
biovec-4             256    305     64   61    1 : tunables  120   60    0 : slabdata      5      5      0
biovec-1           14926  15142     16  226    1 : tunables  120   60    0 : slabdata     67     67      0
bio                14923  15006     64   61    1 : tunables  120   60    0 : slabdata    246    246      0
Module                  Size  Used by
ipv6                  235456  17 
pcspkr                  3300  0 
tuner                  21220  0 
ub                     15324  0 
usbhid                 31168  0 
bttv                  146064  0 
video_buf              17540  1 bttv
firmware_class          7936  1 bttv
i2c_algo_bit            8840  1 bttv
v4l2_common             4736  1 bttv
btcx_risc               3912  1 bttv
tveeprom               11544  1 bttv
videodev                7488  1 bttv
uhci_hcd               29968  0 
ehci_hcd               29000  0 
usbcore               106744  5 ub,usbhid,uhci_hcd,ehci_hcd
via_ircc               23380  0 
irda                  121784  1 via_ircc
crc_ccitt               1728  1 irda
via_rhine              19844  0 
mii                     4032  1 via_rhine
dm_mod                 52796  0 
snd_bt87x              12360  0 
snd_cmipci             30112  0 
snd_opl3_lib            9472  1 snd_cmipci
snd_hwdep               7456  1 snd_opl3_lib
snd_mpu401_uart         6528  1 snd_cmipci
snd_cs46xx             85064  0 
snd_rawmidi            20704  2 snd_mpu401_uart,snd_cs46xx
snd_seq_device          7116  2 snd_opl3_lib,snd_rawmidi
snd_ac97_codec         73976  1 snd_cs46xx
snd_pcm_oss            48480  0 
snd_mixer_oss          17728  1 snd_pcm_oss
snd_pcm                83528  5 snd_bt87x,snd_cmipci,snd_cs46xx,snd_ac97_codec,snd_pcm_oss
snd_timer              21828  2 snd_opl3_lib,snd_pcm
snd                    48996  13 snd_bt87x,snd_cmipci,snd_opl3_lib,snd_hwdep,snd_mpu401_uart,snd_cs46xx,snd_rawmidi,snd_seq_device,snd_ac97_codec,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
soundcore               7648  1 snd
snd_page_alloc          7620  3 snd_bt87x,snd_cs46xx,snd_pcm
lm90                   11044  0 
eeprom                  5776  0 
it87                   23900  0 
i2c_sensor              2944  3 lm90,eeprom,it87
i2c_isa                 1728  0 
i2c_viapro              6412  0 
i2c_core               18512  10 tuner,bttv,i2c_algo_bit,tveeprom,lm90,eeprom,it87,i2c_sensor,i2c_isa,i2c_viapro
0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
0000:00:09.0 Multimedia audio controller: Cirrus Logic CS 4614/22/24 [CrystalClear SoundFusion Audio Accelerator] (rev 01)
0000:00:0b.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 11)
0000:00:0b.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 11)
0000:00:0e.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
0000:00:0f.0 RAID bus controller: Triones Technologies, Inc. HPT366/368/370/370A/372 (rev 05)
0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80)
0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80)
0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80)
0000:00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82)
0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
0000:00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 74)
0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon R200 QM [Radeon 9100]
Personalities : [raid0] [raid1] [raid5] 
md4 : active raid1 hda1[0] hde1[1]
      995904 blocks [2/2] [UU]
      
md5 : active raid1 hda2[0] hde2[1]
      995904 blocks [2/2] [UU]
      
md6 : active raid1 hda7[0] hde7[1]
      5855552 blocks [2/2] [UU]
      
md7 : active raid0 hda8[0] hde8[1]
      136496128 blocks 32k chunks
      
unused devices: <none>


-- 
Educators cannot hope to instill a desire for life-long learning in
students until they themselves are life-long learners.
					    -- cvd6262, on slashdot.org
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Noel Maddy <noel@zhtwn.com>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-08  2:47     ` Memory leak in 2.6.11-rc1? (also here) Noel Maddy
@ 2005-02-16  4:00       ` Parag Warudkar
  2005-02-16  5:12         ` Andrew Morton
  0 siblings, 1 reply; 87+ messages in thread
From: Parag Warudkar @ 2005-02-16  4:00 UTC (permalink / raw)
  To: Noel Maddy; +Cc: Linus Torvalds, Jan Kasprzak, Jens Axboe, Kernel Mailing List

I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after 
use mainly due to growing swap use.  It has 768M of RAM and a Gig of swap. 
After following this thread, I started monitoring /proc/slabinfo. It seems 
size-64 is continuously growing and doing a compile run seem to make it grow 
noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks 
like 

size-64           7216543 7216544     64   61    1 : tunables  120   60    0 : 
slabdata 118304 118304      0

Since this doesn't seem to bio, I think we have another slab leak somewhere. 
The box recently went OOM during a gcc compile run after I killed the swap.

Output from free , OOM Killer, and /proc/slabinfo is down below..

free output -
           total       used       free     shared    buffers     cached
Mem:        767996     758120       9876          0       5276     130360
-/+ buffers/cache:     622484     145512
Swap:      1052248      67668     984580

OOM Killer Output
oom-killer: gfp_mask=0x1d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:        7260kB (0kB HighMem)
Active:62385 inactive:850 dirty:0 writeback:0 unstable:0 free:1815 slab:120136 
mapped:62334 pagetables:2110
DMA free:3076kB min:72kB low:88kB high:108kB active:3328kB inactive:0kB 
present:16384kB pages_scanned:4446 all_unreclaimable? yes
lowmem_reserve[]: 0 751 751
Normal free:4184kB min:3468kB low:4332kB high:5200kB active:246212kB 
inactive:3400kB present:769472kB pages_scanned:3834 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 
1*2048kB 0*4096kB = 3076kB
Normal: 170*4kB 10*8kB 2*16kB 0*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 
1*2048kB 0*4096kB = 4184kB
HighMem: empty
Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0
Free swap  = 0kB
Total swap = 0kB
Out of Memory: Killed process 4898 (klauncher).
oom-killer: gfp_mask=0x1d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:        7020kB (0kB HighMem)
Active:62308 inactive:648 dirty:0 writeback:0 unstable:0 free:1755 slab:120439 
mapped:62199 pagetables:2020
DMA free:3076kB min:72kB low:88kB high:108kB active:3336kB inactive:0kB 
present:16384kB pages_scanned:7087 all_unreclaimable? yes
lowmem_reserve[]: 0 751 751
Normal free:3944kB min:3468kB low:4332kB high:5200kB active:245896kB 
inactive:2592kB present:769472kB pages_scanned:3861 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 
1*2048kB 0*4096kB = 3076kB
Normal: 112*4kB 9*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 
1*2048kB 0*4096kB = 3944kB
HighMem: empty
Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0
Free swap  = 0kB
Total swap = 0kB
Out of Memory: Killed process 4918 (kwin).

/proc/slabinfo output

ipx_sock               0      0    896    4    1 : tunables   54   27    0 : 
slabdata      0      0      0
scsi_cmd_cache         3      7    576    7    1 : tunables   54   27    0 : 
slabdata      1      1      0
ip_fib_alias          10    119     32  119    1 : tunables  120   60    0 : 
slabdata      1      1      0
ip_fib_hash           10     61     64   61    1 : tunables  120   60    0 : 
slabdata      1      1      0
sgpool-128            32     32   4096    1    1 : tunables   24   12    0 : 
slabdata     32     32      0
sgpool-64             32     32   2048    2    1 : tunables   24   12    0 : 
slabdata     16     16      0
sgpool-32             32     32   1024    4    1 : tunables   54   27    0 : 
slabdata      8      8      0
sgpool-16             32     32    512    8    1 : tunables   54   27    0 : 
slabdata      4      4      0
sgpool-8              32     45    256   15    1 : tunables  120   60    0 : 
slabdata      3      3      0
ext3_inode_cache    2805   3063   1224    3    1 : tunables   24   12    0 : 
slabdata   1021   1021      0
ext3_xattr             0      0     88   45    1 : tunables  120   60    0 : 
slabdata      0      0      0
journal_handle        16    156     24  156    1 : tunables  120   60    0 : 
slabdata      1      1      0
journal_head          49    180     88   45    1 : tunables  120   60    0 : 
slabdata      4      4      0
revoke_table           6    225     16  225    1 : tunables  120   60    0 : 
slabdata      1      1      0
revoke_record          0      0     32  119    1 : tunables  120   60    0 : 
slabdata      0      0      0
unix_sock            170    175   1088    7    2 : tunables   24   12    0 : 
slabdata     25     25      0
ip_mrt_cache           0      0    128   31    1 : tunables  120   60    0 : 
slabdata      0      0      0
tcp_tw_bucket          1     20    192   20    1 : tunables  120   60    0 : 
slabdata      1      1      0
tcp_bind_bucket        4    119     32  119    1 : tunables  120   60    0 : 
slabdata      1      1      0
tcp_open_request       0      0    128   31    1 : tunables  120   60    0 : 
slabdata      0      0      0
inet_peer_cache        0      0    128   31    1 : tunables  120   60    0 : 
slabdata      0      0      0
secpath_cache          0      0    192   20    1 : tunables  120   60    0 : 
slabdata      0      0      0
xfrm_dst_cache         0      0    384   10    1 : tunables   54   27    0 : 
slabdata      0      0      0
ip_dst_cache          14     20    384   10    1 : tunables   54   27    0 : 
slabdata      2      2      0
arp_cache              2     12    320   12    1 : tunables   54   27    0 : 
slabdata      1      1      0
raw_sock               2      7   1088    7    2 : tunables   24   12    0 : 
slabdata      1      1      0
udp_sock               7      7   1088    7    2 : tunables   24   12    0 : 
slabdata      1      1      0
tcp_sock               4      4   1920    2    1 : tunables   24   12    0 : 
slabdata      2      2      0
flow_cache             0      0    128   31    1 : tunables  120   60    0 : 
slabdata      0      0      0
cfq_ioc_pool           0      0     48   81    1 : tunables  120   60    0 : 
slabdata      0      0      0
cfq_pool               0      0    176   22    1 : tunables  120   60    0 : 
slabdata      0      0      0
crq_pool               0      0    104   38    1 : tunables  120   60    0 : 
slabdata      0      0      0
deadline_drq           0      0     96   41    1 : tunables  120   60    0 : 
slabdata      0      0      0
as_arq                32     70    112   35    1 : tunables  120   60    0 : 
slabdata      2      2      0
mqueue_inode_cache      1      3   1216    3    1 : tunables   24   12    0 : 
slabdata      1      1      0
isofs_inode_cache      0      0    872    4    1 : tunables   54   27    0 : 
slabdata      0      0      0
hugetlbfs_inode_cache      1      9    824    9    2 : tunables   54   27    
0 : slabdata      1      1      0
ext2_inode_cache       0      0   1024    4    1 : tunables   54   27    0 : 
slabdata      0      0      0
ext2_xattr             0      0     88   45    1 : tunables  120   60    0 : 
slabdata      0      0      0
dnotify_cache         75     96     40   96    1 : tunables  120   60    0 : 
slabdata      1      1      0
dquot                  0      0    320   12    1 : tunables   54   27    0 : 
slabdata      0      0      0
eventpoll_pwq          1     54     72   54    1 : tunables  120   60    0 : 
slabdata      1      1      0
eventpoll_epi          1     20    192   20    1 : tunables  120   60    0 : 
slabdata      1      1      0
kioctx                 0      0    512    7    1 : tunables   54   27    0 : 
slabdata      0      0      0
kiocb                  0      0    256   15    1 : tunables  120   60    0 : 
slabdata      0      0      0
fasync_cache           2    156     24  156    1 : tunables  120   60    0 : 
slabdata      1      1      0
shmem_inode_cache    302    308   1056    7    2 : tunables   24   12    0 : 
slabdata     44     44      0
posix_timers_cache      0      0    264   15    1 : tunables   54   27    0 : 
slabdata      0      0      0
uid_cache              5     61     64   61    1 : tunables  120   60    0 : 
slabdata      1      1      0
blkdev_ioc            84     90     88   45    1 : tunables  120   60    0 : 
slabdata      2      2      0
blkdev_queue          20     27    880    9    2 : tunables   54   27    0 : 
slabdata      3      3      0
blkdev_requests       32     32    248   16    1 : tunables  120   60    0 : 
slabdata      2      2      0
biovec-(256)         256    256   4096    1    1 : tunables   24   12    0 : 
slabdata    256    256      0
biovec-128           256    256   2048    2    1 : tunables   24   12    0 : 
slabdata    128    128      0
biovec-64            256    256   1024    4    1 : tunables   54   27    0 : 
slabdata     64     64      0
biovec-16            256    270    256   15    1 : tunables  120   60    0 : 
slabdata     18     18      0
biovec-4             256    305     64   61    1 : tunables  120   60    0 : 
slabdata      5      5      0
biovec-1             272    450     16  225    1 : tunables  120   60    0 : 
slabdata      2      2      0
bio                  272    279    128   31    1 : tunables  120   60    0 : 
slabdata      9      9      0
file_lock_cache        7     40    200   20    1 : tunables  120   60    0 : 
slabdata      2      2      0
sock_inode_cache     192    192    960    4    1 : tunables   54   27    0 : 
slabdata     48     48      0
skbuff_head_cache     45     72    320   12    1 : tunables   54   27    0 : 
slabdata      6      6      0
sock                   6      8    896    4    1 : tunables   54   27    0 : 
slabdata      2      2      0
proc_inode_cache      50    128    856    4    1 : tunables   54   27    0 : 
slabdata     32     32      0
sigqueue              23     23    168   23    1 : tunables  120   60    0 : 
slabdata      1      1      0
radix_tree_node     2668   2856    536    7    1 : tunables   54   27    0 : 
slabdata    408    408      0
bdev_cache             9      9   1152    3    1 : tunables   24   12    0 : 
slabdata      3      3      0
sysfs_dir_cache     2437   2440     64   61    1 : tunables  120   60    0 : 
slabdata     40     40      0
mnt_cache             26     40    192   20    1 : tunables  120   60    0 : 
slabdata      2      2      0
inode_cache          778    918    824    9    2 : tunables   54   27    0 : 
slabdata    102    102      0
dentry_cache        4320   8895    264   15    1 : tunables   54   27    0 : 
slabdata    593    593      0
filp                1488   1488    320   12    1 : tunables   54   27    0 : 
slabdata    124    124      0
names_cache           11     11   4096    1    1 : tunables   24   12    0 : 
slabdata     11     11      0
idr_layer_cache       76     77    528    7    1 : tunables   54   27    0 : 
slabdata     11     11      0
buffer_head         2360   2385     88   45    1 : tunables  120   60    0 : 
slabdata     53     53      0
mm_struct             65     65   1472    5    2 : tunables   24   12    0 : 
slabdata     13     13      0
vm_area_struct      5628   5632    176   22    1 : tunables  120   60    0 : 
slabdata    256    256      0
fs_cache              76    122     64   61    1 : tunables  120   60    0 : 
slabdata      2      2      0
files_cache           64     64    896    4    1 : tunables   54   27    0 : 
slabdata     16     16      0
signal_cache          96    119    512    7    1 : tunables   54   27    0 : 
slabdata     17     17      0
sighand_cache         78     78   2112    3    2 : tunables   24   12    0 : 
slabdata     26     26      0
task_struct           96     96   1936    2    1 : tunables   24   12    0 : 
slabdata     48     48      0
anon_vma            1464   1464     64   61    1 : tunables  120   60    0 : 
slabdata     24     24      0
size-131072(DMA)       0      0 131072    1   32 : tunables    8    4    0 : 
slabdata      0      0      0
size-131072            0      0 131072    1   32 : tunables    8    4    0 : 
slabdata      0      0      0
size-65536(DMA)        0      0  65536    1   16 : tunables    8    4    0 : 
slabdata      0      0      0
size-65536             3      3  65536    1   16 : tunables    8    4    0 : 
slabdata      3      3      0
size-32768(DMA)        0      0  32768    1    8 : tunables    8    4    0 : 
slabdata      0      0      0
size-32768             4      4  32768    1    8 : tunables    8    4    0 : 
slabdata      4      4      0
size-16384(DMA)        0      0  16384    1    4 : tunables    8    4    0 : 
slabdata      0      0      0
size-16384             4      4  16384    1    4 : tunables    8    4    0 : 
slabdata      4      4      0
size-8192(DMA)         0      0   8192    1    2 : tunables    8    4    0 : 
slabdata      0      0      0
size-8192             31     31   8192    1    2 : tunables    8    4    0 : 
slabdata     31     31      0
size-4096(DMA)         0      0   4096    1    1 : tunables   24   12    0 : 
slabdata      0      0      0
size-4096             56     56   4096    1    1 : tunables   24   12    0 : 
slabdata     56     56      0
size-2048(DMA)         0      0   2048    2    1 : tunables   24   12    0 : 
slabdata      0      0      0
size-2048            123    126   2048    2    1 : tunables   24   12    0 : 
slabdata     63     63      0
size-1024(DMA)         0      0   1024    4    1 : tunables   54   27    0 : 
slabdata      0      0      0
size-1024            252    252   1024    4    1 : tunables   54   27    0 : 
slabdata     63     63      0
size-512(DMA)          0      0    512    8    1 : tunables   54   27    0 : 
slabdata      0      0      0
size-512             421    448    512    8    1 : tunables   54   27    0 : 
slabdata     56     56      0
size-256(DMA)          0      0    256   15    1 : tunables  120   60    0 : 
slabdata      0      0      0
size-256             108    120    256   15    1 : tunables  120   60    0 : 
slabdata      8      8      0
size-192(DMA)          0      0    192   20    1 : tunables  120   60    0 : 
slabdata      0      0      0
size-192            1204   1220    192   20    1 : tunables  120   60    0 : 
slabdata     61     61      0
size-128(DMA)          0      0    128   31    1 : tunables  120   60    0 : 
slabdata      0      0      0
size-128            1247   1426    128   31    1 : tunables  120   60    0 : 
slabdata     46     46      0
size-64(DMA)           0      0     64   61    1 : tunables  120   60    0 : 
slabdata      0      0      0
size-64           7265953 7265954     64   61    1 : tunables  120   60    0 : 
slabdata 119114 119114      0
size-32(DMA)           0      0     32  119    1 : tunables  120   60    0 : 
slabdata      0      0      0
size-32             1071   1071     32  119    1 : tunables  120   60    0 : 
slabdata      9      9      0
kmem_cache           120    120    256   15    1 : tunables  120   60    0 : 
slabdata      8      8      0

Parag

On Monday 07 February 2005 09:47 pm, Noel Maddy wrote:
> On Mon, Feb 07, 2005 at 07:38:12AM -0800, Linus Torvalds wrote:
> > Whee. You've got 5 _million_ bio's "active". Which account for about
> > 750MB of your 860MB of slab usage.
>
> Same situation here, at different rates on two different platforms,
> both running same kernel build. Both show steadily increasing biovec-1.
>
> uglybox was previously running Ingo's 2.6.11-rc2-RT-V0.7.36-03, and was
> well over 3,000,000 bios after about a week of uptime. With only 512M of
> memory, it was pretty sluggish.
>
> Interesting that the 4-disk RAID5 seems to be growing about 4 times as
> fast as the RAID1.
>
> If there's anything else that could help, or patches you want me to try,
> just ask.
>
> Details:
>
> =================================
> #1: Soyo KT600 Platinum, Athlon 2500+, 512MB
> 	2 SATA, 2 PATA (all on 8237)
> 	RAID1 and RAID5
> 	on-board tg3
> ================================
>
> >uname -a
>
> Linux uglybox 2.6.11-rc3 #2 Thu Feb 3 16:19:44 EST 2005 i686 GNU/Linux
>
> >uptime
>
>  21:27:47 up  7:04,  4 users,  load average: 1.06, 1.03, 1.02
>
> >grep '^bio' /proc/slabinfo
>
> biovec-(256)         256    256   3072    2    2 : tunables   24   12    0
> : slabdata    128    128      0 biovec-128           256    260   1536    5
>    2 : tunables   24   12    0 : slabdata     52     52      0 biovec-64   
>         256    260    768    5    1 : tunables   54   27    0 : slabdata   
>  52     52      0 biovec-16            256    260    192   20    1 :
> tunables  120   60    0 : slabdata     13     13      0 biovec-4           
>  256    305     64   61    1 : tunables  120   60    0 : slabdata      5   
>   5      0 biovec-1           64547  64636     16  226    1 : tunables  120
>   60    0 : slabdata    286    286      0 bio                64551  64599  
>   64   61    1 : tunables  120   60    0 : slabdata   1059   1059      0
>
> >lsmod
>
> Module                  Size  Used by
> ppp_deflate             4928  2
> zlib_deflate           21144  1 ppp_deflate
> bsd_comp                5376  0
> ppp_async               9280  1
> crc_ccitt               1728  1 ppp_async
> ppp_generic            21396  7 ppp_deflate,bsd_comp,ppp_async
> slhc                    6720  1 ppp_generic
> radeon                 76224  1
> ipv6                  235456  27
> pcspkr                  3300  0
> tg3                    84932  0
> ohci1394               31748  0
> ieee1394               94196  1 ohci1394
> snd_cmipci             30112  1
> snd_pcm_oss            48480  0
> snd_mixer_oss          17728  1 snd_pcm_oss
> usbhid                 31168  0
> snd_pcm                83528  2 snd_cmipci,snd_pcm_oss
> snd_page_alloc          7620  1 snd_pcm
> snd_opl3_lib            9472  1 snd_cmipci
> snd_timer              21828  2 snd_pcm,snd_opl3_lib
> snd_hwdep               7456  1 snd_opl3_lib
> snd_mpu401_uart         6528  1 snd_cmipci
> snd_rawmidi            20704  1 snd_mpu401_uart
> snd_seq_device          7116  2 snd_opl3_lib,snd_rawmidi
> snd                    48996  12
> snd_cmipci,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_opl3_lib,snd_timer,snd_hwd
>ep,snd_mpu401_uart,snd_rawmidi,snd_seq_device soundcore               7648 
> 1 snd
> uhci_hcd               29968  0
> ehci_hcd               29000  0
> usbcore               106744  4 usbhid,uhci_hcd,ehci_hcd
> dm_mod                 52796  0
> it87                   23900  0
> eeprom                  5776  0
> lm90                   11044  0
> i2c_sensor              2944  3 it87,eeprom,lm90
> i2c_isa                 1728  0
> i2c_viapro              6412  0
> i2c_core               18512  6
> it87,eeprom,lm90,i2c_sensor,i2c_isa,i2c_viapro
>
> >lspci
>
> 0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP]
> Host Bridge (rev 80) 0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237
> PCI Bridge
> 0000:00:07.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705
> Gigabit Ethernet (rev 03) 0000:00:0d.0 FireWire (IEEE 1394): VIA
> Technologies, Inc. IEEE 1394 Host Controller (rev 46) 0000:00:0e.0
> Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
> 0000:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA
> RAID Controller (rev 80) 0000:00:0f.1 IDE interface: VIA Technologies, Inc.
> VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
> 0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller (rev 81) 0000:00:10.1 USB Controller: VIA Technologies, Inc.
> VT82xxxxx UHCI USB 1.1 Controller (rev 81) 0000:00:10.2 USB Controller: VIA
> Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 0000:00:10.3
> USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller
> (rev 81) 0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev
> 86) 0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
> [K8T800 South] 0000:00:13.0 RAID bus controller: Silicon Image, Inc.
> (formerly CMD Technology Inc) SiI 3112 [SATALink/SATARaid] Serial ATA
> Controller (rev 02) 0000:01:00.0 VGA compatible controller: ATI
> Technologies Inc Radeon RV200 QW [Radeon 7500]
>
> >cat /proc/mdstat
>
> Personalities : [raid0] [raid1] [raid5]
> md1 : active raid1 sdb1[0] sda1[1]
>       489856 blocks [2/2] [UU]
>
> md4 : active raid5 sdb3[2] sda3[3] hdc3[1] hda3[0]
>       8795136 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
>
> md5 : active raid5 sdb5[2] sda5[3] hdc5[1] hda5[0]
>       14650752 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
>
> md6 : active raid5 sdb6[2] sda6[3] hdc6[1] hda6[0]
>       43953408 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
>
> md7 : active raid5 sdb7[2] sda7[3] hdc7[1] hda7[0]
>       164103552 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
>
> md0 : active raid1 hdc1[1] hda1[0]
>       489856 blocks [2/2] [UU]
>
> unused devices: <none>
>
> ================================
> #2: Soyo KT400 Platinum, Athlon 2500+, 512MB
> 	2 PATA (one on 8235, one on HPT372)
> 	RAID1
> 	on-board via rhine
> ================================
>
> >uname -a
>
> Linux lepke 2.6.11-rc3 #2 Thu Feb 3 16:19:44 EST 2005 i686 GNU/Linux
>
> >uptime
>
>  21:30:13 up  7:16,  1 user,  load average: 1.00, 1.00, 1.23
>
> >grep '^bio' /proc/slabinfo
>
> biovec-(256)         256    256   3072    2    2 : tunables   24   12    0
> : slabdata    128    128      0 biovec-128           256    260   1536    5
>    2 : tunables   24   12    0 : slabdata     52     52      0 biovec-64   
>         256    260    768    5    1 : tunables   54   27    0 : slabdata   
>  52     52      0 biovec-16            256    260    192   20    1 :
> tunables  120   60    0 : slabdata     13     13      0 biovec-4           
>  256    305     64   61    1 : tunables  120   60    0 : slabdata      5   
>   5      0 biovec-1           14926  15142     16  226    1 : tunables  120
>   60    0 : slabdata     67     67      0 bio                14923  15006  
>   64   61    1 : tunables  120   60    0 : slabdata    246    246      0
> Module                  Size  Used by
> ipv6                  235456  17
> pcspkr                  3300  0
> tuner                  21220  0
> ub                     15324  0
> usbhid                 31168  0
> bttv                  146064  0
> video_buf              17540  1 bttv
> firmware_class          7936  1 bttv
> i2c_algo_bit            8840  1 bttv
> v4l2_common             4736  1 bttv
> btcx_risc               3912  1 bttv
> tveeprom               11544  1 bttv
> videodev                7488  1 bttv
> uhci_hcd               29968  0
> ehci_hcd               29000  0
> usbcore               106744  5 ub,usbhid,uhci_hcd,ehci_hcd
> via_ircc               23380  0
> irda                  121784  1 via_ircc
> crc_ccitt               1728  1 irda
> via_rhine              19844  0
> mii                     4032  1 via_rhine
> dm_mod                 52796  0
> snd_bt87x              12360  0
> snd_cmipci             30112  0
> snd_opl3_lib            9472  1 snd_cmipci
> snd_hwdep               7456  1 snd_opl3_lib
> snd_mpu401_uart         6528  1 snd_cmipci
> snd_cs46xx             85064  0
> snd_rawmidi            20704  2 snd_mpu401_uart,snd_cs46xx
> snd_seq_device          7116  2 snd_opl3_lib,snd_rawmidi
> snd_ac97_codec         73976  1 snd_cs46xx
> snd_pcm_oss            48480  0
> snd_mixer_oss          17728  1 snd_pcm_oss
> snd_pcm                83528  5
> snd_bt87x,snd_cmipci,snd_cs46xx,snd_ac97_codec,snd_pcm_oss snd_timer       
>       21828  2 snd_opl3_lib,snd_pcm
> snd                    48996  13
> snd_bt87x,snd_cmipci,snd_opl3_lib,snd_hwdep,snd_mpu401_uart,snd_cs46xx,snd_
>rawmidi,snd_seq_device,snd_ac97_codec,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_
>timer soundcore               7648  1 snd
> snd_page_alloc          7620  3 snd_bt87x,snd_cs46xx,snd_pcm
> lm90                   11044  0
> eeprom                  5776  0
> it87                   23900  0
> i2c_sensor              2944  3 lm90,eeprom,it87
> i2c_isa                 1728  0
> i2c_viapro              6412  0
> i2c_core               18512  10
> tuner,bttv,i2c_algo_bit,tveeprom,lm90,eeprom,it87,i2c_sensor,i2c_isa,i2c_vi
>apro 0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600
> AGP] Host Bridge 0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI
> Bridge
> 0000:00:09.0 Multimedia audio controller: Cirrus Logic CS 4614/22/24
> [CrystalClear SoundFusion Audio Accelerator] (rev 01) 0000:00:0b.0
> Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev
> 11) 0000:00:0b.1 Multimedia controller: Brooktree Corporation Bt878 Audio
> Capture (rev 11) 0000:00:0e.0 Multimedia audio controller: C-Media
> Electronics Inc CM8738 (rev 10) 0000:00:0f.0 RAID bus controller: Triones
> Technologies, Inc. HPT366/368/370/370A/372 (rev 05) 0000:00:10.0 USB
> Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev
> 80) 0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
> 1.1 Controller (rev 80) 0000:00:10.2 USB Controller: VIA Technologies, Inc.
> VT82xxxxx UHCI USB 1.1 Controller (rev 80) 0000:00:10.3 USB Controller: VIA
> Technologies, Inc. USB 2.0 (rev 82) 0000:00:11.0 ISA bridge: VIA
> Technologies, Inc. VT8235 ISA Bridge
> 0000:00:11.1 IDE interface: VIA Technologies, Inc.
> VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
> 0000:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II]
> (rev 74) 0000:01:00.0 VGA compatible controller: ATI Technologies Inc
> Radeon R200 QM [Radeon 9100] Personalities : [raid0] [raid1] [raid5]
> md4 : active raid1 hda1[0] hde1[1]
>       995904 blocks [2/2] [UU]
>
> md5 : active raid1 hda2[0] hde2[1]
>       995904 blocks [2/2] [UU]
>
> md6 : active raid1 hda7[0] hde7[1]
>       5855552 blocks [2/2] [UU]
>
> md7 : active raid0 hda8[0] hde8[1]
>       136496128 blocks 32k chunks
>
> unused devices: <none>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16  4:00       ` -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?] Parag Warudkar
@ 2005-02-16  5:12         ` Andrew Morton
  2005-02-16  6:07           ` Parag Warudkar
  2005-02-16 23:31           ` Parag Warudkar
  0 siblings, 2 replies; 87+ messages in thread
From: Andrew Morton @ 2005-02-16  5:12 UTC (permalink / raw)
  To: Parag Warudkar; +Cc: noel, torvalds, kas, axboe, linux-kernel

Parag Warudkar <kernel-stuff@comcast.net> wrote:
>
> I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after 
> use mainly due to growing swap use.  It has 768M of RAM and a Gig of swap. 
> After following this thread, I started monitoring /proc/slabinfo. It seems 
> size-64 is continuously growing and doing a compile run seem to make it grow 
> noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks 
> like 
> 
> size-64           7216543 7216544     64   61    1 : tunables  120   60    0 : 
> slabdata 118304 118304      0

Plenty of moisture there.

Could you please use this patch?  Make sure that you enable
CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
but let's be sure).  Also enable CONFIG_DEBUG_SLAB.



From: Manfred Spraul <manfred@colorfullife.com>

With the patch applied,

	echo "size-4096 0 0 0" > /proc/slabinfo

walks the objects in the size-4096 slab, printing out the calling address
of whoever allocated that object.

It is for leak detection.


diff -puN mm/slab.c~slab-leak-detector mm/slab.c
--- 25/mm/slab.c~slab-leak-detector	2005-02-15 21:06:44.000000000 -0800
+++ 25-akpm/mm/slab.c	2005-02-15 21:06:44.000000000 -0800
@@ -2116,6 +2116,15 @@ cache_alloc_debugcheck_after(kmem_cache_
 		*dbg_redzone1(cachep, objp) = RED_ACTIVE;
 		*dbg_redzone2(cachep, objp) = RED_ACTIVE;
 	}
+	{
+		int objnr;
+		struct slab *slabp;
+
+		slabp = GET_PAGE_SLAB(virt_to_page(objp));
+
+		objnr = (objp - slabp->s_mem) / cachep->objsize;
+		slab_bufctl(slabp)[objnr] = (unsigned long)caller;
+	}
 	objp += obj_dbghead(cachep);
 	if (cachep->ctor && cachep->flags & SLAB_POISON) {
 		unsigned long	ctor_flags = SLAB_CTOR_CONSTRUCTOR;
@@ -2179,12 +2188,14 @@ static void free_block(kmem_cache_t *cac
 		objnr = (objp - slabp->s_mem) / cachep->objsize;
 		check_slabp(cachep, slabp);
 #if DEBUG
+#if 0
 		if (slab_bufctl(slabp)[objnr] != BUFCTL_FREE) {
 			printk(KERN_ERR "slab: double free detected in cache '%s', objp %p.\n",
 						cachep->name, objp);
 			BUG();
 		}
 #endif
+#endif
 		slab_bufctl(slabp)[objnr] = slabp->free;
 		slabp->free = objnr;
 		STATS_DEC_ACTIVE(cachep);
@@ -2998,6 +3009,29 @@ struct seq_operations slabinfo_op = {
 	.show	= s_show,
 };
 
+static void do_dump_slabp(kmem_cache_t *cachep)
+{
+#if DEBUG
+	struct list_head *q;
+
+	check_irq_on();
+	spin_lock_irq(&cachep->spinlock);
+	list_for_each(q,&cachep->lists.slabs_full) {
+		struct slab *slabp;
+		int i;
+		slabp = list_entry(q, struct slab, list);
+		for (i = 0; i < cachep->num; i++) {
+			unsigned long sym = slab_bufctl(slabp)[i];
+
+			printk("obj %p/%d: %p", slabp, i, (void *)sym);
+			print_symbol(" <%s>", sym);
+			printk("\n");
+		}
+	}
+	spin_unlock_irq(&cachep->spinlock);
+#endif
+}
+
 #define MAX_SLABINFO_WRITE 128
 /**
  * slabinfo_write - Tuning for the slab allocator
@@ -3038,9 +3072,11 @@ ssize_t slabinfo_write(struct file *file
 			    batchcount < 1 ||
 			    batchcount > limit ||
 			    shared < 0) {
-				res = -EINVAL;
+				do_dump_slabp(cachep);
+				res = 0;
 			} else {
-				res = do_tune_cpucache(cachep, limit, batchcount, shared);
+				res = do_tune_cpucache(cachep, limit,
+							batchcount, shared);
 			}
 			break;
 		}
_


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16  5:12         ` Andrew Morton
@ 2005-02-16  6:07           ` Parag Warudkar
  2005-02-16 23:52             ` Andrew Morton
  2005-02-16 23:31           ` Parag Warudkar
  1 sibling, 1 reply; 87+ messages in thread
From: Parag Warudkar @ 2005-02-16  6:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: noel, torvalds, kas, axboe, linux-kernel

On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
> Plenty of moisture there.
>
> Could you please use this patch?  Make sure that you enable
> CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
> but let's be sure).  Also enable CONFIG_DEBUG_SLAB.

Will try that out. For now I tried -rc4 and couple other things - removing 
nvidia module doesnt make any difference but removing ndiswrapper and with no 
networking the slab growth stops. With 8139too driver and network the growth 
is there but pretty slower than with ndiswrapper. With 8139too + some network 
activity slab seems to reduce sometimes.

Seems either an ndiswrapper or a networking related leak. Will report the 
results with Manfred's patch tomorrow.

Thanks
Parag

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16  5:12         ` Andrew Morton
  2005-02-16  6:07           ` Parag Warudkar
@ 2005-02-16 23:31           ` Parag Warudkar
  2005-02-16 23:51             ` Andrew Morton
  1 sibling, 1 reply; 87+ messages in thread
From: Parag Warudkar @ 2005-02-16 23:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 890 bytes --]

On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
> echo "size-4096 0 0 0" > /proc/slabinfo

Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in 
the .config? I tried -rc4 with Manfred's patch and with CONFIG_DEBUG_SLAB and 
CONFIG_DEBUG.

I get the following output from
echo "size-64 0 0 0" > /proc/slabinfo

obj ffff81002fe80000/0: 00000000000008a8 <0x8a8>
obj ffff81002fe80000/1: 00000000000008a8 <0x8a8>
obj ffff81002fe80000/2: 00000000000008a8 <0x8a8>
:                                 3
:                                 4
:                                 :
obj ffff81002fe80000/43: 00000000000008a8 <0x8a8>
obj ffff81002fe80000/44: 00000000000008a8 <0x8a8>
 
How do I know what is at ffff81002fe80000? I tried the normal tricks (gdb 
-c /proc/kcore vmlinux, objdump -d etc.) but none of the places list this 
address.

I am attaching my config.

Parag

[-- Attachment #2: .config.zip --]
[-- Type: application/x-zip, Size: 9961 bytes --]

PK\x03\x04\x14\0\0\0\b\0•“P2	·×p&\0\0ݐ\0\0\a\0\0\0.config][sÛ8²~ß_¡š}Ø™ªÍD”,YšSy€HPˆ·\x10 .ya)¶âèŒmùÈònòïO\x03¼\b \x1aò<LÆúºqo4º\x06øÏ\x7fü³GÞÎǧÝùp·{|üÙ{Ø?ïO»óþ¾÷´ûkß»;>\x7f;<üÑ»?>ÿëÜÛß\x1fΐ":<¿ýèýµ?=ï\x1f{ÿÙŸ^\x0fÇç?zƒßÇ¿{Þ‡ÓÝ\r°¬!ƒpÿµç{Þí\x1f£ñ\x1f£aoÐïþñÏ\x7føi\x12²y¹™ŒËñͧŸÍïñÍŒ‰ËO _~Äqqù‘¯9Ë9MhÎü’g,‰R\x7fy¡7\x14ŸDl–\x13AË€Fdkd]úq¶ñ\x17ó\vHI\x1emË,g‰ÐòZdT”‚Å4ï`4."™u.|»dÆI\x19Ä\x04!¤1É\0†^øgÏ?Þï¡“Ïo§Ãùgïqÿ\x1fèÌãË\x19úòõÒKt“Aʘ&‚D—üüˆ’¤ôÓ8c\x11½À³<]Ò¤L“’ÇÙ\x05–ÝS.ižÐ¨){®Fù±÷º?¿½\J\x03N\x12­hÎYš|úåewÚ=üÒÐøšhyò-_±Lk{–r¶)ãÏ\x05-ô\x1añ\0ú4õ)ç%ñ}9¾ÐpœV®†½Ãkïùx–ÕÒ
òE¤§#EÀ\x04Â\x19¥g\x11–|ÁBñÉ»mÇ+\x15YThC½Lg\x7fR(¯ +èW=k¶¬þÐso‰4žÑ  \x01Rô’D\x11ßÆ\Ï«ÁJøÿ•$0Â"'eF8¿Ô0,\x04Ýh²™¥‘Ñ\x05¾_¦\x19ˆ%ûBË0ÍK\x0e\x7f`}·ˆi¬I\x0fuaó\x04²O|\x01CÌ?õ-ZDf4B	išaøŸE¬ð¶r‚%Ûªh½JJì¢ãî~÷õ\x11Äþxÿ\x06ÿ{}{y9žÎ\x17\x01ŒÓ ˆ¨Ö\x13\x15P\x160ÇI wAM€Öû\r\x19étÆÓˆÂ<\x05öŒä±‘q-è\x1cÉ–ç~Mu\r\x1f06“);\x1dïö¯¯ÇSïüóeßÛ=ß÷¾íå¼Þk\x139^NŒr2î£RÖj¯¬@Š•ª+ò@µù\vZζ\x02ºj|ƒ\x12«Y0Öi‚û¦\x12œ§iP’ŒipÌ|˜’i@?=éœ1Ï;ú3+X`B,íä¥ê#5J·\f‘çz_€²Bššå”Æ™¶"Ô@9[\x1as!)b‚$Ÿ“\H}«¯\x1c|ÍR\x11ÍÌ\x1aÆ>Õ³«¡\x12Ö\x01Š\r|3>\v’\a,ÿ¬‰j«äóÏR¯Íh+\x1fÇÿîO éŸw\x0fû§ýóÙÖòY¬W!‹aÅš\x15¸\x16âi(Ö$‡\x01.xF“ÀšcÄÏXïWrÿŸÝó\x1d,¾ZÀß`I‡"•lVÕaÏçýéÛînÿ[wg¡ÌâÒ.ù«œ¥©è@²‡r\x18\x14¡/ŽŠÂ#J3\fSú¾\fy‡Fü‹´U¥\x11\x01¹n»h!\x04¬K&¸b\x01M;XH’N	õ:“vk*\x164õ•µª\x0fôm'K6‹;ˆHaŠÍH·Ž\x11ñ—\x11ã¢Ü‚Eñ	L\x1eO[¹$‡5´f\x13»}CýnSÒµÕá™ß\x1d/X6\x055¤JÁð· \fäÔ\x12^[©oÂÓþÿÞöÏw?{¯`\f\x1ež\x1f."\x01ä2Ìégc\x15ª±+ÒÚ²pA°5[Ë"$E$@%­J°x@­Ç$1g&Ê[pPÓ\x19ñ±å¯M`gŠrÈŽåd¥©>ƒÞ\x16åHŸ&\x01…ü\x03$¹ ³¨U\a€ÊŽî½´ËÆýé ­hÓ6VÕIÒu©\x16\x0eœÐÌTC{Z \x01Œ~Vú`Þ€A›:\x16’J" 6r\ŸªÊ½~ߝ@k\4”‘µž¢šË \x04BK–fo¯M\x0e½_A6{ûóÝï¿iêN\x17WøQ‚&¥¾0±8®~èÍ+’4\x0fhN\x03ÐìØ’!“ñNÞ\x11\x13\x7f«jk(Y %$¦Üª}æû ÛeÅcŸ‘þît\x0f\rúM3V´L\x14«Ã‡;HÕûz:Ü?(# †Yoq<¿<¾=Ø:·¶Rë¾iKÐ`PkKŠÎ2I©ÿ÷˜|Y\x15»ÿt\x16¾È4­Ü¥”Ò"-•ý,M¦V¶éýÝÛY™xß\x0eòŸã	<œ×ÞÇ\x1e}z{Üu–½\x19KÂ\x18|©(Ô<†
‹™n¬02\x1cÔ.—ôLt‹]RHZ Ê\x05ü#¢IUí³X8X2Þ¸iA°ÿÏánß\vÚ)yñ—\x0ew5ÜK»\v8¨·$ QšhÊ\x01–FÕ?!ËcµbÏ
\x16i6S¸.¥ÕJ\r[H©Ò2ÈÙ
ÑÑñþéxúÙ\x13û»ïÏÇÇãÃϺ¶0Ëb\x11ü¦K&ü¶Å\x12ü¹ÇGð2¥äÙ¶7ØÇYš‹Ë ×@©\vÂ\x05\x03\x17:òl\x02(I\x06\vê“6I.IB\x16¦¨pj<¼\x1eïu¶T®Û˜\x02¨éÞ`rÓZ_rº)»çq÷\x13ÄIfk°ÇãÝ_½ûªwuæY´„1Z•aઠØ#HÍd:?ƒµ‹è]Ó >\x03\x1f8 h–M‰\x01ñ§ãþU–¢ãsY\f>¬\x1a ýqš8ª(™¤£w\x19×6i¾ÍDŠÓ’YÐm•„ùfr½º³+µÈ‰æ¬i Ô¿\0§Ý^[c4é\a\x03i8¹éRYÂD®M>õ›Ä!\aƒº\0\x17òÓ/íFG43\x1cM?ÈӸ̖Â\x0fVöœ\x02Aåwß÷Ò›Õ\x17p–rp‚2‘ýõ³‹\x12nc\x01%AÄtýÑPüаºˆ e
Ú¡¤ba^[ÿ‚|„ÿ2ö1\x0eã9¸®Ö4g\x01µ;–]Txö¸ß½‚k¾\a%x¼{“\x0e‹ÒÛ\x1f\x0f÷ûßÏ?ÎR§÷¾ï\x1f_>\x1ež¿\x1d{àTH‰W&\f:S€Zr¨ÓUQX\x04¥{ÞÔ¹\x04Œk;s5P¢ X³\x06Ù‰|c,5\x02tÒ;å…Qše[4W¡I¼l£ P^[–ú"²…\x04šv÷ýð\x02@3\x1e\x1f¿¾=|;üÐw\bd&µ\x0f‰ÕُƒñMÿz•\rêú\r˵\{À)Å2MÃp–vL˜\x0eË•*É=¨ñÀ»:²ù\x17¯ß\x7f§ÚALºvO‡ª¶y°Z^R—¤\x10©a\x1aT¤4‰¶RR®Ö’P\x7f<Øl®óDÌ^[m°=Ê–#\x0eno6^[¬\x1d`¸°Mv]9Ë\x01¾^\x05°æÈ^çñ·“?ž\x0e¯3ñÑhp}1‘,Ãë,‹L\fß©±d\x19¯²pß^[ô¯\x17”Aç]eHøäöÆ^[]Ï$ð\a}\x18ä2ðÅÛbLèúzÍWë%¿ÎÁXLæ¸Õ~ၞö®\x17üiŸ¾Ó‘"\aÓë\x1d¹b\x04¤cãt©H\x1e¿;Y‘YÆVØb^\x13»3ó¢õ-U)5kmwÙKW­vµ_Ú†Ò%y®Úýõþðú׿{çÝËþß=?ø\0‹¹æ‰¶ý«{틼„¥\GÛÔ9†•`ý\aúŽS›ñ¼ñ¶ùñi¯7\x19\fùýï\x0f¿C={ÿûö×þëñGëvöÀ\x7f:\x1f^À§ŠŠÄXhU?TË °c\0É\0\x7fK'Eh¦‡Â£t>gÉÜè@qÚ=¿ªBÉù|:|};ëë”JÆå6™\x109ïŒGè·°YA¦þU4|Ð\x1fÿýP\x1d‚YÛ!MŸ\x0e×%ˆî\x06L=\x16tÊ\x05Ò\x14H\x1d”H÷¼‹\x11\x1fIO˜\x7fk¤¯\x01©•¹Ü<’\x15``'Þt\x19rÊ©¨Ž\x18˘\x7fòF°ÜiÖZÍUùu4‘^[AØæŸÁ\x16ƒ)ñ	É$§Ê«\x14¢>¨tuÃ\x0fŠ\x0eiæ´ÛÌé{Íœþ­fJ®ªe\x1eÀìÉq_¤aý^[=2½Ú#Ó¿×#1\x139àR›ƒ-£\x1dC4„XÛÚ½€„E³´+PҐíL 	•dáw5“Âa…èŽ\0 r9ø“\x15†‚Š‹\x19§hö›^[\x14f\x11\x0eÛÓR\x11Š\bÛ\rºÐaÕÀò[1A¹U±YÁA§0ß*
Z!Ê0"|‘¥¡2\x19ƒx3ô¦\x1e¾H+\x0eêò'Z*ô1¾…Pi©B\x14`\x14\a)Œ²ScÎ\x03±è´e]}'ýH–Ú \x01«·»TeÝ^dJò:zò\vËJše\x1e¾Ö_x —Á™\x17Ø6H5DÛx4ô'0I\x06ÝÁk)jq\x0e\x02˜Û\x1cրʋò\¼õ®» s®¹ß\x1d®˜l*ŽñMWÒZ\x1e\x15\x05‘Ñ+ãË2g³>+á‚å$³º®&…þ;i½Á¤ïJ,i¥×G\x1d­š\fì…F¢\x1eŠ\x0ePthI‡B\aÝ¡\x02t<ô\x06Vm\x03\x7f8\x1dýpÕQRûÂNÔ=¦©6\x15¥]ñÁ4¹z¿J\x1d¨vÍ¢Uln,Ú6[ø&\x03€zq&lË­M\x17\x16ò4Ýv)¥=o8½éý\x1a\x1eNû5ü÷^[²g\x01\’©µ›Þ¾¾þ|=Ø‹EZ3ƒý•ÏRN]\an-_Z@ßÌ\f£¶!UÁ\x195
zض]¬ýb;\x13pN£m²¹Z\x05X?ô\x06f§ãùxw|ÔòµZ'\x0f>ë4]\x1aŸe†Ì\x18„2[l¹œú¸\x0fÐ6],Ô¸\©u°r”Ÿ“µ®\x14[܏3\x04\x05·YdMËÙ uÚÿ@Ó’\x0fÒz`Ll¦¢Á\fÈô^[$¢Nƒê\x12“ýù¿ÇÓ_‡ç\a»Ä„ŠÆ2ÖØìó{â/©~~¥~ƒa¡‡LA^\x11K”%«=$lc°”Kª{³DÏ\x16´¾²ó}ÂM”\x04+y°\x19€ˆ\x16Æq|“"‹hu\bÉ\ršb/ÃuLò%B¨’\x12X\x06mZ=¹ŒY“u÷Òµf±Œic_!óœ"\fc#ÕÆX•k@\x19‹y\®<\f\x1c\x18{¦y†m\x1eñ­\f¢K—Ìè\x16Y\x11²0Ç ¤<ë ,“çH\x1dP\x14‰Œ²{Òê#ü,`dÞé¨\x1a…?Wc[)f\x7fôV‡Óùm÷Øãû“<q2Â8´\x19‘•+èS€3Æ ¢‚ ”3\x06ž 7hÊ[ñÞ¹uú^\x1aåóxÜÝ÷¾î\x1ewÏwè쨲ƒ¹$RÙ c\x18ZB\x118\b²KQ\x02Y4sNUìµÙ_ï\x16œçFG\x03²¶¡È·˜l(šá˜•[°è"ÜFhÐ…’ö|]µh÷òòx¸«"räNºÝ´Pè’\x052bþ’ÎΊø[SjÇ–ØŽm¹\x1d£‚;F$weƒÀ\x19²H˜G–-è\x14»Y΂95R·r\x0ejõÛáñ|UÄ“Pjú\x04–bC³W\x04a\x05–vR”R»á“¡J_I\x1d÷Í>—ÄІXîw!°\x11\x19	Bºh\x15\x1dÛÍ1«5s\a‰ð\x17eÄÀmÂI`%“dnåW\x11cbÕ³"dKð–3gªÜêᚢÖ\x02ã|X'‹ÔQÿœÊ(\x14œFý\x04'\x04Ü·z´¢EG¤õ®¢É\,\x1cõ\x13‘ƒàg1wÔ}A£Œæ8M\x0649:Ñ)«\x159]'®L¥e&Í\x03¼éਹG.§$Š\x1dUE$»©i\x1cW£Ó™<uûÁWWòçœ=õ|îN\b’Ïa1Ì©\f³v\x10ÁÝrP
7	\x1f¬„X…\0\x04ê\x06LÿÀXb.9Å„ÃdÌI`ug[ù:(	'ƒ.“Æ\x1dNä$¶²•5âIœ•3™55%\x15Ñ,\x12F”‹„1¥#q\ñ\08\MEænMA&hMÁfhÛµ¶\x0e©I~D8gá¶K\x06\x7fÁÕÏ):“À Ã•&\x10p™\x04Â¥\x17êÕç?ãkë6%Æ­òFgÂØ¥ÆÇWô¸Fëèj’»’¤™p•\x14ædî -"W\r0í®\x17×ÕYFíÍñÖHÒˆXP\x15ã„3EG›ë=f©sH\v6¾qÐ\x10-:Æ•ÕØ­Æø´\x19_\x11ôñE”« "\x15þ÷7LœÆ>
K:ë
QM\x03‚ô‹
}ÍÑHÂjœA44¤F™ô\aå\x10¥8Mæ8%ÏPœápGî5ŠÙ}\x1aÁ²P4\x1a\x17x1«ˆ$®êæ4‹¶(1puŒ¬[‰“ì…A¯ž+ÃJº.' \x17Š\î°\0ÝP¿­"\x7f©\0äË9È:±ý¿êwÅ~3\fèÚ\x02ж\x03EV\x06³y\x19sÇ-ƒ†!ýé;÷êg\x01º\v”t‚\x1fq_Xø‚xèveÃ\x10\a#-„]èÑö"†…ƒ\x19\rh0(ºd>z\x7fD²€PÐn²8K±û"’4Ë\aãÉM7A…B7VSÌéß\j,…QÎ-ﳞW\0«)E/®EÆa	ü\x1c ËÌÆ´Ó6Õ\r\fìü‚€\x04hsK\x05 dYD\x15¬\x19DA`(_øYÒÄ'Y‡GV]—`ð\x01\x1ek\x11‘\f‹\x03[#\fi5ü4
ÿG;d\rÝWïdÕªôó‘ËÝèÇSïÛîpêýßÛþmo\PE¨x9s7K+7Z–\x7f²0¬¶yt¿µ%ƒ:—·<Ó0 [´a:3~5«å˜}6\x1cf\x05.Ä\f\x01CîÛ¨!û\r\bþ~j£¹¾ÓЀ<DÊ\x17ôs„ ³Ð\x06çh®\x017ÕuƒÃÿ)R_–Ìå¡’Iøœrsp(‡N'B‘H”•È-\0L\x7f–\x04tcæ(	JPn\x1c¸O¸¶Y‹¡vîR\x03ꈝ\x1c*a\x04‡ÖÌŽ-†¶.|•!5\x04t¬Ïé†@|lV4Ô,˜ß†kÖÛÒ½óþõ\M\b#;XKç\x14S\x11@ìÞ*­¡2Ç£ƒ\x1a2\x18ä®M^E‡ß"‡?Ì]¢\x05‰ÁÛB¯m°\9,•Q~
H»\x01©\x1f{äÆ\x02\x0f¾Rj\x1ežÊL`-…ÊÙÛ©2OkÏ^%¨îJ‚W\x02+47ôž¤†\x127ï**\x1c?HcÏßNòòÌ\aµ‰ZŸaÜ›÷\b8ËmJ›µ\fc\0Ž¦/‚ãó\x03vI6H•Uz¹\x1e,ÏöRÓR\x05\x11b\\x1eótpA—2Pº\vGLÐêp¢CˆÉ¸ß·Ð9Ëg,²™ýlà\rlö4
Ê\x19–,Áê9è÷µ¬TËÓÇûwºªºkpé+•b\x7f:ì\x1eåS\x05f·]V`UH7þ¯:üRZÁgŒ6f\v›ƒÞ¢‘¼Û¡\vá*’Á\x05,GçÏÌ1©«ˆ^[Må\x05E\x1co\rk1M\x02Шh®ôsA"ö…âe
4 ŒÊë\x13‚´gn$÷Ÿ÷gìº\x03P:KtuÍçü]>þpîýêõ{°&{ý~üõpþÍ\ŽU1Æ\x01VÌÌËM`”lcJ"Üx,’¹ã>ƒÌ½Š®+‡ \b¬ú‰·ÇÃ\vX
O‡ÇŸ½çZCºN\x15«s^[i`^F€\x0e¼þ¶ØTô'½c¥õ\x19¯±À™š\x163#è±B\x13’aI\x02z³\x19]Š[³DŽy9¹ÑÂ\x04‚xêõcÝEæ¡¡ÕJ\x15›ñÔ™\x1cÇá\0íM\x12\a\x13Ïóº\x17
.ô€d‚úrË3\aávD³ÞÜ x\x15¹íÊ:˜çx\0-¥YžzŽ°`ê"„ K‰cõ"‚Ó\x18¿\x1a—ÐÁ²ì\‰k‰\x13o8õ±\x05O\x12Dšê\x03RCÎࣆ\x0eÓ’–bÍ8îO4l\x13o0Õ\aQâJ™æu<\x1e>o\x18Ÿº:.c¾«ï`º\x05ò\x10\x19%®\x18)ó\x053Ý<K\x15@΍\x1a¸\bO\x13†?u\x10D\x03<*Ÿv¯\f\FŠO†\x13Gà:X\x18°šãƒ¸¥Q”®C3Ø«¥æ\x13o<Å;e9DŽT²OV4J}&pŸA°yš\fßé1¤ËØf>Ãk30åªÒuÇ¿öϽ\\x1eÐ"
\Ø·\beŒÍãþõµ\a\x16Uï×çãó‡ï»§Óîþpìèoe¯5vfúõõø¸?ï/Éå\x15××KÄÐËiÿ\x01|Þß=Ïh\f\x179³ï×µ™T)¾ö\x18'\x1få\f42Õó‘\x17õ)®+Àc¢ö¥›ìtx}êÍÅÇàm\x7f†zW%ýºûøõãÃoò.®ºÙûõ\r-\rü-\x1e°à\x17Ù1kP‚\x11åmÔùz÷Ü;4ï)\x18Ý\x0fž¬U1¨®fc\x18[\rÂǯˆÔ{\x19\x04\x17nI…Iæ¤I\rÃ@‰# \x17X¾ÈJ¹ˆ,\x10··ƒ‘cñ\0\x06\x12Ï\bç\x04–c'Ë"ÍÙ—\x14ŸHªŽiNÁ\x10¤`png¸\x0eRÙà¤0\b\x1c—žYæ¸\x0eE\fïë,ÃqîJ %"Ä5\x1c\a\v\aÕð\x06þ’oÓ´gá<H@\x12ë¸6C\x1dHŠ5ëaú¼|?>ÿÔn’_š°H\x11MÍž_ÞÎîت$+ÚX§âu\x7fz”Ñ\x7f†Tëœeœ\x162´ne8G\x06¥Ì8)Ш7“û9¥I¹ù$c/¯ól?Mú\x1d–?Ómµ×ß©…G\0ÎÂé
­:]᎝ì8ö1µ]ƒ9‰©ylÂSXG\x11¼A`	^[&z}[J„[P-Æ…×_â×íZ¦\x15üã¸!Öò„ñ¤ÿN6>¿\x19{ØàÉûÝ©\x16\x02/\x7f–lÒ¿\x19tAøWuþҏ"øb2ðo=‡\x19¢X2’/gŽ\0áŠÁg\x19ǶI+rÄf@¶‹ÎÉÚ1¼V\x1c§!\x18KºU÷$/ml\x10PKKóÂpK\x01‹ÊÕˆ–'Z¾Ë²\x11ï²$t-Ð;ÝÚ4Ò¼0ù³”ý£Ia\x05V·÷ÑÂ*†\x15ßl6Ä\x11:ÚLH.˜[võÜL\v\x7fQMiwÍ\a \x14–ù<[ê\ax
-*ÕUû²þ÷Ýiw'\x0fÛ.ÖPc°ižèJ”\x17\x05Ü,\x16k^[«:DžpTO<äæö_\x1aCѮӏ<•Q r\x0e d\x7fëG$pØ4\x01›3šùxGK¢cŒ6¤z¹*rx ŠCîå
\a\x03ãxÖ|›ør‡è*1v¬å"ˆð\x15\x12ì,èDü\0žQ\x1f:ßqõ“ã°ëé\b\x18¾(ÈWÖìo6ºó¾\x1eøÉ`Ô·¤A‚š¨\x18:¦!wßaAX’¼,` ø岕N¥^[A“*€\x05- &ɶ”ÂÅß)F¿\0ŽÁ7—!:é—Ð\0”œsÒnEƒ+!1¨ˆêSü™‘:±\x0f\x06Ÿ•£\x04í9(c¤§“2\x13[ãü¦yØ\a`ܐËÕù\x18Ò;YÖ1\x19d”³í(±,fú1jÌÀÇL‚ˆ\x1a{~
Ï\b8¹Õ{8èÑ]Ìê‹uÕ™](_‘z2ÈF„º\x04Ö2–!0ƒ\x12âúù¯4\f^[}·Þï¾ß\x1f\x1fzÒ›éø\x1eU\x06H…@ÓåÐ’ÔØAOV9ÁîþVOœ^”ˆp%\x0f§cÜ‚‘g\fÛ§\v«k¹à\x10÷¾=\x1e_^~ª{ºæF¬q‰Äñj\x03™ëÏÍÍ3¹¥¥Þ–½T\x01@¥\x18ËØ_ 9\x04¹ñX\x1aü,E\x10â*F\x12s0VÝDPë\x0e_G’ã¹C­çò\x05\x13\˜ã5Y9V\x19²FÞ\vÒö\bÑ#V\x10乿 þ²~ݶ½}ác×.ôð¤_úòñC5ƒÚDäñáx:œ¿?½\x1aéJ\x12ÍÓ™\x11øTƒ™\x1fb Ñ3]€HÿwwÚ«\aÅ°jUo2\fGÝœ\0\x1c\x0f\x11pÓ\x05ãàvd\x1cÁ]Ð’ßL&èA|Å"·LÍÜÀ
¶\x10¯ßA81Dm\x0eºu¨_2—è\v\x1d\fÝù\x02U9’'O9Y\x11#B\x18à
3Â\x1c¸ÿ\x03œoX³ÐKü2|£`Úég\0Çþ…MÇ›nk¸@ßú‘$ãni\r@Å;Xš\x06iÚ\x19<\x10–º;•À´ÂÂ÷ϯÇÓ+X‚‡\x17\j8MäS\x17ŸMýæ%	b¯?ð\„‘Ñe\x06iŒ®Á\x1aÇ\x10Ë•ÏÌ[¢5\x1epoŒÕ"”ñ\rHµçÑÈ›ðØ&01¹Åê\x1cÅcô©å–|;²3\x03ô\x16E'\x18:AÚ\x05è\x10EÑŽ\x05üöj%§H\x111ÙxcoŠå—ù“Û¡ã\x01­†‡ÇÜ¿¹\r§×J®˜bd€@tÇ“1±	ëÉðvâ\x05(!ºŒôXN4\x1eÜ.Ú…>•ÛÆJ-â‚ݤS''ˆ0€ÂŒno\x1c„)Ò\x1cXø'£ñMw2W7ò¥©æÔN\x15‹Ôèï°Ì
|¹ÓÊY ÛÇÁîñq÷ú¯×ž÷á¿\a˜ò_ßLËdzRć×;l\x7fœÍb˜‡¶e¢nÅ>íï\x0f;ď”7/Km\x01\\x1dî÷Ç^x<U/ã7‡ý\x15Lîw/çÎ^o•ÃLLnp#¢¦¯?ûŽ=ÛŠÁ\x7f‡¾žN\x1dOÆÔé3†["\x15™\x132\x1aÜŒßgÁÏNä!W^\x0eûÃkUàBž2àBPq|Isâ:€©«pë\rqóâˆ7øqJ=Ž\x19î™VÔ\x05Ý°".Óœ9\f:ƒmNcæ؈¯{|3¹6âé
\x06\x14•ø\žÏ ²¤NidÉ‚º\x0f1+&ÐŽÌ
±xäY}Ì\x14\x1c\x1e\x0egpê*¡žŽ»û»
9j\x1e¡4"-Ìç€\x14i~Ú½|?Ü!–\¨=É\x1dÎÔ3\x04õÝ^[Ãá\x03’`\x11ELˆN8Ä…Ãgy^\x18W"\x01ÌbüðBòog4\x1f¸Î^[p\x161’à¦\x18ÐYÌ\x05f\x01i5'žvsO"”kï‡Ô\x02L#j^`\x03Æ…ÃG\0’¼úç¢Å\x04Æ\x0e÷ZdJå—8ß¿j9œ]!¶.·§¢ºH•Éé¤:æ?\x12šÆdÎð\x19
ôå6Ç…\x19hC—ÿ&ÇA\x19”ø\x1e¸\x143ùŠ§{ÈW,\x17\x05±\x1cýã³<$íÝ\x1f^_äc£•\x03kË;È…½Ë¡¢~l8\x04§\x1c\x16Â0\x04]j\x13SãA'ù³œü˜XˆzÞöÒ\x02	Ž\x7fxX@¶¢eòc(HF\x04\x16ý\x04Á¥Ò+o~Œ;0/\x12´hÀ½ÁÁÀê¿èøp¬¿8cÝ‘ˆÒ¹\x11N#\x7fƒ÷“\x14^[Ð\x17	.\x04\x1ašŠH{5\x16?*Ä``ì\x1aÌÔcîàa•‘/÷ó³®OVm!\x1eßžï5CLž\0µ¡UÍ;üÕ7s\x14kœî¾\x1fÎû;ù‰\b-þt8üh=s\rÊüØ\x04\x16ë€f&”“uÌ\x02f‚œ~.hâwóãò9õ:ÜLƒSÎåkËÚþ\x1c€1Û€\f¦zTo]%^[l‹S$#›\x19\x13õcΝj\v¿iðÅ.\a¼~h þ4\x0fn¾'#&³y·Ù²áTƒ²â¦ï©=ØNó³h(Ltµ±1¤ï@9ØÝ\x1f‹Œ¬º-«7=\vo<\x1a¡ïÍ´ul¤I.ïhSˆ?½-åg;|³\\x12±ÑÍÈë€ê-J\fSKoGÂH11v1\x1al€`Ã.öE\f‡ƒ‰	‚Ù{»éö…\x02Õ»¶¾|¿ÃÑ^[òTÒxO§Åä\x11\x188Ι)k@^[L:­¯OQ»\x15¨aÇYƒÌ‹ô½>êç\x03q™æsoà\r:£Î6–l%ñ`4îH~L‡\x03\všÚ\Óñ¨Ã·\bxg ë¥Ë\x04·q˜ù¬+\x1aü¦ßï\f˜\x1f3‹\x11¼AoxÛÇÀn×ro:œØظƒÕ6æÐDÕ©4\x02•‚’Ñí¨3ì̧Þm·Ã\x158¸é€rÛw²éãhGÚyš0\x7fÅf´£ÐÔÙ^[Klk¸š7\x0eÙXm\x06ê]§*²‚Ïð\x19,ƒŒå—­R³\	\x17|3ضÛ\0/ûçz\x19áV¨H\x15|ÉK`–*”%[–\b€úz'Ks¼\x12\x05þó\x1eœîçýñíUåe\x05ÔW‰å!GÈ»™ÎH\x12¬Yàˆ\x02U)·	‰™/m‰4ÇŽ´$SýÍ›nî©Àœ\x01Õ£yu½\øÚÙ½A‘O/µq/ЮÅñõ,-¹óéøø\bÖ›\x15\x11 SSH¤ò|²Pž\x7fR2žšå)Zž¦¢\\x14``
“š¢ù\x15\x17Ôh/&ž'	è ב\fþãîõ\x15\v4k%Í9\x16³¨ \x02ª*ŸMØ:zÖ\ãTžºm"ú\0®‘Ü$\x15ôžª¿\0Ÿ\x1eüý³üRÄ«zÿóßêµ±\x7fU¡\x1e^ÿjdõ_½'°¤w¯ÇÞ×}ïy¿¿ßßÿOO¾N®ç´Ø?¾¨‡ÉŸŽ§}O>L.¿;a\x18\x1a»ÕŸ\x15ì|¿DçÉ×õ‰©9€m\x16DÌpb˜Sê§1Nd<\07Ô‘«´ûPÊ"ƒ…¸¿Ç‰<\bòþÔM^[pšú¤Û"\x15ú´Ð£d:“aÁ:s\v€&œ¬F\x01)ÃÐüéÉäë{Ы\vƒK,À•\x171‘aÒ\x06¡.Ä|IKŠ Ë\x04]šR¸&~Ú\x11Ì¥¼eÝ‘Uy¬\x1a^[Oz()W¡*Y>'\x11Ù˜Ø&#¹\v\x06U™Ó8\x15TWýìi÷`Fêê…\x05þD_«iæçiÕ¦6“k{“J•’™ÔëF> –I§\x17\x02>ËM„Íb‹k)\x17D²öM4]ôÓ/%<ô¦ß…’©_]Y0„l5V§\x03m{š+lÚî–Æï\x13Ñ)|IÖTt\x06%ƒQ1¾Ú%Á\€¢\x1cuz\x14þ“wRžLÿá|üPi~¥’Ì*€\x15]Æ£ñpБ¶dp;èÌØêš“‰É\x17ôù–› ÔÁëO:©³h0ì{&¶ô\a޴߆‰>Ÿa\x19><ÈÏÈ]”ýëÇùîþa\x7fîÖ;•kÛY‰20r\x06“ÉÄ„¿"/:•ô\x03_]âÑ\x15ÁõÑ"^[ð\x16n7z‚ê:\x1c¶Š\x16œß\x0e\fQ¨£q`ù\x05î3¶wR‰Zý\x19\vÓ¾V`Óê±Á’­\x17 }\x16T¿¼¯Qe¼•|eŠFMP§¹|ÔÁ+ÛLÞy-Íh&›Æ ¬h9¡\bd HŠ\x12WÌ8³Ô(,#Ÿq\x02ÎOƒy'8\x15!–‚9Ú9\a±r\‰0*µ¾Þ\rKºå\x19IÊL¿XgÓ¯ÑÀ;ÎÑ‘nè\x05'ƒÉû\x1c›¿ÁBþ\x06OwÙ·x¼îbls¼_\x19oº~Ÿåóßáaïñܼ_\x14°Døt[Fœá£—ÊË£ÜÇ\x050–_æ\x1d\f»«FET
rˆ’8	©“PÂú܉¥Ó8\x04\x16©gÌì\x19Íÿ4ž!Ò¨^[u\x19\x19%¥qªÅF׿_*¯ë²ž^[¾\x15ª)iÌƃî|\x04\x10\r<¨\f|šó5‰:VIÎÒQ×Ĉè<\x15꣒&ÜõF"j9"þV}@Ó©\v²…|\x18C,Ñ\x17Ä4\x06è®UÚÍœ\x05*äØ‘RPn,âDÄ\x1f\x03\x1eáÝg¼© ÆG~#±±íÚ<ªE\x14»Q¡–q"«j{Ʊÿ‘\a*ü\x0e»I\x04d;°ðùÛáù0“\x0e\x0f’‚Á¿	“\x1e³\x1d/w8=©\x18\x17Ë)¥Aû9âP~x¯ºFb|H\\fŒ/¯Ö@¹‘ßd°áê{Þďl\x12§~‘3±5(C™ùS\a@2\x1fº3\x1fâ™ÿi~]\f~:\x1d5H\x1fÏԝþKòœ2\x10¯Wõ»œv6°Š‚ÃÂ\x1e^[†ú;—aŠæÙm¢NBš©“µ¦^Z§HxÓôìÔHo€\x13\x11 •ÇE\x1aA²6.bžÆ®Ò?\x17©0>ûòY~¤q… U\x14ÍTVi«o¸×H¤B¾\x02w	Ù,DÚŒ\x01ÝTXÕ:õñ–ò+lRÄ-	\aûh:\x1e÷\r!ÿ3˜~\x01ý\v0éôê·‘¤\bB£"òw\x12µwý‚”\x7f\f‰ø˜\b¼\x16¡|²OK\x1esHa «.‹üݼ®/÷ð3¹Eq3¼Åè,•Á—\x1cÚôËîõîph¿Z—\bKÄ\x15äþ6¯"çö}˜ìuÿv\x7fTßï´Zg}¿Y\x01ËN0÷–›;€à§»\x04\vH™ÀæÑ…`k\x04\x11gz\x1dÔÏnú
´Ó.
PâÑLO_C¥ê÷ËL•_\alFÝ\¢Í®¹Ä\x18\aî‰GB7mq•”E…“<£î¤37éJ*_5^[\x0fQ¹¢W\x16™›ö9Ùܸ©ò£¬.Za%û‡þäZè¸=\x0e‰»4 9n•Å3g—0Wf~æL“Êçc\¢€¶*۝Î\aõ›øù¢ï@dò²ŽüîmûÌ»\x11Ø®Îä[\x1edŠ¥<¼Ð¤1›\x13<é…Gœ]Í^>Þ¦goè=„0ã°~3¾Œ\bøã\x1a¿Z™x1C’ð4‚Zðêë×6YÞ\x0fQ\x1fØÕ³½\x04\x01\x04ñÕúó9ûG>&\x04Uºš¶Hð´4ttZ%À»3˜o½h÷üð¶{ØÛ\x1fƒNŒ\x17°"Þ¨ÿO¿¼¿M~Ñ)Í‚Q‚a¦i)·C-pפ聾\x06e¢o›u(\x03'ŝ›«\x06“±³œ±ç¤8k Gýw(7NŠ³Ö㱓2uP¦CWš©³G§CW{¦7®r&·ö€!3™Œ¦åÄ‘À^[8Ë\aR§«	÷\x193¥©ÉßË\x1dàð\x10‡\x1du\x1fáð\x18‡oqxŠÃž£*ž£.^§2Ë”MÊ\x1cÁ
\x13+D8i,Öìt\x04S\x01ýt\x02ØM!‹ª\x0fï5zºÂÚó‚¥|%ä±÷}w÷—ñÒa\x15\x17½”O+éúS†ÉÉÕÔüЫz\x12Q¾UŽ^[_2'\x1eéçU\x15–å”Æ™èÂ<c‰ú¼Š\x03‡¼(ÍLÓOU6É§ÀU¨½«N*×Ú*\x1d6×sXò{\Wl^–Æqq…¾„1˜Q{qæû»·ÓáüS;õ¾¤¡[¬smÓ³AäÞÿ:Õ¿¸ÒR|’‘\x19È…0>IÒ’åù1؃ÆÇDZ"ü\x11­\x1cOi]xԁ€\x1d¹xúùr>>T\x11ºöÑ~õñïK…ªßêáV\vLŠHÛš¯Á8¸A°‘…Égb­\x1c\x014bf.ðH\x0f\0©áu†¡\x1eÐQc3õ$þ\x01š\0£ƒâòE\x03#¶¦Æ	’¹ü~Ý>‰Ú-\x11”Øyæ¾ÝeË\x05ùBŒÝ†;)f\f“¶ùad\x1cn6cÀÀ”—"õ'雺æþpà#…É­7K‚ÚëWwJ”´“É*´óðõ´;ý읎oçÃóސ-¿ô}&Œnõõ³¶ˆÍÚºÔÐ\x17À¤ÖPúi —¦þ?PK\x01\x02\x14\x03\x14\0\0\0\b\0•“P2	·×p&\0\0ݐ\0\0\a\0	\0\0\0\0\0\0\0\0\0¤\0\0\0\0.configUT\x05\0\a*×\x13BPK\x05\x06\0\0\0\0\x01\0\x01\0>\0\0\0•&\0\0\0\0

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16 23:31           ` Parag Warudkar
@ 2005-02-16 23:51             ` Andrew Morton
  2005-02-17  1:19               ` Parag Warudkar
  2005-02-17  3:48               ` Horst von Brand
  0 siblings, 2 replies; 87+ messages in thread
From: Andrew Morton @ 2005-02-16 23:51 UTC (permalink / raw)
  To: Parag Warudkar; +Cc: torvalds, linux-kernel

Parag Warudkar <kernel-stuff@comcast.net> wrote:
>
> On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
> > echo "size-4096 0 0 0" > /proc/slabinfo
> 
> Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in 
> the .config?

No good reason, I suspect.

> I tried -rc4 with Manfred's patch and with CONFIG_DEBUG_SLAB and 
> CONFIG_DEBUG.

Thanks.

> I get the following output from
> echo "size-64 0 0 0" > /proc/slabinfo
> 
> obj ffff81002fe80000/0: 00000000000008a8 <0x8a8>
> obj ffff81002fe80000/1: 00000000000008a8 <0x8a8>
> obj ffff81002fe80000/2: 00000000000008a8 <0x8a8>
> :                                 3
> :                                 4
> :                                 :
> obj ffff81002fe80000/43: 00000000000008a8 <0x8a8>
> obj ffff81002fe80000/44: 00000000000008a8 <0x8a8>
>  
> How do I know what is at ffff81002fe80000? I tried the normal tricks (gdb 
> -c /proc/kcore vmlinux, objdump -d etc.) but none of the places list this 
> address.

ffff81002fe80000 is the address of the slab object.  00000000000008a8 is
supposed to be the caller's text address.  It appears that
__builtin_return_address(0) is returning junk.  Perhaps due to
-fomit-frame-pointer.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16  6:07           ` Parag Warudkar
@ 2005-02-16 23:52             ` Andrew Morton
  2005-02-17 13:00               ` Parag Warudkar
  0 siblings, 1 reply; 87+ messages in thread
From: Andrew Morton @ 2005-02-16 23:52 UTC (permalink / raw)
  To: Parag Warudkar; +Cc: noel, torvalds, kas, axboe, linux-kernel

Parag Warudkar <kernel-stuff@comcast.net> wrote:
>
> On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
> > Plenty of moisture there.
> >
> > Could you please use this patch?  Make sure that you enable
> > CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
> > but let's be sure).  Also enable CONFIG_DEBUG_SLAB.
> 
> Will try that out. For now I tried -rc4 and couple other things - removing 
> nvidia module doesnt make any difference but removing ndiswrapper and with no 
> networking the slab growth stops. With 8139too driver and network the growth 
> is there but pretty slower than with ndiswrapper. With 8139too + some network 
> activity slab seems to reduce sometimes.

OK.

> Seems either an ndiswrapper or a networking related leak. Will report the 
> results with Manfred's patch tomorrow.

So it's probably an ndiswrapper bug?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16 23:51             ` Andrew Morton
@ 2005-02-17  1:19               ` Parag Warudkar
  2005-02-17  3:48               ` Horst von Brand
  1 sibling, 0 replies; 87+ messages in thread
From: Parag Warudkar @ 2005-02-17  1:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, linux-kernel

On Wednesday 16 February 2005 06:51 pm, Andrew Morton wrote:
> ffff81002fe80000 is the address of the slab object.  00000000000008a8 is
> supposed to be the caller's text address.  It appears that
> __builtin_return_address(0) is returning junk.  Perhaps due to
> -fomit-frame-pointer.
I tried manually removing -fomit-frame-pointer from Makefile and adding 
-fno-omit-frame-pointer but with same results - junk return addresses. 
Probably a X86_64 issue.

>So it's probably an ndiswrapper bug? 
I looked at ndiswrapper mailing lists and found this explanation for the same 
issue of growing size-64 with ndiswrapper  -
----------------------------------
"It looks like the problem is kernel-version related, not ndiswrapper. 
 ndiswrapper just uses some API that starts the memory leak but the 
 problem is indeed in the kernel itself. versions from 2.6.10 up to 
 .11-rc3 have this problem afaik. haven"t tested rc4 but maybe this one 
 doesn"t have the problem anymore, we will see"
----------------------------------

I tested -rc4 and it has the problem too.  More over, with plain old 8139too 
driver, the slab still continues to grow albeit slowly. So there is a reason 
to suspect kernel leak as well. I will try binary searching...

Parag

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16 23:51             ` Andrew Morton
  2005-02-17  1:19               ` Parag Warudkar
@ 2005-02-17  3:48               ` Horst von Brand
  2005-02-17 13:35                 ` Parag Warudkar
  1 sibling, 1 reply; 87+ messages in thread
From: Horst von Brand @ 2005-02-17  3:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Parag Warudkar, torvalds, linux-kernel

Andrew Morton <akpm@osdl.org> said:
> Parag Warudkar <kernel-stuff@comcast.net> wrote:

[...]

> > Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in 
> > the .config?

> No good reason, I suspect.

Does x86_64 use up a (freeable) register for the frame pointer or not?
I.e., does -fomit-frame-pointer have any effect on the generated code?
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-16 23:52             ` Andrew Morton
@ 2005-02-17 13:00               ` Parag Warudkar
  2005-02-17 18:18                 ` Linus Torvalds
  2005-02-18  1:38                 ` Badari Pulavarty
  0 siblings, 2 replies; 87+ messages in thread
From: Parag Warudkar @ 2005-02-17 13:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: noel, torvalds, kas, axboe, linux-kernel

On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
> So it's probably an ndiswrapper bug?
Andrew, 
It looks like it is a kernel bug triggered by NdisWrapper. Without 
NdisWrapper, and with just 8139too plus some light network activity the 
size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it 
running to see where it goes.

A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
of tracking it down by using kprobes to insert a probe into __kmalloc and 
record the stack to see what is causing so many allocations.)

Thanks
Parag

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-17  3:48               ` Horst von Brand
@ 2005-02-17 13:35                 ` Parag Warudkar
  0 siblings, 0 replies; 87+ messages in thread
From: Parag Warudkar @ 2005-02-17 13:35 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Andrew Morton, linux-kernel

On Wednesday 16 February 2005 10:48 pm, Horst von Brand wrote:
> Does x86_64 use up a (freeable) register for the frame pointer or not?
> I.e., does -fomit-frame-pointer have any effect on the generated code?

{Took Linus out of the loop as he probably isn't interested}

The generated code is different for both cases but for some reason gcc has 
trouble with __builtin_return_address on x86-64.

For e.g. specifying gcc -fo-f-p, a method produces following assembly.

method_1:
.LFB2:
        subq    $8, %rsp
.LCFI0:
        movl    $__FUNCTION__.0, %esi
        movl    $.LC0, %edi
        movl    $0, %eax
        call    printf
        movl    $0, %eax
        addq    $8, %rsp
        ret

And with -fno-o-f-p,  the same method yields 

method_1:
.LFB2:
        pushq   %rbp
.LCFI0:
        movq    %rsp, %rbp
.LCFI1:
        movl    $__FUNCTION__.0, %esi
        movl    $.LC0, %edi
        movl    $0, %eax
        call    printf
        movl    $0, %eax
        leave
        ret

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-17 13:00               ` Parag Warudkar
@ 2005-02-17 18:18                 ` Linus Torvalds
  2005-02-18  1:38                 ` Badari Pulavarty
  1 sibling, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2005-02-17 18:18 UTC (permalink / raw)
  To: Parag Warudkar; +Cc: Andrew Morton, noel, kas, axboe, linux-kernel



On Thu, 17 Feb 2005, Parag Warudkar wrote:
> 
> A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
> of tracking it down by using kprobes to insert a probe into __kmalloc and 
> record the stack to see what is causing so many allocations.)

It's definitely kmalloc-based, but you may not catch it in __kmalloc. The 
"kmalloc()" function is actually an inline function which has some magic 
compile-time code that statically determines when the size is constant and 
can be turned into a direct call to "kmem_cache_alloc()" with the proper 
cache descriptor.

So you'd need to either instrument kmem_cache_alloc() (and trigger on the 
proper slab descriptor) or you would need to modify the kmalloc() 
definition in <linux/slab.h> to not do the constant size optimization, at 
which point you can instrument just __kmalloc() and avoid some of the 
overhead.

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-17 13:00               ` Parag Warudkar
  2005-02-17 18:18                 ` Linus Torvalds
@ 2005-02-18  1:38                 ` Badari Pulavarty
  2005-02-21  4:57                   ` Parag Warudkar
  1 sibling, 1 reply; 87+ messages in thread
From: Badari Pulavarty @ 2005-02-18  1:38 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Andrew Morton, noel, torvalds, kas, axboe, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 982 bytes --]

On Thu, 2005-02-17 at 05:00, Parag Warudkar wrote:
> On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
> > So it's probably an ndiswrapper bug?
> Andrew, 
> It looks like it is a kernel bug triggered by NdisWrapper. Without 
> NdisWrapper, and with just 8139too plus some light network activity the 
> size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it 
> running to see where it goes.
> 
> A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
> of tracking it down by using kprobes to insert a probe into __kmalloc and 
> record the stack to see what is causing so many allocations.)
> 

Last time I debugged something like this, I ended up adding dump_stack()
in kmem_cache_alloc() for the specific slab.

If you are really interested, you can try to get following jprobe
module working. (need to teach about kmem_cache_t structure to
get it to compile and export kallsyms_lookup_name() symbol etc).

Thanks,
Badari




[-- Attachment #2: kmod.c --]
[-- Type: text/x-c, Size: 1224 bytes --]

#include <linux/module.h>
#include <linux/kprobes.h>
#include <linux/kallsyms.h>
#include <linux/kdev_t.h>

MODULE_PARM_DESC(kmod, "\n");

int count = 0;
void fastcall inst_kmem_cache_alloc(kmem_cache_t *cachep, int flags)
{
	if (cachep->objsize == 64) {
		if (count++ == 100) {
			dump_stack();
			count = 0;
		}
	}
	jprobe_return();
}
static char *fn_names[] = {
	"kmem_cache_alloc",
};

static struct jprobe kmem_probes[] = {
  {
    .entry = (kprobe_opcode_t *) inst_kmem_cache_alloc,
    .kp.addr=(kprobe_opcode_t *) 0,
  }
};

#define MAX_KMEM_ROUTINE (sizeof(kmem_probes)/sizeof(struct kprobe))

/* installs the probes in the appropriate places */
static int init_kmods(void)
{
	int i;

	for (i = 0; i < MAX_KMEM_ROUTINE; i++) {
		kmem_probes[i].kp.addr = kallsyms_lookup_name(fn_names[i]);
		if (kmem_probes[i].kp.addr) { 
			printk("plant jprobe at name %s %p, handler addr %p\n",
		          fn_names[i], kmem_probes[i].kp.addr, kmem_probes[i].entry);
			register_jprobe(&kmem_probes[i]);
		}
	}
	return 0;
}

static void cleanup_kmods(void)
{
	int i;
	for (i = 0; i < MAX_KMEM_ROUTINE; i++) {
		unregister_jprobe(&kmem_probes[i]);
	}
}

module_init(init_kmods);
module_exit(cleanup_kmods);
MODULE_LICENSE("GPL");

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
  2005-02-18  1:38                 ` Badari Pulavarty
@ 2005-02-21  4:57                   ` Parag Warudkar
  0 siblings, 0 replies; 87+ messages in thread
From: Parag Warudkar @ 2005-02-21  4:57 UTC (permalink / raw)
  To: Badari Pulavarty
  Cc: Andrew Morton, noel, torvalds, kas, axboe, Linux Kernel Mailing List

On Thursday 17 February 2005 08:38 pm, Badari Pulavarty wrote:
> > On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
> > > So it's probably an ndiswrapper bug?
> >
> > Andrew,
> > It looks like it is a kernel bug triggered by NdisWrapper. Without
> > NdisWrapper, and with just 8139too plus some light network activity the
> > size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep
> > it running to see where it goes.

[OT]

Didn't wanted to keep this hanging - It turned out to be a strange ndiswrapper 
bug - It seems that the other OS in question allows the following without a 
leak ;) -
ptr =Allocate(...);
ptr = Allocate(...);
:
repeat this zillion times without ever fearing that 'ptr' will leak..

I sent a fix to ndiswrapper-general mailing list on sourceforge if any one is 
using ndiswrapper and having a similar problem.

Parag

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2005-02-21  4:58 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-21 16:19 Memory leak in 2.6.11-rc1? Jan Kasprzak
2005-01-22  2:23 ` Alexander Nyberg
2005-01-23  9:11   ` Jens Axboe
2005-01-23  9:19     ` Andrew Morton
2005-01-23  9:56       ` Jens Axboe
2005-01-23 10:32         ` Andrew Morton
2005-01-23 20:03           ` Russell King
2005-01-24 11:48             ` Russell King
2005-01-25 19:32               ` Russell King
2005-01-27  8:28                 ` Russell King
2005-01-27  8:47                   ` Andrew Morton
2005-01-27 10:19                     ` Alessandro Suardi
2005-01-27 12:17                     ` Martin Josefsson
2005-01-27 12:56                     ` Robert Olsson
2005-01-27 13:03                       ` Robert Olsson
2005-01-27 16:49                       ` Russell King
2005-01-27 18:37                         ` Phil Oester
2005-01-27 19:25                           ` Russell King
2005-01-27 20:40                             ` Phil Oester
2005-01-28  9:32                               ` Russell King
2005-01-27 20:33                         ` David S. Miller
2005-01-28  0:17                           ` Russell King
2005-01-28  0:34                             ` David S. Miller
2005-01-28  8:58                               ` Russell King
2005-01-30 13:23                                 ` Russell King
2005-01-30 15:34                                   ` Russell King
2005-01-30 16:57                                     ` Phil Oester
2005-01-30 17:23                                   ` Patrick McHardy
2005-01-30 17:26                                     ` Patrick McHardy
2005-01-30 17:58                                       ` Patrick McHardy
2005-01-30 18:45                                         ` Russell King
2005-01-31  2:48                                         ` David S. Miller
2005-01-31  4:11                                         ` Herbert Xu
2005-01-31  4:45                                           ` YOSHIFUJI Hideaki / 吉藤英明
2005-01-31  5:00                                             ` Patrick McHardy
2005-01-31  5:11                                               ` David S. Miller
2005-01-31  5:40                                                 ` Herbert Xu
2005-01-31  5:16                                               ` YOSHIFUJI Hideaki / 吉藤英明
2005-01-31  5:42                                                 ` Yasuyuki KOZAKAI
2005-01-30 18:01                                       ` Russell King
2005-01-30 18:19                                         ` Phil Oester
2005-01-28  1:41                             ` Phil Oester
2005-01-24  0:56           ` Alexander Nyberg
2005-01-24 20:47             ` Jens Axboe
2005-01-24 20:56               ` Andrew Morton
2005-01-24 21:05                 ` Jens Axboe
2005-01-24 22:35                 ` Linus Torvalds
2005-01-25 15:53                   ` OT " Paulo Marques
2005-01-26  8:01                   ` Jens Axboe
2005-01-26  8:11                     ` Andrew Morton
2005-01-26  8:40                       ` Jens Axboe
2005-01-26  8:44                         ` Andrew Morton
2005-01-26  8:47                           ` Jens Axboe
2005-01-26  8:52                             ` Jens Axboe
2005-01-26  9:00                               ` William Lee Irwin III
2005-01-26  8:58                             ` Andrew Morton
2005-01-26  9:03                               ` Jens Axboe
2005-01-26 15:52                               ` Parag Warudkar
2005-02-02  9:29                   ` Lennert Van Alboom
2005-02-02 16:00                     ` Linus Torvalds
2005-02-02 16:19                       ` Lennert Van Alboom
2005-02-02 17:49                       ` Dave Hansen
2005-02-02 18:27                         ` Linus Torvalds
2005-02-02 19:07                           ` Dave Hansen
2005-02-02 21:08                             ` Linus Torvalds
2005-01-24 22:05             ` Andrew Morton
2005-02-07 11:00 ` Jan Kasprzak
2005-02-07 11:11   ` William Lee Irwin III
2005-02-07 15:38   ` Linus Torvalds
2005-02-07 15:52     ` Jan Kasprzak
2005-02-07 16:38       ` axboe
2005-02-07 17:35         ` Jan Kasprzak
2005-02-07 21:10           ` Jan Kasprzak
2005-02-08  2:47     ` Memory leak in 2.6.11-rc1? (also here) Noel Maddy
2005-02-16  4:00       ` -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?] Parag Warudkar
2005-02-16  5:12         ` Andrew Morton
2005-02-16  6:07           ` Parag Warudkar
2005-02-16 23:52             ` Andrew Morton
2005-02-17 13:00               ` Parag Warudkar
2005-02-17 18:18                 ` Linus Torvalds
2005-02-18  1:38                 ` Badari Pulavarty
2005-02-21  4:57                   ` Parag Warudkar
2005-02-16 23:31           ` Parag Warudkar
2005-02-16 23:51             ` Andrew Morton
2005-02-17  1:19               ` Parag Warudkar
2005-02-17  3:48               ` Horst von Brand
2005-02-17 13:35                 ` Parag Warudkar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).