From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx109.postini.com [74.125.245.109]) by kanga.kvack.org (Postfix) with SMTP id 8556A6B0074 for ; Tue, 13 Nov 2012 17:03:54 -0500 (EST) Date: Tue, 13 Nov 2012 14:03:52 -0800 From: Andrew Morton Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. Message-Id: <20121113140352.4d2db9e8.akpm@linux-foundation.org> In-Reply-To: References: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: sukijaki@gmail.com Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 6 Nov 2012 15:11:48 +0000 (UTC) bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=50181 > > Summary: Memory usage doubles after more then 20 hours of > uptime. > Product: Memory Management > Version: 2.5 > Kernel Version: 3.7-rc3 and 3.7-rc4 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > AssignedTo: akpm@linux-foundation.org > ReportedBy: sukijaki@gmail.com > Regression: Yes > > > Created an attachment (id=85721) > --> (https://bugzilla.kernel.org/attachment.cgi?id=85721) > kernel config file > > After 20 hours of uptime, memory usage starts going up. Normal usage for my > system was around 2.5GB max with all my apps and services up and running. But > with 3.7-rc3 and now -rc4 kernel, after more then 20 hours of uptime, it starts > to going up. With kernel before 3.7-rc3, my machine could be up for 10 days and > not go beyond 2.6GB memory usage. > > If I start some app that uses a lot of memory, when there is already 4 or even > 6GB used already, insted of freeing the memory, it starts to swap it, and > everything slows down with a lot of iowait. > > Here is "free -m" output after 24 hours of uptime: > > free -m > total used free shared buffers cached > Mem: 7989 7563 426 0 146 2772 > -/+ buffers/cache: 4643 3345 > Swap: 1953 688 1264 > > > I know that it is ok for memory to be used this much for buffers and cache, but > it is not normal not to relase it when it is needed. > > In attachment is my kernel config file. > Sounds like a memory leak. Please get the machine into this state and then send us - the contents of /proc/meminfo - the contents of /proc/slabinfo - the contents of /proc/vmstat - as root: dmesg -c echo m > /proc/sysrq-trigger dmesg thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx114.postini.com [74.125.245.114]) by kanga.kvack.org (Postfix) with SMTP id B3FFC6B004D for ; Tue, 13 Nov 2012 18:04:24 -0500 (EST) Received: by mail-ea0-f169.google.com with SMTP id k11so3721979eaa.14 for ; Tue, 13 Nov 2012 15:04:23 -0800 (PST) Message-ID: <1352847858.19536.5.camel@c2d-desktop.mypicture.info> Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. From: Milos Jakovljevic Date: Wed, 14 Nov 2012 00:04:18 +0100 In-Reply-To: <20121113140352.4d2db9e8.akpm@linux-foundation.org> References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org On Tue, 2012-11-13 at 14:03 -0800, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Tue, 6 Nov 2012 15:11:48 +0000 (UTC) > bugzilla-daemon@bugzilla.kernel.org wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=50181 > > > > Summary: Memory usage doubles after more then 20 hours of > > uptime. > > Product: Memory Management > > Version: 2.5 > > Kernel Version: 3.7-rc3 and 3.7-rc4 > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Other > > AssignedTo: akpm@linux-foundation.org > > ReportedBy: sukijaki@gmail.com > > Regression: Yes > > > > > > Created an attachment (id=85721) > > --> (https://bugzilla.kernel.org/attachment.cgi?id=85721) > > kernel config file > > > > After 20 hours of uptime, memory usage starts going up. Normal usage for my > > system was around 2.5GB max with all my apps and services up and running. But > > with 3.7-rc3 and now -rc4 kernel, after more then 20 hours of uptime, it starts > > to going up. With kernel before 3.7-rc3, my machine could be up for 10 days and > > not go beyond 2.6GB memory usage. > > > > If I start some app that uses a lot of memory, when there is already 4 or even > > 6GB used already, insted of freeing the memory, it starts to swap it, and > > everything slows down with a lot of iowait. > > > > Here is "free -m" output after 24 hours of uptime: > > > > free -m > > total used free shared buffers cached > > Mem: 7989 7563 426 0 146 2772 > > -/+ buffers/cache: 4643 3345 > > Swap: 1953 688 1264 > > > > > > I know that it is ok for memory to be used this much for buffers and cache, but > > it is not normal not to relase it when it is needed. > > > > In attachment is my kernel config file. > > > > Sounds like a memory leak. > > Please get the machine into this state and then send us > > - the contents of /proc/meminfo > > - the contents of /proc/slabinfo > > - the contents of /proc/vmstat > > - as root: > > dmesg -c > echo m > /proc/sysrq-trigger > dmesg > > thanks. Will do. But it will take a day or two to get there, I rebooted today because of this problem. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx145.postini.com [74.125.245.145]) by kanga.kvack.org (Postfix) with SMTP id 013D16B005A for ; Thu, 15 Nov 2012 09:05:55 -0500 (EST) Received: by mail-bk0-f41.google.com with SMTP id jg9so836135bkc.14 for ; Thu, 15 Nov 2012 06:05:54 -0800 (PST) Message-ID: <1352988349.6409.4.camel@c2d-desktop.mypicture.info> Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. From: Milos Jakovljevic Date: Thu, 15 Nov 2012 15:05:49 +0100 In-Reply-To: <20121113140352.4d2db9e8.akpm@linux-foundation.org> References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org On Tue, 2012-11-13 at 14:03 -0800, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Tue, 6 Nov 2012 15:11:48 +0000 (UTC) > bugzilla-daemon@bugzilla.kernel.org wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=50181 > > > > Summary: Memory usage doubles after more then 20 hours of > > uptime. > > Product: Memory Management > > Version: 2.5 > > Kernel Version: 3.7-rc3 and 3.7-rc4 > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Other > > AssignedTo: akpm@linux-foundation.org > > ReportedBy: sukijaki@gmail.com > > Regression: Yes > > > > > > Created an attachment (id=85721) > > --> (https://bugzilla.kernel.org/attachment.cgi?id=85721) > > kernel config file > > > > After 20 hours of uptime, memory usage starts going up. Normal usage for my > > system was around 2.5GB max with all my apps and services up and running. But > > with 3.7-rc3 and now -rc4 kernel, after more then 20 hours of uptime, it starts > > to going up. With kernel before 3.7-rc3, my machine could be up for 10 days and > > not go beyond 2.6GB memory usage. > > > > If I start some app that uses a lot of memory, when there is already 4 or even > > 6GB used already, insted of freeing the memory, it starts to swap it, and > > everything slows down with a lot of iowait. > > > > Here is "free -m" output after 24 hours of uptime: > > > > free -m > > total used free shared buffers cached > > Mem: 7989 7563 426 0 146 2772 > > -/+ buffers/cache: 4643 3345 > > Swap: 1953 688 1264 > > > > > > I know that it is ok for memory to be used this much for buffers and cache, but > > it is not normal not to relase it when it is needed. > > > > In attachment is my kernel config file. > > > > Sounds like a memory leak. > > Please get the machine into this state and then send us > > - the contents of /proc/meminfo > > - the contents of /proc/slabinfo > > - the contents of /proc/vmstat > > - as root: > > dmesg -c > echo m > /proc/sysrq-trigger > dmesg > > thanks. Here is the requested content: free -m: http://pastebin.com/vb878a9Y cat /proc/meminfo : http://pastebin.com/zUDFcYEW cat /proc/slabinfo : http://pastebin.com/kswsJ7Hk cat /proc/vmstat : http://pastebin.com/wUebJqJe dmesg -c : http://pastebin.com/f7cTu8Wv echo m > /proc/sysrq-trigger && dmesg : http://pastebin.com/p68DcHUy And here are also files with that content: http://ubuntuone.com/5GUVahBTiZRP0QjQdP3gkQ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx102.postini.com [74.125.245.102]) by kanga.kvack.org (Postfix) with SMTP id 9C71D6B0074 for ; Thu, 15 Nov 2012 17:13:00 -0500 (EST) Date: Thu, 15 Nov 2012 14:12:58 -0800 From: Andrew Morton Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. Message-Id: <20121115141258.8e5cc669.akpm@linux-foundation.org> In-Reply-To: <1352988349.6409.4.camel@c2d-desktop.mypicture.info> References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Milos Jakovljevic Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org, Dave Hansen On Thu, 15 Nov 2012 15:05:49 +0100 Milos Jakovljevic wrote: > Here is the requested content: > > free -m: http://pastebin.com/vb878a9Y > cat /proc/meminfo : http://pastebin.com/zUDFcYEW > cat /proc/slabinfo : http://pastebin.com/kswsJ7Hk > cat /proc/vmstat : http://pastebin.com/wUebJqJe > > dmesg -c : http://pastebin.com/f7cTu8Wv > > echo m > /proc/sysrq-trigger && dmesg : http://pastebin.com/p68DcHUy > > And here are also files with that content: > http://ubuntuone.com/5GUVahBTiZRP0QjQdP3gkQ > You've lost 2-3GB of ZONE_NORMAL and I see no sign there to indicate where it went. /proc/slabinfo indicates that it isn't a slab leak, and kmemleak won't tell us about alloc_pages() leaks. I'm stumped. Dave, any progress at your end? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx164.postini.com [74.125.245.164]) by kanga.kvack.org (Postfix) with SMTP id 0FDFB6B004D for ; Thu, 15 Nov 2012 18:11:49 -0500 (EST) Received: by mail-ea0-f169.google.com with SMTP id a12so56302eaa.14 for ; Thu, 15 Nov 2012 15:11:47 -0800 (PST) Message-ID: <1353021103.6409.31.camel@c2d-desktop.mypicture.info> Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. From: Milos Jakovljevic Date: Fri, 16 Nov 2012 00:11:43 +0100 In-Reply-To: <20121115141258.8e5cc669.akpm@linux-foundation.org> References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org, Dave Hansen On Thu, 2012-11-15 at 14:12 -0800, Andrew Morton wrote: > On Thu, 15 Nov 2012 15:05:49 +0100 > Milos Jakovljevic wrote: > > > Here is the requested content: > > > > free -m: http://pastebin.com/vb878a9Y > > cat /proc/meminfo : http://pastebin.com/zUDFcYEW > > cat /proc/slabinfo : http://pastebin.com/kswsJ7Hk > > cat /proc/vmstat : http://pastebin.com/wUebJqJe > > > > dmesg -c : http://pastebin.com/f7cTu8Wv > > > > echo m > /proc/sysrq-trigger && dmesg : http://pastebin.com/p68DcHUy > > > > And here are also files with that content: > > http://ubuntuone.com/5GUVahBTiZRP0QjQdP3gkQ > > > > You've lost 2-3GB of ZONE_NORMAL and I see no sign there to indicate > where it went. > > /proc/slabinfo indicates that it isn't a slab leak, and kmemleak won't > tell us about alloc_pages() leaks. I'm stumped. Dave, any progress at > your end? > > I didn't understood anything you sad, but never mind. In -rc2 there was a problem with massive iowait when anything was done (starting an app, loading a web page, etc ...), and there was a massive read operation from my /home partition. In -rc3 that stopped, and this started happening. Maybe it is related somehow? Or maybe, it is just some problem with nvidia blob and 3.7 kernel loosing VM_RELEASE (in a blob's mmap.c it was replaced with VM_DONTEXPAND | VM_DONTDUMP ). - or maybe I'm just saying nonsense here. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx157.postini.com [74.125.245.157]) by kanga.kvack.org (Postfix) with SMTP id 9E1E76B006C for ; Thu, 15 Nov 2012 21:51:34 -0500 (EST) Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 15 Nov 2012 19:51:32 -0700 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id B86601FF001C for ; Thu, 15 Nov 2012 19:51:26 -0700 (MST) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAG2pToM213414 for ; Thu, 15 Nov 2012 19:51:29 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAG2pT3O011773 for ; Thu, 15 Nov 2012 19:51:29 -0700 Message-ID: <50A5AA2D.4020003@linux.vnet.ibm.com> Date: Thu, 15 Nov 2012 18:51:25 -0800 From: Dave Hansen MIME-Version: 1.0 Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> In-Reply-To: <20121115141258.8e5cc669.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Milos Jakovljevic , bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org resending to linux-mm@... On 11/15/2012 02:12 PM, Andrew Morton wrote: > /proc/slabinfo indicates that it isn't a slab leak, and kmemleak won't > tell us about alloc_pages() leaks. I'm stumped. Dave, any progress at > your end? I turned on kmemleak and was able to reproduce this on a second reboot, but it went most of a workday before I noticed it had leaked a bunch. Unfortunately, kmemleak didn't help at all. It found a few small things that _may_ be leaks, but nothing to account for this _massive_ loss. I'm stumped so far. My next step is to add some logging to at least see if this is a gradual thing or it happens all at once, and maybe figure out what the heck I'm doing to trigger it. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx107.postini.com [74.125.245.107]) by kanga.kvack.org (Postfix) with SMTP id A9B616B0074 for ; Fri, 16 Nov 2012 13:34:22 -0500 (EST) Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 16 Nov 2012 11:34:21 -0700 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id B43F23E40039 for ; Fri, 16 Nov 2012 11:34:13 -0700 (MST) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAGIY4uI124596 for ; Fri, 16 Nov 2012 11:34:04 -0700 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAGIY3bO025756 for ; Fri, 16 Nov 2012 11:34:03 -0700 Message-ID: <50A68718.3070002@linux.vnet.ibm.com> Date: Fri, 16 Nov 2012 10:34:00 -0800 From: Dave Hansen MIME-Version: 1.0 Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> <1353021103.6409.31.camel@c2d-desktop.mypicture.info> In-Reply-To: <1353021103.6409.31.camel@c2d-desktop.mypicture.info> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Milos Jakovljevic Cc: Andrew Morton , bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org On 11/15/2012 03:11 PM, Milos Jakovljevic wrote: > Or maybe, it is just some problem with nvidia blob and 3.7 kernel > loosing VM_RELEASE (in a blob's mmap.c it was replaced with > VM_DONTEXPAND | VM_DONTDUMP ). - or maybe I'm just saying nonsense > here. I'm using Intel graphics, so it's not nvidia related for me, at least. I've been recording a bunch of gunk from /proc once a minute for the past 16 hours or so. I've grepped some of it in to a log file (but I've got a *LOT* more than this): http://sr71.net/~dave/linux/leak-20121113/log.1353087988.txt.gz >>From meminfo, it shows MemFree/Buffers/Cached/AnonPages/Slab/PageTables, and their sum. That should capture _most_ of the memory use on the system, and if we see that sum going down, it's probably a sign of the leak, especially when we see a trend over a long period. The file is in roughly this format, if anyone cares: sums: The system in question is my laptop. What I can tell is that it doesn't leak much when I'm not using it. But, it's leaking pretty steadily since I started using the system today (~6am in the logs). It _averages_ leaking about 400kB/minute when idle and almost 9MB/minute when in active use. I've tried to provoke the leak doing specific things like large downloads, kernel compiles, watching video, alloc'ing a bunch of transparent huge pages, then exiting... No smoking gun so far. Anybody have ideas what to try next or want to poke holes in my statistics? :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx197.postini.com [74.125.245.197]) by kanga.kvack.org (Postfix) with SMTP id 95BBC6B0068 for ; Fri, 16 Nov 2012 13:46:25 -0500 (EST) Received: by mail-ea0-f169.google.com with SMTP id a12so496960eaa.14 for ; Fri, 16 Nov 2012 10:46:21 -0800 (PST) Message-ID: <1353091574.23064.14.camel@c2d-desktop.mypicture.info> Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. From: Milos Jakovljevic Date: Fri, 16 Nov 2012 19:46:14 +0100 In-Reply-To: <50A68718.3070002@linux.vnet.ibm.com> References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> <1353021103.6409.31.camel@c2d-desktop.mypicture.info> <50A68718.3070002@linux.vnet.ibm.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Dave Hansen Cc: Andrew Morton , bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org On Fri, 2012-11-16 at 10:34 -0800, Dave Hansen wrote: > On 11/15/2012 03:11 PM, Milos Jakovljevic wrote: > > Or maybe, it is just some problem with nvidia blob and 3.7 kernel > > loosing VM_RELEASE (in a blob's mmap.c it was replaced with > > VM_DONTEXPAND | VM_DONTDUMP ). - or maybe I'm just saying nonsense > > here. > > I'm using Intel graphics, so it's not nvidia related for me, at least. > > I've been recording a bunch of gunk from /proc once a minute for the > past 16 hours or so. I've grepped some of it in to a log file (but I've > got a *LOT* more than this): > > http://sr71.net/~dave/linux/leak-20121113/log.1353087988.txt.gz > > From meminfo, it shows MemFree/Buffers/Cached/AnonPages/Slab/PageTables, > and their sum. That should capture _most_ of the memory use on the > system, and if we see that sum going down, it's probably a sign of the > leak, especially when we see a trend over a long period. The file is in > roughly this format, if anyone cares: > > sums: > > The system in question is my laptop. What I can tell is that it doesn't > leak much when I'm not using it. But, it's leaking pretty steadily > since I started using the system today (~6am in the logs). It > _averages_ leaking about 400kB/minute when idle and almost 9MB/minute > when in active use. > > I've tried to provoke the leak doing specific things like large > downloads, kernel compiles, watching video, alloc'ing a bunch of > transparent huge pages, then exiting... No smoking gun so far. > > Anybody have ideas what to try next or want to poke holes in my > statistics? :) > For me, it mostly happens over night, when there is only tvtime and deluge active (and rest of the programs are open but I don't use them - firefox, evolution and pidgin). Only ones it happened while I was on the PC doing something. It was fresh after reboot, and I was restarting Firefox, 10 or more times (I was experimenting with addons). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx138.postini.com [74.125.245.138]) by kanga.kvack.org (Postfix) with SMTP id A50FA6B0072 for ; Fri, 16 Nov 2012 14:16:00 -0500 (EST) Date: Fri, 16 Nov 2012 11:15:59 -0800 From: Andrew Morton Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. Message-Id: <20121116111559.63ec1622.akpm@linux-foundation.org> In-Reply-To: <50A68718.3070002@linux.vnet.ibm.com> References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> <1353021103.6409.31.camel@c2d-desktop.mypicture.info> <50A68718.3070002@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Dave Hansen Cc: Milos Jakovljevic , bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org On Fri, 16 Nov 2012 10:34:00 -0800 Dave Hansen wrote: > Anybody have ideas what to try next or want to poke holes in my > statistics? :) Maybe resurrect the below patch? It's probably six years old. It should allow us to find out who allocated those pages. Then perhaps we should merge the sucker this time. From: Alexander Nyberg Introduces CONFIG_PAGE_OWNER that keeps track of the call chain under which a page was allocated. Includes a user-space helper in Documentation/page_owner.c to sort the enormous amount of output that this may give (thanks tridge). Information available through /proc/page_owner x86_64 introduces some stack noise in certain call chains so for exact output use of x86 && CONFIG_FRAME_POINTER is suggested. Tested on x86, x86 && CONFIG_FRAME_POINTER, x86_64 Output looks like: 4819 times: Page allocated via order 0, mask 0x50 [0xc012b7b9] find_lock_page+25 [0xc012b8c8] find_or_create_page+152 [0xc0147d74] grow_dev_page+36 [0xc0148164] __find_get_block+84 [0xc0147ebc] __getblk_slow+124 [0xc0148164] __find_get_block+84 [0xc01481e7] __getblk+55 [0xc0185d14] do_readahead+100 We use a custom stack unwinder because using __builtin_return_address([0-7]) causes gcc to generate code that might try to unwind the stack looking for function return addresses and "fall off" causing early panics if the call chain is not deep enough. So in that case we could have had a depth of around 3 functions in all traces (I experimented a bit with this). From: Dave Hansen make page_owner handle non-contiguous page ranges From: Alexander Nyberg I've cleaned up the __alloc_pages() part to a simple set_page_owner() call. Signed-off-by: Alexander Nyberg Signed-off-by: Dave Hansen Signed-Off-by: Kamezawa Hiroyuki DESC Update page->order at an appropriate time when tracking PAGE_OWNER EDESC From: mel@skynet.ie (Mel Gorman) PAGE_OWNER tracks free pages by setting page->order to -1. However, it is set during __free_pages() which is not the only free path as __pagevec_free() and free_compound_page() do not go through __free_pages(). This leads to a situation where free pages are visible in /proc/page_owner which is confusing and might be interpreted as a memory leak. This patch sets page->owner when PageBuddy is set. It also prints a warning to the kernel log if a free page is found that does not appear free to PAGE_OWNER. This should be considered a fix to page-owner-tracking-leak-detector.patch. This only applies to -mm as PAGE_OWNER is not in mainline. Signed-off-by: Mel Gorman Acked-by: Andy Whitcroft DESC Print out PAGE_OWNER statistics in relation to fragmentation avoidance EDESC From: Mel Gorman When PAGE_OWNER is set, more information is available of relevance to fragmentation avoidance. A second line is added to /proc/page_owner showing the PFN, the pageblock number, the mobility type of the page based on its allocation flags, whether the allocation is improperly placed and the flags. A sample entry looks like Page allocated via order 0, mask 0x1280d2 PFN 7355 Block 7 type 3 Fallback Flags LA [0xc01528c6] __handle_mm_fault+598 [0xc0320427] do_page_fault+279 [0xc031ed9a] error_code+114 This information can be used to identify pages that are improperly placed. As the format of PAGE_OWNER data is now different, the comment at the top of Documentation/page_owner.c is updated with new instructions. As PAGE_OWNER tracks the GFP flags used to allocate the pages, /proc/pagetypeinfo is enhanced to contain how many mixed blocks exist. The additional output looks like Number of mixed blocks Unmovable Reclaimable Movable Reserve Node 0, zone DMA 0 1 2 1 Node 0, zone Normal 2 11 33 0 Signed-off-by: Mel Gorman Acked-by: Andy Whitcroft Acked-by: Christoph Lameter DESC Allow PAGE_OWNER to be set on any architecture EDESC From: Mel Gorman Currently PAGE_OWNER depends on CONFIG_X86. This appears to be due to pfn_to_page() being called in an inappropriate for many memory models and the presense of memory holes. This patch ensures that pfn_valid() and pfn_valid_within() is called at the appropriate places and the offsets correctly updated so that PAGE_OWNER is safe on any architecture. In situations where CONFIG_HOLES_IN_ZONES is set (IA64 with VIRTUAL_MEM_MAP), there may be cases where pages allocated within a MAX_ORDER_NR_PAGES block of pages may not be displayed in /proc/page_owner if the hole is at the start of the block. Addressing this would be quite complex, perform slowly and is of no clear benefit. Once PAGE_OWNER is allowed on all architectures, the statistics for grouping pages by mobility that declare how many pageblocks contain mixed page types becomes optionally available on all arches. This patch was tested successfully on x86, x86_64, ppc64 and IA64 machines. Signed-off-by: Mel Gorman Acked-by: Andy Whitcroft DESC allow-page_owner-to-be-set-on-any-architecture-fix EDESC From: Andrew Morton Cc: Andy Whitcroft Cc: Mel Gorman DESC allow-page_owner-to-be-set-on-any-architecture-fix fix EDESC From: mel@skynet.ie (Mel Gorman) Page-owner-tracking stores the a backtrace of an allocation in the struct page. How the stack trace is generated depends on whether CONFIG_FRAME_POINTER is set or not. If CONFIG_FRAME_POINTER is set, the frame pointer must be read using some inline assembler which is not available for all architectures. This patch uses the frame pointer where it is available but has a fallback where it is not. Signed-off-by: Mel Gorman Cc: Andy Whitcroft Signed-off-by: Andrew Morton --- Documentation/page_owner.c | 141 ++++++++++++++++++++++++++++++++++ fs/proc/proc_misc.c | 145 +++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 5 + lib/Kconfig.debug | 10 ++ mm/page_alloc.c | 66 +++++++++++++++ mm/vmstat.c | 93 ++++++++++++++++++++++ 6 files changed, 460 insertions(+) diff -puN /dev/null Documentation/page_owner.c --- /dev/null +++ a/Documentation/page_owner.c @@ -0,0 +1,141 @@ +/* + * User-space helper to sort the output of /proc/page_owner + * + * Example use: + * cat /proc/page_owner > page_owner_full.txt + * grep -v ^PFN page_owner_full.txt > page_owner.txt + * ./sort page_owner.txt sorted_page_owner.txt +*/ + +#include +#include +#include +#include +#include +#include +#include + +struct block_list { + char *txt; + int len; + int num; +}; + + +static struct block_list *list; +static int list_size; +static int max_size; + +struct block_list *block_head; + +int read_block(char *buf, FILE *fin) +{ + int ret = 0; + int hit = 0; + char *curr = buf; + + for (;;) { + *curr = getc(fin); + if (*curr == EOF) return -1; + + ret++; + if (*curr == '\n' && hit == 1) + return ret - 1; + else if (*curr == '\n') + hit = 1; + else + hit = 0; + curr++; + } +} + +static int compare_txt(struct block_list *l1, struct block_list *l2) +{ + return strcmp(l1->txt, l2->txt); +} + +static int compare_num(struct block_list *l1, struct block_list *l2) +{ + return l2->num - l1->num; +} + +static void add_list(char *buf, int len) +{ + if (list_size != 0 && + len == list[list_size-1].len && + memcmp(buf, list[list_size-1].txt, len) == 0) { + list[list_size-1].num++; + return; + } + if (list_size == max_size) { + printf("max_size too small??\n"); + exit(1); + } + list[list_size].txt = malloc(len+1); + list[list_size].len = len; + list[list_size].num = 1; + memcpy(list[list_size].txt, buf, len); + list[list_size].txt[len] = 0; + list_size++; + if (list_size % 1000 == 0) { + printf("loaded %d\r", list_size); + fflush(stdout); + } +} + +int main(int argc, char **argv) +{ + FILE *fin, *fout; + char buf[1024]; + int ret, i, count; + struct block_list *list2; + struct stat st; + + fin = fopen(argv[1], "r"); + fout = fopen(argv[2], "w"); + if (!fin || !fout) { + printf("Usage: ./program \n"); + perror("open: "); + exit(2); + } + + fstat(fileno(fin), &st); + max_size = st.st_size / 100; /* hack ... */ + + list = malloc(max_size * sizeof(*list)); + + for(;;) { + ret = read_block(buf, fin); + if (ret < 0) + break; + + buf[ret] = '\0'; + add_list(buf, ret); + } + + printf("loaded %d\n", list_size); + + printf("sorting ....\n"); + + qsort(list, list_size, sizeof(list[0]), compare_txt); + + list2 = malloc(sizeof(*list) * list_size); + + printf("culling\n"); + + for (i=count=0;i +#include +static ssize_t +read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos) +{ + unsigned long pfn; + struct page *page; + char *kbuf, *modname; + const char *symname; + int ret = 0; + char namebuf[128]; + unsigned long offset = 0, symsize; + int i; + ssize_t num_written = 0; + int blocktype = 0, pagetype = 0; + + page = NULL; + pfn = min_low_pfn + *ppos; + + /* Find a valid PFN or the start of a MAX_ORDER_NR_PAGES area */ + while (!pfn_valid(pfn) && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0) + pfn++; + + /* Find an allocated page */ + for (; pfn < max_pfn; pfn++) { + /* + * If the new page is in a new MAX_ORDER_NR_PAGES area, + * validate the area as existing, skip it if not + */ + if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0 && !pfn_valid(pfn)) { + pfn += MAX_ORDER_NR_PAGES - 1; + continue; + } + + /* Check for holes within a MAX_ORDER area */ + if (!pfn_valid_within(pfn)) + continue; + + page = pfn_to_page(pfn); + + /* Catch situations where free pages have a bad ->order */ + if (page->order >= 0 && PageBuddy(page)) + printk(KERN_WARNING + "PageOwner info inaccurate for PFN %lu\n", + pfn); + + /* Stop search if page is allocated and has trace info */ + if (page->order >= 0 && page->trace[0]) + break; + } + + if (!pfn_valid(pfn)) + return 0; + + /* Record the next PFN to read in the file offset */ + *ppos = (pfn - min_low_pfn) + 1; + + kbuf = kmalloc(count, GFP_KERNEL); + if (!kbuf) + return -ENOMEM; + + ret = snprintf(kbuf, count, "Page allocated via order %d, mask 0x%x\n", + page->order, page->gfp_mask); + if (ret >= count) { + ret = -ENOMEM; + goto out; + } + + /* Print information relevant to grouping pages by mobility */ + blocktype = get_pageblock_migratetype(page); + pagetype = allocflags_to_migratetype(page->gfp_mask); + ret += snprintf(kbuf+ret, count-ret, + "PFN %lu Block %lu type %d %s " + "Flags %s%s%s%s%s%s%s%s%s%s%s%s\n", + pfn, + pfn >> pageblock_order, + blocktype, + blocktype != pagetype ? "Fallback" : " ", + PageLocked(page) ? "K" : " ", + PageError(page) ? "E" : " ", + PageReferenced(page) ? "R" : " ", + PageUptodate(page) ? "U" : " ", + PageDirty(page) ? "D" : " ", + PageLRU(page) ? "L" : " ", + PageActive(page) ? "A" : " ", + PageSlab(page) ? "S" : " ", + PageWriteback(page) ? "W" : " ", + PageCompound(page) ? "C" : " ", + PageSwapCache(page) ? "B" : " ", + PageMappedToDisk(page) ? "M" : " "); + if (ret >= count) { + ret = -ENOMEM; + goto out; + } + + num_written = ret; + + for (i = 0; i < 8; i++) { + if (!page->trace[i]) + break; + symname = kallsyms_lookup(page->trace[i], &symsize, &offset, + &modname, namebuf); + ret = snprintf(kbuf + num_written, count - num_written, + "[0x%lx] %s+%lu\n", + page->trace[i], namebuf, offset); + if (ret >= count - num_written) { + ret = -ENOMEM; + goto out; + } + num_written += ret; + } + + ret = snprintf(kbuf + num_written, count - num_written, "\n"); + if (ret >= count - num_written) { + ret = -ENOMEM; + goto out; + } + + num_written += ret; + ret = num_written; + + if (copy_to_user(buf, kbuf, ret)) + ret = -EFAULT; +out: + kfree(kbuf); + return ret; +} + +static struct file_operations proc_page_owner_operations = { + .read = read_page_owner, +}; +#endif + struct proc_dir_entry *proc_root_kcore; void __init proc_misc_init(void) @@ -932,4 +1066,15 @@ void __init proc_misc_init(void) #ifdef CONFIG_PROC_VMCORE proc_vmcore = proc_create("vmcore", S_IRUSR, NULL, &proc_vmcore_operations); #endif +#ifdef CONFIG_PAGE_OWNER + { + struct proc_dir_entry *entry; + entry = create_proc_entry("page_owner", + S_IWUSR | S_IRUGO, NULL); + if (entry) { + entry->proc_fops = &proc_page_owner_operations; + entry->size = 1024; + } + } +#endif } diff -puN include/linux/mm_types.h~page-owner-tracking-leak-detector include/linux/mm_types.h --- a/include/linux/mm_types.h~page-owner-tracking-leak-detector +++ a/include/linux/mm_types.h @@ -101,6 +101,11 @@ struct page { #ifdef CONFIG_KMEMCHECK void *shadow; #endif +#ifdef CONFIG_PAGE_OWNER + int order; + unsigned int gfp_mask; + unsigned long trace[8]; +#endif }; /* diff -puN lib/Kconfig.debug~page-owner-tracking-leak-detector lib/Kconfig.debug --- a/lib/Kconfig.debug~page-owner-tracking-leak-detector +++ a/lib/Kconfig.debug @@ -66,6 +66,16 @@ config UNUSED_SYMBOLS you really need it, and what the merge plan to the mainline kernel for your module is. +config PAGE_OWNER + bool "Track page owner" + depends on DEBUG_KERNEL + help + This keeps track of what call chain is the owner of a page, may + help to find bare alloc_page(s) leaks. Eats a fair amount of memory. + See Documentation/page_owner.c for user-space helper. + + If unsure, say N. + config DEBUG_FS bool "Debug Filesystem" depends on SYSFS diff -puN mm/page_alloc.c~page-owner-tracking-leak-detector mm/page_alloc.c --- a/mm/page_alloc.c~page-owner-tracking-leak-detector +++ a/mm/page_alloc.c @@ -316,6 +316,9 @@ static inline void set_page_order(struct { set_page_private(page, order); __SetPageBuddy(page); +#ifdef CONFIG_PAGE_OWNER + page->order = -1; +#endif } static inline void rmv_page_order(struct page *page) @@ -1434,6 +1437,62 @@ try_next_zone: return page; } +#ifdef CONFIG_PAGE_OWNER +static inline int valid_stack_ptr(struct thread_info *tinfo, void *p) +{ + return p > (void *)tinfo && + p < (void *)tinfo + THREAD_SIZE - 3; +} + +static inline void __stack_trace(struct page *page, unsigned long *stack, + unsigned long bp) +{ + int i = 0; + unsigned long addr; + struct thread_info *tinfo = (struct thread_info *) + ((unsigned long)stack & (~(THREAD_SIZE - 1))); + + memset(page->trace, 0, sizeof(long) * 8); + +#ifdef CONFIG_FRAME_POINTER + if (bp) { + while (valid_stack_ptr(tinfo, (void *)bp)) { + addr = *(unsigned long *)(bp + sizeof(long)); + page->trace[i] = addr; + if (++i >= 8) + break; + bp = *(unsigned long *)bp; + } + return; + } +#endif /* CONFIG_FRAME_POINTER */ + while (valid_stack_ptr(tinfo, stack)) { + addr = *stack++; + if (__kernel_text_address(addr)) { + page->trace[i] = addr; + if (++i >= 8) + break; + } + } +} + +static void set_page_owner(struct page *page, unsigned int order, + unsigned int gfp_mask) +{ + unsigned long address; + unsigned long bp = 0; +#ifdef CONFIG_X86_64 + asm ("movq %%rbp, %0" : "=r" (bp) : ); +#endif +#ifdef CONFIG_X86_32 + asm ("movl %%ebp, %0" : "=r" (bp) : ); +#endif + page->order = (int) order; + page->gfp_mask = gfp_mask; + __stack_trace(page, &address, bp); +} +#endif /* CONFIG_PAGE_OWNER */ + /* * This is the 'heart' of the zoned buddy allocator. */ @@ -1638,6 +1697,10 @@ nopage: show_mem(); } got_pg: +#ifdef CONFIG_PAGE_OWNER + if (page) + set_page_owner(page, order, gfp_mask); +#endif return page; } EXPORT_SYMBOL(__alloc_pages_internal); @@ -2635,6 +2698,9 @@ void __meminit memmap_init_zone(unsigned if (!is_highmem_idx(zone)) set_page_address(page, __va(pfn << PAGE_SHIFT)); #endif +#ifdef CONFIG_PAGE_OWNER + page->order = -1; +#endif } } diff -puN mm/vmstat.c~page-owner-tracking-leak-detector mm/vmstat.c --- a/mm/vmstat.c~page-owner-tracking-leak-detector +++ a/mm/vmstat.c @@ -15,6 +15,7 @@ #include #include #include +#include "internal.h" #ifdef CONFIG_VM_EVENT_COUNTERS DEFINE_PER_CPU(struct vm_event_state, vm_event_states) = {{0}}; @@ -560,6 +561,97 @@ static int pagetypeinfo_showblockcount(s return 0; } +#ifdef CONFIG_PAGE_OWNER +static void pagetypeinfo_showmixedcount_print(struct seq_file *m, + pg_data_t *pgdat, + struct zone *zone) +{ + int mtype, pagetype; + unsigned long pfn; + unsigned long start_pfn = zone->zone_start_pfn; + unsigned long end_pfn = start_pfn + zone->spanned_pages; + unsigned long count[MIGRATE_TYPES] = { 0, }; + + /* Align PFNs to pageblock_nr_pages boundary */ + pfn = start_pfn & ~(pageblock_nr_pages-1); + + /* + * Walk the zone in pageblock_nr_pages steps. If a page block spans + * a zone boundary, it will be double counted between zones. This does + * not matter as the mixed block count will still be correct + */ + for (; pfn < end_pfn; pfn += pageblock_nr_pages) { + struct page *page; + unsigned long offset = 0; + + /* Do not read before the zone start, use a valid page */ + if (pfn < start_pfn) + offset = start_pfn - pfn; + + if (!pfn_valid(pfn + offset)) + continue; + + page = pfn_to_page(pfn + offset); + mtype = get_pageblock_migratetype(page); + + /* Check the block for bad migrate types */ + for (; offset < pageblock_nr_pages; offset++) { + /* Do not past the end of the zone */ + if (pfn + offset >= end_pfn) + break; + + if (!pfn_valid_within(pfn + offset)) + continue; + + page = pfn_to_page(pfn + offset); + + /* Skip free pages */ + if (PageBuddy(page)) { + offset += (1UL << page_order(page)) - 1UL; + continue; + } + if (page->order < 0) + continue; + + pagetype = allocflags_to_migratetype(page->gfp_mask); + if (pagetype != mtype) { + count[mtype]++; + break; + } + + /* Move to end of this allocation */ + offset += (1 << page->order) - 1; + } + } + + /* Print counts */ + seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name); + for (mtype = 0; mtype < MIGRATE_TYPES; mtype++) + seq_printf(m, "%12lu ", count[mtype]); + seq_putc(m, '\n'); +} +#endif /* CONFIG_PAGE_OWNER */ + +/* + * Print out the number of pageblocks for each migratetype that contain pages + * of other types. This gives an indication of how well fallbacks are being + * contained by rmqueue_fallback(). It requires information from PAGE_OWNER + * to determine what is going on + */ +static void pagetypeinfo_showmixedcount(struct seq_file *m, pg_data_t *pgdat) +{ +#ifdef CONFIG_PAGE_OWNER + int mtype; + + seq_printf(m, "\n%-23s", "Number of mixed blocks "); + for (mtype = 0; mtype < MIGRATE_TYPES; mtype++) + seq_printf(m, "%12s ", migratetype_names[mtype]); + seq_putc(m, '\n'); + + walk_zones_in_node(m, pgdat, pagetypeinfo_showmixedcount_print); +#endif /* CONFIG_PAGE_OWNER */ +} + /* * This prints out statistics in relation to grouping pages by mobility. * It is expensive to collect so do not constantly read the file. @@ -577,6 +669,7 @@ static int pagetypeinfo_show(struct seq_ seq_putc(m, '\n'); pagetypeinfo_showfree(m, pgdat); pagetypeinfo_showblockcount(m, pgdat); + pagetypeinfo_showmixedcount(m, pgdat); return 0; } _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx156.postini.com [74.125.245.156]) by kanga.kvack.org (Postfix) with SMTP id 14A056B0070 for ; Fri, 16 Nov 2012 18:59:26 -0500 (EST) Received: from /spool/local by e3.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 16 Nov 2012 18:59:24 -0500 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 36D69C9003C for ; Fri, 16 Nov 2012 18:59:22 -0500 (EST) Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAGNxMFg279892 for ; Fri, 16 Nov 2012 18:59:22 -0500 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAGNxLgS024010 for ; Fri, 16 Nov 2012 18:59:21 -0500 Message-ID: <50A6D357.3070103@linux.vnet.ibm.com> Date: Fri, 16 Nov 2012 15:59:19 -0800 From: Dave Hansen MIME-Version: 1.0 Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> <1353021103.6409.31.camel@c2d-desktop.mypicture.info> <50A68718.3070002@linux.vnet.ibm.com> <20121116111559.63ec1622.akpm@linux-foundation.org> In-Reply-To: <20121116111559.63ec1622.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Milos Jakovljevic , bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org On 11/16/2012 11:15 AM, Andrew Morton wrote: > On Fri, 16 Nov 2012 10:34:00 -0800 > Dave Hansen wrote: >> Anybody have ideas what to try next or want to poke holes in my >> statistics? :) > > Maybe resurrect the below patch? It's probably six years old. It > should allow us to find out who allocated those pages. > > Then perhaps we should merge the sucker this time. I at least got the sucker recompiling. Not my finest work, but here goes: http://sr71.net/~dave/linux/leak-20121113/pageowner_for_3.7-rc5.patch It's not pretty, and probably needs to (at least) get moved over to debugfs before getting merged, but it does appear to give some reasonable output. Figured I'd post it in case anyone else wants to give it a spin. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx174.postini.com [74.125.245.174]) by kanga.kvack.org (Postfix) with SMTP id 57FCE6B004D for ; Sat, 17 Nov 2012 18:08:56 -0500 (EST) Received: from /spool/local by e8.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sat, 17 Nov 2012 18:08:55 -0500 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id A73B838C8039 for ; Sat, 17 Nov 2012 18:08:51 -0500 (EST) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAHN8oxl295424 for ; Sat, 17 Nov 2012 18:08:51 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAHN8oO1018112 for ; Sat, 17 Nov 2012 21:08:50 -0200 Message-ID: <50A81900.8000801@linux.vnet.ibm.com> Date: Sat, 17 Nov 2012 15:08:48 -0800 From: Dave Hansen MIME-Version: 1.0 Subject: 3.7-rc6 memory accounting problem (manifests like a memory leak) References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> <1353021103.6409.31.camel@c2d-desktop.mypicture.info> <50A68718.3070002@linux.vnet.ibm.com> <20121116111559.63ec1622.akpm@linux-foundation.org> <50A6D357.3070103@linux.vnet.ibm.com> In-Reply-To: <50A6D357.3070103@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Milos Jakovljevic , bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org, Bartlomiej Zolnierkiewicz , Kyungmin Park page_owner didn't help, at least not directly. As the pages "leak", they stop showing up in page_owner, which means they're in the allocator. Check out buddyinfo/meminfo: > dave@nimitz:~/ltc/linux.git$ cat /proc/buddyinfo /proc/meminfo > Node 0, zone DMA 0 0 0 1 2 1 1 0 1 1 3 > Node 0, zone DMA32 25450 13645 11665 2994 1242 665 234 50 12 1 1 > Node 0, zone Normal 6494 28630 16790 5872 3524 1666 844 238 146 60 398 > MemTotal: 7825604 kB > MemFree: 1285260 kB ... Just the 398 order-10 zone Normal pages account for ~1.6GB of free memory, yet the MemFree is ~1.3GB, and that's a *SINGLE* bucket in the buddy allocator. Adding them all up, it's fairly close to the amount of memory that I'm missing at the moment. Rather than being a real leak, it looks like this might just be an accounting problem: $ cat /proc/zoneinfo | egrep 'free_pages|Node' Node 0, zone DMA nr_free_pages 3976 Node 0, zone DMA32 nr_free_pages 177041 Node 0, zone Normal nr_free_pages 16148 That 16148 pages for ZONE_NORMAL is obviously bogus compared to what buddyinfo is saying. Commit d1ce749a0d did mess with NR_FREE_PAGES accounting quite a bit. Guess I'll try a revert and see where I end up. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx169.postini.com [74.125.245.169]) by kanga.kvack.org (Postfix) with SMTP id D3B406B002B for ; Mon, 19 Nov 2012 11:44:58 -0500 (EST) Received: from /spool/local by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 19 Nov 2012 11:44:55 -0500 Received: from d01relay01.pok.ibm.com (d01relay01.pok.ibm.com [9.56.227.233]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 1BACAC9006F for ; Mon, 19 Nov 2012 11:44:54 -0500 (EST) Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAJGirx4283754 for ; Mon, 19 Nov 2012 11:44:53 -0500 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAJGircY011504 for ; Mon, 19 Nov 2012 11:44:53 -0500 Message-ID: <50AA6203.4010407@linux.vnet.ibm.com> Date: Mon, 19 Nov 2012 08:44:51 -0800 From: Dave Hansen MIME-Version: 1.0 Subject: Re: [Bug 50181] New: Memory usage doubles after more then 20 hours of uptime. References: <20121113140352.4d2db9e8.akpm@linux-foundation.org> <1352988349.6409.4.camel@c2d-desktop.mypicture.info> <20121115141258.8e5cc669.akpm@linux-foundation.org> <1353021103.6409.31.camel@c2d-desktop.mypicture.info> <50A68718.3070002@linux.vnet.ibm.com> <20121116111559.63ec1622.akpm@linux-foundation.org> <50A6D357.3070103@linux.vnet.ibm.com> In-Reply-To: <50A6D357.3070103@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Milos Jakovljevic , bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org I managed to reproduce this on a second machine. The new system ran basically all weekend doing kernel compiles: no leak. But, I added some memory pressure, and made it start allocating a bunch of hugetlbfs pages. That made this bug kick in there too. It's somewhat hard to tell, but I _think_ the leaking is correlated with compaction activity. I'm trying a bisect now. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org