From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751581AbaKJAT7 (ORCPT ); Sun, 9 Nov 2014 19:19:59 -0500 Received: from mail.rmail.be ([85.234.218.189]:48816 "EHLO mail.rmail.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751320AbaKJAT6 (ORCPT ); Sun, 9 Nov 2014 19:19:58 -0500 Message-ID: In-Reply-To: <545FE59B.3020702@suse.cz> References: <8bdeb6866adef7f2d34a693040c33f12.squirrel@mail.rmail.be> <545F566F.7010102@suse.cz> <27d4dc6a448169861446f8c1b3c3cadd.squirrel@mail.rmail.be> <545FE59B.3020702@suse.cz> Date: Mon, 10 Nov 2014 00:19:56 -0000 Subject: Re: Memory leaks on atom-based boards? From: "AL13N" To: linux-kernel@vger.kernel.org Cc: "Vlastimil Babka" User-Agent: SquirrelMail/1.4.22 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On 11/09/2014 05:38 PM, AL13N wrote: >>> On 10/27/2014 07:44 PM, AL13N wrote: >>> >>> Hi, this does look like a kernel memory leak. There was recently a >>> known >>> one fixed by patch from https://lkml.org/lkml/2014/10/15/447 which made >>> it to 3.18-rc3 and should be backported to stable kernels 3.8+ soon. >>> You would recognize if this is the fix for you by checking the >>> thp_zero_page_alloc value in /proc/vmstat. Value X > 1 basically means >>> that X*2 MB memory is leaked. >>> You say in the serverfault post that 3.17.2 helped, but the fix is not >>> in 3.17.2... but it could be just that the circumstances changed and >>> THP >>> zero pages are no longer freed and realocated. >>> So if you want to be sure, I would suggest trying again a version where >>> the problem appeared on your system, and checking the >>> thp_zero_page_alloc. Perhaps you'll see a >1 value even on 3.17.2, >>> which >>> means some leak did occur there as well, but maybe not so severe. >> >> >> i was gonna tell you guys, but i was waiting until i was sure, but >> indeed >> 3.17.2 fixed, it, where i had OOM after 3, maybe 4 days (for at least 2 >> months), now i'm up more than 4 days and the MemAvailable is still high >> enough... at about 3.5GB whereas otherwise it would dwindle until 0. (at >> about 1GB/day) >> >> Well, it results to 0 on 3.17.2 ... so... i guess not? i'll keep this >> value under observation... > > Hm, 0 sounds like nobody was allocating transparent huge pages at all. > What > about the other thp_* stats? thp_fault_alloc 0 thp_fault_fallback 0 thp_collapse_alloc 0 thp_collapse_alloc_failed 0 thp_split 0 thp_zero_page_alloc 0 thp_zero_page_alloc_failed 0 i guess on 3.17.2 there's something that doesn't allocate thp? either that, or it was a different issue after all... >>>> - How can i find out what is allocating all this memory? >>> >>> There's no simple way, unfortunately. Checking the kpageflags /proc >>> file >>> might help. IIRC there used to be a patch in -mm tree to store who >>> allocated what page, but it might be bitrotten. >> >> >> i checked what was in kpageflags (or kpagecount) but it's all some kind >> of >> binary stuff... >> >> do i need some tool to interprete these values? > > There's tools/vm/page-types.c in kernel sources which can read kpageflags, > but > not the kpagecount... good to know...