From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760593AbcJRNN4 (ORCPT ); Tue, 18 Oct 2016 09:13:56 -0400 Received: from mail-lf0-f68.google.com ([209.85.215.68]:33407 "EHLO mail-lf0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755097AbcJRNNr (ORCPT ); Tue, 18 Oct 2016 09:13:47 -0400 Date: Tue, 18 Oct 2016 15:13:43 +0200 From: Michal Hocko To: Janani Ravichandran Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC] scripts: Include postprocessing script for memory allocation tracing Message-ID: <20161018131343.GJ12092@dhcp22.suse.cz> References: <20160912121635.GL14524@dhcp22.suse.cz> <0ACE5927-A6E5-4B49-891D-F990527A9F50@gmail.com> <20160919094224.GH10785@dhcp22.suse.cz> <20160923080709.GB4478@dhcp22.suse.cz> <2D27EF16-B63B-4516-A156-5E2FB675A1BB@gmail.com> <20161016073340.GA15839@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 17-10-16 13:31:57, Janani Ravichandran wrote: > > > On Oct 17, 2016, at 1:24 PM, Janani Ravichandran wrote: > > > > > > On Sun, Oct 16, 2016 at 3:33 AM, Michal Hocko > wrote: > > > > trace_mm_page_alloc will tell you details about the allocation, like > > gfp mask, order but it doesn't tell you how long the allocation took at > > its current form. So either you have to note jiffies at the allocation > > start and then add the end-start in the trace point or we really need > > another trace point to note the start. The later has an advantage that > > we do not add unnecessary load for jiffies when the tracepoint is > > disabled. > > The function graph tracer can tell us how long alloc_pages_nodemask() took. > Can’t that, combined with the context information given by trace_mm_page_alloc > give us what we want? Correct me if I am wrong. yes, function_graph tracer will give you _some_ information but it will not have the context you are looking for, right? See the following example ------------------------------------------ 0) x-www-b-22756 => x-termi-4083 ------------------------------------------ 0) | __alloc_pages_nodemask() { 0) | /* mm_page_alloc: page=ffffea000411b380 pfn=1066702 order=0 migratetype=0 gfp_flags=GFP_KERNEL */ 0) 3.328 us | } 3) | __alloc_pages_nodemask() { 3) | /* mm_page_alloc: page=ffffea0008f1f6c0 pfn=2344923 order=0 migratetype=0 gfp_flags=GFP_KERNEL */ 3) 1.011 us | } 0) | __alloc_pages_nodemask() { 0) | /* mm_page_alloc: page=ffffea000411b380 pfn=1066702 order=0 migratetype=0 gfp_flags=GFP_KERNEL */ 0) 0.587 us | } 3) | __alloc_pages_nodemask() { 3) | /* mm_page_alloc: page=ffffea0008f1f6c0 pfn=2344923 order=0 migratetype=0 gfp_flags=GFP_KERNEL */ 3) 1.125 us | } How do I know which process has performed those allocations? I know that CPU0 should be running x-termi-4083 but what is running on other CPUs? Let me explain my usecase I am very interested in. Say I that a usespace application is not performing well. I would like to see some statistics about memory allocations performed for that app - are there few outliers or the allocation stalls increase gradually? Where do we spend time during that allocation? Reclaim LRU pages? Compaction or the slab shrinkers? To answer those questions I need to track particular events (alocation, reclaim, compaction) to the process and know how long each step took. Maybe we can reconstruct something from the above output but it is a major PITA. If we either hard start/stop pairs for each step (which we already do have for reclaim, compaction AFAIR) then this is an easy scripting. Another option would be to have only a single tracepoint for each step with a timing information. See my point? -- Michal Hocko SUSE Labs