From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754071AbdCOLd4 (ORCPT ); Wed, 15 Mar 2017 07:33:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44886 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753781AbdCOLdy (ORCPT ); Wed, 15 Mar 2017 07:33:54 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com BE81A70029 Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jolsa@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com BE81A70029 Date: Wed, 15 Mar 2017 12:33:50 +0100 From: Jiri Olsa To: Stephane Eranian Cc: linux-kernel@vger.kernel.org, acme@redhat.com, peterz@infradead.org, mingo@elte.hu, namhyung.kim@kernel.org Subject: Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Message-ID: <20170315113350.GA18147@krava> References: <1489561041-19778-1-git-send-email-eranian@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1489561041-19778-1-git-send-email-eranian@google.com> User-Agent: Mutt/1.8.0 (2017-02-23) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 15 Mar 2017 11:33:55 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian wrote: > This patch significantly improves the execution time of > perf_event__synthesize_mmap_events() when running perf record > on systems where processes have lots of threads. It just happens > that cat /proc/pid/maps support uses a O(N^2) algorithm to generate > each map line in the maps file. If you have 1000 threads, then you have > necessarily 1000 stacks. For each vma, you need to check if it corresponds > to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. > > As of today, perf does not use the fact that a mapping is a stack, > therefore we can work around the issue by using /proc/pid/tasks/pid/maps. > This entry does not try to map a vma to stack and is thus much > faster with no loss of functonality. > > The proc-map-timeout logic is kept in case user still want some uppre limit. > > Signed-off-by: Stephane Eranian > --- > tools/perf/util/event.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c > index 4ea7ce7..b137566 100644 > --- a/tools/perf/util/event.c > +++ b/tools/perf/util/event.c > @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, > if (machine__is_default_guest(machine)) > return 0; > > - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", > - machine->root_dir, pid); > + snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps", > + machine->root_dir, pid, pid); > > fp = fopen(filename, "r"); > if (fp == NULL) { > -- > 2.5.0 > nice.. Acked-by: Jiri Olsa thanks, jirka