All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@redhat.com>
To: Stephane Eranian <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
	Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@redhat.com>,
	namhyung.kim@kernel.org, acme@kernel.org
Subject: Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale
Date: Wed, 15 Mar 2017 10:50:59 -0300	[thread overview]
Message-ID: <20170315135059.GC2177@redhat.com> (raw)
In-Reply-To: <1489561041-19778-1-git-send-email-eranian@google.com>

Em Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian escreveu:
> This patch significantly improves the execution time of
> perf_event__synthesize_mmap_events() when running perf record
> on systems where processes have lots of threads. It just happens
> that cat /proc/pid/maps support uses a O(N^2) algorithm to generate
> each map line in the maps file.  If you have 1000 threads, then you have
> necessarily 1000 stacks.  For each vma, you need to check if it corresponds
> to a thread's stack.  With a large number of threads, this can take a very long time. I have seen latencies >> 10mn.
> 
> As of today, perf does not use the fact that a mapping is a stack,
> therefore we can work around the issue by using /proc/pid/tasks/pid/maps.
> This entry does not try to map a vma to stack and is thus much
> faster with no loss of functonality.
> 
> The proc-map-timeout logic is kept in case user still want some uppre limit.
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/event.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 4ea7ce7..b137566 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
>  	if (machine__is_default_guest(machine))
>  		return 0;
>  
> -	snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
> -		 machine->root_dir, pid);
> +	snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps",
> +		 machine->root_dir, pid, pid);

Humm...

[root@jouet ~]# ps -C firefox -L
  PID   LWP TTY          TIME CMD
25023 25023 tty2     01:32:10 firefox
25023 25048 tty2     00:00:00 gmain
25023 25049 tty2     00:00:00 gdbus
25023 25056 tty2     00:00:00 Gecko_IOThread
25023 25058 tty2     00:00:00 Link Monitor
25023 25059 tty2     00:03:52 Socket Thread
25023 25060 tty2     00:00:00 JS Watchdog
25023 25061 tty2     00:00:12 JS Helper
25023 25062 tty2     00:00:12 JS Helper
25023 25063 tty2     00:00:13 JS Helper
25023 25064 tty2     00:00:12 JS Helper
25023 25065 tty2     00:00:13 JS Helper
25023 25066 tty2     00:00:12 JS Helper
25023 25067 tty2     00:00:13 JS Helper
25023 25068 tty2     00:00:13 JS Helper
25023 25069 tty2     00:00:00 Hang Monitor
25023 25070 tty2     00:00:00 BgHangManager
25023 25117 tty2     00:00:12 Cache2 I/O
25023 25118 tty2     00:05:37 Timer
25023 25146 tty2     00:00:00 DataStorage
25023 25150 tty2     00:00:00 GMPThread
25023 25152 tty2     00:04:12 Compositor
25023 25156 tty2     00:00:02 ImgDecoder #1
25023 25157 tty2     00:00:02 ImgDecoder #2
25023 25158 tty2     00:00:02 ImgDecoder #3
25023 25159 tty2     00:00:00 ImageIO
25023 25160 tty2     00:02:26 SoftwareVsyncTh
25023 25202 tty2     00:00:00 HTML5 Parser
25023 25204 tty2     00:00:00 IPDL Background
25023 25212 tty2     00:00:22 DOM Worker
25023 25219 tty2     00:00:23 URL Classifier
25023 25220 tty2     00:00:05 ImageBridgeChil
25023 25221 tty2     00:00:04 mozStorage #1
25023 25222 tty2     00:00:00 Proxy R~olution
25023 25223 tty2     00:00:00 DataStorage
25023 25234 tty2     00:00:00 Cache I/O
25023 25235 tty2     00:00:01 mozStorage #2
25023 25236 tty2     00:00:00 mozStorage #3
25023 25252 tty2     00:00:20 DOM Worker
25023 25256 tty2     00:00:00 mozStorage #4
25023 25257 tty2     00:00:00 localStorage DB
25023 25260 tty2     00:00:00 mozStorage #5
25023  2066 tty2     00:00:19 DOM Worker
25023  5988 tty2     00:00:00 threaded-ml
25023  6757 tty2     00:00:00 firefox
25023  6999 tty2     00:00:00 mozStorage #6
25023 19239 tty2     00:00:02 mozStorage #7
25023 12038 tty2     00:00:05 threaded-ml
25023 19301 tty2     00:00:00 firefox
[root@jouet ~]# 

[root@jouet ~]# file /proc/25023/tasks
/proc/25023/tasks: cannot open `/proc/25023/tasks' (No such file or directory)
[root@jouet ~]#

But...

[root@jouet ~]# file /proc/25023/task
/proc/25023/task: directory
[root@jouet ~]# wc -l /proc/25023/task/25023/maps
1200 /proc/25023/task/25023/maps
[root@jouet ~]# wc -l /proc/25023/task/25048/maps
1201 /proc/25023/task/25048/maps
[root@jouet ~]# wc -l /proc/25048/task/25048/maps
1201 /proc/25048/task/25048/maps
[root@jouet ~]#

[acme@jouet linux]$ grep \"tasks\"  fs/proc/*.c
[acme@jouet linux]$ grep \"task\"  fs/proc/*.c
fs/proc/base.c:	DIR("task",       S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations),
fs/proc/base.c:	name.name = "task";
[acme@jouet linux]$

These end up mapping to the later of these two:

const struct file_operations proc_pid_maps_operations = {
        .open           = pid_maps_open,
        .read           = seq_read,
        .llseek         = seq_lseek,
        .release        = proc_map_release,
};

const struct file_operations proc_tid_maps_operations = {
        .open           = tid_maps_open,
        .read           = seq_read, 
        .llseek         = seq_lseek,
        .release        = proc_map_release,
};


Which ends up going down to:

show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)

with that is_pid respectively true and false, but then show_map_vma() doesn't
use 'is_pid' at all :-\

/me scratches head, what am I missing?

The following commit, that appeared circa v4.9-rc2. So, fixing up the "tasks"
-> "tasks" we end up with something safe and that avoids this by now
non-existing problem, on older kernels, ok?

commit b18cb64ead400c01bf1580eeba330ace51f8087d
Author: Andy Lutomirski <luto@kernel.org>
Date:   Fri Sep 30 10:58:57 2016 -0700

    fs/proc: Stop trying to report thread stacks
    
    This reverts more of:
    
      b76437579d13 ("procfs: mark thread stack correctly in proc/<pid>/maps")
    
    ... which was partially reverted by:
    
      65376df58217 ("proc: revert /proc/<pid>/maps [stack:TID] annotation")
    
    Originally, /proc/PID/task/TID/maps was the same as /proc/TID/maps.
    
    In current kernels, /proc/PID/maps (or /proc/TID/maps even for
    threads) shows "[stack]" for VMAs in the mm's stack address range.
    
    In contrast, /proc/PID/task/TID/maps uses KSTK_ESP to guess the
    target thread's stack's VMA.  This is racy, probably returns garbage
    and, on arches with CONFIG_TASK_INFO_IN_THREAD=y, is also crash-prone:
    KSTK_ESP is not safe to use on tasks that aren't known to be running
    ordinary process-context kernel code.
    
    This patch removes the difference and just shows "[stack]" for VMAs
    in the mm's stack range.  This is IMO much more sensible -- the
    actual "stack" address really is treated specially by the VM code,
    and the current thread stack isn't even well-defined for programs
    that frequently switch stacks on their own.
    
    Reported-by: Jann Horn <jann@thejh.net>
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Acked-by: Thomas Gleixner <tglx@linutronix.de>


>  
>  	fp = fopen(filename, "r");
>  	if (fp == NULL) {
> -- 
> 2.5.0

  parent reply	other threads:[~2017-03-15 13:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-15  6:57 [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Stephane Eranian
2017-03-15 11:33 ` Jiri Olsa
2017-03-15 13:50 ` Arnaldo Carvalho de Melo [this message]
2017-03-15 13:58   ` Arnaldo Carvalho de Melo
2017-03-15 17:08     ` Stephane Eranian
2017-03-15 17:44       ` Arnaldo Carvalho de Melo
2017-03-16  2:27         ` Stephane Eranian
2017-03-15 17:17   ` [PATCH V2] " Stephane Eranian
2017-03-16 16:35     ` [tip:perf/core] perf tools: Make " tip-bot for Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170315135059.GC2177@redhat.com \
    --to=acme@redhat.com \
    --cc=acme@kernel.org \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung.kim@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.