* [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale @ 2017-03-15 6:57 Stephane Eranian 2017-03-15 11:33 ` Jiri Olsa 2017-03-15 13:50 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 9+ messages in thread From: Stephane Eranian @ 2017-03-15 6:57 UTC (permalink / raw) To: linux-kernel; +Cc: acme, peterz, mingo, jolsa, namhyung.kim This patch significantly improves the execution time of perf_event__synthesize_mmap_events() when running perf record on systems where processes have lots of threads. It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to generate each map line in the maps file. If you have 1000 threads, then you have necessarily 1000 stacks. For each vma, you need to check if it corresponds to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. As of today, perf does not use the fact that a mapping is a stack, therefore we can work around the issue by using /proc/pid/tasks/pid/maps. This entry does not try to map a vma to stack and is thus much faster with no loss of functonality. The proc-map-timeout logic is kept in case user still want some uppre limit. Signed-off-by: Stephane Eranian <eranian@google.com> --- tools/perf/util/event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 4ea7ce7..b137566 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, if (machine__is_default_guest(machine)) return 0; - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", - machine->root_dir, pid); + snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps", + machine->root_dir, pid, pid); fp = fopen(filename, "r"); if (fp == NULL) { -- 2.5.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale 2017-03-15 6:57 [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Stephane Eranian @ 2017-03-15 11:33 ` Jiri Olsa 2017-03-15 13:50 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 9+ messages in thread From: Jiri Olsa @ 2017-03-15 11:33 UTC (permalink / raw) To: Stephane Eranian; +Cc: linux-kernel, acme, peterz, mingo, namhyung.kim On Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian wrote: > This patch significantly improves the execution time of > perf_event__synthesize_mmap_events() when running perf record > on systems where processes have lots of threads. It just happens > that cat /proc/pid/maps support uses a O(N^2) algorithm to generate > each map line in the maps file. If you have 1000 threads, then you have > necessarily 1000 stacks. For each vma, you need to check if it corresponds > to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. > > As of today, perf does not use the fact that a mapping is a stack, > therefore we can work around the issue by using /proc/pid/tasks/pid/maps. > This entry does not try to map a vma to stack and is thus much > faster with no loss of functonality. > > The proc-map-timeout logic is kept in case user still want some uppre limit. > > Signed-off-by: Stephane Eranian <eranian@google.com> > --- > tools/perf/util/event.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c > index 4ea7ce7..b137566 100644 > --- a/tools/perf/util/event.c > +++ b/tools/perf/util/event.c > @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, > if (machine__is_default_guest(machine)) > return 0; > > - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", > - machine->root_dir, pid); > + snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps", > + machine->root_dir, pid, pid); > > fp = fopen(filename, "r"); > if (fp == NULL) { > -- > 2.5.0 > nice.. Acked-by: Jiri Olsa <jolsa@kernel.org> thanks, jirka ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale 2017-03-15 6:57 [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Stephane Eranian 2017-03-15 11:33 ` Jiri Olsa @ 2017-03-15 13:50 ` Arnaldo Carvalho de Melo 2017-03-15 13:58 ` Arnaldo Carvalho de Melo 2017-03-15 17:17 ` [PATCH V2] " Stephane Eranian 1 sibling, 2 replies; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-15 13:50 UTC (permalink / raw) To: Stephane Eranian Cc: linux-kernel, peterz, Ingo Molnar, Jiri Olsa, namhyung.kim, acme Em Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian escreveu: > This patch significantly improves the execution time of > perf_event__synthesize_mmap_events() when running perf record > on systems where processes have lots of threads. It just happens > that cat /proc/pid/maps support uses a O(N^2) algorithm to generate > each map line in the maps file. If you have 1000 threads, then you have > necessarily 1000 stacks. For each vma, you need to check if it corresponds > to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. > > As of today, perf does not use the fact that a mapping is a stack, > therefore we can work around the issue by using /proc/pid/tasks/pid/maps. > This entry does not try to map a vma to stack and is thus much > faster with no loss of functonality. > > The proc-map-timeout logic is kept in case user still want some uppre limit. > > Signed-off-by: Stephane Eranian <eranian@google.com> > --- > tools/perf/util/event.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c > index 4ea7ce7..b137566 100644 > --- a/tools/perf/util/event.c > +++ b/tools/perf/util/event.c > @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, > if (machine__is_default_guest(machine)) > return 0; > > - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", > - machine->root_dir, pid); > + snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps", > + machine->root_dir, pid, pid); Humm... [root@jouet ~]# ps -C firefox -L PID LWP TTY TIME CMD 25023 25023 tty2 01:32:10 firefox 25023 25048 tty2 00:00:00 gmain 25023 25049 tty2 00:00:00 gdbus 25023 25056 tty2 00:00:00 Gecko_IOThread 25023 25058 tty2 00:00:00 Link Monitor 25023 25059 tty2 00:03:52 Socket Thread 25023 25060 tty2 00:00:00 JS Watchdog 25023 25061 tty2 00:00:12 JS Helper 25023 25062 tty2 00:00:12 JS Helper 25023 25063 tty2 00:00:13 JS Helper 25023 25064 tty2 00:00:12 JS Helper 25023 25065 tty2 00:00:13 JS Helper 25023 25066 tty2 00:00:12 JS Helper 25023 25067 tty2 00:00:13 JS Helper 25023 25068 tty2 00:00:13 JS Helper 25023 25069 tty2 00:00:00 Hang Monitor 25023 25070 tty2 00:00:00 BgHangManager 25023 25117 tty2 00:00:12 Cache2 I/O 25023 25118 tty2 00:05:37 Timer 25023 25146 tty2 00:00:00 DataStorage 25023 25150 tty2 00:00:00 GMPThread 25023 25152 tty2 00:04:12 Compositor 25023 25156 tty2 00:00:02 ImgDecoder #1 25023 25157 tty2 00:00:02 ImgDecoder #2 25023 25158 tty2 00:00:02 ImgDecoder #3 25023 25159 tty2 00:00:00 ImageIO 25023 25160 tty2 00:02:26 SoftwareVsyncTh 25023 25202 tty2 00:00:00 HTML5 Parser 25023 25204 tty2 00:00:00 IPDL Background 25023 25212 tty2 00:00:22 DOM Worker 25023 25219 tty2 00:00:23 URL Classifier 25023 25220 tty2 00:00:05 ImageBridgeChil 25023 25221 tty2 00:00:04 mozStorage #1 25023 25222 tty2 00:00:00 Proxy R~olution 25023 25223 tty2 00:00:00 DataStorage 25023 25234 tty2 00:00:00 Cache I/O 25023 25235 tty2 00:00:01 mozStorage #2 25023 25236 tty2 00:00:00 mozStorage #3 25023 25252 tty2 00:00:20 DOM Worker 25023 25256 tty2 00:00:00 mozStorage #4 25023 25257 tty2 00:00:00 localStorage DB 25023 25260 tty2 00:00:00 mozStorage #5 25023 2066 tty2 00:00:19 DOM Worker 25023 5988 tty2 00:00:00 threaded-ml 25023 6757 tty2 00:00:00 firefox 25023 6999 tty2 00:00:00 mozStorage #6 25023 19239 tty2 00:00:02 mozStorage #7 25023 12038 tty2 00:00:05 threaded-ml 25023 19301 tty2 00:00:00 firefox [root@jouet ~]# [root@jouet ~]# file /proc/25023/tasks /proc/25023/tasks: cannot open `/proc/25023/tasks' (No such file or directory) [root@jouet ~]# But... [root@jouet ~]# file /proc/25023/task /proc/25023/task: directory [root@jouet ~]# wc -l /proc/25023/task/25023/maps 1200 /proc/25023/task/25023/maps [root@jouet ~]# wc -l /proc/25023/task/25048/maps 1201 /proc/25023/task/25048/maps [root@jouet ~]# wc -l /proc/25048/task/25048/maps 1201 /proc/25048/task/25048/maps [root@jouet ~]# [acme@jouet linux]$ grep \"tasks\" fs/proc/*.c [acme@jouet linux]$ grep \"task\" fs/proc/*.c fs/proc/base.c: DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), fs/proc/base.c: name.name = "task"; [acme@jouet linux]$ These end up mapping to the later of these two: const struct file_operations proc_pid_maps_operations = { .open = pid_maps_open, .read = seq_read, .llseek = seq_lseek, .release = proc_map_release, }; const struct file_operations proc_tid_maps_operations = { .open = tid_maps_open, .read = seq_read, .llseek = seq_lseek, .release = proc_map_release, }; Which ends up going down to: show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid) with that is_pid respectively true and false, but then show_map_vma() doesn't use 'is_pid' at all :-\ /me scratches head, what am I missing? The following commit, that appeared circa v4.9-rc2. So, fixing up the "tasks" -> "tasks" we end up with something safe and that avoids this by now non-existing problem, on older kernels, ok? commit b18cb64ead400c01bf1580eeba330ace51f8087d Author: Andy Lutomirski <luto@kernel.org> Date: Fri Sep 30 10:58:57 2016 -0700 fs/proc: Stop trying to report thread stacks This reverts more of: b76437579d13 ("procfs: mark thread stack correctly in proc/<pid>/maps") ... which was partially reverted by: 65376df58217 ("proc: revert /proc/<pid>/maps [stack:TID] annotation") Originally, /proc/PID/task/TID/maps was the same as /proc/TID/maps. In current kernels, /proc/PID/maps (or /proc/TID/maps even for threads) shows "[stack]" for VMAs in the mm's stack address range. In contrast, /proc/PID/task/TID/maps uses KSTK_ESP to guess the target thread's stack's VMA. This is racy, probably returns garbage and, on arches with CONFIG_TASK_INFO_IN_THREAD=y, is also crash-prone: KSTK_ESP is not safe to use on tasks that aren't known to be running ordinary process-context kernel code. This patch removes the difference and just shows "[stack]" for VMAs in the mm's stack range. This is IMO much more sensible -- the actual "stack" address really is treated specially by the VM code, and the current thread stack isn't even well-defined for programs that frequently switch stacks on their own. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> > > fp = fopen(filename, "r"); > if (fp == NULL) { > -- > 2.5.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale 2017-03-15 13:50 ` Arnaldo Carvalho de Melo @ 2017-03-15 13:58 ` Arnaldo Carvalho de Melo 2017-03-15 17:08 ` Stephane Eranian 2017-03-15 17:17 ` [PATCH V2] " Stephane Eranian 1 sibling, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-15 13:58 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Stephane Eranian, linux-kernel, peterz, Ingo Molnar, Jiri Olsa, namhyung.kim Em Wed, Mar 15, 2017 at 10:50:59AM -0300, Arnaldo Carvalho de Melo escreveu: > So, fixing up the "tasks" -> "tasks" we end up with something safe and > that avoids this by now "tasks" -> "task", grrr - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale 2017-03-15 13:58 ` Arnaldo Carvalho de Melo @ 2017-03-15 17:08 ` Stephane Eranian 2017-03-15 17:44 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 9+ messages in thread From: Stephane Eranian @ 2017-03-15 17:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo, LKML, Peter Zijlstra, Ingo Molnar, Jiri Olsa, namhyung.kim On Wed, Mar 15, 2017 at 6:58 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Em Wed, Mar 15, 2017 at 10:50:59AM -0300, Arnaldo Carvalho de Melo escreveu: >> So, fixing up the "tasks" -> "tasks" we end up with something safe and >> that avoids this by now > > "tasks" -> "task", grrr > Oops, yeah, sorry about that. Let me submit a V2 with the fix. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale 2017-03-15 17:08 ` Stephane Eranian @ 2017-03-15 17:44 ` Arnaldo Carvalho de Melo 2017-03-16 2:27 ` Stephane Eranian 0 siblings, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-15 17:44 UTC (permalink / raw) To: Stephane Eranian Cc: Arnaldo Carvalho de Melo, LKML, Peter Zijlstra, Ingo Molnar, Jiri Olsa, namhyung.kim Em Wed, Mar 15, 2017 at 10:08:27AM -0700, Stephane Eranian escreveu: > On Wed, Mar 15, 2017 at 6:58 AM, Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > Em Wed, Mar 15, 2017 at 10:50:59AM -0300, Arnaldo Carvalho de Melo escreveu: > >> So, fixing up the "tasks" -> "tasks" we end up with something safe and > >> that avoids this by now > > > > "tasks" -> "task", grrr > > > Oops, yeah, sorry about that. > Let me submit a V2 with the fix. Ok, but what about the other observations, what is the kernel you're using? I think this will not make any difference for any kernel >= 4.9, which isn't to say the patch shouldn't be applied, I'm just curious. - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale 2017-03-15 17:44 ` Arnaldo Carvalho de Melo @ 2017-03-16 2:27 ` Stephane Eranian 0 siblings, 0 replies; 9+ messages in thread From: Stephane Eranian @ 2017-03-16 2:27 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo, LKML, Peter Zijlstra, Ingo Molnar, Jiri Olsa, namhyung.kim Arnaldo, On Wed, Mar 15, 2017 at 10:44 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Mar 15, 2017 at 10:08:27AM -0700, Stephane Eranian escreveu: > > On Wed, Mar 15, 2017 at 6:58 AM, Arnaldo Carvalho de Melo > > <acme@kernel.org> wrote: > > > Em Wed, Mar 15, 2017 at 10:50:59AM -0300, Arnaldo Carvalho de Melo escreveu: > > >> So, fixing up the "tasks" -> "tasks" we end up with something safe and > > >> that avoids this by now > > > > > > "tasks" -> "task", grrr > > > > > Oops, yeah, sorry about that. > > Let me submit a V2 with the fix. > > Ok, but what about the other observations, what is the kernel you're > using? > Ok, yeah, I was using an older kernel which still had [stack:tid] annotations. With more recent kernel, it seems this optimization is not needed. I guess it would only be useful to people running a new perf on a older kernel. > > I think this will not make any difference for any kernel >= 4.9, which > isn't to say the patch shouldn't be applied, I'm just curious. > correct, see above. > > - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH V2] perf/record: make perf_event__synthesize_mmap_events() scale @ 2017-03-15 17:17 ` Stephane Eranian 2017-03-16 16:35 ` [tip:perf/core] perf tools: Make " tip-bot for Stephane Eranian 0 siblings, 1 reply; 9+ messages in thread From: Stephane Eranian @ 2017-03-15 17:17 UTC (permalink / raw) To: linux-kernel; +Cc: acme, peterz, mingo, jolsa, namhyung.kim This patch significantly improves the execution time of perf_event__synthesize_mmap_events() when running perf record on systems where processes have lots of threads. It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to generate each map line in the maps file. If you have 1000 threads, then you have necessarily 1000 stacks. For each vma, you need to check if it corresponds to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. As of today, perf does not use the fact that a mapping is a stack, therefore we can work around the issue by using /proc/pid/tasks/pid/maps. This entry does not try to map a vma to stack and is thus much faster with no loss of functonality. The proc-map-timeout logic is kept in case users still want some upper limit. In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual /proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this. Signed-off-by: Stephane Eranian <eranian@google.com> --- tools/perf/util/event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 4ea7ce7..b137566 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, if (machine__is_default_guest(machine)) return 0; - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", - machine->root_dir, pid); + snprintf(filename, sizeof(filename), "%s/proc/%d/task/%d/maps", + machine->root_dir, pid, pid); fp = fopen(filename, "r"); if (fp == NULL) { -- 2.5.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [tip:perf/core] perf tools: Make perf_event__synthesize_mmap_events() scale 2017-03-15 17:17 ` [PATCH V2] " Stephane Eranian @ 2017-03-16 16:35 ` tip-bot for Stephane Eranian 0 siblings, 0 replies; 9+ messages in thread From: tip-bot for Stephane Eranian @ 2017-03-16 16:35 UTC (permalink / raw) To: linux-tip-commits Cc: hpa, luto, mingo, jolsa, linux-kernel, namhyung, tglx, peterz, acme, eranian Commit-ID: 88b897a30c525c2eee6e7f16e1e8d0f18830845e Gitweb: http://git.kernel.org/tip/88b897a30c525c2eee6e7f16e1e8d0f18830845e Author: Stephane Eranian <eranian@google.com> AuthorDate: Wed, 15 Mar 2017 10:17:13 -0700 Committer: Arnaldo Carvalho de Melo <acme@redhat.com> CommitDate: Wed, 15 Mar 2017 17:48:37 -0300 perf tools: Make perf_event__synthesize_mmap_events() scale This patch significantly improves the execution time of perf_event__synthesize_mmap_events() when running perf record on systems where processes have lots of threads. It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to generate each map line in the maps file. If you have 1000 threads, then you have necessarily 1000 stacks. For each vma, you need to check if it corresponds to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. As of today, perf does not use the fact that a mapping is a stack, therefore we can work around the issue by using /proc/pid/tasks/pid/maps. This entry does not try to map a vma to stack and is thus much faster with no loss of functonality. The proc-map-timeout logic is kept in case users still want some upper limit. In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual /proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this. Committer note: This problem seems to have been elliminated in the kernel since commit : b18cb64ead40 ("fs/proc: Stop trying to report thread stacks"). Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index d082cb7..33fc2e9 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -325,8 +325,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, if (machine__is_default_guest(machine)) return 0; - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", - machine->root_dir, pid); + snprintf(filename, sizeof(filename), "%s/proc/%d/task/%d/maps", + machine->root_dir, pid, pid); fp = fopen(filename, "r"); if (fp == NULL) { ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-03-16 16:36 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-15 6:57 [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Stephane Eranian 2017-03-15 11:33 ` Jiri Olsa 2017-03-15 13:50 ` Arnaldo Carvalho de Melo 2017-03-15 13:58 ` Arnaldo Carvalho de Melo 2017-03-15 17:08 ` Stephane Eranian 2017-03-15 17:44 ` Arnaldo Carvalho de Melo 2017-03-16 2:27 ` Stephane Eranian 2017-03-15 17:17 ` [PATCH V2] " Stephane Eranian 2017-03-16 16:35 ` [tip:perf/core] perf tools: Make " tip-bot for Stephane Eranian
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.