linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf's handling of unfindable user symbols...
@ 2018-10-14  7:42 David Miller
  2018-10-15 22:25 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2018-10-14  7:42 UTC (permalink / raw)
  To: acme; +Cc: linux-kernel


Perf has this hack where it uses the kernel symbol map as a backup
when a symbol can't be found in the user's symbol table(s).

This causes problems because the tests driving this code path use
machine__kernel_ip(), and that is completely meaningless on Sparc.  On
sparc64 the kernel and user live in physically separate virtual
address spaces, rather than a shared one.  And the kernel lives at a
virtual address that overlaps common userspace addresses.  So this
test passes almost all the time when a user symbol lookup fails.

The consequence of this is that, if the unfound user virtual address
in the sample doesn't match up to a kernel symbol either, we trigger
things like this code in builtin-top.c:

	if (al.sym == NULL && al.map != NULL) {
		const char *msg = "Kernel samples will not be resolved.\n";
		/*
		 * As we do lazy loading of symtabs we only will know if the
		 * specified vmlinux file is invalid when we actually have a
		 * hit in kernel space and then try to load it. So if we get
		 * here and there are _no_ symbols in the DSO backing the
		 * kernel map, bail out.
		 *
		 * We may never get here, for instance, if we use -K/
		 * --hide-kernel-symbols, even if the user specifies an
		 * invalid --vmlinux ;-)
		 */
		if (!machine->kptr_restrict_warned && !top->vmlinux_warned &&
		    __map__is_kernel(al.map) && map__has_symbols(al.map)) {
			if (symbol_conf.vmlinux_name) {
				char serr[256];
				dso__strerror_load(al.map->dso, serr, sizeof(serr));
				ui__warning("The %s file can't be used: %s\n%s",
					    symbol_conf.vmlinux_name, serr, msg);
			} else {
				ui__warning("A vmlinux file was not found.\n%s",
					    msg);
			}

			if (use_browser <= 0)
				sleep(5);
			top->vmlinux_warned = true;
		}
	}

When I fire up a compilation on sparc, this triggers immediately.

I'm trying to figure out what the "backup to kernel map" code is
accomplishing.

I see some language in the current code and in the changes that have
happened in this area talking about vdso.  Does that really happen?

The vdso is mapped into userspace virtual addresses, not kernel ones.

More history.  This didn't cause problems on sparc some time ago,
because the kernel IP check used to be "ip < 0" :-) Sparc kernel
addresses are not negative.  But now with machine__kernel_ip(), which
works using the symbol table determined kernel address range, it does
trigger.

What it all boils down to is that on architectures like sparc,
machine__kernel_ip() should always return false in this scenerio, and
therefore this kind of logic:

		if (cpumode == PERF_RECORD_MISC_USER && machine &&
		    mg != &machine->kmaps &&
		    machine__kernel_ip(machine, al->addr)) {

is basically invalid.  PERF_RECORD_MISC_USER implies no kernel address
can possibly match for the sample/event in question (no matter how
hard you try!) :-)

At the very least there should be an arch way to disable this logic,
or even formalize in some way the situation.  Have some kind of
"user_kernel_shared_address_space" boolean, and then
machine__kernel_ip() can take a cpumode parameter and it can thus say:

	if (cpumode == PERF_RECORD_MISC_USER &&
	    !user_kernel_shared_address_space)
		return false;

Comments?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-14  7:42 perf's handling of unfindable user symbols David Miller
@ 2018-10-15 22:25 ` Arnaldo Carvalho de Melo
  2018-10-15 23:02   ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-10-15 22:25 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel, acme

Em Sun, Oct 14, 2018 at 12:42:38AM -0700, David Miller escreveu:
> 
> Perf has this hack where it uses the kernel symbol map as a backup
> when a symbol can't be found in the user's symbol table(s).

Right, I recall that, lemme find where this got introduced... there,
Ingo added it while perf lived in Documentation/:

commit ed966aac335a63083d3125198479447248637d9e
Author: Ingo Molnar <mingo@elte.hu>
Date:   Wed Jun 3 10:39:26 2009 +0200

    perf report: Handle vDSO symbols properly
    
    We were not looking up vDSO symbols properly, because they
    are in the kallsyms but are user-mode entries.
    
    Pass negative addresses to the kernel dso object, this
    way we resolve them properly:
    
         0.05%  [kernel]: vread_tsc
    
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Mike Galbraith <efault@gmx.de>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: John Kacur <jkacur@redhat.com>
    LKML-Reference: <new-submission>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/Documentation/perf_counter/builtin-report.c b/Documentation/perf_counter/builtin-report.c
index a8d390596d8d..0f88d9ebb34a 100644
--- a/Documentation/perf_counter/builtin-report.c
+++ b/Documentation/perf_counter/builtin-report.c
@@ -728,6 +728,8 @@ static int __cmd_report(void)
                        event->ip.pid,
                        (void *)(long)ip);
 
+               dprintf(" ... thread: %s:%d\n", thread->comm, thread->pid);
+
                if (thread == NULL) {
                        fprintf(stderr, "problem processing %d event, skipping it.\n",
                                event->header.type);
@@ -740,6 +742,8 @@ static int __cmd_report(void)
 
                        dso = kernel_dso;
 
+                       dprintf(" ...... dso: %s\n", dso->name);
+
                } else if (event->header.misc & PERF_EVENT_MISC_USER) {
 
                        show = SHOW_USER;
@@ -749,11 +753,22 @@ static int __cmd_report(void)
                        if (map != NULL) {
                                dso = map->dso;
                                ip -= map->start + map->pgoff;
+                       } else {
+                               /*
+                                * If this is outside of all known maps,
+                                * and is a negative address, try to look it
+                                * up in the kernel dso, as it might be a
+                                * vsyscall (which executes in user-mode):
+                                */
+                               if ((long long)ip < 0)
+                                       dso = kernel_dso;
                        }
+                       dprintf(" ...... dso: %s\n", dso ? dso->name : "<not found>");
 
                }

 
> This causes problems because the tests driving this code path use
> machine__kernel_ip(), and that is completely meaningless on Sparc.  On

This machine__kernel_ip() got added later:

commit fbe2af45f6bd27ee69fd775303c936c3af4a4807
Author: Adrian Hunter <adrian.hunter@intel.com>
Date:   Fri Aug 15 22:08:39 2014 +0300

    perf tools: Add machine__kernel_ip()
    
    Add a function to determine if an address is in the kernel.  This is
    based on the kernel function kernel_ip().
    
    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Namhyung Kim <namhyung@gmail.com>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Link: http://lkml.kernel.org/r/1408129739-17368-5-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

static inline bool kernel_ip(unsigned long ip)
{
#ifdef CONFIG_X86_32
        return ip > PAGE_OFFSET;
#else
        return (long)ip < 0;
#endif
}

But this is in:

arch/x86/events/perf_event.h

totally x86 specific:

[acme@jouet perf]$ find . -name "*.[ch]" | xargs grep -w kernel_ip
./arch/x86/events/perf_event.h:static inline bool kernel_ip(unsigned long ip)
./arch/x86/events/perf_event.h:	regs->cs = kernel_ip(ip) ? __KERNEL_CS : __USER_CS;
./arch/x86/events/intel/pt.c:	return virt_addr_valid(ip) && kernel_ip(ip);
./arch/x86/events/intel/ds.c:		    (kernel_ip(at->from) || kernel_ip(at->to)))
./arch/x86/events/intel/ds.c:		    (kernel_ip(at->from) || kernel_ip(at->to)))
./arch/x86/events/intel/ds.c:	if (kernel_ip(ip) != kernel_ip(to))
./arch/x86/events/intel/ds.c:	if (!kernel_ip(ip)) {
./arch/x86/events/intel/ds.c:		is_64bit = kernel_ip(to) || !test_thread_flag(TIF_IA32);
./arch/x86/events/intel/lbr.c:	to_plm = kernel_ip(to) ? X86_BR_KERNEL : X86_BR_USER;
./arch/x86/events/intel/lbr.c:	from_plm = kernel_ip(from) ? X86_BR_KERNEL : X86_BR_USER;
./arch/x86/events/intel/lbr.c:	is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
[acme@jouet perf]$

> sparc64 the kernel and user live in physically separate virtual
> address spaces, rather than a shared one.  And the kernel lives at a
> virtual address that overlaps common userspace addresses.  So this
> test passes almost all the time when a user symbol lookup fails.
> 
> The consequence of this is that, if the unfound user virtual address
> in the sample doesn't match up to a kernel symbol either, we trigger
> things like this code in builtin-top.c:
> 
> 	if (al.sym == NULL && al.map != NULL) {
> 		const char *msg = "Kernel samples will not be resolved.\n";
> 		/*
> 		 * As we do lazy loading of symtabs we only will know if the
> 		 * specified vmlinux file is invalid when we actually have a
> 		 * hit in kernel space and then try to load it. So if we get
> 		 * here and there are _no_ symbols in the DSO backing the
> 		 * kernel map, bail out.
> 		 *
> 		 * We may never get here, for instance, if we use -K/
> 		 * --hide-kernel-symbols, even if the user specifies an
> 		 * invalid --vmlinux ;-)
> 		 */
> 		if (!machine->kptr_restrict_warned && !top->vmlinux_warned &&
> 		    __map__is_kernel(al.map) && map__has_symbols(al.map)) {
> 			if (symbol_conf.vmlinux_name) {
> 				char serr[256];
> 				dso__strerror_load(al.map->dso, serr, sizeof(serr));
> 				ui__warning("The %s file can't be used: %s\n%s",
> 					    symbol_conf.vmlinux_name, serr, msg);
> 			} else {
> 				ui__warning("A vmlinux file was not found.\n%s",
> 					    msg);
> 			}
> 
> 			if (use_browser <= 0)
> 				sleep(5);
> 			top->vmlinux_warned = true;
> 		}
> 	}
> 
> When I fire up a compilation on sparc, this triggers immediately.
> 
> I'm trying to figure out what the "backup to kernel map" code is
> accomplishing.
 
> I see some language in the current code and in the changes that have
> happened in this area talking about vdso.  Does that really happen?
> 
> The vdso is mapped into userspace virtual addresses, not kernel ones.
> 
> More history.  This didn't cause problems on sparc some time ago,
> because the kernel IP check used to be "ip < 0" :-) Sparc kernel
> addresses are not negative.  But now with machine__kernel_ip(), which
> works using the symbol table determined kernel address range, it does
> trigger.
> 
> What it all boils down to is that on architectures like sparc,
> machine__kernel_ip() should always return false in this scenerio, and
> therefore this kind of logic:
> 
> 		if (cpumode == PERF_RECORD_MISC_USER && machine &&
> 		    mg != &machine->kmaps &&
> 		    machine__kernel_ip(machine, al->addr)) {
> 
> is basically invalid.  PERF_RECORD_MISC_USER implies no kernel address
> can possibly match for the sample/event in question (no matter how
> hard you try!) :-)
> 
> At the very least there should be an arch way to disable this logic,
> or even formalize in some way the situation.  Have some kind of
> "user_kernel_shared_address_space" boolean, and then
> machine__kernel_ip() can take a cpumode parameter and it can thus say:
> 
> 	if (cpumode == PERF_RECORD_MISC_USER &&
> 	    !user_kernel_shared_address_space)
> 		return false;
> 
> Comments?

I think your solution is valid and fixes the problem for architectures with
this property, but we need some way of saying hey, I know I'm not in sparc, check
that like on x86 to cope with things like:

        kernel = machine__kernel_ip(machine, start);
        if (kernel)
                *cpumode = PERF_RECORD_MISC_KERNEL;
        else
                *cpumode = PERF_RECORD_MISC_USER;

That we have in builtin-script.c to handle addresses in branch stacks where
cpumode isn't available and is being derived from the address itself. I have to
look at this deeper to see how much change would be needed to do this in some
other fashion, i.e. is the cpumode really only derivable from looking just at
the address in the branch stack?

tools/perf/util/intel-bts.c does the same trick:

static int intel_bts_get_next_insn(struct intel_bts_queue *btsq, u64 ip)
{
        struct machine *machine = btsq->bts->machine;
        struct thread *thread;
        struct addr_location al;
        unsigned char buf[INTEL_PT_INSN_BUF_SZ];
        ssize_t len;
        int x86_64;
        uint8_t cpumode;
        int err = -1;

        if (machine__kernel_ip(machine, ip))
                cpumode = PERF_RECORD_MISC_KERNEL;
        else
                cpumode = PERF_RECORD_MISC_USER;



------------------------------------------------------------------------------------------

But I think we should have it as a property of 'struct machine', because we may
be processing on, say, x86, a perf.data file recorded on a Sparc machine, so we
need to save this property on the perf.data file, humm, or we can derive that
from data already there, like the quick patch below. I'll cache that property
on machine->user_kernel_shared_address_space, to avoid having to do the
strcmp() lots of times.

Does that document the hack further? Defining the
machine__user_kernel_shared_address_space() function right besides the
machine__kernel_ip() inline should help as well?

- Arnaldo

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 0cd42150f712..243dd9241339 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -1573,6 +1573,7 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
 		 * in the whole kernel symbol list.
 		 */
 		if (cpumode == PERF_RECORD_MISC_USER && machine &&
+		    !machine__user_kernel_shared_address_space(machine) &&
 		    mg != &machine->kmaps &&
 		    machine__kernel_ip(machine, al->addr)) {
 			mg = &machine->kmaps;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index d856b85862e2..b8645f16710b 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -6,6 +6,7 @@
 #include <linux/rbtree.h>
 #include "map.h"
 #include "dso.h"
+#include "env.h"
 #include "event.h"
 #include "rwsem.h"
 
@@ -99,6 +100,12 @@ static inline bool machine__kernel_ip(struct machine *machine, u64 ip)
 	return ip >= kernel_start;
 }
 
+static inline bool machine__user_kernel_shared_address_space(struct machine *machine)
+{
+	const char *arch = perf_env__arch(machine->env);
+	return arch && strcmp(arch, "sparc") == 0;
+}
+
 struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 				    pid_t tid);
 struct comm *machine__thread_exec_comm(struct machine *machine,

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-15 22:25 ` Arnaldo Carvalho de Melo
@ 2018-10-15 23:02   ` David Miller
  2018-10-16 18:45     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2018-10-15 23:02 UTC (permalink / raw)
  To: acme; +Cc: linux-kernel, acme

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Mon, 15 Oct 2018 19:25:46 -0300

> But I think we should have it as a property of 'struct machine', because we may
> be processing on, say, x86, a perf.data file recorded on a Sparc machine, so we
> need to save this property on the perf.data file, humm, or we can derive that
> from data already there, like the quick patch below. I'll cache that property
> on machine->user_kernel_shared_address_space, to avoid having to do the
> strcmp() lots of times.
> 
> Does that document the hack further? Defining the
> machine__user_kernel_shared_address_space() function right besides the
> machine__kernel_ip() inline should help as well?

Your patch looks fine.

But, more deeply, the VDSO thing itself makes no sense to me.

Why would we use the kernel map for something that is mapped into
userspace and uses the user space virtual addresse range?

As it is used by user applications, the VDSO isn't mapped into the
kernel virtual address range, therefore no PC from userspace executing
the VDSO will have a kernel range address.

We will see normal userspace virtual addresses instead.  Test this
assertion, if you like :-)

So I am suggesting that we remove the hack, and don't try to use the
kernel map for resolving the IP of user mode events.  If that is a
valid change, we can toss all of this weird stuff that tries to
interpret an address based upon what "range" it falls into.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-15 23:02   ` David Miller
@ 2018-10-16 18:45     ` Arnaldo Carvalho de Melo
  2018-10-16 19:02       ` David Miller
  2018-10-17  8:22       ` Masami Hiramatsu
  0 siblings, 2 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-10-16 18:45 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, acme, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Masami Hiramatsu

Adding some people to the CC list.

Em Mon, Oct 15, 2018 at 04:02:46PM -0700, David Miller escreveu:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date: Mon, 15 Oct 2018 19:25:46 -0300
 
> > But I think we should have it as a property of 'struct machine', because we may
> > be processing on, say, x86, a perf.data file recorded on a Sparc machine, so we
> > need to save this property on the perf.data file, humm, or we can derive that
> > from data already there, like the quick patch below. I'll cache that property
> > on machine->user_kernel_shared_address_space, to avoid having to do the
> > strcmp() lots of times.
 
> > Does that document the hack further? Defining the
> > machine__user_kernel_shared_address_space() function right besides the
> > machine__kernel_ip() inline should help as well?
 
> Your patch looks fine.
> 
> But, more deeply, the VDSO thing itself makes no sense to me.
 
> Why would we use the kernel map for something that is mapped into
> userspace and uses the user space virtual addresse range?
 
> As it is used by user applications, the VDSO isn't mapped into the
> kernel virtual address range, therefore no PC from userspace executing
> the VDSO will have a kernel range address.
 
> We will see normal userspace virtual addresses instead.  Test this
> assertion, if you like :-)
 
> So I am suggesting that we remove the hack, and don't try to use the
> kernel map for resolving the IP of user mode events.  If that is a
> valid change, we can toss all of this weird stuff that tries to
> interpret an address based upon what "range" it falls into.

Exec summary: yeah, drop that hack, I agree, patch at the end of the
message.

So, I thought something had changed and in the past we would somehow
find that address in the kallsyms, but I couldn't find anything to back
that up, the patch introducing this is over a decade old, lots of things
changed, so I was just thinking I was missing something.

I tried a gtod busy loop to generate vdso activity and added a 'perf
probe' at that branch, on x86_64 to see if it ever gets hit:

Made thread__find_map() noinline, as 'perf probe' in lines of inline
functions seems to not be working, only at function start. (Masami?)

[root@jouet ~]# perf probe -x ~/bin/perf -L thread__find_map:57
<thread__find_map@/home/acme/git/perf/tools/perf/util/event.c:57>
     57                 if (cpumode == PERF_RECORD_MISC_USER && machine &&
     58                     mg != &machine->kmaps &&
     59                     machine__kernel_ip(machine, al->addr)) {
     60                         mg = &machine->kmaps;
     61                         load_map = true;
     62                         goto try_again;
                        }
                } else {
                        /*
                         * Kernel maps might be changed when loading
                         * symbols so loading
                         * must be done prior to using kernel maps.
                         */
     69                 if (load_map)
     70                         map__load(al->map);
     71                 al->addr = al->map->map_ip(al->map, al->addr);

[root@jouet ~]# perf probe -x ~/bin/perf thread__find_map:60
Added new event:
  probe_perf:thread__find_map (on thread__find_map:60 in /home/acme/bin/perf)

You can now use it in all perf tools, such as:

	perf record -e probe_perf:thread__find_map -aR sleep 1

[root@jouet ~]#

Then used this to see if, system wide, those probe points were being hit:

[root@jouet ~]# perf trace -e *perf:thread*/max-stack=8/
^C[root@jouet ~]#

No hits when running 'perf top' and:

[root@jouet c]# cat gtod.c
#include <sys/time.h>

int main(void)
{
	struct timeval tv;

	while (1)
		gettimeofday(&tv, 0);

	return 0;
}
[root@jouet c]# ./gtod 
^C

Pressed 'P' in 'perf top' and the [vdso] samples are there:

  62.84%  [vdso]                    [.] __vdso_gettimeofday
   8.13%  gtod                      [.] main
   7.51%  [vdso]                    [.] 0x0000000000000914
   5.78%  [vdso]                    [.] 0x0000000000000917
   5.43%  gtod                      [.] _init
   2.71%  [vdso]                    [.] 0x000000000000092d
   0.35%  [kernel]                  [k] native_io_delay
   0.33%  libc-2.26.so              [.] __memmove_avx_unaligned_erms
   0.20%  [vdso]                    [.] 0x000000000000091d
   0.17%  [i2c_i801]                [k] i801_access
   0.06%  firefox                   [.] free
   0.06%  libglib-2.0.so.0.5400.3   [.] g_source_iter_next
   0.05%  [vdso]                    [.] 0x0000000000000919
   0.05%  libpthread-2.26.so        [.] __pthread_mutex_lock
   0.05%  libpixman-1.so.0.34.0     [.] 0x000000000006d3a7
   0.04%  [kernel]                  [k] entry_SYSCALL_64_trampoline
   0.04%  libxul.so                 [.] style::dom_apis::query_selector_slow
   0.04%  [kernel]                  [k] module_get_kallsym
   0.04%  firefox                   [.] malloc
   0.04%  [vdso]                    [.] 0x0000000000000910

I added a 'perf probe' to thread__find_map:69, and that surely got tons
of hits, i.e. for every map found, just to make sure the 'perf probe'
command was really working.

In the process I noticed a bug, we're only have records for '[vdso]' for
pre-existing commands, i.e. ones that are running when we start 'perf top',
when we will generate the PERF_RECORD_MMAP by looking at /perf/PID/maps.

I.e. like this, for preexisting processes with a vdso map, again,
tracing for all the system, only pre-existing processes get a [vdso] map
(when having one):

[root@jouet ~]# perf probe -x ~/bin/perf __machine__addnew_vdso
Added new event:
  probe_perf:__machine__addnew_vdso (on __machine__addnew_vdso in /home/acme/bin/perf)

You can now use it in all perf tools, such as:

	perf record -e probe_perf:__machine__addnew_vdso -aR sleep 1

[root@jouet ~]# perf trace -e probe_perf:__machine__addnew_vdso/max-stack=8/
     0.000 probe_perf:__machine__addnew_vdso:(568eb3)
                                       __machine__addnew_vdso (/home/acme/bin/perf)
                                       map__new (/home/acme/bin/perf)
                                       machine__process_mmap2_event (/home/acme/bin/perf)
                                       machine__process_event (/home/acme/bin/perf)
                                       perf_event__process (/home/acme/bin/perf)
                                       perf_tool__process_synth_event (/home/acme/bin/perf)
                                       perf_event__synthesize_mmap_events (/home/acme/bin/perf)
                                       __event__synthesize_thread (/home/acme/bin/perf)

The kernel doesn't seem to be generating a PERF_RECORD_MMAP for vDSOs...  And
we can't do this in 'perf record' because we don't process event by event, just
dump things from the ring buffer to a file...

For 'perf top', since we process the PERF_RECORD_MMAPs, we can piggyback and
read the smaps file to hack around this limitation somehow... Peter?

Anyway, two bugs found in this exercise...

The patch is the obvious one and with it we also continue to resolve
vdso symbols (for pre-existing processes).

- Arnaldo

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 0988eb3b844b..bc646185f8d9 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -1561,26 +1561,9 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
 
 		return NULL;
 	}
-try_again:
+
 	al->map = map_groups__find(mg, al->addr);
-	if (al->map == NULL) {
-		/*
-		 * If this is outside of all known maps, and is a negative
-		 * address, try to look it up in the kernel dso, as it might be
-		 * a vsyscall or vdso (which executes in user-mode).
-		 *
-		 * XXX This is nasty, we should have a symbol list in the
-		 * "[vdso]" dso, but for now lets use the old trick of looking
-		 * in the whole kernel symbol list.
-		 */
-		if (cpumode == PERF_RECORD_MISC_USER && machine &&
-		    mg != &machine->kmaps &&
-		    machine__kernel_ip(machine, al->addr)) {
-			mg = &machine->kmaps;
-			load_map = true;
-			goto try_again;
-		}
-	} else {
+	if (al->map != NULL) {
 		/*
 		 * Kernel maps might be changed when loading symbols so loading
 		 * must be done prior to using kernel maps.



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-16 18:45     ` Arnaldo Carvalho de Melo
@ 2018-10-16 19:02       ` David Miller
  2018-10-16 19:37         ` Arnaldo Carvalho de Melo
  2018-10-17  8:22       ` Masami Hiramatsu
  1 sibling, 1 reply; 10+ messages in thread
From: David Miller @ 2018-10-16 19:02 UTC (permalink / raw)
  To: acme; +Cc: linux-kernel, acme, peterz, mingo, jolsa, namhyung, mhiramat

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Tue, 16 Oct 2018 15:45:06 -0300

> Exec summary: yeah, drop that hack, I agree, patch at the end of the
> message.
> 
> So, I thought something had changed and in the past we would somehow
> find that address in the kallsyms, but I couldn't find anything to back
> that up, the patch introducing this is over a decade old, lots of things
> changed, so I was just thinking I was missing something.
> 
> I tried a gtod busy loop to generate vdso activity and added a 'perf
> probe' at that branch, on x86_64 to see if it ever gets hit:

Good, thanks for doing the detailed checking!

> In the process I noticed a bug, we're only have records for '[vdso]' for
> pre-existing commands, i.e. ones that are running when we start 'perf top',
> when we will generate the PERF_RECORD_MMAP by looking at /perf/PID/maps.

Hmmm.  vdso mappings are installed by __install_special_mapping()
which should be emitting proper mmap events by calling
perf_event_mmap(vma).

Maybe the event is emitted too early?  It doesn't look like it.  These
are emitted after load_elf_binary() iterates over the PHDRs and
mmap()'s those areas of the binary, and we definitely see those events
properly.

> The kernel doesn't seem to be generating a PERF_RECORD_MMAP for vDSOs...  And
> we can't do this in 'perf record' because we don't process event by event, just
> dump things from the ring buffer to a file...

It should be, see above.

> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 0988eb3b844b..bc646185f8d9 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -1561,26 +1561,9 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
>  
>  		return NULL;
>  	}
> -try_again:
> +
>  	al->map = map_groups__find(mg, al->addr);
> -	if (al->map == NULL) {
> -		/*
> -		 * If this is outside of all known maps, and is a negative
> -		 * address, try to look it up in the kernel dso, as it might be
> -		 * a vsyscall or vdso (which executes in user-mode).
> -		 *
> -		 * XXX This is nasty, we should have a symbol list in the
> -		 * "[vdso]" dso, but for now lets use the old trick of looking
> -		 * in the whole kernel symbol list.
> -		 */
> -		if (cpumode == PERF_RECORD_MISC_USER && machine &&
> -		    mg != &machine->kmaps &&
> -		    machine__kernel_ip(machine, al->addr)) {
> -			mg = &machine->kmaps;
> -			load_map = true;
> -			goto try_again;
> -		}
> -	} else {
> +	if (al->map != NULL) {
>  		/*
>  		 * Kernel maps might be changed when loading symbols so loading
>  		 * must be done prior to using kernel maps.
> 
> 

Acked-by: David S. Miller <davem@davemloft.net>

:-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-16 19:02       ` David Miller
@ 2018-10-16 19:37         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-10-16 19:37 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, acme, peterz, mingo, jolsa, namhyung, mhiramat, williams

Em Tue, Oct 16, 2018 at 12:02:30PM -0700, David Miller escreveu:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date: Tue, 16 Oct 2018 15:45:06 -0300
> 
> > Exec summary: yeah, drop that hack, I agree, patch at the end of the
> > message.

> > So, I thought something had changed and in the past we would somehow
> > find that address in the kallsyms, but I couldn't find anything to back
> > that up, the patch introducing this is over a decade old, lots of things
> > changed, so I was just thinking I was missing something.

> > I tried a gtod busy loop to generate vdso activity and added a 'perf
> > probe' at that branch, on x86_64 to see if it ever gets hit:
 
> Good, thanks for doing the detailed checking!
 
> > In the process I noticed a bug, we're only have records for '[vdso]' for
> > pre-existing commands, i.e. ones that are running when we start 'perf top',
> > when we will generate the PERF_RECORD_MMAP by looking at /perf/PID/maps.
> 
> Hmmm.  vdso mappings are installed by __install_special_mapping()

Yeah, the kernel is ok, tooling seems b0rken, will check.

[root@jouet ~]# perf record ~acme/c/gtod
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.076 MB perf.data (1499 samples) ]

[root@jouet ~]# perf report -D | grep PERF_RECORD_MMAP2
71293612401913 0x11b48 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x400000(0x1000) @ 0 fd:02 1137 541179306]: r-xp /home/acme/c/gtod
71293612419012 0x11be0 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x7fa4a2783000(0x227000) @ 0 fd:00 3146370 854107250]: r-xp /usr/lib64/ld-2.26.so
71293612432110 0x11c50 [0x60]: PERF_RECORD_MMAP2 25484/25484: [0x7ffcdb53a000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
71293612509944 0x11cb0 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x7fa4a23cd000(0x3b6000) @ 0 fd:00 3149723 262067164]: r-xp /usr/lib64/libc-2.26.so
[root@jouet ~]#

And using tools/perf/python/twatch.py enabling mmap = 1:

<SNIP>
cpu: 1, pid: 25743, tid: 25743 { type: comm, pid: 25743, tid: 25743, comm: gtod }
cpu: 2, pid: 25742, tid: 25742 { type: mmap, pid: 25742, tid: 25742, start: 0x55d0a916e000, length: 0x219000, offset: 0, filename: /usr/bin/procmail }
cpu: 1, pid: 25743, tid: 25743 { type: mmap, pid: 25743, tid: 25743, start: 0x400000, length: 0x1000, offset: 0, filename: /home/acme/c/gtod
}
cpu: 2, pid: 25742, tid: 25742 { type: mmap, pid: 25742, tid: 25742, start: 0x7fa6d666e000, length: 0x227000, offset: 0, filename: /usr/lib64/ld-2.26.so }
cpu: 1, pid: 25743, tid: 25743 { type: mmap, pid: 25743, tid: 25743, start: 0x7f7774286000, length: 0x227000, offset: 0, filename: /usr/lib64/ld-2.26.so }
cpu: 2, pid: 25742, tid: 25742 { type: mmap, pid: 25742, tid: 25742, start: 0x7ffd673d6000, length: 0x2000, offset: 0, filename: [vdso] }
cpu: 1, pid: 25743, tid: 25743 { type: mmap, pid: 25743, tid: 25743, start: 0x7fff8f7ad000, length: 0x2000, offset: 0, filename: [vdso] }
<SNIP>

> which should be emitting proper mmap events by calling
> perf_event_mmap(vma).
> Maybe the event is emitted too early?  It doesn't look like it.  These
> are emitted after load_elf_binary() iterates over the PHDRs and
> mmap()'s those areas of the binary, and we definitely see those events
> properly.
> 
> > The kernel doesn't seem to be generating a PERF_RECORD_MMAP for vDSOs...  And
> > we can't do this in 'perf record' because we don't process event by event, just
> > dump things from the ring buffer to a file...
 
> It should be, see above.

Yeah, I'll check why is that the tooling side doesn't seem to be
catching those...

> > diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> > index 0988eb3b844b..bc646185f8d9 100644
> > --- a/tools/perf/util/event.c
> > +++ b/tools/perf/util/event.c
> > @@ -1561,26 +1561,9 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
> >  
> >  		return NULL;
> >  	}
> > -try_again:
> > +
> >  	al->map = map_groups__find(mg, al->addr);
> > -	if (al->map == NULL) {
> > -		/*
> > -		 * If this is outside of all known maps, and is a negative
> > -		 * address, try to look it up in the kernel dso, as it might be
> > -		 * a vsyscall or vdso (which executes in user-mode).
> > -		 *
> > -		 * XXX This is nasty, we should have a symbol list in the
> > -		 * "[vdso]" dso, but for now lets use the old trick of looking
> > -		 * in the whole kernel symbol list.
> > -		 */
> > -		if (cpumode == PERF_RECORD_MISC_USER && machine &&
> > -		    mg != &machine->kmaps &&
> > -		    machine__kernel_ip(machine, al->addr)) {
> > -			mg = &machine->kmaps;
> > -			load_map = true;
> > -			goto try_again;
> > -		}
> > -	} else {
> > +	if (al->map != NULL) {
> >  		/*
> >  		 * Kernel maps might be changed when loading symbols so loading
> >  		 * must be done prior to using kernel maps.
> > 
> > 
> 
> Acked-by: David S. Miller <davem@davemloft.net>

Reported-by:
Suggested-by:

:-)

OK, will get this with a nice cover letter and stash into my perf/urgent
branch.

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-16 18:45     ` Arnaldo Carvalho de Melo
  2018-10-16 19:02       ` David Miller
@ 2018-10-17  8:22       ` Masami Hiramatsu
  2018-10-17 12:28         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 10+ messages in thread
From: Masami Hiramatsu @ 2018-10-17  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: David Miller, linux-kernel, acme, Peter Zijlstra, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Masami Hiramatsu

On Tue, 16 Oct 2018 15:45:06 -0300
Arnaldo Carvalho de Melo <acme@redhat.com> wrote:

> 
> So, I thought something had changed and in the past we would somehow
> find that address in the kallsyms, but I couldn't find anything to back
> that up, the patch introducing this is over a decade old, lots of things
> changed, so I was just thinking I was missing something.
> 
> I tried a gtod busy loop to generate vdso activity and added a 'perf
> probe' at that branch, on x86_64 to see if it ever gets hit:
> 
> Made thread__find_map() noinline, as 'perf probe' in lines of inline
> functions seems to not be working, only at function start. (Masami?)

Thank you for reporting it. Hmm, what happened when you did it?
I checked some points, but seems no problem.
(Would you see no line number? or defined event didn't work?)

Thank you,



-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-17  8:22       ` Masami Hiramatsu
@ 2018-10-17 12:28         ` Arnaldo Carvalho de Melo
  2018-10-18  9:35           ` Masami Hiramatsu
  2018-10-18  9:42           ` Masami Hiramatsu
  0 siblings, 2 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-10-17 12:28 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Arnaldo Carvalho de Melo, David Miller, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim

Em Wed, Oct 17, 2018 at 05:22:26PM +0900, Masami Hiramatsu escreveu:
> On Tue, 16 Oct 2018 15:45:06 -0300
> Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> > So, I thought something had changed and in the past we would somehow
> > find that address in the kallsyms, but I couldn't find anything to back
> > that up, the patch introducing this is over a decade old, lots of things
> > changed, so I was just thinking I was missing something.

> > I tried a gtod busy loop to generate vdso activity and added a 'perf
> > probe' at that branch, on x86_64 to see if it ever gets hit:

> > Made thread__find_map() noinline, as 'perf probe' in lines of inline
> > functions seems to not be working, only at function start. (Masami?)
 
> Thank you for reporting it. Hmm, what happened when you did it?
> I checked some points, but seems no problem.
> (Would you see no line number? or defined event didn't work?)

So, the problem is with a function in the perf tool that ends up inlined
in five places, then when I ask for a line inside of this function to be
probed, it creates 4 probes, one in each of the function it inlines,
see:

[root@jouet ~]# perf probe -x ~/bin/perf -L thread__find_map:43
<thread__find_map@/home/acme/git/perf/tools/perf/util/event.c:43>
     43                 return NULL;
                }
         
     46         al->map = map_groups__find(mg, al->addr);
     47         if (al->map != NULL) {
                        /*
                         * Kernel maps might be changed when loading symbols so loading
                         * must be done prior to using kernel maps.
                         */
                        if (load_map)
     53                         map__load(al->map);
     54                 al->addr = al->map->map_ip(al->map, al->addr);
                }
         
     57         return al->map;
     58  }
         
         struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode,
                                           u64 addr, struct addr_location *al)

[root@jouet ~]# 

So I know that everytime we look for a mmap in a thread and we find that map,
we get to line 54, lets try to add it:

[root@jouet ~]# perf probe -x ~/bin/perf thread__find_map:54
Added new events:
  probe_perf:thread__find_map (on thread__find_map:54 in /home/acme/bin/perf)
  probe_perf:thread__find_map_1 (on thread__find_map:54 in /home/acme/bin/perf)
  probe_perf:thread__find_map_2 (on thread__find_map:54 in /home/acme/bin/perf)
  probe_perf:thread__find_map_3 (on thread__find_map:54 in /home/acme/bin/perf)
  probe_perf:thread__find_map_4 (on thread__find_map:54 in /home/acme/bin/perf)

You can now use it in all perf tools, such as:

	perf record -e probe_perf:thread__find_map_4 -aR sleep 1

[root@jouet ~]#

Now I run 'perf top' and then, on another terminal, run this to get system wide
events to get a few of those probes:

[root@jouet ~]# perf trace -a -e *perf:*/max-stack=7/ sleep 0.02
     0.000 probe_perf:thread__find_map_3:(4be2e3)
                                       machine__resolve (/home/acme/bin/perf)
                                       perf_top__mmap_read_idx (/home/acme/bin/perf)
                                       perf_top__mmap_read (/home/acme/bin/perf)
                                       cmd_top (/home/acme/bin/perf)
                                       run_builtin (/home/acme/bin/perf)
                                       handle_internal_command (/home/acme/bin/perf)
                                       main (/home/acme/bin/perf)
     0.023 probe_perf:thread__find_map_3:(4be2e3)
                                       machine__resolve (/home/acme/bin/perf)
                                       perf_top__mmap_read_idx (/home/acme/bin/perf)
                                       perf_top__mmap_read (/home/acme/bin/perf)
                                       cmd_top (/home/acme/bin/perf)
                                       run_builtin (/home/acme/bin/perf)
                                       handle_internal_command (/home/acme/bin/perf)
                                       main (/home/acme/bin/perf)
     0.048 probe_perf:thread__find_map_3:(4be2e3)
                                       machine__resolve (/home/acme/bin/perf)
                                       perf_top__mmap_read_idx (/home/acme/bin/perf)
                                       perf_top__mmap_read (/home/acme/bin/perf)
                                       cmd_top (/home/acme/bin/perf)
                                       run_builtin (/home/acme/bin/perf)
                                       handle_internal_command (/home/acme/bin/perf)
                                       main (/home/acme/bin/perf)
<SNIP>
[root@jouet ~]# 

So it now I'm not being able to reproduce... Erm, nevermind then, I'll report
back if I notice this again... :-)

Thanks for checking, sorry for the noise.

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-17 12:28         ` Arnaldo Carvalho de Melo
@ 2018-10-18  9:35           ` Masami Hiramatsu
  2018-10-18  9:42           ` Masami Hiramatsu
  1 sibling, 0 replies; 10+ messages in thread
From: Masami Hiramatsu @ 2018-10-18  9:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Arnaldo Carvalho de Melo, David Miller, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim

On Wed, 17 Oct 2018 09:28:55 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Em Wed, Oct 17, 2018 at 05:22:26PM +0900, Masami Hiramatsu escreveu:
> > On Tue, 16 Oct 2018 15:45:06 -0300
> > Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> > > So, I thought something had changed and in the past we would somehow
> > > find that address in the kallsyms, but I couldn't find anything to back
> > > that up, the patch introducing this is over a decade old, lots of things
> > > changed, so I was just thinking I was missing something.
> 
> > > I tried a gtod busy loop to generate vdso activity and added a 'perf
> > > probe' at that branch, on x86_64 to see if it ever gets hit:
> 
> > > Made thread__find_map() noinline, as 'perf probe' in lines of inline
> > > functions seems to not be working, only at function start. (Masami?)
>  
> > Thank you for reporting it. Hmm, what happened when you did it?
> > I checked some points, but seems no problem.
> > (Would you see no line number? or defined event didn't work?)
> 
> So, the problem is with a function in the perf tool that ends up inlined
> in five places, then when I ask for a line inside of this function to be
> probed, it creates 4 probes, one in each of the function it inlines,
> see:
> 
> [root@jouet ~]# perf probe -x ~/bin/perf -L thread__find_map:43
> <thread__find_map@/home/acme/git/perf/tools/perf/util/event.c:43>
>      43                 return NULL;
>                 }
>          
>      46         al->map = map_groups__find(mg, al->addr);
>      47         if (al->map != NULL) {
>                         /*
>                          * Kernel maps might be changed when loading symbols so loading
>                          * must be done prior to using kernel maps.
>                          */
>                         if (load_map)
>      53                         map__load(al->map);
>      54                 al->addr = al->map->map_ip(al->map, al->addr);
>                 }
>          
>      57         return al->map;
>      58  }
>          
>          struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode,
>                                            u64 addr, struct addr_location *al)
> 
> [root@jouet ~]# 
> 
> So I know that everytime we look for a mmap in a thread and we find that map,
> we get to line 54, lets try to add it:
> 
> [root@jouet ~]# perf probe -x ~/bin/perf thread__find_map:54
> Added new events:
>   probe_perf:thread__find_map (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_1 (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_2 (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_3 (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_4 (on thread__find_map:54 in /home/acme/bin/perf)
> 
> You can now use it in all perf tools, such as:
> 
> 	perf record -e probe_perf:thread__find_map_4 -aR sleep 1
> 
> [root@jouet ~]#

Hmm, it seems that the line is optimized and spread into 5 places (instructions).
"perf probe -x ~/bin/perf -D thread__find_map:54" will show you the actual address
where the probes are. Or, you can just dump the /sys/kernel/debug/tracing/uprobe_events.

> Now I run 'perf top' and then, on another terminal, run this to get system wide
> events to get a few of those probes:
> 
> [root@jouet ~]# perf trace -a -e *perf:*/max-stack=7/ sleep 0.02
>      0.000 probe_perf:thread__find_map_3:(4be2e3)
>                                        machine__resolve (/home/acme/bin/perf)
>                                        perf_top__mmap_read_idx (/home/acme/bin/perf)
>                                        perf_top__mmap_read (/home/acme/bin/perf)
>                                        cmd_top (/home/acme/bin/perf)
>                                        run_builtin (/home/acme/bin/perf)
>                                        handle_internal_command (/home/acme/bin/perf)
>                                        main (/home/acme/bin/perf)
>      0.023 probe_perf:thread__find_map_3:(4be2e3)
>                                        machine__resolve (/home/acme/bin/perf)
>                                        perf_top__mmap_read_idx (/home/acme/bin/perf)
>                                        perf_top__mmap_read (/home/acme/bin/perf)
>                                        cmd_top (/home/acme/bin/perf)
>                                        run_builtin (/home/acme/bin/perf)
>                                        handle_internal_command (/home/acme/bin/perf)
>                                        main (/home/acme/bin/perf)
>      0.048 probe_perf:thread__find_map_3:(4be2e3)
>                                        machine__resolve (/home/acme/bin/perf)
>                                        perf_top__mmap_read_idx (/home/acme/bin/perf)
>                                        perf_top__mmap_read (/home/acme/bin/perf)
>                                        cmd_top (/home/acme/bin/perf)
>                                        run_builtin (/home/acme/bin/perf)
>                                        handle_internal_command (/home/acme/bin/perf)
>                                        main (/home/acme/bin/perf)
> <SNIP>
> [root@jouet ~]# 
> 
> So it now I'm not being able to reproduce... Erm, nevermind then, I'll report
> back if I notice this again... :-)

Thanks!
I guess it might be broken. I will investigate it.

> 
> Thanks for checking, sorry for the noise.

Issue reports are always welcome :)

> 
> - Arnaldo


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf's handling of unfindable user symbols...
  2018-10-17 12:28         ` Arnaldo Carvalho de Melo
  2018-10-18  9:35           ` Masami Hiramatsu
@ 2018-10-18  9:42           ` Masami Hiramatsu
  1 sibling, 0 replies; 10+ messages in thread
From: Masami Hiramatsu @ 2018-10-18  9:42 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Arnaldo Carvalho de Melo, David Miller, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim

On Wed, 17 Oct 2018 09:28:55 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Em Wed, Oct 17, 2018 at 05:22:26PM +0900, Masami Hiramatsu escreveu:
> > On Tue, 16 Oct 2018 15:45:06 -0300
> > Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> > > So, I thought something had changed and in the past we would somehow
> > > find that address in the kallsyms, but I couldn't find anything to back
> > > that up, the patch introducing this is over a decade old, lots of things
> > > changed, so I was just thinking I was missing something.
> 
> > > I tried a gtod busy loop to generate vdso activity and added a 'perf
> > > probe' at that branch, on x86_64 to see if it ever gets hit:
> 
> > > Made thread__find_map() noinline, as 'perf probe' in lines of inline
> > > functions seems to not be working, only at function start. (Masami?)
>  
> > Thank you for reporting it. Hmm, what happened when you did it?
> > I checked some points, but seems no problem.
> > (Would you see no line number? or defined event didn't work?)
> 
> So, the problem is with a function in the perf tool that ends up inlined
> in five places, then when I ask for a line inside of this function to be
> probed, it creates 4 probes, one in each of the function it inlines,
> see:
> 
> [root@jouet ~]# perf probe -x ~/bin/perf -L thread__find_map:43
> <thread__find_map@/home/acme/git/perf/tools/perf/util/event.c:43>
>      43                 return NULL;
>                 }
>          
>      46         al->map = map_groups__find(mg, al->addr);
>      47         if (al->map != NULL) {
>                         /*
>                          * Kernel maps might be changed when loading symbols so loading
>                          * must be done prior to using kernel maps.
>                          */
>                         if (load_map)
>      53                         map__load(al->map);
>      54                 al->addr = al->map->map_ip(al->map, al->addr);
>                 }
>          
>      57         return al->map;
>      58  }
>          
>          struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode,
>                                            u64 addr, struct addr_location *al)
> 
> [root@jouet ~]# 
> 
> So I know that everytime we look for a mmap in a thread and we find that map,
> we get to line 54, lets try to add it:
> 
> [root@jouet ~]# perf probe -x ~/bin/perf thread__find_map:54
> Added new events:
>   probe_perf:thread__find_map (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_1 (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_2 (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_3 (on thread__find_map:54 in /home/acme/bin/perf)
>   probe_perf:thread__find_map_4 (on thread__find_map:54 in /home/acme/bin/perf)
> 
> You can now use it in all perf tools, such as:
> 
> 	perf record -e probe_perf:thread__find_map_4 -aR sleep 1
> 
> [root@jouet ~]#

Hmm, it seems that the line is optimized and spread into 5 places (instructions).
"perf probe -x ~/bin/perf -D thread__find_map:54" will show you the actual address
where the probes are. Or, you can just dump the /sys/kernel/debug/tracing/uprobe_events.

> Now I run 'perf top' and then, on another terminal, run this to get system wide
> events to get a few of those probes:
> 
> [root@jouet ~]# perf trace -a -e *perf:*/max-stack=7/ sleep 0.02
>      0.000 probe_perf:thread__find_map_3:(4be2e3)
>                                        machine__resolve (/home/acme/bin/perf)
>                                        perf_top__mmap_read_idx (/home/acme/bin/perf)
>                                        perf_top__mmap_read (/home/acme/bin/perf)
>                                        cmd_top (/home/acme/bin/perf)
>                                        run_builtin (/home/acme/bin/perf)
>                                        handle_internal_command (/home/acme/bin/perf)
>                                        main (/home/acme/bin/perf)
>      0.023 probe_perf:thread__find_map_3:(4be2e3)
>                                        machine__resolve (/home/acme/bin/perf)
>                                        perf_top__mmap_read_idx (/home/acme/bin/perf)
>                                        perf_top__mmap_read (/home/acme/bin/perf)
>                                        cmd_top (/home/acme/bin/perf)
>                                        run_builtin (/home/acme/bin/perf)
>                                        handle_internal_command (/home/acme/bin/perf)
>                                        main (/home/acme/bin/perf)
>      0.048 probe_perf:thread__find_map_3:(4be2e3)
>                                        machine__resolve (/home/acme/bin/perf)
>                                        perf_top__mmap_read_idx (/home/acme/bin/perf)
>                                        perf_top__mmap_read (/home/acme/bin/perf)
>                                        cmd_top (/home/acme/bin/perf)
>                                        run_builtin (/home/acme/bin/perf)
>                                        handle_internal_command (/home/acme/bin/perf)
>                                        main (/home/acme/bin/perf)
> <SNIP>
> [root@jouet ~]# 
> 
> So it now I'm not being able to reproduce... Erm, nevermind then, I'll report
> back if I notice this again... :-)

OK, I will wait for your report.

> 
> Thanks for checking, sorry for the noise.

Issue reports are always welcome :)

Thank you,

> 
> - Arnaldo


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-10-18  9:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-14  7:42 perf's handling of unfindable user symbols David Miller
2018-10-15 22:25 ` Arnaldo Carvalho de Melo
2018-10-15 23:02   ` David Miller
2018-10-16 18:45     ` Arnaldo Carvalho de Melo
2018-10-16 19:02       ` David Miller
2018-10-16 19:37         ` Arnaldo Carvalho de Melo
2018-10-17  8:22       ` Masami Hiramatsu
2018-10-17 12:28         ` Arnaldo Carvalho de Melo
2018-10-18  9:35           ` Masami Hiramatsu
2018-10-18  9:42           ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).