linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] perf probe fixes for ppc64le
@ 2016-04-06 12:32 Naveen N. Rao
  2016-04-06 12:32 ` [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms Naveen N. Rao
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Naveen N. Rao @ 2016-04-06 12:32 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Mark Wielaard, Thiago Jung Bauermann, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Michael Ellerman, Ananth N Mavinakayanahalli

This patchset fixes three issues found with perf probe on ppc64le:
1. 'perf test kallsyms' failure on ppc64le (reported by Michael
Ellerman). This was due to the symbols being fixed up during symbol
table load. This is fixed in patch 2 by delaying symbol fixup until
later.
2. perf probe function offset was being calculated from the local entry
point (LEP), which does not match user expectation when trying to look
at function disassembly output (reported by Ananth N). This is fixed for
kallsyms in patch 1 and for symbol table in patch 2.
3. perf probe failure with kretprobe when using kallsyms. This was
failing as we were specifying an offset. This is fixed in patch 1.

A few examples demonstrating the issues and the fix:

Example for issue (2):
--------------------
    # objdump -d vmlinux | grep -A8 \<_do_fork\>:
    c0000000000b6a00 <_do_fork>:
    c0000000000b6a00:	f7 00 4c 3c 	addis   r2,r12,247
    c0000000000b6a04:	00 86 42 38 	addi    r2,r2,-31232
    c0000000000b6a08:	a6 02 08 7c 	mflr    r0
    c0000000000b6a0c:	d0 ff 41 fb 	std     r26,-48(r1)
    c0000000000b6a10:	26 80 90 7d 	mfocrf  r12,8
    c0000000000b6a14:	d8 ff 61 fb 	std     r27,-40(r1)
    c0000000000b6a18:	e0 ff 81 fb 	std     r28,-32(r1)
    c0000000000b6a1c:	e8 ff a1 fb 	std     r29,-24(r1)
    # perf probe -v _do_fork+4
    probe-definition(0): _do_fork+4 
    symbol:_do_fork file:(null) line:0 offset:4 return:0 lazy:(null)
    0 arguments
    Looking at the vmlinux_path (8 entries long)
    Using /proc/kcore for kernel object code
    Using /proc/kallsyms for symbols
    Opening /sys/kernel/debug/tracing//kprobe_events write=1
    Writing event: p:probe/_do_fork _text+748044
    Added new event:
      probe:_do_fork       (on _do_fork+4)

    You can now use it in all perf tools, such as:

	    perf record -e probe:_do_fork -aR sleep 1

    # printf "%x\n" 748044
    b6a0c
    ^^^^^
This is offset from the LEP. With this, there is also no way to ever
probe between the GEP and the LEP.

With this patchset:
    # perf probe -v _do_fork+4
    probe-definition(0): _do_fork+4 
    symbol:_do_fork file:(null) line:0 offset:4 return:0 lazy:(null)
    0 arguments
    Looking at the vmlinux_path (8 entries long)
    Using /proc/kcore for kernel object code
    Using /proc/kallsyms for symbols
    Opening /sys/kernel/debug/tracing//kprobe_events write=1
    Writing event: p:probe/_do_fork _text+748036
    Added new event:
      probe:_do_fork       (on _do_fork+4)

    You can now use it in all perf tools, such as:

	    perf record -e probe:_do_fork -aR sleep 1

    # perf probe -v _do_fork
    probe-definition(0): _do_fork 
    symbol:_do_fork file:(null) line:0 offset:0 return:0 lazy:(null)
    0 arguments
    Looking at the vmlinux_path (8 entries long)
    Using /proc/kcore for kernel object code
    Using /proc/kallsyms for symbols
    Opening /sys/kernel/debug/tracing//kprobe_events write=1
    Writing event: p:probe/_do_fork _text+748040
    Added new event:
      probe:_do_fork       (on _do_fork)

    You can now use it in all perf tools, such as:

	    perf record -e probe:_do_fork -aR sleep 1

We only offset to the LEP if function entry is specified, otherwise, we
offset from the GEP.

Example for issue (3):
---------------------
Before patch:
    # perf probe -v _do_fork:%return
    probe-definition(0): _do_fork:%return 
    symbol:_do_fork file:(null) line:0 offset:0 return:1 lazy:(null)
    0 arguments
    Looking at the vmlinux_path (8 entries long)
    Using /proc/kcore for kernel object code
    Using /proc/kallsyms for symbols
    Opening /sys/kernel/debug/tracing//kprobe_events write=1
    Writing event: r:probe/_do_fork _do_fork+8
    Failed to write event: Invalid argument
      Error: Failed to add events. Reason: Invalid argument (Code: -22)

After patch:
    # perf probe _do_fork:%return
    Added new event:
      probe:_do_fork       (on _do_fork%return)

    You can now use it in all perf tools, such as:

	    perf record -e probe:_do_fork -aR sleep 1

Cc: Mark Wielaard <mjw@redhat.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

Naveen N. Rao (2):
  perf/powerpc: Fix kprobe and kretprobe handling with kallsyms
  tools/perf: Fix kallsyms perf test on ppc64le

 tools/perf/arch/powerpc/util/sym-handling.c | 41 ++++++++++++++++++++---------
 tools/perf/util/probe-event.c               |  5 ++--
 tools/perf/util/probe-event.h               |  3 ++-
 tools/perf/util/symbol-elf.c                |  7 ++---
 tools/perf/util/symbol.h                    |  3 ++-
 5 files changed, 40 insertions(+), 19 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms
  2016-04-06 12:32 [PATCH 0/2] perf probe fixes for ppc64le Naveen N. Rao
@ 2016-04-06 12:32 ` Naveen N. Rao
  2016-04-07  4:30   ` Ananth N Mavinakayanahalli
  2016-04-06 12:32 ` [PATCH 2/2] tools/perf: Fix kallsyms perf test on ppc64le Naveen N. Rao
  2016-04-07  8:19 ` [PATCH 0/2] perf probe fixes for ppc64le Balbir Singh
  2 siblings, 1 reply; 12+ messages in thread
From: Naveen N. Rao @ 2016-04-06 12:32 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Mark Wielaard, Thiago Jung Bauermann, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Michael Ellerman

So far, we used to treat probe point offsets as being offset from the
LEP. However, userspace applications (objdump/readelf) always show
disassembly and offsets from the function GEP. This is confusing to the
user as we will end up probing at an address different from what the
user expects when looking at the function disassembly with
readelf/objdump. Fix this by changing how we modify probe address with
perf.

If only the function name is provided, we assume the user needs the LEP.
Otherwise, if an offset is specified, we assume that the user knows the
exact address to probe based on function disassembly, and so we just
place the probe from the GEP offset.

Finally, kretprobe was also broken with kallsyms as we were trying to
specify an offset. This patch also fixes that issue.

Cc: Mark Wielaard <mjw@redhat.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/sym-handling.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c b/tools/perf/arch/powerpc/util/sym-handling.c
index bbc1a50..c5b4756 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -71,12 +71,21 @@ void arch__fix_tev_from_maps(struct perf_probe_event *pev,
 			     struct probe_trace_event *tev, struct map *map)
 {
 	/*
-	 * ppc64 ABIv2 local entry point is currently always 2 instructions
-	 * (8 bytes) after the global entry point.
+	 * When probing at a function entry point, we normally always want the
+	 * LEP since that catches calls to the function through both the GEP and
+	 * the LEP. Hence, we would like to probe at an offset of 8 bytes if
+	 * the user only specified the function entry.
+	 *
+	 * However, if the user specifies an offset, we fall back to using the
+	 * GEP since all userspace applications (objdump/readelf) show function
+	 * disassembly with offsets from the GEP.
+	 *
+	 * In addition, we shouldn't specify an offset for kretprobes.
 	 */
-	if (!pev->uprobes && map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS) {
-		tev->point.address += PPC64LE_LEP_OFFSET;
+	if (pev->point.offset || pev->point.retprobe)
+		return;
+
+	if (!pev->uprobes && map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
 		tev->point.offset += PPC64LE_LEP_OFFSET;
-	}
 }
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] tools/perf: Fix kallsyms perf test on ppc64le
  2016-04-06 12:32 [PATCH 0/2] perf probe fixes for ppc64le Naveen N. Rao
  2016-04-06 12:32 ` [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms Naveen N. Rao
@ 2016-04-06 12:32 ` Naveen N. Rao
  2016-04-06 14:32   ` Ananth N Mavinakayanahalli
  2016-04-07  8:19 ` [PATCH 0/2] perf probe fixes for ppc64le Balbir Singh
  2 siblings, 1 reply; 12+ messages in thread
From: Naveen N. Rao @ 2016-04-06 12:32 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Mark Wielaard, Thiago Jung Bauermann, Ananth N Mavinakayanahalli,
	Arnaldo Carvalho de Melo, Masami Hiramatsu

ppc64le functions have a Global Entry Point (GEP) and a Local Entry
Point (LEP). While placing a probe, we always prefer the LEP since it
catches function calls through both the GEP and the LEP. In order to do
this, we fixup the function entry points during elf symbol table lookup
to point to the LEPs. This works, but breaks 'perf test kallsyms' since
the symbols loaded from the symbol table (pointing to the LEP) do not
match the symbols in kallsyms.

To fix this, we do not adjust all the symbols during symbol table load,
but only adjust the probe trace point.

Cc: Mark Wielaard <mjw@redhat.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/sym-handling.c | 24 ++++++++++++++++--------
 tools/perf/util/probe-event.c               |  5 +++--
 tools/perf/util/probe-event.h               |  3 ++-
 tools/perf/util/symbol-elf.c                |  7 ++++---
 tools/perf/util/symbol.h                    |  3 ++-
 5 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c b/tools/perf/arch/powerpc/util/sym-handling.c
index c5b4756..2f72aec 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -19,12 +19,6 @@ bool elf__needs_adjust_symbols(GElf_Ehdr ehdr)
 	       ehdr.e_type == ET_DYN;
 }
 
-#if defined(_CALL_ELF) && _CALL_ELF == 2
-void arch__elf_sym_adjust(GElf_Sym *sym)
-{
-	sym->st_value += PPC64_LOCAL_ENTRY_OFFSET(sym->st_other);
-}
-#endif
 #endif
 
 #if !defined(_CALL_ELF) || _CALL_ELF != 2
@@ -65,11 +59,21 @@ bool arch__prefers_symtab(void)
 	return true;
 }
 
+#ifdef HAVE_LIBELF_SUPPORT
+void arch__sym_update(struct symbol *s, GElf_Sym *sym)
+{
+	s->arch_sym = sym->st_other;
+}
+#endif
+
 #define PPC64LE_LEP_OFFSET	8
 
 void arch__fix_tev_from_maps(struct perf_probe_event *pev,
-			     struct probe_trace_event *tev, struct map *map)
+			     struct probe_trace_event *tev, struct map *map,
+			     struct symbol *sym)
 {
+	int lep_offset;
+
 	/*
 	 * When probing at a function entry point, we normally always want the
 	 * LEP since that catches calls to the function through both the GEP and
@@ -82,10 +86,14 @@ void arch__fix_tev_from_maps(struct perf_probe_event *pev,
 	 *
 	 * In addition, we shouldn't specify an offset for kretprobes.
 	 */
-	if (pev->point.offset || pev->point.retprobe)
+	if (pev->point.offset || pev->point.retprobe || !map || !sym)
 		return;
 
+	lep_offset = PPC64_LOCAL_ENTRY_OFFSET(sym->arch_sym);
+
 	if (!pev->uprobes && map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
 		tev->point.offset += PPC64LE_LEP_OFFSET;
+	else if (lep_offset)
+		tev->point.offset += lep_offset;
 }
 #endif
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8319fbb..d786a49 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2498,7 +2498,8 @@ static int find_probe_functions(struct map *map, char *name,
 
 void __weak arch__fix_tev_from_maps(struct perf_probe_event *pev __maybe_unused,
 				struct probe_trace_event *tev __maybe_unused,
-				struct map *map __maybe_unused) { }
+				struct map *map __maybe_unused,
+				struct symbol *sym __maybe_unused) { }
 
 /*
  * Find probe function addresses from map.
@@ -2624,7 +2625,7 @@ static int find_probe_trace_events_from_map(struct perf_probe_event *pev,
 					strdup_or_goto(pev->args[i].type,
 							nomem_out);
 		}
-		arch__fix_tev_from_maps(pev, tev, map);
+		arch__fix_tev_from_maps(pev, tev, map, sym);
 	}
 	if (ret == skipped) {
 		ret = -ENOENT;
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index e54e7b0..9bbc0c1 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -154,7 +154,8 @@ int show_available_vars(struct perf_probe_event *pevs, int npevs,
 int show_available_funcs(const char *module, struct strfilter *filter, bool user);
 bool arch__prefers_symtab(void);
 void arch__fix_tev_from_maps(struct perf_probe_event *pev,
-			     struct probe_trace_event *tev, struct map *map);
+			     struct probe_trace_event *tev, struct map *map,
+			     struct symbol *sym);
 
 /* If there is no space to write, returns -E2BIG. */
 int e_snprintf(char *str, size_t size, const char *format, ...)
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index bc229a7..e6c032e 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -777,7 +777,8 @@ static bool want_demangle(bool is_kernel_sym)
 	return is_kernel_sym ? symbol_conf.demangle_kernel : symbol_conf.demangle;
 }
 
-void __weak arch__elf_sym_adjust(GElf_Sym *sym __maybe_unused) { }
+void __weak arch__sym_update(struct symbol *s __maybe_unused,
+		GElf_Sym *sym __maybe_unused) { }
 
 int dso__load_sym(struct dso *dso, struct map *map,
 		  struct symsrc *syms_ss, struct symsrc *runtime_ss,
@@ -954,8 +955,6 @@ int dso__load_sym(struct dso *dso, struct map *map,
 		    (sym.st_value & 1))
 			--sym.st_value;
 
-		arch__elf_sym_adjust(&sym);
-
 		if (dso->kernel || kmodule) {
 			char dso_name[PATH_MAX];
 
@@ -1089,6 +1088,8 @@ new_symbol:
 		if (!f)
 			goto out_elf_end;
 
+		arch__sym_update(f, &sym);
+
 		if (filter && filter(curr_map, f))
 			symbol__delete(f);
 		else {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index c8b7544..f0e62e8 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -55,6 +55,7 @@ struct symbol {
 	u16		namelen;
 	u8		binding;
 	bool		ignore;
+	u8		arch_sym;
 	char		name[0];
 };
 
@@ -310,7 +311,7 @@ int setup_intlist(struct intlist **list, const char *list_str,
 
 #ifdef HAVE_LIBELF_SUPPORT
 bool elf__needs_adjust_symbols(GElf_Ehdr ehdr);
-void arch__elf_sym_adjust(GElf_Sym *sym);
+void arch__sym_update(struct symbol *s, GElf_Sym *sym);
 #endif
 
 #define SYMBOL_A 0
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] tools/perf: Fix kallsyms perf test on ppc64le
  2016-04-06 12:32 ` [PATCH 2/2] tools/perf: Fix kallsyms perf test on ppc64le Naveen N. Rao
@ 2016-04-06 14:32   ` Ananth N Mavinakayanahalli
  0 siblings, 0 replies; 12+ messages in thread
From: Ananth N Mavinakayanahalli @ 2016-04-06 14:32 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: linux-kernel, linuxppc-dev, Mark Wielaard, Thiago Jung Bauermann,
	Arnaldo Carvalho de Melo, Masami Hiramatsu

On Wed, Apr 06, 2016 at 06:02:58PM +0530, Naveen N. Rao wrote:
> ppc64le functions have a Global Entry Point (GEP) and a Local Entry
> Point (LEP). While placing a probe, we always prefer the LEP since it
> catches function calls through both the GEP and the LEP. In order to do
> this, we fixup the function entry points during elf symbol table lookup
> to point to the LEPs. This works, but breaks 'perf test kallsyms' since
> the symbols loaded from the symbol table (pointing to the LEP) do not
> match the symbols in kallsyms.
> 
> To fix this, we do not adjust all the symbols during symbol table load,
> but only adjust the probe trace point.
> 
> Cc: Mark Wielaard <mjw@redhat.com>
> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Reported-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms
  2016-04-06 12:32 ` [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms Naveen N. Rao
@ 2016-04-07  4:30   ` Ananth N Mavinakayanahalli
  2016-04-07  6:32     ` Naveen N. Rao
  0 siblings, 1 reply; 12+ messages in thread
From: Ananth N Mavinakayanahalli @ 2016-04-07  4:30 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: linux-kernel, linuxppc-dev, Thiago Jung Bauermann,
	Arnaldo Carvalho de Melo, Masami Hiramatsu, Mark Wielaard

On Wed, Apr 06, 2016 at 06:02:57PM +0530, Naveen N. Rao wrote:

> +	if (!pev->uprobes && map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
>  		tev->point.offset += PPC64LE_LEP_OFFSET;

uprobes check against kallsysms? Am I missing something here?

Ananth

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms
  2016-04-07  4:30   ` Ananth N Mavinakayanahalli
@ 2016-04-07  6:32     ` Naveen N. Rao
  0 siblings, 0 replies; 12+ messages in thread
From: Naveen N. Rao @ 2016-04-07  6:32 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli
  Cc: linux-kernel, linuxppc-dev, Thiago Jung Bauermann,
	Arnaldo Carvalho de Melo, Masami Hiramatsu, Mark Wielaard

On 2016/04/07 10:00AM, Ananth N wrote:
> On Wed, Apr 06, 2016 at 06:02:57PM +0530, Naveen N. Rao wrote:
> 
> > +	if (!pev->uprobes && map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
> >  		tev->point.offset += PPC64LE_LEP_OFFSET;
> 
> uprobes check against kallsysms? Am I missing something here?

Ah yes. That check shouldn't be necessary since symtab_type would be 
different anyway. I will remove that check.

Thanks for the review!
- Naveen

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] perf probe fixes for ppc64le
  2016-04-06 12:32 [PATCH 0/2] perf probe fixes for ppc64le Naveen N. Rao
  2016-04-06 12:32 ` [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms Naveen N. Rao
  2016-04-06 12:32 ` [PATCH 2/2] tools/perf: Fix kallsyms perf test on ppc64le Naveen N. Rao
@ 2016-04-07  8:19 ` Balbir Singh
  2016-04-07  9:26   ` Naveen N. Rao
  2 siblings, 1 reply; 12+ messages in thread
From: Balbir Singh @ 2016-04-07  8:19 UTC (permalink / raw)
  To: Naveen N. Rao, linux-kernel, linuxppc-dev
  Cc: Mark Wielaard, Arnaldo Carvalho de Melo, Masami Hiramatsu,
	Thiago Jung Bauermann


On 06/04/16 22:32, Naveen N. Rao wrote:
> This patchset fixes three issues found with perf probe on ppc64le:
> 1. 'perf test kallsyms' failure on ppc64le (reported by Michael
> Ellerman). This was due to the symbols being fixed up during symbol
> table load. This is fixed in patch 2 by delaying symbol fixup until
> later.
> 2. perf probe function offset was being calculated from the local entry
> point (LEP), which does not match user expectation when trying to look
> at function disassembly output (reported by Ananth N). This is fixed for
> kallsyms in patch 1 and for symbol table in patch 2.

I think the bit where the offset is w.r.t LEP when using a name, but w.r.t
GEP when using function+offset can be confusing. Do we really need probe
points between GEP and LEP? All the GEP does is setup r2. The use case
could be more generic, but please clarify.

> 3. perf probe failure with kretprobe when using kallsyms. This was
> failing as we were specifying an offset. This is fixed in patch 1.
> 

Balbir Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] perf probe fixes for ppc64le
  2016-04-07  8:19 ` [PATCH 0/2] perf probe fixes for ppc64le Balbir Singh
@ 2016-04-07  9:26   ` Naveen N. Rao
  2016-04-08  6:57     ` Balbir Singh
  0 siblings, 1 reply; 12+ messages in thread
From: Naveen N. Rao @ 2016-04-07  9:26 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-kernel, linuxppc-dev, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Thiago Jung Bauermann, Mark Wielaard

On 2016/04/07 06:19PM, Balbir Singh wrote:
> 
> On 06/04/16 22:32, Naveen N. Rao wrote:
> > This patchset fixes three issues found with perf probe on ppc64le:
> > 1. 'perf test kallsyms' failure on ppc64le (reported by Michael
> > Ellerman). This was due to the symbols being fixed up during symbol
> > table load. This is fixed in patch 2 by delaying symbol fixup until
> > later.
> > 2. perf probe function offset was being calculated from the local entry
> > point (LEP), which does not match user expectation when trying to look
> > at function disassembly output (reported by Ananth N). This is fixed for
> > kallsyms in patch 1 and for symbol table in patch 2.
> 
> I think the bit where the offset is w.r.t LEP when using a name, but w.r.t
> GEP when using function+offset can be confusing.

Thanks for your review!

The rationale for this is actually from the end-user perspective. The 
two use cases we are considering are:
1. User just wants to probe at function entry point:
	# perf probe _do_fork

In this case, the user most definitely needs the local entry point, 
without which the probe won't be hit. So, for this case, we 
automatically insert the probe at the LEP.

[We really only want to alter perf probe behavior in this case only, but 
we were incorrectly changing the behavior of perf with the below 
scenario as well.]

2. User wants to probe at a specific location. In this case, the user 
most likely starts by looking at the function disassembly. For instance:
	# objdump -S -d vmlinux.bak | grep -A100 \<_do_fork\>:
	c0000000000b6a00 <_do_fork>:
		      unsigned long stack_start,
		      unsigned long stack_size,
		      int __user *parent_tidptr,
		      int __user *child_tidptr,
		      unsigned long tls)
	{
	c0000000000b6a00:	f7 00 4c 3c 	addis   r2,r12,247
	c0000000000b6a04:	00 86 42 38 	addi    r2,r2,-31232
	c0000000000b6a08:	a6 02 08 7c 	mflr    r0
	c0000000000b6a0c:	d0 ff 41 fb 	std     r26,-48(r1)
	c0000000000b6a10:	26 80 90 7d 	mfocrf  r12,8
	...<snip>...
		if (!(clone_flags & CLONE_UNTRACED)) {
	c0000000000b6a54:	e3 4f c7 7b 	rldicl. r7,r30,41,63
	c0000000000b6a58:	2c 00 82 40 	bne     c0000000000b6a84 <_do_fork+0x84>
			if (clone_flags & CLONE_VFORK)
	c0000000000b6a5c:	e3 97 c8 7b 	rldicl. r8,r30,50,63
	c0000000000b6a60:	a0 01 82 41 	beq     c0000000000b6c00 <_do_fork+0x200>
	c0000000000b6a64:	20 00 20 39 	li      r9,32
				trace = PTRACE_EVENT_VFORK;
	c0000000000b6a68:	02 00 80 3b 	li      r28,2
	c0000000000b6a6c:	10 02 4d e9 	ld      r10,528(r13)

If the user wants to probe at _do_fork+0x54, he'd do:
	# perf probe _do_fork+0x54

With the earlier approach, we would insert the probe at _do_fork+0x5c 
(0x54 from the LEP) instead, which is incorrect.

In reality, user would probably just use debuginfo:
	# perf probe -L _do_fork
	<_do_fork@/root/linus/kernel/fork.c:0>
	      0  long _do_fork(unsigned long clone_flags,
			      unsigned long stack_start,
			      unsigned long stack_size,
			      int __user *parent_tidptr,
			      int __user *child_tidptr,
			      unsigned long tls)
	      6  {
			struct task_struct *p;
	      8         int trace = 0;
			long nr;
		 
			/*
			 * Determine whether and which event to report to ptracer.  When
			 * called from kernel_thread or CLONE_UNTRACED is explicitly
			 * requested, no event is reported; otherwise, report if the event
			 * for the type of forking is enabled.
			 */
	     17         if (!(clone_flags & CLONE_UNTRACED)) {
	     18                 if (clone_flags & CLONE_VFORK)
	     19                         trace = PTRACE_EVENT_VFORK;
	     20                 else if ((clone_flags & CSIGNAL) != SIGCHLD)
	     21                         trace = PTRACE_EVENT_CLONE;

	# perf probe _do_fork:17

In this case, perf chooses the right address based on DWARF. The current 
patchset matches the behavior of perf without debuginfo with this.

> Do we really need probe
> points between GEP and LEP? All the GEP does is setup r2. The use case
> could be more generic, but please clarify.

There could be scenarios where having a probe point between GEP and LEP 
is useful - for instance, if we are only interested in calls to an 
in-kernel function from an external module. However, this is a secondary 
consideration and the more important consideration was to be consistent 
with userspace tooling (readelf/objdump) while choosing the address to 
probe.

- Naveen

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] perf probe fixes for ppc64le
  2016-04-07  9:26   ` Naveen N. Rao
@ 2016-04-08  6:57     ` Balbir Singh
  2016-04-09 13:42       ` Naveen N. Rao
  0 siblings, 1 reply; 12+ messages in thread
From: Balbir Singh @ 2016-04-08  6:57 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: linux-kernel, linuxppc-dev, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Thiago Jung Bauermann, Mark Wielaard

On Thu, 2016-04-07 at 14:56 +0530, Naveen N. Rao wrote:
> On 2016/04/07 06:19PM, Balbir Singh wrote:
> > 
> > 
> > On 06/04/16 22:32, Naveen N. Rao wrote:
> > > 
> > > This patchset fixes three issues found with perf probe on ppc64le:
> > > 1. 'perf test kallsyms' failure on ppc64le (reported by Michael
> > > Ellerman). This was due to the symbols being fixed up during symbol
> > > table load. This is fixed in patch 2 by delaying symbol fixup until
> > > later.
> > > 2. perf probe function offset was being calculated from the local entry
> > > point (LEP), which does not match user expectation when trying to look
> > > at function disassembly output (reported by Ananth N). This is fixed for
> > > kallsyms in patch 1 and for symbol table in patch 2.
> > I think the bit where the offset is w.r.t LEP when using a name, but w.r.t
> > GEP when using function+offset can be confusing.
> Thanks for your review!
> 
> The rationale for this is actually from the end-user perspective. The 
> two use cases we are considering are:
> 1. User just wants to probe at function entry point:
> 	# perf probe _do_fork
> 
> In this case, the user most definitely needs the local entry point, 
> without which the probe won't be hit. So, for this case, we 
> automatically insert the probe at the LEP.
> 
> [We really only want to alter perf probe behavior in this case only, but 
> we were incorrectly changing the behavior of perf with the below 
> scenario as well.]
> 
> 2. User wants to probe at a specific location. In this case, the user 
> most likely starts by looking at the function disassembly. For instance:
> 	# objdump -S -d vmlinux.bak | grep -A100 \<_do_fork\>:
> 	c0000000000b6a00 <_do_fork>:
> 		      unsigned long stack_start,
> 		      unsigned long stack_size,
> 		      int __user *parent_tidptr,
> 		      int __user *child_tidptr,
> 		      unsigned long tls)
> 	{
> 	c0000000000b6a00:	f7 00 4c 3c 	addis   r2,r12,247
> 	c0000000000b6a04:	00 86 42 38 	addi    r2,r2,-31232
> 	c0000000000b6a08:	a6 02 08 7c 	mflr    r0
> 	c0000000000b6a0c:	d0 ff 41 fb 	std     r26,-48(r1)
> 	c0000000000b6a10:	26 80 90 7d 	mfocrf  r12,8
> 	...<snip>...
> 		if (!(clone_flags & CLONE_UNTRACED)) {
> 	c0000000000b6a54:	e3 4f c7 7b 	rldicl. r7,r30,41,63
> 	c0000000000b6a58:	2c 00 82 40 	bne     c0000000000b6a84 <_do_fork+0x84>
> 			if (clone_flags & CLONE_VFORK)
> 	c0000000000b6a5c:	e3 97 c8 7b 	rldicl. r8,r30,50,63
> 	c0000000000b6a60:	a0 01 82 41 	beq     c0000000000b6c00 <_do_fork+0x200>
> 	c0000000000b6a64:	20 00 20 39 	li      r9,32
> 				trace = PTRACE_EVENT_VFORK;
> 	c0000000000b6a68:	02 00 80 3b 	li      r28,2
> 	c0000000000b6a6c:	10 02 4d e9 	ld      r10,528(r13)
> 
> If the user wants to probe at _do_fork+0x54, he'd do:
> 	# perf probe _do_fork+0x54
> 
> With the earlier approach, we would insert the probe at _do_fork+0x5c 
> (0x54 from the LEP) instead, which is incorrect.
> 
> In reality, user would probably just use debuginfo:
> 	# perf probe -L _do_fork
> 	<_do_fork@/root/linus/kernel/fork.c:0>
> 	      0  long _do_fork(unsigned long clone_flags,
> 			      unsigned long stack_start,
> 			      unsigned long stack_size,
> 			      int __user *parent_tidptr,
> 			      int __user *child_tidptr,
> 			      unsigned long tls)
> 	      6  {
> 			struct task_struct *p;
> 	      8         int trace = 0;
> 			long nr;
> 		 
> 			/*
> 			 * Determine whether and which event to report to ptracer.  When
> 			 * called from kernel_thread or CLONE_UNTRACED is explicitly
> 			 * requested, no event is reported; otherwise, report if the event
> 			 * for the type of forking is enabled.
> 			 */
> 	     17         if (!(clone_flags & CLONE_UNTRACED)) {
> 	     18                 if (clone_flags & CLONE_VFORK)
> 	     19                         trace = PTRACE_EVENT_VFORK;
> 	     20                 else if ((clone_flags & CSIGNAL) != SIGCHLD)
> 	     21                         trace = PTRACE_EVENT_CLONE;
> 
> 	# perf probe _do_fork:17
> 
> In this case, perf chooses the right address based on DWARF. The current 
> patchset matches the behavior of perf without debuginfo with this.


I agree what I worry is that perf probe _do_fork sets a breakpoint after
perf probe _do_fork+0x4. I am not sure if there is an easy solution to
the problem. 

Balbir

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] perf probe fixes for ppc64le
  2016-04-08  6:57     ` Balbir Singh
@ 2016-04-09 13:42       ` Naveen N. Rao
  2016-04-11  4:41         ` Michael Ellerman
  0 siblings, 1 reply; 12+ messages in thread
From: Naveen N. Rao @ 2016-04-09 13:42 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-kernel, linuxppc-dev, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Thiago Jung Bauermann, Mark Wielaard

On 2016/04/08 04:57PM, Balbir Singh wrote:
> On Thu, 2016-04-07 at 14:56 +0530, Naveen N. Rao wrote:
> > On 2016/04/07 06:19PM, Balbir Singh wrote:
> > > 
> > > 
> > > On 06/04/16 22:32, Naveen N. Rao wrote:
> > > > 
> > > > This patchset fixes three issues found with perf probe on ppc64le:
> > > > 1. 'perf test kallsyms' failure on ppc64le (reported by Michael
> > > > Ellerman). This was due to the symbols being fixed up during symbol
> > > > table load. This is fixed in patch 2 by delaying symbol fixup until
> > > > later.
> > > > 2. perf probe function offset was being calculated from the local entry
> > > > point (LEP), which does not match user expectation when trying to look
> > > > at function disassembly output (reported by Ananth N). This is fixed for
> > > > kallsyms in patch 1 and for symbol table in patch 2.
> > > I think the bit where the offset is w.r.t LEP when using a name, but w.r.t
> > > GEP when using function+offset can be confusing.
> > Thanks for your review!
> > 
> > The rationale for this is actually from the end-user perspective. The 
> > two use cases we are considering are:
> > 1. User just wants to probe at function entry point:
> > 	# perf probe _do_fork
> > 
> > In this case, the user most definitely needs the local entry point, 
> > without which the probe won't be hit. So, for this case, we 
> > automatically insert the probe at the LEP.
> > 
> > [We really only want to alter perf probe behavior in this case only, but 
> > we were incorrectly changing the behavior of perf with the below 
> > scenario as well.]
> > 
> > 2. User wants to probe at a specific location. In this case, the user 
> > most likely starts by looking at the function disassembly. For instance:
> > 	# objdump -S -d vmlinux.bak | grep -A100 \<_do_fork\>:
> > 	c0000000000b6a00 <_do_fork>:
> > 		      unsigned long stack_start,
> > 		      unsigned long stack_size,
> > 		      int __user *parent_tidptr,
> > 		      int __user *child_tidptr,
> > 		      unsigned long tls)
> > 	{
> > 	c0000000000b6a00:	f7 00 4c 3c 	addis   r2,r12,247
> > 	c0000000000b6a04:	00 86 42 38 	addi    r2,r2,-31232
> > 	c0000000000b6a08:	a6 02 08 7c 	mflr    r0
> > 	c0000000000b6a0c:	d0 ff 41 fb 	std     r26,-48(r1)
> > 	c0000000000b6a10:	26 80 90 7d 	mfocrf  r12,8
> > 	...<snip>...
> > 		if (!(clone_flags & CLONE_UNTRACED)) {
> > 	c0000000000b6a54:	e3 4f c7 7b 	rldicl. r7,r30,41,63
> > 	c0000000000b6a58:	2c 00 82 40 	bne     c0000000000b6a84 <_do_fork+0x84>
> > 			if (clone_flags & CLONE_VFORK)
> > 	c0000000000b6a5c:	e3 97 c8 7b 	rldicl. r8,r30,50,63
> > 	c0000000000b6a60:	a0 01 82 41 	beq     c0000000000b6c00 <_do_fork+0x200>
> > 	c0000000000b6a64:	20 00 20 39 	li      r9,32
> > 				trace = PTRACE_EVENT_VFORK;
> > 	c0000000000b6a68:	02 00 80 3b 	li      r28,2
> > 	c0000000000b6a6c:	10 02 4d e9 	ld      r10,528(r13)
> > 
> > If the user wants to probe at _do_fork+0x54, he'd do:
> > 	# perf probe _do_fork+0x54
> > 
> > With the earlier approach, we would insert the probe at _do_fork+0x5c 
> > (0x54 from the LEP) instead, which is incorrect.
> > 
> > In reality, user would probably just use debuginfo:
> > 	# perf probe -L _do_fork
> > 	<_do_fork@/root/linus/kernel/fork.c:0>
> > 	      0  long _do_fork(unsigned long clone_flags,
> > 			      unsigned long stack_start,
> > 			      unsigned long stack_size,
> > 			      int __user *parent_tidptr,
> > 			      int __user *child_tidptr,
> > 			      unsigned long tls)
> > 	      6  {
> > 			struct task_struct *p;
> > 	      8         int trace = 0;
> > 			long nr;
> > 		 
> > 			/*
> > 			 * Determine whether and which event to report to ptracer.  When
> > 			 * called from kernel_thread or CLONE_UNTRACED is explicitly
> > 			 * requested, no event is reported; otherwise, report if the event
> > 			 * for the type of forking is enabled.
> > 			 */
> > 	     17         if (!(clone_flags & CLONE_UNTRACED)) {
> > 	     18                 if (clone_flags & CLONE_VFORK)
> > 	     19                         trace = PTRACE_EVENT_VFORK;
> > 	     20                 else if ((clone_flags & CSIGNAL) != SIGCHLD)
> > 	     21                         trace = PTRACE_EVENT_CLONE;
> > 
> > 	# perf probe _do_fork:17
> > 
> > In this case, perf chooses the right address based on DWARF. The current 
> > patchset matches the behavior of perf without debuginfo with this.
> 
> 
> I agree what I worry is that perf probe _do_fork sets a breakpoint after
> perf probe _do_fork+0x4. I am not sure if there is an easy solution to
> the problem. 

I suppose this boils down to the quirkiness of ABIv2. Though, in 
reality, I don't think most users will notice. As I stated above, users 
will most likely start with the disassembly or debuginfo and this patch 
ensures there are actually no surprises there.

- Naveen

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] perf probe fixes for ppc64le
  2016-04-09 13:42       ` Naveen N. Rao
@ 2016-04-11  4:41         ` Michael Ellerman
  2016-04-11 13:43           ` Naveen N. Rao
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2016-04-11  4:41 UTC (permalink / raw)
  To: Naveen N. Rao, Balbir Singh
  Cc: linux-kernel, Mark Wielaard, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Thiago Jung Bauermann, linuxppc-dev

On Sat, 2016-04-09 at 19:12 +0530, Naveen N. Rao wrote:
> 
> I suppose this boils down to the quirkiness of ABIv2. Though, in 
> reality, I don't think most users will notice. As I stated above, users 
> will most likely start with the disassembly or debuginfo and this patch 
> ensures there are actually no surprises there.

Yeah it's unfortunate that we have to handle these two cases differently.

But I think you've chosen the right trade off.

When we are just given the name we *must not* use the global entry point,
otherwise the probes will often not hit - because most calls go to the local
entry point and skip the global entry point entirely.

When we're given a name and offset, it's less confusing if we use the global
entry point as the base for the offset calculation.

So for the concept:

Acked-by: Michael Ellerman <mpe@ellerman.id.au>

I don't really know this part of the perf code enough to give you an ack for the
actual changes, I'll leave that to the perf maintainers.

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] perf probe fixes for ppc64le
  2016-04-11  4:41         ` Michael Ellerman
@ 2016-04-11 13:43           ` Naveen N. Rao
  0 siblings, 0 replies; 12+ messages in thread
From: Naveen N. Rao @ 2016-04-11 13:43 UTC (permalink / raw)
  To: Michael Ellerman, Arnaldo Carvalho de Melo
  Cc: Balbir Singh, linux-kernel, Mark Wielaard, Masami Hiramatsu,
	Thiago Jung Bauermann, linuxppc-dev, Ananth N Mavinakayanahalli

On 2016/04/11 02:41PM, Michael Ellerman wrote:
> On Sat, 2016-04-09 at 19:12 +0530, Naveen N. Rao wrote:
> > 
> > I suppose this boils down to the quirkiness of ABIv2. Though, in 
> > reality, I don't think most users will notice. As I stated above, users 
> > will most likely start with the disassembly or debuginfo and this patch 
> > ensures there are actually no surprises there.
> 
> Yeah it's unfortunate that we have to handle these two cases differently.
> 
> But I think you've chosen the right trade off.
> 
> When we are just given the name we *must not* use the global entry point,
> otherwise the probes will often not hit - because most calls go to the local
> entry point and skip the global entry point entirely.
> 
> When we're given a name and offset, it's less confusing if we use the global
> entry point as the base for the offset calculation.
> 
> So for the concept:
> 
> Acked-by: Michael Ellerman <mpe@ellerman.id.au>

Thanks, Michael. That helps.

> 
> I don't really know this part of the perf code enough to give you an ack for the
> actual changes, I'll leave that to the perf maintainers.

Sure.

Arnaldo,
I will send a v2 soon with a bit more testing to make sure this covers 
all scenarios properly (I am also trying to see if we can address 
debuginfo-based probing properly).

- Naveen

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-04-11 13:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-06 12:32 [PATCH 0/2] perf probe fixes for ppc64le Naveen N. Rao
2016-04-06 12:32 ` [PATCH 1/2] perf/powerpc: Fix kprobe and kretprobe handling with kallsyms Naveen N. Rao
2016-04-07  4:30   ` Ananth N Mavinakayanahalli
2016-04-07  6:32     ` Naveen N. Rao
2016-04-06 12:32 ` [PATCH 2/2] tools/perf: Fix kallsyms perf test on ppc64le Naveen N. Rao
2016-04-06 14:32   ` Ananth N Mavinakayanahalli
2016-04-07  8:19 ` [PATCH 0/2] perf probe fixes for ppc64le Balbir Singh
2016-04-07  9:26   ` Naveen N. Rao
2016-04-08  6:57     ` Balbir Singh
2016-04-09 13:42       ` Naveen N. Rao
2016-04-11  4:41         ` Michael Ellerman
2016-04-11 13:43           ` Naveen N. Rao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).