linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] [RFC] ARM: perf: allow tracing with kernel tracepoints events
@ 2014-05-16 15:01 Jean Pihet
  2014-05-19 15:39 ` Will Deacon
  0 siblings, 1 reply; 10+ messages in thread
From: Jean Pihet @ 2014-05-16 15:01 UTC (permalink / raw)
  To: linux-arm-kernel, Will Deacon, linux-kernel; +Cc: Jiri Olsa, Jean Pihet

When tracing with tracepoints events the IP and CPSR are set to 0,
preventing the perf code to resolve the symbols:

./perf record -e kmem:kmalloc cal
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB perf.data (~321 samples) ]

./perf report
Overhead Command Shared Object Symbol
........ ....... ............. ...........
40.78%   cal     [unknown]     [.]00000000
31.6%    cal     [unknown]     [.]00000000

The examination of the gathered samples (perf report -D) shows the IP
is set to 0 and that the samples are considered as user space samples,
while the IP should be set from the registers and the samples should be
considered as kernel samples.

The fix is to implement perf_arch_fetch_caller_regs for ARM, which
fills the necessary registers: ip, lr, sp and cpsr (used to check
the user mode property of the samples).

Heavily inspired from arch/arm/include/asm/kexec.h.

Reported by Sneha Priya on linaro-dev, cf.
http://lists.linaro.org/pipermail/linaro-dev/2014-May/017151.html

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Reported-by: Sneha Priya <sneha.cse@hotmail.com>
---
 arch/arm/include/asm/perf_event.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h
index 7558775..d466e39 100644
--- a/arch/arm/include/asm/perf_event.h
+++ b/arch/arm/include/asm/perf_event.h
@@ -26,6 +26,19 @@ struct pt_regs;
 extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
 extern unsigned long perf_misc_flags(struct pt_regs *regs);
 #define perf_misc_flags(regs)	perf_misc_flags(regs)
+
+#define perf_arch_fetch_caller_regs(regs, __ip) {	\
+	instruction_pointer(regs)= (__ip);		\
+	__asm__ __volatile__ (				\
+		"mov	%[_ARM_sp], sp\n\t"		\
+		"str	lr, %[_ARM_lr]\n\t"		\
+		"mrs	%[_ARM_cpsr], cpsr\n\t"		\
+		: [_ARM_cpsr] "=r" (regs->ARM_cpsr),	\
+		  [_ARM_sp] "=r" (regs->ARM_sp),	\
+		  [_ARM_lr] "=o" (regs->ARM_lr)		\
+		: : "memory"				\
+	);						\
+}
 #endif
 
 #endif /* __ARM_PERF_EVENT_H__ */
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] [RFC] ARM: perf: allow tracing with kernel tracepoints events
  2014-05-16 15:01 [PATCH] [RFC] ARM: perf: allow tracing with kernel tracepoints events Jean Pihet
@ 2014-05-19 15:39 ` Will Deacon
  2014-05-19 15:58   ` Jean Pihet
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2014-05-19 15:39 UTC (permalink / raw)
  To: Jean Pihet; +Cc: linux-arm-kernel, linux-kernel, Jiri Olsa

Hi Jean,

On Fri, May 16, 2014 at 04:01:16PM +0100, Jean Pihet wrote:
> When tracing with tracepoints events the IP and CPSR are set to 0,
> preventing the perf code to resolve the symbols:
> 
> ./perf record -e kmem:kmalloc cal
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.007 MB perf.data (~321 samples) ]
> 
> ./perf report
> Overhead Command Shared Object Symbol
> ........ ....... ............. ...........
> 40.78%   cal     [unknown]     [.]00000000
> 31.6%    cal     [unknown]     [.]00000000
> 
> The examination of the gathered samples (perf report -D) shows the IP
> is set to 0 and that the samples are considered as user space samples,
> while the IP should be set from the registers and the samples should be
> considered as kernel samples.
> 
> The fix is to implement perf_arch_fetch_caller_regs for ARM, which
> fills the necessary registers: ip, lr, sp and cpsr (used to check
> the user mode property of the samples).
> 
> Heavily inspired from arch/arm/include/asm/kexec.h.
> 
> Reported by Sneha Priya on linaro-dev, cf.
> http://lists.linaro.org/pipermail/linaro-dev/2014-May/017151.html
> 
> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> Cc: Will Deacon <will.deacon@arm.com>
> Reported-by: Sneha Priya <sneha.cse@hotmail.com>
> ---
>  arch/arm/include/asm/perf_event.h | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h
> index 7558775..d466e39 100644
> --- a/arch/arm/include/asm/perf_event.h
> +++ b/arch/arm/include/asm/perf_event.h
> @@ -26,6 +26,19 @@ struct pt_regs;
>  extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
>  extern unsigned long perf_misc_flags(struct pt_regs *regs);
>  #define perf_misc_flags(regs)	perf_misc_flags(regs)
> +
> +#define perf_arch_fetch_caller_regs(regs, __ip) {	\
> +	instruction_pointer(regs)= (__ip);		\
> +	__asm__ __volatile__ (				\
> +		"mov	%[_ARM_sp], sp\n\t"		\
> +		"str	lr, %[_ARM_lr]\n\t"		\
> +		"mrs	%[_ARM_cpsr], cpsr\n\t"		\
> +		: [_ARM_cpsr] "=r" (regs->ARM_cpsr),	\
> +		  [_ARM_sp] "=r" (regs->ARM_sp),	\
> +		  [_ARM_lr] "=o" (regs->ARM_lr)		\
> +		: : "memory"				\
> +	);						\
> +}

Why do we need to save lr? If it's for unwinding, what about fp? Also, why
do you have a "memory" clobber and why is this block marked volatile?

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] [RFC] ARM: perf: allow tracing with kernel tracepoints events
  2014-05-19 15:39 ` Will Deacon
@ 2014-05-19 15:58   ` Jean Pihet
  2014-06-17 17:11     ` [PATCH] " Jean Pihet
  0 siblings, 1 reply; 10+ messages in thread
From: Jean Pihet @ 2014-05-19 15:58 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, linux-kernel, Jiri Olsa

Hi Will,

On 19 May 2014 17:39, Will Deacon <will.deacon@arm.com> wrote:
> Hi Jean,
>
> On Fri, May 16, 2014 at 04:01:16PM +0100, Jean Pihet wrote:
>> When tracing with tracepoints events the IP and CPSR are set to 0,
>> preventing the perf code to resolve the symbols:
>>
>> ./perf record -e kmem:kmalloc cal
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.007 MB perf.data (~321 samples) ]
>>
>> ./perf report
>> Overhead Command Shared Object Symbol
>> ........ ....... ............. ...........
>> 40.78%   cal     [unknown]     [.]00000000
>> 31.6%    cal     [unknown]     [.]00000000
>>
>> The examination of the gathered samples (perf report -D) shows the IP
>> is set to 0 and that the samples are considered as user space samples,
>> while the IP should be set from the registers and the samples should be
>> considered as kernel samples.
>>
>> The fix is to implement perf_arch_fetch_caller_regs for ARM, which
>> fills the necessary registers: ip, lr, sp and cpsr (used to check
>> the user mode property of the samples).
>>
>> Heavily inspired from arch/arm/include/asm/kexec.h.
>>
>> Reported by Sneha Priya on linaro-dev, cf.
>> http://lists.linaro.org/pipermail/linaro-dev/2014-May/017151.html
>>
>> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Reported-by: Sneha Priya <sneha.cse@hotmail.com>
>> ---
>>  arch/arm/include/asm/perf_event.h | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h
>> index 7558775..d466e39 100644
>> --- a/arch/arm/include/asm/perf_event.h
>> +++ b/arch/arm/include/asm/perf_event.h
>> @@ -26,6 +26,19 @@ struct pt_regs;
>>  extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
>>  extern unsigned long perf_misc_flags(struct pt_regs *regs);
>>  #define perf_misc_flags(regs)        perf_misc_flags(regs)
>> +
>> +#define perf_arch_fetch_caller_regs(regs, __ip) {    \
>> +     instruction_pointer(regs)= (__ip);              \
>> +     __asm__ __volatile__ (                          \
>> +             "mov    %[_ARM_sp], sp\n\t"             \
>> +             "str    lr, %[_ARM_lr]\n\t"             \
>> +             "mrs    %[_ARM_cpsr], cpsr\n\t"         \
>> +             : [_ARM_cpsr] "=r" (regs->ARM_cpsr),    \
>> +               [_ARM_sp] "=r" (regs->ARM_sp),        \
>> +               [_ARM_lr] "=o" (regs->ARM_lr)         \
>> +             : : "memory"                            \
>> +     );                                              \
>> +}
>
> Why do we need to save lr? If it's for unwinding, what about fp? Also, why
> do you have a "memory" clobber and why is this block marked volatile?
These are all valid questions, hence the RFC state of the patch.

Here is the comment about the marco from include/linux/perf_event.h:
/*
 * Take a snapshot of the regs. Skip ip and frame pointer to
 * the nth caller. We only need a few of the regs:
 * - ip for PERF_SAMPLE_IP
 * - cs for user_mode() tests
 * - bp for callchains
 * - eflags, for future purposes, just in case
 */
static inline void perf_fetch_caller_regs(struct pt_regs *regs)
...

So, is it OK to provide a version that saves ip, cpsr (for
user_mode()), lr and fp (for callchain)?

The clobber and volatile are from my copy/paste from the kexec code.
The memory clobber is there because we are touching the regs struct in
memory. Just tell me if those are overkill, I will remove them.

Thanks for reviewing,
Jean

>
> Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] ARM: perf: allow tracing with kernel tracepoints events
  2014-05-19 15:58   ` Jean Pihet
@ 2014-06-17 17:11     ` Jean Pihet
  2014-06-18 12:53       ` Will Deacon
  0 siblings, 1 reply; 10+ messages in thread
From: Jean Pihet @ 2014-06-17 17:11 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel, linaro-kernel, Sneha Priya, linux-kernel
  Cc: Jean Pihet

When tracing with tracepoints events the IP and CPSR are set to 0,
preventing the perf code to resolve the symbols:

./perf record -e kmem:kmalloc cal
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB perf.data (~321 samples) ]

./perf report
Overhead Command Shared Object Symbol
........ ....... ............. ...........
40.78%   cal     [unknown]     [.]00000000
31.6%    cal     [unknown]     [.]00000000

The examination of the gathered samples (perf report -D) shows the IP
is set to 0 and that the samples are considered as user space samples,
while the IP should be set from the registers and the samples should be
considered as kernel samples.

The fix is to implement perf_arch_fetch_caller_regs for ARM, which
fills the necessary registers used for the callchain unwinding and
to determine the user/kernel space property of the samples: ip, sp, fp
and cpsr.

Tested with perf record and tracepoints filtering (-e <tracepoint>), with
unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).

Reported by Sneha Priya on linaro-dev, cf.
http://lists.linaro.org/pipermail/linaro-dev/2014-May/017151.html

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Reported-by: Sneha Priya <sneha.cse@hotmail.com>
---
 arch/arm/include/asm/perf_event.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h
index 7558775..5e31d46 100644
--- a/arch/arm/include/asm/perf_event.h
+++ b/arch/arm/include/asm/perf_event.h
@@ -26,6 +26,25 @@ struct pt_regs;
 extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
 extern unsigned long perf_misc_flags(struct pt_regs *regs);
 #define perf_misc_flags(regs)	perf_misc_flags(regs)
+
+/*
+ * Take a snapshot of the regs.
+ * We only need a few of the regs:
+ * - ip for PERF_SAMPLE_IP
+ * - sp, fp for callchains
+ * - cpsr for user_mode() tests
+ */
+#define perf_arch_fetch_caller_regs(regs, __ip) {	\
+	instruction_pointer(regs)= (__ip);		\
+	__asm__ (					\
+		"mov	%[_ARM_sp], sp		\n\t"	\
+		"mov	%[_ARM_fp], fp		\n\t"	\
+		"mrs	%[_ARM_cpsr], cpsr	\n\t"	\
+		: [_ARM_sp]   "=r" (regs->ARM_sp),	\
+		  [_ARM_fp]   "=r" (regs->ARM_fp),	\
+		  [_ARM_cpsr] "=r" (regs->ARM_cpsr)	\
+	);						\
+}
 #endif
 
 #endif /* __ARM_PERF_EVENT_H__ */
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: perf: allow tracing with kernel tracepoints events
  2014-06-17 17:11     ` [PATCH] " Jean Pihet
@ 2014-06-18 12:53       ` Will Deacon
  2014-06-20  8:10         ` Jean Pihet
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2014-06-18 12:53 UTC (permalink / raw)
  To: Jean Pihet; +Cc: linux-arm-kernel, linaro-kernel, Sneha Priya, linux-kernel

Hi Jean,

On Tue, Jun 17, 2014 at 06:11:05PM +0100, Jean Pihet wrote:
> When tracing with tracepoints events the IP and CPSR are set to 0,
> preventing the perf code to resolve the symbols:
> 
> ./perf record -e kmem:kmalloc cal
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.007 MB perf.data (~321 samples) ]
> 
> ./perf report
> Overhead Command Shared Object Symbol
> ........ ....... ............. ...........
> 40.78%   cal     [unknown]     [.]00000000
> 31.6%    cal     [unknown]     [.]00000000
> 
> The examination of the gathered samples (perf report -D) shows the IP
> is set to 0 and that the samples are considered as user space samples,
> while the IP should be set from the registers and the samples should be
> considered as kernel samples.
> 
> The fix is to implement perf_arch_fetch_caller_regs for ARM, which
> fills the necessary registers used for the callchain unwinding and
> to determine the user/kernel space property of the samples: ip, sp, fp
> and cpsr.

Surely its only the CPSR that identifies whether it's user or kernel?

> Tested with perf record and tracepoints filtering (-e <tracepoint>), with
> unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).

Whilst the old ACPS unwinding only needs PC, FP and SP, is this definitely
true for exidx and DWARF-based unwinding? Given that libunwind ends up
running a state machine for the latter, can we guarantee that we won't hit
instructions that require access to other general purpose registers?

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: perf: allow tracing with kernel tracepoints events
  2014-06-18 12:53       ` Will Deacon
@ 2014-06-20  8:10         ` Jean Pihet
  2014-06-25  9:01           ` Will Deacon
  0 siblings, 1 reply; 10+ messages in thread
From: Jean Pihet @ 2014-06-20  8:10 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, linaro-kernel, Sneha Priya, linux-kernel

Hi Will,

On 18 June 2014 14:53, Will Deacon <will.deacon@arm.com> wrote:
> Hi Jean,
>
> On Tue, Jun 17, 2014 at 06:11:05PM +0100, Jean Pihet wrote:
>> When tracing with tracepoints events the IP and CPSR are set to 0,
>> preventing the perf code to resolve the symbols:
>>
>> ./perf record -e kmem:kmalloc cal
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.007 MB perf.data (~321 samples) ]
>>
>> ./perf report
>> Overhead Command Shared Object Symbol
>> ........ ....... ............. ...........
>> 40.78%   cal     [unknown]     [.]00000000
>> 31.6%    cal     [unknown]     [.]00000000
>>
>> The examination of the gathered samples (perf report -D) shows the IP
>> is set to 0 and that the samples are considered as user space samples,
>> while the IP should be set from the registers and the samples should be
>> considered as kernel samples.
>>
>> The fix is to implement perf_arch_fetch_caller_regs for ARM, which
>> fills the necessary registers used for the callchain unwinding and
>> to determine the user/kernel space property of the samples: ip, sp, fp
>> and cpsr.
>
> Surely its only the CPSR that identifies whether it's user or kernel?
Yes, user_mode() is used to determine the user/kernel property of the
samples. user_mode is defined as (((regs)->ARM_cpsr & 0xf) == 0) in
ptrace.h.

>
>> Tested with perf record and tracepoints filtering (-e <tracepoint>), with
>> unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).
>
> Whilst the old ACPS unwinding only needs PC, FP and SP, is this definitely
> true for exidx and DWARF-based unwinding? Given that libunwind ends up
> running a state machine for the latter, can we guarantee that we won't hit
> instructions that require access to other general purpose registers?
Yes. dwarf unwinding does not need anything extra. Once seeded all the
rest is extracted from the dwarf trace info.

I am currently stress testing the change, let me come back to you with
the results.

Thx,
Jean

>
> Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: perf: allow tracing with kernel tracepoints events
  2014-06-20  8:10         ` Jean Pihet
@ 2014-06-25  9:01           ` Will Deacon
  2014-06-25 14:54             ` Jean Pihet
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2014-06-25  9:01 UTC (permalink / raw)
  To: Jean Pihet; +Cc: linux-arm-kernel, linaro-kernel, Sneha Priya, linux-kernel

On Fri, Jun 20, 2014 at 09:10:35AM +0100, Jean Pihet wrote:
> Hi Will,

Hi Jean,

> On 18 June 2014 14:53, Will Deacon <will.deacon@arm.com> wrote:
> > On Tue, Jun 17, 2014 at 06:11:05PM +0100, Jean Pihet wrote:
> >> Tested with perf record and tracepoints filtering (-e <tracepoint>), with
> >> unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).
> >
> > Whilst the old ACPS unwinding only needs PC, FP and SP, is this definitely
> > true for exidx and DWARF-based unwinding? Given that libunwind ends up
> > running a state machine for the latter, can we guarantee that we won't hit
> > instructions that require access to other general purpose registers?
> Yes. dwarf unwinding does not need anything extra. Once seeded all the
> rest is extracted from the dwarf trace info.

Ok, but what if the LR isn't saved on the stack, for example? What if the
code you're trying to unwind is hand-written assembly annotated with CFI
directives?

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: perf: allow tracing with kernel tracepoints events
  2014-06-25  9:01           ` Will Deacon
@ 2014-06-25 14:54             ` Jean Pihet
  2014-06-26  9:00               ` Will Deacon
  0 siblings, 1 reply; 10+ messages in thread
From: Jean Pihet @ 2014-06-25 14:54 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, linaro-kernel, Sneha Priya, linux-kernel

Hi Will,

On 25 June 2014 11:01, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jun 20, 2014 at 09:10:35AM +0100, Jean Pihet wrote:
>> Hi Will,
>
> Hi Jean,
>
>> On 18 June 2014 14:53, Will Deacon <will.deacon@arm.com> wrote:
>> > On Tue, Jun 17, 2014 at 06:11:05PM +0100, Jean Pihet wrote:
>> >> Tested with perf record and tracepoints filtering (-e <tracepoint>), with
>> >> unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).
>> >
>> > Whilst the old ACPS unwinding only needs PC, FP and SP, is this definitely
>> > true for exidx and DWARF-based unwinding? Given that libunwind ends up
>> > running a state machine for the latter, can we guarantee that we won't hit
>> > instructions that require access to other general purpose registers?
>> Yes. dwarf unwinding does not need anything extra. Once seeded all the
>> rest is extracted from the dwarf trace info.
>
> Ok, but what if the LR isn't saved on the stack, for example? What if the
> code you're trying to unwind is hand-written assembly annotated with CFI
> directives?
Then in that case the unwinding is not possible unless the
hand-crafted asm is compatible with the requested unwinding method
(fp, dwarf etc.). Do you expect problems there, if so can you give
more details?

>
> Will

Jean

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: perf: allow tracing with kernel tracepoints events
  2014-06-25 14:54             ` Jean Pihet
@ 2014-06-26  9:00               ` Will Deacon
  2014-06-27 14:53                 ` Jean Pihet
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2014-06-26  9:00 UTC (permalink / raw)
  To: Jean Pihet; +Cc: linux-arm-kernel, linaro-kernel, Sneha Priya, linux-kernel

On Wed, Jun 25, 2014 at 03:54:14PM +0100, Jean Pihet wrote:
> Hi Will,

Hello,

> On 25 June 2014 11:01, Will Deacon <will.deacon@arm.com> wrote:
> > On Fri, Jun 20, 2014 at 09:10:35AM +0100, Jean Pihet wrote:
> >> On 18 June 2014 14:53, Will Deacon <will.deacon@arm.com> wrote:
> >> > On Tue, Jun 17, 2014 at 06:11:05PM +0100, Jean Pihet wrote:
> >> >> Tested with perf record and tracepoints filtering (-e <tracepoint>), with
> >> >> unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).
> >> >
> >> > Whilst the old ACPS unwinding only needs PC, FP and SP, is this definitely
> >> > true for exidx and DWARF-based unwinding? Given that libunwind ends up
> >> > running a state machine for the latter, can we guarantee that we won't hit
> >> > instructions that require access to other general purpose registers?
> >> Yes. dwarf unwinding does not need anything extra. Once seeded all the
> >> rest is extracted from the dwarf trace info.
> >
> > Ok, but what if the LR isn't saved on the stack, for example? What if the
> > code you're trying to unwind is hand-written assembly annotated with CFI
> > directives?
> Then in that case the unwinding is not possible unless the
> hand-crafted asm is compatible with the requested unwinding method
> (fp, dwarf etc.). Do you expect problems there, if so can you give
> more details?

To use a readily available AArch64 example, take a look at
__kernel_gettimeofday in arch/arm64/kernel/vdso/gettimeofday.S

It starts by moving the link register into x2, so that it can later call
__do_get_tspec without clobbering it. Furthermore, it doesn't make use of
the stack at all.

How can you unwind that using your current code?

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: perf: allow tracing with kernel tracepoints events
  2014-06-26  9:00               ` Will Deacon
@ 2014-06-27 14:53                 ` Jean Pihet
  0 siblings, 0 replies; 10+ messages in thread
From: Jean Pihet @ 2014-06-27 14:53 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, linaro-kernel, Sneha Priya, linux-kernel

Hi Will,

On 26 June 2014 11:00, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Jun 25, 2014 at 03:54:14PM +0100, Jean Pihet wrote:
>> Hi Will,
>
> Hello,
>
>> On 25 June 2014 11:01, Will Deacon <will.deacon@arm.com> wrote:
>> > On Fri, Jun 20, 2014 at 09:10:35AM +0100, Jean Pihet wrote:
>> >> On 18 June 2014 14:53, Will Deacon <will.deacon@arm.com> wrote:
>> >> > On Tue, Jun 17, 2014 at 06:11:05PM +0100, Jean Pihet wrote:
>> >> >> Tested with perf record and tracepoints filtering (-e <tracepoint>), with
>> >> >> unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).
>> >> >
>> >> > Whilst the old ACPS unwinding only needs PC, FP and SP, is this definitely
>> >> > true for exidx and DWARF-based unwinding? Given that libunwind ends up
>> >> > running a state machine for the latter, can we guarantee that we won't hit
>> >> > instructions that require access to other general purpose registers?
>> >> Yes. dwarf unwinding does not need anything extra. Once seeded all the
>> >> rest is extracted from the dwarf trace info.
>> >
>> > Ok, but what if the LR isn't saved on the stack, for example? What if the
>> > code you're trying to unwind is hand-written assembly annotated with CFI
>> > directives?
>> Then in that case the unwinding is not possible unless the
>> hand-crafted asm is compatible with the requested unwinding method
>> (fp, dwarf etc.). Do you expect problems there, if so can you give
>> more details?
>
> To use a readily available AArch64 example, take a look at
> __kernel_gettimeofday in arch/arm64/kernel/vdso/gettimeofday.S
>
> It starts by moving the link register into x2, so that it can later call
> __do_get_tspec without clobbering it. Furthermore, it doesn't make use of
> the stack at all.
>
> How can you unwind that using your current code?
That is interesting. In that case that particular function will not be
seen in the call chain since lr, fp are the ones from the caller. I
did not try on a real case, it would be nice to try it out, I can do
that as soon as I am back on ARM64.

Note: I was debugging a deadlock in perf doing call chain unwinding
and tracepoint triggering. A new patch set is on its way.

Thx & regards,
Jean

>
> Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-06-27 14:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-16 15:01 [PATCH] [RFC] ARM: perf: allow tracing with kernel tracepoints events Jean Pihet
2014-05-19 15:39 ` Will Deacon
2014-05-19 15:58   ` Jean Pihet
2014-06-17 17:11     ` [PATCH] " Jean Pihet
2014-06-18 12:53       ` Will Deacon
2014-06-20  8:10         ` Jean Pihet
2014-06-25  9:01           ` Will Deacon
2014-06-25 14:54             ` Jean Pihet
2014-06-26  9:00               ` Will Deacon
2014-06-27 14:53                 ` Jean Pihet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).