linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
@ 2019-11-27 15:44 Antonio Borneo
  2019-12-04 14:21 ` Steven Rostedt
  2019-12-04 16:04 ` Joel Fernandes
  0 siblings, 2 replies; 13+ messages in thread
From: Antonio Borneo @ 2019-11-27 15:44 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar
  Cc: Antonio Borneo, Joel Fernandes, Joel Fernandes, linux-kernel,
	linux-stm32

This tracing event class is the only instance in kernel that logs
in the trace buffer the instruction pointer as offset to _stext,
instead of logging the full pointer.
This looks like a nice optimization for 64 bits platforms, where a
32 bit offset can take less space than a full 64 bits pointer. But
the symbol _stext is incorrectly resolved as zero in the expansion
of TP_printk(), which then prints only the hex offset instead of
the name of the caller function. Plus, on arm arch the kernel
modules are loaded at address lower than _stext, causing the u32
offset arithmetics to overflow and wrap at 32 bits.
I did not identified a 64 bit arch where the modules are loaded at
offset from _stext that exceed u32 range, but I also did not
identified any constraint to feel safe with a u32 offset.

Log directly the instruction pointer instead of the offset to
_stext.

Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events")
---
 include/trace/events/preemptirq.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/trace/events/preemptirq.h b/include/trace/events/preemptirq.h
index 95fba0471e5b..d548a6aafa18 100644
--- a/include/trace/events/preemptirq.h
+++ b/include/trace/events/preemptirq.h
@@ -18,18 +18,18 @@ DECLARE_EVENT_CLASS(preemptirq_template,
 	TP_ARGS(ip, parent_ip),
 
 	TP_STRUCT__entry(
-		__field(u32, caller_offs)
-		__field(u32, parent_offs)
+		__field(unsigned long, caller_ip)
+		__field(unsigned long, parent_ip)
 	),
 
 	TP_fast_assign(
-		__entry->caller_offs = (u32)(ip - (unsigned long)_stext);
-		__entry->parent_offs = (u32)(parent_ip - (unsigned long)_stext);
+		__entry->caller_ip = ip;
+		__entry->parent_ip = parent_ip;
 	),
 
 	TP_printk("caller=%pS parent=%pS",
-		  (void *)((unsigned long)(_stext) + __entry->caller_offs),
-		  (void *)((unsigned long)(_stext) + __entry->parent_offs))
+		  (void *)__entry->caller_ip,
+		  (void *)__entry->parent_ip)
 );
 
 #ifdef CONFIG_TRACE_IRQFLAGS
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-11-27 15:44 [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events Antonio Borneo
@ 2019-12-04 14:21 ` Steven Rostedt
  2019-12-04 16:04   ` Joel Fernandes
  2019-12-21 23:47   ` Joel Fernandes
  2019-12-04 16:04 ` Joel Fernandes
  1 sibling, 2 replies; 13+ messages in thread
From: Steven Rostedt @ 2019-12-04 14:21 UTC (permalink / raw)
  To: Antonio Borneo
  Cc: Ingo Molnar, Joel Fernandes, Joel Fernandes, linux-kernel, linux-stm32


Joel,

Any comments on this patch?

-- Steve

On Wed, 27 Nov 2019 16:44:28 +0100
Antonio Borneo <antonio.borneo@st.com> wrote:

> This tracing event class is the only instance in kernel that logs
> in the trace buffer the instruction pointer as offset to _stext,
> instead of logging the full pointer.
> This looks like a nice optimization for 64 bits platforms, where a
> 32 bit offset can take less space than a full 64 bits pointer. But
> the symbol _stext is incorrectly resolved as zero in the expansion
> of TP_printk(), which then prints only the hex offset instead of
> the name of the caller function. Plus, on arm arch the kernel
> modules are loaded at address lower than _stext, causing the u32
> offset arithmetics to overflow and wrap at 32 bits.
> I did not identified a 64 bit arch where the modules are loaded at
> offset from _stext that exceed u32 range, but I also did not
> identified any constraint to feel safe with a u32 offset.
> 
> Log directly the instruction pointer instead of the offset to
> _stext.
> 
> Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
> Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events")
> ---
>  include/trace/events/preemptirq.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/include/trace/events/preemptirq.h b/include/trace/events/preemptirq.h
> index 95fba0471e5b..d548a6aafa18 100644
> --- a/include/trace/events/preemptirq.h
> +++ b/include/trace/events/preemptirq.h
> @@ -18,18 +18,18 @@ DECLARE_EVENT_CLASS(preemptirq_template,
>  	TP_ARGS(ip, parent_ip),
>  
>  	TP_STRUCT__entry(
> -		__field(u32, caller_offs)
> -		__field(u32, parent_offs)
> +		__field(unsigned long, caller_ip)
> +		__field(unsigned long, parent_ip)
>  	),
>  
>  	TP_fast_assign(
> -		__entry->caller_offs = (u32)(ip - (unsigned long)_stext);
> -		__entry->parent_offs = (u32)(parent_ip - (unsigned long)_stext);
> +		__entry->caller_ip = ip;
> +		__entry->parent_ip = parent_ip;
>  	),
>  
>  	TP_printk("caller=%pS parent=%pS",
> -		  (void *)((unsigned long)(_stext) + __entry->caller_offs),
> -		  (void *)((unsigned long)(_stext) + __entry->parent_offs))
> +		  (void *)__entry->caller_ip,
> +		  (void *)__entry->parent_ip)
>  );
>  
>  #ifdef CONFIG_TRACE_IRQFLAGS


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-11-27 15:44 [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events Antonio Borneo
  2019-12-04 14:21 ` Steven Rostedt
@ 2019-12-04 16:04 ` Joel Fernandes
  2019-12-07  0:00   ` Antonio Borneo
  1 sibling, 1 reply; 13+ messages in thread
From: Joel Fernandes @ 2019-12-04 16:04 UTC (permalink / raw)
  To: Antonio Borneo; +Cc: Steven Rostedt, Ingo Molnar, linux-kernel, linux-stm32

On Wed, Nov 27, 2019 at 04:44:28PM +0100, Antonio Borneo wrote:
> This tracing event class is the only instance in kernel that logs
> in the trace buffer the instruction pointer as offset to _stext,
> instead of logging the full pointer.
> This looks like a nice optimization for 64 bits platforms, where a
> 32 bit offset can take less space than a full 64 bits pointer. But
> the symbol _stext is incorrectly resolved as zero in the expansion

I didn't get this. If _stext is 0 on any platform, then your patch doesn't
really do anything because the offset will be equal to the ip.

Could you provide an example with real numbers showing the overflow?

> of TP_printk(), which then prints only the hex offset instead of
> the name of the caller function. Plus, on arm arch the kernel
> modules are loaded at address lower than _stext, causing the u32
> offset arithmetics to overflow and wrap at 32 bits.

If we use signed 32-bit, will that solve the module issue?

> I did not identified a 64 bit arch where the modules are loaded at
> offset from _stext that exceed u32 range, but I also did not
> identified any constraint to feel safe with a u32 offset.
> 
> Log directly the instruction pointer instead of the offset to
> _stext.

I am not comfortable with this patch at the moment, mainly because it will
increase the size of this rather high frequency event. But I'm not saying
there isn't an issue on 32-bit. Let's discuss more.

thanks,

 - Joel


> 
> Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
> Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events")
> ---
>  include/trace/events/preemptirq.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/include/trace/events/preemptirq.h b/include/trace/events/preemptirq.h
> index 95fba0471e5b..d548a6aafa18 100644
> --- a/include/trace/events/preemptirq.h
> +++ b/include/trace/events/preemptirq.h
> @@ -18,18 +18,18 @@ DECLARE_EVENT_CLASS(preemptirq_template,
>  	TP_ARGS(ip, parent_ip),
>  
>  	TP_STRUCT__entry(
> -		__field(u32, caller_offs)
> -		__field(u32, parent_offs)
> +		__field(unsigned long, caller_ip)
> +		__field(unsigned long, parent_ip)
>  	),
>  
>  	TP_fast_assign(
> -		__entry->caller_offs = (u32)(ip - (unsigned long)_stext);
> -		__entry->parent_offs = (u32)(parent_ip - (unsigned long)_stext);
> +		__entry->caller_ip = ip;
> +		__entry->parent_ip = parent_ip;
>  	),
>  
>  	TP_printk("caller=%pS parent=%pS",
> -		  (void *)((unsigned long)(_stext) + __entry->caller_offs),
> -		  (void *)((unsigned long)(_stext) + __entry->parent_offs))
> +		  (void *)__entry->caller_ip,
> +		  (void *)__entry->parent_ip)
>  );
>  
>  #ifdef CONFIG_TRACE_IRQFLAGS
> -- 
> 2.24.0
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-04 14:21 ` Steven Rostedt
@ 2019-12-04 16:04   ` Joel Fernandes
  2019-12-21 23:47   ` Joel Fernandes
  1 sibling, 0 replies; 13+ messages in thread
From: Joel Fernandes @ 2019-12-04 16:04 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Antonio Borneo, Ingo Molnar, linux-kernel, linux-stm32

On Wed, Dec 04, 2019 at 09:21:15AM -0500, Steven Rostedt wrote:
> 
> Joel,
> 
> Any comments on this patch?

Just replied, thanks for the ping!

 - Joel

> 
> -- Steve
> 
> On Wed, 27 Nov 2019 16:44:28 +0100
> Antonio Borneo <antonio.borneo@st.com> wrote:
> 
> > This tracing event class is the only instance in kernel that logs
> > in the trace buffer the instruction pointer as offset to _stext,
> > instead of logging the full pointer.
> > This looks like a nice optimization for 64 bits platforms, where a
> > 32 bit offset can take less space than a full 64 bits pointer. But
> > the symbol _stext is incorrectly resolved as zero in the expansion
> > of TP_printk(), which then prints only the hex offset instead of
> > the name of the caller function. Plus, on arm arch the kernel
> > modules are loaded at address lower than _stext, causing the u32
> > offset arithmetics to overflow and wrap at 32 bits.
> > I did not identified a 64 bit arch where the modules are loaded at
> > offset from _stext that exceed u32 range, but I also did not
> > identified any constraint to feel safe with a u32 offset.
> > 
> > Log directly the instruction pointer instead of the offset to
> > _stext.
> > 
> > Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
> > Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events")
> > ---
> >  include/trace/events/preemptirq.h | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/trace/events/preemptirq.h b/include/trace/events/preemptirq.h
> > index 95fba0471e5b..d548a6aafa18 100644
> > --- a/include/trace/events/preemptirq.h
> > +++ b/include/trace/events/preemptirq.h
> > @@ -18,18 +18,18 @@ DECLARE_EVENT_CLASS(preemptirq_template,
> >  	TP_ARGS(ip, parent_ip),
> >  
> >  	TP_STRUCT__entry(
> > -		__field(u32, caller_offs)
> > -		__field(u32, parent_offs)
> > +		__field(unsigned long, caller_ip)
> > +		__field(unsigned long, parent_ip)
> >  	),
> >  
> >  	TP_fast_assign(
> > -		__entry->caller_offs = (u32)(ip - (unsigned long)_stext);
> > -		__entry->parent_offs = (u32)(parent_ip - (unsigned long)_stext);
> > +		__entry->caller_ip = ip;
> > +		__entry->parent_ip = parent_ip;
> >  	),
> >  
> >  	TP_printk("caller=%pS parent=%pS",
> > -		  (void *)((unsigned long)(_stext) + __entry->caller_offs),
> > -		  (void *)((unsigned long)(_stext) + __entry->parent_offs))
> > +		  (void *)__entry->caller_ip,
> > +		  (void *)__entry->parent_ip)
> >  );
> >  
> >  #ifdef CONFIG_TRACE_IRQFLAGS
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-04 16:04 ` Joel Fernandes
@ 2019-12-07  0:00   ` Antonio Borneo
  2019-12-19 18:45     ` Valentin Schneider
  0 siblings, 1 reply; 13+ messages in thread
From: Antonio Borneo @ 2019-12-07  0:00 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: Steven Rostedt, Ingo Molnar, linux-kernel, linux-stm32

On Wed, 2019-12-04 at 11:04 -0500, Joel Fernandes wrote:
> On Wed, Nov 27, 2019 at 04:44:28PM +0100, Antonio Borneo wrote:
> > This tracing event class is the only instance in kernel that logs
> > in the trace buffer the instruction pointer as offset to _stext,
> > instead of logging the full pointer.
> > This looks like a nice optimization for 64 bits platforms, where a
> > 32 bit offset can take less space than a full 64 bits pointer. But
> > the symbol _stext is incorrectly resolved as zero in the expansion
> 
> I didn't get this. If _stext is 0 on any platform, then your patch doesn't
> really do anything because the offset will be equal to the ip.
> 
> Could you provide an example with real numbers showing the overflow?
> 

Thanks for your reply.

Easy to replicate, e.g. with a vanilla v5.4 on x86_64
make x86_64_defconfig
echo CONFIG_PREEMPTIRQ_EVENTS=y >> .config
make oldconfig; make

At login:
trace-cmd record -e preemptirq:irq_enable sleep 1
trace-cmd report trace.dat | head
and I incorrectly get only the offsets to _stext:
cpus=1
           sleep-197   [000]   280.378022: irq_enable:           caller=0x1afaS parent=0xc00091S
           sleep-197   [000]   280.378026: irq_enable:           caller=0x54d34S parent=0x78e00000S
           sleep-197   [000]   280.378029: irq_enable:           caller=0x1afaS parent=0xc00a84S
           sleep-197   [000]   280.378032: irq_enable:           caller=0x263dS parent=0x78e00000S
...

With this patch I get the expected/right result:
cpus=1
           sleep-210   [000]   525.177583: irq_enable:           caller=trace_hardirqs_on_thunk parent=entry_SYSCALL_64_after_hwframe
           sleep-210   [000]   525.177587: irq_enable:           caller=__do_page_fault parent=(nil)S
           sleep-210   [000]   525.177590: irq_enable:           caller=trace_hardirqs_on_thunk parent=swapgs_restore_regs_and_return_to_usermode
           sleep-210   [000]   525.177593: irq_enable:           caller=do_syscall_64 parent=(nil)S
...

The same correct result can be obtained by hardcoding _stext in TP_printk() (of course, disabling KASLR to use fixed _stext), e.g. on x86_64:
        TP_printk("caller=%pS parent=%pS",
-                 (void *)((unsigned long)(_stext) + __entry->caller_offs),
-                 (void *)((unsigned long)(_stext) + __entry->parent_offs))
+                 (void *)(0xffffffff81000000UL + __entry->caller_offs),
+                 (void *)(0xffffffff81000000UL + __entry->parent_offs))
 );

The problem is caused by having the macros TP_fast_assign() and TP_printk() expanded in different way.
The simpler expansion of the former macro resolves correctly the symbol _stext.
The more complex expansion of the latter macro passes through some stringification that at the end incorrectly expands _stext as 0.
At first I thought to fix the macro expansion of TP_printk(), but then I realized that:
- on 32 bit systems there is no gain in coding 32 bit offset instead of 32 bit pointer
- on 64 bit systems I cannot find a constraint that guarantees that modules are loaded within 32 bit offset from _stext
then, this patch!

> > of TP_printk(), which then prints only the hex offset instead of
> > the name of the caller function. Plus, on arm arch the kernel
> > modules are loaded at address lower than _stext, causing the u32
> > offset arithmetics to overflow and wrap at 32 bits.
> 
> If we use signed 32-bit, will that solve the module issue?
> 

Of course, on arm 32 bit using a signed offset would solve the 32 bit wraparound.

> > I did not identified a 64 bit arch where the modules are loaded at
> > offset from _stext that exceed u32 range, but I also did not
> > identified any constraint to feel safe with a u32 offset.
> > 
> > Log directly the instruction pointer instead of the offset to
> > _stext.
> 
> I am not comfortable with this patch at the moment, mainly because it will
> increase the size of this rather high frequency event. But I'm not saying
> there isn't an issue on 32-bit. Let's discuss more.
> 

Sure, the pressure on the trace buffer will increase on a 64 bit system, but in general it also has more memory than an embedded 32 bit system.
The key point is, instead, can we feel safe with a 32 bit offset in a 64 bit system? Is there any guarantee a module is loaded within 32 bits offset from _stext?

Antonio

> thanks,
> 
>  - Joel
> 
> 
> > Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
> > Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events")
> > ---
> >  include/trace/events/preemptirq.h | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/trace/events/preemptirq.h b/include/trace/events/preemptirq.h
> > index 95fba0471e5b..d548a6aafa18 100644
> > --- a/include/trace/events/preemptirq.h
> > +++ b/include/trace/events/preemptirq.h
> > @@ -18,18 +18,18 @@ DECLARE_EVENT_CLASS(preemptirq_template,
> >  	TP_ARGS(ip, parent_ip),
> >  
> >  	TP_STRUCT__entry(
> > -		__field(u32, caller_offs)
> > -		__field(u32, parent_offs)
> > +		__field(unsigned long, caller_ip)
> > +		__field(unsigned long, parent_ip)
> >  	),
> >  
> >  	TP_fast_assign(
> > -		__entry->caller_offs = (u32)(ip - (unsigned long)_stext);
> > -		__entry->parent_offs = (u32)(parent_ip - (unsigned long)_stext);
> > +		__entry->caller_ip = ip;
> > +		__entry->parent_ip = parent_ip;
> >  	),
> >  
> >  	TP_printk("caller=%pS parent=%pS",
> > -		  (void *)((unsigned long)(_stext) + __entry->caller_offs),
> > -		  (void *)((unsigned long)(_stext) + __entry->parent_offs))
> > +		  (void *)__entry->caller_ip,
> > +		  (void *)__entry->parent_ip)
> >  );
> >  
> >  #ifdef CONFIG_TRACE_IRQFLAGS
> > -- 
> > 2.24.0
> > 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-07  0:00   ` Antonio Borneo
@ 2019-12-19 18:45     ` Valentin Schneider
  2019-12-21 23:27       ` Joel Fernandes
  0 siblings, 1 reply; 13+ messages in thread
From: Valentin Schneider @ 2019-12-19 18:45 UTC (permalink / raw)
  To: Antonio Borneo, Joel Fernandes
  Cc: Steven Rostedt, Ingo Molnar, linux-kernel, linux-stm32, ambroise.vincent

On 07/12/2019 00:00, Antonio Borneo wrote:
> At login:
> trace-cmd record -e preemptirq:irq_enable sleep 1
> trace-cmd report trace.dat | head
> and I incorrectly get only the offsets to _stext:
> cpus=1
>            sleep-197   [000]   280.378022: irq_enable:           caller=0x1afaS parent=0xc00091S
>            sleep-197   [000]   280.378026: irq_enable:           caller=0x54d34S parent=0x78e00000S
>            sleep-197   [000]   280.378029: irq_enable:           caller=0x1afaS parent=0xc00a84S
>            sleep-197   [000]   280.378032: irq_enable:           caller=0x263dS parent=0x78e00000S
> ...
> 
> With this patch I get the expected/right result:
> cpus=1
>            sleep-210   [000]   525.177583: irq_enable:           caller=trace_hardirqs_on_thunk parent=entry_SYSCALL_64_after_hwframe
>            sleep-210   [000]   525.177587: irq_enable:           caller=__do_page_fault parent=(nil)S
>            sleep-210   [000]   525.177590: irq_enable:           caller=trace_hardirqs_on_thunk parent=swapgs_restore_regs_and_return_to_usermode
>            sleep-210   [000]   525.177593: irq_enable:           caller=do_syscall_64 parent=(nil)S
> ...

FWIW Ambroise (in Cc) and I were staring at some irq enable/disable traces,
and encountered exactly this on a Pixel3 running a mainline kernel. With
Antonio's patch we got the expected result, so you can have:

Tested-by: Valentin Schneider <valentin.schneider@arm.com>

>> I am not comfortable with this patch at the moment, mainly because it will
>> increase the size of this rather high frequency event. But I'm not saying
>> there isn't an issue on 32-bit. Let's discuss more.
>>
> 
> Sure, the pressure on the trace buffer will increase on a 64 bit system, but in general it also has more memory than an embedded 32 bit system.
> The key point is, instead, can we feel safe with a 32 bit offset in a 64 bit system? Is there any guarantee a module is loaded within 32 bits offset from _stext?
> 

You're better placed than I to know if and when this is the case, but I
wouldn't use this in a production environment. To me this is more of a
debugging tool (and a really good one at that), so I am less concerned
about the memory overhead.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-19 18:45     ` Valentin Schneider
@ 2019-12-21 23:27       ` Joel Fernandes
  0 siblings, 0 replies; 13+ messages in thread
From: Joel Fernandes @ 2019-12-21 23:27 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Antonio Borneo, Steven Rostedt, Ingo Molnar, linux-kernel,
	linux-stm32, ambroise.vincent

On Thu, Dec 19, 2019 at 06:45:25PM +0000, Valentin Schneider wrote:
> On 07/12/2019 00:00, Antonio Borneo wrote:
> > At login:
> > trace-cmd record -e preemptirq:irq_enable sleep 1
> > trace-cmd report trace.dat | head
> > and I incorrectly get only the offsets to _stext:
> > cpus=1
> >            sleep-197   [000]   280.378022: irq_enable:           caller=0x1afaS parent=0xc00091S
> >            sleep-197   [000]   280.378026: irq_enable:           caller=0x54d34S parent=0x78e00000S
> >            sleep-197   [000]   280.378029: irq_enable:           caller=0x1afaS parent=0xc00a84S
> >            sleep-197   [000]   280.378032: irq_enable:           caller=0x263dS parent=0x78e00000S
> > ...
> > 
> > With this patch I get the expected/right result:
> > cpus=1
> >            sleep-210   [000]   525.177583: irq_enable:           caller=trace_hardirqs_on_thunk parent=entry_SYSCALL_64_after_hwframe
> >            sleep-210   [000]   525.177587: irq_enable:           caller=__do_page_fault parent=(nil)S
> >            sleep-210   [000]   525.177590: irq_enable:           caller=trace_hardirqs_on_thunk parent=swapgs_restore_regs_and_return_to_usermode
> >            sleep-210   [000]   525.177593: irq_enable:           caller=do_syscall_64 parent=(nil)S
> > ...
> 
> FWIW Ambroise (in Cc) and I were staring at some irq enable/disable traces,
> and encountered exactly this on a Pixel3 running a mainline kernel. With
> Antonio's patch we got the expected result, so you can have:
> 
> Tested-by: Valentin Schneider <valentin.schneider@arm.com>

In BCC, we use the binary trace record of this tracepoint and don't interpret
the text output.  There we lookup _stext value (I believe from kallsyms) and
then apply the offset to that and do another look up for the final symbol.

Would it be possible for Ambroise to work more on this patch so that the
trace record size is not doubled?

> >> I am not comfortable with this patch at the moment, mainly because it will
> >> increase the size of this rather high frequency event. But I'm not saying
> >> there isn't an issue on 32-bit. Let's discuss more.
> >>
> > 
> > Sure, the pressure on the trace buffer will increase on a 64 bit system, but in general it also has more memory than an embedded 32 bit system.
> > The key point is, instead, can we feel safe with a 32 bit offset in a 64 bit system? Is there any guarantee a module is loaded within 32 bits offset from _stext?
> > 
> 
> You're better placed than I to know if and when this is the case, but I
> wouldn't use this in a production environment. To me this is more of a
> debugging tool (and a really good one at that), so I am less concerned
> about the memory overhead.

The issue is with poor trace buffer utilization, you could overflow the trace
buffer more easily, and lose trace data. The reason I came up with the stext
delta approach was precisely to address this and it worked when I was using
this tracepoint. Let me find some time this evening to work more on
Ambroise's patch.

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-04 14:21 ` Steven Rostedt
  2019-12-04 16:04   ` Joel Fernandes
@ 2019-12-21 23:47   ` Joel Fernandes
  2019-12-23 20:13     ` Steven Rostedt
  1 sibling, 1 reply; 13+ messages in thread
From: Joel Fernandes @ 2019-12-21 23:47 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Antonio Borneo, Ingo Molnar, linux-kernel, linux-stm32

On Wed, Dec 04, 2019 at 09:21:15AM -0500, Steven Rostedt wrote:
> 
> Joel,
> 
> Any comments on this patch?

Steve, it looks like this issue happens with trace-cmd not knowing what
_stext is. If I do cat trace_pipe , then I don't see the issue as _stext is
looked up correctly but the reporter of the bug is using trace-cmd. Is there
a way to solve this within trace-cmd? Not knowing much about trace-cmd
internals, I will have to defer to you on this though..

Other than this, I need to make the offset to _stext as s32 instead of u32
type so that the problem of the symbol location being before _stext does not
cause overflow.

Lastly, I am not super convinced that we need to store the full pointer just
to handle a case where the offset of the symbol might be more than +-2G from
_stext. Once we see such issue, then we can handle it. But right now the size
of the trace buffer is utilized better by just storing the offset IMHO.

thanks,

 - Joel


> 
> -- Steve
> 
> On Wed, 27 Nov 2019 16:44:28 +0100
> Antonio Borneo <antonio.borneo@st.com> wrote:
> 
> > This tracing event class is the only instance in kernel that logs
> > in the trace buffer the instruction pointer as offset to _stext,
> > instead of logging the full pointer.
> > This looks like a nice optimization for 64 bits platforms, where a
> > 32 bit offset can take less space than a full 64 bits pointer. But
> > the symbol _stext is incorrectly resolved as zero in the expansion
> > of TP_printk(), which then prints only the hex offset instead of
> > the name of the caller function. Plus, on arm arch the kernel
> > modules are loaded at address lower than _stext, causing the u32
> > offset arithmetics to overflow and wrap at 32 bits.
> > I did not identified a 64 bit arch where the modules are loaded at
> > offset from _stext that exceed u32 range, but I also did not
> > identified any constraint to feel safe with a u32 offset.
> > 
> > Log directly the instruction pointer instead of the offset to
> > _stext.
> > 
> > Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
> > Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events")
> > ---
> >  include/trace/events/preemptirq.h | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/trace/events/preemptirq.h b/include/trace/events/preemptirq.h
> > index 95fba0471e5b..d548a6aafa18 100644
> > --- a/include/trace/events/preemptirq.h
> > +++ b/include/trace/events/preemptirq.h
> > @@ -18,18 +18,18 @@ DECLARE_EVENT_CLASS(preemptirq_template,
> >  	TP_ARGS(ip, parent_ip),
> >  
> >  	TP_STRUCT__entry(
> > -		__field(u32, caller_offs)
> > -		__field(u32, parent_offs)
> > +		__field(unsigned long, caller_ip)
> > +		__field(unsigned long, parent_ip)
> >  	),
> >  
> >  	TP_fast_assign(
> > -		__entry->caller_offs = (u32)(ip - (unsigned long)_stext);
> > -		__entry->parent_offs = (u32)(parent_ip - (unsigned long)_stext);
> > +		__entry->caller_ip = ip;
> > +		__entry->parent_ip = parent_ip;
> >  	),
> >  
> >  	TP_printk("caller=%pS parent=%pS",
> > -		  (void *)((unsigned long)(_stext) + __entry->caller_offs),
> > -		  (void *)((unsigned long)(_stext) + __entry->parent_offs))
> > +		  (void *)__entry->caller_ip,
> > +		  (void *)__entry->parent_ip)
> >  );
> >  
> >  #ifdef CONFIG_TRACE_IRQFLAGS
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-21 23:47   ` Joel Fernandes
@ 2019-12-23 20:13     ` Steven Rostedt
  2019-12-23 20:18       ` Steven Rostedt
                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Steven Rostedt @ 2019-12-23 20:13 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: Antonio Borneo, Ingo Molnar, linux-kernel, linux-stm32

On Sat, 21 Dec 2019 18:47:41 -0500
Joel Fernandes <joel@joelfernandes.org> wrote:

> On Wed, Dec 04, 2019 at 09:21:15AM -0500, Steven Rostedt wrote:
> > 
> > Joel,
> > 
> > Any comments on this patch?  
> 
> Steve, it looks like this issue happens with trace-cmd not knowing what
> _stext is. If I do cat trace_pipe , then I don't see the issue as _stext is
> looked up correctly but the reporter of the bug is using trace-cmd. Is there
> a way to solve this within trace-cmd? Not knowing much about trace-cmd
> internals, I will have to defer to you on this though..
> 
> Other than this, I need to make the offset to _stext as s32 instead of u32
> type so that the problem of the symbol location being before _stext does not
> cause overflow.
> 
> Lastly, I am not super convinced that we need to store the full pointer just
> to handle a case where the offset of the symbol might be more than +-2G from
> _stext. Once we see such issue, then we can handle it. But right now the size
> of the trace buffer is utilized better by just storing the offset IMHO.
>

Does this fix it for you?

-- Steve

diff --git a/lib/traceevent/event-parse.c b/lib/traceevent/event-parse.c
index 4fd3907e..dc705dd2 100644
--- a/lib/traceevent/event-parse.c
+++ b/lib/traceevent/event-parse.c
@@ -3595,6 +3595,45 @@ tep_find_event_by_name(struct tep_handle *tep,
 	return event;
 }
 
+static unsigned long long test_for_symbol(struct tep_handle *tep,
+					  struct tep_print_arg *arg)
+{
+	unsigned long long val = 0;
+	struct func_list *item = tep->funclist;
+	char *func;
+	int i;
+
+	if (isdigit(arg->atom.atom[0]))
+		return 0;
+
+	for (i = 0; i < (int)tep->func_count; i++) {
+		unsigned long long addr;
+		const char *name;
+
+		if (tep->func_map) {
+			addr = tep->func_map[i].addr;
+			name = tep->func_map[i].func;
+		} else if (item) {
+			addr = item->addr;
+			name = item->func;
+			item = item->next;
+		} else
+			break;
+
+		if (strcmp(arg->atom.atom, name) == 0) {
+			val = addr;
+			break;
+		}
+	}
+
+	func = realloc(arg->atom.atom, 32);
+	if (func) {
+		snprintf(func, 32, "%lld", val);
+		arg->atom.atom = func;
+	}
+	return val;
+}
+
 static unsigned long long
 eval_num_arg(void *data, int size, struct tep_event *event, struct tep_print_arg *arg)
 {
@@ -3611,7 +3650,10 @@ eval_num_arg(void *data, int size, struct tep_event *event, struct tep_print_arg
 		/* ?? */
 		return 0;
 	case TEP_PRINT_ATOM:
-		return strtoull(arg->atom.atom, NULL, 0);
+		val = strtoull(arg->atom.atom, NULL, 0);
+		if (!val)
+			val = test_for_symbol(tep, arg);
+		return val;
 	case TEP_PRINT_FIELD:
 		if (!arg->field.field) {
 			arg->field.field = tep_find_any_field(event, arg->field.name);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-23 20:13     ` Steven Rostedt
@ 2019-12-23 20:18       ` Steven Rostedt
  2020-01-02 19:53       ` Joel Fernandes
  2020-01-07  9:21       ` Antonio Borneo
  2 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2019-12-23 20:18 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: Antonio Borneo, Ingo Molnar, linux-kernel, linux-stm32

On Mon, 23 Dec 2019 15:13:01 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> Does this fix it for you?

I created this BZ so it does not get forgotten about over the New Year
break.

  https://bugzilla.kernel.org/show_bug.cgi?id=205953

-- Steve


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-23 20:13     ` Steven Rostedt
  2019-12-23 20:18       ` Steven Rostedt
@ 2020-01-02 19:53       ` Joel Fernandes
  2020-01-07  9:21       ` Antonio Borneo
  2 siblings, 0 replies; 13+ messages in thread
From: Joel Fernandes @ 2020-01-02 19:53 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Antonio Borneo, Ingo Molnar, linux-kernel, linux-stm32

On Mon, Dec 23, 2019 at 03:13:01PM -0500, Steven Rostedt wrote:
> On Sat, 21 Dec 2019 18:47:41 -0500
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> > On Wed, Dec 04, 2019 at 09:21:15AM -0500, Steven Rostedt wrote:
> > > 
> > > Joel,
> > > 
> > > Any comments on this patch?  
> > 
> > Steve, it looks like this issue happens with trace-cmd not knowing what
> > _stext is. If I do cat trace_pipe , then I don't see the issue as _stext is
> > looked up correctly but the reporter of the bug is using trace-cmd. Is there
> > a way to solve this within trace-cmd? Not knowing much about trace-cmd
> > internals, I will have to defer to you on this though..
> > 
> > Other than this, I need to make the offset to _stext as s32 instead of u32
> > type so that the problem of the symbol location being before _stext does not
> > cause overflow.
> > 
> > Lastly, I am not super convinced that we need to store the full pointer just
> > to handle a case where the offset of the symbol might be more than +-2G from
> > _stext. Once we see such issue, then we can handle it. But right now the size
> > of the trace buffer is utilized better by just storing the offset IMHO.
> >
> 
> Does this fix it for you?

I am guessing this question is for Antonio to try Steve's patch with trace-cmd.
Meanwhile I posted a patch to fix the offset issue by changing the u32 to s32.

thanks,

 - Joel

> 
> -- Steve
> 
> diff --git a/lib/traceevent/event-parse.c b/lib/traceevent/event-parse.c
> index 4fd3907e..dc705dd2 100644
> --- a/lib/traceevent/event-parse.c
> +++ b/lib/traceevent/event-parse.c
> @@ -3595,6 +3595,45 @@ tep_find_event_by_name(struct tep_handle *tep,
>  	return event;
>  }
>  
> +static unsigned long long test_for_symbol(struct tep_handle *tep,
> +					  struct tep_print_arg *arg)
> +{
> +	unsigned long long val = 0;
> +	struct func_list *item = tep->funclist;
> +	char *func;
> +	int i;
> +
> +	if (isdigit(arg->atom.atom[0]))
> +		return 0;
> +
> +	for (i = 0; i < (int)tep->func_count; i++) {
> +		unsigned long long addr;
> +		const char *name;
> +
> +		if (tep->func_map) {
> +			addr = tep->func_map[i].addr;
> +			name = tep->func_map[i].func;
> +		} else if (item) {
> +			addr = item->addr;
> +			name = item->func;
> +			item = item->next;
> +		} else
> +			break;
> +
> +		if (strcmp(arg->atom.atom, name) == 0) {
> +			val = addr;
> +			break;
> +		}
> +	}
> +
> +	func = realloc(arg->atom.atom, 32);
> +	if (func) {
> +		snprintf(func, 32, "%lld", val);
> +		arg->atom.atom = func;
> +	}
> +	return val;
> +}
> +
>  static unsigned long long
>  eval_num_arg(void *data, int size, struct tep_event *event, struct tep_print_arg *arg)
>  {
> @@ -3611,7 +3650,10 @@ eval_num_arg(void *data, int size, struct tep_event *event, struct tep_print_arg
>  		/* ?? */
>  		return 0;
>  	case TEP_PRINT_ATOM:
> -		return strtoull(arg->atom.atom, NULL, 0);
> +		val = strtoull(arg->atom.atom, NULL, 0);
> +		if (!val)
> +			val = test_for_symbol(tep, arg);
> +		return val;
>  	case TEP_PRINT_FIELD:
>  		if (!arg->field.field) {
>  			arg->field.field = tep_find_any_field(event, arg->field.name);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2019-12-23 20:13     ` Steven Rostedt
  2019-12-23 20:18       ` Steven Rostedt
  2020-01-02 19:53       ` Joel Fernandes
@ 2020-01-07  9:21       ` Antonio Borneo
  2020-01-07 14:22         ` Steven Rostedt
  2 siblings, 1 reply; 13+ messages in thread
From: Antonio Borneo @ 2020-01-07  9:21 UTC (permalink / raw)
  To: Steven Rostedt, Joel Fernandes; +Cc: Ingo Molnar, linux-kernel, linux-stm32

On Mon, 2019-12-23 at 15:13 -0500, Steven Rostedt wrote:
> On Sat, 21 Dec 2019 18:47:41 -0500
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> > On Wed, Dec 04, 2019 at 09:21:15AM -0500, Steven Rostedt wrote:
> > > Joel,
> > > 
> > > Any comments on this patch?  
> > 
> > Steve, it looks like this issue happens with trace-cmd not knowing what
> > _stext is. If I do cat trace_pipe , then I don't see the issue as _stext is
> > looked up correctly but the reporter of the bug is using trace-cmd. Is there
> > a way to solve this within trace-cmd? Not knowing much about trace-cmd
> > internals, I will have to defer to you on this though..
> > 
> > Other than this, I need to make the offset to _stext as s32 instead of u32
> > type so that the problem of the symbol location being before _stext does not
> > cause overflow.
> > 
> > Lastly, I am not super convinced that we need to store the full pointer just
> > to handle a case where the offset of the symbol might be more than +-2G from
> > _stext. Once we see such issue, then we can handle it. But right now the size
> > of the trace buffer is utilized better by just storing the offset IMHO.
> > 
> 
> Does this fix it for you?
> 
> -- Steve
> 

Steve,

this patch fixes the issue for me, thanks!
Tested-by: Antonio Borneo <antonio.borneo@st.com>

There are other cases where the trace buffer is under stress, e.g. during function tracing.
Would it be useful to only store the offset in such cases too?

Regards,
Antonio



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
  2020-01-07  9:21       ` Antonio Borneo
@ 2020-01-07 14:22         ` Steven Rostedt
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2020-01-07 14:22 UTC (permalink / raw)
  To: Antonio Borneo; +Cc: Joel Fernandes, Ingo Molnar, linux-kernel, linux-stm32

On Tue, 7 Jan 2020 10:21:55 +0100
Antonio Borneo <antonio.borneo@st.com> wrote:

> There are other cases where the trace buffer is under stress, e.g. during function tracing.
> Would it be useful to only store the offset in such cases too?

It could be possible but may require another event type and an option,
as it may break existing tooling.

-- Steve


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-01-07 14:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-27 15:44 [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events Antonio Borneo
2019-12-04 14:21 ` Steven Rostedt
2019-12-04 16:04   ` Joel Fernandes
2019-12-21 23:47   ` Joel Fernandes
2019-12-23 20:13     ` Steven Rostedt
2019-12-23 20:18       ` Steven Rostedt
2020-01-02 19:53       ` Joel Fernandes
2020-01-07  9:21       ` Antonio Borneo
2020-01-07 14:22         ` Steven Rostedt
2019-12-04 16:04 ` Joel Fernandes
2019-12-07  0:00   ` Antonio Borneo
2019-12-19 18:45     ` Valentin Schneider
2019-12-21 23:27       ` Joel Fernandes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).