linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Antonio Borneo <antonio.borneo@st.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>, <linux-kernel@vger.kernel.org>,
	<linux-stm32@st-md-mailman.stormreply.com>
Subject: Re: [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events
Date: Sat, 7 Dec 2019 01:00:51 +0100	[thread overview]
Message-ID: <6b5b74cede87a61d5f04f2a95bc4389830467fe9.camel@st.com> (raw)
In-Reply-To: <20191204160407.GB17404@google.com>

On Wed, 2019-12-04 at 11:04 -0500, Joel Fernandes wrote:
> On Wed, Nov 27, 2019 at 04:44:28PM +0100, Antonio Borneo wrote:
> > This tracing event class is the only instance in kernel that logs
> > in the trace buffer the instruction pointer as offset to _stext,
> > instead of logging the full pointer.
> > This looks like a nice optimization for 64 bits platforms, where a
> > 32 bit offset can take less space than a full 64 bits pointer. But
> > the symbol _stext is incorrectly resolved as zero in the expansion
> 
> I didn't get this. If _stext is 0 on any platform, then your patch doesn't
> really do anything because the offset will be equal to the ip.
> 
> Could you provide an example with real numbers showing the overflow?
> 

Thanks for your reply.

Easy to replicate, e.g. with a vanilla v5.4 on x86_64
make x86_64_defconfig
echo CONFIG_PREEMPTIRQ_EVENTS=y >> .config
make oldconfig; make

At login:
trace-cmd record -e preemptirq:irq_enable sleep 1
trace-cmd report trace.dat | head
and I incorrectly get only the offsets to _stext:
cpus=1
           sleep-197   [000]   280.378022: irq_enable:           caller=0x1afaS parent=0xc00091S
           sleep-197   [000]   280.378026: irq_enable:           caller=0x54d34S parent=0x78e00000S
           sleep-197   [000]   280.378029: irq_enable:           caller=0x1afaS parent=0xc00a84S
           sleep-197   [000]   280.378032: irq_enable:           caller=0x263dS parent=0x78e00000S
...

With this patch I get the expected/right result:
cpus=1
           sleep-210   [000]   525.177583: irq_enable:           caller=trace_hardirqs_on_thunk parent=entry_SYSCALL_64_after_hwframe
           sleep-210   [000]   525.177587: irq_enable:           caller=__do_page_fault parent=(nil)S
           sleep-210   [000]   525.177590: irq_enable:           caller=trace_hardirqs_on_thunk parent=swapgs_restore_regs_and_return_to_usermode
           sleep-210   [000]   525.177593: irq_enable:           caller=do_syscall_64 parent=(nil)S
...

The same correct result can be obtained by hardcoding _stext in TP_printk() (of course, disabling KASLR to use fixed _stext), e.g. on x86_64:
        TP_printk("caller=%pS parent=%pS",
-                 (void *)((unsigned long)(_stext) + __entry->caller_offs),
-                 (void *)((unsigned long)(_stext) + __entry->parent_offs))
+                 (void *)(0xffffffff81000000UL + __entry->caller_offs),
+                 (void *)(0xffffffff81000000UL + __entry->parent_offs))
 );

The problem is caused by having the macros TP_fast_assign() and TP_printk() expanded in different way.
The simpler expansion of the former macro resolves correctly the symbol _stext.
The more complex expansion of the latter macro passes through some stringification that at the end incorrectly expands _stext as 0.
At first I thought to fix the macro expansion of TP_printk(), but then I realized that:
- on 32 bit systems there is no gain in coding 32 bit offset instead of 32 bit pointer
- on 64 bit systems I cannot find a constraint that guarantees that modules are loaded within 32 bit offset from _stext
then, this patch!

> > of TP_printk(), which then prints only the hex offset instead of
> > the name of the caller function. Plus, on arm arch the kernel
> > modules are loaded at address lower than _stext, causing the u32
> > offset arithmetics to overflow and wrap at 32 bits.
> 
> If we use signed 32-bit, will that solve the module issue?
> 

Of course, on arm 32 bit using a signed offset would solve the 32 bit wraparound.

> > I did not identified a 64 bit arch where the modules are loaded at
> > offset from _stext that exceed u32 range, but I also did not
> > identified any constraint to feel safe with a u32 offset.
> > 
> > Log directly the instruction pointer instead of the offset to
> > _stext.
> 
> I am not comfortable with this patch at the moment, mainly because it will
> increase the size of this rather high frequency event. But I'm not saying
> there isn't an issue on 32-bit. Let's discuss more.
> 

Sure, the pressure on the trace buffer will increase on a 64 bit system, but in general it also has more memory than an embedded 32 bit system.
The key point is, instead, can we feel safe with a 32 bit offset in a 64 bit system? Is there any guarantee a module is loaded within 32 bits offset from _stext?

Antonio

> thanks,
> 
>  - Joel
> 
> 
> > Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
> > Fixes: d59158162e03 ("tracing: Add support for preempt and irq enable/disable events")
> > ---
> >  include/trace/events/preemptirq.h | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/trace/events/preemptirq.h b/include/trace/events/preemptirq.h
> > index 95fba0471e5b..d548a6aafa18 100644
> > --- a/include/trace/events/preemptirq.h
> > +++ b/include/trace/events/preemptirq.h
> > @@ -18,18 +18,18 @@ DECLARE_EVENT_CLASS(preemptirq_template,
> >  	TP_ARGS(ip, parent_ip),
> >  
> >  	TP_STRUCT__entry(
> > -		__field(u32, caller_offs)
> > -		__field(u32, parent_offs)
> > +		__field(unsigned long, caller_ip)
> > +		__field(unsigned long, parent_ip)
> >  	),
> >  
> >  	TP_fast_assign(
> > -		__entry->caller_offs = (u32)(ip - (unsigned long)_stext);
> > -		__entry->parent_offs = (u32)(parent_ip - (unsigned long)_stext);
> > +		__entry->caller_ip = ip;
> > +		__entry->parent_ip = parent_ip;
> >  	),
> >  
> >  	TP_printk("caller=%pS parent=%pS",
> > -		  (void *)((unsigned long)(_stext) + __entry->caller_offs),
> > -		  (void *)((unsigned long)(_stext) + __entry->parent_offs))
> > +		  (void *)__entry->caller_ip,
> > +		  (void *)__entry->parent_ip)
> >  );
> >  
> >  #ifdef CONFIG_TRACE_IRQFLAGS
> > -- 
> > 2.24.0
> > 


  reply	other threads:[~2019-12-07  0:01 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27 15:44 [PATCH] tracing: Fix printing ptrs in preempt/irq enable/disable events Antonio Borneo
2019-12-04 14:21 ` Steven Rostedt
2019-12-04 16:04   ` Joel Fernandes
2019-12-21 23:47   ` Joel Fernandes
2019-12-23 20:13     ` Steven Rostedt
2019-12-23 20:18       ` Steven Rostedt
2020-01-02 19:53       ` Joel Fernandes
2020-01-07  9:21       ` Antonio Borneo
2020-01-07 14:22         ` Steven Rostedt
2019-12-04 16:04 ` Joel Fernandes
2019-12-07  0:00   ` Antonio Borneo [this message]
2019-12-19 18:45     ` Valentin Schneider
2019-12-21 23:27       ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6b5b74cede87a61d5f04f2a95bc4389830467fe9.camel@st.com \
    --to=antonio.borneo@st.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-stm32@st-md-mailman.stormreply.com \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).