linux-trace-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Dorau, Lukasz" <lukasz.dorau@intel.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>,
	"Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	"linux-trace-users@vger.kernel.org"
	<linux-trace-users@vger.kernel.org>,
	"Slusarz, Marcin" <marcin.slusarz@intel.com>,
	"Jelinek, Sarah" <sarah.jelinek@intel.com>,
	"Chernookyi, Vitalii" <vitalii.chernookyi@intel.com>,
	"Buella, Gabor" <gabor.buella@intel.com>
Subject: RE: Why return probes of some syscalls sometimes are not called?
Date: Mon, 13 Mar 2017 09:09:47 +0000	[thread overview]
Message-ID: <D9FFE20C522965449E182ACE73889AEB48682689@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <20170310115120.aa5edf952b42ba43d764605b@kernel.org>

On Friday, March 10, 2017 11:51 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> On Thu, 9 Mar 2017 18:00:54 +0100
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > On Thu, 9 Mar 2017 13:44:38 +0000
> > "Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:
> >
> > > Hi,
> > >
> > > Could someone explain me why return probes of some syscalls (for example:
> futex, poll, epoll_wait) sometimes are not called?
> > >
> > > It can be reproduced using the following bash script:
> > > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > >
> > > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora
> 24):
> > >
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > >
> > > The whole log is attached below.
> > >
> > > Lukasz
> > >
> > > ---
> > > # ./test_kprobes.sh 60
> > >
> > > Will trace using following kprobe_events:
> > > r:kprobes/r_futex sys_futex
> > > p:kprobes/p_futex sys_futex
> > > r:kprobes/r_poll sys_poll
> > > p:kprobes/p_poll sys_poll
> > > r:kprobes/r_epoll_wait sys_epoll_wait
> > > p:kprobes/p_epoll_wait sys_epoll_wait
> > > r:kprobes/r_select sys_select
> > > p:kprobes/p_select sys_select
> > > r:kprobes/r_fork sys_fork
> > > p:kprobes/p_fork sys_fork
> > > r:kprobes/r_vfork sys_vfork
> > > p:kprobes/p_vfork sys_vfork
> > > r:kprobes/r_mmap sys_mmap
> > > p:kprobes/p_mmap sys_mmap
> > > r:kprobes/r_open sys_open
> > > p:kprobes/p_open sys_open
> > > r:kprobes/r_close sys_close
> > > p:kprobes/p_close sys_close
> > > r:kprobes/r_write sys_write
> > > p:kprobes/p_write sys_write
> > > r:kprobes/r_read sys_read
> > > p:kprobes/p_read sys_read
> > >
> > > Results (60 sec):
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > > select:      p 13355    r 13351    (0% did not return (4))
> > > fork:        p 0        r 0        (OK)
> > > vfork:       p 0        r 0        (OK)
> > > mmap:        p 4328     r 4328     (OK)
> > > open:        p 4579     r 4579     (OK)
> > > close:       p 7163     r 7163     (OK)
> > > write:       p 22769    r 22769    (OK)
> > > read:        p 40014    r 40014    (OK)
> > >
> >
> > OK, I just dumped kprobe_profile after the test,
> >
> > futex:       p 7775     r 698      (91% did not return (7077))
> > poll:        p 18409    r 3523     (80% did not return (14886))
> > epoll_wait:  p 5122     r 2389     (53% did not return (2733))
> > select:      p 5411     r 5286     (2% did not return (125))
> > fork:        p 0        r 0        (OK)
> > vfork:       p 0        r 0        (OK)
> > mmap:        p 179      r 179      (OK)
> > open:        p 921      r 921      (OK)
> > close:       p 1005     r 1005     (OK)
> > write:       p 5632     r 5631     (0% did not return (1))
> > read:        p 10879    r 10878    (0% did not return (1))
> >   r_futex                                                 1091               0
> >   p_futex                                                 8986               0
> >   r_poll                                                  4276               0
> >   p_poll                                                 20610               0
> >   r_epoll_wait                                            2849               0
> >   p_epoll_wait                                            5905               0
> >   r_select                                               10172               0
> >   p_select                                               10312               0
> >   r_fork                                                     0               0
> >   p_fork                                                     0               0
> >   r_vfork                                                    0               0
> >   p_vfork                                                    0               0
> >   r_mmap                                                  2987               0
> >   p_mmap                                                  2884               0
> >   r_open                                                  4002               0
> >   p_open                                                  3855               0
> >   r_close                                                 4262               0
> >   p_close                                                 4122               0
> >   r_write                                                14281               0
> >   p_write                                                14276               0
> >   r_read                                                 41494               0
> >   p_read                                                 41363               0
> >
> > OK, there are no miss-counted event. So at least kprobe and kretprobe working
> correctly.
> > I also tried to use static tracepoints for that.
> > https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e
> >
> > ----
> > $ sudo bash test_kprobes.sh 10
> > Disabling all current probes ...
> > Removing old probe points ...
> > Adding new probe points ...
> > Enabling new probe points ...
> > Clearing old trace log ...
> >
> > Tracing for 10 seconds ... done
> > Saving trace log ... done
> >
> > Results (10 sec):
> > futex:       p 9856     r 1258     (87% did not return (8598))
> > poll:        p 7531     r 4031     (46% did not return (3500))
> > epoll_wait:  p 1996     r 1617     (18% did not return (379))
> > select:      p 273      r 257      (5% did not return (16))
> > Results (10 sec):
> > futex:       p 9851     r 9851     (OK)
> > poll:        p 7531     r 7531     (OK)
> > epoll_wait:  p 1996     r 1996     (OK)
> > select:      p 273      r 273      (OK)
> > ----
> > Of course these events are collected in "different places" so
> > it could be different. But yeah, it seems sys_* function itself
> > dosen't return but jumps into ret_from_syscall etc.
> 
> Please try attached patch (that is my miss that forgot to accumulate missed
> count of kretprobe).
> 

Thanks for the patch. I ran two tests using the following script:
https://gist.github.com/ldorau/36d87640ff0659a483342d490b66f034
The results are following:

1) The first test - during 60-seconds test the system was almost idle
   (I did not touch the keyboard or mouse)

KProbes results (60 sec):
futex:       p 1373     r 422      (69% did not return (951))
poll:        p 3304     r 1575     (52% did not return (1729))
epoll_wait:  p 1062     r 943      (11% did not return (119))
select:      p 5978     r 5977     (0% did not return (1))
mmap:        p 264      r 264      (OK)
open:        p 1056     r 1056     (OK)
close:       p 1198     r 1198     (OK)
write:       p 4786     r 4786     (OK)
read:        p 8939     r 8939     (OK)
exit:        p 1        r 0    
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
clone:       p 5        r 5        (OK)
execve:      p 4        r 4        (OK)

Tracepoints results (60 sec):
futex:       p 1372     r 1372     (OK)
poll:        p 3304     r 3303     (0% did not return (1))
epoll_wait:  p 1062     r 1062     (OK)
select:      p 5978     r 5978     (OK)
mmap:        p 264      r 264      (OK)
open:        p 1056     r 1056     (OK)
close:       p 1198     r 1198     (OK)
write:       p 4786     r 4786     (OK)
read:        p 8939     r 8940     (OK)
exit:        p 1        r 0    

2) The second test - during 60-seconds test I was normally using the system -
   - I launched an internet browser, edited a file...

KProbes results (60 sec):
futex:       p 35394    r 689      (98% did not return (34705))
poll:        p 23449    r 896      (96% did not return (22553))
epoll_wait:  p 43927    r 4973     (88% did not return (38954))
select:      p 4874     r 4873     (0% did not return (1))
mmap:        p 3165     r 3165     (OK)
open:        p 3680     r 3678     (0% did not return (2))
close:       p 6704     r 6704     (OK)
write:       p 11769    r 11769    (OK)
read:        p 22645    r 22647    (OK)
exit:        p 30       r 0        
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
clone:       p 170      r 170      (OK)
execve:      p 2        r 2        (OK)

Tracepoints results (60 sec):
futex:       p 35359    r 35268    (0% did not return (91))
poll:        p 23448    r 23444    (0% did not return (4))
epoll_wait:  p 43926    r 43909    (0% did not return (17))
select:      p 4874     r 4874     (OK)
mmap:        p 3165     r 3165     (OK)
open:        p 3680     r 3827     (OK)
close:       p 6704     r 6704     (OK)
write:       p 11769    r 11770    (OK)
read:        p 22645    r 22648    (OK)
exit:        p 30       r 0        

Lukasz



> I've focused on futex and it showed there were some "misses" of kretprobe.
> 
> ----
> Disabling all current probes ...
> Removing old probe points ...
> Adding new probe points ...
> Enabling new probe points ...
> Clearing old trace log ...
> 
> Tracing for 10 seconds ... done
> Saving trace log ... done
> 
> Results (10 sec):
> futex:       p 190      r 74       (61% did not return (116))
> Results (10 sec):
> futex:       p 190      r 190      (OK)
> 
> $ cat /sys/kernel/debug/tracing/kprobe_profile
>   r_futex                                                  176             127
>   p_futex                                                  298               0
> -----
> 
> Actually, since the kretprobe is not well designed for the usecase
> probing on the sleep-able functions, it just fails if it invoked
> more than the number of CPUs in parallel.
> It is currently a limitation of kretprobe, but I'll solve it.
> 
> Thanks,
> 
> --
> Masami Hiramatsu <mhiramat@kernel.org>

  reply	other threads:[~2017-03-13  9:09 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-09 13:44 Why return probes of some syscalls sometimes are not called? Dorau, Lukasz
2017-03-09 13:58 ` Mathieu Desnoyers
2017-03-09 14:44   ` Steven Rostedt
2017-03-09 14:57     ` Steven Rostedt
     [not found]     ` <1231722663.1343.1489071698914.JavaMail.zimbra@efficios.com>
2017-03-09 15:20       ` Steven Rostedt
     [not found]         ` <D9FFE20C522965449E182ACE73889AEB486818F2@irsmsx105.ger.corp.intel.com>
2017-03-09 15:38           ` Steven Rostedt
2017-03-09 17:00 ` Masami Hiramatsu
2017-03-09 18:19   ` Chernookyi, Vitalii
2017-03-10 10:51   ` Masami Hiramatsu
2017-03-13  9:09     ` Dorau, Lukasz [this message]
2017-03-16 10:01       ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D9FFE20C522965449E182ACE73889AEB48682689@irsmsx105.ger.corp.intel.com \
    --to=lukasz.dorau@intel.com \
    --cc=ananth@linux.vnet.ibm.com \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=davem@davemloft.net \
    --cc=gabor.buella@intel.com \
    --cc=linux-trace-users@vger.kernel.org \
    --cc=marcin.slusarz@intel.com \
    --cc=mhiramat@kernel.org \
    --cc=sarah.jelinek@intel.com \
    --cc=vitalii.chernookyi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).