From: "Dorau, Lukasz" <lukasz.dorau@intel.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>,
"Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>,
"David S. Miller" <davem@davemloft.net>,
"linux-trace-users@vger.kernel.org"
<linux-trace-users@vger.kernel.org>,
"Slusarz, Marcin" <marcin.slusarz@intel.com>,
"Jelinek, Sarah" <sarah.jelinek@intel.com>,
"Chernookyi, Vitalii" <vitalii.chernookyi@intel.com>,
"Buella, Gabor" <gabor.buella@intel.com>
Subject: RE: Why return probes of some syscalls sometimes are not called?
Date: Mon, 13 Mar 2017 09:09:47 +0000 [thread overview]
Message-ID: <D9FFE20C522965449E182ACE73889AEB48682689@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <20170310115120.aa5edf952b42ba43d764605b@kernel.org>
On Friday, March 10, 2017 11:51 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> On Thu, 9 Mar 2017 18:00:54 +0100
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> > On Thu, 9 Mar 2017 13:44:38 +0000
> > "Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:
> >
> > > Hi,
> > >
> > > Could someone explain me why return probes of some syscalls (for example:
> futex, poll, epoll_wait) sometimes are not called?
> > >
> > > It can be reproduced using the following bash script:
> > > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > >
> > > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora
> 24):
> > >
> > > futex: p 56904 r 5489 (90% did not return (51415))
> > > poll: p 43466 r 7703 (82% did not return (35763))
> > > epoll_wait: p 73366 r 23551 (67% did not return (49815))
> > >
> > > The whole log is attached below.
> > >
> > > Lukasz
> > >
> > > ---
> > > # ./test_kprobes.sh 60
> > >
> > > Will trace using following kprobe_events:
> > > r:kprobes/r_futex sys_futex
> > > p:kprobes/p_futex sys_futex
> > > r:kprobes/r_poll sys_poll
> > > p:kprobes/p_poll sys_poll
> > > r:kprobes/r_epoll_wait sys_epoll_wait
> > > p:kprobes/p_epoll_wait sys_epoll_wait
> > > r:kprobes/r_select sys_select
> > > p:kprobes/p_select sys_select
> > > r:kprobes/r_fork sys_fork
> > > p:kprobes/p_fork sys_fork
> > > r:kprobes/r_vfork sys_vfork
> > > p:kprobes/p_vfork sys_vfork
> > > r:kprobes/r_mmap sys_mmap
> > > p:kprobes/p_mmap sys_mmap
> > > r:kprobes/r_open sys_open
> > > p:kprobes/p_open sys_open
> > > r:kprobes/r_close sys_close
> > > p:kprobes/p_close sys_close
> > > r:kprobes/r_write sys_write
> > > p:kprobes/p_write sys_write
> > > r:kprobes/r_read sys_read
> > > p:kprobes/p_read sys_read
> > >
> > > Results (60 sec):
> > > futex: p 56904 r 5489 (90% did not return (51415))
> > > poll: p 43466 r 7703 (82% did not return (35763))
> > > epoll_wait: p 73366 r 23551 (67% did not return (49815))
> > > select: p 13355 r 13351 (0% did not return (4))
> > > fork: p 0 r 0 (OK)
> > > vfork: p 0 r 0 (OK)
> > > mmap: p 4328 r 4328 (OK)
> > > open: p 4579 r 4579 (OK)
> > > close: p 7163 r 7163 (OK)
> > > write: p 22769 r 22769 (OK)
> > > read: p 40014 r 40014 (OK)
> > >
> >
> > OK, I just dumped kprobe_profile after the test,
> >
> > futex: p 7775 r 698 (91% did not return (7077))
> > poll: p 18409 r 3523 (80% did not return (14886))
> > epoll_wait: p 5122 r 2389 (53% did not return (2733))
> > select: p 5411 r 5286 (2% did not return (125))
> > fork: p 0 r 0 (OK)
> > vfork: p 0 r 0 (OK)
> > mmap: p 179 r 179 (OK)
> > open: p 921 r 921 (OK)
> > close: p 1005 r 1005 (OK)
> > write: p 5632 r 5631 (0% did not return (1))
> > read: p 10879 r 10878 (0% did not return (1))
> > r_futex 1091 0
> > p_futex 8986 0
> > r_poll 4276 0
> > p_poll 20610 0
> > r_epoll_wait 2849 0
> > p_epoll_wait 5905 0
> > r_select 10172 0
> > p_select 10312 0
> > r_fork 0 0
> > p_fork 0 0
> > r_vfork 0 0
> > p_vfork 0 0
> > r_mmap 2987 0
> > p_mmap 2884 0
> > r_open 4002 0
> > p_open 3855 0
> > r_close 4262 0
> > p_close 4122 0
> > r_write 14281 0
> > p_write 14276 0
> > r_read 41494 0
> > p_read 41363 0
> >
> > OK, there are no miss-counted event. So at least kprobe and kretprobe working
> correctly.
> > I also tried to use static tracepoints for that.
> > https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e
> >
> > ----
> > $ sudo bash test_kprobes.sh 10
> > Disabling all current probes ...
> > Removing old probe points ...
> > Adding new probe points ...
> > Enabling new probe points ...
> > Clearing old trace log ...
> >
> > Tracing for 10 seconds ... done
> > Saving trace log ... done
> >
> > Results (10 sec):
> > futex: p 9856 r 1258 (87% did not return (8598))
> > poll: p 7531 r 4031 (46% did not return (3500))
> > epoll_wait: p 1996 r 1617 (18% did not return (379))
> > select: p 273 r 257 (5% did not return (16))
> > Results (10 sec):
> > futex: p 9851 r 9851 (OK)
> > poll: p 7531 r 7531 (OK)
> > epoll_wait: p 1996 r 1996 (OK)
> > select: p 273 r 273 (OK)
> > ----
> > Of course these events are collected in "different places" so
> > it could be different. But yeah, it seems sys_* function itself
> > dosen't return but jumps into ret_from_syscall etc.
>
> Please try attached patch (that is my miss that forgot to accumulate missed
> count of kretprobe).
>
Thanks for the patch. I ran two tests using the following script:
https://gist.github.com/ldorau/36d87640ff0659a483342d490b66f034
The results are following:
1) The first test - during 60-seconds test the system was almost idle
(I did not touch the keyboard or mouse)
KProbes results (60 sec):
futex: p 1373 r 422 (69% did not return (951))
poll: p 3304 r 1575 (52% did not return (1729))
epoll_wait: p 1062 r 943 (11% did not return (119))
select: p 5978 r 5977 (0% did not return (1))
mmap: p 264 r 264 (OK)
open: p 1056 r 1056 (OK)
close: p 1198 r 1198 (OK)
write: p 4786 r 4786 (OK)
read: p 8939 r 8939 (OK)
exit: p 1 r 0
fork: p 0 r 0 (OK)
vfork: p 0 r 0 (OK)
clone: p 5 r 5 (OK)
execve: p 4 r 4 (OK)
Tracepoints results (60 sec):
futex: p 1372 r 1372 (OK)
poll: p 3304 r 3303 (0% did not return (1))
epoll_wait: p 1062 r 1062 (OK)
select: p 5978 r 5978 (OK)
mmap: p 264 r 264 (OK)
open: p 1056 r 1056 (OK)
close: p 1198 r 1198 (OK)
write: p 4786 r 4786 (OK)
read: p 8939 r 8940 (OK)
exit: p 1 r 0
2) The second test - during 60-seconds test I was normally using the system -
- I launched an internet browser, edited a file...
KProbes results (60 sec):
futex: p 35394 r 689 (98% did not return (34705))
poll: p 23449 r 896 (96% did not return (22553))
epoll_wait: p 43927 r 4973 (88% did not return (38954))
select: p 4874 r 4873 (0% did not return (1))
mmap: p 3165 r 3165 (OK)
open: p 3680 r 3678 (0% did not return (2))
close: p 6704 r 6704 (OK)
write: p 11769 r 11769 (OK)
read: p 22645 r 22647 (OK)
exit: p 30 r 0
fork: p 0 r 0 (OK)
vfork: p 0 r 0 (OK)
clone: p 170 r 170 (OK)
execve: p 2 r 2 (OK)
Tracepoints results (60 sec):
futex: p 35359 r 35268 (0% did not return (91))
poll: p 23448 r 23444 (0% did not return (4))
epoll_wait: p 43926 r 43909 (0% did not return (17))
select: p 4874 r 4874 (OK)
mmap: p 3165 r 3165 (OK)
open: p 3680 r 3827 (OK)
close: p 6704 r 6704 (OK)
write: p 11769 r 11770 (OK)
read: p 22645 r 22648 (OK)
exit: p 30 r 0
Lukasz
> I've focused on futex and it showed there were some "misses" of kretprobe.
>
> ----
> Disabling all current probes ...
> Removing old probe points ...
> Adding new probe points ...
> Enabling new probe points ...
> Clearing old trace log ...
>
> Tracing for 10 seconds ... done
> Saving trace log ... done
>
> Results (10 sec):
> futex: p 190 r 74 (61% did not return (116))
> Results (10 sec):
> futex: p 190 r 190 (OK)
>
> $ cat /sys/kernel/debug/tracing/kprobe_profile
> r_futex 176 127
> p_futex 298 0
> -----
>
> Actually, since the kretprobe is not well designed for the usecase
> probing on the sleep-able functions, it just fails if it invoked
> more than the number of CPUs in parallel.
> It is currently a limitation of kretprobe, but I'll solve it.
>
> Thanks,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>
next prev parent reply other threads:[~2017-03-13 9:09 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-09 13:44 Why return probes of some syscalls sometimes are not called? Dorau, Lukasz
2017-03-09 13:58 ` Mathieu Desnoyers
2017-03-09 14:44 ` Steven Rostedt
2017-03-09 14:57 ` Steven Rostedt
[not found] ` <1231722663.1343.1489071698914.JavaMail.zimbra@efficios.com>
2017-03-09 15:20 ` Steven Rostedt
[not found] ` <D9FFE20C522965449E182ACE73889AEB486818F2@irsmsx105.ger.corp.intel.com>
2017-03-09 15:38 ` Steven Rostedt
2017-03-09 17:00 ` Masami Hiramatsu
2017-03-09 18:19 ` Chernookyi, Vitalii
2017-03-10 10:51 ` Masami Hiramatsu
2017-03-13 9:09 ` Dorau, Lukasz [this message]
2017-03-16 10:01 ` Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D9FFE20C522965449E182ACE73889AEB48682689@irsmsx105.ger.corp.intel.com \
--to=lukasz.dorau@intel.com \
--cc=ananth@linux.vnet.ibm.com \
--cc=anil.s.keshavamurthy@intel.com \
--cc=davem@davemloft.net \
--cc=gabor.buella@intel.com \
--cc=linux-trace-users@vger.kernel.org \
--cc=marcin.slusarz@intel.com \
--cc=mhiramat@kernel.org \
--cc=sarah.jelinek@intel.com \
--cc=vitalii.chernookyi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).