linux-trace-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Why return probes of some syscalls sometimes are not called?
@ 2017-03-09 13:44 Dorau, Lukasz
  2017-03-09 13:58 ` Mathieu Desnoyers
  2017-03-09 17:00 ` Masami Hiramatsu
  0 siblings, 2 replies; 11+ messages in thread
From: Dorau, Lukasz @ 2017-03-09 13:44 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, Masami Hiramatsu, linux-trace-users
  Cc: Slusarz, Marcin, Jelinek, Sarah, Chernookyi, Vitalii, Buella, Gabor

Hi,

Could someone explain me why return probes of some syscalls (for example: futex, poll, epoll_wait) sometimes are not called?

It can be reproduced using the following bash script:
https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec

Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):

futex:       p 56904    r 5489     (90% did not return (51415))
poll:        p 43466    r 7703     (82% did not return (35763))
epoll_wait:  p 73366    r 23551    (67% did not return (49815))

The whole log is attached below.

Lukasz

---
# ./test_kprobes.sh 60

Will trace using following kprobe_events:
r:kprobes/r_futex sys_futex
p:kprobes/p_futex sys_futex
r:kprobes/r_poll sys_poll
p:kprobes/p_poll sys_poll
r:kprobes/r_epoll_wait sys_epoll_wait
p:kprobes/p_epoll_wait sys_epoll_wait
r:kprobes/r_select sys_select
p:kprobes/p_select sys_select
r:kprobes/r_fork sys_fork
p:kprobes/p_fork sys_fork
r:kprobes/r_vfork sys_vfork
p:kprobes/p_vfork sys_vfork
r:kprobes/r_mmap sys_mmap
p:kprobes/p_mmap sys_mmap
r:kprobes/r_open sys_open
p:kprobes/p_open sys_open
r:kprobes/r_close sys_close
p:kprobes/p_close sys_close
r:kprobes/r_write sys_write
p:kprobes/p_write sys_write
r:kprobes/r_read sys_read
p:kprobes/p_read sys_read

Results (60 sec):
futex:       p 56904    r 5489     (90% did not return (51415))
poll:        p 43466    r 7703     (82% did not return (35763))
epoll_wait:  p 73366    r 23551    (67% did not return (49815))
select:      p 13355    r 13351    (0% did not return (4))
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
mmap:        p 4328     r 4328     (OK)
open:        p 4579     r 4579     (OK)
close:       p 7163     r 7163     (OK)
write:       p 22769    r 22769    (OK)
read:        p 40014    r 40014    (OK)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
  2017-03-09 13:44 Why return probes of some syscalls sometimes are not called? Dorau, Lukasz
@ 2017-03-09 13:58 ` Mathieu Desnoyers
  2017-03-09 14:44   ` Steven Rostedt
  2017-03-09 17:00 ` Masami Hiramatsu
  1 sibling, 1 reply; 11+ messages in thread
From: Mathieu Desnoyers @ 2017-03-09 13:58 UTC (permalink / raw)
  To: Dorau, Lukasz
  Cc: Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, Masami Hiramatsu, linux-trace-users, Slusarz,
	Marcin, Jelinek, Sarah, Chernookyi, Vitalii, Buella, Gabor

----- On Mar 9, 2017, at 8:44 AM, Dorau, Lukasz lukasz.dorau@intel.com wrote:

> Hi,
> 
> Could someone explain me why return probes of some syscalls (for example: futex,
> poll, epoll_wait) sometimes are not called?
> 
> It can be reproduced using the following bash script:
> https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> 
> Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
> 
> futex:       p 56904    r 5489     (90% did not return (51415))
> poll:        p 43466    r 7703     (82% did not return (35763))
> epoll_wait:  p 73366    r 23551    (67% did not return (49815))

Most likely scenario: those processes are still blocked on those
system calls when your tracing ends.

AFAIU, another possible (less frequent) scenario: a process gets
killed with SIGKILL while blocked on the signal.

Thanks,

Mathieu

> 
> The whole log is attached below.
> 
> Lukasz
> 
> ---
> # ./test_kprobes.sh 60
> 
> Will trace using following kprobe_events:
> r:kprobes/r_futex sys_futex
> p:kprobes/p_futex sys_futex
> r:kprobes/r_poll sys_poll
> p:kprobes/p_poll sys_poll
> r:kprobes/r_epoll_wait sys_epoll_wait
> p:kprobes/p_epoll_wait sys_epoll_wait
> r:kprobes/r_select sys_select
> p:kprobes/p_select sys_select
> r:kprobes/r_fork sys_fork
> p:kprobes/p_fork sys_fork
> r:kprobes/r_vfork sys_vfork
> p:kprobes/p_vfork sys_vfork
> r:kprobes/r_mmap sys_mmap
> p:kprobes/p_mmap sys_mmap
> r:kprobes/r_open sys_open
> p:kprobes/p_open sys_open
> r:kprobes/r_close sys_close
> p:kprobes/p_close sys_close
> r:kprobes/r_write sys_write
> p:kprobes/p_write sys_write
> r:kprobes/r_read sys_read
> p:kprobes/p_read sys_read
> 
> Results (60 sec):
> futex:       p 56904    r 5489     (90% did not return (51415))
> poll:        p 43466    r 7703     (82% did not return (35763))
> epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> select:      p 13355    r 13351    (0% did not return (4))
> fork:        p 0        r 0        (OK)
> vfork:       p 0        r 0        (OK)
> mmap:        p 4328     r 4328     (OK)
> open:        p 4579     r 4579     (OK)
> close:       p 7163     r 7163     (OK)
> write:       p 22769    r 22769    (OK)
> read:        p 40014    r 40014    (OK)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
  2017-03-09 13:58 ` Mathieu Desnoyers
@ 2017-03-09 14:44   ` Steven Rostedt
  2017-03-09 14:57     ` Steven Rostedt
       [not found]     ` <1231722663.1343.1489071698914.JavaMail.zimbra@efficios.com>
  0 siblings, 2 replies; 11+ messages in thread
From: Steven Rostedt @ 2017-03-09 14:44 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Dorau, Lukasz, Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, Masami Hiramatsu, linux-trace-users, Slusarz,
	Marcin, Jelinek, Sarah, Chernookyi, Vitalii, Buella, Gabor

On Thu, 9 Mar 2017 13:58:29 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> ----- On Mar 9, 2017, at 8:44 AM, Dorau, Lukasz lukasz.dorau@intel.com wrote:
> 
> > Hi,
> > 
> > Could someone explain me why return probes of some syscalls (for example: futex,
> > poll, epoll_wait) sometimes are not called?
> > 
> > It can be reproduced using the following bash script:
> > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > 
> > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
> > 
> > futex:       p 56904    r 5489     (90% did not return (51415))
> > poll:        p 43466    r 7703     (82% did not return (35763))
> > epoll_wait:  p 73366    r 23551    (67% did not return (49815))  
> 
> Most likely scenario: those processes are still blocked on those
> system calls when your tracing ends.

This is very common but those numbers are very high. I doubt there's 51
thousand threads blocked on a futex when tracing ended.

> 
> AFAIU, another possible (less frequent) scenario: a process gets
> killed with SIGKILL while blocked on the signal.
> 

This could be.

> > 
> > Results (60 sec):
> > futex:       p 56904    r 5489     (90% did not return (51415))
> > poll:        p 43466    r 7703     (82% did not return (35763))
> > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > select:      p 13355    r 13351    (0% did not return (4))

All these are common system calls that tasks simply sleep on. But it
would take a nasty kill to have them not return back to the program to
clean up nicely. Another possibility is that these actually have another
way out from the kernel that isn't caught by tracing. I'll take a look.

-- Steve



> > fork:        p 0        r 0        (OK)
> > vfork:       p 0        r 0        (OK)
> > mmap:        p 4328     r 4328     (OK)
> > open:        p 4579     r 4579     (OK)
> > close:       p 7163     r 7163     (OK)
> > write:       p 22769    r 22769    (OK)
> > read:        p 40014    r 40014    (OK)
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html  
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
  2017-03-09 14:44   ` Steven Rostedt
@ 2017-03-09 14:57     ` Steven Rostedt
       [not found]     ` <1231722663.1343.1489071698914.JavaMail.zimbra@efficios.com>
  1 sibling, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2017-03-09 14:57 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Dorau, Lukasz, Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, Masami Hiramatsu, linux-trace-users, Slusarz,
	Marcin, Jelinek, Sarah, Chernookyi, Vitalii, Buella, Gabor

On Thu, 9 Mar 2017 09:44:55 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Thu, 9 Mar 2017 13:58:29 +0000 (UTC)
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
> > ----- On Mar 9, 2017, at 8:44 AM, Dorau, Lukasz lukasz.dorau@intel.com wrote:
> >   
> > > Hi,
> > > 
> > > Could someone explain me why return probes of some syscalls (for example: futex,
> > > poll, epoll_wait) sometimes are not called?
> > > 
> > > It can be reproduced using the following bash script:
> > > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > > 
> > > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
> > > 
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))    
> > 
> > Most likely scenario: those processes are still blocked on those
> > system calls when your tracing ends.  
> 
> This is very common but those numbers are very high. I doubt there's 51
> thousand threads blocked on a futex when tracing ended.
> 
> > 
> > AFAIU, another possible (less frequent) scenario: a process gets
> > killed with SIGKILL while blocked on the signal.
> >   
> 
> This could be.
> 
> > > 
> > > Results (60 sec):
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > > select:      p 13355    r 13351    (0% did not return (4))  
> 
> All these are common system calls that tasks simply sleep on. But it
> would take a nasty kill to have them not return back to the program to
> clean up nicely. Another possibility is that these actually have another
> way out from the kernel that isn't caught by tracing. I'll take a look.
> 

BTW, what happens if you change your script to use the syscall
tracepoints instead? As syscalls have an entry and exit tracepoint. Do
the results change?

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
       [not found]     ` <1231722663.1343.1489071698914.JavaMail.zimbra@efficios.com>
@ 2017-03-09 15:20       ` Steven Rostedt
       [not found]         ` <D9FFE20C522965449E182ACE73889AEB486818F2@irsmsx105.ger.corp.intel.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2017-03-09 15:20 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Lukasz Dorau, Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, linux-trace-users,
	Marcin Slusarz, Sarah Jelinek, Vitalii Chernookyi, Gabor Buella

On Thu, 9 Mar 2017 15:01:38 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

 
> Another more likely scenario is if a multithreaded process
> has many threads blocked (e.g. on a futex), and one of the threads
> exits cleanly or forks. I suspect the kernel will just tear down the
> other threads without hitting syscall exit.

Maybe, if this happens multiple times, it could cause those kinds of
numbers. A fork was done zero times according to the numbers. Maybe
exit should be measured too.

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
       [not found]         ` <D9FFE20C522965449E182ACE73889AEB486818F2@irsmsx105.ger.corp.intel.com>
@ 2017-03-09 15:38           ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2017-03-09 15:38 UTC (permalink / raw)
  To: Dorau, Lukasz
  Cc: Mathieu Desnoyers, Ananth N Mavinakayanahalli, Keshavamurthy,
	Anil S, David S. Miller, Masami Hiramatsu, linux-trace-users,
	Slusarz, Marcin, Jelinek, Sarah, Chernookyi, Vitalii, Buella,
	Gabor

On Thu, 9 Mar 2017 15:30:42 +0000
"Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:

> On Thursday, March 9, 2017 4:21 PM Steven Rostedt wrote:
> > On Thu, 9 Mar 2017 15:01:38 +0000 (UTC)
> > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> > 
> >   
> > > Another more likely scenario is if a multithreaded process
> > > has many threads blocked (e.g. on a futex), and one of the threads
> > > exits cleanly or forks. I suspect the kernel will just tear down the
> > > other threads without hitting syscall exit.  
> > 
> > Maybe, if this happens multiple times, it could cause those kinds of
> > numbers. A fork was done zero times according to the numbers. Maybe
> > exit should be measured too.
> >   
> 
> I have run the test again with exit:
> 
> Results (60 sec):
> futex:       p 2968     r 401      (86% did not return (2567))
> poll:        p 13251    r 3957     (70% did not return (9294))
> epoll_wait:  p 1485     r 556      (62% did not return (929))
> select:      p 8063     r 8066     (OK)
> fork:        p 0        r 0        (OK)
> vfork:       p 4        r 4        (OK)
> clone:       p 6962     r 6962     (OK)
> exit:        p 3        r 0        (100% did not return (3))

I expected exit not to have a return, that's the property of exit. But
it shows that the above poll and friends is not because of exit and
fork.

Can you change your script to use syscall tracepoints instead of
retprobes?

-- Steve

> mmap:        p 71740    r 71740    (OK)
> open:        p 43822    r 43825    (OK)
> close:       p 79805    r 79806    (OK)
> write:       p 31080    r 31078    (0% did not return (2))
> read:        p 108456   r 108449   (0% did not return (7))
> 
> Lukasz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
  2017-03-09 13:44 Why return probes of some syscalls sometimes are not called? Dorau, Lukasz
  2017-03-09 13:58 ` Mathieu Desnoyers
@ 2017-03-09 17:00 ` Masami Hiramatsu
  2017-03-09 18:19   ` Chernookyi, Vitalii
  2017-03-10 10:51   ` Masami Hiramatsu
  1 sibling, 2 replies; 11+ messages in thread
From: Masami Hiramatsu @ 2017-03-09 17:00 UTC (permalink / raw)
  To: Dorau, Lukasz
  Cc: Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, linux-trace-users, Slusarz, Marcin, Jelinek,
	Sarah, Chernookyi, Vitalii, Buella, Gabor

On Thu, 9 Mar 2017 13:44:38 +0000
"Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:

> Hi,
> 
> Could someone explain me why return probes of some syscalls (for example: futex, poll, epoll_wait) sometimes are not called?
> 
> It can be reproduced using the following bash script:
> https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> 
> Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
> 
> futex:       p 56904    r 5489     (90% did not return (51415))
> poll:        p 43466    r 7703     (82% did not return (35763))
> epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> 
> The whole log is attached below.
> 
> Lukasz
> 
> ---
> # ./test_kprobes.sh 60
> 
> Will trace using following kprobe_events:
> r:kprobes/r_futex sys_futex
> p:kprobes/p_futex sys_futex
> r:kprobes/r_poll sys_poll
> p:kprobes/p_poll sys_poll
> r:kprobes/r_epoll_wait sys_epoll_wait
> p:kprobes/p_epoll_wait sys_epoll_wait
> r:kprobes/r_select sys_select
> p:kprobes/p_select sys_select
> r:kprobes/r_fork sys_fork
> p:kprobes/p_fork sys_fork
> r:kprobes/r_vfork sys_vfork
> p:kprobes/p_vfork sys_vfork
> r:kprobes/r_mmap sys_mmap
> p:kprobes/p_mmap sys_mmap
> r:kprobes/r_open sys_open
> p:kprobes/p_open sys_open
> r:kprobes/r_close sys_close
> p:kprobes/p_close sys_close
> r:kprobes/r_write sys_write
> p:kprobes/p_write sys_write
> r:kprobes/r_read sys_read
> p:kprobes/p_read sys_read
> 
> Results (60 sec):
> futex:       p 56904    r 5489     (90% did not return (51415))
> poll:        p 43466    r 7703     (82% did not return (35763))
> epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> select:      p 13355    r 13351    (0% did not return (4))
> fork:        p 0        r 0        (OK)
> vfork:       p 0        r 0        (OK)
> mmap:        p 4328     r 4328     (OK)
> open:        p 4579     r 4579     (OK)
> close:       p 7163     r 7163     (OK)
> write:       p 22769    r 22769    (OK)
> read:        p 40014    r 40014    (OK)
> 

OK, I just dumped kprobe_profile after the test,

futex:       p 7775     r 698      (91% did not return (7077))
poll:        p 18409    r 3523     (80% did not return (14886))
epoll_wait:  p 5122     r 2389     (53% did not return (2733))
select:      p 5411     r 5286     (2% did not return (125))
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
mmap:        p 179      r 179      (OK)
open:        p 921      r 921      (OK)
close:       p 1005     r 1005     (OK)
write:       p 5632     r 5631     (0% did not return (1))
read:        p 10879    r 10878    (0% did not return (1))
  r_futex                                                 1091               0
  p_futex                                                 8986               0
  r_poll                                                  4276               0
  p_poll                                                 20610               0
  r_epoll_wait                                            2849               0
  p_epoll_wait                                            5905               0
  r_select                                               10172               0
  p_select                                               10312               0
  r_fork                                                     0               0
  p_fork                                                     0               0
  r_vfork                                                    0               0
  p_vfork                                                    0               0
  r_mmap                                                  2987               0
  p_mmap                                                  2884               0
  r_open                                                  4002               0
  p_open                                                  3855               0
  r_close                                                 4262               0
  p_close                                                 4122               0
  r_write                                                14281               0
  p_write                                                14276               0
  r_read                                                 41494               0
  p_read                                                 41363               0

OK, there are no miss-counted event. So at least kprobe and kretprobe working correctly.
I also tried to use static tracepoints for that.
https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e

----
$ sudo bash test_kprobes.sh 10
Disabling all current probes ...
Removing old probe points ...
Adding new probe points ...
Enabling new probe points ...
Clearing old trace log ...

Tracing for 10 seconds ... done
Saving trace log ... done

Results (10 sec):
futex:       p 9856     r 1258     (87% did not return (8598))
poll:        p 7531     r 4031     (46% did not return (3500))
epoll_wait:  p 1996     r 1617     (18% did not return (379))
select:      p 273      r 257      (5% did not return (16))
Results (10 sec):
futex:       p 9851     r 9851     (OK)
poll:        p 7531     r 7531     (OK)
epoll_wait:  p 1996     r 1996     (OK)
select:      p 273      r 273      (OK)
----
Of course these events are collected in "different places" so
it could be different. But yeah, it seems sys_* function itself
dosen't return but jumps into ret_from_syscall etc.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Why return probes of some syscalls sometimes are not called?
  2017-03-09 17:00 ` Masami Hiramatsu
@ 2017-03-09 18:19   ` Chernookyi, Vitalii
  2017-03-10 10:51   ` Masami Hiramatsu
  1 sibling, 0 replies; 11+ messages in thread
From: Chernookyi, Vitalii @ 2017-03-09 18:19 UTC (permalink / raw)
  To: Masami Hiramatsu, Dorau, Lukasz
  Cc: Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, linux-trace-users, Slusarz, Marcin, Jelinek,
	Sarah, Buella, Gabor

Dear,

> Of course these events are collected in "different places" so
> it could be different. But yeah, it seems sys_* function itself
> dosen't return but jumps into ret_from_syscall etc.

In case of sys_futex and sys_poll there should be different reason of this issue, because most likely next calls could not have different return path:

/* test-futex.c */
Int main() {
    Return syscall(SYS_futex, 1, 2, 3, 4, 5, 6);
}

/* test-poll.c */
Int main() {
    Return syscall(SYS_poll, 0x102, 0x103, 0x104, 0x105, 0x106, 0x107);
}

With best regards,

Vitaly

-----Original Message-----
From: Masami Hiramatsu [mailto:mhiramat@kernel.org] 
Sent: Thursday, March 9, 2017 6:01 PM
To: Dorau, Lukasz <lukasz.dorau@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>; Keshavamurthy, Anil S <anil.s.keshavamurthy@intel.com>; David S. Miller <davem@davemloft.net>; linux-trace-users@vger.kernel.org; Slusarz, Marcin <marcin.slusarz@intel.com>; Jelinek, Sarah <sarah.jelinek@intel.com>; Chernookyi, Vitalii <vitalii.chernookyi@intel.com>; Buella, Gabor <gabor.buella@intel.com>
Subject: Re: Why return probes of some syscalls sometimes are not called?

On Thu, 9 Mar 2017 13:44:38 +0000
"Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:

> Hi,
> 
> Could someone explain me why return probes of some syscalls (for example: futex, poll, epoll_wait) sometimes are not called?
> 
> It can be reproduced using the following bash script:
> https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> 
> Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
> 
> futex:       p 56904    r 5489     (90% did not return (51415))
> poll:        p 43466    r 7703     (82% did not return (35763))
> epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> 
> The whole log is attached below.
> 
> Lukasz
> 
> ---
> # ./test_kprobes.sh 60
> 
> Will trace using following kprobe_events:
> r:kprobes/r_futex sys_futex
> p:kprobes/p_futex sys_futex
> r:kprobes/r_poll sys_poll
> p:kprobes/p_poll sys_poll
> r:kprobes/r_epoll_wait sys_epoll_wait
> p:kprobes/p_epoll_wait sys_epoll_wait
> r:kprobes/r_select sys_select
> p:kprobes/p_select sys_select
> r:kprobes/r_fork sys_fork
> p:kprobes/p_fork sys_fork
> r:kprobes/r_vfork sys_vfork
> p:kprobes/p_vfork sys_vfork
> r:kprobes/r_mmap sys_mmap
> p:kprobes/p_mmap sys_mmap
> r:kprobes/r_open sys_open
> p:kprobes/p_open sys_open
> r:kprobes/r_close sys_close
> p:kprobes/p_close sys_close
> r:kprobes/r_write sys_write
> p:kprobes/p_write sys_write
> r:kprobes/r_read sys_read
> p:kprobes/p_read sys_read
> 
> Results (60 sec):
> futex:       p 56904    r 5489     (90% did not return (51415))
> poll:        p 43466    r 7703     (82% did not return (35763))
> epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> select:      p 13355    r 13351    (0% did not return (4))
> fork:        p 0        r 0        (OK)
> vfork:       p 0        r 0        (OK)
> mmap:        p 4328     r 4328     (OK)
> open:        p 4579     r 4579     (OK)
> close:       p 7163     r 7163     (OK)
> write:       p 22769    r 22769    (OK)
> read:        p 40014    r 40014    (OK)
> 

OK, I just dumped kprobe_profile after the test,

futex:       p 7775     r 698      (91% did not return (7077))
poll:        p 18409    r 3523     (80% did not return (14886))
epoll_wait:  p 5122     r 2389     (53% did not return (2733))
select:      p 5411     r 5286     (2% did not return (125))
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
mmap:        p 179      r 179      (OK)
open:        p 921      r 921      (OK)
close:       p 1005     r 1005     (OK)
write:       p 5632     r 5631     (0% did not return (1))
read:        p 10879    r 10878    (0% did not return (1))
  r_futex                                                 1091               0
  p_futex                                                 8986               0
  r_poll                                                  4276               0
  p_poll                                                 20610               0
  r_epoll_wait                                            2849               0
  p_epoll_wait                                            5905               0
  r_select                                               10172               0
  p_select                                               10312               0
  r_fork                                                     0               0
  p_fork                                                     0               0
  r_vfork                                                    0               0
  p_vfork                                                    0               0
  r_mmap                                                  2987               0
  p_mmap                                                  2884               0
  r_open                                                  4002               0
  p_open                                                  3855               0
  r_close                                                 4262               0
  p_close                                                 4122               0
  r_write                                                14281               0
  p_write                                                14276               0
  r_read                                                 41494               0
  p_read                                                 41363               0

OK, there are no miss-counted event. So at least kprobe and kretprobe working correctly.
I also tried to use static tracepoints for that.
https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e

----
$ sudo bash test_kprobes.sh 10
Disabling all current probes ...
Removing old probe points ...
Adding new probe points ...
Enabling new probe points ...
Clearing old trace log ...

Tracing for 10 seconds ... done
Saving trace log ... done

Results (10 sec):
futex:       p 9856     r 1258     (87% did not return (8598))
poll:        p 7531     r 4031     (46% did not return (3500))
epoll_wait:  p 1996     r 1617     (18% did not return (379))
select:      p 273      r 257      (5% did not return (16))
Results (10 sec):
futex:       p 9851     r 9851     (OK)
poll:        p 7531     r 7531     (OK)
epoll_wait:  p 1996     r 1996     (OK)
select:      p 273      r 273      (OK)
----
Of course these events are collected in "different places" so
it could be different. But yeah, it seems sys_* function itself
dosen't return but jumps into ret_from_syscall etc.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>
--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
  2017-03-09 17:00 ` Masami Hiramatsu
  2017-03-09 18:19   ` Chernookyi, Vitalii
@ 2017-03-10 10:51   ` Masami Hiramatsu
  2017-03-13  9:09     ` Dorau, Lukasz
  1 sibling, 1 reply; 11+ messages in thread
From: Masami Hiramatsu @ 2017-03-10 10:51 UTC (permalink / raw)
  To: Dorau, Lukasz
  Cc: Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, linux-trace-users, Slusarz, Marcin, Jelinek,
	Sarah, Chernookyi, Vitalii, Buella, Gabor

[-- Attachment #1: Type: text/plain, Size: 6801 bytes --]

On Thu, 9 Mar 2017 18:00:54 +0100
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Thu, 9 Mar 2017 13:44:38 +0000
> "Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:
> 
> > Hi,
> > 
> > Could someone explain me why return probes of some syscalls (for example: futex, poll, epoll_wait) sometimes are not called?
> > 
> > It can be reproduced using the following bash script:
> > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > 
> > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
> > 
> > futex:       p 56904    r 5489     (90% did not return (51415))
> > poll:        p 43466    r 7703     (82% did not return (35763))
> > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > 
> > The whole log is attached below.
> > 
> > Lukasz
> > 
> > ---
> > # ./test_kprobes.sh 60
> > 
> > Will trace using following kprobe_events:
> > r:kprobes/r_futex sys_futex
> > p:kprobes/p_futex sys_futex
> > r:kprobes/r_poll sys_poll
> > p:kprobes/p_poll sys_poll
> > r:kprobes/r_epoll_wait sys_epoll_wait
> > p:kprobes/p_epoll_wait sys_epoll_wait
> > r:kprobes/r_select sys_select
> > p:kprobes/p_select sys_select
> > r:kprobes/r_fork sys_fork
> > p:kprobes/p_fork sys_fork
> > r:kprobes/r_vfork sys_vfork
> > p:kprobes/p_vfork sys_vfork
> > r:kprobes/r_mmap sys_mmap
> > p:kprobes/p_mmap sys_mmap
> > r:kprobes/r_open sys_open
> > p:kprobes/p_open sys_open
> > r:kprobes/r_close sys_close
> > p:kprobes/p_close sys_close
> > r:kprobes/r_write sys_write
> > p:kprobes/p_write sys_write
> > r:kprobes/r_read sys_read
> > p:kprobes/p_read sys_read
> > 
> > Results (60 sec):
> > futex:       p 56904    r 5489     (90% did not return (51415))
> > poll:        p 43466    r 7703     (82% did not return (35763))
> > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > select:      p 13355    r 13351    (0% did not return (4))
> > fork:        p 0        r 0        (OK)
> > vfork:       p 0        r 0        (OK)
> > mmap:        p 4328     r 4328     (OK)
> > open:        p 4579     r 4579     (OK)
> > close:       p 7163     r 7163     (OK)
> > write:       p 22769    r 22769    (OK)
> > read:        p 40014    r 40014    (OK)
> > 
> 
> OK, I just dumped kprobe_profile after the test,
> 
> futex:       p 7775     r 698      (91% did not return (7077))
> poll:        p 18409    r 3523     (80% did not return (14886))
> epoll_wait:  p 5122     r 2389     (53% did not return (2733))
> select:      p 5411     r 5286     (2% did not return (125))
> fork:        p 0        r 0        (OK)
> vfork:       p 0        r 0        (OK)
> mmap:        p 179      r 179      (OK)
> open:        p 921      r 921      (OK)
> close:       p 1005     r 1005     (OK)
> write:       p 5632     r 5631     (0% did not return (1))
> read:        p 10879    r 10878    (0% did not return (1))
>   r_futex                                                 1091               0
>   p_futex                                                 8986               0
>   r_poll                                                  4276               0
>   p_poll                                                 20610               0
>   r_epoll_wait                                            2849               0
>   p_epoll_wait                                            5905               0
>   r_select                                               10172               0
>   p_select                                               10312               0
>   r_fork                                                     0               0
>   p_fork                                                     0               0
>   r_vfork                                                    0               0
>   p_vfork                                                    0               0
>   r_mmap                                                  2987               0
>   p_mmap                                                  2884               0
>   r_open                                                  4002               0
>   p_open                                                  3855               0
>   r_close                                                 4262               0
>   p_close                                                 4122               0
>   r_write                                                14281               0
>   p_write                                                14276               0
>   r_read                                                 41494               0
>   p_read                                                 41363               0
> 
> OK, there are no miss-counted event. So at least kprobe and kretprobe working correctly.
> I also tried to use static tracepoints for that.
> https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e
> 
> ----
> $ sudo bash test_kprobes.sh 10
> Disabling all current probes ...
> Removing old probe points ...
> Adding new probe points ...
> Enabling new probe points ...
> Clearing old trace log ...
> 
> Tracing for 10 seconds ... done
> Saving trace log ... done
> 
> Results (10 sec):
> futex:       p 9856     r 1258     (87% did not return (8598))
> poll:        p 7531     r 4031     (46% did not return (3500))
> epoll_wait:  p 1996     r 1617     (18% did not return (379))
> select:      p 273      r 257      (5% did not return (16))
> Results (10 sec):
> futex:       p 9851     r 9851     (OK)
> poll:        p 7531     r 7531     (OK)
> epoll_wait:  p 1996     r 1996     (OK)
> select:      p 273      r 273      (OK)
> ----
> Of course these events are collected in "different places" so
> it could be different. But yeah, it seems sys_* function itself
> dosen't return but jumps into ret_from_syscall etc.

Please try attached patch (that is my miss that forgot to accumulate missed
count of kretprobe).

I've focused on futex and it showed there were some "misses" of kretprobe.

----
Disabling all current probes ...
Removing old probe points ...
Adding new probe points ...
Enabling new probe points ...
Clearing old trace log ...

Tracing for 10 seconds ... done
Saving trace log ... done

Results (10 sec):
futex:       p 190      r 74       (61% did not return (116))
Results (10 sec):
futex:       p 190      r 190      (OK)

$ cat /sys/kernel/debug/tracing/kprobe_profile
  r_futex                                                  176             127
  p_futex                                                  298               0
-----

Actually, since the kretprobe is not well designed for the usecase
probing on the sleep-able functions, it just fails if it invoked
more than the number of CPUs in parallel.
It is currently a limitation of kretprobe, but I'll solve it.

Thanks,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

[-- Attachment #2: trace-kprobes-show-sum-of --]
[-- Type: application/octet-stream, Size: 696 bytes --]

trace: kprobes: Show sum of probe/retprobe nmissed count

From: Masami Hiramatsu <mhiramat@kernel.org>

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 kernel/trace/trace_kprobe.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 12fb540..ee7c6db 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -896,7 +896,7 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
 	seq_printf(m, "  %-44s %15lu %15lu\n",
 		   trace_event_name(&tk->tp.call),
 		   trace_kprobe_nhit(tk),
-		   tk->rp.kp.nmissed);
+		   tk->rp.kp.nmissed + tk->rp.nmissed);
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: Why return probes of some syscalls sometimes are not called?
  2017-03-10 10:51   ` Masami Hiramatsu
@ 2017-03-13  9:09     ` Dorau, Lukasz
  2017-03-16 10:01       ` Masami Hiramatsu
  0 siblings, 1 reply; 11+ messages in thread
From: Dorau, Lukasz @ 2017-03-13  9:09 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, linux-trace-users, Slusarz, Marcin, Jelinek,
	Sarah, Chernookyi, Vitalii, Buella, Gabor

On Friday, March 10, 2017 11:51 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> On Thu, 9 Mar 2017 18:00:54 +0100
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > On Thu, 9 Mar 2017 13:44:38 +0000
> > "Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:
> >
> > > Hi,
> > >
> > > Could someone explain me why return probes of some syscalls (for example:
> futex, poll, epoll_wait) sometimes are not called?
> > >
> > > It can be reproduced using the following bash script:
> > > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > >
> > > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora
> 24):
> > >
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > >
> > > The whole log is attached below.
> > >
> > > Lukasz
> > >
> > > ---
> > > # ./test_kprobes.sh 60
> > >
> > > Will trace using following kprobe_events:
> > > r:kprobes/r_futex sys_futex
> > > p:kprobes/p_futex sys_futex
> > > r:kprobes/r_poll sys_poll
> > > p:kprobes/p_poll sys_poll
> > > r:kprobes/r_epoll_wait sys_epoll_wait
> > > p:kprobes/p_epoll_wait sys_epoll_wait
> > > r:kprobes/r_select sys_select
> > > p:kprobes/p_select sys_select
> > > r:kprobes/r_fork sys_fork
> > > p:kprobes/p_fork sys_fork
> > > r:kprobes/r_vfork sys_vfork
> > > p:kprobes/p_vfork sys_vfork
> > > r:kprobes/r_mmap sys_mmap
> > > p:kprobes/p_mmap sys_mmap
> > > r:kprobes/r_open sys_open
> > > p:kprobes/p_open sys_open
> > > r:kprobes/r_close sys_close
> > > p:kprobes/p_close sys_close
> > > r:kprobes/r_write sys_write
> > > p:kprobes/p_write sys_write
> > > r:kprobes/r_read sys_read
> > > p:kprobes/p_read sys_read
> > >
> > > Results (60 sec):
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > > select:      p 13355    r 13351    (0% did not return (4))
> > > fork:        p 0        r 0        (OK)
> > > vfork:       p 0        r 0        (OK)
> > > mmap:        p 4328     r 4328     (OK)
> > > open:        p 4579     r 4579     (OK)
> > > close:       p 7163     r 7163     (OK)
> > > write:       p 22769    r 22769    (OK)
> > > read:        p 40014    r 40014    (OK)
> > >
> >
> > OK, I just dumped kprobe_profile after the test,
> >
> > futex:       p 7775     r 698      (91% did not return (7077))
> > poll:        p 18409    r 3523     (80% did not return (14886))
> > epoll_wait:  p 5122     r 2389     (53% did not return (2733))
> > select:      p 5411     r 5286     (2% did not return (125))
> > fork:        p 0        r 0        (OK)
> > vfork:       p 0        r 0        (OK)
> > mmap:        p 179      r 179      (OK)
> > open:        p 921      r 921      (OK)
> > close:       p 1005     r 1005     (OK)
> > write:       p 5632     r 5631     (0% did not return (1))
> > read:        p 10879    r 10878    (0% did not return (1))
> >   r_futex                                                 1091               0
> >   p_futex                                                 8986               0
> >   r_poll                                                  4276               0
> >   p_poll                                                 20610               0
> >   r_epoll_wait                                            2849               0
> >   p_epoll_wait                                            5905               0
> >   r_select                                               10172               0
> >   p_select                                               10312               0
> >   r_fork                                                     0               0
> >   p_fork                                                     0               0
> >   r_vfork                                                    0               0
> >   p_vfork                                                    0               0
> >   r_mmap                                                  2987               0
> >   p_mmap                                                  2884               0
> >   r_open                                                  4002               0
> >   p_open                                                  3855               0
> >   r_close                                                 4262               0
> >   p_close                                                 4122               0
> >   r_write                                                14281               0
> >   p_write                                                14276               0
> >   r_read                                                 41494               0
> >   p_read                                                 41363               0
> >
> > OK, there are no miss-counted event. So at least kprobe and kretprobe working
> correctly.
> > I also tried to use static tracepoints for that.
> > https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e
> >
> > ----
> > $ sudo bash test_kprobes.sh 10
> > Disabling all current probes ...
> > Removing old probe points ...
> > Adding new probe points ...
> > Enabling new probe points ...
> > Clearing old trace log ...
> >
> > Tracing for 10 seconds ... done
> > Saving trace log ... done
> >
> > Results (10 sec):
> > futex:       p 9856     r 1258     (87% did not return (8598))
> > poll:        p 7531     r 4031     (46% did not return (3500))
> > epoll_wait:  p 1996     r 1617     (18% did not return (379))
> > select:      p 273      r 257      (5% did not return (16))
> > Results (10 sec):
> > futex:       p 9851     r 9851     (OK)
> > poll:        p 7531     r 7531     (OK)
> > epoll_wait:  p 1996     r 1996     (OK)
> > select:      p 273      r 273      (OK)
> > ----
> > Of course these events are collected in "different places" so
> > it could be different. But yeah, it seems sys_* function itself
> > dosen't return but jumps into ret_from_syscall etc.
> 
> Please try attached patch (that is my miss that forgot to accumulate missed
> count of kretprobe).
> 

Thanks for the patch. I ran two tests using the following script:
https://gist.github.com/ldorau/36d87640ff0659a483342d490b66f034
The results are following:

1) The first test - during 60-seconds test the system was almost idle
   (I did not touch the keyboard or mouse)

KProbes results (60 sec):
futex:       p 1373     r 422      (69% did not return (951))
poll:        p 3304     r 1575     (52% did not return (1729))
epoll_wait:  p 1062     r 943      (11% did not return (119))
select:      p 5978     r 5977     (0% did not return (1))
mmap:        p 264      r 264      (OK)
open:        p 1056     r 1056     (OK)
close:       p 1198     r 1198     (OK)
write:       p 4786     r 4786     (OK)
read:        p 8939     r 8939     (OK)
exit:        p 1        r 0    
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
clone:       p 5        r 5        (OK)
execve:      p 4        r 4        (OK)

Tracepoints results (60 sec):
futex:       p 1372     r 1372     (OK)
poll:        p 3304     r 3303     (0% did not return (1))
epoll_wait:  p 1062     r 1062     (OK)
select:      p 5978     r 5978     (OK)
mmap:        p 264      r 264      (OK)
open:        p 1056     r 1056     (OK)
close:       p 1198     r 1198     (OK)
write:       p 4786     r 4786     (OK)
read:        p 8939     r 8940     (OK)
exit:        p 1        r 0    

2) The second test - during 60-seconds test I was normally using the system -
   - I launched an internet browser, edited a file...

KProbes results (60 sec):
futex:       p 35394    r 689      (98% did not return (34705))
poll:        p 23449    r 896      (96% did not return (22553))
epoll_wait:  p 43927    r 4973     (88% did not return (38954))
select:      p 4874     r 4873     (0% did not return (1))
mmap:        p 3165     r 3165     (OK)
open:        p 3680     r 3678     (0% did not return (2))
close:       p 6704     r 6704     (OK)
write:       p 11769    r 11769    (OK)
read:        p 22645    r 22647    (OK)
exit:        p 30       r 0        
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
clone:       p 170      r 170      (OK)
execve:      p 2        r 2        (OK)

Tracepoints results (60 sec):
futex:       p 35359    r 35268    (0% did not return (91))
poll:        p 23448    r 23444    (0% did not return (4))
epoll_wait:  p 43926    r 43909    (0% did not return (17))
select:      p 4874     r 4874     (OK)
mmap:        p 3165     r 3165     (OK)
open:        p 3680     r 3827     (OK)
close:       p 6704     r 6704     (OK)
write:       p 11769    r 11770    (OK)
read:        p 22645    r 22648    (OK)
exit:        p 30       r 0        

Lukasz



> I've focused on futex and it showed there were some "misses" of kretprobe.
> 
> ----
> Disabling all current probes ...
> Removing old probe points ...
> Adding new probe points ...
> Enabling new probe points ...
> Clearing old trace log ...
> 
> Tracing for 10 seconds ... done
> Saving trace log ... done
> 
> Results (10 sec):
> futex:       p 190      r 74       (61% did not return (116))
> Results (10 sec):
> futex:       p 190      r 190      (OK)
> 
> $ cat /sys/kernel/debug/tracing/kprobe_profile
>   r_futex                                                  176             127
>   p_futex                                                  298               0
> -----
> 
> Actually, since the kretprobe is not well designed for the usecase
> probing on the sleep-able functions, it just fails if it invoked
> more than the number of CPUs in parallel.
> It is currently a limitation of kretprobe, but I'll solve it.
> 
> Thanks,
> 
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Why return probes of some syscalls sometimes are not called?
  2017-03-13  9:09     ` Dorau, Lukasz
@ 2017-03-16 10:01       ` Masami Hiramatsu
  0 siblings, 0 replies; 11+ messages in thread
From: Masami Hiramatsu @ 2017-03-16 10:01 UTC (permalink / raw)
  To: Dorau, Lukasz
  Cc: Ananth N Mavinakayanahalli, Keshavamurthy, Anil S,
	David S. Miller, linux-trace-users, Slusarz, Marcin, Jelinek,
	Sarah, Chernookyi, Vitalii, Buella, Gabor

On Mon, 13 Mar 2017 09:09:47 +0000
"Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:

> On Friday, March 10, 2017 11:51 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > On Thu, 9 Mar 2017 18:00:54 +0100
> > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > 
> > > On Thu, 9 Mar 2017 13:44:38 +0000
> > > "Dorau, Lukasz" <lukasz.dorau@intel.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > Could someone explain me why return probes of some syscalls (for example:
> > futex, poll, epoll_wait) sometimes are not called?
> > > >
> > > > It can be reproduced using the following bash script:
> > > > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > > >
> > > > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora
> > 24):
> > > >
> > > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > > >
> > > > The whole log is attached below.
> > > >
> > > > Lukasz
> > > >
> > > > ---
> > > > # ./test_kprobes.sh 60
> > > >
> > > > Will trace using following kprobe_events:
> > > > r:kprobes/r_futex sys_futex
> > > > p:kprobes/p_futex sys_futex
> > > > r:kprobes/r_poll sys_poll
> > > > p:kprobes/p_poll sys_poll
> > > > r:kprobes/r_epoll_wait sys_epoll_wait
> > > > p:kprobes/p_epoll_wait sys_epoll_wait
> > > > r:kprobes/r_select sys_select
> > > > p:kprobes/p_select sys_select
> > > > r:kprobes/r_fork sys_fork
> > > > p:kprobes/p_fork sys_fork
> > > > r:kprobes/r_vfork sys_vfork
> > > > p:kprobes/p_vfork sys_vfork
> > > > r:kprobes/r_mmap sys_mmap
> > > > p:kprobes/p_mmap sys_mmap
> > > > r:kprobes/r_open sys_open
> > > > p:kprobes/p_open sys_open
> > > > r:kprobes/r_close sys_close
> > > > p:kprobes/p_close sys_close
> > > > r:kprobes/r_write sys_write
> > > > p:kprobes/p_write sys_write
> > > > r:kprobes/r_read sys_read
> > > > p:kprobes/p_read sys_read
> > > >
> > > > Results (60 sec):
> > > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > > > select:      p 13355    r 13351    (0% did not return (4))
> > > > fork:        p 0        r 0        (OK)
> > > > vfork:       p 0        r 0        (OK)
> > > > mmap:        p 4328     r 4328     (OK)
> > > > open:        p 4579     r 4579     (OK)
> > > > close:       p 7163     r 7163     (OK)
> > > > write:       p 22769    r 22769    (OK)
> > > > read:        p 40014    r 40014    (OK)
> > > >
> > >
> > > OK, I just dumped kprobe_profile after the test,
> > >
> > > futex:       p 7775     r 698      (91% did not return (7077))
> > > poll:        p 18409    r 3523     (80% did not return (14886))
> > > epoll_wait:  p 5122     r 2389     (53% did not return (2733))
> > > select:      p 5411     r 5286     (2% did not return (125))
> > > fork:        p 0        r 0        (OK)
> > > vfork:       p 0        r 0        (OK)
> > > mmap:        p 179      r 179      (OK)
> > > open:        p 921      r 921      (OK)
> > > close:       p 1005     r 1005     (OK)
> > > write:       p 5632     r 5631     (0% did not return (1))
> > > read:        p 10879    r 10878    (0% did not return (1))
> > >   r_futex                                                 1091               0
> > >   p_futex                                                 8986               0
> > >   r_poll                                                  4276               0
> > >   p_poll                                                 20610               0
> > >   r_epoll_wait                                            2849               0
> > >   p_epoll_wait                                            5905               0
> > >   r_select                                               10172               0
> > >   p_select                                               10312               0
> > >   r_fork                                                     0               0
> > >   p_fork                                                     0               0
> > >   r_vfork                                                    0               0
> > >   p_vfork                                                    0               0
> > >   r_mmap                                                  2987               0
> > >   p_mmap                                                  2884               0
> > >   r_open                                                  4002               0
> > >   p_open                                                  3855               0
> > >   r_close                                                 4262               0
> > >   p_close                                                 4122               0
> > >   r_write                                                14281               0
> > >   p_write                                                14276               0
> > >   r_read                                                 41494               0
> > >   p_read                                                 41363               0
> > >
> > > OK, there are no miss-counted event. So at least kprobe and kretprobe working
> > correctly.
> > > I also tried to use static tracepoints for that.
> > > https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e
> > >
> > > ----
> > > $ sudo bash test_kprobes.sh 10
> > > Disabling all current probes ...
> > > Removing old probe points ...
> > > Adding new probe points ...
> > > Enabling new probe points ...
> > > Clearing old trace log ...
> > >
> > > Tracing for 10 seconds ... done
> > > Saving trace log ... done
> > >
> > > Results (10 sec):
> > > futex:       p 9856     r 1258     (87% did not return (8598))
> > > poll:        p 7531     r 4031     (46% did not return (3500))
> > > epoll_wait:  p 1996     r 1617     (18% did not return (379))
> > > select:      p 273      r 257      (5% did not return (16))
> > > Results (10 sec):
> > > futex:       p 9851     r 9851     (OK)
> > > poll:        p 7531     r 7531     (OK)
> > > epoll_wait:  p 1996     r 1996     (OK)
> > > select:      p 273      r 273      (OK)
> > > ----
> > > Of course these events are collected in "different places" so
> > > it could be different. But yeah, it seems sys_* function itself
> > > dosen't return but jumps into ret_from_syscall etc.
> > 
> > Please try attached patch (that is my miss that forgot to accumulate missed
> > count of kretprobe).
> > 
> 
> Thanks for the patch. I ran two tests using the following script:
> https://gist.github.com/ldorau/36d87640ff0659a483342d490b66f034
> The results are following:
> 
> 1) The first test - during 60-seconds test the system was almost idle
>    (I did not touch the keyboard or mouse)
> 
> KProbes results (60 sec):
> futex:       p 1373     r 422      (69% did not return (951))
> poll:        p 3304     r 1575     (52% did not return (1729))
> epoll_wait:  p 1062     r 943      (11% did not return (119))
> select:      p 5978     r 5977     (0% did not return (1))
> mmap:        p 264      r 264      (OK)
> open:        p 1056     r 1056     (OK)
> close:       p 1198     r 1198     (OK)
> write:       p 4786     r 4786     (OK)
> read:        p 8939     r 8939     (OK)
> exit:        p 1        r 0    
> fork:        p 0        r 0        (OK)
> vfork:       p 0        r 0        (OK)
> clone:       p 5        r 5        (OK)
> execve:      p 4        r 4        (OK)
> 
> Tracepoints results (60 sec):
> futex:       p 1372     r 1372     (OK)
> poll:        p 3304     r 3303     (0% did not return (1))
> epoll_wait:  p 1062     r 1062     (OK)
> select:      p 5978     r 5978     (OK)
> mmap:        p 264      r 264      (OK)
> open:        p 1056     r 1056     (OK)
> close:       p 1198     r 1198     (OK)
> write:       p 4786     r 4786     (OK)
> read:        p 8939     r 8940     (OK)
> exit:        p 1        r 0    
> 
> 2) The second test - during 60-seconds test I was normally using the system -
>    - I launched an internet browser, edited a file...
> 
> KProbes results (60 sec):
> futex:       p 35394    r 689      (98% did not return (34705))
> poll:        p 23449    r 896      (96% did not return (22553))
> epoll_wait:  p 43927    r 4973     (88% did not return (38954))
> select:      p 4874     r 4873     (0% did not return (1))
> mmap:        p 3165     r 3165     (OK)
> open:        p 3680     r 3678     (0% did not return (2))
> close:       p 6704     r 6704     (OK)
> write:       p 11769    r 11769    (OK)
> read:        p 22645    r 22647    (OK)
> exit:        p 30       r 0        
> fork:        p 0        r 0        (OK)
> vfork:       p 0        r 0        (OK)
> clone:       p 170      r 170      (OK)
> execve:      p 2        r 2        (OK)
> 
> Tracepoints results (60 sec):
> futex:       p 35359    r 35268    (0% did not return (91))
> poll:        p 23448    r 23444    (0% did not return (4))
> epoll_wait:  p 43926    r 43909    (0% did not return (17))
> select:      p 4874     r 4874     (OK)
> mmap:        p 3165     r 3165     (OK)
> open:        p 3680     r 3827     (OK)
> close:       p 6704     r 6704     (OK)
> write:       p 11769    r 11770    (OK)
> read:        p 22645    r 22648    (OK)
> exit:        p 30       r 0        
> 

I see. The patch was to identify the reason why the return event was
lost. Not for fixing the issue.
To fix this issue, I need some more work.

Thank you,

> Lukasz
> 
> 
> 
> > I've focused on futex and it showed there were some "misses" of kretprobe.
> > 
> > ----
> > Disabling all current probes ...
> > Removing old probe points ...
> > Adding new probe points ...
> > Enabling new probe points ...
> > Clearing old trace log ...
> > 
> > Tracing for 10 seconds ... done
> > Saving trace log ... done
> > 
> > Results (10 sec):
> > futex:       p 190      r 74       (61% did not return (116))
> > Results (10 sec):
> > futex:       p 190      r 190      (OK)
> > 
> > $ cat /sys/kernel/debug/tracing/kprobe_profile
> >   r_futex                                                  176             127
> >   p_futex                                                  298               0
> > -----
> > 
> > Actually, since the kretprobe is not well designed for the usecase
> > probing on the sleep-able functions, it just fails if it invoked
> > more than the number of CPUs in parallel.
> > It is currently a limitation of kretprobe, but I'll solve it.
> > 
> > Thanks,
> > 
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-16 10:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-09 13:44 Why return probes of some syscalls sometimes are not called? Dorau, Lukasz
2017-03-09 13:58 ` Mathieu Desnoyers
2017-03-09 14:44   ` Steven Rostedt
2017-03-09 14:57     ` Steven Rostedt
     [not found]     ` <1231722663.1343.1489071698914.JavaMail.zimbra@efficios.com>
2017-03-09 15:20       ` Steven Rostedt
     [not found]         ` <D9FFE20C522965449E182ACE73889AEB486818F2@irsmsx105.ger.corp.intel.com>
2017-03-09 15:38           ` Steven Rostedt
2017-03-09 17:00 ` Masami Hiramatsu
2017-03-09 18:19   ` Chernookyi, Vitalii
2017-03-10 10:51   ` Masami Hiramatsu
2017-03-13  9:09     ` Dorau, Lukasz
2017-03-16 10:01       ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).