lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
* Re: Capturing User-Level Function Calls/Returns
       [not found] <e9c7400ff0075f3beba2863c4432a905@ut.ac.ir>
@ 2020-07-15 18:28 ` Steven Rostedt
  2020-07-15 18:28   ` [lttng-dev] " Steven Rostedt via lttng-dev
                     ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Steven Rostedt @ 2020-07-15 18:28 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Linux-trace Users, lttng-dev, Mathieu Desnoyers,
	Jérémie Galarneau, Namhyung Kim

On Wed, 15 Jul 2020 20:37:16 +0430
ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:

> Hi,
> What is the most efficient way to capture occurrence of a function 
> call/return of a binary program in userspace?
> It seems the answer is Uprobes. 1) Am I right?
> But Uprobes use "int" instruction which leads to a switch into kernel 
> mode. 2) Wouldn't it be better to avoid this transition?
> I'm looking forward to your reply and will be happy to read your 
> opinions.
> Regards.


Hi, I believe LTTng has utilities that can help you trace user space
programs.

I think there's also a users ftrace like utility that Namhyung was
working on. But I don't know where in the development that is.

-- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-15 18:28 ` Capturing User-Level Function Calls/Returns Steven Rostedt
@ 2020-07-15 18:28   ` Steven Rostedt via lttng-dev
  2020-07-15 18:45   ` Mathieu Desnoyers
  2020-07-16  1:04   ` Namhyung Kim
  2 siblings, 0 replies; 24+ messages in thread
From: Steven Rostedt via lttng-dev @ 2020-07-15 18:28 UTC (permalink / raw)
  To: ahmadkhorrami; +Cc: Linux-trace Users, lttng-dev, Namhyung Kim

On Wed, 15 Jul 2020 20:37:16 +0430
ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:

> Hi,
> What is the most efficient way to capture occurrence of a function 
> call/return of a binary program in userspace?
> It seems the answer is Uprobes. 1) Am I right?
> But Uprobes use "int" instruction which leads to a switch into kernel 
> mode. 2) Wouldn't it be better to avoid this transition?
> I'm looking forward to your reply and will be happy to read your 
> opinions.
> Regards.


Hi, I believe LTTng has utilities that can help you trace user space
programs.

I think there's also a users ftrace like utility that Namhyung was
working on. But I don't know where in the development that is.

-- Steve
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Capturing User-Level Function Calls/Returns
  2020-07-15 18:28 ` Capturing User-Level Function Calls/Returns Steven Rostedt
  2020-07-15 18:28   ` [lttng-dev] " Steven Rostedt via lttng-dev
@ 2020-07-15 18:45   ` Mathieu Desnoyers
  2020-07-15 18:45     ` [lttng-dev] " Mathieu Desnoyers via lttng-dev
  2020-07-15 21:39     ` ahmadkhorrami
  2020-07-16  1:04   ` Namhyung Kim
  2 siblings, 2 replies; 24+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 18:45 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: linux-trace-users, lttng-dev, Jeremie Galarneau, Namhyung Kim, rostedt

----- On Jul 15, 2020, at 2:28 PM, rostedt rostedt@goodmis.org wrote:

> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
>> Hi,
>> What is the most efficient way to capture occurrence of a function
>> call/return of a binary program in userspace?
>> It seems the answer is Uprobes. 1) Am I right?
>> But Uprobes use "int" instruction which leads to a switch into kernel
>> mode. 2) Wouldn't it be better to avoid this transition?
>> I'm looking forward to your reply and will be happy to read your
>> opinions.
>> Regards.
> 
> 
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.

Indeed, it is documented here:

https://lttng.org/docs/#doc-liblttng-ust-cyg-profile

If your program is generating function entry/exit at a very
high rate (which goes beyond your available I/O throughput and
lasts longer than the memory you have available for ring buffers),
you will also probably want to use the "blocking-timeout" option
documented at:

https://lttng.org/docs/#doc-enabling-disabling-channels

Thanks,

Mathieu

> 
> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.
> 
> -- Steve

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-15 18:45   ` Mathieu Desnoyers
@ 2020-07-15 18:45     ` Mathieu Desnoyers via lttng-dev
  2020-07-15 21:39     ` ahmadkhorrami
  1 sibling, 0 replies; 24+ messages in thread
From: Mathieu Desnoyers via lttng-dev @ 2020-07-15 18:45 UTC (permalink / raw)
  To: ahmadkhorrami; +Cc: linux-trace-users, lttng-dev, rostedt, Namhyung Kim

----- On Jul 15, 2020, at 2:28 PM, rostedt rostedt@goodmis.org wrote:

> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
>> Hi,
>> What is the most efficient way to capture occurrence of a function
>> call/return of a binary program in userspace?
>> It seems the answer is Uprobes. 1) Am I right?
>> But Uprobes use "int" instruction which leads to a switch into kernel
>> mode. 2) Wouldn't it be better to avoid this transition?
>> I'm looking forward to your reply and will be happy to read your
>> opinions.
>> Regards.
> 
> 
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.

Indeed, it is documented here:

https://lttng.org/docs/#doc-liblttng-ust-cyg-profile

If your program is generating function entry/exit at a very
high rate (which goes beyond your available I/O throughput and
lasts longer than the memory you have available for ring buffers),
you will also probably want to use the "blocking-timeout" option
documented at:

https://lttng.org/docs/#doc-enabling-disabling-channels

Thanks,

Mathieu

> 
> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.
> 
> -- Steve

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Capturing User-Level Function Calls/Returns
  2020-07-15 18:45   ` Mathieu Desnoyers
  2020-07-15 18:45     ` [lttng-dev] " Mathieu Desnoyers via lttng-dev
@ 2020-07-15 21:39     ` ahmadkhorrami
  2020-07-15 21:39       ` [lttng-dev] " ahmadkhorrami via lttng-dev
  2020-07-15 21:48       ` Steven Rostedt
  1 sibling, 2 replies; 24+ messages in thread
From: ahmadkhorrami @ 2020-07-15 21:39 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-trace-users, lttng-dev, Jeremie Galarneau, Namhyung Kim,
	rostedt, linux-trace-users-owner

Hi Steven and Mathieu,
Firstly, many thanks! This method seems to be the most efficient method. 
But, IIUC, what you suggest requires source code compilation. I need an 
efficient dynamic method that, given the function address, captures its 
occurrence and stores some information from the execution context. Is 
there anything better than Uprobes perhaps with no trap into the kernel? 
Why do we need traps?
Regards.

On 2020-07-15 23:15, Mathieu Desnoyers wrote:

> ----- On Jul 15, 2020, at 2:28 PM, rostedt rostedt@goodmis.org wrote:
> 
> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
> Hi,
> What is the most efficient way to capture occurrence of a function
> call/return of a binary program in userspace?
> It seems the answer is Uprobes. 1) Am I right?
> But Uprobes use "int" instruction which leads to a switch into kernel
> mode. 2) Wouldn't it be better to avoid this transition?
> I'm looking forward to your reply and will be happy to read your
> opinions.
> Regards.
> 
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.

Indeed, it is documented here:

https://lttng.org/docs/#doc-liblttng-ust-cyg-profile

If your program is generating function entry/exit at a very
high rate (which goes beyond your available I/O throughput and
lasts longer than the memory you have available for ring buffers),
you will also probably want to use the "blocking-timeout" option
documented at:

https://lttng.org/docs/#doc-enabling-disabling-channels

Thanks,

Mathieu

> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.
> 
> -- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-15 21:39     ` ahmadkhorrami
@ 2020-07-15 21:39       ` ahmadkhorrami via lttng-dev
  2020-07-15 21:48       ` Steven Rostedt
  1 sibling, 0 replies; 24+ messages in thread
From: ahmadkhorrami via lttng-dev @ 2020-07-15 21:39 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-trace-users-owner, rostedt, linux-trace-users, lttng-dev,
	Namhyung Kim

Hi Steven and Mathieu,
Firstly, many thanks! This method seems to be the most efficient method. 
But, IIUC, what you suggest requires source code compilation. I need an 
efficient dynamic method that, given the function address, captures its 
occurrence and stores some information from the execution context. Is 
there anything better than Uprobes perhaps with no trap into the kernel? 
Why do we need traps?
Regards.

On 2020-07-15 23:15, Mathieu Desnoyers wrote:

> ----- On Jul 15, 2020, at 2:28 PM, rostedt rostedt@goodmis.org wrote:
> 
> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
> Hi,
> What is the most efficient way to capture occurrence of a function
> call/return of a binary program in userspace?
> It seems the answer is Uprobes. 1) Am I right?
> But Uprobes use "int" instruction which leads to a switch into kernel
> mode. 2) Wouldn't it be better to avoid this transition?
> I'm looking forward to your reply and will be happy to read your
> opinions.
> Regards.
> 
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.

Indeed, it is documented here:

https://lttng.org/docs/#doc-liblttng-ust-cyg-profile

If your program is generating function entry/exit at a very
high rate (which goes beyond your available I/O throughput and
lasts longer than the memory you have available for ring buffers),
you will also probably want to use the "blocking-timeout" option
documented at:

https://lttng.org/docs/#doc-enabling-disabling-channels

Thanks,

Mathieu

> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.
> 
> -- Steve
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Capturing User-Level Function Calls/Returns
  2020-07-15 21:39     ` ahmadkhorrami
  2020-07-15 21:39       ` [lttng-dev] " ahmadkhorrami via lttng-dev
@ 2020-07-15 21:48       ` Steven Rostedt
  2020-07-15 21:48         ` [lttng-dev] " Steven Rostedt via lttng-dev
                           ` (2 more replies)
  1 sibling, 3 replies; 24+ messages in thread
From: Steven Rostedt @ 2020-07-15 21:48 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Mathieu Desnoyers, linux-trace-users, lttng-dev,
	Jeremie Galarneau, Namhyung Kim, linux-trace-users-owner

On Thu, 16 Jul 2020 02:09:50 +0430
ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:

> Hi Steven and Mathieu,
> Firstly, many thanks! This method seems to be the most efficient method. 
> But, IIUC, what you suggest requires source code compilation. I need an 
> efficient dynamic method that, given the function address, captures its 
> occurrence and stores some information from the execution context. Is 
> there anything better than Uprobes perhaps with no trap into the kernel? 
> Why do we need traps?
> Regards.

Without recompiling, how would that be implemented?

You would need to insert a jump on top of code, and still be able to
preserve that code. What a trap does, is to insert a int3, that will
trap into the kernel, it would then emulate the code that the int3 was
on, and also call some code that can trace the current state.

To do it in user land, you would need to find way to replace the code
at the location you want to trace, with a jump to the tracing
infrastructure, that will also be able to emulate the code that the
jump was inserted on top of. As on x86, that jump will need to be 5
bytes long (covering 5 bytes of text to emulate), where as a int3 is a
single byte.

Thus, you either recompile and insert nops where you want to place your
jumps, or you trap using int3 that can do the work from within the
kernel.

-- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-15 21:48       ` Steven Rostedt
@ 2020-07-15 21:48         ` Steven Rostedt via lttng-dev
  2020-07-15 22:25         ` ahmadkhorrami
  2020-07-16  1:06         ` Michel Dagenais via lttng-dev
  2 siblings, 0 replies; 24+ messages in thread
From: Steven Rostedt via lttng-dev @ 2020-07-15 21:48 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: linux-trace-users-owner, linux-trace-users, lttng-dev, Namhyung Kim

On Thu, 16 Jul 2020 02:09:50 +0430
ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:

> Hi Steven and Mathieu,
> Firstly, many thanks! This method seems to be the most efficient method. 
> But, IIUC, what you suggest requires source code compilation. I need an 
> efficient dynamic method that, given the function address, captures its 
> occurrence and stores some information from the execution context. Is 
> there anything better than Uprobes perhaps with no trap into the kernel? 
> Why do we need traps?
> Regards.

Without recompiling, how would that be implemented?

You would need to insert a jump on top of code, and still be able to
preserve that code. What a trap does, is to insert a int3, that will
trap into the kernel, it would then emulate the code that the int3 was
on, and also call some code that can trace the current state.

To do it in user land, you would need to find way to replace the code
at the location you want to trace, with a jump to the tracing
infrastructure, that will also be able to emulate the code that the
jump was inserted on top of. As on x86, that jump will need to be 5
bytes long (covering 5 bytes of text to emulate), where as a int3 is a
single byte.

Thus, you either recompile and insert nops where you want to place your
jumps, or you trap using int3 that can do the work from within the
kernel.

-- Steve
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Capturing User-Level Function Calls/Returns
  2020-07-15 21:48       ` Steven Rostedt
  2020-07-15 21:48         ` [lttng-dev] " Steven Rostedt via lttng-dev
@ 2020-07-15 22:25         ` ahmadkhorrami
  2020-07-15 22:25           ` [lttng-dev] " ahmadkhorrami via lttng-dev
  2020-07-16  1:06         ` Michel Dagenais via lttng-dev
  2 siblings, 1 reply; 24+ messages in thread
From: ahmadkhorrami @ 2020-07-15 22:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, linux-trace-users, lttng-dev,
	Jeremie Galarneau, Namhyung Kim, linux-trace-users-owner

So, the only barrier to the user-level implementation is the problem 
with instruction sizes. That's an enlightening point. Thanks for the 
detailed answer!
Thanks everybody specially Steven and Mathieu.

Regards.

On 2020-07-16 02:18, Steven Rostedt wrote:

> On Thu, 16 Jul 2020 02:09:50 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
>> Hi Steven and Mathieu,
>> Firstly, many thanks! This method seems to be the most efficient 
>> method.
>> But, IIUC, what you suggest requires source code compilation. I need 
>> an
>> efficient dynamic method that, given the function address, captures 
>> its
>> occurrence and stores some information from the execution context. Is
>> there anything better than Uprobes perhaps with no trap into the 
>> kernel?
>> Why do we need traps?
>> Regards.
> 
> Without recompiling, how would that be implemented?
> 
> You would need to insert a jump on top of code, and still be able to
> preserve that code. What a trap does, is to insert a int3, that will
> trap into the kernel, it would then emulate the code that the int3 was
> on, and also call some code that can trace the current state.
> 
> To do it in user land, you would need to find way to replace the code
> at the location you want to trace, with a jump to the tracing
> infrastructure, that will also be able to emulate the code that the
> jump was inserted on top of. As on x86, that jump will need to be 5
> bytes long (covering 5 bytes of text to emulate), where as a int3 is a
> single byte.
> 
> Thus, you either recompile and insert nops where you want to place your
> jumps, or you trap using int3 that can do the work from within the
> kernel.
> 
> -- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-15 22:25         ` ahmadkhorrami
@ 2020-07-15 22:25           ` ahmadkhorrami via lttng-dev
  0 siblings, 0 replies; 24+ messages in thread
From: ahmadkhorrami via lttng-dev @ 2020-07-15 22:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-trace-users-owner, linux-trace-users, lttng-dev, Namhyung Kim

So, the only barrier to the user-level implementation is the problem 
with instruction sizes. That's an enlightening point. Thanks for the 
detailed answer!
Thanks everybody specially Steven and Mathieu.

Regards.

On 2020-07-16 02:18, Steven Rostedt wrote:

> On Thu, 16 Jul 2020 02:09:50 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
>> Hi Steven and Mathieu,
>> Firstly, many thanks! This method seems to be the most efficient 
>> method.
>> But, IIUC, what you suggest requires source code compilation. I need 
>> an
>> efficient dynamic method that, given the function address, captures 
>> its
>> occurrence and stores some information from the execution context. Is
>> there anything better than Uprobes perhaps with no trap into the 
>> kernel?
>> Why do we need traps?
>> Regards.
> 
> Without recompiling, how would that be implemented?
> 
> You would need to insert a jump on top of code, and still be able to
> preserve that code. What a trap does, is to insert a int3, that will
> trap into the kernel, it would then emulate the code that the int3 was
> on, and also call some code that can trace the current state.
> 
> To do it in user land, you would need to find way to replace the code
> at the location you want to trace, with a jump to the tracing
> infrastructure, that will also be able to emulate the code that the
> jump was inserted on top of. As on x86, that jump will need to be 5
> bytes long (covering 5 bytes of text to emulate), where as a int3 is a
> single byte.
> 
> Thus, you either recompile and insert nops where you want to place your
> jumps, or you trap using int3 that can do the work from within the
> kernel.
> 
> -- Steve
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Capturing User-Level Function Calls/Returns
  2020-07-15 18:28 ` Capturing User-Level Function Calls/Returns Steven Rostedt
  2020-07-15 18:28   ` [lttng-dev] " Steven Rostedt via lttng-dev
  2020-07-15 18:45   ` Mathieu Desnoyers
@ 2020-07-16  1:04   ` Namhyung Kim
  2020-07-16  1:04     ` [lttng-dev] " Namhyung Kim via lttng-dev
  2020-07-16 16:07     ` ahmadkhorrami
  2 siblings, 2 replies; 24+ messages in thread
From: Namhyung Kim @ 2020-07-16  1:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: ahmadkhorrami, Linux-trace Users, lttng-dev, Mathieu Desnoyers,
	Jérémie Galarneau

Hi all,

On Thu, Jul 16, 2020 at 3:28 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
>
> > Hi,
> > What is the most efficient way to capture occurrence of a function
> > call/return of a binary program in userspace?
> > It seems the answer is Uprobes. 1) Am I right?
> > But Uprobes use "int" instruction which leads to a switch into kernel
> > mode. 2) Wouldn't it be better to avoid this transition?
> > I'm looking forward to your reply and will be happy to read your
> > opinions.
> > Regards.
>
>
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.
>
> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.

It's in https://github.com/namhyung/uftrace

Basically it also requires recompilation to add mcount calls for each function.
But it now also supports dynamic tracing without any recompilation.. :)
It's still experimental and has some limitation, but the idea is to copy
first 5 bytes (on x86_64) somewhere and replace it to a call instruction.

Thanks
Namhyung

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16  1:04   ` Namhyung Kim
@ 2020-07-16  1:04     ` Namhyung Kim via lttng-dev
  2020-07-16 16:07     ` ahmadkhorrami
  1 sibling, 0 replies; 24+ messages in thread
From: Namhyung Kim via lttng-dev @ 2020-07-16  1:04 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: ahmadkhorrami, Linux-trace Users, lttng-dev

Hi all,

On Thu, Jul 16, 2020 at 3:28 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
>
> > Hi,
> > What is the most efficient way to capture occurrence of a function
> > call/return of a binary program in userspace?
> > It seems the answer is Uprobes. 1) Am I right?
> > But Uprobes use "int" instruction which leads to a switch into kernel
> > mode. 2) Wouldn't it be better to avoid this transition?
> > I'm looking forward to your reply and will be happy to read your
> > opinions.
> > Regards.
>
>
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.
>
> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.

It's in https://github.com/namhyung/uftrace

Basically it also requires recompilation to add mcount calls for each function.
But it now also supports dynamic tracing without any recompilation.. :)
It's still experimental and has some limitation, but the idea is to copy
first 5 bytes (on x86_64) somewhere and replace it to a call instruction.

Thanks
Namhyung
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Capturing User-Level Function Calls/Returns
  2020-07-15 21:48       ` Steven Rostedt
  2020-07-15 21:48         ` [lttng-dev] " Steven Rostedt via lttng-dev
  2020-07-15 22:25         ` ahmadkhorrami
@ 2020-07-16  1:06         ` Michel Dagenais via lttng-dev
  2020-07-16  1:06           ` [lttng-dev] " Michel Dagenais via lttng-dev
                             ` (3 more replies)
  2 siblings, 4 replies; 24+ messages in thread
From: Michel Dagenais via lttng-dev @ 2020-07-16  1:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: ahmadkhorrami, linux-trace-users-owner, linux-trace-users,
	lttng-dev, Namhyung Kim


> Without recompiling, how would that be implemented?

As you mentioned, this is possible when "jump patching" 5 bytes instructions. Fast tracepoints in GDB and in kprobe do it. Kprobe goes further and patches sequences of instructions (because the target instruction is less than 5 bytes) if there is no incoming branch into the middle of the sequence. You can go even further, for instance using 3 bytes jumps to a trampoline installed in alignment nops. If you combine different strategies like this, you can eventually reach almost 100% success rate for "jump patching" tracepoints. This gets quite hairy though. However, the short story is that there is currently no tool as far as I know that does that easily and reliably in user space.

https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746
https://dl.acm.org/doi/pdf/10.1145/3062341.3062344

If you can afford a more invasive tool, that requires a lot of memory and stops your application for quite some time, you can look at approaches like dyninst that decompile the binary, insert instrumentation code and reassemble the code.

https://dyninst.org/

> You would need to insert a jump on top of code, and still be able to
> preserve that code. What a trap does, is to insert a int3, that will
> trap into the kernel, it would then emulate the code that the int3 was
> on, and also call some code that can trace the current state.
> 
> To do it in user land, you would need to find way to replace the code
> at the location you want to trace, with a jump to the tracing
> infrastructure, that will also be able to emulate the code that the
> jump was inserted on top of. As on x86, that jump will need to be 5
> bytes long (covering 5 bytes of text to emulate), where as a int3 is a
> single byte.
> 
> Thus, you either recompile and insert nops where you want to place your
> jumps, or you trap using int3 that can do the work from within the
> kernel.
> 
> -- Steve
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16  1:06         ` Michel Dagenais via lttng-dev
@ 2020-07-16  1:06           ` Michel Dagenais via lttng-dev
  2020-07-16  1:49           ` Frank Ch. Eigler
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Michel Dagenais via lttng-dev @ 2020-07-16  1:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: ahmadkhorrami, linux-trace-users-owner, linux-trace-users,
	lttng-dev, Namhyung Kim


> Without recompiling, how would that be implemented?

As you mentioned, this is possible when "jump patching" 5 bytes instructions. Fast tracepoints in GDB and in kprobe do it. Kprobe goes further and patches sequences of instructions (because the target instruction is less than 5 bytes) if there is no incoming branch into the middle of the sequence. You can go even further, for instance using 3 bytes jumps to a trampoline installed in alignment nops. If you combine different strategies like this, you can eventually reach almost 100% success rate for "jump patching" tracepoints. This gets quite hairy though. However, the short story is that there is currently no tool as far as I know that does that easily and reliably in user space.

https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746
https://dl.acm.org/doi/pdf/10.1145/3062341.3062344

If you can afford a more invasive tool, that requires a lot of memory and stops your application for quite some time, you can look at approaches like dyninst that decompile the binary, insert instrumentation code and reassemble the code.

https://dyninst.org/

> You would need to insert a jump on top of code, and still be able to
> preserve that code. What a trap does, is to insert a int3, that will
> trap into the kernel, it would then emulate the code that the int3 was
> on, and also call some code that can trace the current state.
> 
> To do it in user land, you would need to find way to replace the code
> at the location you want to trace, with a jump to the tracing
> infrastructure, that will also be able to emulate the code that the
> jump was inserted on top of. As on x86, that jump will need to be 5
> bytes long (covering 5 bytes of text to emulate), where as a int3 is a
> single byte.
> 
> Thus, you either recompile and insert nops where you want to place your
> jumps, or you trap using int3 that can do the work from within the
> kernel.
> 
> -- Steve
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16  1:06         ` Michel Dagenais via lttng-dev
  2020-07-16  1:06           ` [lttng-dev] " Michel Dagenais via lttng-dev
@ 2020-07-16  1:49           ` Frank Ch. Eigler
  2020-07-16  1:49             ` Frank Ch. Eigler via lttng-dev
  2020-07-16 16:26             ` ahmadkhorrami
  2020-07-16 16:20           ` ahmadkhorrami
  2020-07-16 16:34           ` ahmadkhorrami
  3 siblings, 2 replies; 24+ messages in thread
From: Frank Ch. Eigler @ 2020-07-16  1:49 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Steven Rostedt, ahmadkhorrami, linux-trace-users-owner,
	linux-trace-users, lttng-dev, Namhyung Kim

Hi -

> If you can afford a more invasive tool, that requires a lot of
> memory and stops your application for quite some time, you can look
> at approaches like dyninst that decompile the binary, insert
> instrumentation code and reassemble the code.

> https://dyninst.org/

For the record, systemtap includes a backend that uses dyninst as a
pure userspace backend.


% cat foo.c
#include <stdio.h>

int foo() {
  printf("foo\n");
  return 1;
}

int main() {
  foo();
}

% gcc -g foo.c

% stap --runtime=dyninst -e '
   probe process.function("*").{call,return} { println(pp()) }
' -c a.out

foo
process("/home/fche/a.out").function("main@/home/fche/foo.c:8").call
process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").call
process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").return
process("/home/fche/a.out").function("main@/home/fche/foo.c:8").return


- FChE

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16  1:49           ` Frank Ch. Eigler
@ 2020-07-16  1:49             ` Frank Ch. Eigler via lttng-dev
  2020-07-16 16:26             ` ahmadkhorrami
  1 sibling, 0 replies; 24+ messages in thread
From: Frank Ch. Eigler via lttng-dev @ 2020-07-16  1:49 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Steven Rostedt, ahmadkhorrami, linux-trace-users-owner,
	linux-trace-users, lttng-dev, Namhyung Kim

Hi -

> If you can afford a more invasive tool, that requires a lot of
> memory and stops your application for quite some time, you can look
> at approaches like dyninst that decompile the binary, insert
> instrumentation code and reassemble the code.

> https://dyninst.org/

For the record, systemtap includes a backend that uses dyninst as a
pure userspace backend.


% cat foo.c
#include <stdio.h>

int foo() {
  printf("foo\n");
  return 1;
}

int main() {
  foo();
}

% gcc -g foo.c

% stap --runtime=dyninst -e '
   probe process.function("*").{call,return} { println(pp()) }
' -c a.out

foo
process("/home/fche/a.out").function("main@/home/fche/foo.c:8").call
process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").call
process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").return
process("/home/fche/a.out").function("main@/home/fche/foo.c:8").return


- FChE

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Capturing User-Level Function Calls/Returns
  2020-07-16  1:04   ` Namhyung Kim
  2020-07-16  1:04     ` [lttng-dev] " Namhyung Kim via lttng-dev
@ 2020-07-16 16:07     ` ahmadkhorrami
  2020-07-16 16:07       ` [lttng-dev] " ahmadkhorrami via lttng-dev
  1 sibling, 1 reply; 24+ messages in thread
From: ahmadkhorrami @ 2020-07-16 16:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Steven Rostedt, Linux-trace Users, lttng-dev, Mathieu Desnoyers,
	Jérémie Galarneau, linux-trace-users-owner

Hi Namhyung,
This seems really interesting and is what I am looking for. Can it 
capture all function entries/exits? I mean does it fully handle variable 
instruction sizes in dynamic mode?
In any case, thanks! and I hope that it becomes stable as soon as 
possible, so that everyone can use it.
Regards.

On 2020-07-16 05:34, Namhyung Kim wrote:

> Hi all,
> 
> On Thu, Jul 16, 2020 at 3:28 AM Steven Rostedt <rostedt@goodmis.org> 
> wrote:
> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
> Hi,
> What is the most efficient way to capture occurrence of a function
> call/return of a binary program in userspace?
> It seems the answer is Uprobes. 1) Am I right?
> But Uprobes use "int" instruction which leads to a switch into kernel
> mode. 2) Wouldn't it be better to avoid this transition?
> I'm looking forward to your reply and will be happy to read your
> opinions.
> Regards.
> 
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.
> 
> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.

It's in https://github.com/namhyung/uftrace

Basically it also requires recompilation to add mcount calls for each 
function.
But it now also supports dynamic tracing without any recompilation.. :)
It's still experimental and has some limitation, but the idea is to copy
first 5 bytes (on x86_64) somewhere and replace it to a call 
instruction.

Thanks
Namhyung

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16 16:07     ` ahmadkhorrami
@ 2020-07-16 16:07       ` ahmadkhorrami via lttng-dev
  0 siblings, 0 replies; 24+ messages in thread
From: ahmadkhorrami via lttng-dev @ 2020-07-16 16:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-trace-users-owner, Steven Rostedt, Linux-trace Users, lttng-dev

Hi Namhyung,
This seems really interesting and is what I am looking for. Can it 
capture all function entries/exits? I mean does it fully handle variable 
instruction sizes in dynamic mode?
In any case, thanks! and I hope that it becomes stable as soon as 
possible, so that everyone can use it.
Regards.

On 2020-07-16 05:34, Namhyung Kim wrote:

> Hi all,
> 
> On Thu, Jul 16, 2020 at 3:28 AM Steven Rostedt <rostedt@goodmis.org> 
> wrote:
> On Wed, 15 Jul 2020 20:37:16 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
> Hi,
> What is the most efficient way to capture occurrence of a function
> call/return of a binary program in userspace?
> It seems the answer is Uprobes. 1) Am I right?
> But Uprobes use "int" instruction which leads to a switch into kernel
> mode. 2) Wouldn't it be better to avoid this transition?
> I'm looking forward to your reply and will be happy to read your
> opinions.
> Regards.
> 
> Hi, I believe LTTng has utilities that can help you trace user space
> programs.
> 
> I think there's also a users ftrace like utility that Namhyung was
> working on. But I don't know where in the development that is.

It's in https://github.com/namhyung/uftrace

Basically it also requires recompilation to add mcount calls for each 
function.
But it now also supports dynamic tracing without any recompilation.. :)
It's still experimental and has some limitation, but the idea is to copy
first 5 bytes (on x86_64) somewhere and replace it to a call 
instruction.

Thanks
Namhyung
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16  1:06         ` Michel Dagenais via lttng-dev
  2020-07-16  1:06           ` [lttng-dev] " Michel Dagenais via lttng-dev
  2020-07-16  1:49           ` Frank Ch. Eigler
@ 2020-07-16 16:20           ` ahmadkhorrami
  2020-07-16 16:20             ` ahmadkhorrami via lttng-dev
  2020-07-16 16:34           ` ahmadkhorrami
  3 siblings, 1 reply; 24+ messages in thread
From: ahmadkhorrami @ 2020-07-16 16:20 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Steven Rostedt, linux-trace-users-owner, linux-trace-users,
	lttng-dev, Namhyung Kim

<p>Hi Michel,</p>
<p>Thanks for the detailed answer! DBI tools are really interesting but 
I want to do this during normal execution and on multiple programs 
running simultaneously. I mean this is not supposed to 
be&nbsp;conventional tracing with multiple re-executions. I want to 
extract some information about the execution-state at runtime and inform 
the lower levels in the software stack to make smarter choices. 
Fortunately, there are only a few functions that need to be traced. But 
any reduction in the wasted cycles is helpful, specially if it is caused 
by privilege level transitions.</p>
<p>Regards.</p>
<p>&nbsp;</p>
<p>On 2020-07-16 05:36, Michel Dagenais wrote:</p>
<blockquote><!-- html ignored --><!-- head ignored --><!-- meta ignored 
-->
<div class="pre"><br />
<blockquote>Without recompiling, how would that be 
implemented?</blockquote>
<br /> As you mentioned, this is possible when "jump patching" 5 bytes 
instructions. Fast tracepoints in GDB and in kprobe do it. Kprobe goes 
further and patches sequences of instructions (because the target 
instruction is less than 5 bytes) if there is no incoming branch into 
the middle of the sequence. You can go even further, for instance using 
3 bytes jumps to a trampoline installed in alignment nops. If you 
combine different strategies like this, you can eventually reach almost 
100% success rate for "jump patching" tracepoints. This gets quite hairy 
though. However, the short story is that there is currently no tool as 
far as I know that does that easily and reliably in user space.<br /><br 
/><a href="https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746" 
target="_blank" rel="noopener 
noreferrer">https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746</a><br 
/><a href="https://dl.acm.org/doi/pdf/10.1145/3062341.3062344" 
target="_blank" rel="noopener 
noreferrer">https://dl.acm.org/doi/pdf/10.1145/3062341.3062344</a><br 
/><br /> If you can afford a more invasive tool, that requires a lot of 
memory and stops your application for quite some time, you can look at 
approaches like dyninst that decompile the binary, insert 
instrumentation code and reassemble the code.<br /><br /><a 
href="https://dyninst.org/" target="_blank" rel="noopener 
noreferrer">https://dyninst.org/</a><br /><br />
<blockquote>You would need to insert a jump on top of code, and still be 
able to<br /> preserve that code. What a trap does, is to insert a int3, 
that will<br /> trap into the kernel, it would then emulate the code 
that the int3 was<br /> on, and also call some code that can trace the 
current state.<br /><br /> To do it in user land, you would need to find 
way to replace the code<br /> at the location you want to trace, with a 
jump to the tracing<br /> infrastructure, that will also be able to 
emulate the code that the<br /> jump was inserted on top of. As on x86, 
that jump will need to be 5<br /> bytes long (covering 5 bytes of text 
to emulate), where as a int3 is a<br /> single byte.<br /><br /> Thus, 
you either recompile and insert nops where you want to place your<br /> 
jumps, or you trap using int3 that can do the work from within the<br /> 
kernel.<br /><br /> -- Steve<br /> 
_______________________________________________<br /> lttng-dev mailing 
list<br /><a 
href="mailto:lttng-dev@lists.lttng.org">lttng-dev@lists.lttng.org</a><br 
/><a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev" 
target="_blank" rel="noopener 
noreferrer">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a></blockquote>
</div>
</blockquote>
<p>&nbsp;</p>
<div id="_rc_sig">&nbsp;</div>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16 16:20           ` ahmadkhorrami
@ 2020-07-16 16:20             ` ahmadkhorrami via lttng-dev
  0 siblings, 0 replies; 24+ messages in thread
From: ahmadkhorrami via lttng-dev @ 2020-07-16 16:20 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Steven Rostedt, linux-trace-users-owner, linux-trace-users,
	lttng-dev, Namhyung Kim

<p>Hi Michel,</p>
<p>Thanks for the detailed answer! DBI tools are really interesting but 
I want to do this during normal execution and on multiple programs 
running simultaneously. I mean this is not supposed to 
be&nbsp;conventional tracing with multiple re-executions. I want to 
extract some information about the execution-state at runtime and inform 
the lower levels in the software stack to make smarter choices. 
Fortunately, there are only a few functions that need to be traced. But 
any reduction in the wasted cycles is helpful, specially if it is caused 
by privilege level transitions.</p>
<p>Regards.</p>
<p>&nbsp;</p>
<p>On 2020-07-16 05:36, Michel Dagenais wrote:</p>
<blockquote><!-- html ignored --><!-- head ignored --><!-- meta ignored 
-->
<div class="pre"><br />
<blockquote>Without recompiling, how would that be 
implemented?</blockquote>
<br /> As you mentioned, this is possible when "jump patching" 5 bytes 
instructions. Fast tracepoints in GDB and in kprobe do it. Kprobe goes 
further and patches sequences of instructions (because the target 
instruction is less than 5 bytes) if there is no incoming branch into 
the middle of the sequence. You can go even further, for instance using 
3 bytes jumps to a trampoline installed in alignment nops. If you 
combine different strategies like this, you can eventually reach almost 
100% success rate for "jump patching" tracepoints. This gets quite hairy 
though. However, the short story is that there is currently no tool as 
far as I know that does that easily and reliably in user space.<br /><br 
/><a href="https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746" 
target="_blank" rel="noopener 
noreferrer">https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746</a><br 
/><a href="https://dl.acm.org/doi/pdf/10.1145/3062341.3062344" 
target="_blank" rel="noopener 
noreferrer">https://dl.acm.org/doi/pdf/10.1145/3062341.3062344</a><br 
/><br /> If you can afford a more invasive tool, that requires a lot of 
memory and stops your application for quite some time, you can look at 
approaches like dyninst that decompile the binary, insert 
instrumentation code and reassemble the code.<br /><br /><a 
href="https://dyninst.org/" target="_blank" rel="noopener 
noreferrer">https://dyninst.org/</a><br /><br />
<blockquote>You would need to insert a jump on top of code, and still be 
able to<br /> preserve that code. What a trap does, is to insert a int3, 
that will<br /> trap into the kernel, it would then emulate the code 
that the int3 was<br /> on, and also call some code that can trace the 
current state.<br /><br /> To do it in user land, you would need to find 
way to replace the code<br /> at the location you want to trace, with a 
jump to the tracing<br /> infrastructure, that will also be able to 
emulate the code that the<br /> jump was inserted on top of. As on x86, 
that jump will need to be 5<br /> bytes long (covering 5 bytes of text 
to emulate), where as a int3 is a<br /> single byte.<br /><br /> Thus, 
you either recompile and insert nops where you want to place your<br /> 
jumps, or you trap using int3 that can do the work from within the<br /> 
kernel.<br /><br /> -- Steve<br /> 
_______________________________________________<br /> lttng-dev mailing 
list<br /><a 
href="mailto:lttng-dev@lists.lttng.org">lttng-dev@lists.lttng.org</a><br 
/><a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev" 
target="_blank" rel="noopener 
noreferrer">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a></blockquote>
</div>
</blockquote>
<p>&nbsp;</p>
<div id="_rc_sig">&nbsp;</div>
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16  1:49           ` Frank Ch. Eigler
  2020-07-16  1:49             ` Frank Ch. Eigler via lttng-dev
@ 2020-07-16 16:26             ` ahmadkhorrami
  2020-07-16 16:26               ` ahmadkhorrami via lttng-dev
  1 sibling, 1 reply; 24+ messages in thread
From: ahmadkhorrami @ 2020-07-16 16:26 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Michel Dagenais, Steven Rostedt, linux-trace-users-owner,
	linux-trace-users, lttng-dev, Namhyung Kim

Hi Frank,

Thanks for the point!

Regards.


On 2020-07-16 06:19, Frank Ch. Eigler wrote:

> Hi -
> 
>> If you can afford a more invasive tool, that requires a lot of
>> memory and stops your application for quite some time, you can look
>> at approaches like dyninst that decompile the binary, insert
>> instrumentation code and reassemble the code.
> 
>> https://dyninst.org/
> 
> For the record, systemtap includes a backend that uses dyninst as a
> pure userspace backend.
> 
> % cat foo.c
> #include <stdio.h>
> 
> int foo() {
> printf("foo\n");
> return 1;
> }
> 
> int main() {
> foo();
> }
> 
> % gcc -g foo.c
> 
> % stap --runtime=dyninst -e '
> probe process.function("*").{call,return} { println(pp()) }
> ' -c a.out
> 
> foo
> process("/home/fche/a.out").function("main@/home/fche/foo.c:8").call
> process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").call
> process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").return
> process("/home/fche/a.out").function("main@/home/fche/foo.c:8").return
> 
> - FChE

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16 16:26             ` ahmadkhorrami
@ 2020-07-16 16:26               ` ahmadkhorrami via lttng-dev
  0 siblings, 0 replies; 24+ messages in thread
From: ahmadkhorrami via lttng-dev @ 2020-07-16 16:26 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: linux-trace-users-owner, linux-trace-users, Steven Rostedt,
	lttng-dev, Namhyung Kim

Hi Frank,

Thanks for the point!

Regards.


On 2020-07-16 06:19, Frank Ch. Eigler wrote:

> Hi -
> 
>> If you can afford a more invasive tool, that requires a lot of
>> memory and stops your application for quite some time, you can look
>> at approaches like dyninst that decompile the binary, insert
>> instrumentation code and reassemble the code.
> 
>> https://dyninst.org/
> 
> For the record, systemtap includes a backend that uses dyninst as a
> pure userspace backend.
> 
> % cat foo.c
> #include <stdio.h>
> 
> int foo() {
> printf("foo\n");
> return 1;
> }
> 
> int main() {
> foo();
> }
> 
> % gcc -g foo.c
> 
> % stap --runtime=dyninst -e '
> probe process.function("*").{call,return} { println(pp()) }
> ' -c a.out
> 
> foo
> process("/home/fche/a.out").function("main@/home/fche/foo.c:8").call
> process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").call
> process("/home/fche/a.out").function("foo@/home/fche/foo.c:3").return
> process("/home/fche/a.out").function("main@/home/fche/foo.c:8").return
> 
> - FChE
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16  1:06         ` Michel Dagenais via lttng-dev
                             ` (2 preceding siblings ...)
  2020-07-16 16:20           ` ahmadkhorrami
@ 2020-07-16 16:34           ` ahmadkhorrami
  2020-07-16 16:34             ` ahmadkhorrami via lttng-dev
  3 siblings, 1 reply; 24+ messages in thread
From: ahmadkhorrami @ 2020-07-16 16:34 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Steven Rostedt, linux-trace-users-owner, linux-trace-users,
	lttng-dev, Namhyung Kim

Hi Michel,

Thanks for the detailed answer! DBI tools are really interesting but I 
want to do this during normal execution and on multiple programs running 
simultaneously. I mean this is not supposed to be conventional tracing 
with multiple re-executions. I want to extract some information about 
the execution-state at runtime and inform the lower levels in the 
software stack to make smarter choices. Fortunately, there are only a 
few functions that need to be traced. But any reduction in the wasted 
cycles is helpful, specially if it is caused by privilege level 
transitions.

Regards.


On 2020-07-16 05:36, Michel Dagenais wrote:

>> Without recompiling, how would that be implemented?
> 
> As you mentioned, this is possible when "jump patching" 5 bytes 
> instructions. Fast tracepoints in GDB and in kprobe do it. Kprobe goes 
> further and patches sequences of instructions (because the target 
> instruction is less than 5 bytes) if there is no incoming branch into 
> the middle of the sequence. You can go even further, for instance using 
> 3 bytes jumps to a trampoline installed in alignment nops. If you 
> combine different strategies like this, you can eventually reach almost 
> 100% success rate for "jump patching" tracepoints. This gets quite 
> hairy though. However, the short story is that there is currently no 
> tool as far as I know that does that easily and reliably in user space.
> 
> https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746
> https://dl.acm.org/doi/pdf/10.1145/3062341.3062344
> 
> If you can afford a more invasive tool, that requires a lot of memory 
> and stops your application for quite some time, you can look at 
> approaches like dyninst that decompile the binary, insert 
> instrumentation code and reassemble the code.
> 
> https://dyninst.org/
> 
>> You would need to insert a jump on top of code, and still be able to
>> preserve that code. What a trap does, is to insert a int3, that will
>> trap into the kernel, it would then emulate the code that the int3 was
>> on, and also call some code that can trace the current state.
>> 
>> To do it in user land, you would need to find way to replace the code
>> at the location you want to trace, with a jump to the tracing
>> infrastructure, that will also be able to emulate the code that the
>> jump was inserted on top of. As on x86, that jump will need to be 5
>> bytes long (covering 5 bytes of text to emulate), where as a int3 is a
>> single byte.
>> 
>> Thus, you either recompile and insert nops where you want to place 
>> your
>> jumps, or you trap using int3 that can do the work from within the
>> kernel.
>> 
>> -- Steve
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev@lists.lttng.org
>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [lttng-dev] Capturing User-Level Function Calls/Returns
  2020-07-16 16:34           ` ahmadkhorrami
@ 2020-07-16 16:34             ` ahmadkhorrami via lttng-dev
  0 siblings, 0 replies; 24+ messages in thread
From: ahmadkhorrami via lttng-dev @ 2020-07-16 16:34 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Steven Rostedt, linux-trace-users-owner, linux-trace-users,
	lttng-dev, Namhyung Kim

Hi Michel,

Thanks for the detailed answer! DBI tools are really interesting but I 
want to do this during normal execution and on multiple programs running 
simultaneously. I mean this is not supposed to be conventional tracing 
with multiple re-executions. I want to extract some information about 
the execution-state at runtime and inform the lower levels in the 
software stack to make smarter choices. Fortunately, there are only a 
few functions that need to be traced. But any reduction in the wasted 
cycles is helpful, specially if it is caused by privilege level 
transitions.

Regards.


On 2020-07-16 05:36, Michel Dagenais wrote:

>> Without recompiling, how would that be implemented?
> 
> As you mentioned, this is possible when "jump patching" 5 bytes 
> instructions. Fast tracepoints in GDB and in kprobe do it. Kprobe goes 
> further and patches sequences of instructions (because the target 
> instruction is less than 5 bytes) if there is no incoming branch into 
> the middle of the sequence. You can go even further, for instance using 
> 3 bytes jumps to a trampoline installed in alignment nops. If you 
> combine different strategies like this, you can eventually reach almost 
> 100% success rate for "jump patching" tracepoints. This gets quite 
> hairy though. However, the short story is that there is currently no 
> tool as far as I know that does that easily and reliably in user space.
> 
> https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746
> https://dl.acm.org/doi/pdf/10.1145/3062341.3062344
> 
> If you can afford a more invasive tool, that requires a lot of memory 
> and stops your application for quite some time, you can look at 
> approaches like dyninst that decompile the binary, insert 
> instrumentation code and reassemble the code.
> 
> https://dyninst.org/
> 
>> You would need to insert a jump on top of code, and still be able to
>> preserve that code. What a trap does, is to insert a int3, that will
>> trap into the kernel, it would then emulate the code that the int3 was
>> on, and also call some code that can trace the current state.
>> 
>> To do it in user land, you would need to find way to replace the code
>> at the location you want to trace, with a jump to the tracing
>> infrastructure, that will also be able to emulate the code that the
>> jump was inserted on top of. As on x86, that jump will need to be 5
>> bytes long (covering 5 bytes of text to emulate), where as a int3 is a
>> single byte.
>> 
>> Thus, you either recompile and insert nops where you want to place 
>> your
>> jumps, or you trap using int3 that can do the work from within the
>> kernel.
>> 
>> -- Steve
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev@lists.lttng.org
>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-07-16 16:34 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e9c7400ff0075f3beba2863c4432a905@ut.ac.ir>
2020-07-15 18:28 ` Capturing User-Level Function Calls/Returns Steven Rostedt
2020-07-15 18:28   ` [lttng-dev] " Steven Rostedt via lttng-dev
2020-07-15 18:45   ` Mathieu Desnoyers
2020-07-15 18:45     ` [lttng-dev] " Mathieu Desnoyers via lttng-dev
2020-07-15 21:39     ` ahmadkhorrami
2020-07-15 21:39       ` [lttng-dev] " ahmadkhorrami via lttng-dev
2020-07-15 21:48       ` Steven Rostedt
2020-07-15 21:48         ` [lttng-dev] " Steven Rostedt via lttng-dev
2020-07-15 22:25         ` ahmadkhorrami
2020-07-15 22:25           ` [lttng-dev] " ahmadkhorrami via lttng-dev
2020-07-16  1:06         ` Michel Dagenais via lttng-dev
2020-07-16  1:06           ` [lttng-dev] " Michel Dagenais via lttng-dev
2020-07-16  1:49           ` Frank Ch. Eigler
2020-07-16  1:49             ` Frank Ch. Eigler via lttng-dev
2020-07-16 16:26             ` ahmadkhorrami
2020-07-16 16:26               ` ahmadkhorrami via lttng-dev
2020-07-16 16:20           ` ahmadkhorrami
2020-07-16 16:20             ` ahmadkhorrami via lttng-dev
2020-07-16 16:34           ` ahmadkhorrami
2020-07-16 16:34             ` ahmadkhorrami via lttng-dev
2020-07-16  1:04   ` Namhyung Kim
2020-07-16  1:04     ` [lttng-dev] " Namhyung Kim via lttng-dev
2020-07-16 16:07     ` ahmadkhorrami
2020-07-16 16:07       ` [lttng-dev] " ahmadkhorrami via lttng-dev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).