kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
@ 2009-04-02 17:24 Masami Hiramatsu
  2009-04-03 11:26 ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Masami Hiramatsu @ 2009-04-02 17:24 UTC (permalink / raw)
  To: Ingo Molnar, Frederic Weisbecker, Steven Rostedt,
	Ananth N Mavinakayanahalli, Andre
  Cc: kvm, systemtap-ml, LKML

Hi,

Here are the patches of kprobe-based event tracer for x86, version 4.

This version supports only x86(-32/-64) (If someone is interested in
porting this to other architectures, he just needs to port
kprobes/kretprobes and ptrace enhancement[PATCH 2/6]).

I added x86 insn decoder on this version. It might be better
integrated with KVM's decoder, and kprobes x86 code should be
rewritten with it.


This can be applied on the linux-2.6-tip tree.

This patchset includes following changes:
- Fix kernel_trap_sp() on x86 according to systemtap runtime. [1/6]
- Add arch-dep register and stack fetching functions [2/6]
- Add x86 instruction decoder [3/6]
- Check insertion point safety in kprobe [4/6]
- Add kprobe-tracer plugin [5/6]
- Support fetching various status (register/stack/memory/etc.) [6/6]

Done items:
- Add kernel_trap_sp() and fetch_*() on other archs.
- Support name-based register fetching (ax, bx, and so on)
- Support indirect memory fetch from registers etc.
- Check insertion point safety by using instruction decoder.

Future items:
- .init function tracing support.
- Support primitive types(long, ulong, int, uint, etc) for args.


kprobe-based event tracer
---------------------------

This tracer is similar to the events tracer which is based on Tracepoint
infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
and kretprobe). It probes anywhere where kprobes can probe(this means, all
functions body except for __kprobes functions).

Unlike the function tracer, this tracer can probe instructions inside of
kernel functions. It allows you to check which instruction has been executed.

Unlike the Tracepoint based events tracer, this tracer can add new probe points
on the fly.

Similar to the events tracer, this tracer doesn't need to be activated via
current_tracer, instead of that, just set probe points via
/debug/tracing/kprobe_probes.

Synopsis of kprobe_probes:
  p SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS]     : set a probe
  r SYMBOL[+0] [FETCHARGS]                      : set a return probe

 FETCHARGS:
  %REG  : Fetch register REG
  sN    : Fetch Nth entry of stack (N >= 0)
  @ADDR : Fetch memory at ADDR (ADDR should be in kernel)
  @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
  aN    : Fetch function argument. (N >= 0)(*)
  rv    : Fetch return value.(**)
  ra    : Fetch return address.(**)
  +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***)

  (*) aN may not correct on asmlinkaged functions and at the middle of
      function body.
  (**) only for return probe.
  (***) this is useful for fetching a field of data structures.

E.g.
  echo p do_sys_open a0 a1 a2 a3 > /debug/tracing/kprobe_probes

 This sets a kprobe on the top of do_sys_open() function with recording
1st to 4th arguments.

  echo r do_sys_open rv rp >> /debug/tracing/kprobe_probes

 This sets a kretprobe on the return point of do_sys_open() function with
recording return value and return address.

  echo > /debug/tracing/kprobe_probes

 This clears all probe points. and you can see the traced information via
/debug/tracing/trace.

  cat /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
           <...>-2376  [001]   262.389131: do_sys_open: @do_sys_open+0 0xffffff9c 0x98db83e 0x8880 0x0
           <...>-2376  [001]   262.391166: sys_open: <-do_sys_open+0 0x5 0xc06e8ebb
           <...>-2376  [001]   264.384876: do_sys_open: @do_sys_open+0 0xffffff9c 0x98db83e 0x8880 0x0
           <...>-2376  [001]   264.386880: sys_open: <-do_sys_open+0 0x5 0xc06e8ebb
           <...>-2084  [001]   265.380330: do_sys_open: @do_sys_open+0 0xffffff9c 0x804be3e 0x0 0x1b6
           <...>-2084  [001]   265.380399: sys_open: <-do_sys_open+0 0x3 0xc06e8ebb

 @SYMBOL means that kernel hits a probe, and <-SYMBOL means kernel returns
from SYMBOL(e.g. "sys_open: <-do_sys_open+0" means kernel returns from
do_sys_open to sys_open).


 Documentation/ftrace.txt      |   70 ++++
 arch/x86/include/asm/insn.h   |  130 +++++++
 arch/x86/include/asm/ptrace.h |   70 ++++-
 arch/x86/kernel/kprobes.c     |   51 +++
 arch/x86/kernel/ptrace.c      |   59 +++
 arch/x86/lib/Makefile         |    1 +
 arch/x86/lib/insn.c           |  627 ++++++++++++++++++++++++++++++++
 kernel/trace/Kconfig          |    9 +
 kernel/trace/Makefile         |    1 +
 kernel/trace/trace_kprobe.c   |  789 +++++++++++++++++++++++++++++++++++++++++
 10 files changed, 1805 insertions(+), 2 deletions(-)

Thank you,


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-02 17:24 [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer Masami Hiramatsu
@ 2009-04-03 11:26 ` Ingo Molnar
  2009-04-03 11:32   ` Andi Kleen
  2009-04-03 11:50   ` Avi Kivity
  0 siblings, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2009-04-03 11:26 UTC (permalink / raw)
  To: Masami Hiramatsu, H. Peter Anvin, Avi Kivity
  Cc: Frederic Weisbecker, Steven Rostedt, Ananth N Mavinakayanahalli,
	Andrew Morton, Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML


* Masami Hiramatsu <mhiramat@redhat.com> wrote:

> Hi,
> 
> Here are the patches of kprobe-based event tracer for x86, version 4.
> 
> This version supports only x86(-32/-64) (If someone is interested in
> porting this to other architectures, he just needs to port
> kprobes/kretprobes and ptrace enhancement[PATCH 2/6]).
> 
> I added x86 insn decoder on this version. It might be better
> integrated with KVM's decoder, and kprobes x86 code should be
> rewritten with it.
> 
> 
> This can be applied on the linux-2.6-tip tree.
> 
> This patchset includes following changes:
> - Fix kernel_trap_sp() on x86 according to systemtap runtime. [1/6]
> - Add arch-dep register and stack fetching functions [2/6]
> - Add x86 instruction decoder [3/6]
> - Check insertion point safety in kprobe [4/6]
> - Add kprobe-tracer plugin [5/6]
> - Support fetching various status (register/stack/memory/etc.) [6/6]

ok, the structure and concept looks quite good now, really nice!

I'm wondering about something i suggested many moons ago: to look 
into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).

I remember there were some issues with that (one problem being that 
the KVM decoder is a special-purpose thing covering specific range 
of execution environments - not a near-full integer-ops decoder like 
the one we are aiming for here) - are there any other fundamental 
problems beyond 'it has to be done' ?

Conceptually we want just a single piece of decoder logic in 
arch/x86/. If the KVM folks are cool with it we could factor out the 
KVM one into arch/x86/lib/. But ... if there are compelling reasons 
to leave the KVM one alone in its limited environment we can do that 
too.

Avi, Peter, what's your take on this?

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 11:26 ` Ingo Molnar
@ 2009-04-03 11:32   ` Andi Kleen
  2009-04-03 11:50   ` Avi Kivity
  1 sibling, 0 replies; 19+ messages in thread
From: Andi Kleen @ 2009-04-03 11:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, H. Peter Anvin, Avi Kivity,
	Frederic Weisbecker, Steven Rostedt, Ananth N Mavinakayanahalli,
	Andrew Morton, Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML

> I'm wondering about something i suggested many moons ago: to look 
> into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).

Hi Ingo,
Me and Masami just discussed this a few emails ago in this thread:)

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 11:26 ` Ingo Molnar
  2009-04-03 11:32   ` Andi Kleen
@ 2009-04-03 11:50   ` Avi Kivity
  2009-04-03 12:12     ` Ingo Molnar
  2009-04-03 14:21     ` Masami Hiramatsu
  1 sibling, 2 replies; 19+ messages in thread
From: Avi Kivity @ 2009-04-03 11:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, H. Peter Anvin, Frederic Weisbecker,
	Steven Rostedt, Ananth N Mavinakayanahalli, Andrew Morton,
	Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML

Ingo Molnar wrote:
> ok, the structure and concept looks quite good now, really nice!
>
> I'm wondering about something i suggested many moons ago: to look 
> into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).
>
> I remember there were some issues with that (one problem being that 
> the KVM decoder is a special-purpose thing covering specific range 
> of execution environments - not a near-full integer-ops decoder like 
> the one we are aiming for here) - are there any other fundamental 
> problems beyond 'it has to be done' ?
>
> Conceptually we want just a single piece of decoder logic in 
> arch/x86/. If the KVM folks are cool with it we could factor out the 
> KVM one into arch/x86/lib/. But ... if there are compelling reasons 
> to leave the KVM one alone in its limited environment we can do that 
> too.
>   

kvm has three requirements not needed by kprobes:
- it wants to execute instructions, not just decode them, including 
generating faults where appropriate
- it is performance critical
- it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously

If an arch/x86/ decoder/emulator gives me these I'll gladly switch to 
it.  x86_emulate.c is high on my list of most disliked code.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 11:50   ` Avi Kivity
@ 2009-04-03 12:12     ` Ingo Molnar
  2009-04-03 12:17       ` Avi Kivity
  2009-04-03 12:25       ` Andi Kleen
  2009-04-03 14:21     ` Masami Hiramatsu
  1 sibling, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2009-04-03 12:12 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Masami Hiramatsu, H. Peter Anvin, Frederic Weisbecker,
	Steven Rostedt, Ananth N Mavinakayanahalli, Andrew Morton,
	Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML


* Avi Kivity <avi@redhat.com> wrote:

> Ingo Molnar wrote:
>> ok, the structure and concept looks quite good now, really nice!
>>
>> I'm wondering about something i suggested many moons ago: to look into 
>> the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).
>>
>> I remember there were some issues with that (one problem being 
>> that the KVM decoder is a special-purpose thing covering specific 
>> range of execution environments - not a near-full integer-ops 
>> decoder like the one we are aiming for here) - are there any 
>> other fundamental problems beyond 'it has to be done' ?
>>
>> Conceptually we want just a single piece of decoder logic in 
>> arch/x86/. If the KVM folks are cool with it we could factor out 
>> the KVM one into arch/x86/lib/. But ... if there are compelling 
>> reasons to leave the KVM one alone in its limited environment we 
>> can do that too.
>
> kvm has three requirements not needed by kprobes:
> - it wants to execute instructions, not just decode them, including  
>   generating faults where appropriate
> - it is performance critical
> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>
> If an arch/x86/ decoder/emulator gives me these I'll gladly switch 
> to it.  x86_emulate.c is high on my list of most disliked code.

Well, this has to be driven from the KVM side as the kprobes use 
will only be for decoding so if it's modified from the kprobes side 
the KVM-only functionality might regress.

So ... we can do the library decoder for kprobes purposes, and 
someone versed in the KVM emulator can then combine the two.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 12:12     ` Ingo Molnar
@ 2009-04-03 12:17       ` Avi Kivity
  2009-04-03 12:26         ` Ingo Molnar
  2009-04-03 12:25       ` Andi Kleen
  1 sibling, 1 reply; 19+ messages in thread
From: Avi Kivity @ 2009-04-03 12:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, H. Peter Anvin, Frederic Weisbecker,
	Steven Rostedt, Ananth N Mavinakayanahalli, Andrew Morton,
	Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML

Ingo Molnar wrote:
>> kvm has three requirements not needed by kprobes:
>> - it wants to execute instructions, not just decode them, including  
>>   generating faults where appropriate
>> - it is performance critical
>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>
>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch 
>> to it.  x86_emulate.c is high on my list of most disliked code.
>>     
>
> Well, this has to be driven from the KVM side as the kprobes use 
> will only be for decoding so if it's modified from the kprobes side 
> the KVM-only functionality might regress.
>
> So ... we can do the library decoder for kprobes purposes, and 
> someone versed in the KVM emulator can then combine the two.
>   

Problem is, anyone versed in the kvm emulator will want to run as far 
away from this work as possible.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 12:12     ` Ingo Molnar
  2009-04-03 12:17       ` Avi Kivity
@ 2009-04-03 12:25       ` Andi Kleen
  1 sibling, 0 replies; 19+ messages in thread
From: Andi Kleen @ 2009-04-03 12:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Avi Kivity, Masami Hiramatsu, H. Peter Anvin,
	Frederic Weisbecker, Steven Rostedt, Ananth N Mavinakayanahalli,
	Andrew Morton, Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML

> So ... we can do the library decoder for kprobes purposes, and 
> someone versed in the KVM emulator can then combine the two.

The KVM (or rather Xen, that is where it comes from) decoder is already
a "library decoder". That is it does nearly everything
through callbacks, and if you don't want some functionality
you can nop the callbacks. Nearly because some  some
direct KVM references have crept in recently (e.g. to vcpus),
but those could be probably removed again without too much effort.
There are not many of them.

Also doing another interpreter is a lot of work and a lot of testing,
so basing it on something that is already well tested is probably
a good idea.

-/dev/null/Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 12:17       ` Avi Kivity
@ 2009-04-03 12:26         ` Ingo Molnar
  2009-04-03 12:33           ` Avi Kivity
  2009-04-03 13:16           ` Vegard Nossum
  0 siblings, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2009-04-03 12:26 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Masami Hiramatsu, H. Peter Anvin, Frederic Weisbecker,
	Steven Rostedt, Ananth N Mavinakayanahalli, Andrew Morton,
	Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML


* Avi Kivity <avi@redhat.com> wrote:

> Ingo Molnar wrote:
>>> kvm has three requirements not needed by kprobes:
>>> - it wants to execute instructions, not just decode them, including   
>>>   generating faults where appropriate
>>> - it is performance critical
>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>>
>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch  
>>> to it.  x86_emulate.c is high on my list of most disliked code.
>>>     
>>
>> Well, this has to be driven from the KVM side as the kprobes use 
>> will only be for decoding so if it's modified from the kprobes 
>> side the KVM-only functionality might regress.
>>
>> So ... we can do the library decoder for kprobes purposes, and 
>> someone versed in the KVM emulator can then combine the two.
>
> Problem is, anyone versed in the kvm emulator will want to run as 
> far away from this work as possible.

Are you suggesting that the KVM emulator should never have been 
merged in the first place? ;-)

Anyway, we'll make sure the kprobes/library decoder is as clean as 
possible - so it ought to be hackable and extensible without the 
risk of permanent brain damage. Mmiotrace and kmemcheck has decoding 
smarts too, and i think the sw-breakpoint injection code of KGDB 
could use it as well - so there's broader utility in all this.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 12:26         ` Ingo Molnar
@ 2009-04-03 12:33           ` Avi Kivity
  2009-04-03 13:16           ` Vegard Nossum
  1 sibling, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2009-04-03 12:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, H. Peter Anvin, Frederic Weisbecker,
	Steven Rostedt, Ananth N Mavinakayanahalli, Andrew Morton,
	Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML

Ingo Molnar wrote:
>> Problem is, anyone versed in the kvm emulator will want to run as 
>> far away from this work as possible.
>>     
>
> Are you suggesting that the KVM emulator should never have been 
> merged in the first place? ;-)
>   

Truth always comes out eventually.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 12:26         ` Ingo Molnar
  2009-04-03 12:33           ` Avi Kivity
@ 2009-04-03 13:16           ` Vegard Nossum
  2009-04-03 13:40             ` Avi Kivity
  2009-04-03 13:52             ` Masami Hiramatsu
  1 sibling, 2 replies; 19+ messages in thread
From: Vegard Nossum @ 2009-04-03 13:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Avi Kivity, Masami Hiramatsu, H. Peter Anvin,
	Frederic Weisbecker, Steven Rostedt, Ananth N Mavinakayanahalli,
	Andrew Morton, Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML,
	Pekka Paalanen

2009/4/3 Ingo Molnar <mingo@elte.hu>:
>
> * Avi Kivity <avi@redhat.com> wrote:
>
>> Ingo Molnar wrote:
>>>> kvm has three requirements not needed by kprobes:
>>>> - it wants to execute instructions, not just decode them, including
>>>>   generating faults where appropriate
>>>> - it is performance critical
>>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>>>
>>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
>>>> to it.  x86_emulate.c is high on my list of most disliked code.
>>>>
>>>
>>> Well, this has to be driven from the KVM side as the kprobes use
>>> will only be for decoding so if it's modified from the kprobes
>>> side the KVM-only functionality might regress.
>>>
>>> So ... we can do the library decoder for kprobes purposes, and
>>> someone versed in the KVM emulator can then combine the two.
>>
>> Problem is, anyone versed in the kvm emulator will want to run as
>> far away from this work as possible.
>
> Are you suggesting that the KVM emulator should never have been
> merged in the first place? ;-)
>
> Anyway, we'll make sure the kprobes/library decoder is as clean as
> possible - so it ought to be hackable and extensible without the
> risk of permanent brain damage. Mmiotrace and kmemcheck has decoding
> smarts too, and i think the sw-breakpoint injection code of KGDB
> could use it as well - so there's broader utility in all this.

(Sorry in advance for jumping in -- my post may be irrelevant)

For the record, kmemcheck requirements for an instruction decoder are these:

For any instruction with memory operands, we need to know which are
the operands (so for movl %eax, (%ebx) we need to combine the
instruction with a struct pt_regs to get the actual address
dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
the source operand is 8 bits, destination operand is 32 bits). For
things like movsb, we need to be able to get both %esi and %edi.

mmiotrace additionally needs to know what the actual values
read/written were, for instructions that read/write to memory (again,
combined with a struct pt_regs).

Maybe this doesn't really say much, since this is what a generic
instruction decoder would be able to do anyway. But kmemcheck and
mmiotrace both have very special-purpose decoders. I don't really know
what other decoders look like, but what I would wish for is this: Some
macros for iterating the operands, where each operand has a type (e.g.
input (for reads), output (for writes), target (for jumps), immediate
address, immediate value, etc.), a size (in bits), and a way to
evaluate the operand. So eval(op, regs) for op=%eax, it will return
regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
will return 4, etc.

Both kmemcheck and mmiotrace could gain SMP support with instruction
emulation, though it is strictly not necessary. In that case, though,
we would not want to emulate fault handling, etc. (i.e. the fault
should always be generated by the CPU itself).

Please do put me on Cc for future discussions, though.


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 13:16           ` Vegard Nossum
@ 2009-04-03 13:40             ` Avi Kivity
  2009-04-03 13:52             ` Masami Hiramatsu
  1 sibling, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2009-04-03 13:40 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Ingo Molnar, Masami Hiramatsu, H. Peter Anvin,
	Frederic Weisbecker, Steven Rostedt, Ananth N Mavinakayanahalli,
	Andrew Morton, Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML,
	Pekka Paalanen

Vegard Nossum wrote:
> For the record, kmemcheck requirements for an instruction decoder are these:
>
> For any instruction with memory operands, we need to know which are
> the operands (so for movl %eax, (%ebx) we need to combine the
> instruction with a struct pt_regs to get the actual address
> dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
> the source operand is 8 bits, destination operand is 32 bits). For
> things like movsb, we need to be able to get both %esi and %edi.
>
>   

The kvm emulator does all of this.

> mmiotrace additionally needs to know what the actual values
> read/written were, for instructions that read/write to memory (again,
> combined with a struct pt_regs).
>   

And this.

> Maybe this doesn't really say much, since this is what a generic
> instruction decoder would be able to do anyway. But kmemcheck and
> mmiotrace both have very special-purpose decoders. I don't really know
> what other decoders look like, but what I would wish for is this: Some
> macros for iterating the operands, where each operand has a type (e.g.
> input (for reads), output (for writes), target (for jumps), immediate
> address, immediate value, etc.), a size (in bits), and a way to
> evaluate the operand. So eval(op, regs) for op=%eax, it will return
> regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
> will return 4, etc.
>   

You can do something like this by executing the instruction and 
observing what memory is touches through the callbacks.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 13:16           ` Vegard Nossum
  2009-04-03 13:40             ` Avi Kivity
@ 2009-04-03 13:52             ` Masami Hiramatsu
  2009-04-05 19:37               ` Pekka Paalanen
  1 sibling, 1 reply; 19+ messages in thread
From: Masami Hiramatsu @ 2009-04-03 13:52 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Ingo Molnar, Avi Kivity, H. Peter Anvin, Frederic Weisbecker,
	Steven Rostedt, Ananth N Mavinakayanahalli, Andrew Morton,
	Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML,
	Pekka Paalanen

Vegard Nossum wrote:
> 2009/4/3 Ingo Molnar <mingo@elte.hu>:
>> * Avi Kivity <avi@redhat.com> wrote:
>>
>>> Ingo Molnar wrote:
>>>>> kvm has three requirements not needed by kprobes:
>>>>> - it wants to execute instructions, not just decode them, including
>>>>>   generating faults where appropriate
>>>>> - it is performance critical
>>>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>>>>
>>>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
>>>>> to it.  x86_emulate.c is high on my list of most disliked code.
>>>>>
>>>> Well, this has to be driven from the KVM side as the kprobes use
>>>> will only be for decoding so if it's modified from the kprobes
>>>> side the KVM-only functionality might regress.
>>>>
>>>> So ... we can do the library decoder for kprobes purposes, and
>>>> someone versed in the KVM emulator can then combine the two.
>>> Problem is, anyone versed in the kvm emulator will want to run as
>>> far away from this work as possible.
>> Are you suggesting that the KVM emulator should never have been
>> merged in the first place? ;-)
>>
>> Anyway, we'll make sure the kprobes/library decoder is as clean as
>> possible - so it ought to be hackable and extensible without the
>> risk of permanent brain damage. Mmiotrace and kmemcheck has decoding
>> smarts too, and i think the sw-breakpoint injection code of KGDB
>> could use it as well - so there's broader utility in all this.
> 
> (Sorry in advance for jumping in -- my post may be irrelevant)

Thank you for clarify your needs :-)

> For the record, kmemcheck requirements for an instruction decoder are these:
> 
> For any instruction with memory operands, we need to know which are
> the operands (so for movl %eax, (%ebx) we need to combine the
> instruction with a struct pt_regs to get the actual address
> dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
> the source operand is 8 bits, destination operand is 32 bits). For
> things like movsb, we need to be able to get both %esi and %edi.

New decoder can give you the value of mod/rm(insn.modrm), operand size
(insn.opnd_bytes), and immediate size (insn.immediate.nbytes)
To get which register is used, you can decode modrm with MODRM_*()
macros.

> mmiotrace additionally needs to know what the actual values
> read/written were, for instructions that read/write to memory (again,
> combined with a struct pt_regs).

The decoder doesn't use any locks/shared memory, so you can
use it in interrupt context, with pt_regs.

> Maybe this doesn't really say much, since this is what a generic
> instruction decoder would be able to do anyway. But kmemcheck and
> mmiotrace both have very special-purpose decoders. I don't really know
> what other decoders look like, but what I would wish for is this: Some
> macros for iterating the operands, where each operand has a type (e.g.
> input (for reads), output (for writes), target (for jumps), immediate
> address, immediate value, etc.), a size (in bits), and a way to
> evaluate the operand. So eval(op, regs) for op=%eax, it will return
> regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
> will return 4, etc.

Hmm, it's an interesting idea. I think operand classifying can be done by
evaluating opcode and mod/rm.

> Both kmemcheck and mmiotrace could gain SMP support with instruction
> emulation, though it is strictly not necessary. In that case, though,
> we would not want to emulate fault handling, etc. (i.e. the fault
> should always be generated by the CPU itself).
> 
> Please do put me on Cc for future discussions, though.

Of course, thank you!

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 11:50   ` Avi Kivity
  2009-04-03 12:12     ` Ingo Molnar
@ 2009-04-03 14:21     ` Masami Hiramatsu
  2009-04-03 14:23       ` Ingo Molnar
  2009-04-03 14:30       ` Avi Kivity
  1 sibling, 2 replies; 19+ messages in thread
From: Masami Hiramatsu @ 2009-04-03 14:21 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Ingo Molnar, H. Peter Anvin, Frederic Weisbecker, Steven Rostedt,
	Ananth N Mavinakayanahalli, Andrew Morton, Andi Kleen,
	Jim Keniston, kvm, systemtap-ml, LKML, Vegard Nossum

Avi Kivity wrote:
> Ingo Molnar wrote:
>> ok, the structure and concept looks quite good now, really nice!
>>
>> I'm wondering about something i suggested many moons ago: to look into
>> the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).
>>
>> I remember there were some issues with that (one problem being that
>> the KVM decoder is a special-purpose thing covering specific range of
>> execution environments - not a near-full integer-ops decoder like the
>> one we are aiming for here) - are there any other fundamental problems
>> beyond 'it has to be done' ?
>>
>> Conceptually we want just a single piece of decoder logic in
>> arch/x86/. If the KVM folks are cool with it we could factor out the
>> KVM one into arch/x86/lib/. But ... if there are compelling reasons to
>> leave the KVM one alone in its limited environment we can do that too.
>>   
> 
> kvm has three requirements not needed by kprobes:
> - it wants to execute instructions, not just decode them, including
> generating faults where appropriate
> - it is performance critical
> - it needs to support 16-bit, 32-bit, and 64-bit instructions
> simultaneously

Hmm, I'd like to know actually kvm aims to emulate all kinds of
instructions. If so, I might find some bugs in x86_emulate.c.
However, I don't know all bugs. To find all of them, we have to
port x86_emulate.c to user-space, decode binaries with it, and
compare its output with another decoder, as Jim had done with insn.c.

https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html


Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 14:21     ` Masami Hiramatsu
@ 2009-04-03 14:23       ` Ingo Molnar
  2009-04-03 16:55         ` Masami Hiramatsu
  2009-04-03 14:30       ` Avi Kivity
  1 sibling, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2009-04-03 14:23 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Avi Kivity, H. Peter Anvin, Frederic Weisbecker, Steven Rostedt,
	Ananth N Mavinakayanahalli, Andrew Morton, Andi Kleen,
	Jim Keniston, kvm, systemtap-ml, LKML, Vegard Nossum


* Masami Hiramatsu <mhiramat@redhat.com> wrote:

> Hmm, I'd like to know actually kvm aims to emulate all kinds of 
> instructions. If so, I might find some bugs in x86_emulate.c. 
> However, I don't know all bugs. To find all of them, we have to 
> port x86_emulate.c to user-space, decode binaries with it, and 
> compare its output with another decoder, as Jim had done with 
> insn.c.
> 
> https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html

btw., i'd suggest we put a build time check for this into the kernel 
version as well. For example to decode the vmlinux via objdump, run 
it through your decoder as well and compare the results. Put under a 
CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time 
self-test.

This would ensure that the kernel we are running is fully supported 
by the decoder - even as GCC/GAS starts using new instructions, etc. 

How does this sound to you?

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 14:21     ` Masami Hiramatsu
  2009-04-03 14:23       ` Ingo Molnar
@ 2009-04-03 14:30       ` Avi Kivity
  1 sibling, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2009-04-03 14:30 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, H. Peter Anvin, Frederic Weisbecker, Steven Rostedt,
	Ananth N Mavinakayanahalli, Andrew Morton, Andi Kleen,
	Jim Keniston, kvm, systemtap-ml, LKML, Vegard Nossum

Masami Hiramatsu wrote:
> Hmm, I'd like to know actually kvm aims to emulate all kinds of
> instructions. 

We're less interested in fpu/sse.  The interesting instructions are 
those used for page table management, mmio, and real mode execution.

> If so, I might find some bugs in x86_emulate.c.
> However, I don't know all bugs. To find all of them, we have to
> port x86_emulate.c to user-space, decode binaries with it, and
> compare its output with another decoder, as Jim had done with insn.c.
>
>   

That would be very useful.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 14:23       ` Ingo Molnar
@ 2009-04-03 16:55         ` Masami Hiramatsu
  2009-04-03 17:59           ` Jim Keniston
  0 siblings, 1 reply; 19+ messages in thread
From: Masami Hiramatsu @ 2009-04-03 16:55 UTC (permalink / raw)
  To: Ingo Molnar, Jim Keniston
  Cc: Avi Kivity, H. Peter Anvin, Frederic Weisbecker, Steven Rostedt,
	Ananth N Mavinakayanahalli, Andrew Morton, Andi Kleen, kvm,
	systemtap-ml, LKML, Vegard Nossum

Ingo Molnar wrote:
> * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> 
>> Hmm, I'd like to know actually kvm aims to emulate all kinds of 
>> instructions. If so, I might find some bugs in x86_emulate.c. 
>> However, I don't know all bugs. To find all of them, we have to 
>> port x86_emulate.c to user-space, decode binaries with it, and 
>> compare its output with another decoder, as Jim had done with 
>> insn.c.
>>
>> https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html
> 
> btw., i'd suggest we put a build time check for this into the kernel 
> version as well. For example to decode the vmlinux via objdump, run 
> it through your decoder as well and compare the results. Put under a 
> CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time 
> self-test.
> 
> This would ensure that the kernel we are running is fully supported 
> by the decoder - even as GCC/GAS starts using new instructions, etc. 
> 
> How does this sound to you?

Thanks! That is a good idea.
Jim, would you think you can port your script into kernel tree?

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 16:55         ` Masami Hiramatsu
@ 2009-04-03 17:59           ` Jim Keniston
  0 siblings, 0 replies; 19+ messages in thread
From: Jim Keniston @ 2009-04-03 17:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Avi Kivity, H. Peter Anvin, Frederic Weisbecker,
	Steven Rostedt, Ananth N Mavinakayanahalli, Andrew Morton,
	Andi Kleen, kvm, systemtap-ml, LKML, Vegard Nossum

On Fri, 2009-04-03 at 12:55 -0400, Masami Hiramatsu wrote:
> Ingo Molnar wrote:
> > * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> > 
> >> Hmm, I'd like to know actually kvm aims to emulate all kinds of 
> >> instructions. If so, I might find some bugs in x86_emulate.c. 
> >> However, I don't know all bugs. To find all of them, we have to 
> >> port x86_emulate.c to user-space, decode binaries with it, and 
> >> compare its output with another decoder, as Jim had done with 
> >> insn.c.
> >>
> >> https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html
> > 
> > btw., i'd suggest we put a build time check for this into the kernel 
> > version as well. For example to decode the vmlinux via objdump, run 
> > it through your decoder as well and compare the results. Put under a 
> > CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time 
> > self-test.
> > 
> > This would ensure that the kernel we are running is fully supported 
> > by the decoder - even as GCC/GAS starts using new instructions, etc. 
> > 
> > How does this sound to you?
> 
> Thanks! That is a good idea.
> Jim, would you think you can port your script into kernel tree?
...

I'd be happy to do what's needed to make it happen, and maintain it in
the face of x86 changes.  The script itself is practically nothing (~100
lines of awk and C), but what I don't know about the kernel build is a
lot, so I'd need some help from a kernel-build expert.

Jim


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-03 13:52             ` Masami Hiramatsu
@ 2009-04-05 19:37               ` Pekka Paalanen
  2009-04-06  7:53                 ` Avi Kivity
  0 siblings, 1 reply; 19+ messages in thread
From: Pekka Paalanen @ 2009-04-05 19:37 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Vegard Nossum, Ingo Molnar, Avi Kivity, H. Peter Anvin,
	Frederic Weisbecker, Steven Rostedt, Ananth N Mavinakayanahalli,
	Andrew Morton, Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML

On Fri, 03 Apr 2009 09:52:09 -0400
Masami Hiramatsu <mhiramat@redhat.com> wrote:

> Vegard Nossum wrote:
> > 2009/4/3 Ingo Molnar <mingo@elte.hu>:
> >> * Avi Kivity <avi@redhat.com> wrote:
> >>
> >>> Ingo Molnar wrote:
> >>>>> kvm has three requirements not needed by kprobes:
> >>>>> - it wants to execute instructions, not just decode them, including
> >>>>>   generating faults where appropriate
> >>>>> - it is performance critical
> >>>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
> >>>>>
> >>>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
> >>>>> to it.  x86_emulate.c is high on my list of most disliked code.
> >>>>>
> >>>> Well, this has to be driven from the KVM side as the kprobes use
> >>>> will only be for decoding so if it's modified from the kprobes
> >>>> side the KVM-only functionality might regress.
> >>>>
> >>>> So ... we can do the library decoder for kprobes purposes, and
> >>>> someone versed in the KVM emulator can then combine the two.
> >>> Problem is, anyone versed in the kvm emulator will want to run as
> >>> far away from this work as possible.
> >> Are you suggesting that the KVM emulator should never have been
> >> merged in the first place? ;-)
> >>
> >> Anyway, we'll make sure the kprobes/library decoder is as clean as
> >> possible - so it ought to be hackable and extensible without the
> >> risk of permanent brain damage. Mmiotrace and kmemcheck has decoding
> >> smarts too, and i think the sw-breakpoint injection code of KGDB
> >> could use it as well - so there's broader utility in all this.
> > 
> > (Sorry in advance for jumping in -- my post may be irrelevant)
> 
> Thank you for clarify your needs :-)
> 
> > For the record, kmemcheck requirements for an instruction decoder are these:
> > 
> > For any instruction with memory operands, we need to know which are
> > the operands (so for movl %eax, (%ebx) we need to combine the
> > instruction with a struct pt_regs to get the actual address
> > dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
> > the source operand is 8 bits, destination operand is 32 bits). For
> > things like movsb, we need to be able to get both %esi and %edi.
> 
> New decoder can give you the value of mod/rm(insn.modrm), operand size
> (insn.opnd_bytes), and immediate size (insn.immediate.nbytes)
> To get which register is used, you can decode modrm with MODRM_*()
> macros.
> 
> > mmiotrace additionally needs to know what the actual values
> > read/written were, for instructions that read/write to memory (again,
> > combined with a struct pt_regs).
> 
> The decoder doesn't use any locks/shared memory, so you can
> use it in interrupt context, with pt_regs.
> 
> > Maybe this doesn't really say much, since this is what a generic
> > instruction decoder would be able to do anyway. But kmemcheck and
> > mmiotrace both have very special-purpose decoders. I don't really know
> > what other decoders look like, but what I would wish for is this: Some
> > macros for iterating the operands, where each operand has a type (e.g.
> > input (for reads), output (for writes), target (for jumps), immediate
> > address, immediate value, etc.), a size (in bits), and a way to
> > evaluate the operand. So eval(op, regs) for op=%eax, it will return
> > regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
> > will return 4, etc.
> 
> Hmm, it's an interesting idea. I think operand classifying can be done by
> evaluating opcode and mod/rm.
> 
> > Both kmemcheck and mmiotrace could gain SMP support with instruction
> > emulation, though it is strictly not necessary. In that case, though,
> > we would not want to emulate fault handling, etc. (i.e. the fault
> > should always be generated by the CPU itself).

Not just emulation but address diversion, i.e. modifying the operation
(not the text) before executing it. Mmiotrace could do something like
this:
1. a blob calls ioremap
2. mmiotrace maps the MMIO area privately
3. the blob receives a dummy map from ioremap, that will generate
page fault
4. the blob accesses the dummy map and raises a page fault
5. pf handler detects the dummy map
6. mmiotrace pf handler emulates the instruction and replaces the
dummy address with the real MMIO address.
7. mmiotrace records the operation and the datum
8. go to step 4, or whatever

This means mmiotrace would not have to fiddle with the page
tables and page presence bits like it does now. As said, this
would make mmiotrace SMP-proof, and also eliminate the die notifier
(used for the instruction single stepping trap).

IMO a big step from a hack to a tool. Getting rid of the custom
instruction parser in mmiotrace would be a good step in itself.

Avi Kivity noted, that the KVM emulator does almost everything. Does
it allow also address diversion?

I haven't looked at the KVM emulator since something like 2.6.25 or
so, and I probably don't have time to work with it anyway, but
I am very interested to hear how things evolve.


Thanks.

-- 
Pekka Paalanen
http://www.iki.fi/pq/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
  2009-04-05 19:37               ` Pekka Paalanen
@ 2009-04-06  7:53                 ` Avi Kivity
  0 siblings, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2009-04-06  7:53 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Masami Hiramatsu, Vegard Nossum, Ingo Molnar, H. Peter Anvin,
	Frederic Weisbecker, Steven Rostedt, Ananth N Mavinakayanahalli,
	Andrew Morton, Andi Kleen, Jim Keniston, kvm, systemtap-ml, LKML

Pekka Paalanen wrote:
> Not just emulation but address diversion, i.e. modifying the operation
> (not the text) before executing it. Mmiotrace could do something like
> this:
> 1. a blob calls ioremap
> 2. mmiotrace maps the MMIO area privately
> 3. the blob receives a dummy map from ioremap, that will generate
> page fault
> 4. the blob accesses the dummy map and raises a page fault
> 5. pf handler detects the dummy map
> 6. mmiotrace pf handler emulates the instruction and replaces the
> dummy address with the real MMIO address.
> 7. mmiotrace records the operation and the datum
> 8. go to step 4, or whatever
>
> This means mmiotrace would not have to fiddle with the page
> tables and page presence bits like it does now. As said, this
> would make mmiotrace SMP-proof, and also eliminate the die notifier
> (used for the instruction single stepping trap).
>
> IMO a big step from a hack to a tool. Getting rid of the custom
> instruction parser in mmiotrace would be a good step in itself.
>
> Avi Kivity noted, that the KVM emulator does almost everything. Does
> it allow also address diversion?
>   

Operand access is by means of a callback, so yes.  In kvm's use, it's 
used to access guest memory, so it modified the addresses before reading 
or writing.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-04-06  7:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-02 17:24 [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer Masami Hiramatsu
2009-04-03 11:26 ` Ingo Molnar
2009-04-03 11:32   ` Andi Kleen
2009-04-03 11:50   ` Avi Kivity
2009-04-03 12:12     ` Ingo Molnar
2009-04-03 12:17       ` Avi Kivity
2009-04-03 12:26         ` Ingo Molnar
2009-04-03 12:33           ` Avi Kivity
2009-04-03 13:16           ` Vegard Nossum
2009-04-03 13:40             ` Avi Kivity
2009-04-03 13:52             ` Masami Hiramatsu
2009-04-05 19:37               ` Pekka Paalanen
2009-04-06  7:53                 ` Avi Kivity
2009-04-03 12:25       ` Andi Kleen
2009-04-03 14:21     ` Masami Hiramatsu
2009-04-03 14:23       ` Ingo Molnar
2009-04-03 16:55         ` Masami Hiramatsu
2009-04-03 17:59           ` Jim Keniston
2009-04-03 14:30       ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).