All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux
@ 2014-03-28 14:44 Jovi Zhangwei
  2014-03-28 14:44 ` [PATCH v2 01/29] ktap: add tools/ktap/README.md file Jovi Zhangwei
                   ` (29 more replies)
  0 siblings, 30 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:44 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Hi All,

The following set of patches add ktap tracing tool.

v2:
- move kernel module into kernel/trace/ktap/, reviewed by GregKH.
- move include ktap into include/uapi/ktap/
- Some minor cleanup on ktap.c, reviewed by Andi Kleen.

ktap is a new script-based dynamic tracing tool for Linux.
It uses a scripting language and lets the user trace system dynamically.

Highlights features:
* a simple but powerful scripting language
* register-based interpreter (heavily optimized) in Linux kernel
* small and lightweight
* not depend on the GCC toolchain for each script run
* easy to use in embedded environments without debugging info
* support for tracepoint, kprobe, uprobe, function trace, timer, and more
* supported in x86, ARM, PowerPC, MIPS
* safety in sandbox

Simple examples:
1). simplest one-liner command to enable all tracepoints
        ktap -e "trace *:* { print(argstr) }"

2). syscall tracing on target process
        ktap -e "trace syscalls:* { print(argstr) }" -- ls

3). simple syscall tracing
        ktap -e "trace syscalls:* { print(cpu, pid, execname, argstr) }"

4). ftrace
        ktap -e "trace ftrace:function /ip==mutex*/ { print(argstr) }"

5). syscall tracing in histogram style
        var s = {}

        trace syscalls:sys_enter_* {
                s[probename] += 1
        }

        trace_end {
                print_hist(s)
        }

6). kprobe tracing
        trace probe:do_sys_open dfd=%di fname=%dx flags=%cx mode=+4($stack) {
                print("entry:", execname, argstr)
        }

        trace probe:do_sys_open%return fd=$retval {
                print("exit:", execname, argstr)
        }

7). uprobe tracing
        trace probe:/lib/libc.so.6:malloc {
                print("entry:", execname, argstr)
        }

        trace probe:/lib/libc.so.6:malloc%return {
                print("exit:", execname, argstr)
        }

8). stapsdt tracing (userspace static marker)
        trace sdt:/lib64/libc.so.6:lll_futex_wake {
                print("lll_futex_wake", execname, argstr)
        }

        or:

        #trace all static mark in libc
        trace sdt:/lib64/libc.so.6:* {
                print(execname, argstr)
        }

9). timer
        tick-1ms {
                printf("time fired on one cpu\n");
        }

        profile-2s {
                printf("time fired on every cpu\n");
        }


This patchset contains:

1. README and Tutorial
2. ktap kernel module and userspace binary
3. sample scripts
4. test suite

The code can be pull from:
        http://github.com/ktap/linux.git upstream

The merit of putting this software in kernel tree is
to make it more possible to get feedback from users
and thus polish the code.

Thank you.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>

Jovi Zhangwei (29):
  ktap: add tools/ktap/README.md file
  ktap: add ktap tutorial(tools/ktap/doc/tutorial.md)
  ktap: add sample scripts(tools/ktap/samples/*)
  ktap: add basic ktap types definition(include/uapi/ktap/ktap_types.h)
  ktap: add bytecode definition(include/uapi/ktap/ktap_bc.h)
  ktap: add ktap_arch.h and error header file(include/uapi/ktap/)
  ktap: add kernel module main entry(kernel/trace/ktap/ktap.[c|h])
  ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h])
  ktap: add bytecode execution engine(kernel/trace/ktap/kp_vm.[c|h])
  ktap: add string handling
    code(kernel/trace/ktap/kp_[str|mempool].[c|h])
  ktap: add table handling code(kernel/trace/ktap/kp_tab.[c|h])
  ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h])
  ktap: add ring buffer handling
    code(kernel/trace/ktap/kp_transport.[c|h])
  ktap: add events management(kernel/trace/ktap/kp_events.[c|h])
  ktap: add built-in functions and library(kernel/trace/ktap/lib_*.c)
  ktap: add amalgamation build(kernel/trace/ktap/amalg.c)
  ktap: add Makefile for kernel module(kernel/trace/ktap/Makefile)
  ktap: add Kconfig(kernel/trace/ktap/Kconfig)
  ktap: add main file for ktap binary(tools/ktap/kp_main.c)
  ktap: add compiler(tools/ktap/kp_[lex|parse].[c|h])
  ktap: add symbol handling code(tools/ktap/symbol.[c|h])
  ktap: add events parse code(tools/ktap/kp_parse_events.c)
  ktap: add ring buffer reader(tools/ktap/kp_reader.c)
  ktap: add bytecode writer(tools/ktap/kp_bcwrite.c)
  ktap: add userspace util(tools/ktap/kp_util.c)
  ktap: add userspace binary Makefile(tools/ktap/Makefile)
  ktap: add testsuite and benchmark(tools/ktap/test/*)
  ktap: add vim syntax file(tools/ktap/vim/*)
  ktap: add COPYRIGHT file(tools/ktap/COPYRIGHT)

 include/uapi/ktap/ktap_arch.h                      |   33 +
 include/uapi/ktap/ktap_bc.h                        |  369 +++
 include/uapi/ktap/ktap_err.h                       |   11 +
 include/uapi/ktap/ktap_errmsg.h                    |  135 +
 include/uapi/ktap/ktap_types.h                     |  462 +++
 kernel/trace/ktap/Kconfig                          |   21 +
 kernel/trace/ktap/Makefile                         |   50 +
 kernel/trace/ktap/amalg.c                          |   37 +
 kernel/trace/ktap/kp_bcread.c                      |  429 +++
 kernel/trace/ktap/kp_bcread.h                      |    6 +
 kernel/trace/ktap/kp_events.c                      |  832 ++++++
 kernel/trace/ktap/kp_events.h                      |   71 +
 kernel/trace/ktap/kp_mempool.c                     |   94 +
 kernel/trace/ktap/kp_mempool.h                     |    8 +
 kernel/trace/ktap/kp_obj.c                         |  281 ++
 kernel/trace/ktap/kp_obj.h                         |   19 +
 kernel/trace/ktap/kp_str.c                         |  360 +++
 kernel/trace/ktap/kp_str.h                         |   13 +
 kernel/trace/ktap/kp_tab.c                         |  842 ++++++
 kernel/trace/ktap/kp_tab.h                         |   59 +
 kernel/trace/ktap/kp_transport.c                   |  649 ++++
 kernel/trace/ktap/kp_transport.h                   |   13 +
 kernel/trace/ktap/kp_vm.c                          | 1754 +++++++++++
 kernel/trace/ktap/kp_vm.h                          |   43 +
 kernel/trace/ktap/ktap.c                           |  255 ++
 kernel/trace/ktap/ktap.h                           |  176 ++
 kernel/trace/ktap/lib_ansi.c                       |  142 +
 kernel/trace/ktap/lib_base.c                       |  407 +++
 kernel/trace/ktap/lib_kdebug.c                     |  195 ++
 kernel/trace/ktap/lib_net.c                        |  107 +
 kernel/trace/ktap/lib_table.c                      |   58 +
 kernel/trace/ktap/lib_timer.c                      |  210 ++
 tools/ktap/COPYRIGHT                               |   63 +
 tools/ktap/Makefile                                |  130 +
 tools/ktap/README.md                               |  149 +
 tools/ktap/doc/tutorial.md                         |  666 +++++
 tools/ktap/kp_bcwrite.c                            |  375 +++
 tools/ktap/kp_lex.c                                |  552 ++++
 tools/ktap/kp_lex.h                                |   94 +
 tools/ktap/kp_main.c                               |  443 +++
 tools/ktap/kp_parse.c                              | 3139 ++++++++++++++++++++
 tools/ktap/kp_parse.h                              |    4 +
 tools/ktap/kp_parse_events.c                       |  798 +++++
 tools/ktap/kp_reader.c                             |  106 +
 tools/ktap/kp_symbol.c                             |  360 +++
 tools/ktap/kp_symbol.h                             |   50 +
 tools/ktap/kp_util.c                               |  646 ++++
 tools/ktap/kp_util.h                               |  120 +
 tools/ktap/samples/ansi/ansi_color_demo.kp         |   22 +
 tools/ktap/samples/basic/backtrace.kp              |    6 +
 tools/ktap/samples/basic/event_trigger.kp          |   27 +
 tools/ktap/samples/basic/event_trigger_ftrace.kp   |   27 +
 tools/ktap/samples/basic/ftrace.kp                 |    8 +
 tools/ktap/samples/basic/function_time.kp          |   62 +
 tools/ktap/samples/basic/kretprobe.kp              |    6 +
 tools/ktap/samples/basic/memcpy_memset.kp          |   23 +
 tools/ktap/samples/game/tetris.kp                  |  297 ++
 tools/ktap/samples/helloworld.kp                   |    3 +
 tools/ktap/samples/interrupt/hardirq_time.kp       |   25 +
 tools/ktap/samples/interrupt/softirq_time.kp       |   24 +
 tools/ktap/samples/io/kprobes-do-sys-open.kp       |   20 +
 tools/ktap/samples/io/traceio.kp                   |   61 +
 tools/ktap/samples/mem/kmalloc-stack.kp            |   12 +
 tools/ktap/samples/mem/kmem_count.kp               |   29 +
 tools/ktap/samples/network/tcp_ipaddr.kp           |   20 +
 tools/ktap/samples/profiling/function_profiler.kp  |   41 +
 .../profiling/kprobe_all_kernel_functions.kp       |   13 +
 tools/ktap/samples/profiling/stack_profile.kp      |   27 +
 tools/ktap/samples/schedule/sched_transition.kp    |    5 +
 tools/ktap/samples/schedule/schedtimes.kp          |  131 +
 tools/ktap/samples/syscalls/errinfo.kp             |  145 +
 tools/ktap/samples/syscalls/execve.kp              |    9 +
 tools/ktap/samples/syscalls/opensnoop.kp           |   31 +
 tools/ktap/samples/syscalls/sctop.kp               |   13 +
 tools/ktap/samples/syscalls/syscalls.kp            |    6 +
 tools/ktap/samples/syscalls/syscalls_count.kp      |   54 +
 .../samples/syscalls/syscalls_count_by_proc.kp     |   22 +
 tools/ktap/samples/syscalls/syslatl.kp             |   33 +
 tools/ktap/samples/syscalls/syslist.kp             |   31 +
 tools/ktap/samples/tracepoints/eventcount.kp       |  210 ++
 .../ktap/samples/tracepoints/eventcount_by_proc.kp |   57 +
 tools/ktap/samples/tracepoints/raw_tracepoint.kp   |   15 +
 tools/ktap/samples/tracepoints/tracepoints.kp      |    6 +
 tools/ktap/samples/userspace/gcc_unwind.kp         |    9 +
 tools/ktap/samples/userspace/glibc_func_hist.kp    |   44 +
 tools/ktap/samples/userspace/glibc_sdt.kp          |   11 +
 tools/ktap/samples/userspace/glibc_trace.kp        |   11 +
 tools/ktap/samples/userspace/malloc_free.kp        |   20 +
 tools/ktap/samples/userspace/malloc_size_hist.kp   |   22 +
 tools/ktap/samples/userspace/pthread.kp            |    8 +
 tools/ktap/test/README                             |   69 +
 tools/ktap/test/arithmetic.t                       |  109 +
 tools/ktap/test/benchmark/cmp_neq.sh               |  158 +
 tools/ktap/test/benchmark/cmp_profile.sh           |   54 +
 tools/ktap/test/benchmark/cmp_table.sh             |  112 +
 tools/ktap/test/benchmark/sembench.c               |  556 ++++
 tools/ktap/test/cli-arg.t                          |   25 +
 tools/ktap/test/concat.t                           |   21 +
 tools/ktap/test/count.t                            |   25 +
 tools/ktap/test/deadloop.t                         |   37 +
 tools/ktap/test/fibonacci.t                        |   42 +
 tools/ktap/test/function.t                         |   78 +
 tools/ktap/test/if.t                               |   32 +
 tools/ktap/test/kprobe.t                           |   82 +
 tools/ktap/test/kretprobe.t                        |   35 +
 tools/ktap/test/len.t                              |   27 +
 tools/ktap/test/lib/Test/ktap.pm                   |  128 +
 tools/ktap/test/looping.t                          |   46 +
 tools/ktap/test/one-liner.t                        |   48 +
 tools/ktap/test/pairs.t                            |   52 +
 tools/ktap/test/stack_overflow.t                   |   22 +
 tools/ktap/test/syntax-err.t                       |   19 +
 tools/ktap/test/table.t                            |   81 +
 tools/ktap/test/time.t                             |   59 +
 tools/ktap/test/timer.t                            |   65 +
 tools/ktap/test/tracepoint.t                       |   53 +
 tools/ktap/test/util/reindex                       |   61 +
 tools/ktap/test/zerodivide.t                       |   21 +
 tools/ktap/vim/ftdetect/ktap.vim                   |    3 +
 tools/ktap/vim/syntax/ktap.vim                     |  106 +
 120 files changed, 19708 insertions(+)
 create mode 100644 include/uapi/ktap/ktap_arch.h
 create mode 100644 include/uapi/ktap/ktap_bc.h
 create mode 100644 include/uapi/ktap/ktap_err.h
 create mode 100644 include/uapi/ktap/ktap_errmsg.h
 create mode 100644 include/uapi/ktap/ktap_types.h
 create mode 100644 kernel/trace/ktap/Kconfig
 create mode 100644 kernel/trace/ktap/Makefile
 create mode 100644 kernel/trace/ktap/amalg.c
 create mode 100644 kernel/trace/ktap/kp_bcread.c
 create mode 100644 kernel/trace/ktap/kp_bcread.h
 create mode 100644 kernel/trace/ktap/kp_events.c
 create mode 100644 kernel/trace/ktap/kp_events.h
 create mode 100644 kernel/trace/ktap/kp_mempool.c
 create mode 100644 kernel/trace/ktap/kp_mempool.h
 create mode 100644 kernel/trace/ktap/kp_obj.c
 create mode 100644 kernel/trace/ktap/kp_obj.h
 create mode 100644 kernel/trace/ktap/kp_str.c
 create mode 100644 kernel/trace/ktap/kp_str.h
 create mode 100644 kernel/trace/ktap/kp_tab.c
 create mode 100644 kernel/trace/ktap/kp_tab.h
 create mode 100644 kernel/trace/ktap/kp_transport.c
 create mode 100644 kernel/trace/ktap/kp_transport.h
 create mode 100644 kernel/trace/ktap/kp_vm.c
 create mode 100644 kernel/trace/ktap/kp_vm.h
 create mode 100644 kernel/trace/ktap/ktap.c
 create mode 100644 kernel/trace/ktap/ktap.h
 create mode 100644 kernel/trace/ktap/lib_ansi.c
 create mode 100644 kernel/trace/ktap/lib_base.c
 create mode 100644 kernel/trace/ktap/lib_kdebug.c
 create mode 100644 kernel/trace/ktap/lib_net.c
 create mode 100644 kernel/trace/ktap/lib_table.c
 create mode 100644 kernel/trace/ktap/lib_timer.c
 create mode 100644 tools/ktap/COPYRIGHT
 create mode 100644 tools/ktap/Makefile
 create mode 100644 tools/ktap/README.md
 create mode 100644 tools/ktap/doc/tutorial.md
 create mode 100644 tools/ktap/kp_bcwrite.c
 create mode 100644 tools/ktap/kp_lex.c
 create mode 100644 tools/ktap/kp_lex.h
 create mode 100644 tools/ktap/kp_main.c
 create mode 100644 tools/ktap/kp_parse.c
 create mode 100644 tools/ktap/kp_parse.h
 create mode 100644 tools/ktap/kp_parse_events.c
 create mode 100644 tools/ktap/kp_reader.c
 create mode 100644 tools/ktap/kp_symbol.c
 create mode 100644 tools/ktap/kp_symbol.h
 create mode 100644 tools/ktap/kp_util.c
 create mode 100644 tools/ktap/kp_util.h
 create mode 100644 tools/ktap/samples/ansi/ansi_color_demo.kp
 create mode 100644 tools/ktap/samples/basic/backtrace.kp
 create mode 100644 tools/ktap/samples/basic/event_trigger.kp
 create mode 100644 tools/ktap/samples/basic/event_trigger_ftrace.kp
 create mode 100644 tools/ktap/samples/basic/ftrace.kp
 create mode 100644 tools/ktap/samples/basic/function_time.kp
 create mode 100644 tools/ktap/samples/basic/kretprobe.kp
 create mode 100644 tools/ktap/samples/basic/memcpy_memset.kp
 create mode 100644 tools/ktap/samples/game/tetris.kp
 create mode 100644 tools/ktap/samples/helloworld.kp
 create mode 100644 tools/ktap/samples/interrupt/hardirq_time.kp
 create mode 100644 tools/ktap/samples/interrupt/softirq_time.kp
 create mode 100644 tools/ktap/samples/io/kprobes-do-sys-open.kp
 create mode 100644 tools/ktap/samples/io/traceio.kp
 create mode 100644 tools/ktap/samples/mem/kmalloc-stack.kp
 create mode 100644 tools/ktap/samples/mem/kmem_count.kp
 create mode 100644 tools/ktap/samples/network/tcp_ipaddr.kp
 create mode 100644 tools/ktap/samples/profiling/function_profiler.kp
 create mode 100644 tools/ktap/samples/profiling/kprobe_all_kernel_functions.kp
 create mode 100644 tools/ktap/samples/profiling/stack_profile.kp
 create mode 100644 tools/ktap/samples/schedule/sched_transition.kp
 create mode 100644 tools/ktap/samples/schedule/schedtimes.kp
 create mode 100644 tools/ktap/samples/syscalls/errinfo.kp
 create mode 100644 tools/ktap/samples/syscalls/execve.kp
 create mode 100644 tools/ktap/samples/syscalls/opensnoop.kp
 create mode 100644 tools/ktap/samples/syscalls/sctop.kp
 create mode 100644 tools/ktap/samples/syscalls/syscalls.kp
 create mode 100644 tools/ktap/samples/syscalls/syscalls_count.kp
 create mode 100644 tools/ktap/samples/syscalls/syscalls_count_by_proc.kp
 create mode 100644 tools/ktap/samples/syscalls/syslatl.kp
 create mode 100644 tools/ktap/samples/syscalls/syslist.kp
 create mode 100644 tools/ktap/samples/tracepoints/eventcount.kp
 create mode 100644 tools/ktap/samples/tracepoints/eventcount_by_proc.kp
 create mode 100644 tools/ktap/samples/tracepoints/raw_tracepoint.kp
 create mode 100644 tools/ktap/samples/tracepoints/tracepoints.kp
 create mode 100644 tools/ktap/samples/userspace/gcc_unwind.kp
 create mode 100644 tools/ktap/samples/userspace/glibc_func_hist.kp
 create mode 100644 tools/ktap/samples/userspace/glibc_sdt.kp
 create mode 100644 tools/ktap/samples/userspace/glibc_trace.kp
 create mode 100644 tools/ktap/samples/userspace/malloc_free.kp
 create mode 100644 tools/ktap/samples/userspace/malloc_size_hist.kp
 create mode 100644 tools/ktap/samples/userspace/pthread.kp
 create mode 100644 tools/ktap/test/README
 create mode 100644 tools/ktap/test/arithmetic.t
 create mode 100644 tools/ktap/test/benchmark/cmp_neq.sh
 create mode 100644 tools/ktap/test/benchmark/cmp_profile.sh
 create mode 100644 tools/ktap/test/benchmark/cmp_table.sh
 create mode 100644 tools/ktap/test/benchmark/sembench.c
 create mode 100644 tools/ktap/test/cli-arg.t
 create mode 100644 tools/ktap/test/concat.t
 create mode 100644 tools/ktap/test/count.t
 create mode 100644 tools/ktap/test/deadloop.t
 create mode 100644 tools/ktap/test/fibonacci.t
 create mode 100644 tools/ktap/test/function.t
 create mode 100644 tools/ktap/test/if.t
 create mode 100644 tools/ktap/test/kprobe.t
 create mode 100644 tools/ktap/test/kretprobe.t
 create mode 100644 tools/ktap/test/len.t
 create mode 100644 tools/ktap/test/lib/Test/ktap.pm
 create mode 100644 tools/ktap/test/looping.t
 create mode 100644 tools/ktap/test/one-liner.t
 create mode 100644 tools/ktap/test/pairs.t
 create mode 100644 tools/ktap/test/stack_overflow.t
 create mode 100644 tools/ktap/test/syntax-err.t
 create mode 100644 tools/ktap/test/table.t
 create mode 100644 tools/ktap/test/time.t
 create mode 100644 tools/ktap/test/timer.t
 create mode 100644 tools/ktap/test/tracepoint.t
 create mode 100755 tools/ktap/test/util/reindex
 create mode 100644 tools/ktap/test/zerodivide.t
 create mode 100644 tools/ktap/vim/ftdetect/ktap.vim
 create mode 100644 tools/ktap/vim/syntax/ktap.vim

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 01/29] ktap: add tools/ktap/README.md file
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
@ 2014-03-28 14:44 ` Jovi Zhangwei
  2014-03-28 14:44 ` [PATCH v2 02/29] ktap: add ktap tutorial(tools/ktap/doc/tutorial.md) Jovi Zhangwei
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:44 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/README.md | 149 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 149 insertions(+)
 create mode 100644 tools/ktap/README.md

diff --git a/tools/ktap/README.md b/tools/ktap/README.md
new file mode 100644
index 0000000..66dc2d1
--- /dev/null
+++ b/tools/ktap/README.md
@@ -0,0 +1,149 @@
+# ktap
+
+A New Scripting Dynamic Tracing Tool For Linux  
+[www.ktap.org][homepage]
+
+ktap is a new scripting dynamic tracing tool for Linux,
+it uses a scripting language and lets users trace the Linux kernel dynamically.
+ktap is designed to give operational insights with interoperability
+that allows users to tune, troubleshoot and extend the kernel and applications.
+It's similar to Linux Systemtap and Solaris Dtrace.
+
+ktap has different design principles from Linux mainstream dynamic tracing
+language in that it's based on bytecode, so it doesn't depend upon GCC,
+doesn't require compiling kernel module for each script, safe to use in
+production environment, fulfilling the embedded ecosystem's tracing needs.
+
+More information can be found at [ktap homepage][homepage].
+
+[homepage]: http://www.ktap.org
+
+## Highlights
+
+* a simple but powerful scripting language
+* register based interpreter (heavily optimized) in Linux kernel
+* small and lightweight
+* not depend on the gcc toolchain for each script run
+* easy to use in embedded environments without debugging info
+* support for tracepoint, kprobe, uprobe, function trace, timer, and more
+* supported in x86, arm, ppc, mips
+* safety in sandbox
+
+
+## Building & Running
+
+1. Clone ktap from github
+
+        $ git clone http://github.com/ktap/ktap.git
+2. Compiling ktap
+
+        $ cd ktap
+        $ make       #generate ktapvm kernel module and ktap binary
+3. Load ktapvm kernel module(make sure debugfs mounted)
+
+        $ make load  #need to be root or have sudo access
+4. Running ktap
+
+        $ ./ktap samples/helloworld.kp
+
+
+## Examples
+
+1. simplest one-liner command to enable all tracepoints
+
+        ktap -e "trace *:* { print(argstr) }"
+2. syscall tracing on target process
+
+        ktap -e "trace syscalls:* { print(argstr) }" -- ls
+3. ftrace(kernel newer than 3.3, and must compiled with CONFIG_FUNCTION_TRACER)
+
+        ktap -e "trace ftrace:function { print(argstr) }"
+
+        ktap -e "trace ftrace:function /ip==mutex*/ { print(argstr) }"
+4. simple syscall tracing
+
+        trace syscalls:* {
+                print(cpu, pid, execname, argstr)
+        }
+5. syscall tracing in histogram style
+
+        var s = {}
+
+        trace syscalls:sys_enter_* {
+                s[probename] += 1
+        }
+
+        trace_end {
+                print_hist(s)
+        }
+6. kprobe tracing
+
+        trace probe:do_sys_open dfd=%di fname=%dx flags=%cx mode=+4($stack) {
+                print("entry:", execname, argstr)
+        }
+
+        trace probe:do_sys_open%return fd=$retval {
+                print("exit:", execname, argstr)
+        }
+7. uprobe tracing
+
+        trace probe:/lib/libc.so.6:malloc {
+                print("entry:", execname, argstr)
+        }
+
+        trace probe:/lib/libc.so.6:malloc%return {
+                print("exit:", execname, argstr)
+        }
+8. stapsdt tracing (userspace static marker)
+
+        trace sdt:/lib64/libc.so.6:lll_futex_wake {
+                print("lll_futex_wake", execname, argstr)
+        }
+
+        or:
+
+        #trace all static mark in libc
+        trace sdt:/lib64/libc.so.6:* {
+                print(execname, argstr)
+        }
+9. timer
+
+        tick-1ms {
+                printf("time fired on one cpu\n");
+        }
+
+        profile-2s {
+                printf("time fired on every cpu\n");
+        }
+
+More examples can be found at [samples][samples_dir] directory.
+
+[samples_dir]: https://github.com/ktap/ktap/tree/master/samples
+
+## Mailing list
+
+ktap@freelists.org  
+You can subscribe to ktap mailing list at link (subscribe before posting):
+http://www.freelists.org/list/ktap
+
+
+## Copyright and License
+
+ktap is licensed under GPL v2
+
+Copyright (C) 2012-2014, Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+All rights reserved.
+
+
+## Contribution
+
+ktap is still under active development, so contributions are welcome.
+You are encouraged to report bugs, provide feedback, send feature request,
+or hack on it.
+
+
+## See More
+
+More info can be found at [documentation][tutorial]
+[tutorial]: http://www.ktap.org/doc/tutorial.html
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 02/29] ktap: add ktap tutorial(tools/ktap/doc/tutorial.md)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
  2014-03-28 14:44 ` [PATCH v2 01/29] ktap: add tools/ktap/README.md file Jovi Zhangwei
@ 2014-03-28 14:44 ` Jovi Zhangwei
  2014-03-28 14:44 ` [PATCH v2 03/29] ktap: add sample scripts(tools/ktap/samples/*) Jovi Zhangwei
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:44 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

This is detail documentation for ktap users, it contains:

- Basic introduction
- Requirements
- Language Syntax basics
- Built-in functions, libraries and variables
- Simple samples
- Design decision
- References

This tutorial is still keeping update, it would be a good
guide for ktap.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/doc/tutorial.md | 666 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 666 insertions(+)
 create mode 100644 tools/ktap/doc/tutorial.md

diff --git a/tools/ktap/doc/tutorial.md b/tools/ktap/doc/tutorial.md
new file mode 100644
index 0000000..d9712cd
--- /dev/null
+++ b/tools/ktap/doc/tutorial.md
@@ -0,0 +1,666 @@
+% The ktap Tutorial
+
+# Introduction
+
+ktap is a new script-based dynamic tracing tool for Linux
+http://www.ktap.org
+
+ktap is a new script-based dynamic tracing tool for Linux.
+It uses a scripting language and lets the user trace the Linux kernel dynamically.
+ktap is designed to give operational insights with interoperability
+that allows users to tune, troubleshoot and extend kernel and application.
+It's similar to Linux SystemTap and Solaris DTrace.
+
+ktap has different design principles from Linux mainstream dynamic tracing
+language in that it's based on bytecode, so it doesn't depend upon GCC,
+doesn't require compiling a kernel module for each script, safe to use in
+production environment, fulfilling the embedded ecosystem's tracing needs.
+
+Highlights features:
+
+* a simple but powerful scripting language
+* register-based interpreter (heavily optimized) in Linux kernel
+* small and lightweight
+* not depend on the GCC toolchain for each script run
+* easy to use in embedded environments without debugging info
+* support for tracepoint, kprobe, uprobe, function trace, timer, and more
+* supported in x86, ARM, PowerPC, MIPS
+* safety in sandbox
+
+# Getting started
+
+Requirements
+
+* Linux 3.1 or later (patches are required for earlier versions)
+* `CONFIG_EVENT_TRACING` enabled
+* `CONFIG_PERF_EVENTS` enabled
+* `CONFIG_DEBUG_FS` enabled
+
+     make sure debugfs mounted before `insmod ktapvm`
+
+     mount debugfs: `mount -t debugfs none /sys/kernel/debug/`
+* libelf (optional)
+     Install elfutils-libelf-devel on RHEL-based distros, or libelf-dev on
+     Debian-based distros.
+     Use `make NO_LIBELF=1` to build without libelf support.
+     libelf is required for resolving symbols to addresses in DSO, and for SDT.
+
+Note that those configurations should always be enabled in Linux distribution,
+like RHEL, Fedora, Ubuntu, etc.
+
+1. Clone ktap from GitHub
+
+        $ git clone http://github.com/ktap/ktap.git
+2. Compile ktap
+
+        $ cd ktap
+        $ make       #generate ktapvm kernel module and ktap binary
+3. Load ktapvm kernel module(make sure debugfs mounted)
+
+        $ make load  #need to be root or have sudo access
+4. Run ktap
+
+        $ ./ktap samples/helloworld.kp
+
+
+# Language basics
+
+## Syntax basics
+
+ktap's syntax is designed with the C language syntax in mind. This is for lowering the entry barrier for C programmers who are working on the kernel or other systems software.
+
+* Variable declarations
+
+    The biggest syntax differences with C is that ktap is a dynamically-typed language, so you won't need add any variable type declaration, just use the variable.
+* Functions
+
+    All functions in ktap should use keyword "function" declaration
+* Comments
+
+    Comments in ktap start with `#`. Long comments are not supported right now.
+* Others
+
+    Semicolons (`;`) are not required at the end of statements in ktap. ktap uses a free-syntax style, so you are free to use ';' or not.
+
+ktap uses `nil` as `NULL`. The result of an arithmetic operation on `nil` is also `nil`.
+
+ktap does not have array structures, and it does not have any pointer operations.
+
+## Control structures
+
+ktap's `if`/`else` statement is the same as the C language's.
+
+There are three kinds of for-loop in ktap:
+
+1. a kinda Lua-ish style:
+
+    for (i = init, limit, step) { body }
+2. the same form as in C:
+
+    for (i = init; i < limit; i += step) { body }
+3. Lua's table iterating style:
+
+    for (k, v in pairs(t)) { body } # looping all elements of table
+
+Note that ktap does not have the `continue` keyword, but C does.
+
+## Data structures
+
+Associative arrays are heavily used in ktap; they are also called "tables".
+
+Table declarations:
+
+    t = {}
+
+How to use tables:
+
+    t[1] = 1
+    t[1] = "xxx"
+    t["key"] = 10
+    t["key"] = "value"
+
+    for (k, v in pairs(t)) { body }   # looping all elements of table
+
+# Built-in functions and libraries
+
+## Built-in functions
+
+**print (...)**
+
+Receives any number of arguments, and prints their values. print is not intended for formatted output, but only as a quick way to show values, typically for debugging.
+
+For formatted output, use `printf` instead.
+
+**printf (fmt, ...)**
+
+Similar to C's `printf`, for formatted string output.
+
+**pairs (t)**
+
+Returns three values: the next function, the table t, and nil, so that the construction
+
+    for (k, v in pairs(t)) { body }
+
+will iterate through all the key-value pairs in the table `t`.
+
+**len (t) /len (s)**
+
+If the argument is a string, returns the length of the string.
+
+If the argument is a table, returns the number of table pairs.
+
+**in_interrupt ()**
+
+Checks if it is in the context of interrupts.
+
+**exit ()**
+
+quits ktap programs, similar to the `exit` syscall.
+
+**arch ()**
+
+returns machine architecture, like `x86`, `arm`, and etc.
+
+**kernel_v ()**
+
+returns Linux kernel version string, like `3.9` and etc.
+
+**user_string (addr)**
+
+accepts a userspace address, reads the string data from userspace, and returns the ktap string value.
+
+**print_hist (t)**
+
+accepts a table and outputs the table histogram to the user.
+
+
+## Libraries
+
+### Kdebug Library
+
+**kdebug.trace_by_id (eventdef_info, eventfun)**
+
+This function is the underlying interface for the higher level tracing primitives.
+
+Note that the `eventdef_info` argument is just a C pointer value pointing to a userspace memory block holding the real `eventdef_info` structure. The structure definition is as follows:
+
+    struct ktap_eventdesc {
+	int nr; /* the number to id */
+	int *id_arr; /* id array */
+	char *filter;
+    };
+
+Those `id`s are read from `/sys/kernel/debug/tracing/events/$SYS/$EVENT/id`.
+
+The second argument in above example is a ktap function object:
+
+    function eventfun () { action }
+
+**kdebug.trace_end (endfunc)**
+
+This function is used for invoking a function when tracing ends, it will wait until the user presses `CTRL-C` to stop tracing, then ktap will call the argument, the `endfunc` function. The user could output tracing results in that function, or do other things.
+
+User usually do not need to use the `kdebug` library directly and just use the `trace`/`trace_end` keywords provided by the language.
+
+### Timer Library
+
+### Table Library
+
+**table.new (narr, nrec)**
+
+pre-allocates a table with `narr` array entries and `nrec` records.
+
+# Linux tracing basics
+
+tracepoints, probe, timer, filters, ring buffer
+
+# Tracing semantics in ktap
+
+## Tracing block
+
+**trace EVENTDEF /FILTER/ { ACTION }**
+
+This is the basic tracing block in ktap. You need to use a specific `EVENTDEF` string, and your own event function.
+
+There are four types of `EVENTDEF`: tracepoints, kprobes, uprobes, SDT probes.
+
+- tracepoint:
+
+	EventDef               Description
+	--------------------   -------------------------------
+	syscalls:*             trace all syscalls events
+	syscalls:sys_enter_*   trace all syscalls entry events
+	kmem:*                 trace all kmem related events
+	sched:*                trace all sched related events
+	sched:sched_switch     trace sched_switch tracepoint
+	\*:\*                  trace all tracepoints in system
+
+	All tracepoint events are based on
+	
+	    /sys/kernel/debug/tracing/events/$SYS/$EVENT
+
+- ftrace (kernel 3.3+, and must be compiled with `CONFIG_FUNCTION_TRACER`)
+
+	EventDef               Description
+	--------------------   -------------------------------
+	ftrace:function        trace kernel functions based on ftrace
+
+	User need to use filter (/ip==*/) to trace specific functions.
+	Function must be listed in /sys/kernel/debug/tracing/available_filter_functions
+
+> ***Note*** of function event
+> 
+> perf support ftrace:function tracepoint since Linux 3.3 (see below commit),
+> ktap is based on perf callback, so it means kernel must be newer than 3.3
+> then can use this feature.
+> 
+>     commit ced39002f5ea736b716ae233fb68b26d59783912
+>     Author: Jiri Olsa <jolsa@redhat.com>
+>     Date:   Wed Feb 15 15:51:52 2012 +0100
+>
+>     ftrace, perf: Add support to use function tracepoint in perf 
+>
+
+- kprobe:
+
+	EventDef               Description
+	--------------------   -----------------------------------
+	probe:schedule         trace schedule function
+	probe:schedule%return  trace schedule function return
+	probe:SyS_write        trace SyS_write function
+	probe:vfs*             trace wildcards vfs related function
+
+	kprobe functions must be listed in /proc/kallsyms
+- uprobe:
+
+	EventDef                               Description
+	------------------------------------   ---------------------------
+	probe:/lib64/libc.so.6:malloc          trace malloc function
+	probe:/lib64/libc.so.6:malloc%return   trace malloc function return
+	probe:/lib64/libc.so.6:free            trace free function
+	probe:/lib64/libc.so.6:0x82000         trace function with file offset 0x82000
+	probe:/lib64/libc.so.6:*               trace all libc function
+
+	symbol resolving need libelf support
+
+- sdt:
+
+	EventDef                               Description
+	------------------------------------   --------------------------
+	sdt:/libc64/libc.so.6:lll_futex_wake   trace stapsdt lll_futex_wake
+	sdt:/libc64/libc.so.6:*                trace all static markers in libc
+
+	sdt resolving need libelf support
+
+
+**trace_end { ACTION }**
+
+## Tracing Built-in variables
+
+**arg0..9**
+
+Evaluates to argument 0 to 9 of the event object. If fewer than ten arguments are passed to the current probe, the remaining variables return nil.
+
+> ***Note*** of arg offset
+>
+> The arg offset(0..9) is determined by event format shown in debugfs.
+>
+>     #cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
+>     name: sched_switch
+>     ID: 268
+>     format:
+>         field:char prev_comm[32];         <- arg0
+>         field:pid_t prev_pid;             <- arg1
+>         field:int prev_prio;              <- arg2
+>         field:long prev_state;            <- arg3
+>         field:char next_comm[32];         <- arg4
+>         field:pid_t next_pid;             <- arg5
+>         field:int next_prio;              <- arg6
+>
+> As shown above, the tracepoint event `sched:sched_switch` takes 7 arguments, from `arg0` to `arg6`.
+>
+> For syscall event, `arg0` is the syscall number, not the first argument of the syscall function. Use `arg1` as the first argument of the syscall function.
+> For example:
+>
+>     SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
+>                                         <arg1>             <arg2>       <arg3>
+>
+> This is similar to kprobe and uprobe, the `arg0` of kprobe/uprobe events
+>  is always `_probe_ip`, not the first argument given by the user, for example:
+>
+>     # ktap -e 'trace probe:/lib64/libc.so.6:malloc size=%di'
+>
+>     # cat /sys/kernel/debug/tracing/events/ktap_uprobes_3796/malloc/format
+>         field:unsigned long __probe_ip;   <- arg0
+>         field:u64 size;                   <- arg1
+
+
+**cpu**
+
+returns the current CPU id.
+
+**pid**
+
+returns current process pid.
+
+**tid**
+
+returns the current thread id.
+
+**uid**
+
+returns the current process's uid.
+
+**execname**
+
+returns the current process executable's name in a string.
+
+**argstr**
+
+Event string representation. You can print it by `print(argstr)`, turning the
+event into a human readable string. The result is mostly the same as each
+entry in `/sys/kernel/debug/tracing/trace`
+
+**probename**
+
+Event name. Each event has a name associated with it.
+(Dtrace also have 'probename' keyword)
+
+## Timer syntax
+
+**tick-Ns        { ACTION }**
+
+**tick-Nsec      { ACTION }**
+
+**tick-Nms       { ACTION }**
+
+**tick-Nmsec     { ACTION }**
+
+**tick-Nus       { ACTION }**
+
+**tick-Nusec     { ACTION }**
+
+**profile-Ns     { ACTION }**
+
+**profile-Nsec   { ACTION }**
+
+**profile-Nms    { ACTION }**
+
+**profile-Nmsec  { ACTION }**
+
+**profile-Nus    { ACTION }**
+
+**profile-Nusec  { ACTION }**
+
+architecture overview picture reference(pnp format)
+
+one-liners
+
+simple event tracing
+
+# Advanced tracing pattern
+
+* Aggregations/histograms
+* Thread locals
+* Flame graphs
+
+# Overhead/Performance
+
+* ktap has a much shorter startup time than SystemTap (try the helloword script).
+* ktap has a smaller memory footprint than SystemTap
+* Some scripts show that ktap has a little lower overhead than SystemTap
+(See more performance comparison between ktap and stap in test/benchmark/.
+ stap wins on number computation, and ktap wins on table operation;
+ Normally ktap have little memory consuming than stap)
+
+# FAQ
+
+**Q: Why use a bytecode design?**
+
+A: Using bytecode is a clean and lightweight solution,
+   you do not need the GCC toolchain to compile every script; all you
+   need is a ktapvm kernel module and the userspace tool called "ktap".
+   Since its language uses a virtual machine design, it has a great portability.
+   Suppose you are working on a multi-arch cluster; if you want to run
+   a tracing script on each board, you will not need cross-compile your tracing
+   scripts for all the boards. You can just use the `ktap` tool
+   to run scripts right away.
+
+   The bytecode-based design also makes execution safer than the native code
+   generation approach.
+
+   It is already observed that SystemTap is not widely used in embedded Linux systems. This is mainly caused by the problem of SystemTap's design decisions in its architecture design. It is a natural design for Red Hat and IBM, because Red Hat/IBM is focusing on the server area, not embedded area.
+
+**Q: What's the differences with SystemTap and DTrace?**
+
+A: For SystemTap, the answer is already mentioned in the above question,
+   SystemTap chooses the translator design, sacrificing usability for runtime performance.
+   The dependency on the GCC chain when running scripts is the problem that ktap wants to solve.
+
+   DTrace shares the same design decision of using bytecode, so basically
+   DTrace and ktap are more alike. There have been some projects aimed at porting
+   DTrace from Solaris to Linux, but these efforts are still under way and are relatively slow in progress. DTrace
+   has its root in Solaris, and there are many huge differences between Solaris's
+   tracing infrastructure and Linux's.
+
+   DTrace is based on D language, a language subset of C. It's a restricted
+   language, like without for-looping, for safe use in production systems.
+   It seems that DTrace for Linux only supports x86 architecture, doesn't work on
+   PowerPC and ARM/MIPS. Obviously it's not suited for embedded Linux currently.
+
+   DTrace uses CTF as input for debuginfo handing, compared to vmlinux for
+   SystemTap.
+
+   On the license part, DTrace is released as CDDL, which is incompatible with
+   GPL. (This is why it's impossible to upstream DTrace into mainline.)
+
+**Q: Why use a dynamically-typed language instead of a statically-typed language?**
+
+A: It's hard to say which one is better than the other. Dynamically-typed
+   languages bring efficiency and fast prototype production, but lose type
+   checking at the compile phase, and it's easy to make mistake in runtime. It also needs many runtime checks. In contrast, statically-typed languages win on programming safety and performance. Statically-typed languages would suit for interoperation with the kernel, as the kernel is written mainly in C. Note that SystemTap and DTrace both use statically-typed languages.
+
+   ktap chooses a dynamically-typed language for its initial implementation.
+
+**Q: Why do we need ktap for event tracing? There is already a built-in ftrace**
+
+A: This is also a common question for all dynamic tracing tools, not only ktap.
+   ktap provides more flexibility than the built-in tracing infrastructure. Suppose you need to print a global variable at a tracepoint hit, or you want to print a backtrace. Furthermore, you want to store some info into an associative array, and display it as a histogram when tracing ends. `ftrace` cannot handle all these requirements. Overall, ktap provides you with great flexibility to script your own trace needs.
+
+**Q: How about the performance? Is ktap slow?**
+
+A: ktap is not slow. The bytecode is very high-level, based on Luajit. The language's virtual machine is register-based (compared to the stack-based JVM and CLR), with a small number of instructions. The table data structure is heavily optimized in ktapvm. ktap uses per-cpu allocation in many places, without the global locking scheme. It is very fast when executing tracepoint callbacks. Performance benchmarks show that the overhead of associative array operation is smaller than Systemtap.
+
+   ktap will keep optimizing unfailingly.
+
+**Q: Why not port a higher-level language, like Python or Java, directly into the kernel?**
+
+A: I am serious on the size of VM and the memory footprint. The Python VM is too large for embedding into the kernel, and Python has many advanced functionalities which we do not really need.
+
+   There are also some problems when porting those languages into the kernel. Kernel programming is very different from userspace programming, like lack of floating-point numbers, handling sleeping code, deadloop is not allowed in the kernel, multi-thread management, etc. So it is impossible to port large language implementations over to the kernel environment with trivial efforts.
+
+**Q: What is the status of ktap now?**
+
+A: Basically it works on x86-32, x86-64, PowerPC, ARM. It also could work for
+   other hardware architectures, but is not tested yet. (I don't have enough hardware to test.)
+   If you find any bugs, fix it with your own programming skills, or just report to me.
+
+**Q: How can I hack on ktap? I want to write some extensions for ktap.**
+
+A: Patches welcome! Volunteers welcome!
+   You can write your own libraries to fulfill your specific needs,
+   or write scripts for fun.
+
+**Q: What's the plan for ktap? Is there a roadmap?**
+
+A: The current plan is to deliver stable ktapvm kernel modules, more ktap scripts, and more bugfixes.
+
+# References
+
+* [Linux Performance Analysis and Tools][REF1]
+* [Dtrace Blog][REF2]
+* [Dtrace User Guide][REF3]
+* [LWN: ktap -- yet another kernel tracer][REF4]
+* [LWN: Ktap almost gets into 3.13][REF5]
+* [staging: ktap: add to the kernel tree][REF6]
+* [ktap introduction in LinuxCon Japan 2013][REFR7(content is out of date)
+* [ktap Examples by Brendan Gregg][REFR8
+* [What Linux can learn from Solaris performance, and vice-versa][REF9]
+
+[REF1]: http://www.brendangregg.com/Slides/SCaLE_Linux_Performance2013.pdf
+[REF2]: http://dtrace.org/blogs/
+[REF3]: http://docs.huihoo.com/opensolaris/dtrace-user-guide/html/index.html
+[REF4]: http://lwn.net/Articles/551314/
+[REF5]: http://lwn.net/Articles/572788/
+[REF6]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c63a164271f81220ff4966d41218a9101f3d0ec4
+[REF7]: http://events.linuxfoundation.org/sites/events/files/lcjpcojp13_zhangwei.pdf
+[REF8]: http://www.brendangregg.com/ktap.html
+[REF9]: http://www.slideshare.net/brendangregg/what-linux-can-learn-from-solaris-performance-and-viceversa
+
+# History
+
+* ktap was invented at 2012
+* First RFC sent to LKML at 2012.12.31
+* The code was released in GitHub at 2013.01.18
+* ktap released v0.1 at 2013.05.21
+* ktap released v0.2 at 2013.07.31
+* ktap released v0.3 at 2013.10.29
+* ktap released v0.4 at 2013.12.09
+
+For more release info, please look at RELEASES.txt in project root directory.
+
+# Examples
+
+1. simplest one-liner command to enable all tracepoints
+
+        ktap -e "trace *:* { print(argstr) }"
+2. syscall tracing on target process
+
+        ktap -e "trace syscalls:* { print(argstr) }" -- ls
+3. ftrace(kernel newer than 3.3, and must compiled with CONFIG_FUNCTION_TRACER)
+
+        ktap -e "trace ftrace:function { print(argstr) }"
+
+        ktap -e "trace ftrace:function /ip==mutex*/ { print(argstr) }"
+4. simple syscall tracing
+
+        trace syscalls:* {
+                print(cpu, pid, execname, argstr)
+        }
+5. syscall tracing in histogram style
+
+        var s = {}
+
+        trace syscalls:sys_enter_* {
+                s[probename] += 1
+        }
+
+        trace_end {
+                print_hist(s)
+        }
+6. kprobe tracing
+
+        trace probe:do_sys_open dfd=%di fname=%dx flags=%cx mode=+4($stack) {
+                print("entry:", execname, argstr)
+        }
+
+        trace probe:do_sys_open%return fd=$retval {
+                print("exit:", execname, argstr)
+        }
+7. uprobe tracing
+
+        trace probe:/lib/libc.so.6:malloc {
+                print("entry:", execname, argstr)
+        }
+
+        trace probe:/lib/libc.so.6:malloc%return {
+                print("exit:", execname, argstr)
+        }
+8. stapsdt tracing (userspace static marker)
+
+        trace sdt:/lib64/libc.so.6:lll_futex_wake {
+                print("lll_futex_wake", execname, argstr)
+        }
+
+        or:
+
+        #trace all static mark in libc
+        trace sdt:/lib64/libc.so.6:* {
+                print(execname, argstr)
+        }
+9. timer
+
+        tick-1ms {
+                printf("time fired on one cpu\n");
+        }
+
+        profile-2s {
+                printf("time fired on every cpu\n");
+        }
+
+More examples can be found at [samples][samples_dir] directory.
+
+[samples_dir]: https://github.com/ktap/ktap/tree/master/samples
+
+# Appendix
+
+Here is the complete syntax of ktap in extended BNF.
+(based on Lua syntax: http://www.lua.org/manual/5.1/manual.html#5.1)
+
+        chunk ::= {stat [';']} [laststat [';']
+
+        block ::= chunk
+
+        stat ::=  varlist '=' explist | 
+                 functioncall | 
+                 { block } | 
+                 while exp { block } | 
+                 repeat block until exp | 
+                 if exp { block {elseif exp { block }} [else block] } | 
+                 for Name '=' exp ',' exp [',' exp] { block } | 
+                 for namelist in explist { block } | 
+                 function funcname funcbody | 
+                 function Name funcbody | 
+                 var namelist ['=' explist] 
+
+        laststat ::= return [explist] | break
+
+        funcname ::= Name {'.' Name} [':' Name]
+
+        varlist ::= var {',' var}
+
+        var ::=  Name | prefixexp '[' exp ']'| prefixexp '.' Name 
+
+        namelist ::= Name {',' Name}
+
+        explist ::= {exp ',' exp
+
+        exp ::=  nil | false | true | Number | String | '...' | function | 
+                 prefixexp | tableconstructor | exp binop exp | unop exp 
+
+        prefixexp ::= var | functioncall | '(' exp ')'
+
+        functioncall ::=  prefixexp args | prefixexp ':' Name args 
+
+        args ::=  '(' [explist] ')' | tableconstructor | String 
+
+        function ::= function funcbody
+
+        funcbody ::= '(' [parlist] ')' { block }
+
+        parlist ::= namelist [',' '...'] | '...'
+
+        tableconstructor ::= '{' [fieldlist] '}'
+
+        fieldlist ::= field {fieldsep field} [fieldsep]
+
+        field ::= '[' exp ']' '=' exp | Name '=' exp | exp
+
+        fieldsep ::= ',' | ';'
+
+        binop ::= '+' | '-' | '*' | '/' | '^' | '%' | '..' | 
+                  '<' | '<=' | '>' | '>=' | '==' | '!=' | 
+                  and | or
+
+        unop ::= '-'
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 03/29] ktap: add sample scripts(tools/ktap/samples/*)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
  2014-03-28 14:44 ` [PATCH v2 01/29] ktap: add tools/ktap/README.md file Jovi Zhangwei
  2014-03-28 14:44 ` [PATCH v2 02/29] ktap: add ktap tutorial(tools/ktap/doc/tutorial.md) Jovi Zhangwei
@ 2014-03-28 14:44 ` Jovi Zhangwei
  2014-03-28 14:44 ` [PATCH v2 04/29] ktap: add basic ktap types definition(include/uapi/ktap/ktap_types.h) Jovi Zhangwei
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:44 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

The samples directory is organized by different subsystems,
similar with Dtrace toolkit(http://www.brendangregg.com/dtracetoolkit.html)

It contains:
- helloworld.kp: simple hello world program
- ansi: ansi library for screen display
- basic: some simple examples
- game: tetris game wrote by ktap
- interrupt: collect hardirq and softirq time
- io: open and io related scripts
- mem: kernel memory allocation scripts
- network: networking scripts
- profiling: stack profile and function profile
- schedule: schedule scripts.
        schedtimes.kp was inspired by Systemtap schedtimes.stp
- syscalls: tracing syscalls
        syslatl.kp/opensnoop.kp/syslist.kp was contributed by Brendan Gregg(from Dtrace toolkit)
- tracepoints: kernel tracepoint count and histogram
- userspace: uprobe(include SDT) scripts

Besides these samples, "ktap Examples" posted by Brendan Gregg
(http://www.brendangregg.com/ktap.html) also have many sample
scripts, he mentioned how to can make flame graph by ktap.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/samples/ansi/ansi_color_demo.kp         |  22 ++
 tools/ktap/samples/basic/backtrace.kp              |   6 +
 tools/ktap/samples/basic/event_trigger.kp          |  27 ++
 tools/ktap/samples/basic/event_trigger_ftrace.kp   |  27 ++
 tools/ktap/samples/basic/ftrace.kp                 |   8 +
 tools/ktap/samples/basic/function_time.kp          |  62 +++++
 tools/ktap/samples/basic/kretprobe.kp              |   6 +
 tools/ktap/samples/basic/memcpy_memset.kp          |  23 ++
 tools/ktap/samples/game/tetris.kp                  | 297 +++++++++++++++++++++
 tools/ktap/samples/helloworld.kp                   |   3 +
 tools/ktap/samples/interrupt/hardirq_time.kp       |  25 ++
 tools/ktap/samples/interrupt/softirq_time.kp       |  24 ++
 tools/ktap/samples/io/kprobes-do-sys-open.kp       |  20 ++
 tools/ktap/samples/io/traceio.kp                   |  61 +++++
 tools/ktap/samples/mem/kmalloc-stack.kp            |  12 +
 tools/ktap/samples/mem/kmem_count.kp               |  29 ++
 tools/ktap/samples/network/tcp_ipaddr.kp           |  20 ++
 tools/ktap/samples/profiling/function_profiler.kp  |  41 +++
 .../profiling/kprobe_all_kernel_functions.kp       |  13 +
 tools/ktap/samples/profiling/stack_profile.kp      |  27 ++
 tools/ktap/samples/schedule/sched_transition.kp    |   5 +
 tools/ktap/samples/schedule/schedtimes.kp          | 131 +++++++++
 tools/ktap/samples/syscalls/errinfo.kp             | 145 ++++++++++
 tools/ktap/samples/syscalls/execve.kp              |   9 +
 tools/ktap/samples/syscalls/opensnoop.kp           |  31 +++
 tools/ktap/samples/syscalls/sctop.kp               |  13 +
 tools/ktap/samples/syscalls/syscalls.kp            |   6 +
 tools/ktap/samples/syscalls/syscalls_count.kp      |  54 ++++
 .../samples/syscalls/syscalls_count_by_proc.kp     |  22 ++
 tools/ktap/samples/syscalls/syslatl.kp             |  33 +++
 tools/ktap/samples/syscalls/syslist.kp             |  31 +++
 tools/ktap/samples/tracepoints/eventcount.kp       | 210 +++++++++++++++
 .../ktap/samples/tracepoints/eventcount_by_proc.kp |  57 ++++
 tools/ktap/samples/tracepoints/raw_tracepoint.kp   |  15 ++
 tools/ktap/samples/tracepoints/tracepoints.kp      |   6 +
 tools/ktap/samples/userspace/gcc_unwind.kp         |   9 +
 tools/ktap/samples/userspace/glibc_func_hist.kp    |  44 +++
 tools/ktap/samples/userspace/glibc_sdt.kp          |  11 +
 tools/ktap/samples/userspace/glibc_trace.kp        |  11 +
 tools/ktap/samples/userspace/malloc_free.kp        |  20 ++
 tools/ktap/samples/userspace/malloc_size_hist.kp   |  22 ++
 tools/ktap/samples/userspace/pthread.kp            |   8 +
 42 files changed, 1646 insertions(+)
 create mode 100644 tools/ktap/samples/ansi/ansi_color_demo.kp
 create mode 100644 tools/ktap/samples/basic/backtrace.kp
 create mode 100644 tools/ktap/samples/basic/event_trigger.kp
 create mode 100644 tools/ktap/samples/basic/event_trigger_ftrace.kp
 create mode 100644 tools/ktap/samples/basic/ftrace.kp
 create mode 100644 tools/ktap/samples/basic/function_time.kp
 create mode 100644 tools/ktap/samples/basic/kretprobe.kp
 create mode 100644 tools/ktap/samples/basic/memcpy_memset.kp
 create mode 100644 tools/ktap/samples/game/tetris.kp
 create mode 100644 tools/ktap/samples/helloworld.kp
 create mode 100644 tools/ktap/samples/interrupt/hardirq_time.kp
 create mode 100644 tools/ktap/samples/interrupt/softirq_time.kp
 create mode 100644 tools/ktap/samples/io/kprobes-do-sys-open.kp
 create mode 100644 tools/ktap/samples/io/traceio.kp
 create mode 100644 tools/ktap/samples/mem/kmalloc-stack.kp
 create mode 100644 tools/ktap/samples/mem/kmem_count.kp
 create mode 100644 tools/ktap/samples/network/tcp_ipaddr.kp
 create mode 100644 tools/ktap/samples/profiling/function_profiler.kp
 create mode 100644 tools/ktap/samples/profiling/kprobe_all_kernel_functions.kp
 create mode 100644 tools/ktap/samples/profiling/stack_profile.kp
 create mode 100644 tools/ktap/samples/schedule/sched_transition.kp
 create mode 100644 tools/ktap/samples/schedule/schedtimes.kp
 create mode 100644 tools/ktap/samples/syscalls/errinfo.kp
 create mode 100644 tools/ktap/samples/syscalls/execve.kp
 create mode 100644 tools/ktap/samples/syscalls/opensnoop.kp
 create mode 100644 tools/ktap/samples/syscalls/sctop.kp
 create mode 100644 tools/ktap/samples/syscalls/syscalls.kp
 create mode 100644 tools/ktap/samples/syscalls/syscalls_count.kp
 create mode 100644 tools/ktap/samples/syscalls/syscalls_count_by_proc.kp
 create mode 100644 tools/ktap/samples/syscalls/syslatl.kp
 create mode 100644 tools/ktap/samples/syscalls/syslist.kp
 create mode 100644 tools/ktap/samples/tracepoints/eventcount.kp
 create mode 100644 tools/ktap/samples/tracepoints/eventcount_by_proc.kp
 create mode 100644 tools/ktap/samples/tracepoints/raw_tracepoint.kp
 create mode 100644 tools/ktap/samples/tracepoints/tracepoints.kp
 create mode 100644 tools/ktap/samples/userspace/gcc_unwind.kp
 create mode 100644 tools/ktap/samples/userspace/glibc_func_hist.kp
 create mode 100644 tools/ktap/samples/userspace/glibc_sdt.kp
 create mode 100644 tools/ktap/samples/userspace/glibc_trace.kp
 create mode 100644 tools/ktap/samples/userspace/malloc_free.kp
 create mode 100644 tools/ktap/samples/userspace/malloc_size_hist.kp
 create mode 100644 tools/ktap/samples/userspace/pthread.kp

diff --git a/tools/ktap/samples/ansi/ansi_color_demo.kp b/tools/ktap/samples/ansi/ansi_color_demo.kp
new file mode 100644
index 0000000..2998fe0
--- /dev/null
+++ b/tools/ktap/samples/ansi/ansi_color_demo.kp
@@ -0,0 +1,22 @@
+#!/usr/bin/env ktap
+
+#this script demonstrate how to use ktap to output color text.
+
+ansi.clear_screen()
+
+ansi.set_color(32)
+printf("this line should be Green color\n")
+
+ansi.set_color(31)
+printf("this line should be Red color\n")
+
+ansi.set_color2(34, 43)
+printf("this line should be Blue color, with Yellow background\n")
+
+ansi.reset_color()
+ansi.set_color3(34, 46, 4)
+printf("this line should be Blue color, with Cyan background, underline single attribute\n")
+
+ansi.reset_color()
+ansi.new_line()
+
diff --git a/tools/ktap/samples/basic/backtrace.kp b/tools/ktap/samples/basic/backtrace.kp
new file mode 100644
index 0000000..4c13c1a
--- /dev/null
+++ b/tools/ktap/samples/basic/backtrace.kp
@@ -0,0 +1,6 @@
+#!/usr/bin/env ktap
+
+trace sched:sched_switch {
+	print(stack())
+}
+
diff --git a/tools/ktap/samples/basic/event_trigger.kp b/tools/ktap/samples/basic/event_trigger.kp
new file mode 100644
index 0000000..7bf4720
--- /dev/null
+++ b/tools/ktap/samples/basic/event_trigger.kp
@@ -0,0 +1,27 @@
+#!/usr/bin/env ktap
+
+#This ktap script will output all tracepoint events between
+#sys_enter_open and sys_exit_open, in one cpu.
+
+var soft_disabled = 1
+var this_cpu = 0
+
+trace syscalls:sys_enter_open {
+	print(argstr)
+	soft_disabled = 0
+	this_cpu = cpu
+}
+
+trace *:* {
+	if (soft_disabled == 0 && cpu == this_cpu) {
+		print(argstr)
+	}
+}
+
+trace syscalls:sys_exit_open {
+	print(argstr)
+	if (cpu == this_cpu) {
+		exit()
+	}
+}
+
diff --git a/tools/ktap/samples/basic/event_trigger_ftrace.kp b/tools/ktap/samples/basic/event_trigger_ftrace.kp
new file mode 100644
index 0000000..f2ebfa5
--- /dev/null
+++ b/tools/ktap/samples/basic/event_trigger_ftrace.kp
@@ -0,0 +1,27 @@
+#!/usr/bin/env ktap
+
+#This ktap script will output all function calling between
+#sys_enter_open and sys_exit_open, in one cpu.
+
+var soft_disabled = 1
+var this_cpu = 0
+
+trace syscalls:sys_enter_open {
+	print(argstr)
+	soft_disabled = 0
+	this_cpu = cpu
+}
+
+trace ftrace:function {
+	if (soft_disabled == 0 && cpu == this_cpu) {
+		print(argstr)
+	}
+}
+
+trace syscalls:sys_exit_open {
+	print(argstr)
+	if (cpu == this_cpu) {
+		exit()
+	}
+}
+
diff --git a/tools/ktap/samples/basic/ftrace.kp b/tools/ktap/samples/basic/ftrace.kp
new file mode 100644
index 0000000..22cff5d
--- /dev/null
+++ b/tools/ktap/samples/basic/ftrace.kp
@@ -0,0 +1,8 @@
+#!/usr/bin/env ktap
+
+#Description: output all mutex* function event
+
+trace ftrace:function /ip==mutex*/ {
+	print(cpu, pid, execname, argstr)
+}
+
diff --git a/tools/ktap/samples/basic/function_time.kp b/tools/ktap/samples/basic/function_time.kp
new file mode 100644
index 0000000..51d0f89
--- /dev/null
+++ b/tools/ktap/samples/basic/function_time.kp
@@ -0,0 +1,62 @@
+#!/usr/bin/env ktap
+
+#Demo for thread-local variable
+#
+#Note this kind of function time tracing already handled concurrent issue,
+#but not aware on the recursion problem, user need to aware this limitation,
+#so don't use this script to trace function which could be called recursive.
+
+var self = {}
+var count_max = 0
+var count_min = 0
+var count_num = 0
+var total_time = 0
+
+printf("measure time(us) of function vfs_read\n");
+
+trace probe:vfs_read {
+	if (execname == "ktap") {
+		return
+	}
+
+	self[tid] = gettimeofday_us()
+}
+
+trace probe:vfs_read%return {
+	if (execname == "ktap") {
+		return
+	}
+
+	if (self[tid] == nil) {
+		return
+	}
+
+	var durtion = gettimeofday_us() - self[tid]
+	if (durtion > count_max) {
+		count_max = durtion
+	}
+	var min = count_min
+	if (min == 0 || durtion < min) {
+		count_min = durtion
+	}
+
+	count_num = count_num + 1
+	total_time = total_time + durtion
+
+	self[tid] = nil
+}
+
+trace_end {
+	var avg
+	if (count_num == 0) {
+		avg = 0
+	} else {
+		avg = total_time/count_num
+	}
+
+	printf("avg\tmax\tmin\n");
+	printf("-------------------\n")
+	printf("%d\t%d\t%d\n", avg, count_max, count_min)
+}
+
+
diff --git a/tools/ktap/samples/basic/kretprobe.kp b/tools/ktap/samples/basic/kretprobe.kp
new file mode 100644
index 0000000..e03a7a2
--- /dev/null
+++ b/tools/ktap/samples/basic/kretprobe.kp
@@ -0,0 +1,6 @@
+#!/usr/bin/env ktap
+
+trace probe:vfs_read%return fd=$retval {
+	print(execname, argstr);
+}
+
diff --git a/tools/ktap/samples/basic/memcpy_memset.kp b/tools/ktap/samples/basic/memcpy_memset.kp
new file mode 100644
index 0000000..18008cc
--- /dev/null
+++ b/tools/ktap/samples/basic/memcpy_memset.kp
@@ -0,0 +1,23 @@
+#!/usr/bin/env ktap
+
+# This script collect kernel memcpy/memset size histgoram output.
+
+var h_memcpy = {}
+var h_memset = {}
+
+trace probe:memcpy size=%dx {
+	h_memcpy[arg1] += 1
+}
+
+trace probe:memset size=%dx {
+	h_memset[arg1] += 1
+}
+
+trace_end {
+	print("memcpy size hist:")
+	print_hist(h_memcpy)
+
+	print()
+	print("memset size hist:")
+	print_hist(h_memset)
+}
diff --git a/tools/ktap/samples/game/tetris.kp b/tools/ktap/samples/game/tetris.kp
new file mode 100644
index 0000000..f7fbbf2
--- /dev/null
+++ b/tools/ktap/samples/game/tetris.kp
@@ -0,0 +1,297 @@
+#!/usr/bin/env ktap
+ 
+#
+# Tetris KTAP Script
+#
+# Copyright (C) 2013/OCT/05 Tadaki SAKAI
+#
+# based on stapgames (Systemtap Game Collection)
+#   https://github.com/mhiramat/stapgames/blob/master/games/tetris.stp
+#
+#   - Requirements
+#     Kernel Configuration: CONFIG_KPROBE_EVENT=y
+#                           CONFIG_EVENT_TRACING=y
+#                           CONFIG_PERF_EVENTS=y
+#                           CONFIG_DEBUG_FS=y
+#     CPU Architecture : x86_64
+#
+#   - Setup
+#     $ sudo mount -t debugfs none /sys/kernel/debug/
+#
+#     $ git clone https://github.com/ktap/ktap
+#     $ cd ktap
+#     $ make 2>&1 | tee ../make.log
+#     $ sudo make load
+#     $ sudo sh -c 'echo 50000 > /sys/module/ktapvm/parameters/max_exec_count'
+#
+#   - Run Tetris
+#     $ sudo ./ktap samples/game/tetris.kp
+#
+ 
+#
+# global value
+#
+
+var empty = -1
+
+var key_code = 0
+var point = 0
+var block_number = 0
+var height = 0
+var height_update = 0
+
+var destination_position = {}
+ 
+var block_data0 = {}
+var block_data1 = {}
+var block_data2 = {}
+var block_data3 = {}
+var block_data4 = {}
+var block_data5 = {}
+var block_data6 = {}
+var block_table = {}
+ 
+#
+# utils
+#
+ 
+function rand(max) {
+	var r = gettimeofday_us()
+	if (r < 0) {
+		r = r * -1
+	}
+	return r % max
+}
+
+var display_buffer = {}
+
+function update_display() {
+	var tmp
+	for (i = 0, 239, 1) {
+		if ((i % 12 - 11) != 0) {
+			tmp = ""
+		} else {
+			tmp = "\n"
+		}
+
+		if (display_buffer[240 + i] == empty) {
+			printf("  %s", tmp)
+		} else {
+			var color = display_buffer[240 + i] + 40
+			ansi.set_color2(color, color)
+			printf("  %s", tmp)
+			ansi.reset_color()
+		}
+
+		# clear the display buffer
+		display_buffer[240 + i] = display_buffer[i]
+	}
+
+	printf("%d\n",point)
+}
+
+ 
+
+#
+# Initialize
+#
+ 
+# Create blocks
+# block is represented by the position from the center.
+# Every block has "L" part in the center except for a bar.
+block_data0[0] = -11 # non-"L" part for each block
+block_data1[0] = -24
+block_data2[0] = 2
+block_data3[0] = 13
+block_data4[0] = -13
+block_data5[0] = -1
+block_data6[0] = 2
+	
+block_table[0] = block_data0
+block_table[1] = block_data1
+block_table[2] = block_data2
+block_table[3] = block_data3
+block_table[4] = block_data4
+block_table[5] = block_data5
+block_table[6] = block_data6
+ 
+for (i = 0, len(block_table) - 1, 1) {
+	# common "L" part
+	block_table[i][1] = 0
+	block_table[i][2] = 1
+	block_table[i][3] = -12
+}
+ 
+block_table[6][3] = -1 # bar is not common
+# Position: 1 row has 12 columns, 
+# and (x, y) is represented by h = x + y * 12.p
+height = 17 # First block position (center)
+
+for (i = 0, 240, 1) {
+	var tmp
+	# Wall and Floor (sentinel)
+	if (((i % 12) < 2) || (i > 228)) {
+		tmp = 7 # White
+	} else {
+		tmp = empty
+	}
+	display_buffer[i - 1] = tmp
+	display_buffer[240 + i - 1] = tmp
+}
+
+block_number = rand(7)
+
+ansi.clear_screen()
+ 
+ 
+#
+# Key Input
+#
+ 
+trace probe:kbd_event handle=%di event_type=%si event_code=%dx value=%cx {
+	# Only can run it in x86_64
+	#
+	# Register follow x86_64 call conversion:
+	#
+	# x86_64:
+	#	%rcx	4 argument
+	#	%rdx	3 argument
+	#	%rsi	2 argument
+	#	%rdi	1 argument
+ 
+	var event_code = arg4
+	var value = arg5
+ 
+	if (value != 0) {
+		if ((event_code - 4) != 0) {
+			key_code = event_code
+		}
+	}
+}
+ 
+ 
+#
+# timer
+#
+ 
+tick-200ms {
+	ansi.clear_screen()
+ 
+	var f = 0 # move/rotate flag
+	var d
+ 
+	if (key_code != 0) { # if key is pressed
+		if(key_code != 103) { #move left or right
+			# d: movement direction
+			if ((key_code - 105) != 0) {
+				if ((key_code - 106) != 0) {
+					d = 0
+				} else {
+					d = 1
+				}
+			} else {
+				d = -1
+			}
+ 
+			for (i = 0, 3, 1) { # check if the block can be moved
+				# destination is free
+				if (display_buffer[height +
+					block_table[block_number][i] + d] 
+				    != empty) {
+					f = 1
+				}
+			}
+			# move if destinations of every block are free
+			if (f == 0) {
+				height = height + d
+			} 
+		} else { # rotate
+			for (i = 0, 3, 1) { # check if block can be rotated
+				# each block position
+				var p = block_table[block_number][i]
+ 
+				# destination x pos(p/12 rounded)
+				var v = (p * 2 + 252) / 24 - 10
+				var w = p - v * 12 # destination y pos
+ 
+				# destination position
+				destination_position[i] = w * 12 - v
+ 
+				# check if desetination is free
+				if (display_buffer[height +
+				    destination_position[i]] != empty) {
+					f = 1
+				}
+			}
+ 
+			if (f == 0) {
+				# rotate if destinations of every block
+				# are free
+				for (i = 0, 3, 1) {
+					block_table[block_number][i] = 
+						destination_position[i] 
+				}
+			}
+		}
+	}
+	key_code = 0 # clear the input key
+ 
+	f = 0
+	for (i = 0, 3, 1) { # drop 1 row
+		# check if destination is free
+		var p = height + block_table[block_number][i]
+		if (display_buffer[12 + p] != empty) {
+			f = 1
+		}
+ 
+		# copy the moving block to display buffer
+		display_buffer[240 + p] = block_number
+	}
+
+	if ((f == 1) && (height == 17)) {
+		update_display()
+		exit() # exit if there are block at initial position
+	}
+ 
+	height_update = !height_update
+	if (height_update != 0) {
+		if(f != 0) { # the block can't drop anymore
+			for (i = 0, 3, 1) {
+				# fix the block
+				display_buffer[height + 
+				  block_table[block_number][i]] = block_number
+			}
+			# determin the next block
+			block_number = rand(7)
+			height = 17 # make the block to initial position
+		} else {
+			height = height + 12 # drop the block 1 row
+		}
+	}
+ 
+	var k = 1
+	for (i = 18, 0, -1) { #check if line is filled
+		# search for filled line
+		var j = 10
+		while ((j > 0) && 
+		       (display_buffer[i * 12 + j] != empty)) {
+			j = j - 1
+		}
+ 
+		if (j == 0) { # filled!
+			# add a point: 1 line - 1 point, ..., tetris - 10points
+			point = point + k
+			k = k + 1
+ 
+			# drop every upper block
+			j = (i + 1) * 12
+			i = i + 1
+			while (j > 2 * 12) {
+				j = j - 1
+				display_buffer[j] = display_buffer[j - 12] 
+			}
+		}
+	}
+ 
+	update_display()
+}
diff --git a/tools/ktap/samples/helloworld.kp b/tools/ktap/samples/helloworld.kp
new file mode 100644
index 0000000..5673c15
--- /dev/null
+++ b/tools/ktap/samples/helloworld.kp
@@ -0,0 +1,3 @@
+#!/usr/bin/env ktap
+
+print("Hello World! I am ktap")
diff --git a/tools/ktap/samples/interrupt/hardirq_time.kp b/tools/ktap/samples/interrupt/hardirq_time.kp
new file mode 100644
index 0000000..f305a41
--- /dev/null
+++ b/tools/ktap/samples/interrupt/hardirq_time.kp
@@ -0,0 +1,25 @@
+#!/usr/bin/env ktap
+
+#this script output each average consumimg time of each hardirq
+
+var s = {}
+var map = {}
+
+trace irq:irq_handler_entry {
+	map[cpu] = gettimeofday_us()
+}
+
+trace irq:irq_handler_exit {
+	var entry_time = map[cpu]
+	if (entry_time == nil) {
+		return;
+	}
+
+	s[arg0] += gettimeofday_us() - entry_time
+	map[cpu] = nil
+}
+
+trace_end {
+	print_hist(s)
+}
+
diff --git a/tools/ktap/samples/interrupt/softirq_time.kp b/tools/ktap/samples/interrupt/softirq_time.kp
new file mode 100644
index 0000000..561733a
--- /dev/null
+++ b/tools/ktap/samples/interrupt/softirq_time.kp
@@ -0,0 +1,24 @@
+#!/usr/bin/env ktap
+
+#this script output each average consumimg time of each softirq line
+var s = {}
+var map = {}
+
+trace irq:softirq_entry {
+	map[cpu] = gettimeofday_us()
+}
+
+trace irq:softirq_exit {
+	var entry_time = map[cpu]
+	if (entry_time == nil) {
+		return;
+	}
+
+	s[arg0] += gettimeofday_us() - entry_time
+	map[cpu] = nil
+}
+
+trace_end {
+	print_hist(s)
+}
+
diff --git a/tools/ktap/samples/io/kprobes-do-sys-open.kp b/tools/ktap/samples/io/kprobes-do-sys-open.kp
new file mode 100644
index 0000000..692f6c4
--- /dev/null
+++ b/tools/ktap/samples/io/kprobes-do-sys-open.kp
@@ -0,0 +1,20 @@
+#!/usr/bin/env ktap
+
+#Only can run it in x86_64
+#
+#Register follow x86_64 call conversion:
+#
+#x86_64:
+#	%rcx	4 argument
+#	%rdx	3 argument
+#	%rsi	2 argument
+#	%rdi	1 argument
+
+trace probe:do_sys_open dfd=%di filename=%si flags=%dx mode=%cx {
+	printf("[do_sys_open entry]: (%s) open file (%s)\n",
+		execname,  user_string(arg2))
+}
+
+trace probe:do_sys_open%return fd=$retval {
+	printf("[do_sys_open exit]:  return fd (%d)\n", arg2)
+}
diff --git a/tools/ktap/samples/io/traceio.kp b/tools/ktap/samples/io/traceio.kp
new file mode 100644
index 0000000..1ef6588
--- /dev/null
+++ b/tools/ktap/samples/io/traceio.kp
@@ -0,0 +1,61 @@
+#! /usr/bin/env ktap
+
+# Based on systemtap traceio.stp
+
+var reads = {}
+var writes = {}
+var total_io = {}
+
+trace syscalls:sys_exit_read {
+	reads[execname] += arg1
+	total_io[execname] += arg1
+}
+
+trace syscalls:sys_exit_write {
+	writes[execname] += arg1
+	total_io[execname] += arg1
+}
+
+function humanread_digit(bytes) {
+	if (bytes > 1024*1024*1024) {
+		return bytes/1024/1024/1024
+	} elseif (bytes > 1024*1024) {
+		return bytes/1024/1024
+	} elseif (bytes > 1024) {
+		return bytes/1024
+	} else {
+		return bytes
+	}
+}
+
+function humanread_x(bytes) {
+	if (bytes > 1024*1024*1024) {
+		return " GiB"
+	} elseif (bytes > 1024*1024) {
+		return " MiB"
+	} elseif (bytes > 1024) {
+		return " KiB"
+	} else {
+		return "   B"
+	}
+}
+
+tick-1s {
+	ansi.clear_screen()
+	for (exec, _ in pairs(total_io)) {
+		var readnum = reads[exec]
+		var writenum = writes[exec]
+
+		if (readnum == nil) {
+			readnum = 0
+		}
+		if (writenum == nil) {
+			writenum = 0
+		}
+		printf("%15s r: %12d%s w: %12d%s\n", exec,
+			humanread_digit(readnum), humanread_x(readnum),
+			humanread_digit(writenum), humanread_x(writenum))
+	}
+	printf("\n")
+}
+
diff --git a/tools/ktap/samples/mem/kmalloc-stack.kp b/tools/ktap/samples/mem/kmalloc-stack.kp
new file mode 100644
index 0000000..8406461
--- /dev/null
+++ b/tools/ktap/samples/mem/kmalloc-stack.kp
@@ -0,0 +1,12 @@
+#!/usr/bin/env ktap
+
+var s = {}
+
+trace kmem:kmalloc {
+	s[stack()] += 1
+}
+
+tick-60s {
+	print_hist(s)
+}
+
diff --git a/tools/ktap/samples/mem/kmem_count.kp b/tools/ktap/samples/mem/kmem_count.kp
new file mode 100644
index 0000000..ab5c8a9
--- /dev/null
+++ b/tools/ktap/samples/mem/kmem_count.kp
@@ -0,0 +1,29 @@
+#!/usr/bin/env ktap
+
+var count1 = 0
+trace kmem:kmalloc {
+	count1 += 1
+}
+
+var count2 = 0
+trace kmem:kfree {
+	count2 += 1
+}
+
+var count3 = 0
+trace kmem:mm_page_alloc {
+	count3 += 1
+}
+
+var count4 = 0
+trace kmem:mm_page_free {
+	count4 += 1
+}
+
+trace_end {
+	print("\n")
+	print("kmem:kmalloc:\t", count1)
+	print("kmem:kfree:\t", count2)
+	print("kmem:mm_page_alloc:", count3)
+	print("kmem:mm_page_free:", count4)
+}
diff --git a/tools/ktap/samples/network/tcp_ipaddr.kp b/tools/ktap/samples/network/tcp_ipaddr.kp
new file mode 100644
index 0000000..6363aef
--- /dev/null
+++ b/tools/ktap/samples/network/tcp_ipaddr.kp
@@ -0,0 +1,20 @@
+#!/usr/bin/env ktap
+
+#This script print source and destination IP address of received tcp message
+#
+#Tested in x86_64
+#
+#function tcp_recvmsg prototype:
+#int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
+#                size_t len, int nonblock, int flags, int *addr_len)
+#
+
+var ip_sock_saddr = net.ip_sock_saddr
+var ip_sock_daddr = net.ip_sock_daddr
+var format_ip_addr = net.format_ip_addr
+
+trace probe:tcp_recvmsg sock=%si {
+	var saddr = format_ip_addr(ip_sock_saddr(arg1))
+	var daddr = format_ip_addr(ip_sock_daddr(arg1))
+	printf("%s -> %s\n", daddr, saddr)
+}
diff --git a/tools/ktap/samples/profiling/function_profiler.kp b/tools/ktap/samples/profiling/function_profiler.kp
new file mode 100644
index 0000000..1414936
--- /dev/null
+++ b/tools/ktap/samples/profiling/function_profiler.kp
@@ -0,0 +1,41 @@
+#!/usr/bin/env ktap
+
+#kernel function profile
+#You can use this script to know what function is called frequently,
+#without enable CONFIG_FUNCTION_PROFILER in kernel.
+
+var s = {}
+
+trace ftrace:function {
+	s[ipof(arg0)] += 1
+}
+
+trace_end {
+	print_hist(s)
+}
+
+#sample output
+#^C
+#                          value ------------- Distribution ------------- count
+#               sub_preempt_count | @@@@@                                  34904
+#               add_preempt_count | @@@@@                                  33435
+#              nsecs_to_jiffies64 | @@@                                    19919
+# irqtime_account_process_tick... | @                                      9970
+#               account_idle_time | @                                      9880
+#                  _raw_spin_lock |                                        5100
+#                _raw_spin_unlock |                                        5021
+#     _raw_spin_unlock_irqrestore |                                        4235
+#          _raw_spin_lock_irqsave |                                        4232
+#                 __rcu_read_lock |                                        3373
+#               __rcu_read_unlock |                                        3373
+#                  lookup_address |                                        2392
+#             pfn_range_is_mapped |                                        2384
+#      update_cfs_rq_blocked_load |                                        1983
+#                        idle_cpu |                                        1808
+#                       ktime_get |                                        1394
+#            _raw_spin_unlock_irq |                                        1270
+#              _raw_spin_lock_irq |                                        1091
+#                     update_curr |                                        950
+#             irqtime_account_irq |                                        950
+#                             ... |
+#
diff --git a/tools/ktap/samples/profiling/kprobe_all_kernel_functions.kp b/tools/ktap/samples/profiling/kprobe_all_kernel_functions.kp
new file mode 100644
index 0000000..d42f7de
--- /dev/null
+++ b/tools/ktap/samples/profiling/kprobe_all_kernel_functions.kp
@@ -0,0 +1,13 @@
+#!/usr/bin/env ktap
+
+#enable kprobe on all available kernel functios in /proc/kallsyms.
+#
+#This script is very dangerous, it will softlockup your system. 
+#and make your system extremely slow, but the reason is not
+#caused by ktap.
+#
+#DON'T use this script, at least for now.
+
+trace probe:* {
+	print(argstr)
+}
diff --git a/tools/ktap/samples/profiling/stack_profile.kp b/tools/ktap/samples/profiling/stack_profile.kp
new file mode 100644
index 0000000..3595a65
--- /dev/null
+++ b/tools/ktap/samples/profiling/stack_profile.kp
@@ -0,0 +1,27 @@
+#!/usr/bin/env ktap
+
+# This ktap script samples stacktrace of system per 10us,
+# you can use generated output to make a flame graph.
+#
+# Flame Graphs:
+# http://dtrace.org/blogs/brendan/2012/03/17/linux-kernel-performance-flame-graphs/
+#
+#
+# TODO: use aggregation instead of table.
+
+#pre-allocate 2000 record entries, enlarge it if it's not enough
+var s = table.new(0, 2000)
+
+profile-10us {
+	#skip 12 stack entries, and dump all remain entries.
+	s[stack(-1, 12)] += 1
+}
+
+tick-60s {
+	exit()
+}
+
+trace_end {
+	print_hist(s, 100000)
+}
+
diff --git a/tools/ktap/samples/schedule/sched_transition.kp b/tools/ktap/samples/schedule/sched_transition.kp
new file mode 100644
index 0000000..bad90e8
--- /dev/null
+++ b/tools/ktap/samples/schedule/sched_transition.kp
@@ -0,0 +1,5 @@
+#!/usr/bin/env ktap
+
+trace sched:sched_switch {
+	printf("%s ... ", arg0)
+}
diff --git a/tools/ktap/samples/schedule/schedtimes.kp b/tools/ktap/samples/schedule/schedtimes.kp
new file mode 100644
index 0000000..19de1bf
--- /dev/null
+++ b/tools/ktap/samples/schedule/schedtimes.kp
@@ -0,0 +1,131 @@
+#!/usr/vin/env ktap
+
+#schedtimer.kp
+#Initially inspired by Systemtap schedtimes.stp
+#and more bugfree compare with Systemtap's version
+#
+#Note that the time value is associate with pid, not with execname strictly,
+#sometime you will found there have sleep time for command "ls", the reason
+#is that sleep time is belong to parent process bash, so clear on this.
+
+var RUNNING = 0
+var QUEUED = 1
+var SLEEPING = 2
+var DEAD = 64
+
+var run_time = {}
+var queued_time = {}
+var sleep_time = {}
+var io_wait_time = {}
+
+var pid_state = {}
+var pid_names = {}
+var prev_timestamp = {}
+var io_wait = {}
+
+trace sched:sched_switch {
+	var prev_comm = arg0
+	var prev_pid = arg1
+	var prev_state = arg3
+	var next_comm = arg4
+	var next_pid = arg5
+	var t = gettimeofday_us()
+
+	if (pid_state[prev_pid] == nil) {
+		#do nothing
+	} elseif (pid_state[prev_pid] == RUNNING) {
+		run_time[prev_pid] += t - prev_timestamp[prev_pid]
+	} elseif (pid_state[prev_pid] == QUEUED) {
+		#found this:
+		#sched_wakeup comm=foo
+		#sched_switch prev_comm=foo
+		run_time[prev_pid] += t - prev_timestamp[prev_pid]
+	}
+
+	pid_names[prev_pid] = prev_comm
+	prev_timestamp[prev_pid] = t
+
+	if (prev_state == DEAD) {
+		pid_state[prev_pid] = DEAD
+	} elseif (prev_state > 0) {
+		if (in_iowait() == 1) {
+			io_wait[prev_pid] = 1
+		}
+		pid_state[prev_pid] = SLEEPING
+	} elseif (prev_state == 0) {
+		pid_state[prev_pid] = QUEUED
+	}
+
+	if (pid_state[next_pid] == nil) {
+		pid_state[next_pid] = RUNNING
+	} elseif (pid_state[next_pid] == QUEUED) {
+		queued_time[next_pid] += t - prev_timestamp[next_pid]
+		pid_state[next_pid] = RUNNING
+	}
+
+	pid_names[next_pid] = next_comm
+	prev_timestamp[next_pid] = t
+}
+
+trace sched:sched_wakeup, sched:sched_wakeup_new {
+	var comm = arg0
+	var wakeup_pid = arg1
+	var success = arg3
+	var t = gettimeofday_us()
+
+	if (pid_state[wakeup_pid] == nil) {
+		#do nothing
+	} elseif (pid_state[wakeup_pid] == SLEEPING) {
+		var durtion = t - prev_timestamp[wakeup_pid]
+
+		sleep_time[wakeup_pid] += durtion
+		if (io_wait[wakeup_pid] == 1) {
+			io_wait_time[wakeup_pid] += durtion
+			io_wait[wakeup_pid] = 0
+		}
+	} elseif (pid_state[wakeup_pid] == RUNNING) {
+		return
+	}
+
+	pid_names[wakeup_pid] = comm
+	prev_timestamp[wakeup_pid] = t
+	pid_state[wakeup_pid] = QUEUED
+}
+
+trace_end {
+	var t = gettimeofday_us()
+	
+	for (_pid, _state in pairs(pid_state)) {
+		var durtion = t - prev_timestamp[_pid]
+		if (_state == SLEEPING) {
+			sleep_time[_pid] += durtion
+		} elseif (_state == QUEUED) {
+			queued_time[_pid] += durtion
+		} elseif (_state == RUNNING) {
+			run_time[_pid] += durtion
+		}
+	}
+
+	printf ("%16s: %6s %10s %10s %10s %10s %10s\n\n",
+		"execname", "pid", "run(us)", "sleep(us)", "io_wait(us)",
+		"queued(us)", "total(us)")
+
+	for (_pid, _time in pairs(run_time)) {
+		if (sleep_time[_pid] == nil) {
+			sleep_time[_pid] = 0
+		}
+		if (queued_time[_pid] == nil) {
+			queued_time[_pid] = 0
+		}
+
+		if (io_wait_time[_pid] == nil) {
+			io_wait_time[_pid] = 0
+		}
+
+		printf("%16s: %6d %10d %10d %10d %10d %10d\n",
+			pid_names[_pid], _pid, run_time[_pid],
+			sleep_time[_pid], io_wait_time[_pid],
+			queued_time[_pid], run_time[_pid] + sleep_time[_pid] +
+			queued_time[_pid]);
+	}
+}
diff --git a/tools/ktap/samples/syscalls/errinfo.kp b/tools/ktap/samples/syscalls/errinfo.kp
new file mode 100644
index 0000000..5a0f7a6
--- /dev/null
+++ b/tools/ktap/samples/syscalls/errinfo.kp
@@ -0,0 +1,145 @@
+#!/usr/bin/env ktap
+
+#errdesc get from include/uapi/asm-generic/errno*.h
+var errdesc = {
+	[1] = "Operation not permitted",		#EPERM
+	[2] = "No such file or directory",		#ENOENT
+	[3] = "No such process",			#ESRCH
+	[4] = "Interrupted system call",		#EINRT
+	[5] = "I/O error",				#EIO
+	[6] = "No such device or address",		#ENXIO
+	[7] = "Argument list too long",			#E2BIG
+	[8] = "Exec format error",			#ENOEXEC
+	[9] = "Bad file number",			#EBADF
+	[10] = "No child processes",			#ECHILD
+	[11] = "Try again",				#EAGAIN
+	[12] = "Out of memory",				#ENOMEM
+	[13] = "Permission denied",			#EACCES
+	[14] = "Bad address",				#EFAULT
+	[15] = "Block device required",			#ENOTBLK
+	[16] = "Device or resource busy",		#EBUSY
+	[17] = "File exists",				#EEXIST
+	[18] = "Cross-device link",			#EXDEV
+	[19] = "No such device",			#ENODEV
+	[20] = "Not a directory",			#ENOTDIR
+	[21] = "Is a directory",			#EISDIR
+	[22] = "Invalid argument",			#EINVAL
+	[23] = "File table overflow",			#ENFILE
+	[24] = "Too many open files",			#EMFILE
+	[25] = "Not a typewriter",			#ENOTTY
+	[26] = "Text file busy",			#ETXTBSY
+	[27] = "File too large",			#EFBIG
+	[28] = "No space left on device",		#ENOSPC
+	[29] = "Illegal seek",				#ESPIPE
+	[30] = "Read-only file system",			#EROFS
+	[31] = "Too many links",			#EMLINK
+	[32] = "Broken pipe",				#EPIPE
+	[33] = "Math argument out of domain of func",	#EDOM
+	[34] = "Math result not representable",		#ERANGE
+
+	[35] = "Resource deadlock would occur",		#EDEADLK
+	[36] = "File name too long", 			#ENAMETOOLONG
+	[37] = "No record locks available",		#ENOLCK
+	[38] = "Function not implemented",		#ENOSYS
+	[39] = "Directory not empty",			#ENOTEMPTY		
+	[40] = "Too many symbolic links encountered",	#ELOOP
+	[42] = "No message of desired type",		#ENOMSG
+	[43] = "Identifier removed",			#EIDRM
+	[44] = "Channel number out of range",		#ECHRNG
+	[45] = "Level 2 not synchronized",		#EL2NSYNC
+	[46] = "Level 3 halted",			#EL3HLT
+	[47] = "Level 3 reset",				#EL3RST			
+	[48] = "Link number out of range",		#ELNRNG
+	[49] = "Protocol driver not attached",		#EUNATCH
+	[50] = "No CSI structure available",		#ENOCSI
+	[51] = "Level 2 halted",			#EL2HLT
+	[52] = "Invalid exchange",			#EBADE
+	[53] = "Invalid request descriptor",		#EBADR
+	[54] = "Exchange full",				#EXFULL
+	[55] = "No anode",				#ENOANO
+	[56] = "Invalid request code",			#EBADRQC
+	[57] = "Invalid slot",				#EBADSLT
+
+	[59] = "Bad font file format",			#EBFONT
+	[60] = "Device not a stream",			#ENOSTR
+	[61] = "No data available",			#ENODATA
+	[62] = "Timer expired",				#ETIME
+	[63] = "Out of streams resources",		#ENOSR
+	[64] = "Machine is not on the network",		#ENONET
+	[65] = "Package not installed",			#ENOPKG
+	[66] = "Object is remote",			#EREMOTE
+	[67] = "Link has been severed",			#ENOLINK
+	[68] = "Advertise error",			#EADV
+	[69] = "Srmount error",				#ESRMNT
+	[70] = "Communication error on send",		#ECOMM
+	[71] = "Protocol error",			#EPROTO
+	[72] = "Multihop attempted",			#EMULTIHOP
+	[73] = "RFS specific error",			#EDOTDOT
+	[74] = "Not a data message",			#EBADMSG
+	[75] = "Value too large for defined data type",	#EOVERFLOW
+	[76] = "Name not unique on network",		#ENOTUNIQ
+	[77] = "File descriptor in bad state",		#EBADFD
+	[78] = "Remote address changed",		#EREMCHG
+	[79] = "Can not access a needed shared library", #ELIBACC
+	[80] = "Accessing a corrupted shared library",	#ELIBBAD
+	[81] = ".lib section in a.out corrupted",	#ELIBSCN
+	[82] = "Attempting to link in too many shared libraries", #ELIBMAX
+	[83] = "Cannot exec a shared library directly",	#ELIBEXEC
+	[84] = "Illegal byte sequence",			#EILSEQ
+	[85] = "Interrupted system call should be restarted", #ERESTART
+	[86] = "Streams pipe error",			#ESTRPIPE
+	[87] = "Too many users",			#EUSERS
+	[88] = "Socket operation on non-socket",	#ENOTSOCK
+	[89] = "Destination address required",		#EDESTADDRREQ
+	[90] = "Message too long",			#EMSGSIZE
+	[91] = "Protocol wrong type for socket",	#EPROTOTYPE
+	[92] = "Protocol not available",		#ENOPROTOOPT
+	[93] = "Protocol not supported",		#EPROTONOSUPPORT
+	[94] = "Socket type not supported",		#ESOCKTNOSUPPORT
+	[95] = "Operation not supported on transport endpoint", #EOPNOTSUPP
+	[96] = "Protocol family not supported",		#EPFNOSUPPORT
+	[97] = "Address family not supported by protocol", #EAFNOSUPPORT
+	[98] = "Address already in use",		#EADDRINUSE
+	[99] = "Cannot assign requested address",	#EADDRNOTAVAIL
+	[100] = "Network is down",			#ENETDOWN
+	[101] = "Network is unreachable",		#ENETUNREACH
+	[102] = "Network dropped connection because of reset",	#ENETRESET
+	[103] = "Software caused connection abort",	#ECONNABORTED
+	[104] = "Connection reset by peer",		#ECONNRESET
+	[105] = "No buffer space available",		#ENOBUFS
+	[106] = "Transport endpoint is already connected", #EISCONN
+	[107] = "Transport endpoint is not connected",	#ENOTCONN
+	[108] = " Cannot send after transport endpoint shutdown", #ESHUTDOWN
+	[109] = "Too many references: cannot splice",	#ETOOMANYREFS
+	[110] = "Connection timed out",			#ETIMEDOUT
+	[111] = "Connection refused",			#ECONNREFUSED
+	[112] = "Host is down",				#EHOSTDOWN
+	[113] = "No route to host",			#EHOSTUNREACH
+	[114] = "Operation already in progress",	#EALREADY
+	[115] = "Operation now in progress",		#EINPROGRESS
+	[116] = "Stale NFS file handle",		#ESTALE
+	[117] = "Structure needs cleaning",		#EUCLEAN
+	[118] = "Not a XENIX named type file",		#ENOTNAM
+	[119] = "No XENIX semaphores available",	#ENAVAIL
+	[120] = "Is a named type file",			#EISNAM
+	[121] = "Remote I/O error",			#EREMOTEIO
+	[122] = "Quota exceeded",			#EDQUOT
+	[123] = "No medium found",			#ENOMEDIUM
+	[124] = "Wrong medium type",			#EMEDIUMTYPE
+	[125] = "Operation Canceled",			#ECANCELED
+	[126] = "Required key not available",		#ENOKEY
+	[127] = "Key has expired",			#EKEYEXPIRED
+	[128] = "Key has been revoked",			#EKEYREVOKED
+	[129] = "Key was rejected by service",		#EKEYREJECTED
+	[130] = "Owner died",				#EOWNERDEAD
+	[131] = "State not recoverable",		#ENOTRECOVERABLE
+
+}
+
+trace syscalls:sys_exit_* {
+	if (arg1 < 0) {
+		var errno = -arg1
+		printf("%-15s%-20s\t%d\t%-30s\n",
+			execname, probename, errno, errdesc[errno])
+	}
+}
diff --git a/tools/ktap/samples/syscalls/execve.kp b/tools/ktap/samples/syscalls/execve.kp
new file mode 100644
index 0000000..8b4115e
--- /dev/null
+++ b/tools/ktap/samples/syscalls/execve.kp
@@ -0,0 +1,9 @@
+#!/usr/bin/env ktap
+
+#This script trace filename of process execution
+#only tested in x86-64
+
+trace probe:do_execve filename=%di {
+	printf("[do_execve entry]: (%s) name=%s\n", execname,
+						    kernel_string(arg1))
+}
diff --git a/tools/ktap/samples/syscalls/opensnoop.kp b/tools/ktap/samples/syscalls/opensnoop.kp
new file mode 100644
index 0000000..5e561b2
--- /dev/null
+++ b/tools/ktap/samples/syscalls/opensnoop.kp
@@ -0,0 +1,31 @@
+#!/usr/local/bin/ktap -q
+#
+# opensnoop.kp	trace open syscalls with pathnames and basic info
+#
+# 23-Nov-2013	Brendan Gregg	Created this
+
+var path = {}
+
+printf("%5s %6s %-12s %3s %3s %s\n", "UID", "PID", "COMM", "FD", "ERR", "PATH");
+
+trace syscalls:sys_enter_open {
+	path[tid] = user_string(arg1)
+}
+
+trace syscalls:sys_exit_open {
+	var fd
+	var errno
+
+	if (arg1 < 0) {
+		fd = 0
+		errno = -arg1
+	} else {
+		fd = arg1
+		errno = 0
+	}
+
+	printf("%5d %6d %-12s %3d %3d %s\n", uid, pid, execname, fd,
+	    errno, path[tid])
+
+	path[tid] = 0
+}
diff --git a/tools/ktap/samples/syscalls/sctop.kp b/tools/ktap/samples/syscalls/sctop.kp
new file mode 100644
index 0000000..bd33d21
--- /dev/null
+++ b/tools/ktap/samples/syscalls/sctop.kp
@@ -0,0 +1,13 @@
+#! /usr/bin/env ktap
+
+var s = {}
+
+trace syscalls:sys_enter_* {
+	s[probename] += 1
+}
+
+tick-5s {
+	ansi.clear_screen()
+	print_hist(s)
+	delete(s)
+}
diff --git a/tools/ktap/samples/syscalls/syscalls.kp b/tools/ktap/samples/syscalls/syscalls.kp
new file mode 100644
index 0000000..4eb332a
--- /dev/null
+++ b/tools/ktap/samples/syscalls/syscalls.kp
@@ -0,0 +1,6 @@
+#!/usr/bin/env ktap
+
+trace syscalls:* {
+	print(cpu, pid, execname, argstr)
+}
+
diff --git a/tools/ktap/samples/syscalls/syscalls_count.kp b/tools/ktap/samples/syscalls/syscalls_count.kp
new file mode 100644
index 0000000..a355eef
--- /dev/null
+++ b/tools/ktap/samples/syscalls/syscalls_count.kp
@@ -0,0 +1,54 @@
+#!/usr/bin/env ktap
+
+var s = {}
+
+trace syscalls:sys_enter_* {
+	s[probename] += 1
+}
+
+trace_end {
+	print_hist(s)
+}
+
+#Result:
+#
+#[root@jovi ktap]# ./ktap samples/syscalls_count.kp
+#^C
+#                          value ------------- Distribution ------------- count
+#        sys_enter_rt_sigprocmask |@@@@@@                                 326
+#                  sys_enter_read |@@@@@                                  287
+#                 sys_enter_close |@@@@                                   236
+#                  sys_enter_open |@@@@                                   222
+#                sys_enter_stat64 |@@                                     132
+#                sys_enter_select |@@                                     123
+#          sys_enter_rt_sigaction |@@                                     107
+#                  sys_enter_poll |@                                      72
+#                 sys_enter_write |@                                      70
+#            sys_enter_mmap_pgoff |@                                      58
+#               sys_enter_fstat64 |                                       41
+#             sys_enter_nanosleep |                                       23
+#                sys_enter_access |                                       20
+#              sys_enter_mprotect |                                       18
+#               sys_enter_geteuid |                                       17
+#               sys_enter_getegid |                                       16
+#                sys_enter_getuid |                                       16
+#                sys_enter_getgid |                                       16
+#                   sys_enter_brk |                                       15
+#               sys_enter_waitpid |                                       11
+#                  sys_enter_time |                                       10
+#                 sys_enter_ioctl |                                       9
+#                sys_enter_munmap |                                       9
+#               sys_enter_fcntl64 |                                       7
+#                  sys_enter_dup2 |                                       7
+#                 sys_enter_clone |                                       6
+#            sys_enter_exit_group |                                       6
+#                sys_enter_execve |                                       4
+#                  sys_enter_pipe |                                       3
+#          sys_enter_gettimeofday |                                       3
+#              sys_enter_getdents |                                       2
+#             sys_enter_getgroups |                                       2
+#              sys_enter_statfs64 |                                       2
+#                 sys_enter_lseek |                                       2
+#                sys_enter_openat |                                       1
+#              sys_enter_newuname |                                       1
+
diff --git a/tools/ktap/samples/syscalls/syscalls_count_by_proc.kp b/tools/ktap/samples/syscalls/syscalls_count_by_proc.kp
new file mode 100644
index 0000000..fdd0eff
--- /dev/null
+++ b/tools/ktap/samples/syscalls/syscalls_count_by_proc.kp
@@ -0,0 +1,22 @@
+#!/usr/bin/env ktap
+
+var s = {}
+
+trace syscalls:sys_enter_* {
+	s[execname] += 1
+}
+
+trace_end {
+	print_hist(s)
+}
+
+#Result:
+#
+#[root@jovi ktap]# ./ktap samples/syscalls_count_by_proc.kp
+#^C
+#                          value ------------- Distribution ------------- count
+#                            sshd |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@      196
+#                          iscsid |@@@@                                   24
+#                        sendmail |@                                      9
+
+
diff --git a/tools/ktap/samples/syscalls/syslatl.kp b/tools/ktap/samples/syscalls/syslatl.kp
new file mode 100644
index 0000000..e9cd91d
--- /dev/null
+++ b/tools/ktap/samples/syscalls/syslatl.kp
@@ -0,0 +1,33 @@
+#!/usr/bin/env ktap
+#
+# syslatl.kp	syscall latency linear aggregation
+#
+# 10-Nov-2013	Brendan Gregg	Created this
+
+var step = 10	# number of ms per step
+
+var self = {}
+var lats = {}
+var max = 0
+
+trace syscalls:sys_enter_* {
+	self[tid] = gettimeofday_us()
+}
+
+trace syscalls:sys_exit_* {
+	if (self[tid] == nil) { return }
+	var delta = (gettimeofday_us() - self[tid]) / (step * 1000)
+	if (delta > max) { max = delta }
+	lats[delta] += 1
+	self[tid] = nil
+}
+
+trace_end {
+	printf("   %8s %8s\n", "LAT(ms)+", "COUNT");
+	for (i = 0, max, 1) {
+		if (lats[i] == nil) {
+			lats[i] = 0
+		}
+		printf("   %8d %8d\n", i * step, lats[i]);
+	}
+}
diff --git a/tools/ktap/samples/syscalls/syslist.kp b/tools/ktap/samples/syscalls/syslist.kp
new file mode 100644
index 0000000..09f173d
--- /dev/null
+++ b/tools/ktap/samples/syscalls/syslist.kp
@@ -0,0 +1,31 @@
+#!/usr/bin/env ktap
+#
+# syslist.kp    syscall latency as a list with counts
+#
+# 10-Nov-2013   Brendan Gregg   Created this
+
+var self = {}
+var lats = {}
+var order = {}  # a workaround for key sorting
+
+trace syscalls:sys_enter_* {
+	self[tid] = gettimeofday_us()
+}
+
+trace syscalls:sys_exit_* {
+	if (self[tid] == nil) { return }
+	var delta = gettimeofday_us() - self[tid]
+	lats[delta] += 1
+	order[delta] = delta
+	self[tid] = nil
+}
+
+trace_end {
+	printf("   %8s %8s\n", "LAT(us)", "COUNT");
+	#TODO: use a more simple way to sort keys
+
+	#for (lat, dummy in sort_pairs(order, cmp)) {
+	#    printf("   %8d %8d\n", lat, lats[lat]);
+	#}
+	print_hist(lats)
+}
diff --git a/tools/ktap/samples/tracepoints/eventcount.kp b/tools/ktap/samples/tracepoints/eventcount.kp
new file mode 100644
index 0000000..a9e197b
--- /dev/null
+++ b/tools/ktap/samples/tracepoints/eventcount.kp
@@ -0,0 +1,210 @@
+#!/usr/bin/env ktap
+
+# showing all tracepoints in histogram style
+
+var s = {}
+
+trace *:* {
+	s[probename] += 1
+}
+
+trace_end {
+	print_hist(s)
+}
+
+#Results:
+#^C
+#
+#                          value ------------- Distribution ------------- count
+#                 rcu_utilization |@@@@@                                  225289
+#                        cpu_idle |@@@                                    120168
+#                    sched_wakeup |@@                                     91950
+#                    timer_cancel |@@                                     91232
+#                     timer_start |@@                                     91201
+#                sched_stat_sleep |@@                                     90981
+#               timer_expire_exit |@@                                     90634
+#              timer_expire_entry |@@                                     90625
+#                  hrtimer_cancel |@                                      75411
+#                   hrtimer_start |@                                      74946
+#                   softirq_raise |@                                      63117
+#                    softirq_exit |@                                      63109
+#                   softirq_entry |@                                      63094
+#                    sched_switch |@                                      62331
+#                 sched_stat_wait |@                                      60491
+#             hrtimer_expire_exit |@                                      47538
+#            hrtimer_expire_entry |@                                      47530
+#              sched_stat_runtime |                                       2780
+#                 kmem_cache_free |                                       2684
+#                kmem_cache_alloc |                                       2415
+#                           kfree |                                       2288
+#                        sys_exit |                                       2145
+#                       sys_enter |                                       2145
+#         sys_exit_rt_sigprocmask |                                       1000
+#        sys_enter_rt_sigprocmask |                                       1000
+#                      timer_init |                                       912
+#              sched_stat_blocked |                                       685
+#                         kmalloc |                                       667
+#           workqueue_execute_end |                                       621
+#         workqueue_execute_start |                                       621
+#                sys_enter_select |                                       566
+#                 sys_exit_select |                                       566
+#                  sys_enter_read |                                       526
+#                   sys_exit_read |                                       526
+#                    mm_page_free |                                       478
+#                   mm_page_alloc |                                       427
+#            mm_page_free_batched |                                       382
+#                   net_dev_queue |                                       296
+#                    net_dev_xmit |                                       296
+#                     consume_skb |                                       296
+#                  sys_exit_write |                                       290
+#                 sys_enter_write |                                       290
+#                       kfree_skb |                                       289
+#           kmem_cache_alloc_node |                                       269
+#                    kmalloc_node |                                       263
+#                 sys_enter_close |                                       249
+#                  sys_exit_close |                                       249
+#                    hrtimer_init |                                       248
+#               netif_receive_skb |                                       242
+#                  sys_enter_open |                                       237
+#                   sys_exit_open |                                       237
+#                       napi_poll |                                       226
+#              sched_migrate_task |                                       207
+#                   sys_exit_poll |                                       173
+#                  sys_enter_poll |                                       173
+#            workqueue_queue_work |                                       152
+#         workqueue_activate_work |                                       152
+#                sys_enter_stat64 |                                       133
+#                 sys_exit_stat64 |                                       133
+#           sys_exit_rt_sigaction |                                       133
+#          sys_enter_rt_sigaction |                                       133
+#               irq_handler_entry |                                       125
+#                irq_handler_exit |                                       125
+#       mm_page_alloc_zone_locked |                                       99
+#             sys_exit_mmap_pgoff |                                       66
+#            sys_enter_mmap_pgoff |                                       66
+#                sys_exit_fstat64 |                                       54
+#               sys_enter_fstat64 |                                       54
+#             sys_enter_nanosleep |                                       51
+#              sys_exit_nanosleep |                                       51
+#                 block_bio_queue |                                       46
+#                 block_bio_remap |                                       46
+#              block_bio_complete |                                       46
+#                  mix_pool_bytes |                                       44
+#              mm_page_pcpu_drain |                                       31
+#                   sys_exit_time |                                       23
+#                  sys_enter_time |                                       23
+#                 sys_exit_access |                                       20
+#                sys_enter_access |                                       20
+#           mix_pool_bytes_nolock |                                       18
+#              sys_enter_mprotect |                                       18
+#               sys_exit_mprotect |                                       18
+#               sys_enter_geteuid |                                       17
+#                sys_exit_geteuid |                                       17
+#                sys_enter_munmap |                                       17
+#                 sys_exit_munmap |                                       17
+#                     block_getrq |                                       16
+#                sys_enter_getuid |                                       16
+#                sys_enter_getgid |                                       16
+#                 sys_exit_getgid |                                       16
+#                 sys_exit_getuid |                                       16
+#                  block_rq_issue |                                       16
+#         scsi_dispatch_cmd_start |                                       16
+#               block_rq_complete |                                       16
+#          scsi_dispatch_cmd_done |                                       16
+#               sys_enter_getegid |                                       16
+#                sys_exit_getegid |                                       16
+#                 block_rq_insert |                                       16
+#         skb_copy_datagram_iovec |                                       15
+#                   sys_enter_brk |                                       15
+#                    sys_exit_brk |                                       15
+#             credit_entropy_bits |                                       14
+#                   wbc_writepage |                                       14
+#                  sys_exit_clone |                                       12
+#              block_touch_buffer |                                       12
+#              sched_process_wait |                                       11
+#               sys_enter_waitpid |                                       11
+#                sys_exit_waitpid |                                       11
+#               writeback_written |                                       10
+#                 writeback_start |                                       10
+#              writeback_queue_io |                                       10
+#     ext4_es_lookup_extent_enter |                                       9
+#                 sys_enter_ioctl |                                       9
+#                  sys_exit_ioctl |                                       9
+#       ext4_ext_map_blocks_enter |                                       9
+#        ext4_ext_map_blocks_exit |                                       9
+#      ext4_es_lookup_extent_exit |                                       9
+#           ext4_es_insert_extent |                                       9
+#            ext4_ext_show_extent |                                       8
+#                 extract_entropy |                                       8
+#ext4_es_find_delayed_extent_exit |                                       8
+# ext4_es_find_delayed_extent_... |                                       8
+#         writeback_pages_written |                                       7
+#                   sys_exit_dup2 |                                       7
+#                  sys_enter_dup2 |                                       7
+#                 signal_generate |                                       7
+#               sys_enter_fcntl64 |                                       7
+#                sys_exit_fcntl64 |                                       7
+#              global_dirty_state |                                       7
+#     writeback_dirty_inode_start |                                       7
+#             block_bio_backmerge |                                       7
+#           writeback_dirty_inode |                                       7
+#                sched_wakeup_new |                                       6
+#              sched_process_free |                                       6
+#            sys_enter_exit_group |                                       6
+#                    task_newtask |                                       6
+#                 sys_enter_clone |                                       6
+#              sched_process_fork |                                       6
+#              sched_process_exit |                                       6
+#           sys_exit_gettimeofday |                                       5
+#                  signal_deliver |                                       5
+#          sys_enter_gettimeofday |                                       5
+#          writeback_single_inode |                                       4
+#                sys_enter_execve |                                       4
+#                     task_rename |                                       4
+#              sched_process_exec |                                       4
+#              block_dirty_buffer |                                       4
+#                 sys_exit_execve |                                       4
+#                    block_unplug |                                       4
+#               sched_stat_iowait |                                       4
+#    writeback_single_inode_start |                                       4
+#                      block_plug |                                       4
+#           writeback_write_inode |                                       3
+#                  sys_enter_pipe |                                       3
+#            writeback_dirty_page |                                       3
+#     writeback_write_inode_start |                                       3
+#           ext4_mark_inode_dirty |                                       3
+#              ext4_journal_start |                                       3
+#                   sys_exit_pipe |                                       3
+#           jbd2_drop_transaction |                                       2
+#             jbd2_commit_locking |                                       2
+#            jbd2_commit_flushing |                                       2
+#               jbd2_handle_start |                                       2
+#                  jbd2_run_stats |                                       2
+#               sys_exit_getdents |                                       2
+#           jbd2_checkpoint_stats |                                       2
+#             sys_enter_getgroups |                                       2
+#               jbd2_start_commit |                                       2
+#                 jbd2_end_commit |                                       2
+#              ext4_da_writepages |                                       2
+#               jbd2_handle_stats |                                       2
+#              sys_enter_statfs64 |                                       2
+#               sys_exit_statfs64 |                                       2
+#              sys_exit_getgroups |                                       2
+#                  sys_exit_lseek |                                       2
+#                 sys_enter_lseek |                                       2
+#              sys_enter_getdents |                                       2
+#             ext4_da_write_pages |                                       2
+#             jbd2_commit_logging |                                       2
+#             ext4_request_blocks |                                       1
+#                 sys_exit_openat |                                       1
+#     ext4_discard_preallocations |                                       1
+#              ext4_mballoc_alloc |                                       1
+#                sys_enter_openat |                                       1
+#       ext4_da_writepages_result |                                       1
+#            ext4_allocate_blocks |                                       1
+#              sys_enter_newuname |                                       1
+#    ext4_da_update_reserve_space |                                       1
+# ext4_get_reserved_cluster_alloc |                                       1
+#               sys_exit_newuname |                                       1
+#           writeback_wake_thread |                                       1
+
diff --git a/tools/ktap/samples/tracepoints/eventcount_by_proc.kp b/tools/ktap/samples/tracepoints/eventcount_by_proc.kp
new file mode 100644
index 0000000..1b95f19
--- /dev/null
+++ b/tools/ktap/samples/tracepoints/eventcount_by_proc.kp
@@ -0,0 +1,57 @@
+#!/usr/bin/env ktap
+
+# showing all tracepoints in histogram style
+
+var s = {}
+
+trace *:* {
+	s[execname] += 1
+}
+
+trace_end {
+	print_hist(s)
+}
+
+#Results:
+#^C
+#                          value ------------- Distribution ------------- count
+#                       swapper/0 |@@@@@@@@@@@@                           354378
+#                       swapper/1 |@@@@@@@@@@                             284984
+#                              ps |@@@@                                   115697
+#                        ksmtuned |@@@                                    95857
+#                          iscsid |@@                                     80008
+#                             awk |@                                      30354
+#                      irqbalance |                                       16530
+#                       rcu_sched |                                       15892
+#                        sendmail |                                       14463
+#                     kworker/0:1 |                                       10540
+#                    kworker/u4:2 |                                       9250
+#                     kworker/1:2 |                                       7943
+#                           sleep |                                       7555
+#                           crond |                                       3911
+#                     ksoftirqd/0 |                                       3817
+#                            sshd |                                       2849
+#                 systemd-journal |                                       2209
+#                     migration/1 |                                       1601
+#                     migration/0 |                                       1350
+#                        dhclient |                                       1343
+#                 nm-dhcp-client. |                                       1208
+#                     ksoftirqd/1 |                                       1064
+#                      watchdog/1 |                                       966
+#                      watchdog/0 |                                       964
+#                      khugepaged |                                       776
+#                     dbus-daemon |                                       611
+#                         rpcbind |                                       607
+#                           gdbus |                                       529
+#                  NetworkManager |                                       399
+#                     jbd2/dm-1-8 |                                       378
+#                   modem-manager |                                       184
+#                  abrt-watch-log |                                       157
+#                         polkitd |                                       156
+#                   rs:main Q:Reg |                                       153
+#                    avahi-daemon |                                       151
+#                        rsyslogd |                                       102
+#                         systemd |                                       96
+#                    kworker/0:1H |                                       45
+#                          smartd |                                       30
+
diff --git a/tools/ktap/samples/tracepoints/raw_tracepoint.kp b/tools/ktap/samples/tracepoints/raw_tracepoint.kp
new file mode 100644
index 0000000..cffda45
--- /dev/null
+++ b/tools/ktap/samples/tracepoints/raw_tracepoint.kp
@@ -0,0 +1,15 @@
+#!/usr/bin/env ktap
+
+#This script use kdebug.tracepoint, not 'trace' keyword which use perf backend.
+#
+#The overhead of kdebug.tracepoint would be much little than normal perf
+#backend tracing.
+
+kdebug.tracepoint("sys_enter_open", function () {
+	printf("sys_enter_open: (%s) open file (%s)\n",
+		execname,  user_string(arg1))
+})
+
+kdebug.tracepoint("sys_exit_open", function () {
+	printf("sys_exit_open: return fd: (%d)\n", arg1)
+})
diff --git a/tools/ktap/samples/tracepoints/tracepoints.kp b/tools/ktap/samples/tracepoints/tracepoints.kp
new file mode 100644
index 0000000..3ff29b5
--- /dev/null
+++ b/tools/ktap/samples/tracepoints/tracepoints.kp
@@ -0,0 +1,6 @@
+#!/usr/bin/env ktap
+
+trace *:* {
+	print(cpu, pid, execname, argstr)
+}
+
diff --git a/tools/ktap/samples/userspace/gcc_unwind.kp b/tools/ktap/samples/userspace/gcc_unwind.kp
new file mode 100644
index 0000000..48db46e
--- /dev/null
+++ b/tools/ktap/samples/userspace/gcc_unwind.kp
@@ -0,0 +1,9 @@
+#!/usr/bin/env ktap
+
+#only tested in x86-64 system,
+#if you run this script in x86_32, change the libc path.
+
+trace sdt:/lib/x86_64-linux-gnu/libgcc_s.so.1:unwind {
+	print(execname, argstr)
+}
+
diff --git a/tools/ktap/samples/userspace/glibc_func_hist.kp b/tools/ktap/samples/userspace/glibc_func_hist.kp
new file mode 100644
index 0000000..bb3dd9f
--- /dev/null
+++ b/tools/ktap/samples/userspace/glibc_func_hist.kp
@@ -0,0 +1,44 @@
+#!/usr/bin/env ktap
+
+#This ktap script trace all glibc functions in histogram output
+
+#only tested in x86-64 system,
+#if you run this script in x86_32, change the libc path.
+
+var s = {}
+
+trace probe:/lib64/libc.so.6:* {
+	s[probename] += 1
+}
+
+trace_end {
+	print_hist(s)
+}
+
+# Example result:
+#[root@localhost ktap]# ./ktap ./glibc_func_hist.kp
+#Tracing... Ctrl-C to end.
+#^C
+#                          value ------------- Distribution ------------- count
+#                   _IO_sputbackc |                                       1536
+#                  __strncmp_sse2 |                                       1522
+#                    __GI_strncmp |                                       1522
+#                     __GI_memcpy |                                       1446
+#                   __memcpy_sse2 |                                       1446
+#        _dl_mcount_wrapper_check |                                       1433
+#   __GI__dl_mcount_wrapper_check |                                       1433
+# __gconv_transform_utf8_internal |                                       1429
+#                       __mbrtowc |                                       1425
+#                        mbrtoc32 |                                       1425
+#                  __GI___mbrtowc |                                       1425
+#                         mbrtowc |                                       1425
+#                    __GI_mbrtowc |                                       1425
+#                         strtouq |                                       1274
+#                        strtoull |                                       1274
+#                         strtoul |                                       1274
+#          __ctype_get_mb_cur_max |                                       984
+#         ____strtoull_l_internal |                                       970
+#     __GI_____strtoul_l_internal |                                       970
+#              __GI__IO_sputbackc |                                       960
+#                             ... |
+
diff --git a/tools/ktap/samples/userspace/glibc_sdt.kp b/tools/ktap/samples/userspace/glibc_sdt.kp
new file mode 100644
index 0000000..e396901
--- /dev/null
+++ b/tools/ktap/samples/userspace/glibc_sdt.kp
@@ -0,0 +1,11 @@
+#!/usr/bin/env ktap
+
+#This ktap script trace all sdt notes in glibc
+
+#only tested in x86-64 system,
+#if you run this script in x86_32, change the libc path.
+
+trace sdt:/lib64/libc.so.6:* {
+	print(execname, argstr)
+}
+
diff --git a/tools/ktap/samples/userspace/glibc_trace.kp b/tools/ktap/samples/userspace/glibc_trace.kp
new file mode 100644
index 0000000..9b8d16e
--- /dev/null
+++ b/tools/ktap/samples/userspace/glibc_trace.kp
@@ -0,0 +1,11 @@
+#!/usr/bin/env ktap
+
+#This ktap script trace all functions in glibc
+
+#only tested in x86-64 system,
+#if you run this script in x86_32, change the libc path.
+
+trace probe:/lib64/libc.so.6:* {
+	print(execname, argstr)
+}
+
diff --git a/tools/ktap/samples/userspace/malloc_free.kp b/tools/ktap/samples/userspace/malloc_free.kp
new file mode 100644
index 0000000..884d99a
--- /dev/null
+++ b/tools/ktap/samples/userspace/malloc_free.kp
@@ -0,0 +1,20 @@
+#!/usr/bin/env ktap
+
+#only tested in x86-64 system,
+#if you run this script in x86_32, change the libc path.
+
+trace probe:/lib64/libc.so.6:malloc {
+	print("malloc entry:", execname)
+}
+
+trace probe:/lib64/libc.so.6:malloc%return {
+	print("malloc exit:", execname)
+}
+
+trace probe:/lib64/libc.so.6:free {
+	print("free entry:", execname)
+}
+
+trace probe:/lib64/libc.so.6:free%return {
+	print("free exit:", execname)
+}
diff --git a/tools/ktap/samples/userspace/malloc_size_hist.kp b/tools/ktap/samples/userspace/malloc_size_hist.kp
new file mode 100644
index 0000000..41bd202
--- /dev/null
+++ b/tools/ktap/samples/userspace/malloc_size_hist.kp
@@ -0,0 +1,22 @@
+#!/usr/bin/env ktap
+
+# Aggregate system or process malloc size
+
+# only tested in x86-64 system,
+# if you run this script in x86_32, change the libc path and register name.
+#
+# Examples:
+#
+# ktap malloc_size_hist.kp
+# ktap malloc_size_hist.kp -- ls
+
+var s = {}
+
+trace probe:/lib64/libc.so.6:malloc size=%di {
+	#arg2 is argument "size" of malloc function
+	s[arg1] += 1
+}
+
+trace_end {
+	print_hist(s)
+}
diff --git a/tools/ktap/samples/userspace/pthread.kp b/tools/ktap/samples/userspace/pthread.kp
new file mode 100644
index 0000000..be693ac
--- /dev/null
+++ b/tools/ktap/samples/userspace/pthread.kp
@@ -0,0 +1,8 @@
+#!/usr/bin/env ktap
+
+# This script trace pthread_mutex* related call in libpthread
+# Tested in x86_64
+
+trace probe:/lib64/libpthread-2.17.so:pthread_mutex_* {
+	print(execname, argstr)
+}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 04/29] ktap: add basic ktap types definition(include/uapi/ktap/ktap_types.h)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (2 preceding siblings ...)
  2014-03-28 14:44 ` [PATCH v2 03/29] ktap: add sample scripts(tools/ktap/samples/*) Jovi Zhangwei
@ 2014-03-28 14:44 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 05/29] ktap: add bytecode definition(include/uapi/ktap/ktap_bc.h) Jovi Zhangwei
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:44 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

File 'include/ktap_types.h' is the basestone for ktap runtime
and userspace tool, it contains key structure definition for ktap.

1. ktap_val_t
   ktap_val_t is ktap value representaion in stack, and for object
   reference. Each value have a type, the type could be:
        nil, false, true, number, string, upval, proto, func,
        table, eventstr, kstrace, kip, uip.

   ktap_val_t structure occupy 16 bytes in x86_64.

2. ktap_number
   number type in ktap, it is 'long'.

3. ktap_str_t
   String type. All string in ktap is interned, it means ktap keeps
   a single copy for any string. Whenever a new string appears, ktap
   checks whether it already has a copy of that string and, if so,
   reuses that copy. Internalization makes operations like string
   comparison and table indexing very fast, but it slows down string creation.

   String is allocated from mempool(runtime/kp_mempool.c), not allocated
   by kmalloc, because we don't want dynamic allocate string in probe
   context.

   Real string buffer is appended after ktap_str_t.

4. ktap_proto_t
   prototype structure for ktap function.

5. ktap_upval_t
   The concept of upval is get from lua.
        var a = 1
        function f() {
                print(a)   <--- 'a' is upval in 'f'.
        }

6. ktap_func_t
   Instance of ktap_proto_t, also can be called as 'closure', it's
   created for each function prototype in runtime.
   ktap_func_t contains upvals array for access variables in upper
   scope layer.

7. ktap_tab_t
   ktap table(associative array) structure, it contains array part
   and hash part.
   Table is pre-allocated by vmalloc, table is not allowed to resize,
   it will report overflow when entries full.

8. ktap_state_t
   ktap context structure.
   There have many context in ktap runtime:
   1). mainthread conetxt
      mainthread context is the context in ktap thread.
   2). probe context
      ktap runtime pre-allocated percpu probe context.

9. ktap_global_state_t
   global state of ktap runtime, it's access by G(ks).

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 include/uapi/ktap/ktap_types.h | 462 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 462 insertions(+)
 create mode 100644 include/uapi/ktap/ktap_types.h

diff --git a/include/uapi/ktap/ktap_types.h b/include/uapi/ktap/ktap_types.h
new file mode 100644
index 0000000..4509397
--- /dev/null
+++ b/include/uapi/ktap/ktap_types.h
@@ -0,0 +1,462 @@
+/*
+ * ktap types definition.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ */
+
+#ifndef __KTAP_TYPES_H__
+#define __KTAP_TYPES_H__
+
+#ifdef __KERNEL__
+#include <linux/perf_event.h>
+#else
+typedef char u8;
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+typedef int ptrdiff_t;
+#endif
+
+#include "ktap_bc.h"
+
+/* Various VM limits. */
+#define KP_MAX_MEMPOOL_SIZE	10000	/* Max. mempool size(Kbytes). */
+#define KP_MAX_STR	512		/* Max. string length. */
+#define KP_MAX_STRNUM	9999		/* Max. string number. */
+
+#define KP_MAX_STRTAB	(1<<26)		/* Max. string table size. */
+#define KP_MAX_HBITS	26		/* Max. hash bits. */
+#define KP_MAX_ABITS	28		/* Max. bits of array key. */
+#define KP_MAX_ASIZE	((1<<(KP_MAX_ABITS-1))+1)  /* Max. array part size. */
+#define KP_MAX_COLOSIZE	16		/* Max. elems for colocated array. */
+
+#define KP_MAX_LINE	1000		/* Max. source code line number. */
+#define KP_MAX_XLEVEL	200		/* Max. syntactic nesting level. */
+#define KP_MAX_BCINS	(1<<26)		/* Max. # of bytecode instructions. */
+#define KP_MAX_SLOTS	250		/* Max. # of slots in a ktap func. */
+#define KP_MAX_LOCVAR	200		/* Max. # of local variables. */
+#define KP_MAX_UPVAL	60		/* Max. # of upvalues. */
+
+#define KP_MAX_CACHED_CFUNCTION	128	/* Max. cached global cfunction */
+
+#define KP_MAX_STACK_DEPTH	50	/* Max. stack depth */
+
+/*
+ * The first argument type of kdebug.trace_by_id()
+ * The value is a userspace memory pointer.
+ * Maybe embed it into the trunk file in future.
+ */
+typedef struct ktap_eventdesc {
+	int nr;  /* the number of events id */
+	int *id_arr; /* events id array */
+	char *filter;
+} ktap_eventdesc_t;
+
+
+/* ktap option for each script */
+typedef struct ktap_option {
+	char *trunk; /* __user */
+	int trunk_len;
+	int argc;
+	char **argv; /* __user */
+	int verbose;
+	int trace_pid;
+	int workload;
+	int trace_cpu;
+	int print_timestamp;
+	int quiet;
+	int dry_run;
+} ktap_option_t;
+
+/*
+ * Ioctls that can be done on a ktap fd:
+ * todo: use _IO macro in include/uapi/asm-generic/ioctl.h
+ */
+#define KTAP_CMD_IOC_RUN		('$' + 1)
+#define KTAP_CMD_IOC_EXIT		('$' + 3)
+
+#define KTAP_VERSION_MAJOR       "0"
+#define KTAP_VERSION_MINOR       "4"
+
+#define KTAP_VERSION    "ktap " KTAP_VERSION_MAJOR "." KTAP_VERSION_MINOR
+#define KTAP_AUTHOR    "Jovi Zhangwei <jovi.zhangwei@gmail.com>"
+#define KTAP_COPYRIGHT  KTAP_VERSION "  Copyright (C) 2012-2014, " KTAP_AUTHOR
+
+#define MYINT(s)        (s[0] - '0')
+#define VERSION         (MYINT(KTAP_VERSION_MAJOR) * 16 + MYINT(KTAP_VERSION_MINOR))
+
+typedef long ktap_number;
+typedef int ktap_instr_t;
+typedef union ktap_obj ktap_obj_t;
+
+struct ktap_state;
+typedef int (*ktap_cfunction) (struct ktap_state *ks);
+
+/* ktap_val_t is basic value type in ktap stack, for reference all objects */
+typedef struct ktap_val {
+	union {
+		ktap_obj_t *gc;		/* collectable objects, str/tab/... */
+		void *p;		/* light userdata */
+		ktap_cfunction f;	/* light C functions */
+		ktap_number n;		/* numbers */
+		struct {
+			uint16_t depth;	/* stack depth */
+			uint16_t skip;	/* skip stack entries */
+		} stack;
+	} val;
+	union {
+		int type;		/* type for above val */
+		const unsigned int *pcr;/* Overlaps PC for ktap frames.*/
+	};
+} ktap_val_t;
+
+typedef ktap_val_t *StkId;
+
+#define GCHeader ktap_obj_t *nextgc; u8 gct;
+
+typedef struct ktap_str {
+	GCHeader;
+	u8 reserved;  /* Used by lexer for fast lookup of reserved words. */
+	u8 extra;
+	unsigned int hash;
+	int len;  /* number of characters in string */
+} ktap_str_t;
+
+typedef struct ktap_upval {
+	GCHeader;
+	uint8_t closed;	/* Set if closed (i.e. uv->v == &uv->u.value). */
+	uint8_t immutable;	/* Immutable value. */
+	union {
+		ktap_val_t tv; /* If closed: the value itself. */
+		struct { /* If open: double linked list, anchored at thread. */
+			struct ktap_upval *prev;
+			struct ktap_upval *next;
+		};
+	};
+	ktap_val_t *v;  /* Points to stack slot (open) or above (closed). */
+} ktap_upval_t;
+
+
+typedef struct ktap_func {
+	GCHeader;
+	u8 nupvalues;
+	BCIns *pc;
+	struct ktap_proto *p;
+	struct ktap_upval *upvals[1];  /* list of upvalues */
+} ktap_func_t;
+
+typedef struct ktap_proto {
+	GCHeader;
+	uint8_t numparams;	/* Number of parameters. */
+	uint8_t framesize;	/* Fixed frame size. */
+	int sizebc;		/* Number of bytecode instructions. */
+	ktap_obj_t *gclist;
+	void *k;	/* Split constant array (points to the middle). */
+	void *uv;	/* Upvalue list. local slot|0x8000 or parent uv idx. */
+	int sizekgc;	/* Number of collectable constants. */
+	int sizekn;	/* Number of lua_Number constants. */
+	int sizept;	/* Total size including colocated arrays. */
+	uint8_t sizeuv;	/* Number of upvalues. */
+	uint8_t flags;	/* Miscellaneous flags (see below). */
+
+	/* --- The following fields are for debugging/tracebacks only --- */
+	ktap_str_t *chunkname;	/* Chunk name this function was defined in. */
+	BCLine firstline;	/* First line of the function definition. */
+	BCLine numline;	/* Number of lines for the function definition. */
+	void *lineinfo;	/* Compressed map from bytecode ins. to source line. */
+	void *uvinfo;	/* Upvalue names. */
+	void *varinfo;	/* Names and compressed extents of local variables. */
+} ktap_proto_t;
+
+/* Flags for prototype. */
+#define PROTO_CHILD		0x01	/* Has child prototypes. */
+#define PROTO_VARARG		0x02	/* Vararg function. */
+#define PROTO_FFI		0x04	/* Uses BC_KCDATA for FFI datatypes. */
+#define PROTO_NOJIT		0x08	/* JIT disabled for this function. */
+#define PROTO_ILOOP		0x10	/* Patched bytecode with ILOOP etc. */
+/* Only used during parsing. */
+#define PROTO_HAS_RETURN	0x20	/* Already emitted a return. */
+#define PROTO_FIXUP_RETURN	0x40	/* Need to fixup emitted returns. */
+/* Top bits used for counting created closures. */
+#define PROTO_CLCOUNT		0x20	/* Base of saturating 3 bit counter. */
+#define PROTO_CLC_BITS		3
+#define PROTO_CLC_POLY		(3*PROTO_CLCOUNT)  /* Polymorphic threshold. */
+
+#define PROTO_UV_LOCAL		0x8000	/* Upvalue for local slot. */
+#define PROTO_UV_IMMUTABLE	0x4000	/* Immutable upvalue. */
+
+#define proto_kgc(pt, idx)	(((ktap_obj_t *)(pt)->k)[idx])
+#define proto_bc(pt)		((BCIns *)((char *)(pt) + sizeof(ktap_proto_t)))
+#define proto_bcpos(pt, pc)	((BCPos)((pc) - proto_bc(pt)))
+#define proto_uv(pt)		((uint16_t *)(pt)->uv)
+
+#define proto_chunkname(pt)	((pt)->chunkname)
+#define proto_lineinfo(pt)	((const void *)(pt)->lineinfo)
+#define proto_uvinfo(pt)	((const uint8_t *)(pt)->uvinfo)
+#define proto_varinfo(pt)	((const uint8_t *)(pt)->varinfo)
+
+
+typedef struct ktap_node_t {
+	ktap_val_t val; /* Value object. Must be first field. */
+	ktap_val_t key; /* Key object. */
+	struct ktap_node_t *next;  /* hash chain */
+} ktap_node_t;
+
+/* ktap_tab */
+typedef struct ktap_tab {
+	GCHeader;
+#ifdef __KERNEL__
+	arch_spinlock_t lock;
+#endif
+	ktap_val_t *array;    /* Array part. */
+	ktap_node_t *node;    /* Hash part. */
+	ktap_node_t *freetop; /* any free position is before this position */
+
+	uint32_t asize;		/* Size of array part (keys [0, asize-1]). */
+	uint32_t hmask;		/* log2 of size of `node' array */
+
+	uint32_t hnum;		/* number of all nodes */
+} ktap_tab_t;
+
+typedef struct ktap_stats {
+	int mem_allocated;
+	int nr_mem_allocate;
+	int events_hits;
+	int events_missed;
+} ktap_stats_t;
+
+#define KTAP_STATS(ks)	this_cpu_ptr(G(ks)->stats)
+
+
+#define KTAP_RUNNING	0 /* normal running state */
+#define KTAP_TRACE_END	1 /* running in trace_end function */
+#define KTAP_EXIT	2 /* normal exit, set when call exit() */
+#define KTAP_ERROR	3 /* error state, called by kp_error */
+
+typedef struct ktap_global_state {
+	void *mempool;		/* string memory pool */
+	void *mp_freepos;	/* free position in memory pool */
+	int mp_size;		/* memory pool size */
+#ifdef __KERNEL__
+	arch_spinlock_t mp_lock;/* mempool lock */
+#endif
+
+	ktap_str_t **strhash;	/* String hash table (hash chain anchors). */
+	int strmask;		/* String hash mask (size of hash table-1). */
+	int strnum;		/* Number of strings in hash table. */
+#ifdef __KERNEL__
+	arch_spinlock_t str_lock; /* string operation lock */
+#endif
+
+	ktap_val_t registry;
+	ktap_tab_t *gtab;	/* global table contains cfunction and args */
+	ktap_obj_t *allgc; /* list of all collectable objects */
+	ktap_upval_t uvhead; /* head of list of all open upvalues */
+
+	struct ktap_state *mainthread; /*main state */
+	int state; /* status of ktapvm, KTAP_RUNNING, KTAP_TRACE_END, etc */
+#ifdef __KERNEL__
+	/* reserved global percpu data */
+	void __percpu *percpu_state[PERF_NR_CONTEXTS];
+	void __percpu *percpu_print_buffer[PERF_NR_CONTEXTS];
+	void __percpu *percpu_temp_buffer[PERF_NR_CONTEXTS];
+
+	/* for recursion tracing check */
+	int __percpu *recursion_context[PERF_NR_CONTEXTS];
+
+	ktap_option_t *parm; /* ktap options */
+	pid_t trace_pid;
+	struct task_struct *trace_task;
+	cpumask_var_t cpumask;
+	struct ring_buffer *buffer;
+	struct dentry *trace_pipe_dentry;
+	struct task_struct *task;
+	int trace_enabled;
+	int wait_user; /* flag to indicat waiting user consume content */
+
+	struct list_head timers; /* timer list */
+	struct ktap_stats __percpu *stats; /* memory allocation stats */
+	struct list_head events_head; /* probe event list */
+
+	ktap_func_t *trace_end_closure; /* trace_end closure */
+
+	/* C function table for fast call */
+	int nr_builtin_cfunction;
+	ktap_cfunction gfunc_tbl[KP_MAX_CACHED_CFUNCTION];
+#endif
+} ktap_global_state_t;
+
+
+typedef struct ktap_state {
+	ktap_global_state_t *g;	/* global state */
+	int stop;		/* don't enter tracing handler if stop is 1 */
+	StkId top;		/* stack top */
+	StkId func;		/* execute light C function */
+	StkId stack_last;	/* last stack pointer */
+	StkId stack;		/* ktap stack, percpu pre-reserved */
+	ktap_upval_t *openupval;/* opened upvals list */
+
+#ifdef __KERNEL__
+	/* current fired event which allocated on stack */
+	struct ktap_event_data *current_event;
+#endif
+} ktap_state_t;
+
+#define G(ks)   (ks->g)
+
+/*
+ * Union of all collectable objects
+ */
+union ktap_obj {
+	struct { GCHeader } gch;
+	struct ktap_str ts;
+	struct ktap_func fn;
+	struct ktap_tab h;
+	struct ktap_proto pt;
+	struct ktap_upval uv;
+	struct ktap_state th;  /* thread */
+};
+
+#define gch(o)			(&(o)->gch)
+
+/* macros to convert a ktap_obj_t into a specific value */
+#define gco2ts(o)		(&((o)->ts))
+#define gco2uv(o)		(&((o)->uv))
+#define obj2gco(v)		((ktap_obj_t *)(v))
+
+/* predefined values in the registry */
+#define KTAP_RIDX_GLOBALS	1
+#define KTAP_RIDX_LAST		KTAP_RIDX_GLOBALS
+
+/* ktap object types */
+#define KTAP_TNIL		(~0u)
+#define KTAP_TFALSE		(~1u)
+#define KTAP_TTRUE		(~2u)
+#define KTAP_TNUM		(~3u)
+#define KTAP_TLIGHTUD		(~4u)
+#define KTAP_TSTR		(~5u)
+#define KTAP_TUPVAL		(~6u)
+#define KTAP_TPROTO		(~7u)
+#define KTAP_TFUNC		(~8u)
+#define KTAP_TCFUNC		(~9u)
+#define KTAP_TCDATA		(~10u)
+#define KTAP_TTAB		(~11u)
+#define KTAP_TUDATA		(~12u)
+
+/* Specfic types */
+#define KTAP_TEVENTSTR		(~13u) /* argstr */
+#define KTAP_TKSTACK		(~14u) /* stack(), not intern to string yet */
+#define KTAP_TKIP		(~15u) /* kernel function ip addres */
+#define KTAP_TUIP		(~16u) /* userspace function ip addres */
+
+/* This is just the canonical number type used in some places. */
+#define KTAP_TNUMX		(~17u)
+
+
+#define itype(o)		((o)->type)
+#define setitype(o, t)		((o)->type = (t))
+
+#define val_(o)			((o)->val)
+#define gcvalue(o)		(val_(o).gc)
+
+#define nvalue(o)		(val_(o).n)
+#define boolvalue(o)		(KTAP_TFALSE - (o)->type)
+#define hvalue(o)		(&val_(o).gc->h)
+#define phvalue(o)		(&val_(o).gc->ph)
+#define clvalue(o)		(&val_(o).gc->fn)
+#define ptvalue(o)		(&val_(o).gc->pt)
+
+#define getstr(ts)		(const char *)((ts) + 1)
+#define rawtsvalue(o)		(&val_(o).gc->ts)
+#define svalue(o)		getstr(rawtsvalue(o))
+
+#define pvalue(o)		(&val_(o).p)
+#define fvalue(o)		(val_(o).f)
+
+#define is_nil(o)		(itype(o) == KTAP_TNIL)
+#define is_false(o)		(itype(o) == KTAP_TFALSE)
+#define is_true(o)		(itype(o) == KTAP_TTRUE)
+#define is_bool(o)		(is_false(o) || is_true(o))
+#define is_string(o)		(itype(o) == KTAP_TSTR)
+#define is_number(o)		(itype(o) == KTAP_TNUM)
+#define is_table(o)		(itype(o) == KTAP_TTAB)
+#define is_proto(o)		(itype(o) == KTAP_TPROTO)
+#define is_function(o)		(itype(o) == KTAP_TFUNC)
+#define is_cfunc(o)		(itype(o) == KTAP_TCFUNC)
+#define is_eventstr(o)		(itype(o) == KTAP_TEVENTSTR)
+#define is_kip(o)		(itype(o) == KTAP_TKIP)
+
+#define set_nil(o)		((o)->type = KTAP_TNIL)
+#define set_bool(o, x)		((o)->type = KTAP_TFALSE-(uint32_t)(x))
+
+static inline void set_number(ktap_val_t *o, ktap_number n)
+{
+	setitype(o, KTAP_TNUM);
+	o->val.n = n;
+}
+
+static inline void set_string(ktap_val_t *o, const ktap_str_t *str)
+{
+	setitype(o, KTAP_TSTR);
+	o->val.gc = (ktap_obj_t *)str;
+}
+
+static inline void set_table(ktap_val_t *o, ktap_tab_t *tab)
+{
+	setitype(o, KTAP_TTAB);
+	o->val.gc = (ktap_obj_t *)tab;
+}
+
+static inline void set_proto(ktap_val_t *o, ktap_proto_t *pt)
+{
+	setitype(o, KTAP_TPROTO);
+	o->val.gc = (ktap_obj_t *)pt;
+}
+
+static inline void set_kstack(ktap_val_t *o, uint16_t depth, uint16_t skip)
+{
+	setitype(o, KTAP_TKSTACK);
+	o->val.stack.depth = depth;
+	o->val.stack.skip = skip;
+}
+
+static inline void set_func(ktap_val_t *o, ktap_func_t *fn)
+{
+	setitype(o, KTAP_TFUNC);
+	o->val.gc = (ktap_obj_t *)fn;
+}
+
+static inline void set_cfunc(ktap_val_t *o, ktap_cfunction fn)
+{
+	setitype(o, KTAP_TCFUNC);
+	o->val.f = fn;
+}
+
+static inline void set_eventstr(ktap_val_t *o)
+{
+	setitype(o, KTAP_TEVENTSTR);
+}
+
+static inline void set_ip(ktap_val_t *o, unsigned long addr)
+{
+	setitype(o, KTAP_TKIP);
+	o->val.n = addr;
+}
+
+
+#define set_obj(o1, o2)		{ *(o1) = *(o2); }
+
+#define incr_top(ks)		{ks->top++;}
+
+/*
+ * KTAP_QL describes how error messages quote program elements.
+ * CHANGE it if you want a different appearance.
+ */
+#define KTAP_QL(x)      "'" x "'"
+#define KTAP_QS         KTAP_QL("%s")
+
+#endif /* __KTAP_TYPES_H__ */
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 05/29] ktap: add bytecode definition(include/uapi/ktap/ktap_bc.h)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (3 preceding siblings ...)
  2014-03-28 14:44 ` [PATCH v2 04/29] ktap: add basic ktap types definition(include/uapi/ktap/ktap_types.h) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 06/29] ktap: add ktap_arch.h and error header file(include/uapi/ktap/) Jovi Zhangwei
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap bytecode is based on luajit, a fast JIT for lua language.

There have a good detail introduction about luajit bytecode in
http://wiki.luajit.org/Bytecode-2.0

A single bytecode instruction is 32 bit wide and has an 8 bit
opcode field and several operand fields of 8 or 16 bit.
Instructions come in one of two formats:

 +----+----+----+----+
 | B  | C  | A  | OP | Format ABC
 +----+----+----+----+
 |    D    | A  | OP | Format AD
 +--------------------
 MSB               LSB

In-memory instructions are always stored in host byte order.

E.g. 0xbbccaa1e is the instruction with opcode 0x1e (ADDVV),
with operands A = 0xaa, B = 0xbb and C = 0xcc.

Bytecods:

1). Comparison ops
ISLT, ISGE, ISLE, ISGT, ISEQV, ISNEV, ISEQS, ISNES, ISEQN,
ISNEN, ISEQP, ISNEP

2). Unary Test and Copy ops
ISTC, ISFC, IST, ISF

3). Unary ops
MOV, NOT, UNM, LEN

4). Binary ops
ADDVN, SUBVN, MULVN, DIVVN, MODVN, ADDNV, SUBNV, MULNV, DIVNV, MODNV,
ADDVV, SUBVV, SUBVV, MULVV, DIVVV, MODVV, POW, CAT

5). Constant ops
KSTR, KCDATA, KSHORT, KNUM, KPRI, KNIL

6). Upvalue and Function ops
UGET, USETV, USETS, USETN, USETP, UCLO, FNEW

7). Table ops
TNEW, TDUP, GGET, GSET, TGETV, TGETS, TGETB, TSETV, TSETS, TSETB, TSETM

8). Calls and Vararg Handling (ktap do not support variable argument now)
CALLM, CALL, CALLMT, CALLT, ITERC, ITERN, VARG, ISNEXT

9). Returns (ktap don't support RETM now)
RETM, RET, RET0, RET1

10). Loops and branches (loop bytecode J* is not used in ktap)
FORI, JFORI, FORL, IFORL, JFORL, ITERL, IITERL, JITERL, LOOP, ILOOP, JLOOP, JMP

11). Function headers (All is not used in ktap)
FUNCF, IFUNCF, JFUNCF, FUNCV, JFUNCV, FUNCC, FUNCCW

12). ktap specific bytecodes
VARGN, VARGSTR, VPROBENAME, VPID, VTID, VUID, VCPU, VEXECNAME, GFUNC

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 include/uapi/ktap/ktap_bc.h | 369 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 369 insertions(+)
 create mode 100644 include/uapi/ktap/ktap_bc.h

diff --git a/include/uapi/ktap/ktap_bc.h b/include/uapi/ktap/ktap_bc.h
new file mode 100644
index 0000000..aac286a
--- /dev/null
+++ b/include/uapi/ktap/ktap_bc.h
@@ -0,0 +1,369 @@
+/*
+ * Bytecode instruction format.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ * Copyright (C) 2005-2014 Mike Pall.
+ */
+
+#ifndef __KTAP_BC_H__
+#define __KTAP_BC_H__
+
+#include "ktap_arch.h"
+
+/*TODO*/
+#define KP_STATIC_ASSERT(cond)
+#define kp_assert(cond)
+
+/* Types for handling bytecodes. */
+typedef uint32_t BCIns;  /* Bytecode instruction. */
+typedef uint32_t BCPos;  /* Bytecode position. */
+typedef uint32_t BCReg;  /* Bytecode register. */
+typedef int32_t BCLine;  /* Bytecode line number. */
+
+/*
+ * Bytecode instruction format, 32 bit wide, fields of 8 or 16 bit:
+ *
+ * +----+----+----+----+
+ * | B  | C  | A  | OP | Format ABC
+ * +----+----+----+----+
+ * |    D    | A  | OP | Format AD
+ * +--------------------
+ * MSB               LSB
+ *
+ * In-memory instructions are always stored in host byte order.
+ */
+
+/* Operand ranges and related constants. */
+#define BCMAX_A		0xff
+#define BCMAX_B		0xff
+#define BCMAX_C		0xff
+#define BCMAX_D		0xffff
+#define BCBIAS_J	0x8000
+#define NO_REG		BCMAX_A
+#define NO_JMP		(~(BCPos)0)
+
+/* Macros to get instruction fields. */
+#define bc_op(i)	((BCOp)((i)&0xff))
+#define bc_a(i)		((BCReg)(((i)>>8)&0xff))
+#define bc_b(i)		((BCReg)((i)>>24))
+#define bc_c(i)		((BCReg)(((i)>>16)&0xff))
+#define bc_d(i)		((BCReg)((i)>>16))
+#define bc_j(i)		((ptrdiff_t)bc_d(i)-BCBIAS_J)
+
+/* Macros to set instruction fields. */
+#define setbc_byte(p, x, ofs) \
+	((uint8_t *)(p))[KP_ENDIAN_SELECT(ofs, 3 - ofs)] = (uint8_t)(x)
+#define setbc_op(p, x)	setbc_byte(p, (x), 0)
+#define setbc_a(p, x)	setbc_byte(p, (x), 1)
+#define setbc_b(p, x)	setbc_byte(p, (x), 3)
+#define setbc_c(p, x)	setbc_byte(p, (x), 2)
+#define setbc_d(p, x) \
+	((uint16_t *)(p))[KP_ENDIAN_SELECT(1, 0)] = (uint16_t)(x)
+#define setbc_j(p, x)	setbc_d(p, (BCPos)((int32_t)(x)+BCBIAS_J))
+
+/* Macros to compose instructions. */
+#define BCINS_ABC(o, a, b, c) \
+	(((BCIns)(o))|((BCIns)(a)<<8)|((BCIns)(b)<<24)|((BCIns)(c)<<16))
+#define BCINS_AD(o, a, d) \
+	(((BCIns)(o))|((BCIns)(a)<<8)|((BCIns)(d)<<16))
+#define BCINS_AJ(o, a, j)	BCINS_AD(o, a, (BCPos)((int32_t)(j)+BCBIAS_J))
+
+/*
+ * Bytecode instruction definition. Order matters, see below.
+ *
+ * (name, filler, Amode, Bmode, Cmode or Dmode, metamethod)
+ *
+ * The opcode name suffixes specify the type for RB/RC or RD:
+ * V = variable slot
+ * S = string const
+ * N = number const
+ * P = primitive type (~itype)
+ * B = unsigned byte literal
+ * M = multiple args/results
+ */
+#define BCDEF(_) \
+	/* Comparison ops. ORDER OPR. */ \
+	_(ISLT,	var,	___,	var,	lt) \
+	_(ISGE,	var,	___,	var,	lt) \
+	_(ISLE,	var,	___,	var,	le) \
+	_(ISGT,	var,	___,	var,	le) \
+	\
+	_(ISEQV,	var,	___,	var,	eq) \
+	_(ISNEV,	var,	___,	var,	eq) \
+	_(ISEQS,	var,	___,	str,	eq) \
+	_(ISNES,	var,	___,	str,	eq) \
+	_(ISEQN,	var,	___,	num,	eq) \
+	_(ISNEN,	var,	___,	num,	eq) \
+	_(ISEQP,	var,	___,	pri,	eq) \
+	_(ISNEP,	var,	___,	pri,	eq) \
+	\
+	/* Unary test and copy ops. */ \
+	_(ISTC,	dst,	___,	var,	___) \
+	_(ISFC,	dst,	___,	var,	___) \
+	_(IST,	___,	___,	var,	___) \
+	_(ISF,	___,	___,	var,	___) \
+	_(ISTYPE,	var,	___,	lit,	___) \
+	_(ISNUM,	var,	___,	lit,	___) \
+	\
+	/* Unary ops. */ \
+	_(MOV,	dst,	___,	var,	___) \
+	_(NOT,	dst,	___,	var,	___) \
+	_(UNM,	dst,	___,	var,	unm) \
+	\
+	/* Binary ops. ORDER OPR. VV last, POW must be next. */ \
+	_(ADDVN,	dst,	var,	num,	add) \
+	_(SUBVN,	dst,	var,	num,	sub) \
+	_(MULVN,	dst,	var,	num,	mul) \
+	_(DIVVN,	dst,	var,	num,	div) \
+	_(MODVN,	dst,	var,	num,	mod) \
+	\
+	_(ADDNV,	dst,	var,	num,	add) \
+	_(SUBNV,	dst,	var,	num,	sub) \
+	_(MULNV,	dst,	var,	num,	mul) \
+	_(DIVNV,	dst,	var,	num,	div) \
+	_(MODNV,	dst,	var,	num,	mod) \
+	\
+	_(ADDVV,	dst,	var,	var,	add) \
+	_(SUBVV,	dst,	var,	var,	sub) \
+	_(MULVV,	dst,	var,	var,	mul) \
+	_(DIVVV,	dst,	var,	var,	div) \
+	_(MODVV,	dst,	var,	var,	mod) \
+	\
+	_(POW,	dst,	var,	var,	pow) \
+	_(CAT,	dst,	rbase,	rbase,	concat) \
+	\
+	/* Constant ops. */ \
+	_(KSTR,	dst,	___,	str,	___) \
+	_(KCDATA,	dst,	___,	cdata,	___) \
+	_(KSHORT,	dst,	___,	lits,	___) \
+	_(KNUM,	dst,	___,	num,	___) \
+	_(KPRI,	dst,	___,	pri,	___) \
+	_(KNIL,	base,	___,	base,	___) \
+	\
+	/* Upvalue and function ops. */ \
+	_(UGET,	dst,	___,	uv,	___) \
+	_(USETV,	uv,	___,	var,	___) \
+	_(UINCV,	uv,	___,	var,	___) \
+	_(USETS,	uv,	___,	str,	___) \
+	_(USETN,	uv,	___,	num,	___) \
+	_(UINCN,	uv,	___,	num,	___) \
+	_(USETP,	uv,	___,	pri,	___) \
+	_(UCLO,	rbase,	___,	jump,	___) \
+	_(FNEW,	dst,	___,	func,	gc) \
+	\
+	/* Table ops. */ \
+	_(TNEW,	dst,	___,	lit,	gc) \
+	_(TDUP,	dst,	___,	tab,	gc) \
+	_(GGET,	dst,	___,	str,	index) \
+	_(GSET,	var,	___,	str,	newindex) \
+	_(GINC,	var,	___,	str,	newindex) \
+	_(TGETV,	dst,	var,	var,	index) \
+	_(TGETS,	dst,	var,	str,	index) \
+	_(TGETB,	dst,	var,	lit,	index) \
+	_(TGETR,	dst,	var,	var,	index) \
+	_(TSETV,	var,	var,	var,	newindex) \
+	_(TINCV,	var,	var,	var,	newindex) \
+	_(TSETS,	var,	var,	str,	newindex) \
+	_(TINCS,	var,	var,	str,	newindex) \
+	_(TSETB,	var,	var,	lit,	newindex) \
+	_(TINCB,	var,	var,	lit,	newindex) \
+	_(TSETM,	base,	___,	num,	newindex) \
+	_(TSETR,	var,	var,	var,	newindex) \
+	\
+	/* Calls and vararg handling. T = tail call. */ \
+	_(CALLM,	base,	lit,	lit,	call) \
+	_(CALL,	base,	lit,	lit,	call) \
+	_(CALLMT,	base,	___,	lit,	call) \
+	_(CALLT,	base,	___,	lit,	call) \
+	_(ITERC,	base,	lit,	lit,	call) \
+	_(ITERN,	base,	lit,	lit,	call) \
+	_(VARG,	base,	lit,	lit,	___) \
+	_(ISNEXT,	base,	___,	jump,	___) \
+	\
+	/* Returns. */ \
+	_(RETM,	base,	___,	lit,	___) \
+	_(RET,	rbase,	___,	lit,	___) \
+	_(RET0,	rbase,	___,	lit,	___) \
+	_(RET1,	rbase,	___,	lit,	___) \
+	\
+	/* Loops and branches. I/J = interp/JIT, I/C/L = init/call/loop. */ \
+	_(FORI,	base,	___,	jump,	___) \
+	_(JFORI,	base,	___,	jump,	___) \
+	\
+	_(FORL,	base,	___,	jump,	___) \
+	_(IFORL,	base,	___,	jump,	___) \
+	_(JFORL,	base,	___,	lit,	___) \
+	\
+	_(ITERL,	base,	___,	jump,	___) \
+	_(IITERL,	base,	___,	jump,	___) \
+	_(JITERL,	base,	___,	lit,	___) \
+	\
+	_(LOOP,	rbase,	___,	jump,	___) \
+	_(ILOOP,	rbase,	___,	jump,	___) \
+	_(JLOOP,	rbase,	___,	lit,	___) \
+	\
+	_(JMP,	rbase,	___,	jump,	___) \
+	\
+	/*Function headers. I/J = interp/JIT, F/V/C = fixarg/vararg/C func.*/ \
+	_(FUNCF,	rbase,	___,	___,	___) \
+	_(IFUNCF,	rbase,	___,	___,	___) \
+	_(JFUNCF,	rbase,	___,	lit,	___) \
+	_(FUNCV,	rbase,	___,	___,	___) \
+	_(IFUNCV,	rbase,	___,	___,	___) \
+	_(JFUNCV,	rbase,	___,	lit,	___) \
+	_(FUNCC,	rbase,	___,	___,	___) \
+	_(FUNCCW,	rbase,	___,	___,	___) \
+	\
+	/* specific purpose bc. */	\
+	_(VARGN,	dst, ___,	lit,	___) \
+	_(VARGSTR,	dst, ___,	lit,	___) \
+	_(VPROBENAME,	dst, ___,	lit,	___) \
+	_(VPID,		dst, ___,	lit,	___) \
+	_(VTID,		dst, ___,	lit,	___) \
+	_(VUID,		dst, ___,	lit,	___) \
+	_(VCPU,		dst, ___,	lit,	___) \
+	_(VEXECNAME,	dst, ___,	lit,	___) \
+	\
+	_(GFUNC,	dst, ___,	___,	___) /*load global C function*/
+
+/* Bytecode opcode numbers. */
+typedef enum {
+#define BCENUM(name, ma, mb, mc, mt)	BC_##name,
+	BCDEF(BCENUM)
+#undef BCENUM
+	BC__MAX
+} BCOp;
+
+KP_STATIC_ASSERT((int)BC_ISEQV+1 == (int)BC_ISNEV);
+KP_STATIC_ASSERT(((int)BC_ISEQV^1) == (int)BC_ISNEV);
+KP_STATIC_ASSERT(((int)BC_ISEQS^1) == (int)BC_ISNES);
+KP_STATIC_ASSERT(((int)BC_ISEQN^1) == (int)BC_ISNEN);
+KP_STATIC_ASSERT(((int)BC_ISEQP^1) == (int)BC_ISNEP);
+KP_STATIC_ASSERT(((int)BC_ISLT^1) == (int)BC_ISGE);
+KP_STATIC_ASSERT(((int)BC_ISLE^1) == (int)BC_ISGT);
+KP_STATIC_ASSERT(((int)BC_ISLT^3) == (int)BC_ISGT);
+KP_STATIC_ASSERT((int)BC_IST-(int)BC_ISTC == (int)BC_ISF-(int)BC_ISFC);
+KP_STATIC_ASSERT((int)BC_CALLT-(int)BC_CALL == (int)BC_CALLMT-(int)BC_CALLM);
+KP_STATIC_ASSERT((int)BC_CALLMT + 1 == (int)BC_CALLT);
+KP_STATIC_ASSERT((int)BC_RETM + 1 == (int)BC_RET);
+KP_STATIC_ASSERT((int)BC_FORL + 1 == (int)BC_IFORL);
+KP_STATIC_ASSERT((int)BC_FORL + 2 == (int)BC_JFORL);
+KP_STATIC_ASSERT((int)BC_ITERL + 1 == (int)BC_IITERL);
+KP_STATIC_ASSERT((int)BC_ITERL + 2 == (int)BC_JITERL);
+KP_STATIC_ASSERT((int)BC_LOOP + 1 == (int)BC_ILOOP);
+KP_STATIC_ASSERT((int)BC_LOOP + 2 == (int)BC_JLOOP);
+KP_STATIC_ASSERT((int)BC_FUNCF + 1 == (int)BC_IFUNCF);
+KP_STATIC_ASSERT((int)BC_FUNCF + 2 == (int)BC_JFUNCF);
+KP_STATIC_ASSERT((int)BC_FUNCV + 1 == (int)BC_IFUNCV);
+KP_STATIC_ASSERT((int)BC_FUNCV + 2 == (int)BC_JFUNCV);
+
+/* This solves a circular dependency problem, change as needed. */
+#define FF_next_N	4
+
+/* Stack slots used by FORI/FORL, relative to operand A. */
+enum {
+	FORL_IDX, FORL_STOP, FORL_STEP, FORL_EXT
+};
+
+/* Bytecode operand modes. ORDER BCMode */
+typedef enum {
+	/* Mode A must be <= 7 */
+	BCMnone, BCMdst, BCMbase, BCMvar, BCMrbase, BCMuv,
+	BCMlit, BCMlits, BCMpri, BCMnum, BCMstr, BCMtab, BCMfunc,
+	BCMjump, BCMcdata,
+	BCM_max
+} BCMode;
+
+#define BCM___	BCMnone
+
+#define bcmode_a(op)	((BCMode)(bc_mode[op] & 7))
+#define bcmode_b(op)	((BCMode)((bc_mode[op] >> 3) & 15))
+#define bcmode_c(op)	((BCMode)((bc_mode[op] >> 7) & 15))
+#define bcmode_d(op)	bcmode_c(op)
+#define bcmode_hasd(op)	((bc_mode[op] & (15 << 3)) == (BCMnone << 3))
+#define bcmode_mm(op)	((MMS)(bc_mode[op] >> 11))
+
+#define BCMODE(name, ma, mb, mc, mm) \
+	(BCM##ma | (BCM##mb << 3) | (BCM##mc << 7)|(MM_##mm << 11)),
+#define BCMODE_FF	0
+
+static inline int bc_isret(BCOp op)
+{
+	return (op == BC_RETM || op == BC_RET || op == BC_RET0 ||
+		op == BC_RET1);
+}
+
+/* 
+ * Metamethod definition
+ * Note ktap don't use any lua methmethod currently.
+ */
+typedef enum {
+	MM_lt,
+	MM_le,
+	MM_eq,
+	MM_unm,
+	MM_add,
+	MM_sub,
+	MM_mul,
+	MM_div,
+	MM_mod,
+	MM_pow,
+	MM_concat,
+	MM_gc,
+	MM_index,
+	MM_newindex,
+	MM_call,
+	MM__MAX,
+	MM____ = MM__MAX
+} MMS;
+
+
+/* -- Bytecode dump format ------------------------------------------------ */
+
+/*
+** dump   = header proto+ 0U
+** header = ESC 'L' 'J' versionB flagsU [namelenU nameB*]
+** proto  = lengthU pdata
+** pdata  = phead bcinsW* uvdataH* kgc* knum* [debugB*]
+** phead  = flagsB numparamsB framesizeB numuvB numkgcU numknU numbcU
+**          [debuglenU [firstlineU numlineU]]
+** kgc    = kgctypeU { ktab | (loU hiU) | (rloU rhiU iloU ihiU) | strB* }
+** knum   = intU0 | (loU1 hiU)
+** ktab   = narrayU nhashU karray* khash*
+** karray = ktabk
+** khash  = ktabk ktabk
+** ktabk  = ktabtypeU { intU | (loU hiU) | strB* }
+**
+** B = 8 bit, H = 16 bit, W = 32 bit, U = ULEB128 of W, U0/U1 = ULEB128 of W+1
+*/
+
+/* Bytecode dump header. */
+#define BCDUMP_HEAD1		0x15
+#define BCDUMP_HEAD2		0x22
+#define BCDUMP_HEAD3		0x06
+
+/* If you perform *any* kind of private modifications to the bytecode itself
+** or to the dump format, you *must* set BCDUMP_VERSION to 0x80 or higher.
+*/
+#define BCDUMP_VERSION		1
+
+/* Compatibility flags. */
+#define BCDUMP_F_BE		0x01
+#define BCDUMP_F_STRIP		0x02
+#define BCDUMP_F_FFI		0x04
+
+#define BCDUMP_F_KNOWN		(BCDUMP_F_FFI*2-1)
+
+/* Type codes for the GC constants of a prototype. Plus length for strings. */
+enum {
+	BCDUMP_KGC_CHILD, BCDUMP_KGC_TAB, BCDUMP_KGC_I64, BCDUMP_KGC_U64,
+	BCDUMP_KGC_COMPLEX, BCDUMP_KGC_STR
+};
+
+/* Type codes for the keys/values of a constant table. */
+enum {
+	BCDUMP_KTAB_NIL, BCDUMP_KTAB_FALSE, BCDUMP_KTAB_TRUE,
+	BCDUMP_KTAB_INT, BCDUMP_KTAB_NUM, BCDUMP_KTAB_STR
+};
+
+#endif /* __KTAP_BC_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 06/29] ktap: add ktap_arch.h and error header file(include/uapi/ktap/)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (4 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 05/29] ktap: add bytecode definition(include/uapi/ktap/ktap_bc.h) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 07/29] ktap: add kernel module main entry(kernel/trace/ktap/ktap.[c|h]) Jovi Zhangwei
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Simple arch related definition and error msg definition.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 include/uapi/ktap/ktap_arch.h   |  33 ++++++++++
 include/uapi/ktap/ktap_err.h    |  11 ++++
 include/uapi/ktap/ktap_errmsg.h | 135 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 179 insertions(+)
 create mode 100644 include/uapi/ktap/ktap_arch.h
 create mode 100644 include/uapi/ktap/ktap_err.h
 create mode 100644 include/uapi/ktap/ktap_errmsg.h

diff --git a/include/uapi/ktap/ktap_arch.h b/include/uapi/ktap/ktap_arch.h
new file mode 100644
index 0000000..aeb7036
--- /dev/null
+++ b/include/uapi/ktap/ktap_arch.h
@@ -0,0 +1,33 @@
+#ifndef __KTAP_ARCH__
+#define __KTAP_ARCH__
+
+#ifdef __KERNEL__
+#include <linux/types.h>
+#include <asm/byteorder.h>
+
+#if defined(__LITTLE_ENDIAN)
+#define KP_LE				1
+#define KP_BE				0
+#define KP_ENDIAN_SELECT(le, be)        le
+#elif defined(__BIG_ENDIAN)
+#define KP_LE				0
+#define KP_BE				1
+#define KP_ENDIAN_SELECT(le, be)        be
+#endif
+
+#else /* __KERNEL__ */
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define KP_LE				1
+#define KP_BE				0
+#define KP_ENDIAN_SELECT(le, be)        le
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define KP_LE				0
+#define KP_BE				1
+#define KP_ENDIAN_SELECT(le, be)        be
+#else
+#error "could not determine byte order"
+#endif
+
+#endif
+#endif
diff --git a/include/uapi/ktap/ktap_err.h b/include/uapi/ktap/ktap_err.h
new file mode 100644
index 0000000..b7e1a31
--- /dev/null
+++ b/include/uapi/ktap/ktap_err.h
@@ -0,0 +1,11 @@
+#ifndef __KTAP_ERR_H__
+#define __KTAP_ERR_H__
+
+typedef enum {
+#define ERRDEF(name, msg) \
+	KP_ERR_##name, KP_ERR_##name##_ = KP_ERR_##name + sizeof(msg)-1,
+#include "ktap_errmsg.h"
+	KP_ERR__MAX
+} ErrMsg;
+
+#endif
diff --git a/include/uapi/ktap/ktap_errmsg.h b/include/uapi/ktap/ktap_errmsg.h
new file mode 100644
index 0000000..fd7a081
--- /dev/null
+++ b/include/uapi/ktap/ktap_errmsg.h
@@ -0,0 +1,135 @@
+/*
+ * VM error messages.
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ * Copyright (C) 2005-2014 Mike Pall.
+ */
+
+/* Basic error handling. */
+ERRDEF(ERRMEM,	"not enough memory")
+ERRDEF(ERRERR,	"error in error handling")
+
+/* Allocations. */
+ERRDEF(STROV,	"string length overflow")
+ERRDEF(UDATAOV,	"userdata length overflow")
+ERRDEF(STKOV,	"stack overflow")
+ERRDEF(STKOVM,	"stack overflow (%s)")
+ERRDEF(TABOV,	"table overflow")
+
+/* Table indexing. */
+ERRDEF(NANIDX,	"table index is NaN")
+ERRDEF(NILIDX,	"table index is nil")
+ERRDEF(NEXTIDX,	"invalid key to " KTAP_QL("next"))
+
+/* Metamethod resolving. */
+ERRDEF(BADCALL,	"attempt to call a %s value")
+ERRDEF(BADOPRT,	"attempt to %s %s " KTAP_QS " (a %s value)")
+ERRDEF(BADOPRV,	"attempt to %s a %s value")
+ERRDEF(BADCMPT,	"attempt to compare %s with %s")
+ERRDEF(BADCMPV,	"attempt to compare two %s values")
+ERRDEF(GETLOOP,	"loop in gettable")
+ERRDEF(SETLOOP,	"loop in settable")
+ERRDEF(OPCALL,	"call")
+ERRDEF(OPINDEX,	"index")
+ERRDEF(OPARITH,	"perform arithmetic on")
+ERRDEF(OPCAT,	"concatenate")
+ERRDEF(OPLEN,	"get length of")
+
+/* Type checks. */
+ERRDEF(BADSELF,	"calling " KTAP_QS " on bad self (%s)")
+ERRDEF(BADARG,	"bad argument #%d to " KTAP_QS " (%s)")
+ERRDEF(BADTYPE,	"%s expected, got %s")
+ERRDEF(BADVAL,	"invalid value")
+ERRDEF(NOVAL,	"value expected")
+ERRDEF(NOCORO,	"coroutine expected")
+ERRDEF(NOTABN,	"nil or table expected")
+ERRDEF(NOLFUNC,	"ktap function expected")
+ERRDEF(NOFUNCL,	"function or level expected")
+ERRDEF(NOSFT,	"string/function/table expected")
+ERRDEF(NOPROXY,	"boolean or proxy expected")
+ERRDEF(FORINIT,	KTAP_QL("for") " initial value must be a number")
+ERRDEF(FORLIM,	KTAP_QL("for") " limit must be a number")
+ERRDEF(FORSTEP,	KTAP_QL("for") " step must be a number")
+
+/* C API checks. */
+ERRDEF(NOENV,	"no calling environment")
+ERRDEF(CYIELD,	"attempt to yield across C-call boundary")
+ERRDEF(BADLU,	"bad light userdata pointer")
+
+/* Standard library function errors. */
+ERRDEF(ASSERT,	"assertion failed!")
+ERRDEF(PROTMT,	"cannot change a protected metatable")
+ERRDEF(UNPACK,	"too many results to unpack")
+ERRDEF(RDRSTR,	"reader function must return a string")
+ERRDEF(PRTOSTR,	KTAP_QL("tostring") " must return a string to " KTAP_QL("print"))
+ERRDEF(IDXRNG,	"index out of range")
+ERRDEF(BASERNG,	"base out of range")
+ERRDEF(LVLRNG,	"level out of range")
+ERRDEF(INVLVL,	"invalid level")
+ERRDEF(INVOPT,	"invalid option")
+ERRDEF(INVOPTM,	"invalid option " KTAP_QS)
+ERRDEF(INVFMT,	"invalid format")
+ERRDEF(SETFENV,	KTAP_QL("setfenv") " cannot change environment of given object")
+ERRDEF(CORUN,	"cannot resume running coroutine")
+ERRDEF(CODEAD,	"cannot resume dead coroutine")
+ERRDEF(COSUSP,	"cannot resume non-suspended coroutine")
+ERRDEF(TABINS,	"wrong number of arguments to " KTAP_QL("insert"))
+ERRDEF(TABCAT,	"invalid value (%s) at index %d in table for " KTAP_QL("concat"))
+ERRDEF(TABSORT,	"invalid order function for sorting")
+ERRDEF(IOCLFL,	"attempt to use a closed file")
+ERRDEF(IOSTDCL,	"standard file is closed")
+ERRDEF(OSUNIQF,	"unable to generate a unique filename")
+ERRDEF(OSDATEF,	"field " KTAP_QS " missing in date table")
+ERRDEF(STRDUMP,	"unable to dump given function")
+ERRDEF(STRSLC,	"string slice too long")
+ERRDEF(STRPATB,	"missing " KTAP_QL("[") " after " KTAP_QL("%f") " in pattern")
+ERRDEF(STRPATC,	"invalid pattern capture")
+ERRDEF(STRPATE,	"malformed pattern (ends with " KTAP_QL("%") ")")
+ERRDEF(STRPATM,	"malformed pattern (missing " KTAP_QL("]") ")")
+ERRDEF(STRPATU,	"unbalanced pattern")
+ERRDEF(STRPATX,	"pattern too complex")
+ERRDEF(STRCAPI,	"invalid capture index")
+ERRDEF(STRCAPN,	"too many captures")
+ERRDEF(STRCAPU,	"unfinished capture")
+ERRDEF(STRFMT,	"invalid option " KTAP_QS " to " KTAP_QL("format"))
+ERRDEF(STRGSRV,	"invalid replacement value (a %s)")
+ERRDEF(BADMODN,	"name conflict for module " KTAP_QS)
+ERRDEF(JITOPT,	"unknown or malformed optimization flag " KTAP_QS)
+
+/* Lexer/parser errors. */
+ERRDEF(XMODE,	"attempt to load chunk with wrong mode")
+ERRDEF(XNEAR,	"%s near " KTAP_QS)
+ERRDEF(XLINES,	"chunk has too many lines")
+ERRDEF(XLEVELS,	"chunk has too many syntax levels")
+ERRDEF(XNUMBER,	"malformed number")
+ERRDEF(XLSTR,	"unfinished long string")
+ERRDEF(XLCOM,	"unfinished long comment")
+ERRDEF(XSTR,	"unfinished string")
+ERRDEF(XESC,	"invalid escape sequence")
+ERRDEF(XLDELIM,	"invalid long string delimiter")
+ERRDEF(XTOKEN,	KTAP_QS " expected")
+ERRDEF(XJUMP,	"control structure too long")
+ERRDEF(XSLOTS,	"function or expression too complex")
+ERRDEF(XLIMC,	"chunk has more than %d local variables")
+ERRDEF(XLIMM,	"main function has more than %d %s")
+ERRDEF(XLIMF,	"function at line %d has more than %d %s")
+ERRDEF(XMATCH,	KTAP_QS " expected (to close " KTAP_QS " at line %d)")
+ERRDEF(XFIXUP,	"function too long for return fixup")
+ERRDEF(XPARAM,	"<name> or " KTAP_QL("...") " expected")
+ERRDEF(XAMBIG,	"ambiguous syntax (function call x new statement)")
+ERRDEF(XFUNARG,	"function arguments expected")
+ERRDEF(XSYMBOL,	"unexpected symbol")
+ERRDEF(XDOTS,	"cannot use " KTAP_QL("...") " outside a vararg function")
+ERRDEF(XSYNTAX,	"syntax error")
+ERRDEF(XFOR,	KTAP_QL("=") " or " KTAP_QL("in") " expected")
+ERRDEF(XBREAK,	"no loop to break")
+ERRDEF(XLUNDEF,	"undefined label " KTAP_QS)
+ERRDEF(XLDUP,	"duplicate label " KTAP_QS)
+ERRDEF(XGSCOPE,	"<goto %s> jumps into the scope of local " KTAP_QS)
+ERRDEF(XEVENTDEF,"cannot parse eventdef " KTAP_QS)
+
+/* Bytecode reader errors. */
+ERRDEF(BCFMT,	"cannot load incompatible bytecode")
+ERRDEF(BCBAD,	"cannot load malformed bytecode")
+
+#undef ERRDEF
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 07/29] ktap: add kernel module main entry(kernel/trace/ktap/ktap.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (5 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 06/29] ktap: add ktap_arch.h and error header file(include/uapi/ktap/) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h]) Jovi Zhangwei
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap.c is ktapvm kernel module main entry,
it init ktap kernel module, create 'ktap' debugfs directory.

Userspace tool send ioctl command to '/sys/kernel/debug/ktap/ktapvm' file,
to control ktap runtime.

It will read bytecode trunk, validate and execute bytecode.

kp_vm_new_state
    kp_bcread
        kp_vm_validate_code
            kp_vm_call_proto

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/ktap.c | 255 +++++++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/ktap.h | 176 ++++++++++++++++++++++++++++++++
 2 files changed, 431 insertions(+)
 create mode 100644 kernel/trace/ktap/ktap.c
 create mode 100644 kernel/trace/ktap/ktap.h

diff --git a/kernel/trace/ktap/ktap.c b/kernel/trace/ktap/ktap.c
new file mode 100644
index 0000000..855af09
--- /dev/null
+++ b/kernel/trace/ktap/ktap.c
@@ -0,0 +1,255 @@
+/*
+ * ktap.c - ktapvm kernel module main entry
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+/*
+ * this file is the first file to be compile, add CONFIG_ checking in here.
+ * See Requirements in doc/tutorial.md
+ */
+
+#include <linux/version.h>
+#include <linux/module.h>
+#include <linux/errno.h>
+#include <linux/file.h>
+#include <linux/slab.h>
+#include <linux/fcntl.h>
+#include <linux/sched.h>
+#include <linux/poll.h>
+#include <linux/anon_inodes.h>
+#include <linux/debugfs.h>
+#include <linux/vmalloc.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_bcread.h"
+#include "kp_vm.h"
+
+/* 
+ * gettimeofday_ns: common helper function
+ * TODO: make getnstimeofday safe called in probe context, there have
+ * seq lock in getnstimeofday.
+ * (Systemtap fix this by introduce its own timekeeping code)
+ */
+long gettimeofday_ns(void)
+{
+	struct timespec now;
+
+	getnstimeofday(&now);
+	return now.tv_sec * NSEC_PER_SEC + now.tv_nsec;
+}
+
+static int load_trunk(ktap_option_t *parm, unsigned long **buff)
+{
+	unsigned long *vmstart;
+
+	if (parm->trunk_len > 4096)
+		return -EINVAL;
+
+	vmstart = vmalloc(parm->trunk_len);
+	if (!vmstart)
+		return -ENOMEM;
+
+	if (copy_from_user(vmstart, (void __user *)parm->trunk,
+			   parm->trunk_len)) {
+		vfree(vmstart);
+		return -EFAULT;
+	}
+
+	*buff = vmstart;
+	return 0;
+}
+
+static struct dentry *kp_dir_dentry;
+
+/* Ktap Main Entry */
+static int ktap_main(struct file *file, ktap_option_t *parm)
+{
+	unsigned long *buff = NULL;
+	ktap_state_t *ks;
+	ktap_proto_t *pt;
+	long start_time, delta_time;
+	int ret;
+
+	start_time = gettimeofday_ns();
+
+	ks = kp_vm_new_state(parm, kp_dir_dentry);
+	if (unlikely(!ks))
+		return -ENOEXEC;
+
+	file->private_data = ks;
+
+	ret = load_trunk(parm, &buff);
+	if (ret) {
+		kp_error(ks, "cannot load file\n");
+		goto out;
+	}
+
+	pt = kp_bcread(ks, (unsigned char *)buff, parm->trunk_len);
+
+	vfree(buff);
+
+	if (pt) {
+		/* validate byte code */
+		if (kp_vm_validate_code(ks, pt, ks->stack))
+			goto out;
+
+		delta_time = (gettimeofday_ns() - start_time) / NSEC_PER_USEC;
+		kp_verbose_printf(ks, "booting time: %d (us)\n", delta_time);
+
+		/* enter vm */
+		kp_vm_call_proto(ks, pt);
+	}
+
+ out:
+	kp_vm_exit(ks);
+	return ret;
+}
+
+static long ktap_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	ktap_option_t parm;
+
+	switch (cmd) {
+	case KTAP_CMD_IOC_RUN:
+		/* must be root to run ktap script (at least for now) */
+		if (!capable(CAP_SYS_ADMIN))
+			return -EACCES;
+
+		if (copy_from_user(&parm, (void __user *)arg,
+				   sizeof(ktap_option_t)))
+			return -EFAULT;
+
+		return ktap_main(file, &parm);
+	default:
+		return -EINVAL;
+	};
+
+        return 0;
+}
+
+static const struct file_operations ktap_fops = {
+	.llseek                 = no_llseek,
+	.unlocked_ioctl         = ktap_ioctl,
+};
+
+static long ktapvm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	int new_fd, err;
+	struct file *new_file;
+
+	new_fd = get_unused_fd();
+	if (new_fd < 0)
+		return new_fd;
+
+	new_file = anon_inode_getfile("[ktap]", &ktap_fops, NULL, O_RDWR);
+	if (IS_ERR(new_file)) {
+		err = PTR_ERR(new_file);
+		put_unused_fd(new_fd);
+		return err;
+	}
+
+	file->private_data = NULL;
+	fd_install(new_fd, new_file);
+	return new_fd;
+}
+
+static const struct file_operations ktapvm_fops = {
+	.owner  = THIS_MODULE,
+	.unlocked_ioctl         = ktapvm_ioctl,
+};
+
+int (*kp_ftrace_profile_set_filter)(struct perf_event *event, int event_id,
+				    const char *filter_str);
+
+struct syscall_metadata **syscalls_metadata;
+
+/*TODO: kill this function in future */
+static int __init init_dummy_kernel_functions(void)
+{
+	unsigned long *addr;
+
+	/*
+	 * ktap need symbol ftrace_profile_set_filter to set event filter, 
+	 * export it in future. 
+	 */
+#ifdef CONFIG_PPC64
+	kp_ftrace_profile_set_filter =
+		(void *)kallsyms_lookup_name(".ftrace_profile_set_filter");
+#else
+	kp_ftrace_profile_set_filter =
+		(void *)kallsyms_lookup_name("ftrace_profile_set_filter");
+#endif
+	if (!kp_ftrace_profile_set_filter) {
+		pr_err("ktap: cannot lookup ftrace_profile_set_filter "
+			"in kallsyms\n");
+		return -1;
+	}
+
+	/* use syscalls_metadata for syscall event handling */
+	addr = (void *)kallsyms_lookup_name("syscalls_metadata");
+	if (!addr) {
+		pr_err("ktap: cannot lookup syscalls_metadata in kallsyms\n");
+		return -1;
+	}
+
+	syscalls_metadata = (struct syscall_metadata **)*addr;
+	return 0;
+}
+
+static int __init init_ktap(void)
+{
+	struct dentry *ktapvm_dentry;
+
+	if (init_dummy_kernel_functions())
+		return -1;
+
+	kp_dir_dentry = debugfs_create_dir("ktap", NULL);
+	if (!kp_dir_dentry) {
+		pr_err("ktap: debugfs_create_dir failed\n");
+		return -1;
+	}
+
+	ktapvm_dentry = debugfs_create_file("ktapvm", 0444, kp_dir_dentry, NULL,
+					    &ktapvm_fops);
+
+	if (!ktapvm_dentry) {
+		pr_err("ktapvm: cannot create ktapvm file\n");
+		debugfs_remove_recursive(kp_dir_dentry);
+		return -1;
+	}
+
+	return 0;
+}
+
+static void __exit exit_ktap(void)
+{
+	debugfs_remove_recursive(kp_dir_dentry);
+}
+
+module_init(init_ktap);
+module_exit(exit_ktap);
+
+MODULE_AUTHOR("Jovi Zhangwei <jovi.zhangwei@gmail.com>");
+MODULE_DESCRIPTION("ktap");
+MODULE_LICENSE("GPL");
+
+int kp_max_loop_count = 100000;
+module_param_named(max_loop_count, kp_max_loop_count, int, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(max_loop_count, "max loop execution count");
+
diff --git a/kernel/trace/ktap/ktap.h b/kernel/trace/ktap/ktap.h
new file mode 100644
index 0000000..90d1468
--- /dev/null
+++ b/kernel/trace/ktap/ktap.h
@@ -0,0 +1,176 @@
+#ifndef __KTAP_H__
+#define __KTAP_H__
+
+#include <linux/version.h>
+#include <linux/hardirq.h>
+#include <linux/trace_seq.h>
+
+/* for built-in library C function register */
+typedef struct ktap_libfunc {
+        const char *name; /* function name */
+        ktap_cfunction func; /* function pointer */
+} ktap_libfunc_t;
+
+long gettimeofday_ns(void); /* common helper function */
+int kp_lib_init_base(ktap_state_t *ks);
+int kp_lib_init_kdebug(ktap_state_t *ks);
+int kp_lib_init_timer(ktap_state_t *ks);
+int kp_lib_init_table(ktap_state_t *ks);
+int kp_lib_init_ansi(ktap_state_t *ks);
+int kp_lib_init_net(ktap_state_t *ks);
+
+void kp_exit_timers(ktap_state_t *ks);
+void kp_freeupval(ktap_state_t *ks, ktap_upval_t *uv);
+
+extern int (*kp_ftrace_profile_set_filter)(struct perf_event *event,
+					   int event_id,
+					   const char *filter_str);
+
+extern struct syscall_metadata **syscalls_metadata;
+
+/* get from kernel/trace/trace.h */
+static __always_inline int trace_get_context_bit(void)
+{
+	int bit;
+
+	if (in_interrupt()) {
+		if (in_nmi())
+			bit = 0;
+		else if (in_irq())
+			bit = 1;
+		else
+			bit = 2;
+	} else
+		bit = 3;
+
+	return bit;
+}
+
+static __always_inline int get_recursion_context(ktap_state_t *ks)
+{
+	int rctx = trace_get_context_bit();
+	int *val = __this_cpu_ptr(G(ks)->recursion_context[rctx]);
+
+	if (*val)
+		return -1;
+
+	*val = true;
+	return rctx;
+}
+
+static inline void put_recursion_context(ktap_state_t *ks, int rctx)
+{
+	int *val = __this_cpu_ptr(G(ks)->recursion_context[rctx]);
+	*val = false;
+}
+
+static inline void *kp_this_cpu_state(ktap_state_t *ks, int rctx)
+{
+	return this_cpu_ptr(G(ks)->percpu_state[rctx]);
+}
+
+static inline void *kp_this_cpu_print_buffer(ktap_state_t *ks)
+{
+	return this_cpu_ptr(G(ks)->percpu_print_buffer[trace_get_context_bit()]);
+}
+
+static inline void *kp_this_cpu_temp_buffer(ktap_state_t *ks)
+{
+	return this_cpu_ptr(G(ks)->percpu_temp_buffer[trace_get_context_bit()]);
+}
+
+#define kp_verbose_printf(ks, ...) \
+	if (G(ks)->parm->verbose)	\
+		kp_printf(ks, "[verbose] "__VA_ARGS__);
+
+/* argument operation macro */
+#define kp_arg(ks, idx)	((ks)->func + (idx))
+#define kp_arg_nr(ks)	((int)(ks->top - (ks->func + 1)))
+
+#define kp_arg_check(ks, idx, type)				\
+	do {							\
+		if (unlikely(itype(kp_arg(ks, idx)) != type)) {	\
+			kp_error(ks, "wrong type of argument %d\n", idx);\
+			return -1;				\
+		}						\
+	} while (0)
+
+#define kp_arg_checkstring(ks, idx)				\
+	({							\
+		ktap_val_t *o = kp_arg(ks, idx);		\
+		if (unlikely(!is_string(o))) {			\
+			kp_error(ks, "wrong type of argument %d\n", idx); \
+			return -1;				\
+		}						\
+		svalue(o);					\
+	})
+
+#define kp_arg_checkfunction(ks, idx)				\
+	({							\
+		ktap_val_t *o = kp_arg(ks, idx);		\
+		if (unlikely(!is_function(o))) {			\
+			kp_error(ks, "wrong type of argument %d\n", idx); \
+			return -1;				\
+		}						\
+		clvalue(o);					\
+	})
+
+#define kp_arg_checknumber(ks, idx)				\
+	({							\
+		ktap_val_t *o = kp_arg(ks, idx);		\
+		if (unlikely(!is_number(o))) {			\
+			kp_error(ks, "wrong type of argument %d\n", idx); \
+			return -1;				\
+		}						\
+		nvalue(o);					\
+	})
+
+#define kp_arg_checkoptnumber(ks, idx, def)			\
+	({							\
+		ktap_number n;					\
+		if (idx > kp_arg_nr(ks)) {				\
+			n = def;				\
+		} else {					\
+			ktap_val_t *o = kp_arg(ks, idx);	\
+			if (unlikely(!is_number(o))) {		\
+				kp_error(ks, "wrong type of argument %d\n", \
+					     idx);		\
+				return -1;			\
+			}					\
+			n = nvalue(o);				\
+		}						\
+		n;						\
+	})
+
+#define kp_error(ks, args...)			\
+	do {					\
+		kp_printf(ks, "error: "args);	\
+		kp_vm_try_to_exit(ks);		\
+		G(ks)->state = KTAP_ERROR;	\
+	} while(0)
+
+
+#define SPRINT_SYMBOL	sprint_symbol_no_offset
+
+extern int kp_max_loop_count;
+
+void kp_printf(ktap_state_t *ks, const char *fmt, ...);
+void __kp_puts(ktap_state_t *ks, const char *str);
+void __kp_bputs(ktap_state_t *ks, const char *str);
+
+#define kp_puts(ks, str) ({						\
+	static const char *trace_printk_fmt				\
+		__attribute__((section("__trace_printk_fmt"))) =	\
+		__builtin_constant_p(str) ? str : NULL;			\
+									\
+	if (__builtin_constant_p(str))					\
+		__kp_bputs(ks, trace_printk_fmt);		\
+	else								\
+		__kp_puts(ks, str);		\
+})
+
+#define err2msg(em)     (kp_err_allmsg+(int)(em))
+extern const char *kp_err_allmsg;
+
+#endif /* __KTAP_H__ */
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (6 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 07/29] ktap: add kernel module main entry(kernel/trace/ktap/ktap.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-30  2:47   ` Andi Kleen
  2014-03-28 14:45 ` [PATCH v2 09/29] ktap: add bytecode execution engine(kernel/trace/ktap/kp_vm.[c|h]) Jovi Zhangwei
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Exposed function:
ktap_proto_t *kp_bcread(ktap_state_t *ks, unsigned char *buff, int len)

Function kp_bcread read bytecode from buff, and return
ktap top-level function prototype.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/kp_bcread.c | 429 ++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/kp_bcread.h |   6 +
 2 files changed, 435 insertions(+)
 create mode 100644 kernel/trace/ktap/kp_bcread.c
 create mode 100644 kernel/trace/ktap/kp_bcread.h

diff --git a/kernel/trace/ktap/kp_bcread.c b/kernel/trace/ktap/kp_bcread.c
new file mode 100644
index 0000000..b51b622
--- /dev/null
+++ b/kernel/trace/ktap/kp_bcread.c
@@ -0,0 +1,429 @@
+/*
+ * Bytecode reader.
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <uapi/ktap/ktap_types.h>
+#include <uapi/ktap/ktap_bc.h>
+#include <uapi/ktap/ktap_err.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_vm.h"
+#include "kp_str.h"
+#include "kp_tab.h"
+
+
+/* Context for bytecode reader. */
+typedef struct BCReadCtx {
+	ktap_state_t *ks;
+	int flags;
+	char *start;
+	char *p;
+	char *pe;
+	ktap_str_t *chunkname;
+	ktap_val_t *savetop;
+} BCReadCtx;
+
+
+#define bcread_flags(ctx)	(ctx)->flags
+#define bcread_swap(ctx) \
+	((bcread_flags(ctx) & BCDUMP_F_BE) != KP_BE*BCDUMP_F_BE)
+#define bcread_oldtop(ctx)		(ctx)->savetop
+#define bcread_savetop(ctx)	(ctx)->savetop = (ctx)->ks->top;
+
+static inline uint32_t bswap(uint32_t x)
+{
+	return (uint32_t)__builtin_bswap32((int32_t)x);
+}
+
+/* -- Input buffer handling ----------------------------------------------- */
+
+/* Throw reader error. */
+static void bcread_error(BCReadCtx *ctx, ErrMsg em)
+{
+	kp_error(ctx->ks, "%s\n", err2msg(em));
+}
+
+/* Return memory block from buffer. */
+static inline uint8_t *bcread_mem(BCReadCtx *ctx, int len)
+{
+	uint8_t *p = (uint8_t *)ctx->p;
+	ctx->p += len;
+	kp_assert(ctx->p <= ctx->pe);
+	return p;
+}
+
+/* Copy memory block from buffer. */
+static void bcread_block(BCReadCtx *ctx, void *q, int len)
+{
+	memcpy(q, bcread_mem(ctx, len), len);
+}
+
+/* Read byte from buffer. */
+static inline uint32_t bcread_byte(BCReadCtx *ctx)
+{
+	kp_assert(ctx->p < ctx->pe);
+	return (uint32_t)(uint8_t)*ctx->p++;
+}
+
+/* Read ULEB128 value from buffer. */
+static inline uint32_t bcread_uint32(BCReadCtx *ctx)
+{
+	uint32_t v;
+	bcread_block(ctx, &v, sizeof(uint32_t));
+	kp_assert(ctx->p <= ctx->pe);
+	return v;
+}
+
+/* -- Bytecode reader ----------------------------------------------------- */
+
+/* Read debug info of a prototype. */
+static void bcread_dbg(BCReadCtx *ctx, ktap_proto_t *pt, int sizedbg)
+{
+	void *lineinfo = (void *)proto_lineinfo(pt);
+
+	bcread_block(ctx, lineinfo, sizedbg);
+	/* Swap lineinfo if the endianess differs. */
+	if (bcread_swap(ctx) && pt->numline >= 256) {
+		int i, n = pt->sizebc-1;
+		if (pt->numline < 65536) {
+			uint16_t *p = (uint16_t *)lineinfo;
+			for (i = 0; i < n; i++)
+				p[i] = (uint16_t)((p[i] >> 8)|(p[i] << 8));
+		} else {
+			uint32_t *p = (uint32_t *)lineinfo;
+			for (i = 0; i < n; i++)
+				p[i] = bswap(p[i]);
+		}
+	}
+}
+
+/* Find pointer to varinfo. */
+static const void *bcread_varinfo(ktap_proto_t *pt)
+{
+	const uint8_t *p = proto_uvinfo(pt);
+	int n = pt->sizeuv;
+	if (n)
+		while (*p++ || --n) ;
+	return p;
+}
+
+/* Read a single constant key/value of a template table. */
+static int bcread_ktabk(BCReadCtx *ctx, ktap_val_t *o)
+{
+	int tp = bcread_uint32(ctx);
+	if (tp >= BCDUMP_KTAB_STR) {
+		int len = tp - BCDUMP_KTAB_STR;
+		const char *p = (const char *)bcread_mem(ctx, len);
+		ktap_str_t *ts = kp_str_new(ctx->ks, p, len);
+		if (unlikely(!ts))
+			return -ENOMEM;
+
+		set_string(o, ts);
+	} else if (tp == BCDUMP_KTAB_NUM) {
+		set_number(o, *(ktap_number *)bcread_mem(ctx,
+					sizeof(ktap_number)));
+	} else {
+ 		kp_assert(tp <= BCDUMP_KTAB_TRUE);
+		setitype(o, ~tp);
+	}
+	return 0;
+}
+
+/* Read a template table. */
+static ktap_tab_t *bcread_ktab(BCReadCtx *ctx)
+{
+	int narray = bcread_uint32(ctx);
+	int nhash = bcread_uint32(ctx);
+
+	ktap_tab_t *t = kp_tab_new(ctx->ks, narray, hsize2hbits(nhash));
+	if (!t)
+		return NULL;
+
+	if (narray) {  /* Read array entries. */
+		int i;
+		ktap_val_t *o = t->array;
+		for (i = 0; i < narray; i++, o++) 
+			if (bcread_ktabk(ctx, o))
+				return NULL;
+	}
+	if (nhash) {  /* Read hash entries. */
+		int i;
+		for (i = 0; i < nhash; i++) {
+			ktap_val_t key;
+			ktap_val_t val;
+			if (bcread_ktabk(ctx, &key))
+				return NULL;
+			kp_assert(!is_nil(&key));
+			if (bcread_ktabk(ctx, &val))
+				return NULL;
+			kp_tab_set(ctx->ks, t, &key, &val);
+		}
+	}
+	return t;
+}
+
+/* Read GC constants(string, table, child proto) of a prototype. */
+static int bcread_kgc(BCReadCtx *ctx, ktap_proto_t *pt, int sizekgc)
+{
+	ktap_obj_t **kr = (ktap_obj_t **)pt->k - (ptrdiff_t)sizekgc;
+	int i;
+
+	for (i = 0; i < sizekgc; i++, kr++) {
+		int tp = bcread_uint32(ctx);
+		if (tp >= BCDUMP_KGC_STR) {
+			int len = tp - BCDUMP_KGC_STR;
+			const char *p = (const char *)bcread_mem(ctx, len);
+			*kr =(ktap_obj_t *)kp_str_new(ctx->ks, p, len);
+			if (unlikely(!*kr))
+				return -1;
+		} else if (tp == BCDUMP_KGC_TAB) {
+			*kr = (ktap_obj_t *)bcread_ktab(ctx);
+			if (unlikely(!*kr))
+				return -1;
+		} else if (tp == BCDUMP_KGC_CHILD){
+			ktap_state_t *ks = ctx->ks;
+			if (ks->top <= bcread_oldtop(ctx)) {
+				bcread_error(ctx, KP_ERR_BCBAD);
+				return -1;
+			}
+			ks->top--;
+			*kr = (ktap_obj_t *)ptvalue(ks->top);
+		} else {
+			bcread_error(ctx, KP_ERR_BCBAD);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+/* Read number constants of a prototype. */
+static void bcread_knum(BCReadCtx *ctx, ktap_proto_t *pt, int sizekn)
+{
+	int i;
+	ktap_val_t *o = pt->k;
+
+	for (i = 0; i < sizekn; i++, o++) {
+		set_number(o, *(ktap_number *)bcread_mem(ctx,
+					sizeof(ktap_number)));
+	}
+}
+
+/* Read bytecode instructions. */
+static void bcread_bytecode(BCReadCtx *ctx, ktap_proto_t *pt, int sizebc)
+{
+	BCIns *bc = proto_bc(pt);
+	bc[0] = BCINS_AD((pt->flags & PROTO_VARARG) ? BC_FUNCV : BC_FUNCF,
+			  pt->framesize, 0);
+	bcread_block(ctx, bc+1, (sizebc-1)*(int)sizeof(BCIns));
+	/* Swap bytecode instructions if the endianess differs. */
+	if (bcread_swap(ctx)) {
+		int i;
+		for (i = 1; i < sizebc; i++) bc[i] = bswap(bc[i]);
+	}
+}
+
+/* Read upvalue refs. */
+static void bcread_uv(BCReadCtx *ctx, ktap_proto_t *pt, int sizeuv)
+{
+	if (sizeuv) {
+		uint16_t *uv = proto_uv(pt);
+		bcread_block(ctx, uv, sizeuv*2);
+		/* Swap upvalue refs if the endianess differs. */
+		if (bcread_swap(ctx)) {
+			int i;
+			for (i = 0; i < sizeuv; i++)
+				uv[i] = (uint16_t)((uv[i] >> 8)|(uv[i] << 8));
+		}
+	}
+}
+
+/* Read a prototype. */
+static ktap_proto_t *bcread_proto(BCReadCtx *ctx)
+{
+	ktap_proto_t *pt;
+	int framesize, numparams, flags;
+	int sizeuv, sizekgc, sizekn, sizebc, sizept;
+	int ofsk, ofsuv, ofsdbg;
+	int sizedbg = 0;
+	BCLine firstline = 0, numline = 0;
+
+	/* Read prototype header. */
+	flags = bcread_byte(ctx);
+	numparams = bcread_byte(ctx);
+	framesize = bcread_byte(ctx);
+	sizeuv = bcread_byte(ctx);
+	sizekgc = bcread_uint32(ctx);
+	sizekn = bcread_uint32(ctx);
+	sizebc = bcread_uint32(ctx) + 1;
+	if (!(bcread_flags(ctx) & BCDUMP_F_STRIP)) {
+		sizedbg = bcread_uint32(ctx);
+		if (sizedbg) {
+			firstline = bcread_uint32(ctx);
+			numline = bcread_uint32(ctx);
+		}
+	}
+
+	/* Calculate total size of prototype including all colocated arrays. */
+	sizept = (int)sizeof(ktap_proto_t) + sizebc * (int)sizeof(BCIns) +
+			sizekgc * (int)sizeof(ktap_obj_t *);
+	sizept = (sizept + (int)sizeof(ktap_val_t)-1) &
+			~((int)sizeof(ktap_val_t)-1);
+	ofsk = sizept; sizept += sizekn*(int)sizeof(ktap_val_t);
+	ofsuv = sizept; sizept += ((sizeuv+1)&~1)*2;
+	ofsdbg = sizept; sizept += sizedbg;
+
+	/* Allocate prototype object and initialize its fields. */
+	pt = (ktap_proto_t *)kp_obj_new(ctx->ks, (int)sizept);
+	pt->gct = ~KTAP_TPROTO;
+	pt->numparams = (uint8_t)numparams;
+	pt->framesize = (uint8_t)framesize;
+	pt->sizebc = sizebc;
+	pt->k = (char *)pt + ofsk;
+	pt->uv = (char *)pt + ofsuv;
+	pt->sizekgc = 0;  /* Set to zero until fully initialized. */
+	pt->sizekn = sizekn;
+	pt->sizept = sizept;
+	pt->sizeuv = (uint8_t)sizeuv;
+	pt->flags = (uint8_t)flags;
+	pt->chunkname = ctx->chunkname;
+
+	/* Close potentially uninitialized gap between bc and kgc. */
+	*(uint32_t *)((char *)pt + ofsk - sizeof(ktap_obj_t *)*(sizekgc+1))
+									= 0;
+
+	/* Read bytecode instructions and upvalue refs. */
+	bcread_bytecode(ctx, pt, sizebc);
+	bcread_uv(ctx, pt, sizeuv);
+
+	/* Read constants. */
+	if (bcread_kgc(ctx, pt, sizekgc))
+		return NULL;
+	pt->sizekgc = sizekgc;
+	bcread_knum(ctx, pt, sizekn);
+
+	/* Read and initialize debug info. */
+	pt->firstline = firstline;
+	pt->numline = numline;
+	if (sizedbg) {
+		int sizeli = (sizebc-1) << (numline < 256 ? 0 :
+					numline < 65536 ? 1 : 2);
+		pt->lineinfo = (char *)pt + ofsdbg;
+		pt->uvinfo = (char *)pt + ofsdbg + sizeli;
+		bcread_dbg(ctx, pt, sizedbg);
+		pt->varinfo = (void *)bcread_varinfo(pt);
+	} else {
+		pt->lineinfo = NULL;
+		pt->uvinfo = NULL;
+		pt->varinfo = NULL;
+	}
+	return pt;
+}
+
+/* Read and check header of bytecode dump. */
+static int bcread_header(BCReadCtx *ctx)
+{
+	uint32_t flags;
+
+	if (bcread_byte(ctx) != BCDUMP_HEAD1 ||
+		bcread_byte(ctx) != BCDUMP_HEAD2 ||
+		bcread_byte(ctx) != BCDUMP_HEAD3 ||
+		bcread_byte(ctx) != BCDUMP_VERSION)
+		return -1;
+
+	bcread_flags(ctx) = flags = bcread_byte(ctx);
+
+	if ((flags & ~(BCDUMP_F_KNOWN)) != 0)
+		return -1;
+
+	if ((flags & BCDUMP_F_FFI)) {
+		return -1;
+	}
+
+	if ((flags & BCDUMP_F_STRIP)) {
+		ctx->chunkname = kp_str_newz(ctx->ks, "striped");
+	} else {
+		int len = bcread_uint32(ctx);
+		ctx->chunkname = kp_str_new(ctx->ks,
+				(const char *)bcread_mem(ctx, len), len);
+	}
+
+	if (unlikely(!ctx->chunkname))
+		return -1;
+
+	return 0;
+}
+
+/* Read a bytecode dump. */
+ktap_proto_t *kp_bcread(ktap_state_t *ks, unsigned char *buff, int len)
+{
+	BCReadCtx ctx;
+
+	ctx.ks = ks;
+	ctx.p = buff;
+	ctx.pe = buff + len;
+
+	ctx.start = buff;
+
+	bcread_savetop(&ctx);
+	/* Check for a valid bytecode dump header. */
+	if (bcread_header(&ctx)) {
+		bcread_error(&ctx, KP_ERR_BCFMT);
+		return NULL;
+	}
+
+	for (;;) {  /* Process all prototypes in the bytecode dump. */
+		ktap_proto_t *pt;
+		int len;
+		const char *startp;
+		/* Read length. */
+		if (ctx.p < ctx.pe && ctx.p[0] == 0) {  /* Shortcut EOF. */
+			ctx.p++;
+			break;
+		}
+		len = bcread_uint32(&ctx);
+		if (!len)
+			break;  /* EOF */
+		startp = ctx.p;
+		pt = bcread_proto(&ctx);
+		if (!pt)
+			return NULL;
+		if (ctx.p != startp + len) {
+			bcread_error(&ctx, KP_ERR_BCBAD);
+			return NULL;
+		}
+		set_proto(ks->top, pt);
+		incr_top(ks);
+	}
+	if ((int32_t)(2*(uint32_t)(ctx.pe - ctx.p)) > 0 ||
+			ks->top-1 != bcread_oldtop(&ctx)) {
+		bcread_error(&ctx, KP_ERR_BCBAD);
+		return NULL;
+	}
+
+	/* Pop off last prototype. */
+	ks->top--;
+	return ptvalue(ks->top);
+}
+
diff --git a/kernel/trace/ktap/kp_bcread.h b/kernel/trace/ktap/kp_bcread.h
new file mode 100644
index 0000000..ea2dde2
--- /dev/null
+++ b/kernel/trace/ktap/kp_bcread.h
@@ -0,0 +1,6 @@
+#ifndef __KTAP_BCREAD_H__
+#define __KTAP_BCREAD_H__
+
+ktap_proto_t *kp_bcread(ktap_state_t *ks, unsigned char *buff, int len);
+
+#endif /* __KTAP_BCREAD_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 09/29] ktap: add bytecode execution engine(kernel/trace/ktap/kp_vm.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (7 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h]) Jovi Zhangwei
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

kp_vm.c is ktap runtime execution engine, it execute
all bytecodes and handle core vm stuff.

Exposed functions:

1). kp_vm_new_state:    allocate and init main context.
2). kp_vm_exit:         ktap main thread exit
3). kp_vm_register_lib: library and built-in functions register interface
4). kp_vm_validate_code:validate bytecode before execute
5). kp_vm_call:         dispatch and execute bytecode, called from probe context
6). kp_vm_call_proto:   called with a new proto, called when start.

ktap runtime pre-allocate context(ktap_state_t) for each cpu and
each probe context(process, irq, sirq, nmi), and pre-allocate
ktap stack memory for each ktap_state.

The key function is kp_vm_call, it dispatch and execute bytecodes.
Computed goto is used for fast dispatch instead of big switch.
All bytecode check its type in runtime, it will return if type mismatch.

Hotloop is checked when execute loop related bytecode, to forbid
deadloop, it limit can be set by module parameter.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/kp_vm.c | 1754 +++++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/kp_vm.h |   43 ++
 2 files changed, 1797 insertions(+)
 create mode 100644 kernel/trace/ktap/kp_vm.c
 create mode 100644 kernel/trace/ktap/kp_vm.h

diff --git a/kernel/trace/ktap/kp_vm.c b/kernel/trace/ktap/kp_vm.c
new file mode 100644
index 0000000..acb1d21
--- /dev/null
+++ b/kernel/trace/ktap/kp_vm.c
@@ -0,0 +1,1754 @@
+/*
+ * kp_vm.c - ktap script virtual machine in Linux kernel
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/slab.h>
+#include <linux/ftrace_event.h>
+#include <linux/signal.h>
+#include <linux/sched.h>
+#include <linux/uaccess.h>
+#include <uapi/ktap/ktap_types.h>
+#include <uapi/ktap/ktap_bc.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_mempool.h"
+#include "kp_tab.h"
+#include "kp_transport.h"
+#include "kp_vm.h"
+#include "kp_events.h"
+
+#define KTAP_MIN_RESERVED_STACK_SIZE 20
+#define KTAP_STACK_SIZE		120 /* enlarge this value for big stack */
+#define KTAP_STACK_SIZE_BYTES	(KTAP_STACK_SIZE * sizeof(ktap_val_t))
+
+#define KTAP_PERCPU_BUFFER_SIZE	(3 * PAGE_SIZE)
+
+static ktap_cfunction gfunc_get(ktap_state_t *ks, int idx);
+static int gfunc_getidx(ktap_global_state_t *g, ktap_cfunction cfunc);
+
+static ktap_str_t *str_concat(ktap_state_t *ks, StkId top, int start, int end)
+{
+	int i, len = 0;
+	ktap_str_t *ts;
+	char *ptr, *buffer;
+
+	for (i = start; i <= end; i++) {
+		if (!is_string(top + i)) {
+			kp_error(ks, "cannot concat non-string\n");
+			return NULL;
+		}
+
+		len += rawtsvalue(top + i)->len;
+	}
+
+	if (len >= KTAP_PERCPU_BUFFER_SIZE) {
+		kp_error(ks, "Error: too long string concatenation\n");
+		return NULL;
+	}
+
+	preempt_disable_notrace();
+
+	buffer = kp_this_cpu_print_buffer(ks);
+	ptr = buffer;
+
+	for (i = start; i <= end; i++) {
+		int len = rawtsvalue(top + i)->len;
+		strncpy(ptr, svalue(top + i), len);
+		ptr += len;
+	}
+	ts = kp_str_new(ks, buffer, len);
+
+	preempt_enable_notrace();
+
+	return ts;
+}
+
+static ktap_upval_t *findupval(ktap_state_t *ks, StkId slot)
+{
+	ktap_global_state_t *g = G(ks);
+	ktap_upval_t **pp = &ks->openupval;
+	ktap_upval_t *p;
+	ktap_upval_t *uv;
+
+	while (*pp != NULL && (p = *pp)->v >= slot) {
+		if (p->v == slot) {  /* found a corresponding upvalue? */
+			return p;
+		}
+		pp = (ktap_upval_t **)&p->nextgc;
+	}
+
+	/* not found: create a new one */
+	uv = (ktap_upval_t *)kp_malloc(ks, sizeof(ktap_upval_t));
+	if (!uv)
+		return NULL;
+	uv->gct = ~KTAP_TUPVAL;
+	uv->closed = 0; /* still open */
+	uv->v = slot;  /* current value lives in the stack */
+	/* Insert into sorted list of open upvalues. */
+	uv->nextgc = (ktap_obj_t *)*pp;
+	*pp = uv;
+	uv->prev = &g->uvhead;  /* double link it in `uvhead' list */
+	uv->next = g->uvhead.next;
+	uv->next->prev = uv;
+	g->uvhead.next = uv;
+	return uv;
+}
+
+static void unlinkupval(ktap_upval_t *uv)
+{
+	uv->next->prev = uv->prev;  /* remove from `uvhead' list */
+	uv->prev->next = uv->next;
+}
+
+void kp_freeupval(ktap_state_t *ks, ktap_upval_t *uv)
+{
+	if (!uv->closed)  /* is it open? */
+		unlinkupval(uv);  /* remove from open list */
+	kp_free(ks, uv);  /* free upvalue */
+}
+
+/* close upvals */
+static void func_closeuv(ktap_state_t *ks, StkId level)
+{
+	ktap_upval_t *uv;
+	ktap_global_state_t *g = G(ks);
+	while (ks->openupval != NULL &&
+		(uv = ks->openupval)->v >= level) {
+		ktap_obj_t *o = obj2gco(uv);
+		/* remove from `open' list */
+		ks->openupval = (ktap_upval_t *)uv->nextgc;
+		unlinkupval(uv);  /* remove upvalue from 'uvhead' list */
+		set_obj(&uv->tv, uv->v);  /* move value to upvalue slot */
+		uv->v = &uv->tv;  /* now current value lives here */
+		uv->closed = 1;
+		gch(o)->nextgc = g->allgc; /* link upvalue into 'allgc' list */
+		g->allgc = o;
+	}
+}
+
+#define SIZE_KTAP_FUNC(n) (sizeof(ktap_func_t) - sizeof(ktap_obj_t *) + \
+			   sizeof(ktap_obj_t *) * (n))
+static ktap_func_t *func_new_empty(ktap_state_t *ks, ktap_proto_t *pt)
+{
+	ktap_func_t *fn;
+
+	/* only mainthread can create new function */
+	if (ks != G(ks)->mainthread) {
+		kp_error(ks, "only mainthread can create function\n");
+		return NULL;
+	}
+
+	fn = (ktap_func_t *)kp_obj_new(ks, SIZE_KTAP_FUNC(pt->sizeuv));
+	if (!fn)
+		return NULL;
+	fn->gct = ~KTAP_TFUNC;
+	fn->nupvalues = 0; /* Set to zero until upvalues are initialized. */
+	fn->pc = proto_bc(pt);
+	fn->p = pt;
+
+	return fn;
+}
+
+static ktap_func_t *func_new(ktap_state_t *ks, ktap_proto_t *pt,
+			     ktap_func_t *parent, ktap_val_t *base)
+{
+	ktap_func_t *fn;
+	int nuv = pt->sizeuv, i;
+
+	fn = func_new_empty(ks, pt);
+	if (!fn)
+		return NULL;
+
+	fn->nupvalues = nuv;
+	for (i = 0; i < nuv; i++) {
+		uint32_t v = proto_uv(pt)[i];
+		ktap_upval_t *uv;
+
+		if (v & PROTO_UV_LOCAL) {
+			uv = findupval(ks, base + (v & 0xff));
+			if (!uv)
+				return NULL;
+			uv->immutable = ((v /PROTO_UV_IMMUTABLE) & 1);
+		} else {
+			uv = parent->upvals[v];
+		}
+		fn->upvals[i] = uv;
+	}
+	return fn;
+}
+
+static inline int checkstack(ktap_state_t *ks, int n)
+{
+	if (unlikely(ks->stack_last - ks->top <= n)) {
+		kp_error(ks, "stack overflow, please enlarge stack size\n");
+		return -1;
+	}
+	return 0;
+}
+
+static StkId adjust_varargs(ktap_state_t *ks, ktap_proto_t *p, int actual)
+{
+	int i;
+	int nfixargs = p->numparams;
+	StkId base, fixed;
+
+	/* move fixed parameters to final position */
+	fixed = ks->top - actual;  /* first fixed argument */
+	base = ks->top;  /* final position of first argument */
+
+	for (i=0; i < nfixargs; i++) {
+		set_obj(ks->top++, fixed + i);
+		set_nil(fixed + i);
+	}
+
+	return base;
+}
+
+static void poscall(ktap_state_t *ks, StkId func, StkId first_result,
+		   int wanted)
+{
+	int i;
+
+	for (i = wanted; i != 0 && first_result < ks->top; i--)
+		set_obj(func++, first_result++);
+
+	while(i-- > 0)
+		set_nil(func++);
+}
+
+void kp_vm_call_proto(ktap_state_t *ks, ktap_proto_t *pt)
+{
+	ktap_func_t *fn;
+
+	fn = func_new_empty(ks, pt);
+	if (!fn)
+		return;
+	set_func(ks->top++, fn);
+	kp_vm_call(ks, ks->top - 1, 0);
+}
+
+/*
+ * Hot loop detaction
+ *
+ * Check hot loop detaction in three cases:
+ * 1. jmp -x: this happens in 'while (expr) { ... }'
+ * 2. FORPREP-FORLOOP
+ * 3. TFORCALL-TFORLOOP
+ */ 
+static __always_inline int check_hot_loop(ktap_state_t *ks, int loop_count)
+{
+	if (unlikely(loop_count == kp_max_loop_count)) {
+		kp_error(ks, "loop execute count exceed max limit(%d)\n",
+			     kp_max_loop_count);
+		return -1;
+	}
+
+	return 0;
+}
+
+#define dojump(i, e) { pc += (int)bc_d(i) - BCBIAS_J + e; }
+#define donextjump  { instr = *pc; dojump(instr, 1); }
+
+#define NUMADD(a, b)    ((a) + (b))
+#define NUMSUB(a, b)    ((a) - (b))
+#define NUMMUL(a, b)    ((a) * (b))
+#define NUMDIV(a, b)    ((a) / (b))
+#define NUMUNM(a)       (-(a))
+#define NUMEQ(a, b)     ((a) == (b))
+#define NUMLT(a, b)     ((a) < (b))
+#define NUMLE(a, b)     ((a) <= (b))
+#define NUMMOD(a, b)    ((a) % (b))
+
+#define arith_VV(ks, op) { \
+	ktap_val_t *rb = RB; \
+	ktap_val_t *rc = RC; \
+	if (is_number(rb) && is_number(rc)) { \
+		ktap_number nb = nvalue(rb), nc = nvalue(rc); \
+		set_number(RA, op(nb, nc)); \
+	} else {	\
+		kp_puts(ks, "Error: Cannot make arith operation\n");	\
+		return;	\
+	} }
+
+#define arith_VN(ks, op) { \
+	ktap_val_t *rb = RB; \
+	if (is_number(rb)) { \
+		ktap_number nb = nvalue(rb);\
+		ktap_number nc = nvalue((ktap_val_t *)kbase + bc_c(instr));\
+		set_number(RA, op(nb, nc)); \
+	} else {	\
+		kp_puts(ks, "Error: Cannot make arith operation\n");	\
+		return;	\
+	} }
+
+#define arith_NV(ks, op) { \
+	ktap_val_t *rb = RB; \
+	if (is_number(rb)) { \
+		ktap_number nb = nvalue(rb);\
+		ktap_number nc = nvalue((ktap_val_t *)kbase + bc_c(instr));\
+		set_number(RA, op(nc, nb)); \
+	} else {	\
+		kp_puts(ks, "Error: Cannot make arith operation\n");	\
+		return;	\
+	} }
+
+
+static const char * const bc_names[] = {
+#define BCNAME(name, ma, mb, mc, mt)       #name,
+	BCDEF(BCNAME)
+#undef BCNAME
+	NULL
+};
+
+
+/*
+ * ktap bytecode interpreter routine
+ *
+ *
+ * kp_vm_call only can be used for:
+ * 1). call ktap function, not light C function
+ * 2). accept fixed argument function
+ */
+void kp_vm_call(ktap_state_t *ks, StkId func, int nresults)
+{
+	int loop_count = 0;
+	ktap_func_t *fn;
+	ktap_proto_t *pt;
+	ktap_obj_t **kbase;
+	unsigned int instr, op;
+	const unsigned int *pc;
+	StkId base; /* stack pointer */
+	int multres = 0; /* temp varible */
+	ktap_tab_t *gtab = G(ks)->gtab;
+
+	/* use computed goto for opcode dispatch */
+
+	static void *dispatch_table[] = {
+#define BCNAME(name, ma, mb, mc, mt)       &&DO_BC_##name,
+		BCDEF(BCNAME)
+#undef BCNAME
+	};
+
+#define DISPATCH()				\
+	do {					\
+		instr = *(pc++);		\
+		op = bc_op(instr);		\
+		goto *dispatch_table[op];	\
+	} while (0)
+
+#define RA	(base + bc_a(instr))
+#define RB	(base + bc_b(instr))
+#define RC	(base + bc_c(instr))
+#define RD	(base + bc_d(instr))
+#define RKD	((ktap_val_t *)kbase + bc_d(instr))
+
+	/*TODO: fix argument number mismatch, example: sort cmp closure */
+
+	fn = clvalue(func);
+	pt = fn->p;
+	kbase = fn->p->k;
+	base = func + 1;
+	pc = proto_bc(pt) + 1;
+	ks->top = base + pt->framesize;
+	func->pcr = 0; /* no previous frame */
+
+	/* main loop of interpreter */
+	DISPATCH();
+
+	while (1) {
+	DO_BC_ISLT: /* Jump if A < D */
+		if (!is_number(RA) || !is_number(RD)) {
+			kp_error(ks, "compare with non-number\n");
+			return;
+		}
+
+		if (nvalue(RA) >= nvalue(RD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISGE: /* Jump if A >= D */
+		if (!is_number(RA) || !is_number(RD)) {
+			kp_error(ks, "compare with non-number\n");
+			return;
+		}
+
+		if (nvalue(RA) < nvalue(RD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISLE: /* Jump if A <= D */
+		if (!is_number(RA) || !is_number(RD)) {
+			kp_error(ks, "compare with non-number\n");
+			return;
+		}
+
+		if (nvalue(RA) > nvalue(RD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISGT: /* Jump if A > D */
+		if (!is_number(RA) || !is_number(RD)) {
+			kp_error(ks, "compare with non-number\n");
+			return;
+		}
+
+		if (nvalue(RA) <= nvalue(RD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISEQV: /* Jump if A = D */
+		if (!kp_obj_equal(RA, RD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISNEV: /* Jump if A != D */
+		if (kp_obj_equal(RA, RD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISEQS: { /* Jump if A = D */
+		int idx = ~bc_d(instr);
+
+		if (!is_string(RA) ||
+				rawtsvalue(RA) != (ktap_str_t *)kbase[idx])
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+		}
+	DO_BC_ISNES: { /* Jump if A != D */
+		int idx = ~bc_d(instr);
+
+		if (is_string(RA) &&
+			rawtsvalue(RA) == (ktap_str_t *)kbase[idx])
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+		}
+	DO_BC_ISEQN: /* Jump if A = D */
+		if (!is_number(RA) || nvalue(RA) !=  nvalue(RKD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISNEN: /* Jump if A != D */
+		if (is_number(RA) && nvalue(RA) ==  nvalue(RKD))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISEQP: /* Jump if A = D */
+		if (itype(RA) != ~bc_d(instr))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISNEP: /* Jump if A != D */
+		if (itype(RA) == ~bc_d(instr))
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISTC: /* Copy D to A and jump, if D is true */
+		if (itype(RD) == KTAP_TNIL || itype(RD) == KTAP_TFALSE)
+			pc++;
+		else {
+			set_obj(RA, RD);
+			donextjump;
+		}
+		DISPATCH();
+	DO_BC_ISFC: /* Copy D to A and jump, if D is false */
+		if (itype(RD) != KTAP_TNIL && itype(RD) != KTAP_TFALSE)
+			pc++;
+		else {
+			set_obj(RA, RD);
+			donextjump;
+		}
+		DISPATCH();
+	DO_BC_IST: /* Jump if D is true */
+		if (itype(RD) == KTAP_TNIL || itype(RD) == KTAP_TFALSE)
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISF: /* Jump if D is false */
+		/* only nil and false are considered false,
+		 * all other values are true */
+		if (itype(RD) != KTAP_TNIL && itype(RD) != KTAP_TFALSE)
+			pc++;
+		else
+			donextjump;
+		DISPATCH();
+	DO_BC_ISTYPE: /* generated by genlibbc, not compiler; not used now */
+	DO_BC_ISNUM:
+		return;
+	DO_BC_MOV: /* Copy D to A */
+		set_obj(RA, RD);
+		DISPATCH();
+	DO_BC_NOT: /* Set A to boolean not of D */
+		if (itype(RD) == KTAP_TNIL || itype(RD) == KTAP_TFALSE)
+			setitype(RA, KTAP_TTRUE);
+		else
+			setitype(RA, KTAP_TFALSE);
+
+		DISPATCH();
+	DO_BC_UNM: /* Set A to -D (unary minus) */
+		if (!is_number(RD)) {
+			kp_error(ks, "use '-' operator on non-number\n");
+			return;
+		}
+
+		set_number(RA, -nvalue(RD));
+		DISPATCH();
+	DO_BC_ADDVN: /* A = B + C */
+		arith_VN(ks, NUMADD);
+		DISPATCH();
+	DO_BC_SUBVN: /* A = B - C */
+		arith_VN(ks, NUMSUB);
+		DISPATCH();
+	DO_BC_MULVN: /* A = B * C */
+		arith_VN(ks, NUMMUL);
+		DISPATCH();
+	DO_BC_DIVVN: /* A = B / C */
+		/* divide 0 checking */
+		if (!nvalue((ktap_val_t *)kbase + bc_c(instr))) {
+			kp_error(ks, "divide 0 arith operation\n");
+			return;
+		}
+		arith_VN(ks, NUMDIV);
+		DISPATCH();
+	DO_BC_MODVN: /* A = B % C */
+		/* divide 0 checking */
+		if (!nvalue((ktap_val_t *)kbase + bc_c(instr))) {
+			kp_error(ks, "mod 0 arith operation\n");
+			return;
+		}
+		arith_VN(ks, NUMMOD);
+		DISPATCH();
+	DO_BC_ADDNV: /* A = C + B */
+		arith_NV(ks, NUMADD);
+		DISPATCH();
+	DO_BC_SUBNV: /* A = C - B */
+		arith_NV(ks, NUMSUB);
+		DISPATCH();
+	DO_BC_MULNV: /* A = C * B */
+		arith_NV(ks, NUMMUL);
+		DISPATCH();
+	DO_BC_DIVNV: /* A = C / B */
+		/* divide 0 checking */
+		if (!nvalue(RB)){
+			kp_error(ks, "divide 0 arith operation\n");
+			return;
+		}
+		arith_NV(ks, NUMDIV);
+		DISPATCH();
+	DO_BC_MODNV: /* A = C % B */
+		/* divide 0 checking */
+		if (!nvalue(RB)){
+			kp_error(ks, "mod 0 arith operation\n");
+			return;
+		}
+		arith_NV(ks, NUMMOD);
+		DISPATCH();
+	DO_BC_ADDVV: /* A = B + C */
+		arith_VV(ks, NUMADD);
+		DISPATCH();
+	DO_BC_SUBVV: /* A = B - C */
+		arith_VV(ks, NUMSUB);
+		DISPATCH();
+	DO_BC_MULVV: /* A = B * C */
+		arith_VV(ks, NUMMUL);
+		DISPATCH();
+	DO_BC_DIVVV: /* A = B / C */
+		arith_VV(ks, NUMDIV);
+		DISPATCH();
+	DO_BC_MODVV: /* A = B % C */
+		arith_VV(ks, NUMMOD);
+		DISPATCH();
+	DO_BC_POW: /* A = B ^ C, rejected */
+		return;
+	DO_BC_CAT: { /* A = B .. ~ .. C */
+		/* The CAT instruction concatenates all values in
+		 * variable slots B to C inclusive. */
+		ktap_str_t *ts = str_concat(ks, base, bc_b(instr),
+					    bc_c(instr));
+		if (!ts)
+			return;
+		
+		set_string(RA, ts);
+		DISPATCH();
+		}
+	DO_BC_KSTR: { /* Set A to string constant D */
+		int idx = ~bc_d(instr);
+		set_string(RA, (ktap_str_t *)kbase[idx]);
+		DISPATCH();
+		}
+	DO_BC_KCDATA: /* not used now */
+		DISPATCH();
+	DO_BC_KSHORT: /* Set A to 16 bit signed integer D */
+		set_number(RA, bc_d(instr));
+		DISPATCH();
+	DO_BC_KNUM: /* Set A to number constant D */
+		set_number(RA, nvalue(RKD));
+		DISPATCH();
+	DO_BC_KPRI: /* Set A to primitive D */
+		setitype(RA, ~bc_d(instr));
+		DISPATCH();
+	DO_BC_KNIL: { /* Set slots A to D to nil */
+		int i;
+		for (i = 0; i <= bc_d(instr) - bc_a(instr); i++) {
+			set_nil(RA + i);
+		}
+		DISPATCH();
+		}
+	DO_BC_UGET: /* Set A to upvalue D */
+		set_obj(RA, fn->upvals[bc_d(instr)]->v);
+		DISPATCH();
+	DO_BC_USETV: /* Set upvalue A to D */
+		set_obj(fn->upvals[bc_a(instr)]->v, RD);
+		DISPATCH();
+	DO_BC_UINCV: { /* upvalus[A] += D */
+		ktap_val_t *v = fn->upvals[bc_a(instr)]->v;
+		if (unlikely(!is_number(RD) || !is_number(v))) {
+			kp_error(ks, "use '+=' on non-number\n");
+			return;
+		}
+		set_number(v, nvalue(v) + nvalue(RD));
+		DISPATCH();
+		}
+	DO_BC_USETS: { /* Set upvalue A to string constant D */
+		int idx = ~bc_d(instr);
+		set_string(fn->upvals[bc_a(instr)]->v,
+				(ktap_str_t *)kbase[idx]);
+		DISPATCH();
+		}
+	DO_BC_USETN: /* Set upvalue A to number constant D */
+		set_number(fn->upvals[bc_a(instr)]->v, nvalue(RKD));
+		DISPATCH();
+	DO_BC_UINCN: { /* upvalus[A] += D */
+		ktap_val_t *v = fn->upvals[bc_a(instr)]->v;
+		if (unlikely(!is_number(v))) {
+			kp_error(ks, "use '+=' on non-number\n");
+			return;
+		}
+		set_number(v, nvalue(v) + nvalue(RKD));
+		DISPATCH();
+		}
+	DO_BC_USETP: /* Set upvalue A to primitive D */
+		setitype(fn->upvals[bc_a(instr)]->v, ~bc_d(instr));
+		DISPATCH();
+	DO_BC_UCLO: /* Close upvalues for slots . rbase and jump to target D */
+		if (ks->openupval != NULL)
+			func_closeuv(ks, RA);
+		dojump(instr, 0);
+		DISPATCH();
+	DO_BC_FNEW: {
+		/* Create new closure from prototype D and store it in A */
+		int idx = ~bc_d(instr);
+		ktap_func_t *subfn = func_new(ks, (ktap_proto_t *)kbase[idx],
+					      fn, base);
+		if (unlikely(!subfn))
+			return;
+		set_func(RA, subfn);
+		DISPATCH();
+		}
+	DO_BC_TNEW: { /* Set A to new table with size D */
+		/* 
+		 * preallocate default narr and nrec,
+		 * op_b and op_c is not used
+		 * This would allocate more memory for some static table.
+		 */
+		ktap_tab_t *t = kp_tab_new_ah(ks, 0, 0);
+		if (unlikely(!t))
+			return;
+		set_table(RA, t);
+		DISPATCH();
+		}
+	DO_BC_TDUP: { /* Set A to duplicated template table D */
+		int idx = ~bc_d(instr);
+		ktap_tab_t *t = kp_tab_dup(ks, (ktap_tab_t *)kbase[idx]);
+		if (!t)
+			return;
+		set_table(RA, t);
+		DISPATCH();
+		}
+	DO_BC_GGET: { /* A = _G[D] */
+		int idx = ~bc_d(instr);
+		kp_tab_getstr(gtab, (ktap_str_t *)kbase[idx], RA);
+		DISPATCH();
+		}
+	DO_BC_GSET: /* _G[D] = A, rejected. */
+	DO_BC_GINC: /* _G[D] += A, rejected. */
+		return;
+	DO_BC_TGETV: /* A = B[C] */
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "get key from non-table\n");
+			return;
+		}
+
+		kp_tab_get(ks, hvalue(RB), RC, RA);
+		DISPATCH();
+	DO_BC_TGETS: { /* A = B[C] */
+		int idx = ~bc_c(instr);
+
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "get key from non-table\n");
+			return;
+		}
+		kp_tab_getstr(hvalue(RB), (ktap_str_t *)kbase[idx], RA);
+		DISPATCH();
+		}
+	DO_BC_TGETB: { /* A = B[C] */
+		/* 8 bit literal C operand as an unsigned integer
+		 * index (0..255)) */
+		uint8_t idx = bc_c(instr);
+
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "set key to non-table\n");
+			return;
+		}
+		kp_tab_getint(hvalue(RB), idx, RA);
+		DISPATCH();
+		}
+	DO_BC_TGETR: /* generated by genlibbc, not compiler, not used */
+		return;
+	DO_BC_TSETV: /* B[C] = A */
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "set key to non-table\n");
+			return;
+		}
+		kp_tab_set(ks, hvalue(RB), RC, RA);
+		DISPATCH();
+	DO_BC_TINCV: /* B[C] += A */
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "set key to non-table\n");
+			return;
+		}
+		if (unlikely(!is_number(RA))) {
+			kp_error(ks, "use '+=' on non-number\n");
+			return;
+		}
+		kp_tab_incr(ks, hvalue(RB), RC, nvalue(RA));
+		DISPATCH();
+	DO_BC_TSETS: { /* B[C] = A */
+		int idx = ~bc_c(instr);
+
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "set key to non-table\n");
+			return;
+		}
+		kp_tab_setstr(ks, hvalue(RB), (ktap_str_t *)kbase[idx], RA);
+		DISPATCH();
+		}
+	DO_BC_TINCS: { /* B[C] += A */
+		int idx = ~bc_c(instr);
+
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "set key to non-table\n");
+			return;
+		}
+		if (unlikely(!is_number(RA))) {
+			kp_error(ks, "use '+=' on non-number\n");
+			return;
+		}
+		kp_tab_incrstr(ks, hvalue(RB), (ktap_str_t *)kbase[idx],
+				nvalue(RA));
+		DISPATCH();
+		}
+	DO_BC_TSETB: { /* B[C] = A */
+		/* 8 bit literal C operand as an unsigned integer
+		 * index (0..255)) */
+		uint8_t idx = bc_c(instr);
+
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "set key to non-table\n");
+			return;
+		}
+		kp_tab_setint(ks, hvalue(RB), idx, RA);
+		DISPATCH();
+		}
+	DO_BC_TINCB: { /* B[C] = A */
+		uint8_t idx = bc_c(instr);
+
+		if (unlikely(!is_table(RB))) {
+			kp_error(ks, "set key to non-table\n");
+			return;
+		}
+		if (unlikely(!is_number(RA))) {
+			kp_error(ks, "use '+=' on non-number\n");
+			return;
+		}
+		kp_tab_incrint(ks, hvalue(RB), idx, nvalue(RA));
+		DISPATCH();
+		}
+	DO_BC_TSETM: /* don't support */
+		return;
+	DO_BC_TSETR: /* generated by genlibbc, not compiler, not used */
+		return;
+	DO_BC_CALLM:
+	DO_BC_CALL: { /* b: return_number + 1; c: argument + 1 */
+		int c = bc_c(instr);
+		int nresults = bc_b(instr) - 1;
+		StkId oldtop = ks->top;
+		StkId newfunc = RA;
+
+		if (op == BC_CALL && c != 0)
+			ks->top = RA + c;
+		else if (op == BC_CALLM)
+			ks->top = RA + c + multres;
+
+		if (itype(newfunc) == KTAP_TCFUNC) { /* light C function */
+			ktap_cfunction f = fvalue(newfunc);
+			int n;
+
+			if (unlikely(checkstack(ks,
+					KTAP_MIN_RESERVED_STACK_SIZE)))
+				return;
+
+			ks->func = newfunc;
+			n = (*f)(ks);
+			if (unlikely(n < 0)) /* error occured */
+				return;
+			poscall(ks, newfunc, ks->top - n, nresults);
+
+			ks->top = oldtop;
+			multres = n + 1; /* set to multres */
+			DISPATCH();
+		} else if (itype(newfunc) == KTAP_TFUNC) { /* ktap function */
+			int n;
+
+			func = newfunc;
+			pt = clvalue(func)->p;
+
+			if (unlikely(checkstack(ks, pt->framesize)))
+				return;
+
+			/* get number of real arguments */
+			n = (int)(ks->top - func) - 1;
+
+			/* complete missing arguments */
+			for (; n < pt->numparams; n++)
+				set_nil(ks->top++);
+
+			base = (!(pt->flags & PROTO_VARARG)) ? func + 1 :
+						adjust_varargs(ks, pt, n);
+
+			fn = clvalue(func);
+			pt = fn->p;
+			kbase = pt->k;
+			func->pcr = pc - 1; /* save pc */
+			ks->top = base + pt->framesize;
+			pc = proto_bc(pt) + 1; /* starting point */
+			DISPATCH();
+		} else {
+			kp_error(ks, "attempt to call nil function\n");
+			return;
+		}
+		}
+	DO_BC_CALLMT: /* don't support */
+		return;
+	DO_BC_CALLT: { /* Tailcall: return A(A+1, ..., A+D-1) */
+		StkId nfunc = RA;
+
+		if (itype(nfunc) == KTAP_TCFUNC) { /* light C function */
+			kp_error(ks, "don't support callt for C function");
+			return;
+		} else if (itype(nfunc) == KTAP_TFUNC) { /* ktap function */
+			int aux;
+
+			/*
+			 * tail call: put called frame (n) in place of
+			 * caller one (o)
+			 */
+			StkId ofunc = func; /* caller function */
+			/* last stack slot filled by 'precall' */
+			StkId lim = nfunc + 1 + clvalue(nfunc)->p->numparams;
+
+			fn = clvalue(nfunc);
+			ofunc->val = nfunc->val;
+
+			/* move new frame into old one */
+			for (aux = 1; nfunc + aux < lim; aux++)
+				set_obj(ofunc + aux, nfunc + aux);
+
+			pt = fn->p;
+			kbase = pt->k;
+			ks->top = base + pt->framesize;
+			pc = proto_bc(pt) + 1; /* starting point */
+			DISPATCH();
+		} else {
+			kp_error(ks, "attempt to call nil function\n");
+			return;
+		}
+		}
+	DO_BC_ITERC: /* don't support it now */
+		return;
+	DO_BC_ITERN: /* Specialized ITERC, if iterator function A-3 is next()*/
+		/* detect hot loop */
+		if (unlikely(check_hot_loop(ks, loop_count++) < 0))
+			return;
+
+		if (kp_tab_next(ks, hvalue(RA - 2), RA)) {
+			donextjump; /* Get jump target from ITERL */
+		} else {
+			pc++; /* jump to ITERL + 1 */
+		}
+		DISPATCH();
+	DO_BC_VARG: /* don't support */
+		return;
+	DO_BC_ISNEXT: /* Verify ITERN specialization and jump */
+		if (!is_cfunc(RA - 3) || !is_table(RA - 2) || !is_nil(RA - 1)
+			|| fvalue(RA - 3) != (ktap_cfunction)kp_tab_next) {
+			/* Despecialize bytecode if any of the checks fail. */
+			setbc_op(pc - 1, BC_JMP);
+			dojump(instr, 0);
+			setbc_op(pc, BC_ITERC);
+		} else {
+			dojump(instr, 0);
+			set_nil(RA); /* init control variable */
+		}
+		DISPATCH();
+	DO_BC_RETM: /* don't support return multiple values */
+	DO_BC_RET:
+		return;
+	DO_BC_RET0:
+		/* if it's called from external invocation, just return */
+		if (!func->pcr)
+			return;
+
+		pc = func->pcr; /* restore PC */
+
+		multres = bc_d(instr);
+		set_nil(func);
+
+		base = func - bc_a(*pc);
+		func = base - 1;
+		fn = clvalue(func);
+		kbase = fn->p->k;
+		ks->top = base + pt->framesize;
+		pc++;
+
+		DISPATCH();
+	DO_BC_RET1:
+		/* if it's called from external invocation, just return */
+		if (!func->pcr)
+			return;
+
+		pc = func->pcr; /* restore PC */
+
+		multres = bc_d(instr);
+		set_obj(base - 1, RA); /* move result */
+
+		base = func - bc_a(*pc);
+		func = base - 1;
+		fn = clvalue(func);
+		kbase = fn->p->k;
+		ks->top = base + pt->framesize;
+		pc++;
+
+		DISPATCH();
+	DO_BC_FORI: { /* Numeric 'for' loop init */
+		ktap_number idx;
+		ktap_number limit;
+		ktap_number step;
+
+		if (unlikely(!is_number(RA) || !is_number(RA + 1) ||
+				!is_number(RA + 2))) {
+			kp_error(ks, KTAP_QL("for")
+				 " init/limit/step value must be a number\n");
+			return;
+		}
+
+		idx = nvalue(RA);
+		limit = nvalue(RA + 1);
+		step = nvalue(RA + 2);
+
+		if (NUMLT(0, step) ? NUMLE(idx, limit) : NUMLE(limit, idx)) {
+			set_number(RA + 3, nvalue(RA));
+		} else {
+			dojump(instr, 0);
+		}
+		DISPATCH();
+		}
+	DO_BC_JFORI: /* not used */
+		return;
+	DO_BC_FORL: { /* Numeric 'for' loop */
+		ktap_number step = nvalue(RA + 2);
+		/* increment index */
+		ktap_number idx = NUMADD(nvalue(RA), step);
+		ktap_number limit = nvalue(RA + 1);
+		if (NUMLT(0, step) ? NUMLE(idx, limit) : NUMLE(limit, idx)) {
+			dojump(instr, 0); /* jump back */
+			set_number(RA, idx);  /* update internal index... */
+			set_number(RA + 3, idx);  /* ...and external index */
+		}
+
+		if (unlikely(check_hot_loop(ks, loop_count++) < 0))
+			return;
+
+		DISPATCH();
+		}
+	DO_BC_IFORL: /* not used */
+	DO_BC_JFORL:
+	DO_BC_ITERL:
+	DO_BC_IITERL:
+	DO_BC_JITERL:
+		return;
+	DO_BC_LOOP: /* Generic loop */
+		/* ktap use this bc to detect hot loop */
+		if (unlikely(check_hot_loop(ks, loop_count++) < 0))
+			return;
+		DISPATCH();
+	DO_BC_ILOOP: /* not used */
+	DO_BC_JLOOP:
+		return;
+	DO_BC_JMP: /* Jump */
+		dojump(instr, 0);
+		DISPATCH();
+	DO_BC_FUNCF: /* function header, not used */
+	DO_BC_IFUNCF:
+	DO_BC_JFUNCF:
+	DO_BC_FUNCV:
+	DO_BC_IFUNCV:
+	DO_BC_JFUNCV:
+	DO_BC_FUNCC:
+	DO_BC_FUNCCW:	
+		return;
+	DO_BC_VARGN: /* arg0 .. arg9*/
+		if (unlikely(!ks->current_event)) {
+			kp_error(ks, "invalid event context\n");
+			return;
+		}
+
+		kp_event_getarg(ks, RA, bc_d(instr));
+		DISPATCH();
+	DO_BC_VARGSTR: { /* argstr */
+		/*
+		 * If you pass argstr to print/printf function directly,
+		 * then no extra string generated, so don't worry string
+		 * poll size for below case:
+		 *     print(argstr)
+		 *
+		 * If you use argstr as table key like below, then it may
+		 * overflow your string pool size, so be care of on it.
+		 *     table[argstr] = V
+		 *
+		 * If you assign argstr to upval or table value like below,
+		 * it don't really write string, just write type KTAP_TEVENTSTR,
+		 * the value will be interpreted when value print out in valid
+		 * event context, if context mismatch, error will report.
+		 *     table[V] = argstr
+		 *     upval = argstr
+		 *
+		 * If you want to save real string of argstr, then use it like
+		 * below, again, be care of string pool size in this case.
+		 *     table[V] = stringof(argstr)
+		 *     upval = stringof(argstr)
+		 */
+		struct ktap_event_data *e = ks->current_event;
+
+		if (unlikely(!e)) {
+			kp_error(ks, "invalid event context\n");
+			return;
+		}
+
+		if (e->argstr) /* argstr been stringified */
+			set_string(RA, e->argstr);
+		else
+			set_eventstr(RA);
+		DISPATCH();
+		}
+	DO_BC_VPROBENAME: { /* probename */
+		struct ktap_event_data *e = ks->current_event;
+
+		if (unlikely(!e)) {
+			kp_error(ks, "invalid event context\n");
+			return;
+		}
+		set_string(RA, e->event->name);
+		DISPATCH();
+		}
+	DO_BC_VPID: /* pid */
+		set_number(RA, (int)current->pid);
+		DISPATCH();
+	DO_BC_VTID: /* tid */
+		set_number(RA, (int)task_pid_vnr(current));
+		DISPATCH();
+	DO_BC_VUID: { /* uid */
+		uid_t uid = from_kuid_munged(current_user_ns(), current_uid());
+		set_number(RA, (int)uid);
+		DISPATCH();
+		}
+	DO_BC_VCPU: /* cpu */
+		set_number(RA, smp_processor_id());
+		DISPATCH();
+	DO_BC_VEXECNAME: { /* execname */
+		ktap_str_t *ts = kp_str_newz(ks, current->comm);
+		if (unlikely(!ts))
+			return;
+		set_string(RA, ts);
+		DISPATCH();
+		}
+	DO_BC_GFUNC: { /* Call built-in C function, patched by BC_GGET */
+		ktap_cfunction cfunc = gfunc_get(ks, bc_d(instr));
+		set_cfunc(RA, cfunc);
+		DISPATCH();
+		}
+	}
+}
+
+/*
+ * Validate byte code and static analysis.
+ *
+ * TODO: more type checking before real running.
+ */
+int kp_vm_validate_code(ktap_state_t *ks, ktap_proto_t *pt, ktap_val_t *base)
+{
+	const unsigned int *pc = proto_bc(pt) + 1;
+	unsigned int instr, op;
+	ktap_obj_t **kbase = pt->k;
+	ktap_tab_t *gtab = G(ks)->gtab;
+	int i;
+
+#define RA	(base + bc_a(instr))
+#define RB	(base + bc_b(instr))
+#define RC	(base + bc_c(instr))
+#define RD	(base + bc_d(instr))
+
+	if (pt->framesize > KP_MAX_SLOTS) {
+		kp_error(ks, "exceed max frame size %d\n", pt->framesize);
+		return -1;
+	}
+
+	if (base + pt->framesize > ks->stack_last) {
+		kp_error(ks, "stack overflow\n");
+		return -1;
+	}
+
+	for (i = 0; i < pt->sizebc - 1; i++) {
+		instr = *pc++;
+		op = bc_op(instr);
+
+
+		if (op >= BC__MAX) {
+			kp_error(ks, "unknown byte code %d\n", op);
+			return -1;
+		}
+
+		switch (op) {
+		case BC_FNEW: {
+			int idx = ~bc_d(instr);
+			ktap_proto_t *newpt = (ktap_proto_t *)kbase[idx];
+			if (kp_vm_validate_code(ks, newpt, RA + 1))
+				return -1;
+
+			break;
+			}
+		case BC_RETM: case BC_RET:
+			kp_error(ks, "don't support return multiple values\n");
+			return -1;
+		case BC_GSET: case BC_GINC: { /* _G[D] = A, _G[D] += A */
+			int idx = ~bc_d(instr);
+			ktap_str_t *ts = (ktap_str_t *)kbase[idx];
+			kp_error(ks, "cannot set global variable '%s'\n",
+					getstr(ts));
+			return -1;
+			}
+		case BC_GGET: {
+			int idx = ~bc_d(instr);
+			ktap_str_t *ts = (ktap_str_t *)kbase[idx];
+			ktap_val_t val;
+			kp_tab_getstr(gtab, ts, &val);
+			if (is_nil(&val)) {
+				kp_error(ks, "undefined global variable"
+						" '%s'\n", getstr(ts));
+				return -1;
+			} else if (is_cfunc(&val)) {
+				int idx = gfunc_getidx(G(ks), fvalue(&val));
+				if (idx >= 0) {
+					/* patch BC_GGET bytecode to BC_GFUNC */
+					setbc_op(pc - 1, BC_GFUNC);
+					setbc_d(pc - 1, idx);
+				}
+			}
+			break;
+			}
+		case BC_ITERC:
+			kp_error(ks, "ktap only support pairs iteraor\n");
+			return -1;
+		case BC_POW:
+			kp_error(ks, "ktap don't support pow arith\n");
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+/* return cfunction by idx */
+static ktap_cfunction gfunc_get(ktap_state_t *ks, int idx)
+{
+	return G(ks)->gfunc_tbl[idx];
+}
+
+/* get cfunction index, the index is for fast get cfunction in runtime */
+static int gfunc_getidx(ktap_global_state_t *g, ktap_cfunction cfunc)
+{
+	int nr = g->nr_builtin_cfunction;
+	ktap_cfunction *gfunc_tbl = g->gfunc_tbl;
+	int i;
+
+	for (i = 0; i < nr; i++) {
+		if (gfunc_tbl[i] == cfunc)
+			return i;
+	}
+
+	return -1;
+}
+
+static void gfunc_add(ktap_state_t *ks, ktap_cfunction cfunc)
+{
+	int nr = G(ks)->nr_builtin_cfunction;
+
+	if (nr == KP_MAX_CACHED_CFUNCTION) {
+		kp_error(ks, "please enlarge KP_MAX_CACHED_CFUNCTION %d\n",
+				KP_MAX_CACHED_CFUNCTION);
+		return;
+	}
+	G(ks)->gfunc_tbl[nr] = cfunc;
+	G(ks)->nr_builtin_cfunction++;
+}
+
+/* function for register library */
+int kp_vm_register_lib(ktap_state_t *ks, const char *libname,
+		       const ktap_libfunc_t *funcs)
+{
+	ktap_tab_t *gtab = G(ks)->gtab;
+	ktap_tab_t *target_tbl;
+	int i;
+
+	/* lib is null when register baselib function */
+	if (libname == NULL)
+		target_tbl = gtab;
+	else {
+		ktap_val_t key, val;
+		ktap_str_t *ts = kp_str_newz(ks, libname);
+		if (!ts)
+			return -ENOMEM;
+
+		/* calculate the function number contained by this library */
+		for (i = 0; funcs[i].name != NULL; i++) {
+		}
+
+		target_tbl = kp_tab_new_ah(ks, 0, i + 1);
+		if (!target_tbl)
+			return -ENOMEM;
+
+		set_string(&key, ts);
+		set_table(&val, target_tbl);
+		kp_tab_set(ks, gtab, &key, &val);
+	}
+
+	/* TODO: be care of same function name issue, foo() and tbl.foo() */
+	for (i = 0; funcs[i].name != NULL; i++) {
+		ktap_str_t *func_name = kp_str_newz(ks, funcs[i].name);
+		ktap_val_t fn;
+
+		if (unlikely(!func_name))
+			return -ENOMEM;
+
+		set_cfunc(&fn, funcs[i].func);
+		kp_tab_setstr(ks, target_tbl, func_name, &fn);
+
+		gfunc_add(ks, funcs[i].func);
+	}
+
+	return 0;
+}
+
+static int init_registry(ktap_state_t *ks)
+{
+	ktap_tab_t *registry = kp_tab_new_ah(ks, 2, 0);
+	ktap_val_t gtbl;
+	ktap_tab_t *t;
+
+	if (!registry)
+		return -1;
+
+	set_table(&G(ks)->registry, registry);
+
+	/* assume there will have max 1024 global variables */
+	t = kp_tab_new_ah(ks, 0, 1024);
+	if (!t)
+		return -1;
+
+	set_table(&gtbl, t);
+	kp_tab_setint(ks, registry, KTAP_RIDX_GLOBALS, &gtbl);
+	G(ks)->gtab = t;
+
+	return 0;
+}
+
+static int init_arguments(ktap_state_t *ks, int argc, char __user **user_argv)
+{
+	ktap_tab_t *gtbl = G(ks)->gtab;
+	ktap_tab_t *arg_tbl = kp_tab_new_ah(ks, argc, 1);
+	ktap_val_t arg_tblval;
+	ktap_val_t arg_tsval;
+	ktap_str_t *argts = kp_str_newz(ks, "arg");
+	char **argv;
+	int i, ret;
+
+	if (!arg_tbl)
+		return -1;
+
+	if (unlikely(!argts))
+		return -ENOMEM;
+
+	set_string(&arg_tsval, argts);
+	set_table(&arg_tblval, arg_tbl);
+	kp_tab_set(ks, gtbl, &arg_tsval, &arg_tblval);
+
+	if (!argc)
+		return 0;
+
+	if (argc > 1024)
+		return -EINVAL;
+
+	argv = kzalloc(argc * sizeof(char *), GFP_KERNEL);
+	if (!argv)
+		return -ENOMEM;
+
+	if (copy_from_user(argv, user_argv, argc * sizeof(char *))) {
+		kfree(argv);
+		return -EFAULT;
+	}
+
+	ret = 0;
+	for (i = 0; i < argc; i++) {
+		ktap_val_t val;
+		char __user *ustr = argv[i];
+		char *kstr;
+		int len;
+		int res;
+
+		len = strlen_user(ustr);
+		if (len > 0x1000) {
+			ret = -EINVAL;
+			break;
+		}
+
+		kstr = kmalloc(len + 1, GFP_KERNEL);
+		if (!kstr) {
+			ret = -ENOMEM;
+			break;
+		}
+
+		if (strncpy_from_user(kstr, ustr, len) < 0) {
+			kfree(kstr);
+			ret = -EFAULT;
+			break;
+		}
+
+		kstr[len] = '\0';
+
+		if (!kstrtoint(kstr, 10, &res)) {
+			set_number(&val, res);
+		} else {
+			ktap_str_t *ts = kp_str_newz(ks, kstr);
+			if (unlikely(!ts)) {
+				kfree(kstr);
+				ret = -ENOMEM;
+				break;
+			}
+				
+			set_string(&val, ts);
+		}
+
+		kp_tab_setint(ks, arg_tbl, i, &val);
+
+		kfree(kstr);
+	}
+
+	kfree(argv);
+	return ret;
+}
+
+static void free_preserved_data(ktap_state_t *ks)
+{
+	int cpu, i, j;
+
+	/* free stack for each allocated ktap_state */
+	for_each_possible_cpu(cpu) {
+		for (j = 0; j < PERF_NR_CONTEXTS; j++) {
+			void *percpu_state = G(ks)->percpu_state[j];
+			ktap_state_t *pks;
+
+			if (!percpu_state)
+				break;
+			pks = per_cpu_ptr(percpu_state, cpu);
+			if (!ks)
+				break;
+			kfree(pks->stack);
+		}
+	}
+
+	/* free percpu ktap_state */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+		if (G(ks)->percpu_state[i])
+			free_percpu(G(ks)->percpu_state[i]);
+	}
+
+	/* free percpu ktap print buffer */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+		if (G(ks)->percpu_print_buffer[i])
+			free_percpu(G(ks)->percpu_print_buffer[i]);
+	}
+
+	/* free percpu ktap temp buffer */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+		if (G(ks)->percpu_temp_buffer[i])
+			free_percpu(G(ks)->percpu_temp_buffer[i]);
+	}
+
+	/* free percpu ktap recursion context flag */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++)
+		if (G(ks)->recursion_context[i])
+			free_percpu(G(ks)->recursion_context[i]);
+}
+
+#define ALLOC_PERCPU(size)  __alloc_percpu(size, __alignof__(char))
+static int init_preserved_data(ktap_state_t *ks)
+{
+	void __percpu *data;
+	int cpu, i, j;
+
+	/* init percpu ktap_state */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+		data = ALLOC_PERCPU(sizeof(ktap_state_t));
+		if (!data)
+			goto fail;
+		G(ks)->percpu_state[i] = data;
+	}
+
+	/* init stack for each allocated ktap_state */
+	for_each_possible_cpu(cpu) {
+		for (j = 0; j < PERF_NR_CONTEXTS; j++) {
+			void *percpu_state = G(ks)->percpu_state[j];
+			ktap_state_t *pks;
+
+			if (!percpu_state)
+				break;
+			pks = per_cpu_ptr(percpu_state, cpu);
+			if (!ks)
+				break;
+			pks->stack = kzalloc(KTAP_STACK_SIZE_BYTES, GFP_KERNEL);
+			if (!pks->stack)
+				goto fail;
+
+			pks->stack_last = pks->stack + KTAP_STACK_SIZE;
+			G(pks) = G(ks);
+		}
+	}
+
+	/* init percpu ktap print buffer */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+		data = ALLOC_PERCPU(KTAP_PERCPU_BUFFER_SIZE);
+		if (!data)
+			goto fail;
+		G(ks)->percpu_print_buffer[i] = data;
+	}
+
+	/* init percpu ktap temp buffer */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+		data = ALLOC_PERCPU(KTAP_PERCPU_BUFFER_SIZE);
+		if (!data)
+			goto fail;
+		G(ks)->percpu_temp_buffer[i] = data;
+	}
+
+	/* init percpu ktap recursion context flag */
+	for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+		data = alloc_percpu(int);
+		if (!data)
+			goto fail;
+		G(ks)->recursion_context[i] = data;
+	}
+
+	return 0;
+
+ fail:
+	free_preserved_data(ks);
+	return -ENOMEM;
+}
+
+/*
+ * wait ktapio thread read all content in ring buffer.
+ *
+ * Here we use stupid approach to sync with ktapio thread,
+ * note that we cannot use semaphore/completion/other sync method,
+ * because ktapio thread could be killed by SIG_KILL in anytime, there
+ * have no safe way to up semaphore or wake waitqueue before thread exit.
+ *
+ * we also cannot use waitqueue of current->signal->wait_chldexit to sync
+ * exit, becasue mainthread and ktapio thread are in same thread group.
+ *
+ * Also ktap mainthread must wait ktapio thread exit, otherwise ktapio
+ * thread will oops when access ktap structure.
+ */
+static void wait_user_completion(ktap_state_t *ks)
+{
+	struct task_struct *tsk = G(ks)->task;
+	G(ks)->wait_user = 1;
+
+	while (1) {
+		set_current_state(TASK_INTERRUPTIBLE);
+		/* sleep for 100 msecs, and try again. */
+		schedule_timeout(HZ / 10);
+
+		if (get_nr_threads(tsk) == 1)
+			break;
+	}
+}
+
+static void sleep_loop(ktap_state_t *ks,
+			int (*actor)(ktap_state_t *ks, void *arg), void *arg)
+{
+	while (!ks->stop) {
+		set_current_state(TASK_INTERRUPTIBLE);
+		/* sleep for 100 msecs, and try again. */
+		schedule_timeout(HZ / 10);
+
+		if (actor(ks, arg))
+			return;
+	}
+}
+
+static int sl_wait_task_pause_actor(ktap_state_t *ks, void *arg)
+{
+	struct task_struct *task = (struct task_struct *)arg;
+
+	if (task->state)
+		return 1;
+	else
+		return 0;
+}
+
+static int sl_wait_task_exit_actor(ktap_state_t *ks, void *arg)
+{
+	struct task_struct *task = (struct task_struct *)arg;
+
+	if (signal_pending(current)) {
+		flush_signals(current);
+
+		/* newline for handle CTRL+C display as ^C */
+		kp_puts(ks, "\n");
+		return 1;
+	}
+
+	/* stop waiting if target pid is exited */
+	if (task && task->state == TASK_DEAD)
+			return 1;
+
+	return 0;
+}
+
+/* wait user interrupt, signal killed */
+static void wait_user_interrupt(ktap_state_t *ks)
+{
+	struct task_struct *task = G(ks)->trace_task;
+
+	if (G(ks)->state == KTAP_EXIT || G(ks)->state == KTAP_ERROR)
+		return;
+
+	/* let tracing goes now. */
+	ks->stop = 0;
+
+	if (G(ks)->parm->workload) {
+		/* make sure workload is in pause state
+		 * so it won't miss the signal */
+		sleep_loop(ks, sl_wait_task_pause_actor, task);
+		/* tell workload process to start executing */
+		send_sig(SIGINT, G(ks)->trace_task, 0);
+	}
+
+	if (!G(ks)->parm->quiet)
+		kp_printf(ks, "Tracing... Hit Ctrl-C to end.\n");
+
+	sleep_loop(ks, sl_wait_task_exit_actor, task);
+}
+
+/*
+ * ktap exit, free all resources.
+ */
+void kp_vm_exit(ktap_state_t *ks)
+{
+	if (!list_empty(&G(ks)->events_head) ||
+	    !list_empty(&G(ks)->timers))
+		wait_user_interrupt(ks);
+
+	kp_exit_timers(ks);
+	kp_events_exit(ks);
+
+	/* free all resources got by ktap */
+	kp_str_freeall(ks);
+	kp_mempool_destroy(ks);
+
+	func_closeuv(ks, 0); /* close all open upvals, let below call free it */
+	kp_obj_freeall(ks);
+
+	kp_vm_exit_thread(ks);
+	kp_free(ks, ks->stack);
+
+	free_preserved_data(ks);
+	free_cpumask_var(G(ks)->cpumask);
+
+	wait_user_completion(ks);
+
+	/* should invoke after wait_user_completion */
+	if (G(ks)->trace_task)
+		put_task_struct(G(ks)->trace_task);
+
+	kp_transport_exit(ks);
+	kp_free(ks, ks); /* free self */
+}
+
+/*
+ * ktap mainthread initization
+ */
+ktap_state_t *kp_vm_new_state(ktap_option_t *parm, struct dentry *dir)
+{
+	ktap_state_t *ks;
+	ktap_global_state_t *g;
+	pid_t pid;
+	int cpu;
+
+	ks = kzalloc(sizeof(ktap_state_t) + sizeof(ktap_global_state_t),
+		     GFP_KERNEL);
+	if (!ks)
+		return NULL;
+
+	G(ks) = (ktap_global_state_t *)(ks + 1);
+	g = G(ks);
+	g->mainthread = ks;
+	g->task = current;
+	g->parm = parm;
+	g->str_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+	g->strmask = ~(int)0;
+	g->uvhead.prev = &g->uvhead;
+	g->uvhead.next = &g->uvhead;
+	g->state = KTAP_RUNNING;
+	INIT_LIST_HEAD(&(g->timers));
+	INIT_LIST_HEAD(&(g->events_head));
+
+	if (kp_transport_init(ks, dir))
+		goto out;
+
+	ks->stack = kp_malloc(ks, KTAP_STACK_SIZE_BYTES);
+	if (!ks->stack)
+		goto out;
+
+	ks->stack_last = ks->stack + KTAP_STACK_SIZE;
+	ks->top = ks->stack;
+
+	pid = (pid_t)parm->trace_pid;
+	if (pid != -1) {
+		struct task_struct *task;
+
+		rcu_read_lock();
+		task = pid_task(find_vpid(pid), PIDTYPE_PID);
+		if (!task) {
+			kp_error(ks, "cannot find pid %d\n", pid);
+			rcu_read_unlock();
+			goto out;
+		}
+		g->trace_task = task;
+		get_task_struct(task);
+		rcu_read_unlock();
+	}
+
+	if( !alloc_cpumask_var(&g->cpumask, GFP_KERNEL))
+		goto out;
+
+	cpumask_copy(g->cpumask, cpu_online_mask);
+
+	cpu = parm->trace_cpu;
+	if (cpu != -1) {
+		if (!cpu_online(cpu)) {
+			kp_error(ks, "ktap: cpu %d is not online\n", cpu);
+			goto out;
+		}
+
+		cpumask_clear(g->cpumask);
+		cpumask_set_cpu(cpu, g->cpumask);
+	}
+
+	if (kp_mempool_init(ks, KP_MAX_MEMPOOL_SIZE))
+		goto out;
+
+	if (kp_str_resize(ks, 1024 - 1)) /* set string hashtable size */
+		goto out;
+
+	if (init_registry(ks))
+		goto out;
+	if (init_arguments(ks, parm->argc, parm->argv))
+		goto out;
+
+	/* init librarys */
+	if (kp_lib_init_base(ks))
+		goto out;
+	if (kp_lib_init_kdebug(ks))
+		goto out;
+	if (kp_lib_init_timer(ks))
+		goto out;
+	if (kp_lib_init_ansi(ks))
+		goto out;
+	if (kp_lib_init_table(ks))
+		goto out;
+
+	if (kp_lib_init_net(ks))
+		goto out;
+
+	if (init_preserved_data(ks))
+		goto out;
+
+	if (kp_events_init(ks))
+		goto out;
+
+	return ks;
+
+ out:
+	g->state = KTAP_ERROR;
+	kp_vm_exit(ks);
+	return NULL;
+}
+
diff --git a/kernel/trace/ktap/kp_vm.h b/kernel/trace/ktap/kp_vm.h
new file mode 100644
index 0000000..a01e969
--- /dev/null
+++ b/kernel/trace/ktap/kp_vm.h
@@ -0,0 +1,43 @@
+#ifndef __KTAP_VM_H__
+#define __KTAP_VM_H__
+
+#include "kp_obj.h"
+
+void kp_vm_call_proto(ktap_state_t *ks, ktap_proto_t *pt);
+void kp_vm_call(ktap_state_t *ks, StkId func, int nresults);
+int kp_vm_validate_code(ktap_state_t *ks, ktap_proto_t *pt, ktap_val_t *base);
+void kp_vm_exit(ktap_state_t *ks);
+ktap_state_t *kp_vm_new_state(ktap_option_t *parm, struct dentry *dir);
+void kp_optimize_code(ktap_state_t *ks, int level, ktap_proto_t *f);
+int kp_vm_register_lib(ktap_state_t *ks, const char *libname,
+		       const ktap_libfunc_t *funcs);
+
+
+static __always_inline
+ktap_state_t *kp_vm_new_thread(ktap_state_t *mainthread, int rctx)
+{
+	ktap_state_t *ks;
+
+	ks = kp_this_cpu_state(mainthread, rctx);
+	ks->top = ks->stack;
+	return ks;
+}
+
+static __always_inline
+void kp_vm_exit_thread(ktap_state_t *ks)
+{
+}
+
+/*
+ * This function only tell ktapvm this thread want to exit,
+ * let mainthread handle real exit work later.
+ */
+static __always_inline
+void kp_vm_try_to_exit(ktap_state_t *ks)
+{
+	G(ks)->mainthread->stop = 1;
+	G(ks)->state = KTAP_EXIT;
+}
+
+
+#endif /* __KTAP_VM_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (8 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 09/29] ktap: add bytecode execution engine(kernel/trace/ktap/kp_vm.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-30  3:50   ` Andi Kleen
  2014-03-28 14:45 ` [PATCH v2 11/29] ktap: add table handling code(kernel/trace/ktap/kp_tab.[c|h]) Jovi Zhangwei
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Exposed functions:

1). kp_str_new:
        Return a interned string, failure if out of memory
        or exceed max string number(default 9999).
        It allocate memory from mempool, not kmalloc, because we
        don't want dynamic allocate string in probe context.

2). kp_str_resize:
        Initizate interned hash table g->strhash.

3). kp_str_fmt:
        Return a format string, called from 'printf' built-in function.

4). kp_mempool_init/kp_mempool_destroy/kp_mempool_alloc
        mempool is only service for string allocation currently.

All string in ktap is interned, it means ktap keeps
a single copy for any string. Whenever a new string appears, ktap
checks whether it already has a copy of that string and, if so,
reuses that copy. Internalization makes operations like string
comparison and table indexing very fast, but it slows down string creation.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/kp_mempool.c |  94 +++++++++++
 kernel/trace/ktap/kp_mempool.h |   8 +
 kernel/trace/ktap/kp_str.c     | 360 +++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/kp_str.h     |  13 ++
 4 files changed, 475 insertions(+)
 create mode 100644 kernel/trace/ktap/kp_mempool.c
 create mode 100644 kernel/trace/ktap/kp_mempool.h
 create mode 100644 kernel/trace/ktap/kp_str.c
 create mode 100644 kernel/trace/ktap/kp_str.h

diff --git a/kernel/trace/ktap/kp_mempool.c b/kernel/trace/ktap/kp_mempool.c
new file mode 100644
index 0000000..37fb8e4
--- /dev/null
+++ b/kernel/trace/ktap/kp_mempool.c
@@ -0,0 +1,94 @@
+/*
+ * kp_mempool.c - ktap memory pool, service for string allocation
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <uapi/ktap/ktap_types.h>
+#include "kp_obj.h"
+#include "kp_str.h"
+
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include "ktap.h"
+
+
+/*
+ * allocate memory from mempool, the allocated memory will be free
+ * util ktap exit.
+ * TODO: lock-free allocation
+ */
+void *kp_mempool_alloc(ktap_state_t *ks, int size)
+{
+	ktap_global_state_t *g = G(ks);
+	void *mempool = g->mempool;
+	void *freepos = g->mp_freepos;
+	void *addr;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	arch_spin_lock(&g->mp_lock);
+
+	if (unlikely((unsigned long)((char *)freepos + size)) >
+		     (unsigned long)((char *)mempool + g->mp_size)) {
+		addr = NULL;
+		goto out;
+	}
+
+	addr = freepos;
+	g->mp_freepos = (char *)freepos + size;
+ out:
+
+	arch_spin_unlock(&g->mp_lock);
+	local_irq_restore(flags);
+	return addr;
+}
+
+/*
+ * destroy mempool.
+ */
+void kp_mempool_destroy(ktap_state_t *ks)
+{
+	ktap_global_state_t *g = G(ks);
+
+	if (!g->mempool)
+		return;
+
+	vfree(g->mempool);
+	g->mempool = NULL;
+	g->mp_freepos = NULL;
+	g->mp_size = 0;
+}
+
+/*
+ * pre-allocate size Kbytes memory pool.
+ */
+int kp_mempool_init(ktap_state_t *ks, int size)
+{
+	ktap_global_state_t *g = G(ks);
+
+	g->mempool = vmalloc(size * 1024);
+	if (!g->mempool)
+		return -ENOMEM;
+
+	g->mp_freepos = g->mempool;
+	g->mp_size = size * 1024;
+	g->mp_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+	return 0;
+}
+
diff --git a/kernel/trace/ktap/kp_mempool.h b/kernel/trace/ktap/kp_mempool.h
new file mode 100644
index 0000000..3eabf5e
--- /dev/null
+++ b/kernel/trace/ktap/kp_mempool.h
@@ -0,0 +1,8 @@
+#ifndef __KTAP_MEMPOOL_H__
+#define __KTAP_MEMPOOL_H__
+
+void *kp_mempool_alloc(ktap_state_t *ks, int size);
+void kp_mempool_destroy(ktap_state_t *ks);
+int kp_mempool_init(ktap_state_t *ks, int size);
+
+#endif /* __KTAP_MEMPOOL_H__ */
diff --git a/kernel/trace/ktap/kp_str.c b/kernel/trace/ktap/kp_str.c
new file mode 100644
index 0000000..9d2e741
--- /dev/null
+++ b/kernel/trace/ktap/kp_str.c
@@ -0,0 +1,360 @@
+/*
+ * kp_str.c - ktap string data struction manipulation
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <uapi/ktap/ktap_types.h>
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_mempool.h"
+
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include "ktap.h"
+#include "kp_transport.h"
+#include "kp_vm.h"
+#include "kp_events.h"
+
+/* Fast string data comparison. Caveat: unaligned access to 1st string! */
+static __always_inline int str_fastcmp(const char *a, const char *b, int len)
+{
+	int i = 0;
+
+	kp_assert(len > 0);
+	kp_assert((((uintptr_t)a + len - 1)&(PAGE_SIZE - 1)) <= PAGE_SIZE - 4);
+
+	do {  /* Note: innocuous access up to end of string + 3. */
+		uint32_t v = *(uint32_t *)(a + i) ^ *(const uint32_t *)(b + i);
+		if (v) {
+			i -= len;
+#if KP_LE
+			return (int32_t)i >= -3 ? (v << (32 + (i << 3))) : 1;
+#else
+			return (int32_t)i >= -3 ? (v >> (32 + (i << 3))) : 1;
+#endif
+		}
+		i += 4;
+	} while (i < len);
+	return 0;
+}
+
+
+//TODO: change hash algo
+
+#define STRING_HASHLIMIT	5
+static __always_inline unsigned int kp_str_hash(const char *str, size_t len)
+{
+	unsigned int h = 201236 ^ len;
+	size_t step = (len >> STRING_HASHLIMIT) + 1;
+	size_t l1;
+
+	for (l1 = len; l1 >= step; l1 -= step)
+		h = h ^ ((h<<5) + (h>>2) + (u8)(str[l1 - 1]));
+
+	return h;
+}
+
+
+/*
+ * resizes the string table
+ */
+int kp_str_resize(ktap_state_t *ks, int newmask)
+{
+	ktap_global_state_t *g = G(ks);
+	ktap_str_t **newhash;
+
+	newhash = kp_zalloc(ks, (newmask + 1) * sizeof(ktap_str_t *));
+	if (!newhash)
+		return -ENOMEM;
+
+	g->strmask = newmask;
+	g->strhash = newhash;
+	return 0;
+}
+
+/*
+ * Intern a string and return string object.
+ */
+ktap_str_t *kp_str_new(ktap_state_t *ks, const char *str, size_t len)
+{
+	ktap_global_state_t *g = G(ks);
+	ktap_str_t *s;
+	ktap_obj_t *o;
+	unsigned int h = kp_str_hash(str, len);
+	unsigned long flags;
+
+	if (len >= KP_MAX_STR)
+		return NULL;
+
+	local_irq_save(flags);
+	arch_spin_lock(&g->str_lock);
+
+	o = (ktap_obj_t *)g->strhash[h & g->strmask];
+	if (likely((((uintptr_t)str+len-1) & (PAGE_SIZE-1)) <= PAGE_SIZE-4)) {
+		while (o != NULL) {
+			ktap_str_t *sx = (ktap_str_t *)o;
+			if (sx->len == len &&
+			    !str_fastcmp(str, getstr(sx), len)) {
+				arch_spin_unlock(&g->str_lock);
+				local_irq_restore(flags);
+				return sx; /* Return existing string. */
+			}
+			o = gch(o)->nextgc;
+		}
+	} else { /* Slow path: end of string is too close to a page boundary */
+		while (o != NULL) {
+			ktap_str_t *sx = (ktap_str_t *)o;
+			if (sx->len == len &&
+			    !memcmp(str, getstr(sx), len)) {
+				arch_spin_unlock(&g->str_lock);
+				local_irq_restore(flags);
+				return sx; /* Return existing string. */
+			}
+			o = gch(o)->nextgc;
+		}
+	}
+
+	/* create a new string, allocate it from mempool, not use kmalloc. */
+	s = kp_mempool_alloc(ks, sizeof(ktap_str_t) + len + 1);
+	if (unlikely(!s))
+		goto out;
+	s->gct = ~KTAP_TSTR;
+	s->len = len;
+	s->hash = h;
+	s->reserved = 0;
+	memcpy(s + 1, str, len);
+	((char *)(s + 1))[len] = '\0';  /* ending 0 */
+
+	/* Add it to string hash table */
+	h &= g->strmask;
+	s->nextgc = (ktap_obj_t *)g->strhash[h];
+	g->strhash[h] = s;
+	if (g->strnum++ > KP_MAX_STRNUM) {
+		kp_error(ks, "exceed max string number %d\n", KP_MAX_STRNUM);
+		s = NULL;
+	}
+
+ out:
+	arch_spin_unlock(&g->str_lock);
+	local_irq_restore(flags);
+	return s; /* Return newly interned string. */
+}
+
+void kp_str_freeall(ktap_state_t *ks)
+{
+	/* don't need to free string in here, it will handled by mempool */
+	kp_free(ks, G(ks)->strhash);
+}
+
+/* kp_str_fmt - printf implementation */
+
+/* macro to `unsign' a character */
+#define uchar(c)	((unsigned char)(c))
+
+#define L_ESC		'%'
+
+/* valid flags in a format specification */
+#define FLAGS	"-+ #0"
+
+#define INTFRMLEN	"ll"
+#define INTFRM_T	long long
+
+/*
+ * maximum size of each format specification (such as '%-099.99d')
+ * (+10 accounts for %99.99x plus margin of error)
+ */
+#define MAX_FORMAT	(sizeof(FLAGS) + sizeof(INTFRMLEN) + 10)
+
+static const char *scanformat(ktap_state_t *ks, const char *strfrmt, char *form)
+{
+	const char *p = strfrmt;
+	while (*p != '\0' && strchr(FLAGS, *p) != NULL)
+		p++;  /* skip flags */
+
+	if ((size_t)(p - strfrmt) >= sizeof(FLAGS)/sizeof(char)) {
+		kp_error(ks, "invalid format (repeated flags)\n");
+		return NULL;
+	}
+
+	if (isdigit(uchar(*p)))
+		p++;  /* skip width */
+
+	if (isdigit(uchar(*p)))
+		p++;  /* (2 digits at most) */
+
+	if (*p == '.') {
+		p++;
+		if (isdigit(uchar(*p)))
+			p++;  /* skip precision */
+		if (isdigit(uchar(*p)))
+			p++;  /* (2 digits at most) */
+	}
+
+	if (isdigit(uchar(*p))) {
+		kp_error(ks, "invalid format (width or precision too long)\n");
+		return NULL;
+	}
+
+	*(form++) = '%';
+	memcpy(form, strfrmt, (p - strfrmt + 1) * sizeof(char));
+	form += p - strfrmt + 1;
+	*form = '\0';
+	return p;
+}
+
+
+/*
+ * add length modifier into formats
+ */
+static void addlenmod(char *form, const char *lenmod)
+{
+	size_t l = strlen(form);
+	size_t lm = strlen(lenmod);
+	char spec = form[l - 1];
+
+	strcpy(form + l - 1, lenmod);
+	form[l + lm - 1] = spec;
+	form[l + lm] = '\0';
+}
+
+
+static void arg_error(ktap_state_t *ks, int narg, const char *extramsg)
+{
+	kp_error(ks, "bad argument #%d: (%s)\n", narg, extramsg);
+}
+
+int kp_str_fmt(ktap_state_t *ks, struct trace_seq *seq)
+{
+	int arg = 1;
+	size_t sfl;
+	ktap_val_t *arg_fmt = kp_arg(ks, 1);
+	int argnum = kp_arg_nr(ks);
+	const char *strfrmt, *strfrmt_end;
+
+	strfrmt = svalue(arg_fmt);
+	sfl = rawtsvalue(arg_fmt)->len;
+	strfrmt_end = strfrmt + sfl;
+
+	while (strfrmt < strfrmt_end) {
+		if (*strfrmt != L_ESC)
+			trace_seq_putc(seq, *strfrmt++);
+		else if (*++strfrmt == L_ESC)
+			trace_seq_putc(seq, *strfrmt++);
+		else { /* format item */
+			char form[MAX_FORMAT];
+
+			if (++arg > argnum) {
+				arg_error(ks, arg, "no value");
+				return -1;
+			}
+
+			strfrmt = scanformat(ks, strfrmt, form);
+			switch (*strfrmt++) {
+			case 'c':
+				kp_arg_checknumber(ks, arg);
+
+				trace_seq_printf(seq, form,
+						 nvalue(kp_arg(ks, arg)));
+				break;
+			case 'd':  case 'i': {
+				ktap_number n;
+				INTFRM_T ni;
+
+				kp_arg_checknumber(ks, arg);
+
+				n = nvalue(kp_arg(ks, arg));
+				ni = (INTFRM_T)n;
+				addlenmod(form, INTFRMLEN);
+				trace_seq_printf(seq, form, ni);
+				break;
+			}
+			case 'p': {
+				char str[KSYM_SYMBOL_LEN];
+
+				kp_arg_checknumber(ks, arg);
+
+				SPRINT_SYMBOL(str, nvalue(kp_arg(ks, arg)));
+				_trace_seq_puts(seq, str);
+				break;
+			}
+			case 'o':  case 'u':  case 'x':  case 'X': {
+				ktap_number n;
+				unsigned INTFRM_T ni;
+
+				kp_arg_checknumber(ks, arg);
+
+				n = nvalue(kp_arg(ks, arg));
+				ni = (unsigned INTFRM_T)n;
+				addlenmod(form, INTFRMLEN);
+				trace_seq_printf(seq, form, ni);
+				break;
+			}
+			case 's': {
+				ktap_val_t *v = kp_arg(ks, arg);
+				const char *s;
+				size_t l;
+
+				if (is_nil(v)) {
+					_trace_seq_puts(seq, "nil");
+					return 0;
+				}
+
+				if (is_eventstr(v)) {
+					const char *str = kp_event_tostr(ks);
+					if (!str)
+						return  -1;
+					_trace_seq_puts(seq,
+						kp_event_tostr(ks));
+					return 0;
+				}
+
+				kp_arg_checkstring(ks, arg);
+
+				s = svalue(v);
+				l = rawtsvalue(v)->len;
+				if (!strchr(form, '.') && l >= 100) {
+					/*
+					 * no precision and string is too long
+					 * to be formatted;
+					 * keep original string
+					 */
+					_trace_seq_puts(seq, s);
+					break;
+				} else {
+					trace_seq_printf(seq, form, s);
+					break;
+				}
+			}
+			default: /* also treat cases `pnLlh' */
+				kp_error(ks, "invalid option " KTAP_QL("%%%c")
+					     " to " KTAP_QL("format"),
+					     *(strfrmt - 1));
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
diff --git a/kernel/trace/ktap/kp_str.h b/kernel/trace/ktap/kp_str.h
new file mode 100644
index 0000000..4510e4a
--- /dev/null
+++ b/kernel/trace/ktap/kp_str.h
@@ -0,0 +1,13 @@
+#ifndef __KTAP_STR_H__
+#define __KTAP_STR_H__
+
+int kp_str_resize(ktap_state_t *ks, int newmask);
+void kp_str_freeall(ktap_state_t *ks);
+ktap_str_t * kp_str_new(ktap_state_t *ks, const char *str, size_t len);
+
+#define kp_str_newz(ks, s)	(kp_str_new(ks, s, strlen(s)))
+
+#include <linux/trace_seq.h>
+int kp_str_fmt(ktap_state_t *ks, struct trace_seq *seq);
+
+#endif /* __KTAP_STR_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 11/29] ktap: add table handling code(kernel/trace/ktap/kp_tab.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (9 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 12/29] ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h]) Jovi Zhangwei
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap table is associative array, you can put any key value into
table and store any type value, for example:

var s = {}
s[1] = 1  <-- store number key and number value
s["key"] = "value"  <-- store string key and string value
s["key2"] = 1    <--- store string key and number value

Exposed functions:

1). kp_tab_new
        return a new allocated table
2). kp_tab_dup
        return a new allocated table with template table.
3). kp_tab_free
        free a table
4). kp_tab_clear
        clear entires of a table, not delete ktap_tab_t structure.
5). kp_tab_getint/kp_tab_setint/kp_tab_incrint
        get/set/inc with a int key
6). kp_tab_getstr/kp_tab_setstr/kp_tab_incrstr
        get/set/inc with a string key
7). kp_tab_get/kp_tab_set/kp_tab_incr
        get/set/inc with a key(variable type)
8). kp_tab_next
        table iterator
9). kp_tab_len
        return length of a table
10).kp_tab_print_hist
        print histogram of a table(value must be number)

All table operation is protect by spinlock.
Table cannot be resize, ktap runtime will report error once table
overflow.

ktap table performance is good, benchmark show that ktap table
operation performance is better than systemtap, especially
for string key(because all string is interned in ktap).

Note table is not aggregation in Systemtap and Dtrace, aggregation
use percpu buffer. ktap will introduce aggregation soon with
Dtrace aggregation syntax.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/kp_tab.c | 842 +++++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/kp_tab.h |  59 ++++
 2 files changed, 901 insertions(+)
 create mode 100644 kernel/trace/ktap/kp_tab.c
 create mode 100644 kernel/trace/ktap/kp_tab.h

diff --git a/kernel/trace/ktap/kp_tab.c b/kernel/trace/ktap/kp_tab.c
new file mode 100644
index 0000000..2d37dfe
--- /dev/null
+++ b/kernel/trace/ktap/kp_tab.c
@@ -0,0 +1,842 @@
+/*
+ * kp_tab.c - Table handling.
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/slab.h>
+#include <linux/sort.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_vm.h"
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_events.h"
+#include "kp_tab.h"
+
+#define tab_lock_init(t)						\
+	do {								\
+		(t)->lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;	\
+	} while (0)
+#define tab_lock(t)						\
+	do {								\
+		local_irq_save(flags);					\
+		arch_spin_lock(&(t)->lock);				\
+	} while (0)
+#define tab_unlock(t)						\
+	do {								\
+		arch_spin_unlock(&(t)->lock);				\
+		local_irq_restore(flags);				\
+	} while (0)
+
+
+const ktap_val_t kp_niltv = { {NULL}, {KTAP_TNIL} } ;
+#define niltv  (&kp_niltv)
+
+/* -- Object hashing ------------------------------------------------------ */
+
+/* Hash values are masked with the table hash mask and used as an index. */
+static __always_inline
+ktap_node_t *hashmask(const ktap_tab_t *t, uint32_t hash)
+{
+	ktap_node_t *n = t->node;
+	return &n[hash & t->hmask];
+}
+
+/* String hashes are precomputed when they are interned. */
+#define hashstr(t, s)		hashmask(t, (s)->hash)
+
+#define hashlohi(t, lo, hi)	hashmask((t), hashrot((lo), (hi)))
+#define hashnum(t, o)		hashlohi((t), (o)->val.n & 0xffffffff, 0)
+#define hashgcref(t, o)		hashlohi((t),	\
+				((unsigned long)(o)->val.gc & 0xffffffff), \
+				((unsigned long)(o)->val.gc & 0xffffffff) + HASH_BIAS)
+
+
+/* Hash an arbitrary key and return its anchor position in the hash table. */
+static ktap_node_t *hashkey(const ktap_tab_t *t, const ktap_val_t *key)
+{
+	kp_assert(!tvisint(key));
+	if (is_string(key))
+		return hashstr(t, rawtsvalue(key));
+	else if (is_number(key))
+		return hashnum(t, key);
+	else if (is_bool(key))
+		return hashmask(t, boolvalue(key));
+	else
+		return hashgcref(t, key);
+}
+
+/* -- Table creation and destruction -------------------------------------- */
+
+/* Create new hash part for table. */
+static __always_inline
+int newhpart(ktap_state_t *ks, ktap_tab_t *t, uint32_t hbits)
+{
+	uint32_t hsize;
+	ktap_node_t *node;
+	kp_assert(hbits != 0);
+
+	if (hbits > KP_MAX_HBITS) {
+		kp_error(ks, "table overflow\n");
+		return -1;
+	}
+	hsize = 1u << hbits;
+	node = vmalloc(hsize * sizeof(ktap_node_t));
+	if (!node)
+		return -ENOMEM;
+	t->freetop = &node[hsize];
+	t->node = node;
+	t->hmask = hsize-1;
+
+	return 0;
+}
+
+/*
+ * Q: Why all of these copies of t->hmask, t->node etc. to local variables?
+ * A: Because alias analysis for C is _really_ tough.
+ *    Even state-of-the-art C compilers won't produce good code without this.
+ */
+
+/* Clear hash part of table. */
+static __always_inline void clearhpart(ktap_tab_t *t)
+{
+	uint32_t i, hmask = t->hmask;
+	ktap_node_t *node = t->node;
+	kp_assert(t->hmask != 0);
+
+	for (i = 0; i <= hmask; i++) {
+		ktap_node_t *n = &node[i];
+		n->next = NULL;
+		set_nil(&n->key);
+		set_nil(&n->val);
+	}
+
+	t->hnum = 0;
+}
+
+/* Clear array part of table. */
+static __always_inline void clearapart(ktap_tab_t *t)
+{
+	uint32_t i, asize = t->asize;
+	ktap_val_t *array = t->array;
+	for (i = 0; i < asize; i++)
+		set_nil(&array[i]);
+}
+
+/* Create a new table. Note: the slots are not initialized (yet). */
+static ktap_tab_t *newtab(ktap_state_t *ks, uint32_t asize, uint32_t hbits)
+{
+	ktap_tab_t *t;
+ 
+	t = (ktap_tab_t *)kp_obj_new(ks, sizeof(ktap_tab_t));
+	t->gct = ~KTAP_TTAB;
+	t->array = NULL;
+	t->asize = 0;  /* In case the array allocation fails. */
+	t->hmask = 0;
+
+	tab_lock_init(t);
+
+	if (asize > 0) {
+		if (asize > KP_MAX_ASIZE) {
+			kp_error(ks, "table overflow\n");
+			return NULL;
+		}
+
+		t->array = vmalloc(asize * sizeof(ktap_val_t));
+		if (!t->array)
+			return NULL;
+		t->asize = asize;
+	}
+	if (hbits)
+		if (newhpart(ks, t, hbits)) {
+			vfree(t->array);
+			return NULL;		
+		}
+	return t;
+}
+
+/* Create a new table.
+ *
+ * The array size is non-inclusive. E.g. asize=128 creates array slots
+ * for 0..127, but not for 128. If you need slots 1..128, pass asize=129
+ * (slot 0 is wasted in this case).
+ *
+ * The hash size is given in hash bits. hbits=0 means no hash part.
+ * hbits=1 creates 2 hash slots, hbits=2 creates 4 hash slots and so on.
+ */
+ktap_tab_t *kp_tab_new(ktap_state_t *ks, uint32_t asize, uint32_t hbits)
+{
+	ktap_tab_t *t = newtab(ks, asize, hbits);
+	if (!t)
+		return NULL;
+
+	clearapart(t);
+	if (t->hmask > 0)
+		clearhpart(t);
+	return t;
+}
+
+#define TABLE_NARR_ENTRIES	255 /* PAGE_SIZE / sizeof(ktap_value) - 1 */
+#define TABLE_NREC_ENTRIES	2048 /* (PAGE_SIZE * 20) / sizeof(ktap_tnode)*/
+
+ktap_tab_t *kp_tab_new_ah(ktap_state_t *ks, int32_t a, int32_t h)
+{
+	if (a == 0 && h == 0) {
+		a = TABLE_NARR_ENTRIES;
+		h = TABLE_NREC_ENTRIES;
+	}
+
+	return kp_tab_new(ks, (uint32_t)(a > 0 ? a+1 : 0), hsize2hbits(h));
+}
+
+/* Duplicate a table. */
+ktap_tab_t *kp_tab_dup(ktap_state_t *ks, const ktap_tab_t *kt)
+{
+	ktap_tab_t *t;
+	uint32_t asize, hmask;
+	int i;
+
+	/* allocate default table size */
+	t = kp_tab_new_ah(ks, 0, 0);
+	if (!t)
+		return NULL;
+
+	asize = kt->asize;
+	if (asize > 0) {
+		ktap_val_t *array = t->array;
+		ktap_val_t *karray = kt->array;
+		if (asize < 64) {
+			/* An inlined loop beats memcpy for < 512 bytes. */
+			uint32_t i;
+			for (i = 0; i < asize; i++)
+				set_obj(&array[i], &karray[i]);
+		} else {
+			memcpy(array, karray, asize*sizeof(ktap_val_t));
+		}
+	}
+
+	hmask = kt->hmask;
+	for (i = 0; i <= hmask; i++) {
+		ktap_node_t *knode = &kt->node[i];
+		if (is_nil(&knode->key))
+			continue;
+		kp_tab_set(ks, t, &knode->key, &knode->val);
+	}
+	return t;
+}
+
+/* Clear a table. */
+void kp_tab_clear(ktap_tab_t *t)
+{
+	clearapart(t);
+	if (t->hmask > 0) {
+		ktap_node_t *node = t->node;
+		t->freetop = &node[t->hmask+1];
+		clearhpart(t);
+	}
+}
+
+/* Free a table. */
+void kp_tab_free(ktap_state_t *ks, ktap_tab_t *t)
+{
+	if (t->hmask > 0)
+		vfree(t->node);
+	if (t->asize > 0)
+		vfree(t->array);
+	kp_free(ks, t);
+}
+
+/* -- Table getters ------------------------------------------------------- */
+
+static const ktap_val_t *tab_getinth(ktap_tab_t *t, uint32_t key)
+{
+	ktap_val_t k;
+	ktap_node_t *n;
+
+	set_number(&k, (ktap_number)key);
+	n = hashnum(t, &k);
+	do {
+		if (is_number(&n->key) && nvalue(&n->key) == key) {
+			return &n->val;
+		}
+	} while ((n = n->next));
+	return niltv;
+}
+
+static __always_inline
+const ktap_val_t *tab_getint(ktap_tab_t *t, uint32_t key)
+{
+	return ((key < t->asize) ? arrayslot(t, key) :
+				   tab_getinth(t, key));
+}
+
+void kp_tab_getint(ktap_tab_t *t, uint32_t key, ktap_val_t *val)
+{
+	unsigned long flags;
+
+	tab_lock(t);
+	set_obj(val, tab_getint(t, key));
+	tab_unlock(t);
+}
+
+static const ktap_val_t *tab_getstr(ktap_tab_t *t, ktap_str_t *key)
+{
+	ktap_node_t *n = hashstr(t, key);
+	do {
+		if (is_string(&n->key) && rawtsvalue(&n->key) == key)
+			return &n->val;
+	} while ((n = n->next));
+	return niltv;
+}
+
+void kp_tab_getstr(ktap_tab_t *t, ktap_str_t *key, ktap_val_t *val)
+{
+	unsigned long flags;
+
+	tab_lock(t);
+	set_obj(val,  tab_getstr(t, key));
+	tab_unlock(t);
+}
+
+static const ktap_val_t *tab_get(ktap_state_t *ks, ktap_tab_t *t,
+				 const ktap_val_t *key)
+{
+	if (is_string(key)) {
+		return tab_getstr(t, rawtsvalue(key));
+	} else if (is_number(key)) {
+		ktap_number nk = nvalue(key);
+		uint32_t k = (uint32_t)nk;
+		if (nk == (ktap_number)k) {
+			return tab_getint(t, k);
+		} else {
+			goto genlookup;	/* Else use the generic lookup. */
+		}
+	} else if (is_eventstr(key)) {
+		const ktap_str_t *ts;
+
+		if (!ks->current_event) {
+			kp_error(ks,
+			"cannot stringify event str in invalid context\n");
+			return niltv;
+		}
+
+		ts = kp_event_stringify(ks);
+		if (!ts)
+			return niltv;
+
+		return tab_getstr(t, rawtsvalue(key));
+	} else if (!is_nil(key)) {
+		ktap_node_t *n;
+ genlookup:
+		n = hashkey(t, key);
+		do {
+			if (kp_obj_equal(&n->key, key))
+				return &n->val;
+		} while ((n = n->next));
+	}
+	return niltv;
+}
+
+void kp_tab_get(ktap_state_t *ks, ktap_tab_t *t, const ktap_val_t *key,
+		ktap_val_t *val)
+{
+	unsigned long flags;
+
+	tab_lock(t);
+	set_obj(val, tab_get(ks, t, key));
+	tab_unlock(t);
+}
+
+/* -- Table setters ------------------------------------------------------- */
+
+/* Insert new key. Use Brent's variation to optimize the chain length. */
+static ktap_val_t *kp_tab_newkey(ktap_state_t *ks, ktap_tab_t *t,
+				 const ktap_val_t *key)
+{
+	ktap_node_t *n = hashkey(t, key);
+
+	if (!is_nil(&n->val) || t->hmask == 0) {
+		ktap_node_t *nodebase = t->node;
+		ktap_node_t *collide, *freenode = t->freetop;
+
+		kp_assert(freenode >= nodebase &&
+			  freenode <= nodebase+t->hmask+1);
+		do {
+			if (freenode == nodebase) {  /* No free node found? */
+				kp_error(ks, "table overflow\n");
+				return NULL;
+			}
+		} while (!is_nil(&(--freenode)->key));
+
+		t->freetop = freenode;
+		collide = hashkey(t, &n->key);
+		if (collide != n) {  /* Colliding node not the main node? */
+			while (collide->next != n)
+				/* Find predecessor. */
+				collide = collide->next;
+			collide->next = freenode;  /* Relink chain. */
+ 			/* Copy colliding node into free node and
+			 * free main node. */
+			freenode->val = n->val;
+			freenode->key = n->key;
+			freenode->next = n->next;
+			n->next = NULL;
+			set_nil(&n->val);
+			/* Rechain pseudo-resurrected string keys with
+			 * colliding hashes. */
+			while (freenode->next) {
+				ktap_node_t *nn = freenode->next;
+				if (is_string(&nn->key) && !is_nil(&nn->val) &&
+					hashstr(t, rawtsvalue(&nn->key)) == n) {
+					freenode->next = nn->next;
+					nn->next = n->next;
+					n->next = nn;
+				} else {
+					freenode = nn;
+				}
+			}
+		} else {  /* Otherwise use free node. */
+			freenode->next = n->next;  /* Insert into chain. */
+			n->next = freenode;
+			n = freenode;
+		}
+	}
+	set_obj(&n->key, key);
+	t->hnum++;
+	return &n->val;
+}
+
+static ktap_val_t *tab_setinth(ktap_state_t *ks, ktap_tab_t *t, uint32_t key)
+{
+	ktap_val_t k;
+	ktap_node_t *n;
+
+	set_number(&k, (ktap_number)key);
+	n = hashnum(t, &k);
+	do {
+		if (is_number(&n->key) && nvalue(&n->key) == key)
+			return &n->val;
+	} while ((n = n->next));
+	return kp_tab_newkey(ks, t, &k);
+}
+
+static __always_inline
+ktap_val_t *tab_setint(ktap_state_t *ks, ktap_tab_t *t, uint32_t key)
+{
+	return ((key < t->asize) ? arrayslot(t, key) :
+				   tab_setinth(ks, t, key));
+}
+
+void kp_tab_setint(ktap_state_t *ks, ktap_tab_t *t,
+		   uint32_t key, const ktap_val_t *val)
+{
+	ktap_val_t *v;
+	unsigned long flags;
+
+	tab_lock(t);
+	v = tab_setint(ks, t, key);
+	if (likely(v))
+		set_obj(v, val);
+	tab_unlock(t);
+}
+
+void kp_tab_incrint(ktap_state_t *ks, ktap_tab_t *t, uint32_t key,
+		    ktap_number n)
+{
+	ktap_val_t *v;
+	unsigned long flags;
+
+	tab_lock(t);
+	v = tab_setint(ks, t, key);
+	if (unlikely(!v))
+		goto out;
+
+	if (likely(is_number(v)))
+		set_number(v, nvalue(v) + n);
+	else if (is_nil(v))
+		set_number(v, n);
+	else
+		kp_error(ks, "use '+=' operator on non-number value\n");
+
+ out:
+	tab_unlock(t);
+}
+
+static ktap_val_t *tab_setstr(ktap_state_t *ks, ktap_tab_t *t,
+			      const ktap_str_t *key)
+{
+	ktap_val_t k;
+	ktap_node_t *n = hashstr(t, key);
+	do {
+		if (is_string(&n->key) && rawtsvalue(&n->key) == key)
+			return &n->val;
+	} while ((n = n->next));
+	set_string(&k, key);
+	return kp_tab_newkey(ks, t, &k);
+}
+
+void kp_tab_setstr(ktap_state_t *ks, ktap_tab_t *t, const ktap_str_t *key,
+		   const ktap_val_t *val)
+{
+	ktap_val_t *v;
+	unsigned long flags;
+
+	tab_lock(t);
+	v = tab_setstr(ks, t, key);
+	if (likely(v))
+		set_obj(v, val);
+	tab_unlock(t);
+}
+
+void kp_tab_incrstr(ktap_state_t *ks, ktap_tab_t *t, const ktap_str_t *key,
+		    ktap_number n)
+{
+	ktap_val_t *v;
+	unsigned long flags;
+
+	tab_lock(t);
+	v = tab_setstr(ks, t, key);
+	if (unlikely(!v))
+		goto out;
+
+	if (likely(is_number(v)))
+		set_number(v, nvalue(v) + n);
+	else if (is_nil(v))
+		set_number(v, n);
+	else
+		kp_error(ks, "use '+=' operator on non-number value\n");
+ out:
+	tab_unlock(t);
+}
+
+static ktap_val_t *tab_set(ktap_state_t *ks, ktap_tab_t *t,
+			   const ktap_val_t *key)
+{
+	ktap_node_t *n;
+
+	if (is_string(key)) {
+		return tab_setstr(ks, t, rawtsvalue(key));
+	} else if (is_number(key)) {
+		ktap_number nk = nvalue(key);
+		uint32_t k = (ktap_number)nk;
+		if (nk == (ktap_number)k)
+			return tab_setint(ks, t, k);
+	} else if (itype(key) == KTAP_TKSTACK) {
+		/* change stack into string */
+		ktap_str_t *bt = kp_obj_kstack2str(ks, key->val.stack.depth,
+						       key->val.stack.skip);
+		if (!bt)
+			return NULL;
+		return tab_setstr(ks, t, bt);
+	} else if (is_eventstr(key)) {
+		const ktap_str_t *ts;
+
+		if (!ks->current_event) {
+			kp_error(ks,
+			"cannot stringify event str in invalid context\n");
+			return NULL;
+		}
+
+		ts = kp_event_stringify(ks);
+		if (!ts)
+			return NULL;
+
+		return tab_setstr(ks, t, ts);
+		/* Else use the generic lookup. */
+	} else if (is_nil(key)) {
+		kp_error(ks, "table nil index\n");
+		return NULL;
+	}
+	n = hashkey(t, key);
+	do {
+		if (kp_obj_equal(&n->key, key))
+			return &n->val;
+	} while ((n = n->next));
+	return kp_tab_newkey(ks, t, key);
+}
+
+void kp_tab_set(ktap_state_t *ks, ktap_tab_t *t,
+		const ktap_val_t *key, const ktap_val_t *val)
+{
+	ktap_val_t *v;
+	unsigned long flags;
+
+	tab_lock(t);
+	v = tab_set(ks, t, key);
+	if (likely(v))
+		set_obj(v, val);
+	tab_unlock(t);
+}
+
+void kp_tab_incr(ktap_state_t *ks, ktap_tab_t *t, ktap_val_t *key,
+		 ktap_number n)
+{
+	ktap_val_t *v;
+	unsigned long flags;
+
+	tab_lock(t);
+	v = tab_set(ks, t, key);
+	if (unlikely(!v))
+		goto out;
+
+	if (likely(is_number(v)))
+		set_number(v, nvalue(v) + n);
+	else if (is_nil(v))
+		set_number(v, n);
+	else
+		kp_error(ks, "use '+=' operator on non-number value\n");
+ out:
+	tab_unlock(t);
+}
+
+
+/* -- Table traversal ----------------------------------------------------- */
+
+/* Get the traversal index of a key. */
+static uint32_t keyindex(ktap_state_t *ks, ktap_tab_t *t,
+			 const ktap_val_t *key)
+{
+	if (is_number(key)) {
+		ktap_number nk = nvalue(key);
+		uint32_t k = (uint32_t)nk;
+		/* Array key indexes: [0..t->asize-1] */
+		if ((uint32_t)k < t->asize && nk == (ktap_number)k)
+			return (uint32_t)k;
+	}
+
+	if (!is_nil(key)) {
+		ktap_node_t *n = hashkey(t, key);
+		do {
+			if (kp_obj_equal(&n->key, key))
+				return t->asize + (uint32_t)(n - (t->node));
+			/* Hash key indexes: [t->asize..t->asize+t->nmask] */
+		} while ((n = n->next));
+		kp_error(ks, "table next index\n");
+		return 0;  /* unreachable */
+	}
+	return ~0u;  /* A nil key starts the traversal. */
+}
+
+/* Advance to the next step in a table traversal. */
+int kp_tab_next(ktap_state_t *ks, ktap_tab_t *t, ktap_val_t *key)
+{
+	unsigned long flags;
+	uint32_t i;
+
+	tab_lock(t);
+	i = keyindex(ks, t, key);  /* Find predecessor key index. */
+
+	/* First traverse the array keys. */
+	for (i++; i < t->asize; i++)
+ 		if (!is_nil(arrayslot(t, i))) {
+			set_number(key, i);
+			set_obj(key + 1, arrayslot(t, i));
+			tab_unlock(t);
+			return 1;
+		}
+	/* Then traverse the hash keys. */
+	for (i -= t->asize; i <= t->hmask; i++) {
+		ktap_node_t *n = &t->node[i];
+		if (!is_nil(&n->val)) {
+			set_obj(key, &n->key);
+			set_obj(key + 1, &n->val);
+			tab_unlock(t);
+			return 1;
+		}
+	}
+	tab_unlock(t);
+	return 0;  /* End of traversal. */
+}
+
+/* -- Table length calculation -------------------------------------------- */
+
+int kp_tab_len(ktap_state_t *ks, ktap_tab_t *t)
+{
+	unsigned long flags;
+	int i, len = 0;
+
+	tab_lock(t);
+	for (i = 0; i < t->asize; i++) {
+		ktap_val_t *v = &t->array[i];
+
+		if (is_nil(v))
+			continue;
+		len++;
+	}
+
+	for (i = 0; i <= t->hmask; i++) {
+		ktap_node_t *n = &t->node[i];
+
+		if (is_nil(&n->key))
+			continue;
+
+		len++;
+	}
+	tab_unlock(t);
+	return len;
+}
+
+static void string_convert(char *output, const char *input)
+{
+	if (strlen(input) > 32) {
+		strncpy(output, input, 32-4);
+		memset(output + 32-4, '.', 3);
+	} else
+		memcpy(output, input, strlen(input));
+}
+
+typedef struct ktap_node2 {
+	ktap_val_t key;
+	ktap_val_t val;
+} ktap_node2_t;
+
+static int hist_record_cmp(const void *i, const void *j)
+{
+	ktap_number n1 = nvalue(&((const ktap_node2_t *)i)->val);
+	ktap_number n2 = nvalue(&((const ktap_node2_t *)j)->val);
+
+	if (n1 == n2)
+		return 0;
+	else if (n1 < n2)
+		return 1;
+	else
+		return -1;
+}
+
+/* todo: make histdump to be faster, just need to sort n entries, not all */
+
+/* print_hist: key should be number/string/ip, value must be number */
+static void tab_histdump(ktap_state_t *ks, ktap_tab_t *t, int shownums)
+{
+	long start_time, delta_time;
+	uint32_t i, asize = t->asize;
+	ktap_val_t *array = t->array;
+	uint32_t hmask = t->hmask;
+	ktap_node_t *node = t->node;
+	ktap_node2_t *sort_mem;
+	char dist_str[39];
+	int total = 0, sum = 0;
+
+	start_time = gettimeofday_ns();
+
+	sort_mem = kmalloc((t->asize + t->hnum) * sizeof(ktap_node2_t),
+				GFP_KERNEL);
+	if (!sort_mem)
+		return;
+
+	/* copy all values in table into sort_mem. */
+	for (i = 0; i < asize; i++) {
+		ktap_val_t *val = &array[i];
+		if (is_nil(val))
+			continue;
+
+		if (!is_number(val)) {
+			kp_error(ks, "print_hist only can print number\n");
+			goto out;
+		}
+
+		set_number(&sort_mem[total].key, i);
+		set_obj(&sort_mem[total].val, val);
+		sum += nvalue(val);
+		total++;
+	}
+
+	for (i = 0; i <= hmask; i++) {
+		ktap_node_t *n = &node[i];
+		ktap_val_t *val = &n->val;
+
+		if (is_nil(val))
+			continue;
+
+		if (!is_number(val)) {
+			kp_error(ks, "print_hist only can print number\n");
+			goto out;
+		}
+
+		set_obj(&sort_mem[total].key, &n->key);
+		set_obj(&sort_mem[total].val, val);
+		sum += nvalue(val);
+		total++;
+	}
+
+	/* sort */
+	sort(sort_mem, total, sizeof(ktap_node2_t), hist_record_cmp, NULL);
+
+	dist_str[sizeof(dist_str) - 1] = '\0';
+
+	for (i = 0; i < total; i++) {
+		ktap_val_t *key = &sort_mem[i].key;
+		ktap_number num = nvalue(&sort_mem[i].val);
+		int ratio;
+
+		if (!--shownums)
+			break;
+
+		memset(dist_str, ' ', sizeof(dist_str) - 1);
+		ratio = (num * (sizeof(dist_str) - 1)) / sum;
+		memset(dist_str, '@', ratio);
+
+		if (is_string(key)) {
+			//char buf[32] = {0};
+
+			//string_convert(buf, svalue(key));
+			if (rawtsvalue(key)->len > 32) {
+				kp_puts(ks, svalue(key));
+				kp_printf(ks, "%s\n%d\n\n", dist_str, num);
+			} else {
+				kp_printf(ks, "%31s |%s%-7d\n", svalue(key),
+								dist_str, num);
+			}
+		} else if (is_number(key)) {
+			kp_printf(ks, "%31d |%s%-7d\n", nvalue(key),
+						dist_str, num);
+		} else if (is_kip(key)) {
+			char str[KSYM_SYMBOL_LEN];
+			char buf[32] = {0};
+
+			SPRINT_SYMBOL(str, nvalue(key));
+			string_convert(buf, str);
+			kp_printf(ks, "%31s |%s%-7d\n", buf, dist_str, num);
+		}
+	}
+
+	if (!shownums && total)
+		kp_printf(ks, "%31s |\n", "...");
+
+ out:
+	kfree(sort_mem);
+
+	delta_time = (gettimeofday_ns() - start_time) / NSEC_PER_USEC;
+	kp_verbose_printf(ks, "tab_histdump time: %d (us)\n", delta_time);
+}
+
+#define DISTRIBUTION_STR "------------- Distribution -------------"
+void kp_tab_print_hist(ktap_state_t *ks, ktap_tab_t *t, int n)
+{
+	kp_printf(ks, "%31s%s%s\n", "value ", DISTRIBUTION_STR, " count");
+	tab_histdump(ks, t, n);
+}
+
diff --git a/kernel/trace/ktap/kp_tab.h b/kernel/trace/ktap/kp_tab.h
new file mode 100644
index 0000000..2a7576d
--- /dev/null
+++ b/kernel/trace/ktap/kp_tab.h
@@ -0,0 +1,59 @@
+#ifndef __KTAP_TAB_H__
+#define __KTAP_TAB_H__
+
+/* Hash constants. Tuned using a brute force search. */
+#define HASH_BIAS       (-0x04c11db7)
+#define HASH_ROT1       14
+#define HASH_ROT2       5
+#define HASH_ROT3       13
+
+/* Every half-decent C compiler transforms this into a rotate instruction. */
+#define kp_rol(x, n)    (((x)<<(n)) | ((x)>>(-(int)(n)&(8*sizeof(x)-1))))
+#define kp_ror(x, n)    (((x)<<(-(int)(n)&(8*sizeof(x)-1))) | ((x)>>(n)))
+
+/* Scramble the bits of numbers and pointers. */
+static __always_inline uint32_t hashrot(uint32_t lo, uint32_t hi)
+{
+	/* Prefer variant that compiles well for a 2-operand CPU. */
+	lo ^= hi; hi = kp_rol(hi, HASH_ROT1);
+	lo -= hi; hi = kp_rol(hi, HASH_ROT2);
+	hi ^= lo; hi -= kp_rol(lo, HASH_ROT3);
+	return hi;
+}
+
+
+#define FLS(x)       ((uint32_t)(__builtin_clz(x)^31))
+#define hsize2hbits(s)  ((s) ? ((s)==1 ? 1 : 1+FLS((uint32_t)((s)-1))) : 0)
+
+#define arrayslot(t, i)         (&(t)->array[(i)])
+
+void kp_tab_set(ktap_state_t *ks, ktap_tab_t *t, const ktap_val_t *key,
+		const ktap_val_t *val);
+void kp_tab_setstr(ktap_state_t *ks, ktap_tab_t *t,
+		   const ktap_str_t *key, const ktap_val_t *val);
+void kp_tab_incrstr(ktap_state_t *ks, ktap_tab_t *t, const ktap_str_t *key,
+		    ktap_number n);
+void kp_tab_get(ktap_state_t *ks, ktap_tab_t *t, const ktap_val_t *key,
+		ktap_val_t *val);
+void kp_tab_getstr(ktap_tab_t *t, ktap_str_t *key, ktap_val_t *val);
+
+void kp_tab_getint(ktap_tab_t *t, uint32_t key, ktap_val_t *val);
+void kp_tab_setint(ktap_state_t *ks, ktap_tab_t *t,
+		   uint32_t key, const ktap_val_t *val);
+void kp_tab_incrint(ktap_state_t *ks, ktap_tab_t *t, uint32_t key,
+		    ktap_number n);
+ktap_tab_t *kp_tab_new(ktap_state_t *ks, uint32_t asize, uint32_t hbits);
+ktap_tab_t *kp_tab_new_ah(ktap_state_t *ks, int32_t a, int32_t h);
+ktap_tab_t *kp_tab_dup(ktap_state_t *ks, const ktap_tab_t *kt);
+
+void kp_tab_free(ktap_state_t *ks, ktap_tab_t *t);
+int kp_tab_len(ktap_state_t *ks, ktap_tab_t *t);
+void kp_tab_dump(ktap_state_t *ks, ktap_tab_t *t);
+void kp_tab_clear(ktap_tab_t *t);
+void kp_tab_print_hist(ktap_state_t *ks, ktap_tab_t *t, int n);
+int kp_tab_next(ktap_state_t *ks, ktap_tab_t *t, StkId key);
+int kp_tab_sort_next(ktap_state_t *ks, ktap_tab_t *t, StkId key);
+void kp_tab_sort(ktap_state_t *ks, ktap_tab_t *t, ktap_func_t *cmp_func);
+void kp_tab_incr(ktap_state_t *ks, ktap_tab_t *t, ktap_val_t *key,
+		ktap_number n);
+#endif /* __KTAP_TAB_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 12/29] ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (10 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 11/29] ktap: add table handling code(kernel/trace/ktap/kp_tab.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-30  3:56   ` Andi Kleen
  2014-03-28 14:45 ` [PATCH v2 13/29] ktap: add ring buffer handling code(kernel/trace/ktap/kp_transport.[c|h]) Jovi Zhangwei
                   ` (17 subsequent siblings)
  29 siblings, 1 reply; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

kp_obj.c include some common ktap object operation,
like object dump, object length.
It also include generic memory allocate code.

Exposed functions:
1). kp_malloc/kp_zalloc/kp_free

2). kp_obj_dump/kp_obj_show

3). kp_obj_rawequal

4). kp_obj_len

5). kp_obj_new
        allocate new object, all object is linked in G(ks)->allgc.

6). kp_obj_kstack2str
        convert kernel stack to string.

7). kp_obj_freeall
        free all object, called in kp_vm_exit.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/kp_obj.c | 281 +++++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/kp_obj.h |  19 +++
 2 files changed, 300 insertions(+)
 create mode 100644 kernel/trace/ktap/kp_obj.c
 create mode 100644 kernel/trace/ktap/kp_obj.h

diff --git a/kernel/trace/ktap/kp_obj.c b/kernel/trace/ktap/kp_obj.c
new file mode 100644
index 0000000..03e25fc
--- /dev/null
+++ b/kernel/trace/ktap/kp_obj.c
@@ -0,0 +1,281 @@
+/*
+ * kp_obj.c - ktap object generic operation
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/stacktrace.h>
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/slab.h>
+#include <uapi/ktap/ktap_types.h>
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_tab.h"
+#include "ktap.h"
+#include "kp_vm.h"
+#include "kp_transport.h"
+
+/* Error message strings. */
+const char *kp_err_allmsg =
+#define ERRDEF(name, msg)       msg "\0"
+#include <uapi/ktap/ktap_errmsg.h>
+;
+
+/* memory allocation flag */
+#define KTAP_ALLOC_FLAGS ((GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN) \
+			 & ~__GFP_WAIT)
+
+void *kp_malloc(ktap_state_t *ks, int size)
+{
+	void *addr;
+
+	addr = kmalloc(size, KTAP_ALLOC_FLAGS);
+	if (unlikely(!addr)) {
+		kp_error(ks, "kmalloc failed\n");
+	}
+	return addr;
+}
+
+void *kp_zalloc(ktap_state_t *ks, int size)
+{
+	void *addr;
+
+	addr = kzalloc(size, KTAP_ALLOC_FLAGS);
+	if (unlikely(!addr))
+		kp_error(ks, "kzalloc failed\n");
+	return addr;
+}
+
+void kp_free(ktap_state_t *ks, void *addr)
+{
+	kfree(addr);
+}
+
+
+void kp_obj_dump(ktap_state_t *ks, const ktap_val_t *v)
+{
+	switch (itype(v)) {
+	case KTAP_TNIL:
+		kp_puts(ks, "NIL");
+		break;
+	case KTAP_TTRUE:
+		kp_printf(ks, "true");
+		break;
+	case KTAP_TFALSE:
+		kp_printf(ks, "false");
+		break;
+	case KTAP_TNUM:
+		kp_printf(ks, "NUM %ld", nvalue(v));
+		break;
+	case KTAP_TLIGHTUD:
+		kp_printf(ks, "LIGHTUD 0x%lx", (unsigned long)pvalue(v));
+		break;
+	case KTAP_TFUNC:
+		kp_printf(ks, "FUNCTION 0x%lx", (unsigned long)fvalue(v));
+		break;
+	case KTAP_TSTR:
+		kp_printf(ks, "STR #%s", svalue(v));
+		break;
+	case KTAP_TTAB:
+		kp_printf(ks, "TABLE 0x%lx", (unsigned long)hvalue(v));
+		break;
+        default:
+		kp_printf(ks, "GCVALUE 0x%lx", (unsigned long)gcvalue(v));
+		break;
+	}
+}
+
+void kp_obj_show(ktap_state_t *ks, const ktap_val_t *v)
+{
+	switch (itype(v)) {
+	case KTAP_TNIL:
+		kp_puts(ks, "nil");
+		break;
+	case KTAP_TTRUE:
+		kp_puts(ks, "true");
+		break;
+	case KTAP_TFALSE:
+		kp_puts(ks, "false");
+		break;
+	case KTAP_TNUM:
+		kp_printf(ks, "%ld", nvalue(v));
+		break;
+	case KTAP_TLIGHTUD:
+		kp_printf(ks, "lightud 0x%lx", (unsigned long)pvalue(v));
+		break;
+	case KTAP_TCFUNC:
+		kp_printf(ks, "cfunction 0x%lx", (unsigned long)fvalue(v));
+		break;
+	case KTAP_TFUNC:
+		kp_printf(ks, "function 0x%lx", (unsigned long)gcvalue(v));
+		break;
+	case KTAP_TSTR:
+		kp_puts(ks, svalue(v));
+		break;
+	case KTAP_TTAB:
+		kp_printf(ks, "table 0x%lx", (unsigned long)hvalue(v));
+		break;
+	case KTAP_TEVENTSTR:
+		/* check event context */
+		if (!ks->current_event) {
+			kp_error(ks,
+			"cannot stringify event str in invalid context\n");
+			return;
+		}
+
+		kp_transport_event_write(ks, ks->current_event);
+		break;
+	case KTAP_TKSTACK:
+		kp_transport_print_kstack(ks, v->val.stack.depth,
+					      v->val.stack.skip);
+		break;
+        default:
+		kp_error(ks, "print unknown value type: %d\n", itype(v));
+		break;
+	}
+}
+
+
+/*
+ * equality of ktap values.
+ */
+int kp_obj_rawequal(const ktap_val_t *t1, const ktap_val_t *t2)
+{
+	switch (itype(t1)) {
+	case KTAP_TNIL:
+	case KTAP_TTRUE:
+	case KTAP_TFALSE:
+		return 1;
+	case KTAP_TNUM:
+		return nvalue(t1) == nvalue(t2);
+	case KTAP_TLIGHTUD:
+		return pvalue(t1) == pvalue(t2);
+	case KTAP_TFUNC:
+		return fvalue(t1) == fvalue(t2);
+	case KTAP_TSTR:
+		return rawtsvalue(t1) == rawtsvalue(t2);
+	case KTAP_TTAB:
+		return hvalue(t1) == hvalue(t2);
+	default:
+		return gcvalue(t1) == gcvalue(t2);
+	}
+
+	return 0;
+}
+
+/*
+ * ktap will not use lua's length operator for table,
+ * also # is not for length operator any more in ktap.
+ */
+int kp_obj_len(ktap_state_t *ks, const ktap_val_t *v)
+{
+	switch(itype(v)) {
+	case KTAP_TTAB:
+		return kp_tab_len(ks, hvalue(v));
+	case KTAP_TSTR:
+		return rawtsvalue(v)->len;
+	default:
+		kp_printf(ks, "cannot get length of type %d\n", v->type);
+		return -1;
+	}
+	return 0;
+}
+
+/* need to protect allgc field? */
+ktap_obj_t *kp_obj_new(ktap_state_t *ks, size_t size)
+{
+	ktap_obj_t *o, **list;
+
+	if (ks != G(ks)->mainthread) {
+		kp_error(ks, "kp_obj_new only can be called in mainthread\n");
+		return NULL;
+	}
+
+	o = kp_malloc(ks, size);
+	if (unlikely(!o))
+		return NULL;
+
+	list = &G(ks)->allgc;
+	gch(o)->nextgc = *list;
+	*list = o;
+
+	return o;
+}
+
+
+/* this function may be time consuming, move out from table set/get? */
+ktap_str_t *kp_obj_kstack2str(ktap_state_t *ks, uint16_t depth, uint16_t skip)
+{
+	struct stack_trace trace;
+	unsigned long *bt;
+	char *btstr, *p;
+	int i;
+
+	bt = kp_this_cpu_print_buffer(ks); /* use print percpu buffer */
+	trace.nr_entries = 0;
+	trace.skip = skip;
+	trace.max_entries = depth;
+	trace.entries = (unsigned long *)(bt + 1);
+	save_stack_trace(&trace);
+
+	/* convert backtrace to string */
+	p = btstr = kp_this_cpu_temp_buffer(ks);
+	for (i = 0; i < trace.nr_entries; i++) {
+		unsigned long addr = trace.entries[i];
+
+		if (addr == ULONG_MAX)
+			break;
+
+		p += sprint_symbol(p, addr);
+		*p++ = '\n';
+        }
+
+	return kp_str_new(ks, btstr, p - btstr);
+}
+
+static void free_gclist(ktap_state_t *ks, ktap_obj_t *o)
+{
+	while (o) {
+		ktap_obj_t *next;
+
+		next = gch(o)->nextgc;
+		switch (gch(o)->gct) {
+		case ~KTAP_TTAB:
+			kp_tab_free(ks, (ktap_tab_t *)o);
+			break;
+		case ~KTAP_TUPVAL:
+			kp_freeupval(ks, (ktap_upval_t *)o);
+			break;
+		default:
+			kp_free(ks, o);
+		}
+		o = next;
+	}
+}
+
+void kp_obj_freeall(ktap_state_t *ks)
+{
+	free_gclist(ks, G(ks)->allgc);
+	G(ks)->allgc = NULL;
+}
+
diff --git a/kernel/trace/ktap/kp_obj.h b/kernel/trace/ktap/kp_obj.h
new file mode 100644
index 0000000..b24aab0
--- /dev/null
+++ b/kernel/trace/ktap/kp_obj.h
@@ -0,0 +1,19 @@
+#ifndef __KTAP_OBJ_H__
+#define __KTAP_OBJ_H__
+
+void *kp_malloc(ktap_state_t *ks, int size);
+void *kp_zalloc(ktap_state_t *ks, int size);
+void kp_free(ktap_state_t *ks, void *addr);
+
+void kp_obj_dump(ktap_state_t *ks, const ktap_val_t *v);
+void kp_obj_show(ktap_state_t *ks, const ktap_val_t *v);
+int kp_obj_len(ktap_state_t *ks, const ktap_val_t *rb);
+ktap_obj_t *kp_obj_new(ktap_state_t *ks, size_t size);
+int kp_obj_rawequal(const ktap_val_t *t1, const ktap_val_t *t2);
+ktap_str_t *kp_obj_kstack2str(ktap_state_t *ks, uint16_t depth, uint16_t skip);
+void kp_obj_freeall(ktap_state_t *ks);
+
+#define kp_obj_equal(o1, o2) \
+	(((o1)->type == (o2)->type) && kp_obj_rawequal(o1, o2))
+
+#endif /* __KTAP_OBJ_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 13/29] ktap: add ring buffer handling code(kernel/trace/ktap/kp_transport.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (11 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 12/29] ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-30  3:58   ` Andi Kleen
  2014-03-28 14:45 ` [PATCH v2 14/29] ktap: add events management(kernel/trace/ktap/kp_events.[c|h]) Jovi Zhangwei
                   ` (16 subsequent siblings)
  29 siblings, 1 reply; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap transport functionality is based on ftrace ring buffer.

All contents generated by ktap runtime will send to userspace
through this ring buffer.

Actually the concept of ktap transport is similar with ftrace,
for example, it will write event binary data into ring buffer
when use call: 'print(argstr)' or 'print(stack())', and the content
will be parse in consumer side, not producer side.

Userspace reader thread will consume content in ring buffer through
'/sys/kernel/debug/ktap/trace_pipe_$pid'

A lot of code in this file is duplicated with kernel trace_output.c.

Exposed functions:
1). kp_transport_init
2). kp_transport_exit
3). __kp_bputs/__kp_puts/kp_printf
4). kp_transport_write
5). kp_transport_event_write
        write a event content into ring buffer, called by: 'print(argstr)'
6). kp_transport_print_kstack
        write a kernel stack into ring buffer, called by: 'print(stack())'

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/kp_transport.c | 649 +++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/kp_transport.h |  13 +
 2 files changed, 662 insertions(+)
 create mode 100644 kernel/trace/ktap/kp_transport.c
 create mode 100644 kernel/trace/ktap/kp_transport.h

diff --git a/kernel/trace/ktap/kp_transport.c b/kernel/trace/ktap/kp_transport.c
new file mode 100644
index 0000000..5baf586
--- /dev/null
+++ b/kernel/trace/ktap/kp_transport.c
@@ -0,0 +1,649 @@
+/*
+ * kp_transport.c - ktap transport functionality
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/debugfs.h>
+#include <linux/ftrace_event.h>
+#include <linux/stacktrace.h>
+#include <linux/clocksource.h>
+#include <asm/uaccess.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_events.h"
+#include "kp_transport.h"
+
+struct ktap_trace_iterator {
+	struct ring_buffer	*buffer;
+	int			print_timestamp;
+	void			*private;
+
+	struct trace_iterator	iter;
+};
+
+enum ktap_trace_type {
+	__TRACE_FIRST_TYPE = 0,
+
+	TRACE_FN = 1, /* must be same as ftrace definition in kernel */
+	TRACE_PRINT,
+	TRACE_BPUTS,
+	TRACE_STACK,
+	TRACE_USER_STACK,
+
+	__TRACE_LAST_TYPE,
+};
+
+#define KTAP_TRACE_ITER(iter)	\
+	container_of(iter, struct ktap_trace_iterator, iter)
+
+static
+ssize_t _trace_seq_to_user(struct trace_seq *s, char __user *ubuf, size_t cnt)
+{
+	int len;
+	int ret;
+
+	if (!cnt)
+		return 0;
+
+	if (s->len <= s->readpos)
+		return -EBUSY;
+
+	len = s->len - s->readpos;
+	if (cnt > len)
+		cnt = len;
+	ret = copy_to_user(ubuf, s->buffer + s->readpos, cnt);
+	if (ret == cnt)
+		return -EFAULT;
+
+	cnt -= ret;
+
+	s->readpos += cnt;
+	return cnt;
+}
+
+int _trace_seq_puts(struct trace_seq *s, const char *str)
+{
+	int len = strlen(str);
+
+	if (s->full)
+		return 0;
+
+	if (len > ((PAGE_SIZE - 1) - s->len)) {
+		s->full = 1;
+		return 0;
+	}
+
+	memcpy(s->buffer + s->len, str, len);
+	s->len += len;
+
+	return len;
+}
+
+static int trace_empty(struct trace_iterator *iter)
+{
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		if (!ring_buffer_empty_cpu(ktap_iter->buffer, cpu))
+			return 0;
+	}
+
+	return 1;
+}
+
+static void trace_consume(struct trace_iterator *iter)
+{
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+
+	ring_buffer_consume(ktap_iter->buffer, iter->cpu, &iter->ts,
+			    &iter->lost_events);
+}
+
+unsigned long long ns2usecs(cycle_t nsec)
+{
+	nsec += 500;
+	do_div(nsec, 1000);
+	return nsec;
+}
+
+static int trace_print_timestamp(struct trace_iterator *iter)
+{
+	struct trace_seq *s = &iter->seq;
+	unsigned long long t;
+	unsigned long secs, usec_rem;
+
+	t = ns2usecs(iter->ts);
+	usec_rem = do_div(t, USEC_PER_SEC);
+	secs = (unsigned long)t;
+
+	return trace_seq_printf(s, "%5lu.%06lu: ", secs, usec_rem);
+}
+
+/* todo: export kernel function ftrace_find_event in future, and make faster */
+static struct trace_event *(*ftrace_find_event)(int type);
+
+static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
+{
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+	struct trace_entry *entry = iter->ent;
+	struct trace_event *ev;
+
+	ev = ftrace_find_event(entry->type);
+
+	if (ktap_iter->print_timestamp && !trace_print_timestamp(iter))
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	if (ev) {
+		int ret = ev->funcs->trace(iter, 0, ev);
+
+		/* overwrite '\n' at the ending */
+		iter->seq.buffer[iter->seq.len - 1] = '\0';
+		iter->seq.len--;
+		return ret;
+	}
+
+	return TRACE_TYPE_PARTIAL_LINE;
+}
+
+static enum print_line_t print_trace_stack(struct trace_iterator *iter)
+{
+	struct trace_entry *entry = iter->ent;
+	struct stack_trace trace;
+	char str[KSYM_SYMBOL_LEN];
+	int i;
+
+	trace.entries = (unsigned long *)(entry + 1);
+	trace.nr_entries = (iter->ent_size - sizeof(*entry)) /
+			   sizeof(unsigned long);
+
+	if (!_trace_seq_puts(&iter->seq, "<stack trace>\n"))
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	for (i = 0; i < trace.nr_entries; i++) {
+		unsigned long p = trace.entries[i];
+
+		if (p == ULONG_MAX)
+			break;
+
+		sprint_symbol(str, p);
+		if (!trace_seq_printf(&iter->seq, " => %s\n", str))
+			return TRACE_TYPE_PARTIAL_LINE;
+	}
+
+	return TRACE_TYPE_HANDLED;
+}
+
+struct ktap_ftrace_entry {
+	struct trace_entry entry;
+	unsigned long ip;
+	unsigned long parent_ip;
+};
+
+static enum print_line_t print_trace_fn(struct trace_iterator *iter)
+{
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+	struct ktap_ftrace_entry *field = (struct ktap_ftrace_entry *)iter->ent;
+	char str[KSYM_SYMBOL_LEN];
+
+	if (ktap_iter->print_timestamp && !trace_print_timestamp(iter))
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	sprint_symbol(str, field->ip);
+	if (!_trace_seq_puts(&iter->seq, str))
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	if (!_trace_seq_puts(&iter->seq, " <- "))
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	sprint_symbol(str, field->parent_ip);
+	if (!_trace_seq_puts(&iter->seq, str))
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	return TRACE_TYPE_HANDLED;
+}
+
+static enum print_line_t print_trace_bputs(struct trace_iterator *iter)
+{
+	if (!_trace_seq_puts(&iter->seq,
+			    (const char *)(*(unsigned long *)(iter->ent + 1))))
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	return TRACE_TYPE_HANDLED;
+}
+
+static enum print_line_t print_trace_line(struct trace_iterator *iter)
+{
+	struct trace_entry *entry = iter->ent;
+	char *str = (char *)(entry + 1);
+
+	if (entry->type == TRACE_PRINT) {
+		if (!trace_seq_printf(&iter->seq, "%s", str))
+			return TRACE_TYPE_PARTIAL_LINE;
+
+		return TRACE_TYPE_HANDLED;
+	}
+
+	if (entry->type == TRACE_BPUTS)
+		return print_trace_bputs(iter);
+
+	if (entry->type == TRACE_STACK)
+		return print_trace_stack(iter);
+
+	if (entry->type == TRACE_FN)
+		return print_trace_fn(iter);
+
+	return print_trace_fmt(iter);
+}
+
+static struct trace_entry *
+peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts,
+		unsigned long *lost_events)
+{
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+	struct ring_buffer_event *event;
+
+	event = ring_buffer_peek(ktap_iter->buffer, cpu, ts, lost_events);
+	if (event) {
+		iter->ent_size = ring_buffer_event_length(event);
+		return ring_buffer_event_data(event);
+	}
+
+	return NULL;
+}
+
+static struct trace_entry *
+__find_next_entry(struct trace_iterator *iter, int *ent_cpu,
+		  unsigned long *missing_events, u64 *ent_ts)
+{
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+	struct ring_buffer *buffer = ktap_iter->buffer;
+	struct trace_entry *ent, *next = NULL;
+	unsigned long lost_events = 0, next_lost = 0;
+	u64 next_ts = 0, ts;
+	int next_cpu = -1;
+	int next_size = 0;
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		if (ring_buffer_empty_cpu(buffer, cpu))
+			continue;
+
+		ent = peek_next_entry(iter, cpu, &ts, &lost_events);
+		/*
+		 * Pick the entry with the smallest timestamp:
+		 */
+		if (ent && (!next || ts < next_ts)) {
+			next = ent;
+			next_cpu = cpu;
+			next_ts = ts;
+			next_lost = lost_events;
+			next_size = iter->ent_size;
+		}
+	}
+
+	iter->ent_size = next_size;
+
+	if (ent_cpu)
+		*ent_cpu = next_cpu;
+
+	if (ent_ts)
+		*ent_ts = next_ts;
+
+	if (missing_events)
+		*missing_events = next_lost;
+
+	return next;
+}
+
+/* Find the next real entry, and increment the iterator to the next entry */
+static void *trace_find_next_entry_inc(struct trace_iterator *iter)
+{
+	iter->ent = __find_next_entry(iter, &iter->cpu,
+				      &iter->lost_events, &iter->ts);
+	if (iter->ent)
+		iter->idx++;
+
+	return iter->ent ? iter : NULL;
+}
+
+static void poll_wait_pipe(void)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	/* sleep for 100 msecs, and try again. */
+	schedule_timeout(HZ / 10);
+}
+
+static int tracing_wait_pipe(struct file *filp)
+{
+	struct trace_iterator *iter = filp->private_data;
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+	ktap_state_t *ks = ktap_iter->private;
+
+	while (trace_empty(iter)) {
+
+		if ((filp->f_flags & O_NONBLOCK)) {
+			return -EAGAIN;
+		}
+
+		mutex_unlock(&iter->mutex);
+
+		poll_wait_pipe();
+
+		mutex_lock(&iter->mutex);
+
+		if (G(ks)->wait_user && trace_empty(iter))
+			return -EINTR;
+	}
+
+	return 1;
+}
+
+static ssize_t
+tracing_read_pipe(struct file *filp, char __user *ubuf, size_t cnt,
+		  loff_t *ppos)
+{
+	struct trace_iterator *iter = filp->private_data;
+	ssize_t sret;
+
+	/* return any leftover data */
+	sret = _trace_seq_to_user(&iter->seq, ubuf, cnt);
+	if (sret != -EBUSY)
+		return sret;
+	/*
+	 * Avoid more than one consumer on a single file descriptor
+	 * This is just a matter of traces coherency, the ring buffer itself
+	 * is protected.
+	 */
+	mutex_lock(&iter->mutex);
+
+waitagain:
+	sret = tracing_wait_pipe(filp);
+	if (sret <= 0)
+		goto out;
+
+	/* stop when tracing is finished */
+	if (trace_empty(iter)) {
+		sret = 0;
+		goto out;
+	}
+
+	if (cnt >= PAGE_SIZE)
+		cnt = PAGE_SIZE - 1;
+
+	/* reset all but tr, trace, and overruns */
+	memset(&iter->seq, 0,
+	       sizeof(struct trace_iterator) -
+	       offsetof(struct trace_iterator, seq));
+	iter->pos = -1;
+
+	while (trace_find_next_entry_inc(iter) != NULL) {
+		enum print_line_t ret;
+		int len = iter->seq.len;
+
+		ret = print_trace_line(iter);
+		if (ret == TRACE_TYPE_PARTIAL_LINE) {
+			/* don't print partial lines */
+			iter->seq.len = len;
+			break;
+		}
+		if (ret != TRACE_TYPE_NO_CONSUME)
+			trace_consume(iter);
+
+		if (iter->seq.len >= cnt)
+			break;
+
+		/*
+		 * Setting the full flag means we reached the trace_seq buffer
+		 * size and we should leave by partial output condition above.
+		 * One of the trace_seq_* functions is not used properly.
+		 */
+		WARN_ONCE(iter->seq.full, "full flag set for trace type %d",
+			  iter->ent->type);
+	}
+
+	/* Now copy what we have to the user */
+	sret = _trace_seq_to_user(&iter->seq, ubuf, cnt);
+	if (iter->seq.readpos >= iter->seq.len)
+		trace_seq_init(&iter->seq);
+
+	/*
+	 * If there was nothing to send to user, in spite of consuming trace
+	 * entries, go back to wait for more entries.
+	 */
+	if (sret == -EBUSY)
+		goto waitagain;
+
+out:
+	mutex_unlock(&iter->mutex);
+
+	return sret;
+}
+
+static int tracing_open_pipe(struct inode *inode, struct file *filp)
+{
+	struct ktap_trace_iterator *ktap_iter;
+	ktap_state_t *ks = inode->i_private;
+
+	/* create a buffer to store the information to pass to userspace */
+	ktap_iter = kzalloc(sizeof(*ktap_iter), GFP_KERNEL);
+	if (!ktap_iter)
+		return -ENOMEM;
+
+	ktap_iter->private = ks;
+	ktap_iter->buffer = G(ks)->buffer;
+	ktap_iter->print_timestamp = G(ks)->parm->print_timestamp;
+	mutex_init(&ktap_iter->iter.mutex);
+	filp->private_data = &ktap_iter->iter;
+
+	nonseekable_open(inode, filp);
+
+	return 0;
+}
+
+static int tracing_release_pipe(struct inode *inode, struct file *file)
+{
+	struct trace_iterator *iter = file->private_data;
+	struct ktap_trace_iterator *ktap_iter = KTAP_TRACE_ITER(iter);
+
+	mutex_destroy(&iter->mutex);
+	kfree(ktap_iter);
+	return 0;
+}
+
+static const struct file_operations tracing_pipe_fops = {
+	.open		= tracing_open_pipe,
+	.read		= tracing_read_pipe,
+	.splice_read	= NULL,
+	.release	= tracing_release_pipe,
+	.llseek		= no_llseek,
+};
+
+/*
+ * preempt disabled in ring_buffer_lock_reserve
+ *
+ * The implementation is similar with funtion __ftrace_trace_stack.
+ */
+void kp_transport_print_kstack(ktap_state_t *ks, uint16_t depth, uint16_t skip)
+{
+	struct ring_buffer *buffer = G(ks)->buffer;
+	struct ring_buffer_event *event;
+	struct trace_entry *entry;
+	int size;
+
+	size = depth * sizeof(unsigned long);
+	event = ring_buffer_lock_reserve(buffer, sizeof(*entry) + size);
+	if (!event) {
+		KTAP_STATS(ks)->events_missed += 1;
+		return;
+	} else {
+		struct stack_trace trace;
+
+		entry = ring_buffer_event_data(event);
+		tracing_generic_entry_update(entry, 0, 0);
+		entry->type = TRACE_STACK;
+
+		trace.nr_entries = 0;
+		trace.skip = skip;
+		trace.max_entries = depth;
+		trace.entries = (unsigned long *)(entry + 1);
+		save_stack_trace(&trace);
+
+		ring_buffer_unlock_commit(buffer, event);
+	}
+}
+
+void kp_transport_event_write(ktap_state_t *ks, struct ktap_event_data *e)
+{
+	struct ring_buffer *buffer = G(ks)->buffer;
+	struct ring_buffer_event *event;
+	struct trace_entry *ev_entry = e->data->raw->data;
+	struct trace_entry *entry;
+	int entry_size = e->data->raw->size;
+
+	event = ring_buffer_lock_reserve(buffer, entry_size +
+					 sizeof(struct ftrace_event_call *));
+	if (!event) {
+		KTAP_STATS(ks)->events_missed += 1;
+		return;
+	} else {
+		entry = ring_buffer_event_data(event);
+
+		memcpy(entry, ev_entry, entry_size);
+
+		ring_buffer_unlock_commit(buffer, event);
+	}
+}
+
+void kp_transport_write(ktap_state_t *ks, const void *data, size_t length)
+{
+	struct ring_buffer *buffer = G(ks)->buffer;
+	struct ring_buffer_event *event;
+	struct trace_entry *entry;
+	int size;
+
+	size = sizeof(struct trace_entry) + length;
+
+	event = ring_buffer_lock_reserve(buffer, size);
+	if (!event) {
+		KTAP_STATS(ks)->events_missed += 1;
+		return;
+	} else {
+		entry = ring_buffer_event_data(event);
+
+		tracing_generic_entry_update(entry, 0, 0);
+		entry->type = TRACE_PRINT;
+		memcpy(entry + 1, data, length);
+
+		ring_buffer_unlock_commit(buffer, event);
+	}
+}
+
+/* general print function */
+void kp_printf(ktap_state_t *ks, const char *fmt, ...)
+{
+	char buff[1024];
+	va_list args;
+	int len;
+
+	va_start(args, fmt);
+	len = vscnprintf(buff, 1024, fmt, args);
+	va_end(args);
+
+	buff[len] = '\0';
+	kp_transport_write(ks, buff, len + 1);
+}
+
+void __kp_puts(ktap_state_t *ks, const char *str)
+{
+	kp_transport_write(ks, str, strlen(str) + 1);
+}
+
+void __kp_bputs(ktap_state_t *ks, const char *str)
+{
+	struct ring_buffer *buffer = G(ks)->buffer;
+	struct ring_buffer_event *event;
+	struct trace_entry *entry;
+	int size;
+
+	size = sizeof(struct trace_entry) + sizeof(unsigned long *);
+
+	event = ring_buffer_lock_reserve(buffer, size);
+	if (!event) {
+		KTAP_STATS(ks)->events_missed += 1;
+		return;
+	} else {
+		entry = ring_buffer_event_data(event);
+
+		tracing_generic_entry_update(entry, 0, 0);
+		entry->type = TRACE_BPUTS;
+		*(unsigned long *)(entry + 1) = (unsigned long)str;
+
+		ring_buffer_unlock_commit(buffer, event);
+	}
+}
+
+void kp_transport_exit(ktap_state_t *ks)
+{
+	if (G(ks)->buffer)
+		ring_buffer_free(G(ks)->buffer);
+	debugfs_remove(G(ks)->trace_pipe_dentry);
+}
+
+#define TRACE_BUF_SIZE_DEFAULT	1441792UL /* 16384 * 88 (sizeof(entry)) */
+
+int kp_transport_init(ktap_state_t *ks, struct dentry *dir)
+{
+	struct ring_buffer *buffer;
+	struct dentry *dentry;
+	char filename[32] = {0};
+
+#ifdef CONFIG_PPC64
+	ftrace_find_event = (void *)kallsyms_lookup_name(".ftrace_find_event");
+#else
+	ftrace_find_event = (void *)kallsyms_lookup_name("ftrace_find_event");
+#endif
+	if (!ftrace_find_event) {
+		printk("ktap: cannot lookup ftrace_find_event in kallsyms\n");
+		return -EINVAL;
+	}
+
+	buffer = ring_buffer_alloc(TRACE_BUF_SIZE_DEFAULT, RB_FL_OVERWRITE);
+	if (!buffer)
+		return -ENOMEM;
+
+	sprintf(filename, "trace_pipe_%d", (int)task_tgid_vnr(current));
+
+	dentry = debugfs_create_file(filename, 0444, dir,
+				     ks, &tracing_pipe_fops);
+	if (!dentry) {
+		pr_err("ktapvm: cannot create trace_pipe file in debugfs\n");
+		ring_buffer_free(buffer);
+		return -1;
+	}
+
+	G(ks)->buffer = buffer;
+	G(ks)->trace_pipe_dentry = dentry;
+
+	return 0;
+}
+
diff --git a/kernel/trace/ktap/kp_transport.h b/kernel/trace/ktap/kp_transport.h
new file mode 100644
index 0000000..bc7c892
--- /dev/null
+++ b/kernel/trace/ktap/kp_transport.h
@@ -0,0 +1,13 @@
+#ifndef __KTAP_TRANSPORT_H__
+#define __KTAP_TRANSPORT_H__
+
+void kp_transport_write(ktap_state_t *ks, const void *data, size_t length);
+void kp_transport_event_write(ktap_state_t *ks, struct ktap_event_data *e);
+void kp_transport_print_kstack(ktap_state_t *ks, uint16_t depth, uint16_t skip);
+void *kp_transport_reserve(ktap_state_t *ks, size_t length);
+void kp_transport_exit(ktap_state_t *ks);
+int kp_transport_init(ktap_state_t *ks, struct dentry *dir);
+
+int _trace_seq_puts(struct trace_seq *s, const char *str);
+
+#endif /* __KTAP_TRANSPORT_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 14/29] ktap: add events management(kernel/trace/ktap/kp_events.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (12 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 13/29] ktap: add ring buffer handling code(kernel/trace/ktap/kp_transport.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 15/29] ktap: add built-in functions and library(kernel/trace/ktap/lib_*.c) Jovi Zhangwei
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

kp_events.c handle ktap events management(registry, destroy, event callback)

This file is core event management interface between ktap and kernel.

Exposed functions:
1). kp_events_init/kp_events_exit

2). kp_event_create_kprobe
        create kprobe event, for example:
                kdebug.kprobe("SyS_futex", function () {})

3). kp_event_create_tracepoint
        create tracepoint event, for example"
                kdebug.tracepoint("sys_futex_enter", function () {})

4). kp_event_create
        create perf backend event, for example:
                trace sched:sched_switch { print(argstr) }

        It call kernel function 'perf_event_create_kernel_counter' to
        register event(tracepoint/kprobe/uprobe)

5). kp_event_getarg
        get argument of event, from arg0 to arg9,
        only can be called in probe context.
                trace sched:sched_switch { print(arg0, arg1) }

6). kp_event_stringify/kp_event_tostr
        stringify argstr, sometimes if store argstr as key to table,
        then it need to stringify firstly, like below:
                var s={} trace sched:sched_switch { s[argstr] += 1 }
        (This is quite rare usage, but ktap support it)

Note:
Why ktap support 'kdebug.kprobe' and 'kdebug.tracepoint' when
it already support perf backend event(trace xxx {})?

Because benchmark shows raw kprobe and tracpoint interface is faster
than perf backed tracing, nearly 10+%, it's more fair to compare
with Systemtap by raw tracing syntax, not perf backend tracing.

perf backend tracing have a long code path before reach ktap callback,
and it need to copy event buffer firstly.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/kp_events.c | 832 ++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/kp_events.h |  71 ++++
 2 files changed, 903 insertions(+)
 create mode 100644 kernel/trace/ktap/kp_events.c
 create mode 100644 kernel/trace/ktap/kp_events.h

diff --git a/kernel/trace/ktap/kp_events.c b/kernel/trace/ktap/kp_events.c
new file mode 100644
index 0000000..1aabe80
--- /dev/null
+++ b/kernel/trace/ktap/kp_events.c
@@ -0,0 +1,832 @@
+/*
+ * kp_events.c - ktap events management (registry, destroy, event callback)
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/module.h>
+#include <linux/ctype.h>
+#include <linux/slab.h>
+#include <linux/version.h>
+#include <asm/syscall.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_transport.h"
+#include "kp_vm.h"
+#include "kp_events.h"
+
+const char *kp_event_tostr(ktap_state_t *ks)
+{
+	struct ktap_event_data *e = ks->current_event;
+	struct ftrace_event_call *call;
+	struct trace_iterator *iter;
+	struct trace_event *ev;
+	enum print_line_t ret = TRACE_TYPE_NO_CONSUME;
+	static const char *dummy_msg = "argstr_not_available";
+
+	/* need to check current context is vaild tracing context */
+	if (!ks->current_event) {
+		kp_error(ks, "cannot stringify event str in invalid context\n");
+		return NULL;
+	}
+
+	/*check if stringified before */
+	if (ks->current_event->argstr)
+		return getstr(ks->current_event->argstr);
+
+	/* timer event and raw tracepoint don't have associated argstr */
+	if (e->event->type == KTAP_EVENT_TYPE_PERF && e->event->perf->tp_event)
+		call = e->event->perf->tp_event;
+	else
+		return dummy_msg;
+
+	/* Simulate the iterator */
+
+	/*
+	 * use temp percpu buffer as trace_iterator
+	 * we cannot use same print_buffer because we may called from printf.
+	 */
+	iter = kp_this_cpu_temp_buffer(ks);
+
+	trace_seq_init(&iter->seq);
+	iter->ent = e->data->raw->data;
+
+	ev = &(call->event);
+	if (ev)
+		ret = ev->funcs->trace(iter, 0, ev);
+
+	if (ret != TRACE_TYPE_NO_CONSUME) {
+		struct trace_seq *s = &iter->seq;
+		int len = s->len >= PAGE_SIZE ? PAGE_SIZE - 1 : s->len;
+
+		s->buffer[len] = '\0';
+		return &s->buffer[0];
+	}
+
+	return dummy_msg;
+}
+
+/* return string repr of 'argstr' */
+const ktap_str_t *kp_event_stringify(ktap_state_t *ks)
+{
+	const char *str;
+	ktap_str_t *ts;
+
+	/*check if stringified before */
+	if (ks->current_event->argstr)
+		return ks->current_event->argstr;
+
+	str = kp_event_tostr(ks);
+	if (!str)
+		return NULL;
+
+	ts = kp_str_newz(ks, str);
+	ks->current_event->argstr = ts;
+	return ts;
+}
+
+/*
+ * This definition should keep update with kernel/trace/trace.h
+ * TODO: export this struct in kernel 
+ */
+struct ftrace_event_field {
+	struct list_head        link;
+	const char              *name;
+	const char              *type;
+	int                     filter_type;
+	int                     offset;
+	int                     size;
+	int                     is_signed;
+};
+
+static struct list_head *get_fields(struct ftrace_event_call *event_call)
+{
+	if (!event_call->class->get_fields)
+		return &event_call->class->fields;
+	return event_call->class->get_fields(event_call);
+}
+
+void kp_event_getarg(ktap_state_t *ks, ktap_val_t *ra, int idx)
+{
+	struct ktap_event_data *e = ks->current_event;
+	struct ktap_event *event = e->event;
+	struct ktap_event_field *event_fields = &event->fields[idx];
+
+	switch (event_fields->type)  {
+	case KTAP_EVENT_FIELD_TYPE_INT: {
+		struct trace_entry *entry = e->data->raw->data;
+		void *value = (unsigned char *)entry + event_fields->offset;
+		int n = *(int *)value;
+		set_number(ra, n);
+		return;
+		}
+	case KTAP_EVENT_FIELD_TYPE_LONG: {
+		struct trace_entry *entry = e->data->raw->data;
+		void *value = (unsigned char *)entry + event_fields->offset;
+		long n = *(long *)value;
+		set_number(ra, n);
+		return;
+		}
+	case KTAP_EVENT_FIELD_TYPE_STRING: {
+		struct trace_entry *entry = e->data->raw->data;
+		ktap_str_t *ts;
+		void *value = (unsigned char *)entry + event_fields->offset;
+		ts = kp_str_newz(ks, (char *)value);
+		if (ts)
+			set_string(ra, ts);
+		else
+			set_nil(ra);
+		return;
+		}
+	case KTAP_EVENT_FIELD_TYPE_CONST: {
+		set_number(ra, (ktap_number)event_fields->offset);
+		return;
+		}
+	case KTAP_EVENT_FIELD_TYPE_REGESTER: {
+		unsigned long *reg = (unsigned long *)((u8 *)e->regs +
+					event_fields->offset);
+		set_number(ra, *reg);
+		return;
+		}
+	case KTAP_EVENT_FIELD_TYPE_NIL:
+		set_nil(ra);
+		return;
+	case KTAP_EVENT_FIELD_TYPE_INVALID:
+		kp_error(ks, "the field type is not supported yet\n");
+		set_nil(ra);
+		return;
+	}
+}
+
+/* init all fields of event, for quick arg1..arg9 access */
+static int init_event_fields(ktap_state_t *ks, struct ktap_event *event)
+{
+	struct ftrace_event_call *event_call = event->perf->tp_event; 
+	struct ktap_event_field *event_fields = &event->fields[0];
+	struct ftrace_event_field *field;
+	struct list_head *head;
+	int idx = 0, n = 0;
+
+	/* only init fields for tracepoint, not timer event */
+	if (!event_call)
+		return 0;
+
+	/* intern probename */
+	event->name = kp_str_newz(ks, event_call->name);
+	if (unlikely(!event->name))
+		return -ENOMEM;
+
+	head = get_fields(event_call);
+	list_for_each_entry_reverse(field, head, link) {
+		if (n++ == 9) {
+			/*
+			 * For some events have fields more than 9, just ignore
+			 * those rest fields at present.
+			 *
+			 * TODO: support access all fields in tracepoint event
+			 *
+			 * Examples: mce:mce_record, ext4:ext4_writepages, ...
+			 */
+			return 0;
+		}
+
+		event_fields[idx].offset = field->offset;
+
+		if (field->size == 4) {
+			event_fields[idx].type = KTAP_EVENT_FIELD_TYPE_INT;
+			idx++;
+			continue;
+		} else if (field->size == 8) {
+			event_fields[idx].type = KTAP_EVENT_FIELD_TYPE_LONG;
+			idx++;
+			continue;
+		}
+		if (!strncmp(field->type, "char", 4)) {
+			event_fields[idx].type = KTAP_EVENT_FIELD_TYPE_STRING;
+			idx++;
+			continue;
+		}
+
+		/* TODO: add more type check */
+		event_fields[idx++].type = KTAP_EVENT_FIELD_TYPE_INVALID;
+	}
+
+	/* init all rest fields as NIL */
+	while (idx < 9)
+		event_fields[idx++].type = KTAP_EVENT_FIELD_TYPE_NIL;
+
+	return 0;
+}
+
+static inline void call_probe_closure(ktap_state_t *mainthread,
+				      ktap_func_t *fn,
+				      struct ktap_event_data *e, int rctx)
+{
+	ktap_state_t *ks;
+	ktap_val_t *func;
+
+	ks = kp_vm_new_thread(mainthread, rctx);
+	set_func(ks->top, fn);
+	func = ks->top;
+	incr_top(ks);
+
+	ks->current_event = e;
+
+	kp_vm_call(ks, func, 0);
+
+	ks->current_event = NULL;
+	kp_vm_exit_thread(ks);
+}
+
+/*
+ * Callback tracing function for perf event subsystem.
+ *
+ * make ktap reentrant, don't disable irq in callback function,
+ * same as perf and ftrace. to make reentrant, we need some
+ * percpu data to be context isolation(irq/sirq/nmi/process)
+ *
+ * The recursion checking in here is mainly purpose for avoiding
+ * corrupt ktap_state_t with timer closure callback. For tracepoint
+ * recusion, perf core already handle it.
+ *
+ * Note tracepoint handler is calling with rcu_read_lock.
+ */
+static void perf_callback(struct perf_event *perf_event,
+			   struct perf_sample_data *data,
+			   struct pt_regs *regs)
+{
+	struct ktap_event *event;
+	struct ktap_event_data e;
+	ktap_state_t *ks;
+	int rctx;
+
+	event = perf_event->overflow_handler_context;
+	ks = event->ks;
+
+	if (unlikely(ks->stop))
+		return;
+
+	rctx = get_recursion_context(ks);
+	if (unlikely(rctx < 0))
+		return;
+
+	e.event = event;
+	e.data = data;
+	e.regs = regs;
+	e.argstr = NULL;
+
+	call_probe_closure(ks, event->fn, &e, rctx);
+
+	put_recursion_context(ks, rctx);
+}
+
+/*
+ * Generic ktap event creation function (based on perf callback)
+ * purpose for tracepoints/kprobe/uprobe/profile-timer/hw_breakpoint/pmu.
+ */
+int kp_event_create(ktap_state_t *ks, struct perf_event_attr *attr,
+		    struct task_struct *task, const char *filter,
+		    ktap_func_t *fn)
+{
+	struct ktap_event *event;
+	struct perf_event *perf_event;
+	void *callback = perf_callback;
+	int cpu, ret;
+
+	if (G(ks)->parm->dry_run)
+		callback = NULL;
+
+	/*
+	 * don't tracing until ktap_wait, the reason is:
+	 * 1). some event may hit before apply filter
+	 * 2). more simple to manage tracing thread
+	 * 3). avoid race with mainthread.
+	 *
+	 * Another way to do this is make attr.disabled as 1, then use
+	 * perf_event_enable after filter apply, however, perf_event_enable
+	 * was not exported in kernel older than 3.3, so we drop this method.
+	 */
+	ks->stop = 1;
+
+	for_each_cpu(cpu, G(ks)->cpumask) {
+		event = kzalloc(sizeof(struct ktap_event), GFP_KERNEL);
+		if (!event)
+			return -ENOMEM;
+
+		event->type = KTAP_EVENT_TYPE_PERF;
+		event->ks = ks;
+		event->fn = fn;
+		perf_event = perf_event_create_kernel_counter(attr, cpu, task,
+							      callback, event);
+		if (IS_ERR(perf_event)) {
+			int err = PTR_ERR(perf_event);
+			kp_error(ks, "unable register perf event: "
+				     "[cpu: %d; id: %d; err: %d]\n",
+				     cpu, attr->config, err);
+			kfree(event);
+			return err;
+		}
+
+		if (attr->type == PERF_TYPE_TRACEPOINT) {
+			const char *name = perf_event->tp_event->name;
+			kp_verbose_printf(ks, "enable perf event: "
+					      "[cpu: %d; id: %d; name: %s; "
+					      "filter: %s; pid: %d]\n",
+					      cpu, attr->config, name, filter,
+					      task ? task_tgid_vnr(task) : -1);
+		} else if (attr->type == PERF_TYPE_SOFTWARE &&
+			 attr->config == PERF_COUNT_SW_CPU_CLOCK) {
+			kp_verbose_printf(ks, "enable profile event: "
+					      "[cpu: %d; sample_period: %d]\n",
+					      cpu, attr->sample_period);
+		} else {
+			kp_verbose_printf(ks, "unknown perf event type\n");
+		}
+
+		event->perf = perf_event;
+		INIT_LIST_HEAD(&event->list);
+		list_add_tail(&event->list, &G(ks)->events_head);
+
+		if (init_event_fields(ks, event)) {
+			kp_error(ks, "unable init event fields id %d\n",
+					attr->config);
+			perf_event_release_kernel(event->perf);
+			list_del(&event->list);
+			kfree(event);
+			return ret;
+		}
+
+		if (!filter)
+			continue;
+
+		ret = kp_ftrace_profile_set_filter(perf_event, attr->config,
+						   filter);
+		if (ret) {
+			kp_error(ks, "unable set event filter: "
+				     "[id: %d; filter: %s; ret: %d]\n",
+				     attr->config, filter, ret);
+			perf_event_release_kernel(event->perf);
+			list_del(&event->list);
+			kfree(event);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Ignore function proto in here, just use first argument.
+ */
+static void probe_callback(void *__data)
+{
+	struct ktap_event *event = __data;
+	ktap_state_t *ks = event->ks;
+	struct ktap_event_data e;
+	struct pt_regs regs; /* pt_regs maybe is large for stack */
+	int rctx;
+
+	if (unlikely(ks->stop))
+		return;
+
+	rctx = get_recursion_context(ks);
+	if (unlikely(rctx < 0))
+		return;
+
+	perf_fetch_caller_regs(&regs);
+
+	e.event = event;
+	e.regs = &regs;
+	e.argstr = NULL;
+
+	call_probe_closure(ks, event->fn, &e, rctx);
+
+	put_recursion_context(ks, rctx);
+}
+
+/*
+ * syscall events handling
+ */
+
+static DEFINE_MUTEX(syscall_trace_lock);
+static DECLARE_BITMAP(enabled_enter_syscalls, NR_syscalls);
+static DECLARE_BITMAP(enabled_exit_syscalls, NR_syscalls);
+static int sys_refcount_enter;
+static int sys_refcount_exit;
+
+static int get_syscall_num(const char *name)
+{
+	int i;
+
+	for (i = 0; i < NR_syscalls; i++) {
+		if (syscalls_metadata[i] &&
+		    !strcmp(name, syscalls_metadata[i]->name + 4))
+			return i;
+	}
+	return -1;
+}
+
+static void trace_syscall_enter(void *data, struct pt_regs *regs, long id)
+{
+	struct ktap_event *event = data;
+	ktap_state_t *ks = event->ks;
+	struct ktap_event_data e;
+	int syscall_nr;
+	int rctx;
+
+	if (unlikely(ks->stop))
+		return;
+
+	syscall_nr = syscall_get_nr(current, regs);
+	if (unlikely(syscall_nr < 0))
+		return;
+	if (!test_bit(syscall_nr, enabled_enter_syscalls))
+		return;
+
+	rctx = get_recursion_context(ks);
+	if (unlikely(rctx < 0))
+		return;
+
+	e.event = event;
+	e.regs = regs;
+	e.argstr = NULL;
+
+	call_probe_closure(ks, event->fn, &e, rctx);
+
+	put_recursion_context(ks, rctx);
+}
+
+static void trace_syscall_exit(void *data, struct pt_regs *regs, long id)
+{
+	struct ktap_event *event = data;
+	ktap_state_t *ks = event->ks;
+	struct ktap_event_data e;
+	int syscall_nr;
+	int rctx;
+
+	syscall_nr = syscall_get_nr(current, regs);
+	if (unlikely(syscall_nr < 0))
+		return;
+	if (!test_bit(syscall_nr, enabled_exit_syscalls))
+		return;
+
+	if (unlikely(ks->stop))
+		return;
+
+	rctx = get_recursion_context(ks);
+	if (unlikely(rctx < 0))
+		return;
+
+	e.event = event;
+	e.regs = regs;
+	e.argstr = NULL;
+
+	call_probe_closure(ks, event->fn, &e, rctx);
+
+	put_recursion_context(ks, rctx);
+}
+
+/* called in dry-run mode, purpose for compare overhead with normal vm call */
+static void dry_run_callback(void *data, struct pt_regs *regs, long id)
+{
+
+}
+
+static void init_syscall_event_fields(struct ktap_event *event, int is_enter)
+{
+	struct ftrace_event_call *event_call;
+	struct ktap_event_field *event_fields = &event->fields[0];
+	struct syscall_metadata *meta = syscalls_metadata[event->syscall_nr];
+	int idx = 0;
+
+	event_call = is_enter ? meta->enter_event : meta->exit_event;
+
+	event_fields[0].type = KTAP_EVENT_FIELD_TYPE_CONST;
+	event_fields[0].offset = event->syscall_nr;
+
+	if (!is_enter) {
+#ifdef CONFIG_X86_64
+		event_fields[1].type = KTAP_EVENT_FIELD_TYPE_REGESTER;
+		event_fields[1].offset = offsetof(struct pt_regs, ax);
+#endif
+		return;
+	}
+
+	while (idx++ < meta->nb_args) {
+		event_fields[idx].type = KTAP_EVENT_FIELD_TYPE_REGESTER;
+#ifdef CONFIG_X86_64
+		switch (idx) {
+		case 1:
+			event_fields[idx].offset = offsetof(struct pt_regs, di);
+			break;
+		case 2:
+			event_fields[idx].offset = offsetof(struct pt_regs, si);
+			break;
+		case 3:
+			event_fields[idx].offset = offsetof(struct pt_regs, dx);
+			break;
+		case 4:
+			event_fields[idx].offset =
+						offsetof(struct pt_regs, r10);
+			break;
+		case 5:
+			event_fields[idx].offset = offsetof(struct pt_regs, r8);
+			break;
+		case 6:
+			event_fields[idx].offset = offsetof(struct pt_regs, r9);
+			break;
+		}
+#else
+#error "don't support syscall tracepoint event register access in this arch, "
+	"use 'trace syscalls:* {}' instead"
+#endif
+	}
+
+	/* init all rest fields as NIL */
+	while (idx < 9)
+		event_fields[idx++].type = KTAP_EVENT_FIELD_TYPE_NIL;
+}
+
+static int syscall_event_register(ktap_state_t *ks, const char *event_name,
+				  struct ktap_event *event)
+{
+	int syscall_nr = 0, is_enter = 0;
+	void *callback = NULL;
+	int ret = 0;
+
+	if (!strncmp(event_name, "sys_enter_", 10)) {
+		is_enter = 1;
+		event->type = KTAP_EVENT_TYPE_SYSCALL_ENTER;
+		syscall_nr = get_syscall_num(event_name + 10);
+		callback = trace_syscall_enter;
+	} else if (!strncmp(event_name, "sys_exit_", 9)) {
+		is_enter = 0;
+		event->type = KTAP_EVENT_TYPE_SYSCALL_EXIT;
+		syscall_nr = get_syscall_num(event_name + 9);
+		callback = trace_syscall_exit;
+	}
+	
+	if (G(ks)->parm->dry_run)
+		callback = dry_run_callback;
+
+	if (syscall_nr < 0)
+		return -1;
+
+	event->syscall_nr = syscall_nr;
+
+	init_syscall_event_fields(event, is_enter);
+
+	mutex_lock(&syscall_trace_lock);
+	if (is_enter) {
+		if (!sys_refcount_enter)
+			ret = register_trace_sys_enter(callback, event);
+		if (!ret) {
+			set_bit(syscall_nr, enabled_enter_syscalls);
+			sys_refcount_enter++;
+		}
+	} else {
+		if (!sys_refcount_exit)
+			ret = register_trace_sys_exit(callback, event);
+		if (!ret) {
+			set_bit(syscall_nr, enabled_exit_syscalls);
+			sys_refcount_exit++;
+		}
+	}
+	mutex_unlock(&syscall_trace_lock);
+
+	return ret;
+}
+
+static int syscall_event_unregister(ktap_state_t *ks, struct ktap_event *event)
+{
+	int ret = 0;
+	void *callback;
+	
+	if (event->type == KTAP_EVENT_TYPE_SYSCALL_ENTER)
+		callback = trace_syscall_enter;
+	else
+		callback = trace_syscall_exit;
+
+	if (G(ks)->parm->dry_run)
+		callback = dry_run_callback;
+
+	mutex_lock(&syscall_trace_lock);
+	if (event->type == KTAP_EVENT_TYPE_SYSCALL_ENTER) {
+		sys_refcount_enter--;
+        	clear_bit(event->syscall_nr, enabled_enter_syscalls);
+        	if (!sys_refcount_enter)
+                	unregister_trace_sys_enter(callback, event);
+	} else {
+		sys_refcount_exit--;
+        	clear_bit(event->syscall_nr, enabled_exit_syscalls);
+        	if (!sys_refcount_exit)
+                	unregister_trace_sys_exit(callback, event);
+	}
+	mutex_unlock(&syscall_trace_lock);
+
+	return ret;
+}
+
+/*
+ * Register tracepoint event directly, not based on perf callback
+ *
+ * This tracing method would be more faster than perf callback,
+ * because it won't need to write trace data into any temp buffer,
+ * and code path is much shorter than perf callback.
+ */
+int kp_event_create_tracepoint(ktap_state_t *ks, const char *event_name,
+			       ktap_func_t *fn)
+{
+	struct ktap_event *event;
+	void *callback = probe_callback;
+	int is_syscall = 0;
+	int ret;
+
+	if (G(ks)->parm->dry_run)
+		callback = NULL;
+
+	if (!strncmp(event_name, "sys_enter_", 10) ||
+	    !strncmp(event_name, "sys_exit_", 9))
+		is_syscall = 1;
+
+	event = kzalloc(sizeof(struct ktap_event), GFP_KERNEL);
+	if (!event)
+		return -ENOMEM;
+
+	event->ks = ks;
+	event->fn = fn;
+	event->name = kp_str_newz(ks, event_name);
+	if (unlikely(!event->name)) {
+		kfree(event);
+		return -ENOMEM;
+	}
+
+	INIT_LIST_HEAD(&event->list);
+	list_add_tail(&event->list, &G(ks)->events_head);
+
+	if (is_syscall) {
+		ret = syscall_event_register(ks, event_name, event);
+	} else {
+		event->type = KTAP_EVENT_TYPE_TRACEPOINT;
+		ret = tracepoint_probe_register(event_name, callback, event);
+	}
+
+	if (ret) {
+		kp_error(ks, "register tracepoint %s failed, ret: %d\n",
+				event_name, ret);
+		list_del(&event->list);
+		kfree(event);
+		return ret;
+	}
+	return 0;
+}
+
+/* kprobe handler */
+static int __kprobes pre_handler_kprobe(struct kprobe *p, struct pt_regs *regs)
+{
+	struct ktap_event *event = container_of(p, struct ktap_event, kp);
+	ktap_state_t *ks = event->ks;
+	struct ktap_event_data e;
+	int rctx;
+
+	if (unlikely(ks->stop))
+		return 0;
+
+	rctx = get_recursion_context(ks);
+	if (unlikely(rctx < 0))
+		return 0;
+
+	e.event = event;
+	e.regs = regs;
+	e.argstr = NULL;
+
+	call_probe_closure(ks, event->fn, &e, rctx);
+
+	put_recursion_context(ks, rctx);
+	return 0;
+}
+
+/*
+ * Register kprobe event directly, not based on perf callback
+ *
+ * This tracing method would be more faster than perf callback,
+ * because it won't need to write trace data into any temp buffer,
+ * and code path is much shorter than perf callback.
+ */
+int kp_event_create_kprobe(ktap_state_t *ks, const char *event_name,
+			   ktap_func_t *fn)
+{
+	struct ktap_event *event;
+	void *callback = pre_handler_kprobe;
+	int ret;
+
+	if (G(ks)->parm->dry_run)
+		callback = NULL;
+
+	event = kzalloc(sizeof(struct ktap_event), GFP_KERNEL);
+	if (!event)
+		return -ENOMEM;
+
+	event->ks = ks;
+	event->fn = fn;
+	event->name = kp_str_newz(ks, event_name);
+	if (unlikely(!event->name)) {
+		kfree(event);
+		return -ENOMEM;
+	}
+
+	INIT_LIST_HEAD(&event->list);
+	list_add_tail(&event->list, &G(ks)->events_head);
+
+	event->type = KTAP_EVENT_TYPE_KPROBE;
+
+	event->kp.symbol_name = event_name;
+	event->kp.pre_handler = callback;
+	ret = register_kprobe(&event->kp);
+	if (ret) {
+		kp_error(ks, "register kprobe event %s failed, ret: %d\n",
+				event_name, ret);
+		list_del(&event->list);
+		kfree(event);
+		return ret;
+	}
+	return 0;
+}
+
+
+static void events_destroy(ktap_state_t *ks)
+{
+	struct ktap_event *event;
+	struct list_head *tmp, *pos;
+	struct list_head *head = &G(ks)->events_head;
+
+	list_for_each(pos, head) {
+		event = container_of(pos, struct ktap_event,
+					   list);
+		if (event->type == KTAP_EVENT_TYPE_PERF)
+			perf_event_release_kernel(event->perf);
+		else if (event->type == KTAP_EVENT_TYPE_TRACEPOINT)
+			tracepoint_probe_unregister(getstr(event->name),
+						    probe_callback, event);
+		else if (event->type == KTAP_EVENT_TYPE_SYSCALL_ENTER ||
+			 event->type == KTAP_EVENT_TYPE_SYSCALL_EXIT )
+			syscall_event_unregister(ks, event);
+		else if (event->type == KTAP_EVENT_TYPE_KPROBE)
+			unregister_kprobe(&event->kp);
+        }
+       	/*
+	 * Ensure our callback won't be called anymore. The buffers
+	 * will be freed after that.
+	 */
+	tracepoint_synchronize_unregister();
+
+	list_for_each_safe(pos, tmp, head) {
+		event = container_of(pos, struct ktap_event,
+					   list);
+		list_del(&event->list);
+		kfree(event);
+	}
+}
+
+void kp_events_exit(ktap_state_t *ks)
+{
+	if (!G(ks)->trace_enabled)
+		return;
+
+	events_destroy(ks);
+
+	/* call trace_end_closure after all event unregistered */
+	if ((G(ks)->state != KTAP_ERROR) && G(ks)->trace_end_closure) {
+		G(ks)->state = KTAP_TRACE_END;
+		set_func(ks->top, G(ks)->trace_end_closure);
+		incr_top(ks);
+		kp_vm_call(ks, ks->top - 1, 0);
+		G(ks)->trace_end_closure = NULL;
+	}
+
+	G(ks)->trace_enabled = 0;
+}
+
+int kp_events_init(ktap_state_t *ks)
+{
+	G(ks)->trace_enabled = 1;
+	return 0;
+}
+
diff --git a/kernel/trace/ktap/kp_events.h b/kernel/trace/ktap/kp_events.h
new file mode 100644
index 0000000..b24f723
--- /dev/null
+++ b/kernel/trace/ktap/kp_events.h
@@ -0,0 +1,71 @@
+#ifndef __KTAP_EVENTS_H__
+#define __KTAP_EVENTS_H__
+
+#include <linux/ftrace_event.h>
+#include <trace/syscall.h>
+#include <trace/events/syscalls.h>
+#include <linux/syscalls.h>
+#include <linux/kprobes.h>
+
+enum KTAP_EVENT_FIELD_TYPE {
+	KTAP_EVENT_FIELD_TYPE_INVALID = 0, /* arg type not support yet */
+
+	KTAP_EVENT_FIELD_TYPE_INT,
+	KTAP_EVENT_FIELD_TYPE_LONG,
+	KTAP_EVENT_FIELD_TYPE_STRING,
+
+	KTAP_EVENT_FIELD_TYPE_REGESTER,
+	KTAP_EVENT_FIELD_TYPE_CONST,
+	KTAP_EVENT_FIELD_TYPE_NIL /* arg not exist */
+};
+
+struct ktap_event_field {
+	enum KTAP_EVENT_FIELD_TYPE type;
+	int offset;
+};
+
+enum KTAP_EVENT_TYPE {
+	KTAP_EVENT_TYPE_PERF,
+	KTAP_EVENT_TYPE_TRACEPOINT,
+	KTAP_EVENT_TYPE_SYSCALL_ENTER,
+	KTAP_EVENT_TYPE_SYSCALL_EXIT,
+	KTAP_EVENT_TYPE_KPROBE,
+};
+
+struct ktap_event {
+	struct list_head list;
+	int type;
+	ktap_state_t *ks;
+	ktap_func_t *fn;
+	struct perf_event *perf;
+	int syscall_nr; /* for syscall event */
+	struct ktap_event_field fields[9]; /* arg1..arg9 */
+	ktap_str_t *name; /* intern probename string */
+
+	struct kprobe kp; /* kprobe event */
+};
+
+/* this structure allocate on stack */
+struct ktap_event_data {
+	struct ktap_event *event;
+	struct perf_sample_data *data;
+	struct pt_regs *regs;
+	ktap_str_t *argstr; /* for cache argstr intern string */
+};
+
+int kp_events_init(ktap_state_t *ks);
+void kp_events_exit(ktap_state_t *ks);
+
+int kp_event_create(ktap_state_t *ks, struct perf_event_attr *attr,
+		    struct task_struct *task, const char *filter,
+		    ktap_func_t *fn);
+int kp_event_create_tracepoint(ktap_state_t *ks, const char *event_name,
+			       ktap_func_t *fn);
+
+int kp_event_create_kprobe(ktap_state_t *ks, const char *event_name,
+			   ktap_func_t *fn);
+void kp_event_getarg(ktap_state_t *ks, ktap_val_t *ra, int idx);
+const char *kp_event_tostr(ktap_state_t *ks);
+const ktap_str_t *kp_event_stringify(ktap_state_t *ks);
+
+#endif /* __KTAP_EVENTS_H__ */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 15/29] ktap: add built-in functions and library(kernel/trace/ktap/lib_*.c)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (13 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 14/29] ktap: add events management(kernel/trace/ktap/kp_events.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 16/29] ktap: add amalgamation build(kernel/trace/ktap/amalg.c) Jovi Zhangwei
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap register built-in functions and library into table.

1). Built-in functions(lib_base,c):

print, printf, print_hist, pairs, len, delete, stack,
print_trace_clock, num_cpus, arch, kernel_v, kernel_string,
user_string, stringof, ipof, gettimeofday_ns, gettimeofday_us,
gettimeofday_ms, gettimeofday_s, curr_taskinfo, in_iowait,
in_interrupt, exit.

2). Ansi library(lib_ansi.c):

ansi.clear_screen
ansi.set_color
ansi.set_color2
ansi.set_color3
ansi.reset_color
ansi.new_line

3). kdebug library(lib_kdebug.c):

kdebug.trace_by_id
kdebug.trace_end
kdebug.tracepoint
kdebug.kprobe

4). net library(lib_net.c):

net.ip_sock_saddr
net.ip_sock_daddr
net.format_ip_addr

5). table library(lib_table.c):

table.new

6). timer library(lib_timer.c):

timer.profile
timer.tick

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/lib_ansi.c   | 142 ++++++++++++++
 kernel/trace/ktap/lib_base.c   | 407 +++++++++++++++++++++++++++++++++++++++++
 kernel/trace/ktap/lib_kdebug.c | 195 ++++++++++++++++++++
 kernel/trace/ktap/lib_net.c    | 107 +++++++++++
 kernel/trace/ktap/lib_table.c  |  58 ++++++
 kernel/trace/ktap/lib_timer.c  | 210 +++++++++++++++++++++
 6 files changed, 1119 insertions(+)
 create mode 100644 kernel/trace/ktap/lib_ansi.c
 create mode 100644 kernel/trace/ktap/lib_base.c
 create mode 100644 kernel/trace/ktap/lib_kdebug.c
 create mode 100644 kernel/trace/ktap/lib_net.c
 create mode 100644 kernel/trace/ktap/lib_table.c
 create mode 100644 kernel/trace/ktap/lib_timer.c

diff --git a/kernel/trace/ktap/lib_ansi.c b/kernel/trace/ktap/lib_ansi.c
new file mode 100644
index 0000000..04f0b9a
--- /dev/null
+++ b/kernel/trace/ktap/lib_ansi.c
@@ -0,0 +1,142 @@
+/*
+ * lib_ansi.c - ANSI escape sequences library
+ *
+ * http://en.wikipedia.org/wiki/ANSI_escape_code
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_vm.h"
+
+/**
+ * function ansi.clear_screen - Move cursor to top left and clear screen.
+ *
+ * Description: Sends ansi code for moving cursor to top left and then the
+ * ansi code for clearing the screen from the cursor position to the end.
+ */
+
+static int kplib_ansi_clear_screen(ktap_state_t *ks)
+{
+	kp_printf(ks, "\033[1;1H\033[J");
+	return 0;
+}
+
+/**
+ * function ansi.set_color - Set the ansi Select Graphic Rendition mode.
+ * @fg: Foreground color to set.
+ *
+ * Description: Sends ansi code for Select Graphic Rendition mode for the
+ * given forground color. Black (30), Blue (34), Green (32), Cyan (36),
+ * Red (31), Purple (35), Brown (33), Light Gray (37).
+ */
+
+static int kplib_ansi_set_color(ktap_state_t *ks)
+{
+	int fg = kp_arg_checknumber(ks, 1);
+
+	kp_printf(ks, "\033[%dm", fg);
+	return 0;
+}
+
+/**
+ * function ansi.set_color2 - Set the ansi Select Graphic Rendition mode.
+ * @fg: Foreground color to set.
+ * @bg: Background color to set.
+ *
+ * Description: Sends ansi code for Select Graphic Rendition mode for the
+ * given forground color, Black (30), Blue (34), Green (32), Cyan (36),
+ * Red (31), Purple (35), Brown (33), Light Gray (37) and the given
+ * background color, Black (40), Red (41), Green (42), Yellow (43),
+ * Blue (44), Magenta (45), Cyan (46), White (47).
+ */
+static int kplib_ansi_set_color2(ktap_state_t *ks)
+{
+	int fg = kp_arg_checknumber(ks, 1);
+	int bg = kp_arg_checknumber(ks, 2);
+	
+	kp_printf(ks, "\033[%d;%dm", fg, bg);
+	return 0;
+}
+
+/**
+ * function ansi.set_color3 - Set the ansi Select Graphic Rendition mode.
+ * @fg: Foreground color to set.
+ * @bg: Background color to set.
+ * @attr: Color attribute to set.
+ *
+ * Description: Sends ansi code for Select Graphic Rendition mode for the
+ * given forground color, Black (30), Blue (34), Green (32), Cyan (36),
+ * Red (31), Purple (35), Brown (33), Light Gray (37), the given
+ * background color, Black (40), Red (41), Green (42), Yellow (43),
+ * Blue (44), Magenta (45), Cyan (46), White (47) and the color attribute
+ * All attributes off (0), Intensity Bold (1), Underline Single (4),
+ * Blink Slow (5), Blink Rapid (6), Image Negative (7).
+ */
+static int kplib_ansi_set_color3(ktap_state_t *ks)
+{
+	int fg = kp_arg_checknumber(ks, 1);
+	int bg = kp_arg_checknumber(ks, 2);
+	int attr = kp_arg_checknumber(ks, 3);
+
+	if (attr)
+		kp_printf(ks, "\033[%d;%d;%dm", fg, bg, attr);
+	else
+		kp_printf(ks, "\033[%d;%dm", fg, bg);
+	
+	return 0;
+}
+
+/**
+ * function ansi.reset_color - Resets Select Graphic Rendition mode.
+ *
+ * Description: Sends ansi code to reset foreground, background and color
+ * attribute to default values.
+ */
+static int kplib_ansi_reset_color(ktap_state_t *ks)
+{
+	kp_printf(ks, "\033[0;0m");
+	return 0;
+}
+
+/**
+ * function ansi.new_line - Move cursor to new line.
+ *
+ * Description: Sends ansi code new line.
+ */
+static int kplib_ansi_new_line (ktap_state_t *ks)
+{
+	kp_printf(ks, "\12");
+	return 0;
+}
+
+static const ktap_libfunc_t ansi_lib_funcs[] = {
+	{"clear_screen", kplib_ansi_clear_screen},
+	{"set_color", kplib_ansi_set_color},
+	{"set_color2", kplib_ansi_set_color2},
+	{"set_color3", kplib_ansi_set_color3},
+	{"reset_color", kplib_ansi_reset_color},
+	{"new_line", kplib_ansi_new_line},
+	{NULL}
+};
+
+int kp_lib_init_ansi(ktap_state_t *ks)
+{
+	return kp_vm_register_lib(ks, "ansi", ansi_lib_funcs); 
+}
diff --git a/kernel/trace/ktap/lib_base.c b/kernel/trace/ktap/lib_base.c
new file mode 100644
index 0000000..1765cc3
--- /dev/null
+++ b/kernel/trace/ktap/lib_base.c
@@ -0,0 +1,407 @@
+/*
+ * lib_base.c - base library
+ *
+ * Caveat: all kernel funtion called by ktap library have to be lock free,
+ * otherwise system will deadlock.
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/version.h>
+#include <linux/hardirq.h>
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/sched.h>
+#include <linux/uaccess.h>
+#include <linux/utsname.h>
+#include <linux/time.h>
+#include <linux/clocksource.h>
+#include <linux/ring_buffer.h>
+#include <linux/stacktrace.h>
+#include <linux/cred.h>
+#include <linux/uidgid.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_tab.h"
+#include "kp_transport.h"
+#include "kp_events.h"
+#include "kp_vm.h"
+
+static int kplib_print(ktap_state_t *ks)
+{
+	int i;
+	int n = kp_arg_nr(ks);
+
+	for (i = 1; i <= n; i++) {
+		ktap_val_t *arg = kp_arg(ks, i);
+		if (i > 1)
+			kp_puts(ks, "\t");
+		kp_obj_show(ks, arg);
+	}
+
+	kp_puts(ks, "\n");
+	return 0;
+}
+
+/* don't engage with intern string in printf, use buffer directly */
+static int kplib_printf(ktap_state_t *ks)
+{
+	struct trace_seq *seq;
+
+	preempt_disable_notrace();
+
+	seq = kp_this_cpu_print_buffer(ks);
+	trace_seq_init(seq);
+
+	if (kp_str_fmt(ks, seq))
+		goto out;
+
+	seq->buffer[seq->len] = '\0';
+	kp_transport_write(ks, seq->buffer, seq->len + 1);
+
+ out:
+	preempt_enable_notrace();
+	return 0;
+}
+
+#define HISTOGRAM_DEFAULT_TOP_NUM	20
+
+static int kplib_print_hist(ktap_state_t *ks)
+{
+	int n ;
+
+	kp_arg_check(ks, 1, KTAP_TTAB);
+	n = kp_arg_checkoptnumber(ks, 2, HISTOGRAM_DEFAULT_TOP_NUM);
+
+	n = min(n, 1000);
+	n = max(n, HISTOGRAM_DEFAULT_TOP_NUM);
+
+	kp_tab_print_hist(ks, hvalue(kp_arg(ks, 1)), n);
+
+	return 0;
+}
+
+static int kplib_pairs(ktap_state_t *ks)
+{
+	kp_arg_check(ks, 1, KTAP_TTAB);
+
+	set_cfunc(ks->top++, (ktap_cfunction)kp_tab_next);
+	set_table(ks->top++, hvalue(kp_arg(ks, 1)));
+	set_nil(ks->top++);
+	return 3;
+}
+
+static int kplib_len(ktap_state_t *ks)
+{
+	int len = kp_obj_len(ks, kp_arg(ks, 1));
+
+	if (len < 0)
+		return -1;
+
+	set_number(ks->top, len);
+	incr_top(ks);
+	return 1;
+}
+
+static int kplib_delete(ktap_state_t *ks)
+{
+	kp_arg_check(ks, 1, KTAP_TTAB);
+	kp_tab_clear(hvalue(kp_arg(ks, 1)));
+	return 0;
+}
+
+#ifdef CONFIG_STACKTRACE
+static int kplib_stack(ktap_state_t *ks)
+{
+	uint16_t skip, depth = 10;
+
+	depth = kp_arg_checkoptnumber(ks, 1, 10); /* default as 10 */
+	depth = min_t(uint16_t, depth, KP_MAX_STACK_DEPTH);
+	skip = kp_arg_checkoptnumber(ks, 2, 10); /* default as 10 */
+
+	set_kstack(ks->top, depth, skip);
+	incr_top(ks);
+	return 1;
+}
+#else
+static int kplib_stack(ktap_state_t *ks)
+{
+	kp_error(ks, "Please enable CONFIG_STACKTRACE before call stack()\n");
+	return -1;
+}
+#endif
+
+
+extern unsigned long long ns2usecs(cycle_t nsec);
+static int kplib_print_trace_clock(ktap_state_t *ks)
+{
+	unsigned long long t;
+	unsigned long secs, usec_rem;
+	u64 timestamp;
+
+	/* use ring buffer's timestamp */
+	timestamp = ring_buffer_time_stamp(G(ks)->buffer, smp_processor_id());
+
+	t = ns2usecs(timestamp);
+	usec_rem = do_div(t, USEC_PER_SEC);
+	secs = (unsigned long)t;
+
+	kp_printf(ks, "%5lu.%06lu\n", secs, usec_rem);
+	return 0;
+}
+
+static int kplib_num_cpus(ktap_state_t *ks)
+{
+	set_number(ks->top, num_online_cpus());
+	incr_top(ks);
+	return 1;
+}
+
+/* TODO: intern string firstly */
+static int kplib_arch(ktap_state_t *ks)
+{
+	ktap_str_t *ts = kp_str_newz(ks, utsname()->machine);
+	if (unlikely(!ts))
+		return -1;
+
+	set_string(ks->top, ts);
+	incr_top(ks);
+	return 1;
+}
+
+/* TODO: intern string firstly */
+static int kplib_kernel_v(ktap_state_t *ks)
+{
+	ktap_str_t *ts = kp_str_newz(ks, utsname()->release);
+	if (unlikely(!ts))
+		return -1;
+
+	set_string(ks->top, ts);
+	incr_top(ks);
+	return 1;
+}
+
+static int kplib_kernel_string(ktap_state_t *ks)
+{
+	unsigned long addr = kp_arg_checknumber(ks, 1);
+	char str[256] = {0};
+	ktap_str_t *ts;
+	char *ret;
+
+	ret = strncpy((void *)str, (const void *)addr, 256);
+	(void) &ret;  /* Silence compiler warning. */
+	str[255] = '\0';
+
+	ts = kp_str_newz(ks, str);
+	if (unlikely(!ts))
+		return -1;
+
+	set_string(ks->top, ts);
+	incr_top(ks);
+	return 1;
+}
+
+static int kplib_user_string(ktap_state_t *ks)
+{
+	unsigned long addr = kp_arg_checknumber(ks, 1);
+	char str[256] = {0};
+	ktap_str_t *ts;
+	int ret;
+
+	pagefault_disable();
+	ret = __copy_from_user_inatomic((void *)str, (const void *)addr, 256);
+	(void) &ret;  /* Silence compiler warning. */
+	pagefault_enable();
+	str[255] = '\0';
+
+	ts = kp_str_newz(ks, str);
+	if (unlikely(!ts))
+		return -1;
+
+	set_string(ks->top, ts);
+	incr_top(ks);
+	return 1;
+}
+
+static int kplib_stringof(ktap_state_t *ks)
+{
+	ktap_val_t *v = kp_arg(ks, 1);
+	const ktap_str_t *ts = NULL;
+
+	if (itype(v) == KTAP_TEVENTSTR) {
+		ts = kp_event_stringify(ks);
+	} else if (itype(v) == KTAP_TKIP) {
+		char str[KSYM_SYMBOL_LEN];
+
+		SPRINT_SYMBOL(str, nvalue(v));
+		ts = kp_str_newz(ks, str);
+	}
+
+	if (unlikely(!ts))
+		return -1;
+
+	set_string(ks->top++, ts);
+	return 1;
+}
+
+static int kplib_ipof(ktap_state_t *ks)
+{
+	unsigned long addr = kp_arg_checknumber(ks, 1);
+
+	set_ip(ks->top++, addr);
+	return 1;
+}
+
+static int kplib_gettimeofday_ns(ktap_state_t *ks)
+{
+	set_number(ks->top, gettimeofday_ns());
+	incr_top(ks);
+
+	return 1;
+}
+
+static int kplib_gettimeofday_us(ktap_state_t *ks)
+{
+	set_number(ks->top, gettimeofday_ns() / NSEC_PER_USEC);
+	incr_top(ks);
+
+	return 1;
+}
+
+static int kplib_gettimeofday_ms(ktap_state_t *ks)
+{
+	set_number(ks->top, gettimeofday_ns() / NSEC_PER_MSEC);
+	incr_top(ks);
+
+	return 1;
+}
+
+static int kplib_gettimeofday_s(ktap_state_t *ks)
+{
+	set_number(ks->top, gettimeofday_ns() / NSEC_PER_SEC);
+	incr_top(ks);
+
+	return 1;
+}
+
+/*
+ * use gdb to get field offset of struct task_struct, for example:
+ *
+ * gdb vmlinux
+ * (gdb)p &(((struct task_struct *)0).prio)
+ */
+static int kplib_curr_taskinfo(ktap_state_t *ks)
+{
+	int offset = kp_arg_checknumber(ks, 1);
+	int fetch_bytes  = kp_arg_checkoptnumber(ks, 2, 4); /* fetch 4 bytes */
+
+	if (offset >= sizeof(struct task_struct)) {
+		set_nil(ks->top++);
+		kp_error(ks, "access out of bound value of task_struct\n");
+		return 1;
+	}
+
+#define RET_VALUE ((unsigned long)current + offset)
+
+	switch (fetch_bytes) {
+	case 4:
+		set_number(ks->top, *(unsigned int *)RET_VALUE);
+		break;
+	case 8:
+		set_number(ks->top, *(unsigned long *)RET_VALUE);
+		break;
+	default:
+		kp_error(ks, "unsupported fetch bytes in curr_task_info\n");
+		set_nil(ks->top);
+		break;
+	}
+
+#undef RET_VALUE
+
+	incr_top(ks);
+	return 1;
+}
+
+/*
+ * This built-in function mainly purpose scripts/schedule/schedtimes.kp
+ */
+static int kplib_in_iowait(ktap_state_t *ks)
+{
+	set_number(ks->top, current->in_iowait);
+	incr_top(ks);
+
+	return 1;
+}
+
+static int kplib_in_interrupt(ktap_state_t *ks)
+{
+	int ret = in_interrupt();
+
+	set_number(ks->top, ret);
+	incr_top(ks);
+	return 1;
+}
+
+static int kplib_exit(ktap_state_t *ks)
+{
+	kp_vm_try_to_exit(ks);
+
+	/* do not execute bytecode any more in this thread */
+	return -1;
+}
+
+static const ktap_libfunc_t base_lib_funcs[] = {
+	{"print", kplib_print},
+	{"printf", kplib_printf},
+	{"print_hist", kplib_print_hist},
+
+	{"pairs", kplib_pairs},
+	{"len", kplib_len},
+	{"delete", kplib_delete},
+
+	{"stack", kplib_stack},
+	{"print_trace_clock", kplib_print_trace_clock},
+
+	{"num_cpus", kplib_num_cpus},
+	{"arch", kplib_arch},
+	{"kernel_v", kplib_kernel_v},
+	{"kernel_string", kplib_kernel_string},
+	{"user_string", kplib_user_string},
+	{"stringof", kplib_stringof},
+	{"ipof", kplib_ipof},
+
+	{"gettimeofday_ns", kplib_gettimeofday_ns},
+	{"gettimeofday_us", kplib_gettimeofday_us},
+	{"gettimeofday_ms", kplib_gettimeofday_ms},
+	{"gettimeofday_s", kplib_gettimeofday_s},
+
+	{"curr_taskinfo", kplib_curr_taskinfo},
+
+	{"in_iowait", kplib_in_iowait},
+	{"in_interrupt", kplib_in_interrupt},
+
+	{"exit", kplib_exit},
+	{NULL}
+};
+
+int kp_lib_init_base(ktap_state_t *ks)
+{
+	return kp_vm_register_lib(ks, NULL, base_lib_funcs); 
+}
diff --git a/kernel/trace/ktap/lib_kdebug.c b/kernel/trace/ktap/lib_kdebug.c
new file mode 100644
index 0000000..247fc51
--- /dev/null
+++ b/kernel/trace/ktap/lib_kdebug.c
@@ -0,0 +1,195 @@
+/*
+ * lib_kdebug.c - kdebug library support for ktap
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/module.h>
+#include <linux/ctype.h>
+#include <linux/slab.h>
+#include <linux/version.h>
+#include <linux/ftrace_event.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_transport.h"
+#include "kp_vm.h"
+#include "kp_events.h"
+
+/**
+ * function kdebug.trace_by_id
+ *
+ * @uaddr: userspace address refer to ktap_eventdesc_t
+ * @closure
+ */
+static int kplib_kdebug_trace_by_id(ktap_state_t *ks)
+{
+	unsigned long uaddr = kp_arg_checknumber(ks, 1);
+	ktap_func_t *fn = kp_arg_checkfunction(ks, 2);
+	struct task_struct *task = G(ks)->trace_task;
+	ktap_eventdesc_t eventsdesc;
+	char *filter = NULL;
+	int *id_arr;
+	int i;
+
+	if (G(ks)->mainthread != ks) {
+		kp_error(ks,
+		    "kdebug.trace_by_id only can be called in mainthread\n");
+		return -1;
+	}
+
+	/* kdebug.trace_by_id cannot be called in trace_end state */
+	if (G(ks)->state != KTAP_RUNNING) {
+		kp_error(ks,
+		    "kdebug.trace_by_id only can be called in RUNNING state\n");
+		return -1;
+	}
+
+	/* copy ktap_eventdesc_t from userspace */
+	if (copy_from_user(&eventsdesc, (void *)uaddr,
+			     sizeof(ktap_eventdesc_t)))
+		return -1;
+
+	if (eventsdesc.filter) {
+		int len;
+
+		len = strlen_user(eventsdesc.filter);
+		if (len > 0x1000)
+			return -1;
+
+		filter = kmalloc(len + 1, GFP_KERNEL);
+		if (!filter)
+			return -1;
+
+		/* copy filter string from userspace */
+		if (strncpy_from_user(filter, eventsdesc.filter, len) < 0) {
+			kfree(filter);
+			return -1;
+		}
+	}
+
+	id_arr = kmalloc(eventsdesc.nr * sizeof(int), GFP_KERNEL);
+	if (!id_arr) {
+		kfree(filter);
+		return -1;
+	}
+
+	/* copy all event id from userspace */
+	if (copy_from_user(id_arr, eventsdesc.id_arr,
+			   eventsdesc.nr * sizeof(int))) {
+		kfree(filter);
+		kfree(id_arr);
+		return -1;
+	}
+
+	fn = clvalue(kp_arg(ks, 2));
+
+	for (i = 0; i < eventsdesc.nr; i++) {
+		struct perf_event_attr attr;
+
+		cond_resched();
+
+		if (signal_pending(current)) {
+			flush_signals(current);
+			kfree(filter);
+			kfree(id_arr);
+			return -1;
+		}
+
+		memset(&attr, 0, sizeof(attr));
+		attr.type = PERF_TYPE_TRACEPOINT;	
+		attr.config = id_arr[i];
+		attr.sample_type = PERF_SAMPLE_RAW | PERF_SAMPLE_TIME |
+				   PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD;
+		attr.sample_period = 1;
+		attr.size = sizeof(attr);
+		attr.disabled = 0;
+
+		/* register event one by one */
+		if (kp_event_create(ks, &attr, task, filter, fn))
+			break;
+	}
+
+	kfree(filter);
+	kfree(id_arr);
+	return 0;
+}
+
+static int kplib_kdebug_trace_end(ktap_state_t *ks)
+{
+	/* trace_end_closure will be called when ktap main thread exit */
+	G(ks)->trace_end_closure = kp_arg_checkfunction(ks, 1);
+	return 0;
+}
+
+static int kplib_kdebug_tracepoint(ktap_state_t *ks)
+{
+	const char *event_name = kp_arg_checkstring(ks, 1);
+	ktap_func_t *fn = kp_arg_checkfunction(ks, 2);
+
+	if (G(ks)->mainthread != ks) {
+		kp_error(ks,
+		    "kdebug.tracepoint only can be called in mainthread\n");
+		return -1;
+	}
+
+	/* kdebug.tracepoint cannot be called in trace_end state */
+	if (G(ks)->state != KTAP_RUNNING) {
+		kp_error(ks,
+		    "kdebug.tracepoint only can be called in RUNNING state\n");
+		return -1;
+	}
+
+	return kp_event_create_tracepoint(ks, event_name, fn);
+}
+
+static int kplib_kdebug_kprobe(ktap_state_t *ks)
+{
+	const char *event_name = kp_arg_checkstring(ks, 1);
+	ktap_func_t *fn = kp_arg_checkfunction(ks, 2);
+
+	if (G(ks)->mainthread != ks) {
+		kp_error(ks,
+		    "kdebug.kprobe only can be called in mainthread\n");
+		return -1;
+	}
+
+	/* kdebug.kprobe cannot be called in trace_end state */
+	if (G(ks)->state != KTAP_RUNNING) {
+		kp_error(ks,
+		    "kdebug.kprobe only can be called in RUNNING state\n");
+		return -1;
+	}
+
+	return kp_event_create_kprobe(ks, event_name, fn);
+}
+static const ktap_libfunc_t kdebug_lib_funcs[] = {
+	{"trace_by_id", kplib_kdebug_trace_by_id},
+	{"trace_end", kplib_kdebug_trace_end},
+
+	{"tracepoint", kplib_kdebug_tracepoint},
+	{"kprobe", kplib_kdebug_kprobe},
+	{NULL}
+};
+
+int kp_lib_init_kdebug(ktap_state_t *ks)
+{
+	return kp_vm_register_lib(ks, "kdebug", kdebug_lib_funcs);
+}
+
diff --git a/kernel/trace/ktap/lib_net.c b/kernel/trace/ktap/lib_net.c
new file mode 100644
index 0000000..a34f4c2
--- /dev/null
+++ b/kernel/trace/ktap/lib_net.c
@@ -0,0 +1,107 @@
+/*
+ * lib_base.c - base library
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <net/inet_sock.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_str.h"
+#include "kp_vm.h"
+
+/**
+ * Return the source IP address for a given sock
+ */
+static int kplib_net_ip_sock_saddr(ktap_state_t *ks)
+{
+	struct inet_sock *isk;
+	int family;
+
+	/* TODO: need to validate the address firstly */	
+
+	isk = (struct inet_sock *)kp_arg_checknumber(ks, 1);
+	family = isk->sk.__sk_common.skc_family;
+
+	if (family == AF_INET) {
+		set_number(ks->top, isk->inet_rcv_saddr);
+	} else {
+		kp_error(ks, "ip_sock_saddr only support ipv4 now\n");
+		set_nil(ks->top);
+	}
+
+	incr_top(ks);
+	return 1;
+}
+
+/**
+ * Return the destination IP address for a given sock
+ */
+static int kplib_net_ip_sock_daddr(ktap_state_t *ks)
+{
+	struct inet_sock *isk;
+	int family;
+
+	/* TODO: need to validate the address firstly */	
+
+	isk = (struct inet_sock *)kp_arg_checknumber(ks, 1);
+	family = isk->sk.__sk_common.skc_family;
+
+	if (family == AF_INET) {
+		set_number(ks->top, isk->inet_daddr);
+	} else {
+		kp_error(ks, "ip_sock_daddr only support ipv4 now\n");
+		set_nil(ks->top);
+	}
+
+	incr_top(ks);
+	return 1;
+
+}
+
+/**
+ * Returns a string representation for an IP address
+ */
+static int kplib_net_format_ip_addr(ktap_state_t *ks)
+{
+	__be32 ip = (__be32)kp_arg_checknumber(ks, 1);
+	ktap_str_t *ts;
+	char ipstr[32];
+
+	snprintf(ipstr, 32, "%pI4", &ip);
+	ts = kp_str_newz(ks, ipstr);
+	if (ts) {
+		set_string(ks->top, kp_str_newz(ks, ipstr));
+		incr_top(ks);
+		return 1;
+	} else
+		return -1;
+}
+
+static const ktap_libfunc_t net_lib_funcs[] = {
+	{"ip_sock_saddr", kplib_net_ip_sock_saddr},
+	{"ip_sock_daddr", kplib_net_ip_sock_daddr},
+	{"format_ip_addr", kplib_net_format_ip_addr},
+	{NULL}
+};
+
+int kp_lib_init_net(ktap_state_t *ks)
+{
+	return kp_vm_register_lib(ks, "net", net_lib_funcs); 
+}
diff --git a/kernel/trace/ktap/lib_table.c b/kernel/trace/ktap/lib_table.c
new file mode 100644
index 0000000..470461c
--- /dev/null
+++ b/kernel/trace/ktap/lib_table.c
@@ -0,0 +1,58 @@
+/*
+ * lib_table.c - Table library
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/ctype.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/sched.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_vm.h"
+#include "kp_tab.h"
+
+static int kplib_table_new(ktap_state_t *ks)
+{
+	int narr = kp_arg_checkoptnumber(ks, 1, 0);
+	int nrec = kp_arg_checkoptnumber(ks, 2, 0);
+	ktap_tab_t *h;
+
+	h = kp_tab_new_ah(ks, narr, nrec);
+	if (!h) {
+		set_nil(ks->top);
+	} else {
+		set_table(ks->top, h);
+	}
+
+	incr_top(ks);
+	return 1;
+}
+
+static const ktap_libfunc_t table_lib_funcs[] = {
+	{"new",	kplib_table_new},
+	{NULL}
+};
+
+int kp_lib_init_table(ktap_state_t *ks)
+{
+	return kp_vm_register_lib(ks, "table", table_lib_funcs);
+}
+
diff --git a/kernel/trace/ktap/lib_timer.c b/kernel/trace/ktap/lib_timer.c
new file mode 100644
index 0000000..d1b6b77
--- /dev/null
+++ b/kernel/trace/ktap/lib_timer.c
@@ -0,0 +1,210 @@
+/*
+ * lib_timer.c - timer library support for ktap
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/ctype.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/sched.h>
+#include <uapi/ktap/ktap_types.h>
+#include "ktap.h"
+#include "kp_obj.h"
+#include "kp_vm.h"
+#include "kp_events.h"
+
+struct ktap_hrtimer {
+	struct hrtimer timer;
+	ktap_state_t *ks;
+	ktap_func_t *fn;
+	u64 ns;
+	struct list_head list;
+};
+
+/*
+ * Currently ktap disallow tracing event in timer callback closure,
+ * that will corrupt ktap_state_t and ktap stack, because timer closure
+ * and event closure use same irq percpu ktap_state_t and stack.
+ * We can use a different percpu ktap_state_t and stack for timer purpuse,
+ * but that's don't bring any big value with cost on memory consuming.
+ *
+ * So just simply disable tracing in timer closure,
+ * get_recursion_context()/put_recursion_context() is used for this purpose.
+ */
+static enum hrtimer_restart hrtimer_ktap_fn(struct hrtimer *timer)
+{
+	struct ktap_hrtimer *t;
+	ktap_state_t *ks;
+	int rctx;
+
+	rcu_read_lock_sched_notrace();
+
+	t = container_of(timer, struct ktap_hrtimer, timer);
+	rctx = get_recursion_context(t->ks);
+
+	ks = kp_vm_new_thread(t->ks, rctx);
+	set_func(ks->top, t->fn);
+	incr_top(ks);
+	kp_vm_call(ks, ks->top - 1, 0);
+	kp_vm_exit_thread(ks);
+
+	hrtimer_add_expires_ns(timer, t->ns);
+
+	put_recursion_context(ks, rctx);
+	rcu_read_unlock_sched_notrace();
+
+	return HRTIMER_RESTART;
+}
+
+static int set_tick_timer(ktap_state_t *ks, u64 period, ktap_func_t *fn)
+{
+	struct ktap_hrtimer *t;
+
+	t = kp_malloc(ks, sizeof(*t));
+	if (unlikely(!t))
+		return -ENOMEM;
+	t->ks = ks;
+	t->fn = fn;
+	t->ns = period;
+
+	INIT_LIST_HEAD(&t->list);
+	list_add(&t->list, &(G(ks)->timers));
+
+	hrtimer_init(&t->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	t->timer.function = hrtimer_ktap_fn;
+	hrtimer_start(&t->timer, ns_to_ktime(period), HRTIMER_MODE_REL);
+
+	return 0;
+}
+
+static int set_profile_timer(ktap_state_t *ks, u64 period, ktap_func_t *fn)
+{
+	struct perf_event_attr attr;
+
+	memset(&attr, 0, sizeof(attr));
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_CPU_CLOCK;
+	attr.sample_type = PERF_SAMPLE_RAW | PERF_SAMPLE_TIME |
+			   PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD;
+	attr.sample_period = period;
+	attr.size = sizeof(attr);
+	attr.disabled = 0;
+
+	return kp_event_create(ks, &attr, NULL, NULL, fn);
+}
+
+static int do_tick_profile(ktap_state_t *ks, int is_tick)
+{
+	const char *str = kp_arg_checkstring(ks, 1);
+	ktap_func_t *fn = kp_arg_checkfunction(ks, 2);
+	const char *tmp;
+	char interval_str[32] = {0};
+	char suffix[10] = {0};
+	int i = 0, ret, n;
+	int factor;
+
+	tmp = str;
+	while (isdigit(*tmp))
+		tmp++;
+
+	strncpy(interval_str, str, tmp - str);
+	if (kstrtoint(interval_str, 10, &n))
+		goto error;
+
+	strncpy(suffix, tmp, 9);
+	while (suffix[i] != ' ' && suffix[i] != '\0')
+		i++;
+
+	suffix[i] = '\0';
+
+	if (!strcmp(suffix, "s") || !strcmp(suffix, "sec"))
+		factor = NSEC_PER_SEC;
+	else if (!strcmp(suffix, "ms") || !strcmp(suffix, "msec"))
+		factor = NSEC_PER_MSEC;
+	else if (!strcmp(suffix, "us") || !strcmp(suffix, "usec"))
+		factor = NSEC_PER_USEC;
+	else
+		goto error;
+
+	if (is_tick)
+		ret = set_tick_timer(ks, (u64)factor * n, fn);
+	else
+		ret = set_profile_timer(ks, (u64)factor * n, fn);
+
+	return ret;
+
+ error:
+	kp_error(ks, "cannot parse timer interval: %s\n", str);
+	return -1;
+}
+
+/*
+ * tick-n probes fire on only one CPU per interval.
+ * valid time suffixes: sec/s, msec/ms, usec/us
+ */
+static int kplib_timer_tick(ktap_state_t *ks)
+{
+	/* timer.tick cannot be called in trace_end state */
+	if (G(ks)->state != KTAP_RUNNING) {
+		kp_error(ks,
+			 "timer.tick only can be called in RUNNING state\n");
+		return -1;
+	}
+
+	return do_tick_profile(ks, 1);
+}
+
+/*
+ * A profile-n probe fires every fixed interval on every CPU
+ * valid time suffixes: sec/s, msec/ms, usec/us
+ */
+static int kplib_timer_profile(ktap_state_t *ks)
+{
+	/* timer.profile cannot be called in trace_end state */
+	if (G(ks)->state != KTAP_RUNNING) {
+		kp_error(ks,
+			 "timer.profile only can be called in RUNNING state\n");
+		return -1;
+	}
+
+	return do_tick_profile(ks, 0);
+}
+
+void kp_exit_timers(ktap_state_t *ks)
+{
+	struct ktap_hrtimer *t, *tmp;
+	struct list_head *timers_list = &(G(ks)->timers);
+
+	list_for_each_entry_safe(t, tmp, timers_list, list) {
+		hrtimer_cancel(&t->timer);
+		kp_free(ks, t);
+	}
+}
+
+static const ktap_libfunc_t timer_lib_funcs[] = {
+	{"profile",	kplib_timer_profile},
+	{"tick",	kplib_timer_tick},
+	{NULL}
+};
+
+int kp_lib_init_timer(ktap_state_t *ks)
+{
+	return kp_vm_register_lib(ks, "timer", timer_lib_funcs);
+}
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 16/29] ktap: add amalgamation build(kernel/trace/ktap/amalg.c)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (14 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 15/29] ktap: add built-in functions and library(kernel/trace/ktap/lib_*.c) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-31  2:17   ` Li Zefan
  2014-03-28 14:45 ` [PATCH v2 17/29] ktap: add Makefile for kernel module(kernel/trace/ktap/Makefile) Jovi Zhangwei
                   ` (13 subsequent siblings)
  29 siblings, 1 reply; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

This compiles the ktapvm as one huge C file and allows
GCC to generate faster and shorter code.

No amalgamation build in x86_64:
ktapvm.ko: 3.1M

amalgamation build in x86_64:
ktapvm.ko: 1.1M

User can set use amalgamation build or not in Makefile.

(Need to analyze further why have so big differences)

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/amalg.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)
 create mode 100644 kernel/trace/ktap/amalg.c

diff --git a/kernel/trace/ktap/amalg.c b/kernel/trace/ktap/amalg.c
new file mode 100644
index 0000000..9935ccf
--- /dev/null
+++ b/kernel/trace/ktap/amalg.c
@@ -0,0 +1,37 @@
+/*
+ * amalg.c - ktapvm kernel module amalgamation.
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include "ktap.c"
+#include "kp_obj.c"
+#include "kp_bcread.c"
+#include "kp_str.c"
+#include "kp_mempool.c"
+#include "kp_tab.c"
+#include "kp_transport.c"
+#include "kp_vm.c"
+#include "kp_events.c"
+#include "lib_base.c"
+#include "lib_ansi.c"
+#include "lib_kdebug.c"
+#include "lib_timer.c"
+#include "lib_table.c"
+#include "lib_net.c"
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 17/29] ktap: add Makefile for kernel module(kernel/trace/ktap/Makefile)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (15 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 16/29] ktap: add amalgamation build(kernel/trace/ktap/amalg.c) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 18/29] ktap: add Kconfig(kernel/trace/ktap/Kconfig) Jovi Zhangwei
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

This Makefile compiles kernel module and generate ktapvm.ko.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/Makefile | 50 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)
 create mode 100644 kernel/trace/ktap/Makefile

diff --git a/kernel/trace/ktap/Makefile b/kernel/trace/ktap/Makefile
new file mode 100644
index 0000000..6168ac2
--- /dev/null
+++ b/kernel/trace/ktap/Makefile
@@ -0,0 +1,50 @@
+# Define amalg to enable amalgamation build, This compiles the ktapvm as
+# one huge C file and allows GCC to generate faster and shorter code. Also,
+# this requires lots of memory during the build.
+# Recommend to use amalgmation build as default.
+amalg = 1
+
+# Do not instrument the tracer itself:
+ifdef CONFIG_FUNCTION_TRACER
+ORIG_CFLAGS := $(KBUILD_CFLAGS)
+KBUILD_CFLAGS = $(subst -pg,,$(ORIG_CFLAGS))
+endif
+
+all: mod
+
+KTAP_LIBS = -lpthread
+
+LIB_OBJS += lib_base.o lib_kdebug.o lib_timer.o lib_ansi.o lib_table.o \
+		lib_net.o
+
+ifndef amalg
+RUNTIME_OBJS += ktap.o kp_bcread.o kp_obj.o kp_str.o kp_mempool.o \
+		kp_tab.o kp_vm.o kp_transport.o kp_events.o $(LIB_OBJS)
+else
+RUNTIME_OBJS += amalg.o
+endif
+
+obj-m		+= ktapvm.o
+ktapvm-y	:= $(RUNTIME_OBJS)
+
+KVERSION ?= $(shell uname -r)
+KERNEL_SRC ?= /lib/modules/$(KVERSION)/build
+PWD := $(shell pwd)
+mod:
+	$(MAKE) -C $(KERNEL_SRC) M=$(PWD) modules
+
+modules_install:
+	$(MAKE) -C $(KERNEL_SRC) M=$(PWD) modules_install
+
+load:
+	insmod ktapvm.ko
+
+unload:
+	rmmod ktapvm
+
+reload:
+	make unload; make load
+
+clean:
+	$(MAKE) -C $(KERNEL_SRC) M=$(PWD) clean
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 18/29] ktap: add Kconfig(kernel/trace/ktap/Kconfig)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (16 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 17/29] ktap: add Makefile for kernel module(kernel/trace/ktap/Makefile) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 19/29] ktap: add main file for ktap binary(tools/ktap/kp_main.c) Jovi Zhangwei
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 kernel/trace/ktap/Kconfig | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)
 create mode 100644 kernel/trace/ktap/Kconfig

diff --git a/kernel/trace/ktap/Kconfig b/kernel/trace/ktap/Kconfig
new file mode 100644
index 0000000..21f8d2e
--- /dev/null
+++ b/kernel/trace/ktap/Kconfig
@@ -0,0 +1,21 @@
+config KTAP
+	tristate "a programable dynamic tracing tool for Linux"
+	depends on PERF_EVENTS && EVENT_TRACING
+	default n
+	help
+	  ktap is a new script-based dynamic tracing tool for Linux,
+	  it uses a scripting language and lets users trace the
+	  Linux kernel dynamically. ktap is designed to give
+	  operational insights with interoperability that allow
+	  users to tune, troubleshoot and extend kernel and application.
+	  It's similar with Linux Systemtap and Solaris Dtrace.
+
+	  ktap have different design principles from Linux mainstream
+	  dynamic tracing language in that it's based on bytecode,
+	  so it doesn't depend upon GCC, doesn't require compiling
+	  kernel module for each script, safe to use in production
+	  environment, fulfilling the embedded ecosystem's tracing needs.
+
+	  See ktap tutorial for more information:
+	      http://www.ktap.org/doc/tutorial.html
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 19/29] ktap: add main file for ktap binary(tools/ktap/kp_main.c)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (17 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 18/29] ktap: add Kconfig(kernel/trace/ktap/Kconfig) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 20/29] ktap: add compiler(tools/ktap/kp_[lex|parse].[c|h]) Jovi Zhangwei
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Main entry for userspace ktap tool.

[root@localhost ktap]# ./ktap
Usage: ktap [options] file [script args] -- cmd [args]
   or: ktap [options] -e one-liner  -- cmd [args]

Options and arguments:
  -o file        : send script output to file, instead of stderr
  -p pid         : specific tracing pid
  -C cpu         : cpu to monitor in system-wide
  -T             : show timestamp for event
  -V             : show version
  -v             : enable verbose mode
  -q             : suppress start tracing message
  -d             : dry run mode(register NULL callback to perf events)
  -s             : simple event tracing
  -b             : list byte codes
  -le [glob]     : list pre-defined events in system
  -lf DSO        : list available functions from DSO
  -lm DSO        : list available sdt notes from DSO
  file           : program read from script file
  -- cmd [args]  : workload to tracing

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/kp_main.c | 443 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 443 insertions(+)
 create mode 100644 tools/ktap/kp_main.c

diff --git a/tools/ktap/kp_main.c b/tools/ktap/kp_main.c
new file mode 100644
index 0000000..f4b9a7b
--- /dev/null
+++ b/tools/ktap/kp_main.c
@@ -0,0 +1,443 @@
+/*
+ * main.c - ktap compiler and loader entry
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sched.h>
+#include <string.h>
+#include <signal.h>
+#include <stdarg.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <math.h>
+#include <linux/errno.h>
+
+#include "../../include/uapi/ktap/ktap_types.h"
+#include "kp_lex.h"
+#include "kp_parse.h"
+#include "kp_symbol.h"
+
+static void usage(const char *msg_fmt, ...)
+{
+	va_list ap;
+
+	va_start(ap, msg_fmt);
+	vfprintf(stderr, msg_fmt, ap);
+	va_end(ap);
+
+	fprintf(stderr,
+"Usage: ktap [options] file [script args] -- cmd [args]\n"
+"   or: ktap [options] -e one-liner  -- cmd [args]\n"
+"\n"
+"Options and arguments:\n"
+"  -o file        : send script output to file, instead of stderr\n"
+"  -p pid         : specific tracing pid\n"
+"  -C cpu         : cpu to monitor in system-wide\n"
+"  -T             : show timestamp for event\n"
+"  -V             : show version\n"
+"  -v             : enable verbose mode\n"
+"  -q             : suppress start tracing message\n"
+"  -d             : dry run mode(register NULL callback to perf events)\n"
+"  -s             : simple event tracing\n"
+"  -b             : list byte codes\n"
+"  -le [glob]     : list pre-defined events in system\n"
+#ifndef NO_LIBELF
+"  -lf DSO        : list available functions from DSO\n"
+"  -lm DSO        : list available sdt notes from DSO\n"
+#endif
+"  file           : program read from script file\n"
+"  -- cmd [args]  : workload to tracing\n");
+
+	exit(EXIT_FAILURE);
+}
+
+#define handle_error(str) do { perror(str); exit(-1); } while(0)
+
+ktap_option_t uparm;
+static int ktap_trunk_mem_size = 1024;
+
+static int kp_writer(const void* p, size_t sz, void* ud)
+{
+	if (uparm.trunk_len + sz > ktap_trunk_mem_size) {
+		int new_size = (uparm.trunk_len + sz) * 2;
+		uparm.trunk = realloc(uparm.trunk, new_size);
+		ktap_trunk_mem_size = new_size;
+	}
+
+	memcpy(uparm.trunk + uparm.trunk_len, p, sz);
+	uparm.trunk_len += sz;
+
+	return 0;
+}
+
+
+static int forks;
+static char **workload_argv;
+
+static int fork_workload(int ktap_fd)
+{
+	int pid;
+
+	pid = fork();
+	if (pid < 0)
+		handle_error("failed to fork");
+
+	if (pid > 0)
+		return pid;
+
+	signal(SIGTERM, SIG_DFL);
+
+	execvp("", workload_argv);
+
+	/*
+	 * waiting ktapvm prepare all tracing event
+	 * make it more robust in future.
+	 */
+	pause();
+
+	execvp(workload_argv[0], workload_argv);
+
+	perror(workload_argv[0]);
+	exit(-1);
+
+	return -1;
+}
+
+#define KTAPVM_PATH "/sys/kernel/debug/ktap/ktapvm"
+
+static char *output_filename;
+
+static int run_ktapvm()
+{
+        int ktapvm_fd, ktap_fd;
+	int ret;
+
+	ktapvm_fd = open(KTAPVM_PATH, O_RDONLY);
+	if (ktapvm_fd < 0)
+		handle_error("open " KTAPVM_PATH " failed");
+
+	ktap_fd = ioctl(ktapvm_fd, 0, NULL);
+	if (ktap_fd < 0)
+		handle_error("ioctl ktapvm failed");
+
+	kp_create_reader(output_filename);
+
+	if (forks) {
+		uparm.trace_pid = fork_workload(ktap_fd);
+		uparm.workload = 1;
+	}
+
+	ret = ioctl(ktap_fd, KTAP_CMD_IOC_RUN, &uparm);
+	switch (ret) {
+	case -EPERM:
+	case -EACCES:
+		fprintf(stderr, "You may not have permission to run ktap\n");
+		break;
+	}
+
+	close(ktap_fd);
+	close(ktapvm_fd);
+
+	return ret;
+}
+
+int verbose;
+static int quiet;
+static int dry_run;
+static int dump_bytecode;
+static char oneline_src[1024];
+static int trace_pid = -1;
+static int trace_cpu = -1;
+static int print_timestamp;
+
+#define SIMPLE_ONE_LINER_FMT	\
+	"trace %s { print(cpu(), tid(), execname(), argstr) }"
+
+static const char *script_file;
+static int script_args_start;
+static int script_args_end;
+
+#ifndef NO_LIBELF
+struct binary_base
+{
+	int type;
+	const char *binary;
+};
+static int print_symbol(const char *name, vaddr_t addr, void *arg)
+{
+	struct binary_base *base = (struct binary_base *)arg;
+	const char *type = base->type == FIND_SYMBOL ?
+		"probe" : "sdt";
+
+	printf("%s:%s:%s\n", type, base->binary, name);
+	return 0;
+}
+#endif
+
+static void parse_option(int argc, char **argv)
+{
+	char pid[32] = {0};
+	char cpu_str[32] = {0};
+	char *next_arg;
+	int i, j;
+
+	for (i = 1; i < argc; i++) {
+		if (argv[i][0] != '-') {
+			script_file = argv[i];
+			if (!script_file)
+				usage("");
+
+			script_args_start = i + 1;
+			script_args_end = argc;
+
+			for (j = i + 1; j < argc; j++) {
+				if (argv[j][0] == '-' && argv[j][1] == '-')
+					goto found_cmd;
+			}
+
+			return;
+		}
+
+		if (argv[i][0] == '-' && argv[i][1] == '-') {
+			j = i;
+			goto found_cmd;
+		}
+
+		next_arg = argv[i + 1];
+
+		switch (argv[i][1]) {
+		case 'o':
+			output_filename = malloc(strlen(next_arg) + 1);
+			if (!output_filename)
+				return;
+
+			strncpy(output_filename, next_arg, strlen(next_arg));
+			i++;
+			break;
+		case 'e':
+			strncpy(oneline_src, next_arg, strlen(next_arg));
+			i++;
+			break;
+		case 'p':
+			strncpy(pid, next_arg, strlen(next_arg));
+			trace_pid = atoi(pid);
+			i++;
+			break;
+		case 'C':
+			strncpy(cpu_str, next_arg, strlen(next_arg));
+			trace_cpu = atoi(cpu_str);
+			i++;
+			break;
+		case 'T':
+			print_timestamp = 1;
+			break;
+		case 'v':
+			verbose = 1;
+			break;
+		case 'q':
+			quiet = 1;
+			break;
+		case 'd':
+			dry_run = 1;
+			break;
+		case 's':
+			sprintf(oneline_src, SIMPLE_ONE_LINER_FMT, next_arg);
+			i++;
+			break;
+		case 'b':
+			dump_bytecode = 1;
+			break;
+		case 'l': /* list available events */
+			switch (argv[i][2]) {
+			case 'e': /* tracepoints */
+				list_available_events(next_arg);
+				exit(EXIT_SUCCESS);
+#ifndef NO_LIBELF
+			case 'f': /* functions in DSO */
+			case 'm': /* static marks in DSO */ {
+				const char *binary = next_arg;
+				int type = argv[i][2] == 'f' ?
+						FIND_SYMBOL : FIND_STAPSDT_NOTE;
+				struct binary_base base = {
+					.type = type,
+					.binary = binary,
+				};
+				int ret;
+
+				ret = parse_dso_symbols(binary, type,
+							print_symbol,
+							(void *)&base);
+				if (ret <= 0) {
+					fprintf(stderr,
+					"error: no symbols in binary %s\n",
+						binary);
+					exit(EXIT_FAILURE);
+				}
+				exit(EXIT_SUCCESS);
+			}
+#endif
+			default:
+				exit(EXIT_FAILURE);
+			}
+			break;
+		case 'V':
+			usage("%s\n\n", KTAP_VERSION);
+			break;
+		case '?':
+		case 'h':
+			usage("");
+			break;
+		default:
+			usage("wrong argument\n");
+			break;
+		}
+	}
+
+	return;
+
+ found_cmd:
+	script_args_end = j;
+	forks = 1;
+	workload_argv = &argv[j + 1];
+}
+
+static ktap_proto_t *parse(const char *chunkname, const char *src)
+{
+	LexState ls;
+
+	ls.chunkarg = chunkname ? chunkname : "?";
+	kp_lex_init();
+	kp_buf_init(&ls.sb);
+	kp_lex_setup(&ls, src);
+	return kp_parse(&ls);
+}
+
+static void compile(const char *input)
+{
+	ktap_proto_t *pt;
+	char *buff;
+	struct stat sb;
+	int fdin;
+
+	kp_str_resize();
+
+	if (oneline_src[0] != '\0') {
+		pt = parse(input, oneline_src);
+		goto dump;
+	}
+
+	fdin = open(input, O_RDONLY);
+	if (fdin < 0) {
+		fprintf(stderr, "open file %s failed\n", input);
+		exit(-1);
+	}
+
+	if (fstat(fdin, &sb) == -1)
+		handle_error("fstat failed");
+
+	buff = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fdin, 0);
+	if (buff == MAP_FAILED)
+		handle_error("mmap failed");
+
+	pt = parse(input, buff);
+
+	munmap(buff, sb.st_size);
+	close(fdin);
+
+ dump:
+	if (dump_bytecode) {
+		kp_dump_proto(pt);
+		exit(0);
+	}
+
+	/* bcwrite */
+	uparm.trunk = malloc(ktap_trunk_mem_size);
+	if (!uparm.trunk)
+		handle_error("malloc failed");
+
+	kp_bcwrite(pt, kp_writer, NULL, 0);
+}
+
+int main(int argc, char **argv)
+{
+	char **ktapvm_argv;
+	int new_index, i;
+	int ret;
+
+	if (argc == 1)
+		usage("");
+
+	parse_option(argc, argv);
+
+	if (oneline_src[0] != '\0')
+		script_file = "(command line)";
+
+	compile(script_file);
+
+	ktapvm_argv = (char **)malloc(sizeof(char *)*(script_args_end -
+					script_args_start + 1));
+	if (!ktapvm_argv) {
+		fprintf(stderr, "canno allocate ktapvm_argv\n");
+		return -1;
+	}
+
+	ktapvm_argv[0] = malloc(strlen(script_file) + 1);
+	if (!ktapvm_argv[0]) {
+		fprintf(stderr, "canno allocate memory\n");
+		return -1;
+	}
+	strcpy(ktapvm_argv[0], script_file);
+	ktapvm_argv[0][strlen(script_file)] = '\0';
+
+	/* pass rest argv into ktapvm */
+	new_index = 1;
+	for (i = script_args_start; i < script_args_end; i++) {
+		ktapvm_argv[new_index] = malloc(strlen(argv[i]) + 1);
+		if (!ktapvm_argv[new_index]) {
+			fprintf(stderr, "canno allocate memory\n");
+			return -1;
+		}
+		strcpy(ktapvm_argv[new_index], argv[i]);
+		ktapvm_argv[new_index][strlen(argv[i])] = '\0';
+		new_index++;
+	}
+
+	uparm.argv = ktapvm_argv;
+	uparm.argc = new_index;
+	uparm.verbose = verbose;
+	uparm.trace_pid = trace_pid;
+	uparm.trace_cpu = trace_cpu;
+	uparm.print_timestamp = print_timestamp;
+	uparm.quiet = quiet;
+	uparm.dry_run = dry_run;
+
+	/* start running into kernel ktapvm */
+	ret = run_ktapvm();
+
+	cleanup_event_resources();
+	return ret;
+}
+
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 20/29] ktap: add compiler(tools/ktap/kp_[lex|parse].[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (18 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 19/29] ktap: add main file for ktap binary(tools/ktap/kp_main.c) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 21/29] ktap: add symbol handling code(tools/ktap/symbol.[c|h]) Jovi Zhangwei
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap compiler is based on luajit, origin forked from lua.

The compiler use one pass compilation, it is very fast, and
the generated binary is very small.

[root@localhost ktap]# ll -h ktap
-rwxr-xr-x. 1 root root 83K Mar 27 06:09 ktap

The compiler is easy to hack, the code is very easy to understand.

Note that some bytecode in luajit is not used by ktap, it could
be remove in future.

More compiler hack soon:
1). aggregation
	@name[keys] = aggfunction(args)

2). multi-key table
	var s = {}
	s[key1, key2, key3] = value

3). C structure access

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/kp_lex.c   |  552 +++++++++
 tools/ktap/kp_lex.h   |   94 ++
 tools/ktap/kp_parse.c | 3139 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/ktap/kp_parse.h |    4 +
 4 files changed, 3789 insertions(+)
 create mode 100644 tools/ktap/kp_lex.c
 create mode 100644 tools/ktap/kp_lex.h
 create mode 100644 tools/ktap/kp_parse.c
 create mode 100644 tools/ktap/kp_parse.h

diff --git a/tools/ktap/kp_lex.c b/tools/ktap/kp_lex.c
new file mode 100644
index 0000000..e9597f1
--- /dev/null
+++ b/tools/ktap/kp_lex.c
@@ -0,0 +1,552 @@
+/*
+ * Lexical analyzer.
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include "../../include/uapi/ktap/ktap_types.h"
+#include "../../include/uapi/ktap/ktap_err.h"
+#include "kp_util.h"
+#include "kp_lex.h"
+#include "kp_parse.h"
+
+/* lexer token names. */
+static const char *const tokennames[] = {
+#define TKSTR1(name)		#name,
+#define TKSTR2(name, sym)	#sym,
+TKDEF(TKSTR1, TKSTR2)
+#undef TKSTR1
+#undef TKSTR2
+  NULL
+};
+
+/* -- Buffer handling ----------------------------------------------------- */
+
+#define LEX_EOF			(-1)
+#define lex_iseol(ls)		(ls->c == '\n' || ls->c == '\r')
+
+/* Get next character. */
+static inline LexChar lex_next(LexState *ls)
+{
+	return (ls->c = ls->p < ls->pe ? (LexChar)(uint8_t)*ls->p++ : LEX_EOF);
+}
+
+/* Save character. */
+static inline void lex_save(LexState *ls, LexChar c)
+{
+	kp_buf_putb(&ls->sb, c);
+}
+
+/* Save previous character and get next character. */
+static inline LexChar lex_savenext(LexState *ls)
+{
+	lex_save(ls, ls->c);
+	return lex_next(ls);
+}
+
+/* Skip line break. Handles "\n", "\r", "\r\n" or "\n\r". */
+static void lex_newline(LexState *ls)
+{
+	LexChar old = ls->c;
+
+	kp_assert(lex_iseol(ls));
+	lex_next(ls);  /* Skip "\n" or "\r". */
+	if (lex_iseol(ls) && ls->c != old)
+		lex_next(ls);  /* Skip "\n\r" or "\r\n". */
+	if (++ls->linenumber >= KP_MAX_LINE)
+		kp_lex_error(ls, ls->tok, KP_ERR_XLINES);
+}
+
+/* -- Scanner for terminals ----------------------------------------------- */
+
+static int kp_str2d(const char *s, size_t len, ktap_number *result)
+{
+	char *endptr;
+
+	if (strpbrk(s, "nN"))  /* reject 'inf' and 'nan' */
+		return 0;
+	else
+		*result = (long)strtoul(s, &endptr, 0);
+
+	if (endptr == s)
+		return 0;  /* nothing recognized */
+	while (kp_char_isspace((unsigned char)(*endptr)))
+		endptr++;
+	return (endptr == s + len);  /* OK if no trailing characters */
+}
+
+
+/* Parse a number literal. */
+static void lex_number(LexState *ls, ktap_val_t *tv)
+{
+	LexChar c, xp = 'e';
+	ktap_number n = 0;
+
+	kp_assert(kp_char_isdigit(ls->c));
+	if ((c = ls->c) == '0' && (lex_savenext(ls) | 0x20) == 'x')
+		xp = 'p';
+	while (kp_char_isident(ls->c) || ls->c == '.' ||
+		((ls->c == '-' || ls->c == '+') && (c | 0x20) == xp)) {
+		c = ls->c;
+		lex_savenext(ls);
+	}
+	lex_save(ls, '\0');
+	if (!kp_str2d(sbufB(&ls->sb), sbuflen(&ls->sb) - 1, &n))
+			kp_lex_error(ls, ls->tok, KP_ERR_XNUMBER);
+	set_number(tv, n);
+}
+
+/* Skip equal signs for "[=...=[" and "]=...=]" and return their count. */
+static int lex_skipeq(LexState *ls)
+{
+	int count = 0;
+	LexChar s = ls->c;
+
+	kp_assert(s == '[' || s == ']');
+	while (lex_savenext(ls) == '=')
+		count++;
+	return (ls->c == s) ? count : (-count) - 1;
+}
+
+/* Parse a long string or long comment (tv set to NULL). */
+static void lex_longstring(LexState *ls, ktap_val_t *tv, int sep)
+{
+	lex_savenext(ls);  /* Skip second '['. */
+	if (lex_iseol(ls))  /* Skip initial newline. */
+		lex_newline(ls);
+	for (;;) {
+		switch (ls->c) {
+		case LEX_EOF:
+			kp_lex_error(ls, TK_eof,
+					tv ? KP_ERR_XLSTR : KP_ERR_XLCOM);
+			break;
+		case ']':
+			if (lex_skipeq(ls) == sep) {
+				lex_savenext(ls);  /* Skip second ']'. */
+				goto endloop;
+			}
+			break;
+		case '\n':
+		case '\r':
+			lex_save(ls, '\n');
+			lex_newline(ls);
+			if (!tv) /* Don't waste space for comments. */
+				kp_buf_reset(&ls->sb);
+			break;
+		default:
+			lex_savenext(ls);
+			break;
+		}
+	}
+ endloop:
+	if (tv) {
+		ktap_str_t *str = kp_parse_keepstr(ls,
+					sbufB(&ls->sb) + (2 + (int)sep),
+					sbuflen(&ls->sb) - 2*(2 + (int)sep));
+		set_string(tv, str);
+	}
+}
+
+/* Parse a string. */
+static void lex_string(LexState *ls, ktap_val_t *tv)
+{
+	LexChar delim = ls->c;  /* Delimiter is '\'' or '"'. */
+
+	lex_savenext(ls);
+	while (ls->c != delim) {
+		switch (ls->c) {
+		case LEX_EOF:
+			kp_lex_error(ls, TK_eof, KP_ERR_XSTR);
+			continue;
+		case '\n':
+		case '\r':
+			kp_lex_error(ls, TK_string, KP_ERR_XSTR);
+			continue;
+		case '\\': {
+			LexChar c = lex_next(ls);  /* Skip the '\\'. */
+			switch (c) {
+			case 'a': c = '\a'; break;
+			case 'b': c = '\b'; break;
+			case 'f': c = '\f'; break;
+			case 'n': c = '\n'; break;
+			case 'r': c = '\r'; break;
+			case 't': c = '\t'; break;
+			case 'v': c = '\v'; break;
+			case 'x':  /* Hexadecimal escape '\xXX'. */
+				c = (lex_next(ls) & 15u) << 4;
+				if (!kp_char_isdigit(ls->c)) {
+					if (!kp_char_isxdigit(ls->c))
+						goto err_xesc;
+					c += 9 << 4;
+				}
+				c += (lex_next(ls) & 15u);
+				if (!kp_char_isdigit(ls->c)) {
+					if (!kp_char_isxdigit(ls->c))
+						goto err_xesc;
+					c += 9;
+				}
+				break;
+			case 'z':  /* Skip whitespace. */
+				lex_next(ls);
+				while (kp_char_isspace(ls->c))
+					if (lex_iseol(ls))
+						lex_newline(ls);
+					else
+						lex_next(ls);
+					continue;
+			case '\n': case '\r':
+				lex_save(ls, '\n');
+				lex_newline(ls);
+				continue;
+			case '\\': case '\"': case '\'':
+				break;
+			case LEX_EOF:
+				continue;
+			default:
+				if (!kp_char_isdigit(c))
+					goto err_xesc;
+				c -= '0';  /* Decimal escape '\ddd'. */
+				if (kp_char_isdigit(lex_next(ls))) {
+					c = c*10 + (ls->c - '0');
+					if (kp_char_isdigit(lex_next(ls))) {
+						c = c*10 + (ls->c - '0');
+						if (c > 255) {
+ err_xesc:
+							kp_lex_error(ls,
+								TK_string,
+								KP_ERR_XESC);
+						}
+						lex_next(ls);
+					}
+				}
+				lex_save(ls, c);
+				continue;
+			}
+			lex_save(ls, c);
+			lex_next(ls);
+			continue;
+		}
+		default:
+			lex_savenext(ls);
+			break;
+		}
+	}
+	lex_savenext(ls);  /* Skip trailing delimiter. */
+	set_string(tv,
+		kp_parse_keepstr(ls, sbufB(&ls->sb)+1, sbuflen(&ls->sb)-2));
+}
+
+/* lex helper for parse_trace and parse_timer */
+void kp_lex_read_string_until(LexState *ls, int c)
+{
+	ktap_str_t *ts;
+
+	kp_buf_reset(&ls->sb);
+
+	while (ls->c == ' ')
+		lex_next(ls);
+
+	do {
+		lex_savenext(ls);
+	} while (ls->c != c && ls->c != LEX_EOF);
+
+	if (ls->c != c)
+		kp_lex_error(ls, ls->tok, KP_ERR_XTOKEN, c);
+
+	ts = kp_parse_keepstr(ls, sbufB(&ls->sb), sbuflen(&ls->sb));
+	ls->tok = TK_string;
+	set_string(&ls->tokval, ts);
+}
+
+
+/* -- Main lexical scanner ------------------------------------------------ */
+
+/* Get next lexical token. */
+static LexToken lex_scan(LexState *ls, ktap_val_t *tv)
+{
+	kp_buf_reset(&ls->sb);
+	for (;;) {
+		if (kp_char_isident(ls->c)) {
+			ktap_str_t *s;
+			if (kp_char_isdigit(ls->c)) {  /* Numeric literal. */
+				lex_number(ls, tv);
+				return TK_number;
+			}
+			/* Identifier or reserved word. */
+			do {
+				lex_savenext(ls);
+			} while (kp_char_isident(ls->c));
+			s = kp_parse_keepstr(ls, sbufB(&ls->sb),
+						sbuflen(&ls->sb));
+			set_string(tv, s);
+			if (s->reserved > 0)  /* Reserved word? */
+				return TK_OFS + s->reserved;
+			return TK_name;
+		}
+
+		switch (ls->c) {
+		case '\n':
+		case '\r':
+			lex_newline(ls);
+			continue;
+		case ' ':
+		case '\t':
+		case '\v':
+		case '\f':
+			lex_next(ls);
+			continue;
+
+		case '#':
+			while (!lex_iseol(ls) && ls->c != LEX_EOF)
+				lex_next(ls);
+			break;
+		case '-':
+			lex_next(ls);
+			if (ls->c != '-')
+				return '-';
+			lex_next(ls);
+			if (ls->c == '[') { /* Long comment "--[=*[...]=*]". */
+				int sep = lex_skipeq(ls);
+				/* `lex_skipeq' may dirty the buffer */
+				kp_buf_reset(&ls->sb);
+				if (sep >= 0) {
+					lex_longstring(ls, NULL, sep);
+					kp_buf_reset(&ls->sb);
+					continue;
+				}
+			}
+			/* Short comment "--.*\n". */
+			while (!lex_iseol(ls) && ls->c != LEX_EOF)
+				lex_next(ls);
+			continue;
+		case '[': {
+			int sep = lex_skipeq(ls);
+			if (sep >= 0) {
+				lex_longstring(ls, tv, sep);
+				return TK_string;
+			} else if (sep == -1) {
+				return '[';
+			} else {
+				kp_lex_error(ls, TK_string, KP_ERR_XLDELIM);
+				continue;
+			}
+		}
+		case '+': {
+			lex_next(ls);
+			if (ls->c != '=')
+				return '+';
+			else {
+				lex_next(ls);
+				return TK_incr;
+			}
+		}
+		case '=':
+			lex_next(ls);
+			if (ls->c != '=')
+				return '=';
+			else {
+				lex_next(ls);
+				return TK_eq;
+			}
+		case '<':
+			lex_next(ls);
+			if (ls->c != '=')
+				return '<';
+			else {
+				lex_next(ls);
+				return TK_le;
+			}
+		case '>':
+			lex_next(ls);
+			if (ls->c != '=')
+				return '>';
+			else {
+				lex_next(ls);
+				return TK_ge;
+			}
+		case '!':
+      			lex_next(ls);
+			if (ls->c != '=')
+				return TK_not;
+			else {
+				lex_next(ls);
+				return TK_ne;
+			}
+		case ':':
+			lex_next(ls);
+			if (ls->c != ':')
+				return ':';
+			else {
+				lex_next(ls);
+				return TK_label;
+			}
+		case '"':
+		case '\'':
+			lex_string(ls, tv);
+			return TK_string;
+		case '.':
+			if (lex_savenext(ls) == '.') {
+				lex_next(ls);
+				if (ls->c == '.') {
+					lex_next(ls);
+					return TK_dots;   /* ... */
+				}
+				return TK_concat;   /* .. */
+			} else if (!kp_char_isdigit(ls->c)) {
+				return '.';
+			} else {
+				lex_number(ls, tv);
+				return TK_number;
+			}
+		case LEX_EOF:
+			return TK_eof;
+		case '&':
+			lex_next(ls);
+			if (ls->c != '&')
+				return '&';
+			else {
+				lex_next(ls);
+				return TK_and;
+			}
+		case '|':
+			lex_next(ls);
+			if (ls->c != '|')
+				return '|';
+			else {
+				lex_next(ls);
+				return TK_or;
+			}
+		default: {
+			LexChar c = ls->c;
+			lex_next(ls);
+			return c;  /* Single-char tokens (+ - / ...). */
+		}
+		}
+	}
+}
+
+/* -- Lexer API ----------------------------------------------------------- */
+
+/* Setup lexer state. */
+int kp_lex_setup(LexState *ls, const char *str)
+{
+	ls->fs = NULL;
+	ls->pe = ls->p = NULL;
+	ls->p = str;
+	ls->pe = str + strlen(str);
+	ls->vstack = NULL;
+	ls->sizevstack = 0;
+	ls->vtop = 0;
+	ls->bcstack = NULL;
+	ls->sizebcstack = 0;
+	ls->lookahead = TK_eof;  /* No look-ahead token. */
+	ls->linenumber = 1;
+	ls->lastline = 1;
+	lex_next(ls);  /* Read-ahead first char. */
+	if (ls->c == 0xef && ls->p + 2 <= ls->pe &&
+		(uint8_t)ls->p[0] == 0xbb &&
+		(uint8_t)ls->p[1] == 0xbf) {/* Skip UTF-8 BOM (if buffered). */
+		ls->p += 2;
+		lex_next(ls);
+	}
+	if (ls->c == '#') {  /* Skip POSIX #! header line. */
+		do {
+			lex_next(ls);
+			if (ls->c == LEX_EOF)
+				return 0;
+		} while (!lex_iseol(ls));
+		lex_newline(ls);
+	}
+	return 0;
+}
+
+/* Cleanup lexer state. */
+void kp_lex_cleanup(LexState *ls)
+{
+	free(ls->bcstack);
+	free(ls->vstack);
+	kp_buf_free(&ls->sb);
+}
+
+/* Return next lexical token. */
+void kp_lex_next(LexState *ls)
+{
+	ls->lastline = ls->linenumber;
+	if (ls->lookahead == TK_eof) {  /* No lookahead token? */
+		ls->tok = lex_scan(ls, &ls->tokval);  /* Get next token. */
+	} else {  /* Otherwise return lookahead token. */
+		ls->tok = ls->lookahead;
+		ls->lookahead = TK_eof;
+		ls->tokval = ls->lookaheadval;
+	}
+}
+
+/* Look ahead for the next token. */
+LexToken kp_lex_lookahead(LexState *ls)
+{
+	kp_assert(ls->lookahead == TK_eof);
+	ls->lookahead = lex_scan(ls, &ls->lookaheadval);
+	return ls->lookahead;
+}
+
+/* Convert token to string. */
+const char *kp_lex_token2str(LexState *ls, LexToken tok)
+{
+	if (tok > TK_OFS)
+		return tokennames[tok-TK_OFS-1];
+	else if (!kp_char_iscntrl(tok))
+		return kp_sprintf("%c", tok);
+	else
+		return kp_sprintf("char(%d)", tok);
+}
+
+/* Lexer error. */
+void kp_lex_error(LexState *ls, LexToken tok, ErrMsg em, ...)
+{
+	const char *tokstr;
+	va_list argp;
+
+	if (tok == 0) {
+		tokstr = NULL;
+	} else if (tok == TK_name || tok == TK_string || tok == TK_number) {
+		lex_save(ls, '\0');
+		tokstr = sbufB(&ls->sb);
+	} else {
+		tokstr = kp_lex_token2str(ls, tok);
+	}
+
+	va_start(argp, em);
+	kp_err_lex(ls->chunkname, tokstr, ls->linenumber, em, argp);
+	va_end(argp);
+}
+
+/* Initialize strings for reserved words. */
+void kp_lex_init()
+{
+	uint32_t i;
+
+	for (i = 0; i < TK_RESERVED; i++) {
+		ktap_str_t *s = kp_str_newz(tokennames[i]);
+		s->reserved = (uint8_t)(i+1);
+	}
+}
+
diff --git a/tools/ktap/kp_lex.h b/tools/ktap/kp_lex.h
new file mode 100644
index 0000000..d84babd
--- /dev/null
+++ b/tools/ktap/kp_lex.h
@@ -0,0 +1,94 @@
+/*
+ * Lexical analyzer.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ */
+
+#ifndef _KTAP_LEX_H
+#define _KTAP_LEX_H
+
+#include <stdarg.h>
+#include "../include/err.h"
+#include "../../include/uapi/ktap/ktap_bc.h"
+#include "kp_util.h"
+
+/* ktap lexer tokens. */
+#define TKDEF(_, __) \
+	_(trace) _(trace_end) _(argstr) _(probename) _(ffi) \
+	_(arg0)_(arg1) _(arg2) _(arg3) _(arg4) _(arg5) _(arg6) _(arg7) \
+	_(arg8) _(arg9) _(profile) _(tick) \
+	_(pid) _(tid) _(uid) _(cpu) _(execname) __(incr, +=) \
+	__(and, &&) _(break) _(do) _(else) _(elseif) _(end) _(false) \
+	_(for) _(function) _(goto) _(if) _(in) __(local, var) _(nil) \
+	__(not, !) __(or, ||) \
+	_(repeat) _(return) _(then) _(true) _(until) _(while) \
+	__(concat, ..) __(dots, ...) __(eq, ==) __(ge, >=) __(le, <=) \
+	__(ne, !=) __(label, ::) __(number, <number>) __(name, <name>) \
+	__(string, <string>) __(eof, <eof>)
+
+enum {
+	TK_OFS = 256,
+#define TKENUM1(name)		TK_##name,
+#define TKENUM2(name, sym)	TK_##name,
+	TKDEF(TKENUM1, TKENUM2)
+#undef TKENUM1
+#undef TKENUM2
+	TK_RESERVED = TK_while - TK_OFS
+};
+
+typedef int LexChar;	/* Lexical character. Unsigned ext. from char. */
+typedef int LexToken;	/* Lexical token. */
+
+/* Combined bytecode ins/line. Only used during bytecode generation. */
+typedef struct BCInsLine {
+	BCIns ins;		/* Bytecode instruction. */
+	BCLine line;		/* Line number for this bytecode. */
+} BCInsLine;
+
+/* Info for local variables. Only used during bytecode generation. */
+typedef struct VarInfo {
+	ktap_str_t *name;	/* Local variable name or goto/label name. */
+	BCPos startpc;	/* First point where the local variable is active. */
+	BCPos endpc;	/* First point where the local variable is dead. */
+	uint8_t slot;	/* Variable slot. */
+	uint8_t info;	/* Variable/goto/label info. */
+} VarInfo;
+
+/* lexer state. */
+typedef struct LexState {
+	struct FuncState *fs;	/* Current FuncState. Defined in kp_parse.c. */
+	ktap_val_t tokval;	/* Current token value. */
+	ktap_val_t lookaheadval;/* Lookahead token value. */
+	const char *p;	/* Current position in input buffer. */
+	const char *pe;	/* End of input buffer. */
+	LexChar c;		/* Current character. */
+	LexToken tok;		/* Current token. */
+	LexToken lookahead;	/* Lookahead token. */
+	SBuf sb;		/* String buffer for tokens. */
+	BCLine linenumber;	/* Input line counter. */
+	BCLine lastline;	/* Line of last token. */
+	ktap_str_t *chunkname;/* Current chunk name (interned string). */
+	const char *chunkarg;	/* Chunk name argument. */
+	const char *mode;/* Allow loading bytecode (b) and/or source text (t) */
+	VarInfo *vstack;/* Stack for names and extents of local variables. */
+	int sizevstack;	/* Size of variable stack. */
+	int vtop;	/* Top of variable stack. */
+	BCInsLine *bcstack;/* Stack for bytecode instructions/line numbers. */
+	int sizebcstack;/* Size of bytecode stack. */
+	uint32_t level;	/* Syntactical nesting level. */
+} LexState;
+
+int kp_lex_setup(LexState *ls, const char *str);
+void kp_lex_cleanup(LexState *ls);
+void kp_lex_next(LexState *ls);
+void kp_lex_read_string_until(LexState *ls, int c);
+LexToken kp_lex_lookahead(LexState *ls);
+const char *kp_lex_token2str(LexState *ls, LexToken tok);
+void kp_lex_error(LexState *ls, LexToken tok, ErrMsg em, ...);
+void kp_lex_init(void);
+
+#endif
diff --git a/tools/ktap/kp_parse.c b/tools/ktap/kp_parse.c
new file mode 100644
index 0000000..5c3916c
--- /dev/null
+++ b/tools/ktap/kp_parse.c
@@ -0,0 +1,3139 @@
+/*
+ * ktap parser (source code -> bytecode).
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include "../../include/uapi/ktap/ktap_types.h"
+#include "../../include/uapi/ktap/ktap_err.h"
+#include "kp_util.h"
+#include "kp_lex.h"
+
+/* Fixed internal variable names. */
+#define VARNAMEDEF(_) \
+	_(FOR_IDX, "(for index)") \
+	_(FOR_STOP, "(for limit)") \
+	_(FOR_STEP, "(for step)") \
+	_(FOR_GEN, "(for generator)") \
+	_(FOR_STATE, "(for state)") \
+	_(FOR_CTL, "(for control)")
+
+enum {
+	VARNAME_END,
+#define VARNAMEENUM(name, str)  VARNAME_##name,
+	VARNAMEDEF(VARNAMEENUM)
+#undef VARNAMEENUM
+	VARNAME__MAX
+};
+
+/* -- Parser structures and definitions ----------------------------------- */
+
+/* Expression kinds. */
+typedef enum {
+	/* Constant expressions must be first and in this order: */
+	VKNIL,
+	VKFALSE,
+	VKTRUE,
+	VKSTR,	/* sval = string value */
+	VKNUM,	/* nval = number value */
+	VKLAST = VKNUM,
+	VKCDATA, /* nval = cdata value, not treated as a constant expression */
+	/* Non-constant expressions follow: */
+	VLOCAL,	/* info = local register, aux = vstack index */
+	VUPVAL,	/* info = upvalue index, aux = vstack index */
+	VGLOBAL,/* sval = string value */
+	VINDEXED,/* info = table register, aux = index reg/byte/string const */
+	VJMP,	/* info = instruction PC */
+	VRELOCABLE, /* info = instruction PC */
+	VNONRELOC, /* info = result register */
+	VCALL,	/* info = instruction PC, aux = base */
+	VVOID,
+
+	VARGN,
+	VARGSTR,
+	VARGNAME,
+	VPID,
+	VTID,
+	VUID,
+	VCPU,
+	VEXECNAME,
+	VMAX
+} ExpKind;
+
+/* Expression descriptor. */
+typedef struct ExpDesc {
+	union {
+		struct {
+			uint32_t info;	/* Primary info. */
+			uint32_t aux;	/* Secondary info. */
+		} s;
+		ktap_val_t nval;	/* Number value. */
+		ktap_str_t *sval;	/* String value. */
+	} u;
+	ExpKind k;
+	BCPos t;	/* True condition jump list. */
+	BCPos f;	/* False condition jump list. */
+} ExpDesc;
+
+/* Macros for expressions. */
+#define expr_hasjump(e)		((e)->t != (e)->f)
+
+#define expr_isk(e)		((e)->k <= VKLAST)
+#define expr_isk_nojump(e)	(expr_isk(e) && !expr_hasjump(e))
+#define expr_isnumk(e)		((e)->k == VKNUM)
+#define expr_isnumk_nojump(e)	(expr_isnumk(e) && !expr_hasjump(e))
+#define expr_isstrk(e)		((e)->k == VKSTR)
+
+#define expr_numtv(e)		(&(e)->u.nval)
+#define expr_numberV(e)		nvalue(expr_numtv((e)))
+
+/* Initialize expression. */
+static inline void expr_init(ExpDesc *e, ExpKind k, uint32_t info)
+{
+	e->k = k;
+	e->u.s.info = info;
+	e->f = e->t = NO_JMP;
+}
+
+/* Check number constant for +-0. */
+static int expr_numiszero(ExpDesc *e)
+{
+	ktap_val_t *o = expr_numtv(e);
+	return (nvalue(o) == 0);
+}
+
+/* Per-function linked list of scope blocks. */
+typedef struct FuncScope {
+	struct FuncScope *prev;	/* Link to outer scope. */
+	int vstart;		/* Start of block-local variables. */
+	uint8_t nactvar;	/* Number of active vars outside the scope. */
+	uint8_t flags;		/* Scope flags. */
+} FuncScope;
+
+#define FSCOPE_LOOP		0x01	/* Scope is a (breakable) loop. */
+#define FSCOPE_BREAK		0x02	/* Break used in scope. */
+#define FSCOPE_GOLA		0x04	/* Goto or label used in scope. */
+#define FSCOPE_UPVAL		0x08	/* Upvalue in scope. */
+#define FSCOPE_NOCLOSE		0x10	/* Do not close upvalues. */
+
+#define NAME_BREAK		((ktap_str_t *)(uintptr_t)1)
+
+/* Index into variable stack. */
+typedef uint16_t VarIndex;
+#define KP_MAX_VSTACK		(65536 - KP_MAX_UPVAL)
+
+/* Variable/goto/label info. */
+#define VSTACK_VAR_RW		0x01	/* R/W variable. */
+#define VSTACK_GOTO		0x02	/* Pending goto. */
+#define VSTACK_LABEL		0x04	/* Label. */
+
+/* Per-function state. */
+typedef struct FuncState {
+	ktap_tab_t *kt;		/* Hash table for constants. */
+	LexState *ls;		/* Lexer state. */
+	FuncScope *bl;		/* Current scope. */
+	struct FuncState *prev;	/* Enclosing function. */
+	BCPos pc;		/* Next bytecode position. */
+	BCPos lasttarget;	/* Bytecode position of last jump target. */
+	BCPos jpc;		/* Pending jump list to next bytecode. */
+	BCReg freereg;		/* First free register. */
+	BCReg nactvar;		/* Number of active local variables. */
+	BCReg nkn, nkgc;        /* Number of ktap_number/ktap_obj_t constants*/
+	BCLine linedefined;	/* First line of the function definition. */
+	BCInsLine *bcbase;	/* Base of bytecode stack. */
+	BCPos bclim;		/* Limit of bytecode stack. */
+	int vbase;		/* Base of variable stack for this function. */
+	uint8_t flags;		/* Prototype flags. */
+	uint8_t numparams;	/* Number of parameters. */
+	uint8_t framesize;	/* Fixed frame size. */
+	uint8_t nuv;		/* Number of upvalues */
+	VarIndex varmap[KP_MAX_LOCVAR];/* Map from register to variable idx. */
+	VarIndex uvmap[KP_MAX_UPVAL];	/* Map from upvalue to variable idx. */
+	VarIndex uvtmp[KP_MAX_UPVAL];	/* Temporary upvalue map. */
+} FuncState;
+
+/* Binary and unary operators. ORDER OPR */
+typedef enum BinOpr {
+	OPR_ADD, OPR_SUB, OPR_MUL, OPR_DIV, OPR_MOD, OPR_POW, /* ORDER ARITH */
+	OPR_CONCAT,
+	OPR_NE, OPR_EQ,
+	OPR_LT, OPR_GE, OPR_LE, OPR_GT,
+	OPR_AND, OPR_OR,
+	OPR_NOBINOPR
+} BinOpr;
+
+KP_STATIC_ASSERT((int)BC_ISGE-(int)BC_ISLT == (int)OPR_GE-(int)OPR_LT);
+KP_STATIC_ASSERT((int)BC_ISLE-(int)BC_ISLT == (int)OPR_LE-(int)OPR_LT);
+KP_STATIC_ASSERT((int)BC_ISGT-(int)BC_ISLT == (int)OPR_GT-(int)OPR_LT);
+KP_STATIC_ASSERT((int)BC_SUBVV-(int)BC_ADDVV == (int)OPR_SUB-(int)OPR_ADD);
+KP_STATIC_ASSERT((int)BC_MULVV-(int)BC_ADDVV == (int)OPR_MUL-(int)OPR_ADD);
+KP_STATIC_ASSERT((int)BC_DIVVV-(int)BC_ADDVV == (int)OPR_DIV-(int)OPR_ADD);
+KP_STATIC_ASSERT((int)BC_MODVV-(int)BC_ADDVV == (int)OPR_MOD-(int)OPR_ADD);
+
+/* -- Error handling ------------------------------------------------------ */
+
+static void err_syntax(LexState *ls, ErrMsg em)
+{
+	kp_lex_error(ls, ls->tok, em);
+}
+
+static void err_token(LexState *ls, LexToken tok)
+{
+	kp_lex_error(ls, ls->tok, KP_ERR_XTOKEN, kp_lex_token2str(ls, tok));
+}
+
+static void err_limit(FuncState *fs, uint32_t limit, const char *what)
+{
+	if (fs->linedefined == 0)
+		kp_lex_error(fs->ls, 0, KP_ERR_XLIMM, limit, what);
+	else
+		kp_lex_error(fs->ls, 0, KP_ERR_XLIMF, fs->linedefined,
+				limit, what);
+}
+
+#define checklimit(fs, v, l, m)		if ((v) >= (l)) err_limit(fs, l, m)
+#define checklimitgt(fs, v, l, m)	if ((v) > (l)) err_limit(fs, l, m)
+#define checkcond(ls, c, em)		{ if (!(c)) err_syntax(ls, em); }
+
+/* -- Management of constants --------------------------------------------- */
+
+/* Return bytecode encoding for primitive constant. */
+#define const_pri(e)	((e)->k)
+
+#define tvhaskslot(o)	(is_number(o))
+#define tvkslot(o)	(nvalue(o))
+
+/* Add a number constant. */
+static BCReg const_num(FuncState *fs, ExpDesc *e)
+{
+	ktap_val_t *o;
+
+	kp_assert(expr_isnumk(e));
+	o = kp_tab_set(fs->kt, &e->u.nval);
+	if (tvhaskslot(o))
+		return tvkslot(o);
+	set_number(o, fs->nkn);
+	return fs->nkn++;
+}
+
+/* Add a GC object constant. */
+static BCReg const_gc(FuncState *fs, ktap_obj_t *gc, uint32_t itype)
+{
+	ktap_val_t key, *o;
+
+	setitype(&key, itype);
+	key.val.gc = gc;
+	o = kp_tab_set(fs->kt, &key);
+	if (tvhaskslot(o))
+		return tvkslot(o);
+	set_number(o, fs->nkgc);
+	return fs->nkgc++;
+}
+
+/* Add a string constant. */
+static BCReg const_str(FuncState *fs, ExpDesc *e)
+{
+	kp_assert(expr_isstrk(e) || e->k == VGLOBAL);
+	return const_gc(fs, obj2gco(e->u.sval), KTAP_TSTR);
+}
+
+/* Anchor string constant. */
+ktap_str_t *kp_parse_keepstr(LexState *ls, const char *str, size_t len)
+{
+	ktap_val_t v;
+	ktap_str_t *s = kp_str_new(str, len);
+
+	set_string(&v, s);
+	ktap_val_t *tv = kp_tab_set(ls->fs->kt, &v);
+	if (is_nil(tv))
+		set_bool(tv, 1);
+	return s;
+}
+
+/* -- Jump list handling -------------------------------------------------- */
+
+/* Get next element in jump list. */
+static BCPos jmp_next(FuncState *fs, BCPos pc)
+{
+	ptrdiff_t delta = bc_j(fs->bcbase[pc].ins);
+	if ((BCPos)delta == NO_JMP)
+		return NO_JMP;
+	else
+		return (BCPos)(((ptrdiff_t)pc+1)+delta);
+}
+
+/* Check if any of the instructions on the jump list produce no value. */
+static int jmp_novalue(FuncState *fs, BCPos list)
+{
+	for (; list != NO_JMP; list = jmp_next(fs, list)) {
+		BCIns p = fs->bcbase[list >= 1 ? list-1 : list].ins;
+		if (!(bc_op(p) == BC_ISTC || bc_op(p) == BC_ISFC ||
+			bc_a(p) == NO_REG))
+		return 1;
+	}
+	return 0;
+}
+
+/* Patch register of test instructions. */
+static int jmp_patchtestreg(FuncState *fs, BCPos pc, BCReg reg)
+{
+	BCInsLine *ilp = &fs->bcbase[pc >= 1 ? pc-1 : pc];
+	BCOp op = bc_op(ilp->ins);
+
+	if (op == BC_ISTC || op == BC_ISFC) {
+		if (reg != NO_REG && reg != bc_d(ilp->ins)) {
+			setbc_a(&ilp->ins, reg);
+		} else {/* Nothing to store or already in the right register */
+			setbc_op(&ilp->ins, op+(BC_IST-BC_ISTC));
+			setbc_a(&ilp->ins, 0);
+		}
+	} else if (bc_a(ilp->ins) == NO_REG) {
+		if (reg == NO_REG) {
+			ilp->ins =
+				BCINS_AJ(BC_JMP, bc_a(fs->bcbase[pc].ins), 0);
+		} else {
+			setbc_a(&ilp->ins, reg);
+			if (reg >= bc_a(ilp[1].ins))
+				setbc_a(&ilp[1].ins, reg+1);
+		}
+	} else {
+		return 0;  /* Cannot patch other instructions. */
+	}
+	return 1;
+}
+
+/* Drop values for all instructions on jump list. */
+static void jmp_dropval(FuncState *fs, BCPos list)
+{
+	for (; list != NO_JMP; list = jmp_next(fs, list))
+		jmp_patchtestreg(fs, list, NO_REG);
+}
+
+/* Patch jump instruction to target. */
+static void jmp_patchins(FuncState *fs, BCPos pc, BCPos dest)
+{
+	BCIns *jmp = &fs->bcbase[pc].ins;
+	BCPos offset = dest-(pc+1)+BCBIAS_J;
+
+	kp_assert(dest != NO_JMP);
+	if (offset > BCMAX_D)
+		err_syntax(fs->ls, KP_ERR_XJUMP);
+	setbc_d(jmp, offset);
+}
+
+/* Append to jump list. */
+static void jmp_append(FuncState *fs, BCPos *l1, BCPos l2)
+{
+	if (l2 == NO_JMP) {
+		return;
+	} else if (*l1 == NO_JMP) {
+		*l1 = l2;
+	} else {
+		BCPos list = *l1;
+		BCPos next;
+		/* Find last element. */
+		while ((next = jmp_next(fs, list)) != NO_JMP)
+			list = next;
+		jmp_patchins(fs, list, l2);
+	}
+}
+
+/* Patch jump list and preserve produced values. */
+static void jmp_patchval(FuncState *fs, BCPos list, BCPos vtarget,
+			 BCReg reg, BCPos dtarget)
+{
+	while (list != NO_JMP) {
+		BCPos next = jmp_next(fs, list);
+		if (jmp_patchtestreg(fs, list, reg)) {
+			/* Jump to target with value. */
+			jmp_patchins(fs, list, vtarget);
+		} else {
+			/* Jump to default target. */
+			jmp_patchins(fs, list, dtarget);
+		}
+		list = next;
+	}
+}
+
+/* Jump to following instruction. Append to list of pending jumps. */
+static void jmp_tohere(FuncState *fs, BCPos list)
+{
+	fs->lasttarget = fs->pc;
+	jmp_append(fs, &fs->jpc, list);
+}
+
+/* Patch jump list to target. */
+static void jmp_patch(FuncState *fs, BCPos list, BCPos target)
+{
+	if (target == fs->pc) {
+		jmp_tohere(fs, list);
+	} else {
+		kp_assert(target < fs->pc);
+		jmp_patchval(fs, list, target, NO_REG, target);
+	}
+}
+
+/* -- Bytecode register allocator ----------------------------------------- */
+
+/* Bump frame size. */
+static void bcreg_bump(FuncState *fs, BCReg n)
+{
+	BCReg sz = fs->freereg + n;
+
+	if (sz > fs->framesize) {
+		if (sz >= KP_MAX_SLOTS)
+			err_syntax(fs->ls, KP_ERR_XSLOTS);
+		fs->framesize = (uint8_t)sz;
+	}
+}
+
+/* Reserve registers. */
+static void bcreg_reserve(FuncState *fs, BCReg n)
+{
+	bcreg_bump(fs, n);
+	fs->freereg += n;
+}
+
+/* Free register. */
+static void bcreg_free(FuncState *fs, BCReg reg)
+{
+	if (reg >= fs->nactvar) {
+		fs->freereg--;
+		kp_assert(reg == fs->freereg);
+	}
+}
+
+/* Free register for expression. */
+static void expr_free(FuncState *fs, ExpDesc *e)
+{
+	if (e->k == VNONRELOC)
+		bcreg_free(fs, e->u.s.info);
+}
+
+/* -- Bytecode emitter ---------------------------------------------------- */
+
+/* Emit bytecode instruction. */
+static BCPos bcemit_INS(FuncState *fs, BCIns ins)
+{
+	BCPos pc = fs->pc;
+	LexState *ls = fs->ls;
+
+	jmp_patchval(fs, fs->jpc, pc, NO_REG, pc);
+	fs->jpc = NO_JMP;
+	if (pc >= fs->bclim) {
+		ptrdiff_t base = fs->bcbase - ls->bcstack;
+		checklimit(fs, ls->sizebcstack, KP_MAX_BCINS,
+				"bytecode instructions");
+		if (!ls->bcstack) {
+			ls->bcstack = malloc(sizeof(BCInsLine) * 20);
+			ls->sizebcstack = 20;
+		} else {
+			ls->bcstack = realloc(ls->bcstack,
+				ls->sizebcstack * sizeof(BCInsLine) * 2);
+			ls->sizebcstack = ls->sizebcstack * 2;
+		}
+		fs->bclim = (BCPos)(ls->sizebcstack - base);
+		fs->bcbase = ls->bcstack + base;
+	}
+	fs->bcbase[pc].ins = ins;
+	fs->bcbase[pc].line = ls->lastline;
+	fs->pc = pc+1;
+	return pc;
+}
+
+#define bcemit_ABC(fs, o, a, b, c)	bcemit_INS(fs, BCINS_ABC(o, a, b, c))
+#define bcemit_AD(fs, o, a, d)		bcemit_INS(fs, BCINS_AD(o, a, d))
+#define bcemit_AJ(fs, o, a, j)		bcemit_INS(fs, BCINS_AJ(o, a, j))
+
+#define bcptr(fs, e)			(&(fs)->bcbase[(e)->u.s.info].ins)
+
+/* -- Bytecode emitter for expressions ------------------------------------ */
+
+/* Discharge non-constant expression to any register. */
+static void expr_discharge(FuncState *fs, ExpDesc *e)
+{
+	BCIns ins;
+
+	if (e->k == VUPVAL) {
+		ins = BCINS_AD(BC_UGET, 0, e->u.s.info);
+	} else if (e->k == VGLOBAL) {
+		ins = BCINS_AD(BC_GGET, 0, const_str(fs, e));
+	} else if (e->k == VINDEXED) {
+		BCReg rc = e->u.s.aux;
+		if ((int32_t)rc < 0) {
+			ins = BCINS_ABC(BC_TGETS, 0, e->u.s.info, ~rc);
+		} else if (rc > BCMAX_C) {
+			ins = BCINS_ABC(BC_TGETB, 0, e->u.s.info,
+					rc-(BCMAX_C+1));
+		} else {
+			bcreg_free(fs, rc);
+			ins = BCINS_ABC(BC_TGETV, 0, e->u.s.info, rc);
+		}
+		bcreg_free(fs, e->u.s.info);
+	} else if (e->k == VCALL) {
+		e->u.s.info = e->u.s.aux;
+		e->k = VNONRELOC;
+		return;
+	} else if (e->k == VLOCAL) {
+		e->k = VNONRELOC;
+		return;
+	} else {
+		return;
+	}
+
+	e->u.s.info = bcemit_INS(fs, ins);
+	e->k = VRELOCABLE;
+}
+
+/* Emit bytecode to set a range of registers to nil. */
+static void bcemit_nil(FuncState *fs, BCReg from, BCReg n)
+{
+	if (fs->pc > fs->lasttarget) {  /* No jumps to current position? */
+		BCIns *ip = &fs->bcbase[fs->pc-1].ins;
+		BCReg pto, pfrom = bc_a(*ip);
+		/* Try to merge with the previous instruction. */
+		switch (bc_op(*ip)) {
+		case BC_KPRI:
+			if (bc_d(*ip) != ~KTAP_TNIL) break;
+			if (from == pfrom) {
+				if (n == 1)
+					return;
+			} else if (from == pfrom+1) {
+				from = pfrom;
+				n++;
+			} else {
+				break;
+			}
+			/* Replace KPRI. */
+			*ip = BCINS_AD(BC_KNIL, from, from+n-1);
+			return;
+		case BC_KNIL:
+			pto = bc_d(*ip);
+			/* Can we connect both ranges? */
+			if (pfrom <= from && from <= pto+1) {
+				if (from+n-1 > pto) {
+					/* Patch previous instruction range. */
+					setbc_d(ip, from+n-1);
+				}
+				return;
+			}
+			break;
+		default:
+			break;
+		}
+	}
+
+	/* Emit new instruction or replace old instruction. */
+	bcemit_INS(fs, n == 1 ? BCINS_AD(BC_KPRI, from, VKNIL) :
+				BCINS_AD(BC_KNIL, from, from+n-1));
+}
+
+/* Discharge an expression to a specific register. Ignore branches. */
+static void expr_toreg_nobranch(FuncState *fs, ExpDesc *e, BCReg reg)
+{
+	BCIns ins;
+
+	expr_discharge(fs, e);
+	if (e->k == VKSTR) {
+		ins = BCINS_AD(BC_KSTR, reg, const_str(fs, e));
+	} else if (e->k == VKNUM) {
+		ktap_number n = expr_numberV(e);
+		if (n >= 0 && n <= 0xffff) {
+			ins = BCINS_AD(BC_KSHORT, reg, (BCReg)(uint16_t)n);
+		} else
+			ins = BCINS_AD(BC_KNUM, reg, const_num(fs, e));
+	} else if (e->k == VRELOCABLE) {
+		setbc_a(bcptr(fs, e), reg);
+		goto noins;
+	} else if (e->k == VNONRELOC) {
+		if (reg == e->u.s.info)
+			goto noins;
+		ins = BCINS_AD(BC_MOV, reg, e->u.s.info);
+	} else if (e->k == VKNIL) {
+		bcemit_nil(fs, reg, 1);
+		goto noins;
+	} else if (e->k <= VKTRUE) {
+		ins = BCINS_AD(BC_KPRI, reg, const_pri(e));
+	} else if (e->k == VARGN) {
+		ins = BCINS_AD(BC_VARGN, reg, e->u.s.info);
+	} else if (e->k > VARGN && e->k < VMAX) {
+		ins = BCINS_AD(e->k - VARGN + BC_VARGN, reg, 0);
+	} else {
+		kp_assert(e->k == VVOID || e->k == VJMP);
+		return;
+	}
+	bcemit_INS(fs, ins);
+ noins:
+	e->u.s.info = reg;
+	e->k = VNONRELOC;
+}
+
+/* Forward declaration. */
+static BCPos bcemit_jmp(FuncState *fs);
+
+/* Discharge an expression to a specific register. */
+static void expr_toreg(FuncState *fs, ExpDesc *e, BCReg reg)
+{
+	expr_toreg_nobranch(fs, e, reg);
+	if (e->k == VJMP) {
+		/* Add it to the true jump list. */
+		jmp_append(fs, &e->t, e->u.s.info);
+	}
+	if (expr_hasjump(e)) {  /* Discharge expression with branches. */
+		BCPos jend, jfalse = NO_JMP, jtrue = NO_JMP;
+		if (jmp_novalue(fs, e->t) || jmp_novalue(fs, e->f)) {
+			BCPos jval = (e->k == VJMP) ? NO_JMP : bcemit_jmp(fs);
+			jfalse = bcemit_AD(fs, BC_KPRI, reg, VKFALSE);
+			bcemit_AJ(fs, BC_JMP, fs->freereg, 1);
+			jtrue = bcemit_AD(fs, BC_KPRI, reg, VKTRUE);
+			jmp_tohere(fs, jval);
+		}
+		jend = fs->pc;
+		fs->lasttarget = jend;
+		jmp_patchval(fs, e->f, jend, reg, jfalse);
+		jmp_patchval(fs, e->t, jend, reg, jtrue);
+	}
+	e->f = e->t = NO_JMP;
+	e->u.s.info = reg;
+	e->k = VNONRELOC;
+}
+
+/* Discharge an expression to the next free register. */
+static void expr_tonextreg(FuncState *fs, ExpDesc *e)
+{
+	expr_discharge(fs, e);
+	expr_free(fs, e);
+	bcreg_reserve(fs, 1);
+	expr_toreg(fs, e, fs->freereg - 1);
+}
+
+/* Discharge an expression to any register. */
+static BCReg expr_toanyreg(FuncState *fs, ExpDesc *e)
+{
+	expr_discharge(fs, e);
+	if (e->k == VNONRELOC) {
+		if (!expr_hasjump(e))
+			return e->u.s.info;  /* Already in a register. */
+		if (e->u.s.info >= fs->nactvar) {
+			/* Discharge to temp. register. */
+			expr_toreg(fs, e, e->u.s.info);
+			return e->u.s.info;
+		}
+	}
+	expr_tonextreg(fs, e);  /* Discharge to next register. */
+	return e->u.s.info;
+}
+
+/* Partially discharge expression to a value. */
+static void expr_toval(FuncState *fs, ExpDesc *e)
+{
+	if (expr_hasjump(e))
+		expr_toanyreg(fs, e);
+	else
+		expr_discharge(fs, e);
+}
+
+/* Emit store for LHS expression. */
+static void bcemit_store(FuncState *fs, ExpDesc *var, ExpDesc *e)
+{
+	BCIns ins;
+
+	if (var->k == VLOCAL) {
+		fs->ls->vstack[var->u.s.aux].info |= VSTACK_VAR_RW;
+		expr_free(fs, e);
+		expr_toreg(fs, e, var->u.s.info);
+		return;
+	} else if (var->k == VUPVAL) {
+		fs->ls->vstack[var->u.s.aux].info |= VSTACK_VAR_RW;
+		expr_toval(fs, e);
+		if (e->k <= VKTRUE)
+			ins = BCINS_AD(BC_USETP, var->u.s.info, const_pri(e));
+		else if (e->k == VKSTR)
+			ins = BCINS_AD(BC_USETS, var->u.s.info,
+					const_str(fs, e));
+		else if (e->k == VKNUM)
+			ins = BCINS_AD(BC_USETN, var->u.s.info,
+					const_num(fs, e));
+		else
+			ins = BCINS_AD(BC_USETV, var->u.s.info,
+					expr_toanyreg(fs, e));
+	} else if (var->k == VGLOBAL) {
+		BCReg ra = expr_toanyreg(fs, e);
+		ins = BCINS_AD(BC_GSET, ra, const_str(fs, var));
+	} else {
+		BCReg ra, rc;
+		kp_assert(var->k == VINDEXED);
+		ra = expr_toanyreg(fs, e);
+		rc = var->u.s.aux;
+		if ((int32_t)rc < 0) {
+			ins = BCINS_ABC(BC_TSETS, ra, var->u.s.info, ~rc);
+		} else if (rc > BCMAX_C) {
+			ins = BCINS_ABC(BC_TSETB, ra, var->u.s.info,
+				rc-(BCMAX_C+1));
+		} else {
+			/* 
+			 * Free late alloced key reg to avoid assert on
+			 * free of value reg. This can only happen when
+			 * called from expr_table(). 
+			 */
+			kp_assert(e->k != VNONRELOC || ra < fs->nactvar ||
+					rc < ra || (bcreg_free(fs, rc),1));
+			ins = BCINS_ABC(BC_TSETV, ra, var->u.s.info, rc);
+		}
+	}
+	bcemit_INS(fs, ins);
+	expr_free(fs, e);
+}
+
+/* Emit store for '+=' expression. */
+static void bcemit_store_incr(FuncState *fs, ExpDesc *var, ExpDesc *e)
+{
+	BCIns ins;
+
+	if (var->k == VLOCAL) {
+		/* don't need to do like "var a=0; a+=1", just use 'a=a+1' */
+		err_syntax(fs->ls, KP_ERR_XSYMBOL);
+		return;
+	} else if (var->k == VUPVAL) {
+		fs->ls->vstack[var->u.s.aux].info |= VSTACK_VAR_RW;
+		expr_toval(fs, e);
+		if (e->k == VKNUM) {
+			ins = BCINS_AD(BC_UINCN, var->u.s.info,
+					const_num(fs, e));
+		} else if (e->k <= VKTRUE || e->k == VKSTR) {
+			err_syntax(fs->ls, KP_ERR_XSYMBOL);
+			return;
+		} else
+			ins = BCINS_AD(BC_UINCV, var->u.s.info,
+					expr_toanyreg(fs, e));
+	} else if (var->k == VGLOBAL) {
+		BCReg ra = expr_toanyreg(fs, e);
+		ins = BCINS_AD(BC_GINC, ra, const_str(fs, var));
+	} else {
+		BCReg ra, rc;
+		kp_assert(var->k == VINDEXED);
+		ra = expr_toanyreg(fs, e);
+		rc = var->u.s.aux;
+		if ((int32_t)rc < 0) {
+			ins = BCINS_ABC(BC_TINCS, ra, var->u.s.info, ~rc);
+		} else if (rc > BCMAX_C) {
+			ins = BCINS_ABC(BC_TINCB, ra, var->u.s.info,
+				rc-(BCMAX_C+1));
+		} else {
+			/* 
+			 * Free late alloced key reg to avoid assert on
+			 * free of value reg. This can only happen when
+			 * called from expr_table(). 
+			 */
+			kp_assert(e->k != VNONRELOC || ra < fs->nactvar ||
+					rc < ra || (bcreg_free(fs, rc),1));
+			ins = BCINS_ABC(BC_TINCV, ra, var->u.s.info, rc);
+		}
+	}
+	bcemit_INS(fs, ins);
+	expr_free(fs, e);
+}
+
+
+/* Emit method lookup expression. */
+static void bcemit_method(FuncState *fs, ExpDesc *e, ExpDesc *key)
+{
+	BCReg idx, func, obj = expr_toanyreg(fs, e);
+
+	expr_free(fs, e);
+	func = fs->freereg;
+	bcemit_AD(fs, BC_MOV, func+1, obj);/* Copy object to first argument. */
+	kp_assert(expr_isstrk(key));
+	idx = const_str(fs, key);
+	if (idx <= BCMAX_C) {
+		bcreg_reserve(fs, 2);
+		bcemit_ABC(fs, BC_TGETS, func, obj, idx);
+	} else {
+		bcreg_reserve(fs, 3);
+		bcemit_AD(fs, BC_KSTR, func+2, idx);
+		bcemit_ABC(fs, BC_TGETV, func, obj, func+2);
+		fs->freereg--;
+	}
+	e->u.s.info = func;
+	e->k = VNONRELOC;
+}
+
+/* -- Bytecode emitter for branches --------------------------------------- */
+
+/* Emit unconditional branch. */
+static BCPos bcemit_jmp(FuncState *fs)
+{
+	BCPos jpc = fs->jpc;
+	BCPos j = fs->pc - 1;
+	BCIns *ip = &fs->bcbase[j].ins;
+
+	fs->jpc = NO_JMP;
+	if ((int32_t)j >= (int32_t)fs->lasttarget && bc_op(*ip) == BC_UCLO)
+		setbc_j(ip, NO_JMP);
+	else
+		j = bcemit_AJ(fs, BC_JMP, fs->freereg, NO_JMP);
+	jmp_append(fs, &j, jpc);
+	return j;
+}
+
+/* Invert branch condition of bytecode instruction. */
+static void invertcond(FuncState *fs, ExpDesc *e)
+{
+	BCIns *ip = &fs->bcbase[e->u.s.info - 1].ins;
+	setbc_op(ip, bc_op(*ip)^1);
+}
+
+/* Emit conditional branch. */
+static BCPos bcemit_branch(FuncState *fs, ExpDesc *e, int cond)
+{
+	BCPos pc;
+
+	if (e->k == VRELOCABLE) {
+		BCIns *ip = bcptr(fs, e);
+		if (bc_op(*ip) == BC_NOT) {
+			*ip = BCINS_AD(cond ? BC_ISF : BC_IST, 0, bc_d(*ip));
+			return bcemit_jmp(fs);
+		}
+	}
+	if (e->k != VNONRELOC) {
+		bcreg_reserve(fs, 1);
+		expr_toreg_nobranch(fs, e, fs->freereg-1);
+	}
+	bcemit_AD(fs, cond ? BC_ISTC : BC_ISFC, NO_REG, e->u.s.info);
+	pc = bcemit_jmp(fs);
+	expr_free(fs, e);
+	return pc;
+}
+
+/* Emit branch on true condition. */
+static void bcemit_branch_t(FuncState *fs, ExpDesc *e)
+{
+	BCPos pc;
+
+	expr_discharge(fs, e);
+	if (e->k == VKSTR || e->k == VKNUM || e->k == VKTRUE)
+		pc = NO_JMP;  /* Never jump. */
+	else if (e->k == VJMP)
+		invertcond(fs, e), pc = e->u.s.info;
+	else if (e->k == VKFALSE || e->k == VKNIL)
+		expr_toreg_nobranch(fs, e, NO_REG), pc = bcemit_jmp(fs);
+	else
+		pc = bcemit_branch(fs, e, 0);
+	jmp_append(fs, &e->f, pc);
+	jmp_tohere(fs, e->t);
+	e->t = NO_JMP;
+}
+
+/* Emit branch on false condition. */
+static void bcemit_branch_f(FuncState *fs, ExpDesc *e)
+{
+	BCPos pc;
+
+	expr_discharge(fs, e);
+	if (e->k == VKNIL || e->k == VKFALSE)
+		pc = NO_JMP;  /* Never jump. */
+	else if (e->k == VJMP)
+		pc = e->u.s.info;
+	else if (e->k == VKSTR || e->k == VKNUM || e->k == VKTRUE)
+		expr_toreg_nobranch(fs, e, NO_REG), pc = bcemit_jmp(fs);
+	else
+		pc = bcemit_branch(fs, e, 1);
+	jmp_append(fs, &e->t, pc);
+	jmp_tohere(fs, e->f);
+	e->f = NO_JMP;
+}
+
+/* -- Bytecode emitter for operators -------------------------------------- */
+
+static ktap_number number_foldarith(ktap_number x, ktap_number y, int op)
+{
+	switch (op) {
+	case OPR_ADD - OPR_ADD: return x + y;
+	case OPR_SUB - OPR_ADD: return x - y;
+	case OPR_MUL - OPR_ADD: return x * y;
+	case OPR_DIV - OPR_ADD: return x / y;
+	default: return x;
+	}
+}
+
+/* Try constant-folding of arithmetic operators. */
+static int foldarith(BinOpr opr, ExpDesc *e1, ExpDesc *e2)
+{
+	ktap_val_t o;
+	ktap_number n;
+
+	if (!expr_isnumk_nojump(e1) || !expr_isnumk_nojump(e2))
+		return 0;
+
+	if (opr == OPR_DIV && expr_numberV(e2) == 0)
+		return 0; /* do not attempt to divide by 0 */
+
+	if (opr == OPR_MOD)
+		return 0; /* ktap current do not suppor pow arith */
+
+	n = number_foldarith(expr_numberV(e1), expr_numberV(e2),
+				(int)opr-OPR_ADD);
+	set_number(&o, n);
+	set_number(&e1->u.nval, n);
+	return 1;
+}
+
+/* Emit arithmetic operator. */
+static void bcemit_arith(FuncState *fs, BinOpr opr, ExpDesc *e1, ExpDesc *e2)
+{
+	BCReg rb, rc, t;
+	uint32_t op;
+
+	if (foldarith(opr, e1, e2))
+		return;
+	if (opr == OPR_POW) {
+		op = BC_POW;
+		rc = expr_toanyreg(fs, e2);
+		rb = expr_toanyreg(fs, e1);
+	} else {
+		op = opr-OPR_ADD+BC_ADDVV;
+		/*
+		 * Must discharge 2nd operand first since VINDEXED
+		 * might free regs.
+		 */
+		expr_toval(fs, e2);
+		if (expr_isnumk(e2) && (rc = const_num(fs, e2)) <= BCMAX_C)
+			op -= BC_ADDVV-BC_ADDVN;
+		else
+			rc = expr_toanyreg(fs, e2);
+		/* 1st operand discharged by bcemit_binop_left,
+		 * but need KNUM/KSHORT. */
+		kp_assert(expr_isnumk(e1) || e1->k == VNONRELOC);
+		expr_toval(fs, e1);
+		/* Avoid two consts to satisfy bytecode constraints. */
+		if (expr_isnumk(e1) && !expr_isnumk(e2) &&
+			(t = const_num(fs, e1)) <= BCMAX_B) {
+			rb = rc; rc = t; op -= BC_ADDVV-BC_ADDNV;
+		} else {
+			rb = expr_toanyreg(fs, e1);
+		}
+	}
+	/* Using expr_free might cause asserts if the order is wrong. */
+	if (e1->k == VNONRELOC && e1->u.s.info >= fs->nactvar)
+		fs->freereg--;
+	if (e2->k == VNONRELOC && e2->u.s.info >= fs->nactvar)
+		fs->freereg--;
+	e1->u.s.info = bcemit_ABC(fs, op, 0, rb, rc);
+	e1->k = VRELOCABLE;
+}
+
+/* Emit comparison operator. */
+static void bcemit_comp(FuncState *fs, BinOpr opr, ExpDesc *e1, ExpDesc *e2)
+{
+	ExpDesc *eret = e1;
+	BCIns ins;
+
+	expr_toval(fs, e1);
+	if (opr == OPR_EQ || opr == OPR_NE) {
+		BCOp op = opr == OPR_EQ ? BC_ISEQV : BC_ISNEV;
+		BCReg ra;
+
+		if (expr_isk(e1)) { /* Need constant in 2nd arg. */
+			e1 = e2;
+			e2 = eret;
+		}
+		ra = expr_toanyreg(fs, e1);  /* First arg must be in a reg. */
+		expr_toval(fs, e2);
+		switch (e2->k) {
+		case VKNIL: case VKFALSE: case VKTRUE:
+			ins = BCINS_AD(op+(BC_ISEQP-BC_ISEQV), ra,
+					const_pri(e2));
+			break;
+		case VKSTR:
+			ins = BCINS_AD(op+(BC_ISEQS-BC_ISEQV), ra,
+					const_str(fs, e2));
+			break;
+		case VKNUM:
+			ins = BCINS_AD(op+(BC_ISEQN-BC_ISEQV), ra,
+					const_num(fs, e2));
+			break;
+		default:
+			ins = BCINS_AD(op, ra, expr_toanyreg(fs, e2));
+			break;
+		}
+	} else {
+		uint32_t op = opr-OPR_LT+BC_ISLT;
+		BCReg ra, rd;
+		if ((op-BC_ISLT) & 1) {  /* GT -> LT, GE -> LE */
+			e1 = e2; e2 = eret;  /* Swap operands. */
+			op = ((op-BC_ISLT)^3)+BC_ISLT;
+			expr_toval(fs, e1);
+		}
+		rd = expr_toanyreg(fs, e2);
+		ra = expr_toanyreg(fs, e1);
+		ins = BCINS_AD(op, ra, rd);
+	}
+	/* Using expr_free might cause asserts if the order is wrong. */
+	if (e1->k == VNONRELOC && e1->u.s.info >= fs->nactvar)
+		fs->freereg--;
+	if (e2->k == VNONRELOC && e2->u.s.info >= fs->nactvar)
+		fs->freereg--;
+	bcemit_INS(fs, ins);
+	eret->u.s.info = bcemit_jmp(fs);
+	eret->k = VJMP;
+}
+
+/* Fixup left side of binary operator. */
+static void bcemit_binop_left(FuncState *fs, BinOpr op, ExpDesc *e)
+{
+	if (op == OPR_AND) {
+		bcemit_branch_t(fs, e);
+	} else if (op == OPR_OR) {
+		bcemit_branch_f(fs, e);
+	} else if (op == OPR_CONCAT) {
+		expr_tonextreg(fs, e);
+	} else if (op == OPR_EQ || op == OPR_NE) {
+		if (!expr_isk_nojump(e))
+			expr_toanyreg(fs, e);
+	} else {
+		if (!expr_isnumk_nojump(e))
+			expr_toanyreg(fs, e);
+	}
+}
+
+/* Emit binary operator. */
+static void bcemit_binop(FuncState *fs, BinOpr op, ExpDesc *e1, ExpDesc *e2)
+{
+	if (op <= OPR_POW) {
+		bcemit_arith(fs, op, e1, e2);
+	} else if (op == OPR_AND) {
+		kp_assert(e1->t == NO_JMP);  /* List must be closed. */
+		expr_discharge(fs, e2);
+		jmp_append(fs, &e2->f, e1->f);
+		*e1 = *e2;
+	} else if (op == OPR_OR) {
+		kp_assert(e1->f == NO_JMP);  /* List must be closed. */
+		expr_discharge(fs, e2);
+		jmp_append(fs, &e2->t, e1->t);
+		*e1 = *e2;
+	} else if (op == OPR_CONCAT) {
+		expr_toval(fs, e2);
+		if (e2->k == VRELOCABLE && bc_op(*bcptr(fs, e2)) == BC_CAT) {
+			kp_assert(e1->u.s.info == bc_b(*bcptr(fs, e2))-1);
+			expr_free(fs, e1);
+			setbc_b(bcptr(fs, e2), e1->u.s.info);
+			e1->u.s.info = e2->u.s.info;
+		} else {
+			expr_tonextreg(fs, e2);
+			expr_free(fs, e2);
+			expr_free(fs, e1);
+			e1->u.s.info = bcemit_ABC(fs, BC_CAT, 0, e1->u.s.info,
+								 e2->u.s.info);
+		}
+		e1->k = VRELOCABLE;
+	} else {
+		kp_assert(op == OPR_NE || op == OPR_EQ || op == OPR_LT ||
+			  op == OPR_GE || op == OPR_LE || op == OPR_GT);
+		bcemit_comp(fs, op, e1, e2);
+	}
+}
+
+/* Emit unary operator. */
+static void bcemit_unop(FuncState *fs, BCOp op, ExpDesc *e)
+{
+	if (op == BC_NOT) {
+		/* Swap true and false lists. */
+		{ BCPos temp = e->f; e->f = e->t; e->t = temp; }
+		jmp_dropval(fs, e->f);
+		jmp_dropval(fs, e->t);
+		expr_discharge(fs, e);
+		if (e->k == VKNIL || e->k == VKFALSE) {
+			e->k = VKTRUE;
+			return;
+		} else if (expr_isk(e)) {
+			e->k = VKFALSE;
+			return;
+		} else if (e->k == VJMP) {
+			invertcond(fs, e);
+			return;
+		} else if (e->k == VRELOCABLE) {
+			bcreg_reserve(fs, 1);
+			setbc_a(bcptr(fs, e), fs->freereg-1);
+			e->u.s.info = fs->freereg-1;
+			e->k = VNONRELOC;
+		} else {
+			kp_assert(e->k == VNONRELOC);
+		}
+	} else {
+		kp_assert(op == BC_UNM || op == BC_LEN);
+		/* Constant-fold negations. */
+		if (op == BC_UNM && !expr_hasjump(e)) {
+			/* Avoid folding to -0. */
+			if (expr_isnumk(e) && !expr_numiszero(e)) {
+				ktap_val_t *o = expr_numtv(e);
+				if (is_number(o))
+					set_number(o, -nvalue(o));
+				return;
+			}
+		}
+		expr_toanyreg(fs, e);
+	}
+	expr_free(fs, e);
+	e->u.s.info = bcemit_AD(fs, op, 0, e->u.s.info);
+	e->k = VRELOCABLE;
+}
+
+/* -- Lexer support ------------------------------------------------------- */
+
+/* Check and consume optional token. */
+static int lex_opt(LexState *ls, LexToken tok)
+{
+	if (ls->tok == tok) {
+		kp_lex_next(ls);
+		return 1;
+	}
+	return 0;
+}
+
+/* Check and consume token. */
+static void lex_check(LexState *ls, LexToken tok)
+{
+	if (ls->tok != tok)
+		err_token(ls, tok);
+	kp_lex_next(ls);
+}
+
+/* Check for matching token. */
+static void lex_match(LexState *ls, LexToken what, LexToken who, BCLine line)
+{
+	if (!lex_opt(ls, what)) {
+		if (line == ls->linenumber) {
+			err_token(ls, what);
+		} else {
+			const char *swhat = kp_lex_token2str(ls, what);
+			const char *swho = kp_lex_token2str(ls, who);
+			kp_lex_error(ls, ls->tok, KP_ERR_XMATCH, swhat, swho,
+								line);
+		}
+	}
+}
+
+/* Check for string token. */
+static ktap_str_t *lex_str(LexState *ls)
+{
+	ktap_str_t *s;
+
+	if (ls->tok != TK_name)
+		err_token(ls, TK_name);
+	s = rawtsvalue(&ls->tokval);
+	kp_lex_next(ls);
+	return s;
+}
+
+/* -- Variable handling --------------------------------------------------- */
+
+#define var_get(ls, fs, i)	((ls)->vstack[(fs)->varmap[(i)]])
+
+/* Define a new local variable. */
+static void var_new(LexState *ls, BCReg n, ktap_str_t *name)
+{
+	FuncState *fs = ls->fs;
+	int vtop = ls->vtop;
+
+	checklimit(fs, fs->nactvar+n, KP_MAX_LOCVAR, "local variables");
+	if (vtop >= ls->sizevstack) {
+		if (ls->sizevstack >= KP_MAX_VSTACK)
+			kp_lex_error(ls, 0, KP_ERR_XLIMC, KP_MAX_VSTACK);
+		if (!ls->vstack) {
+			ls->vstack = malloc(sizeof(VarInfo) * 20);
+			ls->sizevstack = 20;
+		} else {
+			ls->vstack = realloc(ls->vstack,
+				ls->sizevstack * sizeof(VarInfo) * 2);
+			ls->sizevstack = ls->sizevstack * 2;
+		}
+	}
+	kp_assert((uintptr_t)name < VARNAME__MAX ||
+			kp_tab_getstr(fs->kt, name) != NULL);
+	ls->vstack[vtop].name = name;
+	fs->varmap[fs->nactvar+n] = (uint16_t)vtop;
+	ls->vtop = vtop+1;
+}
+
+#define var_new_lit(ls, n, v) \
+	var_new(ls, (n), kp_parse_keepstr(ls, "" v, sizeof(v)-1))
+
+#define var_new_fixed(ls, n, vn) \
+	var_new(ls, (n), (ktap_str_t *)(uintptr_t)(vn))
+
+/* Add local variables. */
+static void var_add(LexState *ls, BCReg nvars)
+{
+	FuncState *fs = ls->fs;
+	BCReg nactvar = fs->nactvar;
+
+	while (nvars--) {
+		VarInfo *v = &var_get(ls, fs, nactvar);
+		v->startpc = fs->pc;
+		v->slot = nactvar++;
+		v->info = 0;
+	}
+	fs->nactvar = nactvar;
+}
+
+/* Remove local variables. */
+static void var_remove(LexState *ls, BCReg tolevel)
+{
+	FuncState *fs = ls->fs;
+	while (fs->nactvar > tolevel)
+		var_get(ls, fs, --fs->nactvar).endpc = fs->pc;
+}
+
+/* Lookup local variable name. */
+static BCReg var_lookup_local(FuncState *fs, ktap_str_t *n)
+{
+	int i;
+	
+	for (i = fs->nactvar-1; i >= 0; i--) {
+		if (n == var_get(fs->ls, fs, i).name)
+			return (BCReg)i;
+	}
+	return (BCReg)-1;  /* Not found. */
+}
+
+/* Lookup or add upvalue index. */
+static int var_lookup_uv(FuncState *fs, int vidx, ExpDesc *e)
+{
+	int i, n = fs->nuv;
+
+	for (i = 0; i < n; i++)
+		if (fs->uvmap[i] == vidx)
+			return i;  /* Already exists. */
+
+	/* Otherwise create a new one. */
+	checklimit(fs, fs->nuv, KP_MAX_UPVAL, "upvalues");
+	kp_assert(e->k == VLOCAL || e->k == VUPVAL);
+	fs->uvmap[n] = (uint16_t)vidx;
+	fs->uvtmp[n] = (uint16_t)(e->k == VLOCAL ? vidx :
+			KP_MAX_VSTACK+e->u.s.info);
+	fs->nuv = n+1;
+	return n;
+}
+
+/* Forward declaration. */
+static void fscope_uvmark(FuncState *fs, BCReg level);
+
+/* Recursively lookup variables in enclosing functions. */
+static int var_lookup_(FuncState *fs, ktap_str_t *name, ExpDesc *e,
+			 int first)
+{
+	if (fs) {
+		BCReg reg = var_lookup_local(fs, name);
+		if ((int32_t)reg >= 0) {  /* Local in this function? */
+			expr_init(e, VLOCAL, reg);
+			if (!first) {
+				/* Scope now has an upvalue. */
+				fscope_uvmark(fs, reg);
+			}
+			return (int)(e->u.s.aux = (uint32_t)fs->varmap[reg]);
+		} else {
+			/* Var in outer func? */
+			int vidx = var_lookup_(fs->prev, name, e, 0);
+			if ((int32_t)vidx >= 0) {
+				/* Yes, make it an upvalue here. */
+				e->u.s.info =
+					(uint8_t)var_lookup_uv(fs, vidx, e);
+				e->k = VUPVAL;
+				return vidx;
+			}
+		}
+	} else {  /* Not found in any function, must be a global. */
+		expr_init(e, VGLOBAL, 0);
+		e->u.sval = name;
+	}
+	return (int)-1;  /* Global. */
+}
+
+/* Lookup variable name. */
+#define var_lookup(ls, e) \
+	var_lookup_((ls)->fs, lex_str(ls), (e), 1)
+
+/* -- Goto an label handling ---------------------------------------------- */
+
+/* Add a new goto or label. */
+static int gola_new(LexState *ls, ktap_str_t *name, uint8_t info, BCPos pc)
+{
+	FuncState *fs = ls->fs;
+	int vtop = ls->vtop;
+
+	if (vtop >= ls->sizevstack) {
+		if (ls->sizevstack >= KP_MAX_VSTACK)
+			kp_lex_error(ls, 0, KP_ERR_XLIMC, KP_MAX_VSTACK);
+		if (!ls->vstack) {
+			ls->vstack = malloc(sizeof(VarInfo) * 20);
+			ls->sizevstack = 20;
+		} else {
+			ls->vstack = realloc(ls->vstack,
+					ls->sizevstack * sizeof(VarInfo) * 2);
+			ls->sizevstack = ls->sizevstack * 2;
+		}
+	}
+	kp_assert(name == NAME_BREAK ||
+		  kp_tab_getstr(fs->kt, name) != NULL);
+	ls->vstack[vtop].name = name;
+	ls->vstack[vtop].startpc = pc;
+	ls->vstack[vtop].slot = (uint8_t)fs->nactvar;
+	ls->vstack[vtop].info = info;
+	ls->vtop = vtop+1;
+	return vtop;
+}
+
+#define gola_isgoto(v)		((v)->info & VSTACK_GOTO)
+#define gola_islabel(v)		((v)->info & VSTACK_LABEL)
+#define gola_isgotolabel(v)	((v)->info & (VSTACK_GOTO|VSTACK_LABEL))
+
+/* Patch goto to jump to label. */
+static void gola_patch(LexState *ls, VarInfo *vg, VarInfo *vl)
+{
+	FuncState *fs = ls->fs;
+	BCPos pc = vg->startpc;
+
+	vg->name = NULL; /* Invalidate pending goto. */
+	setbc_a(&fs->bcbase[pc].ins, vl->slot);
+	jmp_patch(fs, pc, vl->startpc);
+}
+
+/* Patch goto to close upvalues. */
+static void gola_close(LexState *ls, VarInfo *vg)
+{
+	FuncState *fs = ls->fs;
+	BCPos pc = vg->startpc;
+	BCIns *ip = &fs->bcbase[pc].ins;
+	kp_assert(gola_isgoto(vg));
+	kp_assert(bc_op(*ip) == BC_JMP || bc_op(*ip) == BC_UCLO);
+	setbc_a(ip, vg->slot);
+	if (bc_op(*ip) == BC_JMP) {
+		BCPos next = jmp_next(fs, pc);
+		if (next != NO_JMP)
+			jmp_patch(fs, next, pc);  /* Jump to UCLO. */
+		setbc_op(ip, BC_UCLO);  /* Turn into UCLO. */
+		setbc_j(ip, NO_JMP);
+	}
+}
+
+/* Resolve pending forward gotos for label. */
+static void gola_resolve(LexState *ls, FuncScope *bl, int idx)
+{
+	VarInfo *vg = ls->vstack + bl->vstart;
+	VarInfo *vl = ls->vstack + idx;
+	for (; vg < vl; vg++)
+		if (vg->name == vl->name && gola_isgoto(vg)) {
+			if (vg->slot < vl->slot) {
+				ktap_str_t *name =
+					var_get(ls, ls->fs, vg->slot).name;
+				kp_assert((uintptr_t)name >= VARNAME__MAX);
+				ls->linenumber =
+					ls->fs->bcbase[vg->startpc].line;
+				kp_assert(vg->name != NAME_BREAK);
+				kp_lex_error(ls, 0, KP_ERR_XGSCOPE,
+				getstr(vg->name), getstr(name));
+			}
+			gola_patch(ls, vg, vl);
+		}
+}
+
+/* Fixup remaining gotos and labels for scope. */
+static void gola_fixup(LexState *ls, FuncScope *bl)
+{
+	VarInfo *v = ls->vstack + bl->vstart;
+	VarInfo *ve = ls->vstack + ls->vtop;
+
+	for (; v < ve; v++) {
+		ktap_str_t *name = v->name;
+		/* Only consider remaining valid gotos/labels. */
+		if (name != NULL) {
+			if (gola_islabel(v)) {
+				VarInfo *vg;
+				/* Invalidate label that goes out of scope. */
+				v->name = NULL;
+				/* Resolve pending backward gotos. */
+				for (vg = v+1; vg < ve; vg++)
+					if (vg->name == name &&
+						gola_isgoto(vg)) {
+						if ((bl->flags&FSCOPE_UPVAL) &&
+							 vg->slot > v->slot)
+							gola_close(ls, vg);
+						gola_patch(ls, vg, v);
+					}
+			} else if (gola_isgoto(v)) {
+				/* Propagate goto or break to outer scope. */
+				if (bl->prev) {
+					bl->prev->flags |= name == NAME_BREAK ? 						FSCOPE_BREAK : FSCOPE_GOLA;
+					v->slot = bl->nactvar;
+					if ((bl->flags & FSCOPE_UPVAL))
+						gola_close(ls, v);
+				} else {
+					ls->linenumber =
+					ls->fs->bcbase[v->startpc].line;
+					if (name == NAME_BREAK)
+						kp_lex_error(ls, 0, KP_ERR_XBREAK);
+					else
+						kp_lex_error(ls, 0, KP_ERR_XLUNDEF, getstr(name));
+				}
+			}
+		}
+	}
+}
+
+/* Find existing label. */
+static VarInfo *gola_findlabel(LexState *ls, ktap_str_t *name)
+{
+	VarInfo *v = ls->vstack + ls->fs->bl->vstart;
+	VarInfo *ve = ls->vstack + ls->vtop;
+
+	for (; v < ve; v++)
+		if (v->name == name && gola_islabel(v))
+			return v;
+	return NULL;
+}
+
+/* -- Scope handling ------------------------------------------------------ */
+
+/* Begin a scope. */
+static void fscope_begin(FuncState *fs, FuncScope *bl, int flags)
+{
+	bl->nactvar = (uint8_t)fs->nactvar;
+	bl->flags = flags;
+	bl->vstart = fs->ls->vtop;
+	bl->prev = fs->bl;
+	fs->bl = bl;
+	kp_assert(fs->freereg == fs->nactvar);
+}
+
+/* End a scope. */
+static void fscope_end(FuncState *fs)
+{
+	FuncScope *bl = fs->bl;
+	LexState *ls = fs->ls;
+
+	fs->bl = bl->prev;
+	var_remove(ls, bl->nactvar);
+	fs->freereg = fs->nactvar;
+	kp_assert(bl->nactvar == fs->nactvar);
+	if ((bl->flags & (FSCOPE_UPVAL|FSCOPE_NOCLOSE)) == FSCOPE_UPVAL)
+		bcemit_AJ(fs, BC_UCLO, bl->nactvar, 0);
+	if ((bl->flags & FSCOPE_BREAK)) {
+		if ((bl->flags & FSCOPE_LOOP)) {
+			int idx = gola_new(ls, NAME_BREAK, VSTACK_LABEL,
+						fs->pc);
+			ls->vtop = idx;  /* Drop break label immediately. */
+			gola_resolve(ls, bl, idx);
+			return;
+		}  /* else: need the fixup step to propagate the breaks. */
+	} else if (!(bl->flags & FSCOPE_GOLA)) {
+		return;
+	}
+	gola_fixup(ls, bl);
+}
+
+/* Mark scope as having an upvalue. */
+static void fscope_uvmark(FuncState *fs, BCReg level)
+{
+	FuncScope *bl;
+
+	for (bl = fs->bl; bl && bl->nactvar > level; bl = bl->prev);
+	if (bl)
+		bl->flags |= FSCOPE_UPVAL;
+}
+
+/* -- Function state management ------------------------------------------- */
+
+/* Fixup bytecode for prototype. */
+static void fs_fixup_bc(FuncState *fs, ktap_proto_t *pt, BCIns *bc, int n)
+{
+	BCInsLine *base = fs->bcbase;
+	int i;
+
+	pt->sizebc = n;
+	bc[0] = BCINS_AD((fs->flags & PROTO_VARARG) ? BC_FUNCV : BC_FUNCF,
+			 fs->framesize, 0);
+	for (i = 1; i < n; i++)
+		bc[i] = base[i].ins;
+}
+
+/* Fixup upvalues for child prototype, step #2. */
+static void fs_fixup_uv2(FuncState *fs, ktap_proto_t *pt)
+{
+	VarInfo *vstack = fs->ls->vstack;
+	uint16_t *uv = pt->uv;
+	int i, n = pt->sizeuv;
+
+	for (i = 0; i < n; i++) {
+		VarIndex vidx = uv[i];
+		if (vidx >= KP_MAX_VSTACK)
+			uv[i] = vidx - KP_MAX_VSTACK;
+		else if ((vstack[vidx].info & VSTACK_VAR_RW))
+			uv[i] = vstack[vidx].slot | PROTO_UV_LOCAL;
+		else
+			uv[i] = vstack[vidx].slot | PROTO_UV_LOCAL |
+					PROTO_UV_IMMUTABLE;
+	}
+}
+
+/* Fixup constants for prototype. */
+static void fs_fixup_k(FuncState *fs, ktap_proto_t *pt, void *kptr)
+{
+	ktap_tab_t *kt;
+	ktap_node_t *node;
+	int i, hmask;
+
+	checklimitgt(fs, fs->nkn, BCMAX_D+1, "constants");
+	checklimitgt(fs, fs->nkgc, BCMAX_D+1, "constants");
+
+	pt->k = kptr;
+	pt->sizekn = fs->nkn;
+	pt->sizekgc = fs->nkgc;
+	kt = fs->kt;
+	node = kt->node;
+	hmask = kt->hmask;
+	for (i = 0; i <= hmask; i++) {
+		ktap_node_t *n = &node[i];
+
+		if (tvhaskslot(&n->val)) {
+			ptrdiff_t kidx = (ptrdiff_t)tvkslot(&n->val);
+			kp_assert(!is_number(&n->key));
+			if (is_number(&n->key)) {
+				ktap_val_t *tv = &((ktap_val_t *)kptr)[kidx];
+				*tv = n->key;
+			} else {
+				ktap_obj_t *o = n->key.val.gc;
+				ktap_obj_t **v = (ktap_obj_t **)kptr;
+				v[~kidx] = o;
+				if (is_proto(&n->key))
+					fs_fixup_uv2(fs, (ktap_proto_t *)o);
+			}
+		}
+	}
+}
+
+/* Fixup upvalues for prototype, step #1. */
+static void fs_fixup_uv1(FuncState *fs, ktap_proto_t *pt, uint16_t *uv)
+{
+	pt->uv = uv;
+	pt->sizeuv = fs->nuv;
+	memcpy(uv, fs->uvtmp, fs->nuv*sizeof(VarIndex));
+}
+
+#ifndef KTAP_DISABLE_LINEINFO
+/* Prepare lineinfo for prototype. */
+static size_t fs_prep_line(FuncState *fs, BCLine numline)
+{
+	return (fs->pc-1) << (numline < 256 ? 0 : numline < 65536 ? 1 : 2);
+}
+
+/* Fixup lineinfo for prototype. */
+static void fs_fixup_line(FuncState *fs, ktap_proto_t *pt,
+			  void *lineinfo, BCLine numline)
+{
+	BCInsLine *base = fs->bcbase + 1;
+	BCLine first = fs->linedefined;
+	int i = 0, n = fs->pc-1;
+
+	pt->firstline = fs->linedefined;
+	pt->numline = numline;
+	pt->lineinfo = lineinfo;
+	if (numline < 256) {
+		uint8_t *li = (uint8_t *)lineinfo;
+		do {
+			BCLine delta = base[i].line - first;
+			kp_assert(delta >= 0 && delta < 256);
+			li[i] = (uint8_t)delta;
+		} while (++i < n);
+	} else if (numline < 65536) {
+		uint16_t *li = (uint16_t *)lineinfo;
+		do {
+			BCLine delta = base[i].line - first;
+			kp_assert(delta >= 0 && delta < 65536);
+			li[i] = (uint16_t)delta;
+		} while (++i < n);
+	} else {
+		uint32_t *li = (uint32_t *)lineinfo;
+		do {
+			BCLine delta = base[i].line - first;
+			kp_assert(delta >= 0);
+			li[i] = (uint32_t)delta;
+		} while (++i < n);
+	}
+}
+
+/* Prepare variable info for prototype. */
+static size_t fs_prep_var(LexState *ls, FuncState *fs, size_t *ofsvar)
+{
+	VarInfo *vs =ls->vstack, *ve;
+	int i, n;
+	BCPos lastpc;
+
+	kp_buf_reset(&ls->sb);  /* Copy to temp. string buffer. */
+	/* Store upvalue names. */
+	for (i = 0, n = fs->nuv; i < n; i++) {
+		ktap_str_t *s = vs[fs->uvmap[i]].name;
+		int len = s->len+1;
+		char *p = kp_buf_more(&ls->sb, len);
+		p = kp_buf_wmem(p, getstr(s), len);
+		setsbufP(&ls->sb, p);
+	}
+
+	*ofsvar = sbuflen(&ls->sb);
+	lastpc = 0;
+	/* Store local variable names and compressed ranges. */
+	for (ve = vs + ls->vtop, vs += fs->vbase; vs < ve; vs++) {
+		if (!gola_isgotolabel(vs)) {
+			ktap_str_t *s = vs->name;
+			BCPos startpc;
+			char *p;
+			if ((uintptr_t)s < VARNAME__MAX) {
+				p = kp_buf_more(&ls->sb, 1 + 2*5);
+				*p++ = (char)(uintptr_t)s;
+			} else {
+				int len = s->len+1;
+				p = kp_buf_more(&ls->sb, len + 2*5);
+				p = kp_buf_wmem(p, getstr(s), len);
+			}
+			startpc = vs->startpc;
+			p = strfmt_wuleb128(p, startpc-lastpc);
+			p = strfmt_wuleb128(p, vs->endpc-startpc);
+			setsbufP(&ls->sb, p);
+			lastpc = startpc;
+		}
+	}
+
+	kp_buf_putb(&ls->sb, '\0');  /* Terminator for varinfo. */
+	return sbuflen(&ls->sb);
+}
+
+/* Fixup variable info for prototype. */
+static void fs_fixup_var(LexState *ls, ktap_proto_t *pt, uint8_t *p,
+			 size_t ofsvar)
+{
+	pt->uvinfo = p;
+	pt->varinfo = (char *)p + ofsvar;
+	/* Copy from temp. buffer. */
+	memcpy(p, sbufB(&ls->sb), sbuflen(&ls->sb));
+}
+#else
+
+/* Initialize with empty debug info, if disabled. */
+#define fs_prep_line(fs, numline)		(UNUSED(numline), 0)
+#define fs_fixup_line(fs, pt, li, numline) \
+  pt->firstline = pt->numline = 0, (pt)->lineinfo = NULL
+#define fs_prep_var(ls, fs, ofsvar)		(UNUSED(ofsvar), 0)
+#define fs_fixup_var(ls, pt, p, ofsvar) \
+  (pt)->uvinfo = NULL, (pt)->varinfo = NULL
+
+#endif
+
+/* Check if bytecode op returns. */
+static int bcopisret(BCOp op)
+{
+	switch (op) {
+	case BC_CALLMT: case BC_CALLT:
+	case BC_RETM: case BC_RET: case BC_RET0: case BC_RET1:
+		return 1;
+	default:
+		return 0;
+	}
+}
+
+/* Fixup return instruction for prototype. */
+static void fs_fixup_ret(FuncState *fs)
+{
+	BCPos lastpc = fs->pc;
+
+	if (lastpc <= fs->lasttarget ||
+		!bcopisret(bc_op(fs->bcbase[lastpc-1].ins))) {
+		if ((fs->bl->flags & FSCOPE_UPVAL))
+			bcemit_AJ(fs, BC_UCLO, 0, 0);
+		bcemit_AD(fs, BC_RET0, 0, 1);  /* Need final return. */
+	}
+	fs->bl->flags |= FSCOPE_NOCLOSE;  /* Handled above. */
+	fscope_end(fs);
+	kp_assert(fs->bl == NULL);
+	/* May need to fixup returns encoded before first function
+	 * was created. */
+	if (fs->flags & PROTO_FIXUP_RETURN) {
+		BCPos pc;
+		for (pc = 1; pc < lastpc; pc++) {
+			BCIns ins = fs->bcbase[pc].ins;
+			BCPos offset;
+			switch (bc_op(ins)) {
+			case BC_CALLMT: case BC_CALLT:
+			case BC_RETM: case BC_RET: case BC_RET0: case BC_RET1:
+				/* Copy original instruction. */
+				offset = bcemit_INS(fs, ins);
+				fs->bcbase[offset].line = fs->bcbase[pc].line;
+				offset = offset-(pc+1)+BCBIAS_J;
+				if (offset > BCMAX_D)
+					err_syntax(fs->ls, KP_ERR_XFIXUP);
+				/* Replace with UCLO plus branch. */
+				fs->bcbase[pc].ins = BCINS_AD(BC_UCLO, 0,
+								offset);
+				break;
+			case BC_UCLO:
+				return;  /* We're done. */
+			default:
+				break;
+			}
+		}
+	}
+}
+
+/* Finish a FuncState and return the new prototype. */
+static ktap_proto_t *fs_finish(LexState *ls, BCLine line)
+{
+	FuncState *fs = ls->fs;
+	BCLine numline = line - fs->linedefined;
+	size_t sizept, ofsk, ofsuv, ofsli, ofsdbg, ofsvar;
+	ktap_proto_t *pt;
+
+	/* Apply final fixups. */
+	fs_fixup_ret(fs);
+
+	/* Calculate total size of prototype including all colocated arrays. */
+	sizept = sizeof(ktap_proto_t) + fs->pc*sizeof(BCIns) +
+			fs->nkgc*sizeof(ktap_obj_t *);
+	sizept = (sizept + sizeof(ktap_val_t)-1) & ~(sizeof(ktap_val_t)-1);
+	ofsk = sizept; sizept += fs->nkn*sizeof(ktap_val_t);
+	ofsuv = sizept; sizept += ((fs->nuv+1)&~1)*2;
+	ofsli = sizept; sizept += fs_prep_line(fs, numline);
+	ofsdbg = sizept; sizept += fs_prep_var(ls, fs, &ofsvar);
+
+	/* Allocate prototype and initialize its fields. */
+	pt = (ktap_proto_t *)malloc((int)sizept);
+	pt->gct = ~KTAP_TPROTO;
+	pt->sizept = (int)sizept;
+	pt->flags =
+		(uint8_t)(fs->flags & ~(PROTO_HAS_RETURN|PROTO_FIXUP_RETURN));
+	pt->numparams = fs->numparams;
+	pt->framesize = fs->framesize;
+	pt->chunkname = ls->chunkname;
+
+	/* Close potentially uninitialized gap between bc and kgc. */
+	*(uint32_t *)((char *)pt + ofsk - sizeof(ktap_obj_t *)*(fs->nkgc+1)) = 0;
+	fs_fixup_bc(fs, pt, (BCIns *)((char *)pt + sizeof(ktap_proto_t)), fs->pc);
+	fs_fixup_k(fs, pt, (void *)((char *)pt + ofsk));
+	fs_fixup_uv1(fs, pt, (uint16_t *)((char *)pt + ofsuv));
+	fs_fixup_line(fs, pt, (void *)((char *)pt + ofsli), numline);
+	fs_fixup_var(ls, pt, (uint8_t *)((char *)pt + ofsdbg), ofsvar);
+
+	ls->vtop = fs->vbase;  /* Reset variable stack. */
+	ls->fs = fs->prev;
+	kp_assert(ls->fs != NULL || ls->tok == TK_eof);
+	return pt;
+}
+
+/* Initialize a new FuncState. */
+static void fs_init(LexState *ls, FuncState *fs)
+{
+	fs->prev = ls->fs; ls->fs = fs;  /* Append to list. */
+	fs->ls = ls;
+	fs->vbase = ls->vtop;
+	fs->pc = 0;
+	fs->lasttarget = 0;
+	fs->jpc = NO_JMP;
+	fs->freereg = 0;
+	fs->nkgc = 0;
+	fs->nkn = 0;
+	fs->nactvar = 0;
+	fs->nuv = 0;
+	fs->bl = NULL;
+	fs->flags = 0;
+	fs->framesize = 1;  /* Minimum frame size. */
+	fs->kt = kp_tab_new();
+}
+
+/* -- Expressions --------------------------------------------------------- */
+
+/* Forward declaration. */
+static void expr(LexState *ls, ExpDesc *v);
+
+/* Return string expression. */
+static void expr_str(LexState *ls, ExpDesc *e)
+{
+	expr_init(e, VKSTR, 0);
+	e->u.sval = lex_str(ls);
+}
+
+#define checku8(x)     ((x) == (int32_t)(uint8_t)(x))
+
+/* Return index expression. */
+static void expr_index(FuncState *fs, ExpDesc *t, ExpDesc *e)
+{
+	/* Already called: expr_toval(fs, e). */
+	t->k = VINDEXED;
+	if (expr_isnumk(e)) {
+		ktap_number n = expr_numberV(e);
+		int32_t k = (int)n;
+		if (checku8(k) && n == (ktap_number)k) {
+			/* 256..511: const byte key */
+			t->u.s.aux = BCMAX_C+1+(uint32_t)k;
+			return;
+		}
+	} else if (expr_isstrk(e)) {
+		BCReg idx = const_str(fs, e);
+		if (idx <= BCMAX_C) {
+			/* -256..-1: const string key */
+			t->u.s.aux = ~idx;
+			return;
+		}
+	}
+	t->u.s.aux = expr_toanyreg(fs, e);  /* 0..255: register */
+}
+
+/* Parse index expression with named field. */
+static void expr_field(LexState *ls, ExpDesc *v)
+{
+	FuncState *fs = ls->fs;
+	ExpDesc key;
+
+	expr_toanyreg(fs, v);
+	kp_lex_next(ls);  /* Skip dot or colon. */
+	expr_str(ls, &key);
+	expr_index(fs, v, &key);
+}
+
+/* Parse index expression with brackets. */
+static void expr_bracket(LexState *ls, ExpDesc *v)
+{
+	kp_lex_next(ls);  /* Skip '['. */
+	expr(ls, v);
+	expr_toval(ls->fs, v);
+	lex_check(ls, ']');
+}
+
+/* Get value of constant expression. */
+static void expr_kvalue(ktap_val_t *v, ExpDesc *e)
+{
+	if (e->k <= VKTRUE) {
+		setitype(v, ~(uint32_t)e->k);
+	} else if (e->k == VKSTR) {
+		set_string(v, e->u.sval);
+	} else {
+		kp_assert(tvisnumber(expr_numtv(e)));
+		*v = *expr_numtv(e);
+	}
+}
+
+#define FLS(x)       ((uint32_t)(__builtin_clz(x)^31))
+#define hsize2hbits(s) ((s) ? ((s)==1 ? 1 : 1+FLS((uint32_t)((s)-1))) : 0)
+
+
+/* Parse table constructor expression. */
+static void expr_table(LexState *ls, ExpDesc *e)
+{
+	FuncState *fs = ls->fs;
+	BCLine line = ls->linenumber;
+	ktap_tab_t *t = NULL;
+	int vcall = 0, needarr = 0, fixt = 0;
+	uint32_t narr = 1;  /* First array index. */
+	uint32_t nhash = 0;  /* Number of hash entries. */
+	BCReg freg = fs->freereg;
+	BCPos pc = bcemit_AD(fs, BC_TNEW, freg, 0);
+
+	expr_init(e, VNONRELOC, freg);
+	bcreg_reserve(fs, 1);
+	freg++;
+	lex_check(ls, '{');
+	while (ls->tok != '}') {
+		ExpDesc key, val;
+		vcall = 0;
+		if (ls->tok == '[') {
+			expr_bracket(ls, &key);/* Already calls expr_toval. */
+			if (!expr_isk(&key))
+				expr_index(fs, e, &key);
+			if (expr_isnumk(&key) && expr_numiszero(&key))
+				needarr = 1;
+			else
+				nhash++;
+			lex_check(ls, '=');
+		} else if ((ls->tok == TK_name) &&
+				kp_lex_lookahead(ls) == '=') {
+			expr_str(ls, &key);
+			lex_check(ls, '=');
+			nhash++;
+		} else {
+			expr_init(&key, VKNUM, 0);
+			set_number(&key.u.nval, (int)narr);
+			narr++;
+			needarr = vcall = 1;
+		}
+		expr(ls, &val);
+		if (expr_isk(&key) && key.k != VKNIL &&
+			(key.k == VKSTR || expr_isk_nojump(&val))) {
+			ktap_val_t k, *v;
+			if (!t) {  /* Create template table on demand. */
+				BCReg kidx;
+				t = kp_tab_new();
+				kidx = const_gc(fs, obj2gco(t), KTAP_TTAB);
+				fs->bcbase[pc].ins = BCINS_AD(BC_TDUP, freg-1,
+								 kidx);
+			}
+			vcall = 0;
+			expr_kvalue(&k, &key);
+			v = kp_tab_set(t, &k);
+			/* Add const key/value to template table. */
+			if (expr_isk_nojump(&val)) {
+				expr_kvalue(v, &val);
+			} else {
+				/* Otherwise create dummy string key (avoids kp_tab_newkey). */
+				set_table(v, t);  /* Preserve key with table itself as value. */
+				fixt = 1;/* Fix this later, after all resizes. */
+				goto nonconst;
+			}
+		} else {
+ nonconst:
+			if (val.k != VCALL) {
+				expr_toanyreg(fs, &val);
+				vcall = 0;
+			}
+			if (expr_isk(&key))
+				expr_index(fs, e, &key);
+			bcemit_store(fs, e, &val);
+		}
+		fs->freereg = freg;
+		if (!lex_opt(ls, ',') && !lex_opt(ls, ';'))
+			break;
+	}
+	lex_match(ls, '}', '{', line);
+	if (vcall) {
+		BCInsLine *ilp = &fs->bcbase[fs->pc-1];
+		ExpDesc en;
+		kp_assert(bc_a(ilp->ins) == freg &&
+			bc_op(ilp->ins) == (narr > 256 ? BC_TSETV : BC_TSETB));
+		expr_init(&en, VKNUM, 0);
+		set_number(&en.u.nval, narr - 1);
+		if (narr > 256) { fs->pc--; ilp--; }
+		ilp->ins = BCINS_AD(BC_TSETM, freg, const_num(fs, &en));
+		setbc_b(&ilp[-1].ins, 0);
+	}
+	if (pc == fs->pc-1) {  /* Make expr relocable if possible. */
+		e->u.s.info = pc;
+		fs->freereg--;
+		e->k = VRELOCABLE;
+	} else {
+		e->k = VNONRELOC;  /* May have been changed by expr_index. */
+	}
+	if (!t) {  /* Construct TNEW RD: hhhhhaaaaaaaaaaa. */
+		BCIns *ip = &fs->bcbase[pc].ins;
+		if (!needarr) narr = 0;
+		else if (narr < 3) narr = 3;
+		else if (narr > 0x7ff) narr = 0x7ff;
+		setbc_d(ip, narr|(hsize2hbits(nhash)<<11));
+	} else {
+		if (fixt) {  /* Fix value for dummy keys in template table. */
+			ktap_node_t *node = t->node;
+			uint32_t i, hmask = t->hmask;
+			for (i = 0; i <= hmask; i++) {
+				ktap_node_t *n = &node[i];
+				if (is_table(&n->val)) {
+					kp_assert(tabV(&n->val) == t);
+					/* Turn value into nil. */
+					set_nil(&n->val);
+				}
+			}
+		}
+	}
+}
+
+/* Parse function parameters. */
+static BCReg parse_params(LexState *ls, int needself)
+{
+	FuncState *fs = ls->fs;
+	BCReg nparams = 0;
+	lex_check(ls, '(');
+	if (needself)
+		var_new_lit(ls, nparams++, "self");
+	if (ls->tok != ')') {
+		do {
+			if (ls->tok == TK_name) {
+				var_new(ls, nparams++, lex_str(ls));
+			} else if (ls->tok == TK_dots) {
+				kp_lex_next(ls);
+				fs->flags |= PROTO_VARARG;
+				break;
+			} else {
+				err_syntax(ls, KP_ERR_XPARAM);
+			}
+		} while (lex_opt(ls, ','));
+	}
+	var_add(ls, nparams);
+	kp_assert(fs->nactvar == nparams);
+	bcreg_reserve(fs, nparams);
+	lex_check(ls, ')');
+	return nparams;
+}
+
+/* Forward declaration. */
+static void parse_chunk(LexState *ls);
+
+/* Parse body of a function. */
+static void parse_body(LexState *ls, ExpDesc *e, int needself, BCLine line)
+{
+	FuncState fs, *pfs = ls->fs;
+	FuncScope bl;
+	ktap_proto_t *pt;
+	ptrdiff_t oldbase = pfs->bcbase - ls->bcstack;
+
+	fs_init(ls, &fs);
+	fscope_begin(&fs, &bl, 0);
+	fs.linedefined = line;
+	fs.numparams = (uint8_t)parse_params(ls, needself);
+	fs.bcbase = pfs->bcbase + pfs->pc;
+	fs.bclim = pfs->bclim - pfs->pc;
+	bcemit_AD(&fs, BC_FUNCF, 0, 0);  /* Placeholder. */
+	lex_check(ls, '{');
+	parse_chunk(ls);
+	lex_check(ls, '}');
+	pt = fs_finish(ls, (ls->lastline = ls->linenumber));
+	pfs->bcbase = ls->bcstack + oldbase;  /* May have been reallocated. */
+	pfs->bclim = (BCPos)(ls->sizebcstack - oldbase);
+	/* Store new prototype in the constant array of the parent. */
+	expr_init(e, VRELOCABLE,
+		bcemit_AD(pfs, BC_FNEW, 0,
+			  const_gc(pfs, (ktap_obj_t *)pt, KTAP_TPROTO)));
+	if (!(pfs->flags & PROTO_CHILD)) {
+		if (pfs->flags & PROTO_HAS_RETURN)
+			pfs->flags |= PROTO_FIXUP_RETURN;
+		pfs->flags |= PROTO_CHILD;
+	}
+	//kp_lex_next(ls);
+}
+
+/* Parse body of a function, for 'trace/trace_end/profile/tick' closure */
+static void parse_body_no_args(LexState *ls, ExpDesc *e, int needself,
+				BCLine line)
+{
+	FuncState fs, *pfs = ls->fs;
+	FuncScope bl;
+	ktap_proto_t *pt;
+	ptrdiff_t oldbase = pfs->bcbase - ls->bcstack;
+
+	fs_init(ls, &fs);
+	fscope_begin(&fs, &bl, 0);
+	fs.linedefined = line;
+	fs.numparams = 0;
+	fs.bcbase = pfs->bcbase + pfs->pc;
+	fs.bclim = pfs->bclim - pfs->pc;
+	bcemit_AD(&fs, BC_FUNCF, 0, 0);  /* Placeholder. */
+	lex_check(ls, '{');
+	parse_chunk(ls);
+	lex_check(ls, '}');
+	pt = fs_finish(ls, (ls->lastline = ls->linenumber));
+	pfs->bcbase = ls->bcstack + oldbase;  /* May have been reallocated. */
+	pfs->bclim = (BCPos)(ls->sizebcstack - oldbase);
+	/* Store new prototype in the constant array of the parent. */
+	expr_init(e, VRELOCABLE,
+		bcemit_AD(pfs, BC_FNEW, 0,
+			  const_gc(pfs, (ktap_obj_t *)pt, KTAP_TPROTO)));
+	if (!(pfs->flags & PROTO_CHILD)) {
+		if (pfs->flags & PROTO_HAS_RETURN)
+			pfs->flags |= PROTO_FIXUP_RETURN;
+		pfs->flags |= PROTO_CHILD;
+	}
+	//kp_lex_next(ls);
+}
+
+
+/* Parse expression list. Last expression is left open. */
+static BCReg expr_list(LexState *ls, ExpDesc *v)
+{
+	BCReg n = 1;
+
+	expr(ls, v);
+	while (lex_opt(ls, ',')) {
+		expr_tonextreg(ls->fs, v);
+		expr(ls, v);
+		n++;
+	}
+	return n;
+}
+
+/* Parse function argument list. */
+static void parse_args(LexState *ls, ExpDesc *e)
+{
+	FuncState *fs = ls->fs;
+	ExpDesc args;
+	BCIns ins;
+	BCReg base;
+	BCLine line = ls->linenumber;
+
+	if (ls->tok == '(') {
+		if (line != ls->lastline)
+			err_syntax(ls, KP_ERR_XAMBIG);
+		kp_lex_next(ls);
+		if (ls->tok == ')') {  /* f(). */
+			args.k = VVOID;
+		} else {
+			expr_list(ls, &args);
+			/* f(a, b, g()) or f(a, b, ...). */
+			if (args.k == VCALL) {
+				/* Pass on multiple results. */
+				setbc_b(bcptr(fs, &args), 0);
+			}
+		}
+		lex_match(ls, ')', '(', line);
+	} else if (ls->tok == '{') {
+		expr_table(ls, &args);
+	} else if (ls->tok == TK_string) {
+		expr_init(&args, VKSTR, 0);
+		args.u.sval = rawtsvalue(&ls->tokval);
+		kp_lex_next(ls);
+	} else {
+		err_syntax(ls, KP_ERR_XFUNARG);
+		return;  /* Silence compiler. */
+	}
+
+	kp_assert(e->k == VNONRELOC);
+	base = e->u.s.info;  /* Base register for call. */
+	if (args.k == VCALL) {
+		ins = BCINS_ABC(BC_CALLM, base, 2, args.u.s.aux - base - 1);
+	} else {
+		if (args.k != VVOID)
+			expr_tonextreg(fs, &args);
+		ins = BCINS_ABC(BC_CALL, base, 2, fs->freereg - base);
+	}
+	expr_init(e, VCALL, bcemit_INS(fs, ins));
+	e->u.s.aux = base;
+	fs->bcbase[fs->pc - 1].line = line;
+	fs->freereg = base+1;  /* Leave one result by default. */
+}
+
+/* Parse primary expression. */
+static void expr_primary(LexState *ls, ExpDesc *v)
+{
+	FuncState *fs = ls->fs;
+
+	/* Parse prefix expression. */
+	if (ls->tok == '(') {
+		BCLine line = ls->linenumber;
+		kp_lex_next(ls);
+		expr(ls, v);
+		lex_match(ls, ')', '(', line);
+		expr_discharge(ls->fs, v);
+	} else if (ls->tok == TK_name) {
+		var_lookup(ls, v);
+	} else {
+		err_syntax(ls, KP_ERR_XSYMBOL);
+	}
+
+	for (;;) {  /* Parse multiple expression suffixes. */
+		if (ls->tok == '.') {
+			expr_field(ls, v);
+		} else if (ls->tok == '[') {
+			ExpDesc key;
+			expr_toanyreg(fs, v);
+			expr_bracket(ls, &key);
+			expr_index(fs, v, &key);
+		} else if (ls->tok == ':') {
+			ExpDesc key;
+			kp_lex_next(ls);
+			expr_str(ls, &key);
+			bcemit_method(fs, v, &key);
+			parse_args(ls, v);
+		} else if (ls->tok == '(' || ls->tok == TK_string ||
+				ls->tok == '{') {
+			expr_tonextreg(fs, v);
+			parse_args(ls, v);
+		} else {
+			break;
+		}
+	}
+}
+
+/* Parse simple expression. */
+static void expr_simple(LexState *ls, ExpDesc *v)
+{
+	switch (ls->tok) {
+	case TK_number:
+		expr_init(v, VKNUM, 0);
+		set_obj(&v->u.nval, &ls->tokval);
+		break;
+	case TK_string:
+		expr_init(v, VKSTR, 0);
+		v->u.sval = rawtsvalue(&ls->tokval);
+		break;
+	case TK_nil:
+		expr_init(v, VKNIL, 0);
+		break;
+	case TK_true:
+		expr_init(v, VKTRUE, 0);
+		break;
+	case TK_false:
+		expr_init(v, VKFALSE, 0);
+		break;
+	case TK_dots: {  /* Vararg. */
+		FuncState *fs = ls->fs;
+		BCReg base;
+		checkcond(ls, fs->flags & PROTO_VARARG, KP_ERR_XDOTS);
+		bcreg_reserve(fs, 1);
+		base = fs->freereg-1;
+		expr_init(v, VCALL, bcemit_ABC(fs, BC_VARG, base, 2,
+		fs->numparams));
+		v->u.s.aux = base;
+		break;
+	}
+	case '{':  /* Table constructor. */
+		expr_table(ls, v);
+		return;
+	case TK_function:
+		kp_lex_next(ls);
+		parse_body(ls, v, 0, ls->linenumber);
+		return;
+	case TK_argstr:
+		expr_init(v, VARGSTR, 0);
+		break;
+	case TK_probename:
+		expr_init(v, VARGNAME, 0);
+		break;
+	case TK_arg0: case TK_arg1: case TK_arg2: case TK_arg3: case TK_arg4:
+	case TK_arg5: case TK_arg6: case TK_arg7: case TK_arg8: case TK_arg9:
+		expr_init(v, VARGN, ls->tok - TK_arg0);
+		break;
+	case TK_pid:
+		expr_init(v, VPID, 0);
+		break;
+	case TK_tid:
+		expr_init(v, VTID, 0);
+		break;
+	case TK_uid:
+		expr_init(v, VUID, 0);
+		break;
+	case TK_cpu:
+		expr_init(v, VCPU, 0);
+		break;
+	case TK_execname:
+		expr_init(v, VEXECNAME, 0);
+		break;
+	default:
+		expr_primary(ls, v);
+		return;
+	}
+	kp_lex_next(ls);
+}
+
+/* Manage syntactic levels to avoid blowing up the stack. */
+static void synlevel_begin(LexState *ls)
+{
+	if (++ls->level >= KP_MAX_XLEVEL)
+		kp_lex_error(ls, 0, KP_ERR_XLEVELS);
+}
+
+#define synlevel_end(ls)	((ls)->level--)
+
+/* Convert token to binary operator. */
+static BinOpr token2binop(LexToken tok)
+{
+	switch (tok) {
+	case '+':	return OPR_ADD;
+	case '-':	return OPR_SUB;
+	case '*':	return OPR_MUL;
+	case '/':	return OPR_DIV;
+	case '%':	return OPR_MOD;
+	case '^':	return OPR_POW;
+	case TK_concat: return OPR_CONCAT;
+	case TK_ne:	return OPR_NE;
+	case TK_eq:	return OPR_EQ;
+	case '<':	return OPR_LT;
+	case TK_le:	return OPR_LE;
+	case '>':	return OPR_GT;
+	case TK_ge:	return OPR_GE;
+	case TK_and:	return OPR_AND;
+	case TK_or:	return OPR_OR;
+	default:	return OPR_NOBINOPR;
+	}
+}
+
+/* Priorities for each binary operator. ORDER OPR. */
+static const struct {
+	uint8_t left;	/* Left priority. */
+	uint8_t right;	/* Right priority. */
+} priority[] = {
+	{6,6}, {6,6}, {7,7}, {7,7}, {7,7},	/* ADD SUB MUL DIV MOD */
+	{10,9}, {5,4},			/* POW CONCAT (right associative) */
+	{3,3}, {3,3},				/* EQ NE */
+	{3,3}, {3,3}, {3,3}, {3,3},		/* LT GE GT LE */
+	{2,2}, {1,1}				/* AND OR */
+};
+
+#define UNARY_PRIORITY		8  /* Priority for unary operators. */
+
+/* Forward declaration. */
+static BinOpr expr_binop(LexState *ls, ExpDesc *v, uint32_t limit);
+
+/* Parse unary expression. */
+static void expr_unop(LexState *ls, ExpDesc *v)
+{
+	BCOp op;
+	if (ls->tok == TK_not) {
+		op = BC_NOT;
+	} else if (ls->tok == '-') {
+		op = BC_UNM;
+#if 0 /* ktap don't support lua length operator '#' */
+	} else if (ls->tok == '#') {
+		op = BC_LEN;
+#endif
+	} else {
+		expr_simple(ls, v);
+		return;
+	}
+	kp_lex_next(ls);
+	expr_binop(ls, v, UNARY_PRIORITY);
+	bcemit_unop(ls->fs, op, v);
+}
+
+/* Parse binary expressions with priority higher than the limit. */
+static BinOpr expr_binop(LexState *ls, ExpDesc *v, uint32_t limit)
+{
+	BinOpr op;
+
+	synlevel_begin(ls);
+	expr_unop(ls, v);
+	op = token2binop(ls->tok);
+	while (op != OPR_NOBINOPR && priority[op].left > limit) {
+		ExpDesc v2;
+		BinOpr nextop;
+		kp_lex_next(ls);
+		bcemit_binop_left(ls->fs, op, v);
+		/* Parse binary expression with higher priority. */
+		nextop = expr_binop(ls, &v2, priority[op].right);
+		bcemit_binop(ls->fs, op, v, &v2);
+		op = nextop;
+	}
+	synlevel_end(ls);
+	return op;  /* Return unconsumed binary operator (if any). */
+}
+
+/* Parse expression. */
+static void expr(LexState *ls, ExpDesc *v)
+{
+	expr_binop(ls, v, 0);  /* Priority 0: parse whole expression. */
+}
+
+/* Assign expression to the next register. */
+static void expr_next(LexState *ls)
+{
+	ExpDesc e;
+	expr(ls, &e);
+	expr_tonextreg(ls->fs, &e);
+}
+
+/* Parse conditional expression. */
+static BCPos expr_cond(LexState *ls)
+{
+	ExpDesc v;
+
+	lex_check(ls, '(');
+	expr(ls, &v);
+	if (v.k == VKNIL)
+		v.k = VKFALSE;
+	bcemit_branch_t(ls->fs, &v);
+	lex_check(ls, ')');
+	return v.f;
+}
+
+/* -- Assignments --------------------------------------------------------- */
+
+/* List of LHS variables. */
+typedef struct LHSVarList {
+	ExpDesc v;			/* LHS variable. */
+	struct LHSVarList *prev;	/* Link to previous LHS variable. */
+} LHSVarList;
+
+/* Eliminate write-after-read hazards for local variable assignment. */
+static void assign_hazard(LexState *ls, LHSVarList *lh, const ExpDesc *v)
+{
+	FuncState *fs = ls->fs;
+	BCReg reg = v->u.s.info; /* Check against this variable. */
+	BCReg tmp = fs->freereg; /* Rename to this temp. register(if needed) */
+	int hazard = 0;
+
+	for (; lh; lh = lh->prev) {
+		if (lh->v.k == VINDEXED) {
+			if (lh->v.u.s.info == reg) {  /* t[i], t = 1, 2 */
+				hazard = 1;
+				lh->v.u.s.info = tmp;
+			}
+			if (lh->v.u.s.aux == reg) {  /* t[i], i = 1, 2 */
+				hazard = 1;
+				lh->v.u.s.aux = tmp;
+			}
+		}
+	}
+	if (hazard) {
+		/* Rename conflicting variable. */
+		bcemit_AD(fs, BC_MOV, tmp, reg);
+		bcreg_reserve(fs, 1);
+	}
+}
+
+/* Adjust LHS/RHS of an assignment. */
+static void assign_adjust(LexState *ls, BCReg nvars, BCReg nexps, ExpDesc *e)
+{
+	FuncState *fs = ls->fs;
+	int32_t extra = (int32_t)nvars - (int32_t)nexps;
+
+	if (e->k == VCALL) {
+		extra++;  /* Compensate for the VCALL itself. */
+		if (extra < 0)
+			extra = 0;
+		setbc_b(bcptr(fs, e), extra+1);  /* Fixup call results. */
+		if (extra > 1)
+			bcreg_reserve(fs, (BCReg)extra-1);
+	} else {
+		if (e->k != VVOID)
+			expr_tonextreg(fs, e);  /* Close last expression. */
+		if (extra > 0) {  /* Leftover LHS are set to nil. */
+			BCReg reg = fs->freereg;
+			bcreg_reserve(fs, (BCReg)extra);
+			bcemit_nil(fs, reg, (BCReg)extra);
+		}
+	}
+}
+
+/* Recursively parse assignment statement. */
+static void parse_assignment(LexState *ls, LHSVarList *lh, BCReg nvars)
+{
+	ExpDesc e;
+
+	checkcond(ls, VLOCAL <= lh->v.k && lh->v.k <= VINDEXED,
+			KP_ERR_XSYNTAX);
+	if (lex_opt(ls, ',')) {  /* Collect LHS list and recurse upwards. */
+		LHSVarList vl;
+		vl.prev = lh;
+		expr_primary(ls, &vl.v);
+		if (vl.v.k == VLOCAL)
+			assign_hazard(ls, lh, &vl.v);
+		checklimit(ls->fs, ls->level + nvars, KP_MAX_XLEVEL,
+				"variable names");
+		parse_assignment(ls, &vl, nvars+1);
+	} else {  /* Parse RHS. */
+		BCReg nexps;
+		int assign_incr = 1;
+
+		if (lex_opt(ls, '='))
+			assign_incr = 0;
+		else if (lex_opt(ls, TK_incr))
+			assign_incr = 1;
+		else
+			err_syntax(ls, KP_ERR_XSYMBOL);
+
+		nexps = expr_list(ls, &e);
+		if (nexps == nvars) {
+			if (e.k == VCALL) {
+				/* Vararg assignment. */
+				if (bc_op(*bcptr(ls->fs, &e)) == BC_VARG) {
+					ls->fs->freereg--;
+					e.k = VRELOCABLE;
+				} else {  /* Multiple call results. */
+					/* Base of call is not relocatable. */
+					e.u.s.info = e.u.s.aux;
+					e.k = VNONRELOC;
+				}
+			}
+			if (assign_incr == 0)
+				bcemit_store(ls->fs, &lh->v, &e);
+			else
+				bcemit_store_incr(ls->fs, &lh->v, &e);
+			return;
+		}
+		assign_adjust(ls, nvars, nexps, &e);
+		if (nexps > nvars) {
+			/* Drop leftover regs. */
+			ls->fs->freereg -= nexps - nvars;
+		}
+	}
+	/* Assign RHS to LHS and recurse downwards. */
+	expr_init(&e, VNONRELOC, ls->fs->freereg-1);
+	bcemit_store(ls->fs, &lh->v, &e);
+}
+
+/* Parse call statement or assignment. */
+static void parse_call_assign(LexState *ls)
+{
+	FuncState *fs = ls->fs;
+	LHSVarList vl;
+
+	expr_primary(ls, &vl.v);
+	if (vl.v.k == VCALL) {  /* Function call statement. */
+		setbc_b(bcptr(fs, &vl.v), 1);  /* No results. */
+	} else {  /* Start of an assignment. */
+		vl.prev = NULL;
+		parse_assignment(ls, &vl, 1);
+	}
+}
+
+/* Parse 'var'(local in lua) statement. */
+static void parse_local(LexState *ls)
+{
+	if (lex_opt(ls, TK_function)) {  /* Local function declaration. */
+		ExpDesc v, b;
+		FuncState *fs = ls->fs;
+		var_new(ls, 0, lex_str(ls));
+		expr_init(&v, VLOCAL, fs->freereg);
+		v.u.s.aux = fs->varmap[fs->freereg];
+		bcreg_reserve(fs, 1);
+		var_add(ls, 1);
+		parse_body(ls, &b, 0, ls->linenumber);
+		/* bcemit_store(fs, &v, &b) without setting VSTACK_VAR_RW. */
+		expr_free(fs, &b);
+		expr_toreg(fs, &b, v.u.s.info);
+		/* The upvalue is in scope, but the local is only valid 
+		 * after the store. */
+		var_get(ls, fs, fs->nactvar - 1).startpc = fs->pc;
+	} else {  /* Local variable declaration. */
+		ExpDesc e;
+		BCReg nexps, nvars = 0;
+		do {  /* Collect LHS. */
+			var_new(ls, nvars++, lex_str(ls));
+		} while (lex_opt(ls, ','));
+		if (lex_opt(ls, '=')) {  /* Optional RHS. */
+			nexps = expr_list(ls, &e);
+		} else {  /* Or implicitly set to nil. */
+			e.k = VVOID;
+			nexps = 0;
+		}
+		assign_adjust(ls, nvars, nexps, &e);
+		var_add(ls, nvars);
+	}
+}
+
+/* Parse 'function' statement. */
+static void parse_func(LexState *ls, BCLine line)
+{
+	FuncState *fs = ls->fs;
+	ExpDesc v, b;
+
+	kp_lex_next(ls);  /* Skip 'function'. */
+
+	/* function is declared as local */
+#if 1
+	var_new(ls, 0, lex_str(ls));
+	expr_init(&v, VLOCAL, fs->freereg);
+	v.u.s.aux = fs->varmap[fs->freereg];
+	bcreg_reserve(fs, 1);
+	var_add(ls, 1);
+	parse_body(ls, &b, 0, ls->linenumber);
+	/* bcemit_store(fs, &v, &b) without setting VSTACK_VAR_RW. */
+	expr_free(fs, &b);
+	expr_toreg(fs, &b, v.u.s.info);
+	/* The upvalue is in scope, but the local is only valid 
+	 * after the store. */
+	var_get(ls, fs, fs->nactvar - 1).startpc = fs->pc;
+
+#else
+	int needself = 0;
+
+	/* Parse function name. */
+	var_lookup(ls, &v);
+	while (ls->tok == '.')  /* Multiple dot-separated fields. */
+		expr_field(ls, &v);
+	if (ls->tok == ':') {  /* Optional colon to signify method call. */
+		needself = 1;
+		expr_field(ls, &v);
+	}
+	parse_body(ls, &b, needself, line);
+	fs = ls->fs;
+	bcemit_store(fs, &v, &b);
+	fs->bcbase[fs->pc - 1].line = line;  /* Set line for the store. */
+#endif
+}
+
+/* -- Control transfer statements ----------------------------------------- */
+
+/* Check for end of block. */
+static int parse_isend(LexToken tok)
+{
+	switch (tok) {
+	case TK_else: case TK_elseif: case TK_end: case TK_until: case TK_eof:
+	case '}':
+		return 1;
+	default:
+		return 0;
+	}
+}
+
+/* Parse 'return' statement. */
+static void parse_return(LexState *ls)
+{
+	BCIns ins;
+	FuncState *fs = ls->fs;
+
+	kp_lex_next(ls);  /* Skip 'return'. */
+	fs->flags |= PROTO_HAS_RETURN;
+	if (parse_isend(ls->tok) || ls->tok == ';') {  /* Bare return. */
+		ins = BCINS_AD(BC_RET0, 0, 1);
+	} else {  /* Return with one or more values. */
+		ExpDesc e;  /* Receives the _last_ expression in the list. */
+		BCReg nret = expr_list(ls, &e);
+		if (nret == 1) {  /* Return one result. */
+			if (e.k == VCALL) {  /* Check for tail call. */
+				BCIns *ip = bcptr(fs, &e);
+				/* It doesn't pay off to add BC_VARGT just
+				 * for 'return ...'. */
+				if (bc_op(*ip) == BC_VARG)
+					goto notailcall;
+				fs->pc--;
+				ins = BCINS_AD(bc_op(*ip)-BC_CALL+BC_CALLT,
+						bc_a(*ip), bc_c(*ip));
+			} else { /* Can return the result from any register. */
+				ins = BCINS_AD(BC_RET1,
+					expr_toanyreg(fs, &e), 2);
+			}
+		} else {
+			if (e.k == VCALL) {/* Append all results from a call */
+ notailcall:
+				setbc_b(bcptr(fs, &e), 0);
+				ins = BCINS_AD(BC_RETM, fs->nactvar,
+						e.u.s.aux - fs->nactvar);
+			} else {
+				/* Force contiguous registers. */
+				expr_tonextreg(fs, &e);
+				ins = BCINS_AD(BC_RET, fs->nactvar, nret+1);
+			}
+		}
+	}
+	if (fs->flags & PROTO_CHILD) {
+		/* May need to close upvalues first. */
+		bcemit_AJ(fs, BC_UCLO, 0, 0);
+	}
+	bcemit_INS(fs, ins);
+}
+
+/* Parse 'break' statement. */
+static void parse_break(LexState *ls)
+{
+	ls->fs->bl->flags |= FSCOPE_BREAK;
+	gola_new(ls, NAME_BREAK, VSTACK_GOTO, bcemit_jmp(ls->fs));
+}
+
+/* Parse label. */
+static void parse_label(LexState *ls)
+{
+	FuncState *fs = ls->fs;
+	ktap_str_t *name;
+	int idx;
+
+	fs->lasttarget = fs->pc;
+	fs->bl->flags |= FSCOPE_GOLA;
+	kp_lex_next(ls);  /* Skip '::'. */
+	name = lex_str(ls);
+	if (gola_findlabel(ls, name))
+		kp_lex_error(ls, 0, KP_ERR_XLDUP, getstr(name));
+	idx = gola_new(ls, name, VSTACK_LABEL, fs->pc);
+	lex_check(ls, TK_label);
+	/* Recursively parse trailing statements: labels and ';'. */
+	for (;;) {
+		if (ls->tok == TK_label) {
+			synlevel_begin(ls);
+			parse_label(ls);
+			synlevel_end(ls);
+		} else if (ls->tok == ';') {
+			kp_lex_next(ls);
+		} else {
+			break;
+		}
+	}
+	/* Trailing label is considered to be outside of scope. */
+	if (parse_isend(ls->tok) && ls->tok != TK_until)
+		ls->vstack[idx].slot = fs->bl->nactvar;
+	gola_resolve(ls, fs->bl, idx);
+}
+
+/* -- Blocks, loops and conditional statements ---------------------------- */
+
+/* Parse a block. */
+static void parse_block(LexState *ls)
+{
+	FuncState *fs = ls->fs;
+	FuncScope bl;
+
+	fscope_begin(fs, &bl, 0);
+	parse_chunk(ls);
+	fscope_end(fs);
+}
+
+/* Parse 'while' statement. */
+static void parse_while(LexState *ls, BCLine line)
+{
+	FuncState *fs = ls->fs;
+	BCPos start, loop, condexit;
+	FuncScope bl;
+
+	kp_lex_next(ls);  /* Skip 'while'. */
+	start = fs->lasttarget = fs->pc;
+	condexit = expr_cond(ls);
+	fscope_begin(fs, &bl, FSCOPE_LOOP);
+	//lex_check(ls, TK_do);
+	lex_check(ls, '{');
+	loop = bcemit_AD(fs, BC_LOOP, fs->nactvar, 0);
+	parse_block(ls);
+	jmp_patch(fs, bcemit_jmp(fs), start);
+	//lex_match(ls, TK_end, TK_while, line);
+	lex_check(ls, '}');
+	fscope_end(fs);
+	jmp_tohere(fs, condexit);
+	jmp_patchins(fs, loop, fs->pc);
+}
+
+/* Parse 'repeat' statement. */
+static void parse_repeat(LexState *ls, BCLine line)
+{
+	FuncState *fs = ls->fs;
+	BCPos loop = fs->lasttarget = fs->pc;
+	BCPos condexit;
+	FuncScope bl1, bl2;
+
+	fscope_begin(fs, &bl1, FSCOPE_LOOP);  /* Breakable loop scope. */
+	fscope_begin(fs, &bl2, 0);  /* Inner scope. */
+	kp_lex_next(ls);  /* Skip 'repeat'. */
+	bcemit_AD(fs, BC_LOOP, fs->nactvar, 0);
+	parse_chunk(ls);
+	lex_match(ls, TK_until, TK_repeat, line);
+	/* Parse condition (still inside inner scope). */
+	condexit = expr_cond(ls);
+	/* No upvalues? Just end inner scope. */
+	if (!(bl2.flags & FSCOPE_UPVAL)) {
+		fscope_end(fs);
+	} else {
+		/* Otherwise generate: cond: UCLO+JMP out,
+		 * !cond: UCLO+JMP loop. */
+		parse_break(ls);  /* Break from loop and close upvalues. */
+		jmp_tohere(fs, condexit);
+		fscope_end(fs);  /* End inner scope and close upvalues. */
+		condexit = bcemit_jmp(fs);
+	}
+	jmp_patch(fs, condexit, loop);  /* Jump backwards if !cond. */
+	jmp_patchins(fs, loop, fs->pc);
+	fscope_end(fs);  /* End loop scope. */
+}
+
+/* Parse numeric 'for'. */
+static void parse_for_num(LexState *ls, ktap_str_t *varname, BCLine line)
+{
+	FuncState *fs = ls->fs;
+	BCReg base = fs->freereg;
+	FuncScope bl;
+	BCPos loop, loopend;
+
+	/* Hidden control variables. */
+	var_new_fixed(ls, FORL_IDX, VARNAME_FOR_IDX);
+	var_new_fixed(ls, FORL_STOP, VARNAME_FOR_STOP);
+	var_new_fixed(ls, FORL_STEP, VARNAME_FOR_STEP);
+	/* Visible copy of index variable. */
+	var_new(ls, FORL_EXT, varname);
+	lex_check(ls, '=');
+	expr_next(ls);
+	lex_check(ls, ',');
+	expr_next(ls);
+	if (lex_opt(ls, ',')) {
+		expr_next(ls);
+	} else {
+		/* Default step is 1. */
+		bcemit_AD(fs, BC_KSHORT, fs->freereg, 1);
+		bcreg_reserve(fs, 1);
+	}
+	var_add(ls, 3);  /* Hidden control variables. */
+	//lex_check(ls, TK_do);
+	lex_check(ls, ')');
+	lex_check(ls, '{');
+	loop = bcemit_AJ(fs, BC_FORI, base, NO_JMP);
+	fscope_begin(fs, &bl, 0);  /* Scope for visible variables. */
+	var_add(ls, 1);
+	bcreg_reserve(fs, 1);
+	parse_block(ls);
+	fscope_end(fs);
+	/* Perform loop inversion. Loop control instructions are at the end. */
+	loopend = bcemit_AJ(fs, BC_FORL, base, NO_JMP);
+	fs->bcbase[loopend].line = line;  /* Fix line for control ins. */
+	jmp_patchins(fs, loopend, loop+1);
+	jmp_patchins(fs, loop, fs->pc);
+}
+
+/*
+ * Try to predict whether the iterator is next() and specialize the bytecode.
+ * Detecting next() and pairs() by name is simplistic, but quite effective.
+ * The interpreter backs off if the check for the closure fails at runtime.
+ */
+static int predict_next(LexState *ls, FuncState *fs, BCPos pc)
+{
+	BCIns ins = fs->bcbase[pc].ins;
+	ktap_str_t *name;
+	const ktap_val_t *o;
+
+	switch (bc_op(ins)) {
+	case BC_MOV:
+		name = var_get(ls, fs, bc_d(ins)).name;
+		break;
+	case BC_UGET:
+		name = ls->vstack[fs->uvmap[bc_d(ins)]].name;
+		break;
+	case BC_GGET:
+		/* There's no inverse index (yet), so lookup the strings. */
+		o = kp_tab_getstr(fs->kt, kp_str_newz("pairs"));
+		if (o && tvhaskslot(o) && tvkslot(o) == bc_d(ins))
+			return 1;
+		o = kp_tab_getstr(fs->kt, kp_str_newz("next"));
+		if (o && tvhaskslot(o) && tvkslot(o) == bc_d(ins))
+			return 1;
+		return 0;
+	default:
+		return 0;
+	}
+
+	return (name->len == 5 && !strcmp(getstr(name), "pairs")) ||
+		(name->len == 4 && !strcmp(getstr(name), "next"));
+}
+
+/* Parse 'for' iterator. */
+static void parse_for_iter(LexState *ls, ktap_str_t *indexname)
+{
+	FuncState *fs = ls->fs;
+	ExpDesc e;
+	BCReg nvars = 0;
+	BCLine line;
+	BCReg base = fs->freereg + 3;
+	BCPos loop, loopend, exprpc = fs->pc;
+	FuncScope bl;
+	int isnext;
+
+	/* Hidden control variables. */
+	var_new_fixed(ls, nvars++, VARNAME_FOR_GEN);
+	var_new_fixed(ls, nvars++, VARNAME_FOR_STATE);
+	var_new_fixed(ls, nvars++, VARNAME_FOR_CTL);
+
+	/* Visible variables returned from iterator. */
+	var_new(ls, nvars++, indexname);
+	while (lex_opt(ls, ','))
+		var_new(ls, nvars++, lex_str(ls));
+	lex_check(ls, TK_in);
+	line = ls->linenumber;
+	assign_adjust(ls, 3, expr_list(ls, &e), &e);
+	/* The iterator needs another 3 slots (func + 2 args). */
+	bcreg_bump(fs, 3);
+	isnext = (nvars <= 5 && predict_next(ls, fs, exprpc));
+	var_add(ls, 3);  /* Hidden control variables. */
+	//lex_check(ls, TK_do);
+	lex_check(ls, ')');
+	lex_check(ls, '{');
+	loop = bcemit_AJ(fs, isnext ? BC_ISNEXT : BC_JMP, base, NO_JMP);
+	fscope_begin(fs, &bl, 0);  /* Scope for visible variables. */
+	var_add(ls, nvars-3);
+	bcreg_reserve(fs, nvars-3);
+	parse_block(ls);
+	fscope_end(fs);
+	/* Perform loop inversion. Loop control instructions are at the end. */
+	jmp_patchins(fs, loop, fs->pc);
+	bcemit_ABC(fs, isnext ? BC_ITERN : BC_ITERC, base, nvars-3+1, 2+1);
+	loopend = bcemit_AJ(fs, BC_ITERL, base, NO_JMP);
+	fs->bcbase[loopend-1].line = line;  /* Fix line for control ins. */
+	fs->bcbase[loopend].line = line;
+	jmp_patchins(fs, loopend, loop+1);
+}
+
+/* Parse 'for' statement. */
+static void parse_for(LexState *ls, BCLine line)
+{
+	FuncState *fs = ls->fs;
+	ktap_str_t *varname;
+	FuncScope bl;
+
+	fscope_begin(fs, &bl, FSCOPE_LOOP);
+	kp_lex_next(ls);  /* Skip 'for'. */
+	lex_check(ls, '(');
+	varname = lex_str(ls);  /* Get first variable name. */
+	if (ls->tok == '=')
+		parse_for_num(ls, varname, line);
+	else if (ls->tok == ',' || ls->tok == TK_in)
+		parse_for_iter(ls, varname);
+	else
+		err_syntax(ls, KP_ERR_XFOR);
+	//lex_check(ls, '}');
+	//lex_match(ls, TK_end, TK_for, line);
+	lex_match(ls, '}', TK_for, line);
+	fscope_end(fs);  /* Resolve break list. */
+}
+
+/* Parse condition and 'then' block. */
+static BCPos parse_then(LexState *ls)
+{
+	BCPos condexit;
+	kp_lex_next(ls);  /* Skip 'if' or 'elseif'. */
+	condexit = expr_cond(ls);
+	lex_check(ls, '{');
+	parse_block(ls);
+	lex_check(ls, '}');
+	return condexit;
+}
+
+/* Parse 'if' statement. */
+static void parse_if(LexState *ls, BCLine line)
+{
+	FuncState *fs = ls->fs;
+	BCPos flist;
+	BCPos escapelist = NO_JMP;
+	flist = parse_then(ls);
+	while (ls->tok == TK_elseif) {  /* Parse multiple 'elseif' blocks. */
+		jmp_append(fs, &escapelist, bcemit_jmp(fs));
+		jmp_tohere(fs, flist);
+		flist = parse_then(ls);
+	}
+	if (ls->tok == TK_else) {  /* Parse optional 'else' block. */
+		jmp_append(fs, &escapelist, bcemit_jmp(fs));
+		jmp_tohere(fs, flist);
+		kp_lex_next(ls);  /* Skip 'else'. */
+		lex_check(ls, '{');
+		parse_block(ls);
+		lex_check(ls, '}');
+	} else {
+		jmp_append(fs, &escapelist, flist);
+	}
+	jmp_tohere(fs, escapelist);
+	//lex_match(ls, TK_end, TK_if, line);
+}
+
+/* Parse 'trace' and 'trace_end' statement. */
+static void parse_trace(LexState *ls)
+{
+	ExpDesc v, key, args;
+	ktap_str_t *kdebug_str = kp_str_newz("kdebug");
+	ktap_str_t *probe_str = kp_str_newz("trace_by_id");
+	ktap_str_t *probe_end_str = kp_str_newz("trace_end");
+	FuncState *fs = ls->fs;
+	int token = ls->tok;
+	BCIns ins;
+	BCReg base;
+	BCLine line = ls->linenumber;
+
+	if (token == TK_trace)
+		kp_lex_read_string_until(ls, '{');
+	else
+		kp_lex_next(ls);  /* skip "trace_end" keyword */
+
+	/* kdebug */
+	expr_init(&v, VGLOBAL, 0);
+	v.u.sval = kdebug_str;
+	expr_toanyreg(fs, &v);
+
+	/* fieldsel: kdebug.probe */
+	expr_init(&key, VKSTR, 0);
+	key.u.sval = token == TK_trace ? probe_str : probe_end_str;
+	expr_index(fs, &v, &key);
+
+	/* funcargs*/
+	expr_tonextreg(fs, &v);
+
+	if (token == TK_trace) {
+		ktap_eventdesc_t *evdef_info;
+		const char *str;
+
+		/* argument: EVENTDEF string */
+		lex_check(ls, TK_string);
+		str = svalue(&ls->tokval);
+		evdef_info = kp_parse_events(str);
+		if (!evdef_info)
+			kp_lex_error(ls, 0, KP_ERR_XEVENTDEF, str);
+
+
+		/* pass a userspace pointer to kernel */
+		expr_init(&args, VKNUM, 0);
+		set_number(&args.u.nval, (ktap_number)evdef_info);
+
+		expr_tonextreg(fs, &args);
+	}
+
+	/* argument: callback function */
+	parse_body_no_args(ls, &args, 0, ls->linenumber);
+
+	expr_tonextreg(fs, &args);
+
+	base = v.u.s.info;  /* base register for call */
+	ins = BCINS_ABC(BC_CALL, base, 2, fs->freereg - base);
+
+	expr_init(&v, VCALL, bcemit_INS(fs, ins));
+	v.u.s.aux = base;
+	fs->bcbase[fs->pc - 1].line = line;
+	fs->freereg = base+1;  /* Leave one result by default. */
+
+	setbc_b(bcptr(fs, &v), 1);  /* No results. */
+}
+
+
+/* Parse 'profile' and 'tick' statement. */
+static void parse_timer(LexState *ls)
+{
+	FuncState *fs = ls->fs;
+	ExpDesc v, key, args;
+	ktap_str_t *token_str = rawtsvalue(&ls->tokval);
+	ktap_str_t *interval_str;
+	BCLine line = ls->linenumber;
+	BCIns ins;
+	BCReg base;
+
+	kp_lex_next(ls);  /* skip '-' */
+
+	kp_lex_read_string_until(ls, '{');
+	interval_str = rawtsvalue(&ls->tokval);
+	lex_check(ls, TK_string);
+
+	/* timer */
+	expr_init(&v, VGLOBAL, 0);
+	v.u.sval = kp_str_newz("timer");
+	expr_toanyreg(fs, &v);
+
+	/* fieldsel: timer.profile, timer.tick */
+	expr_init(&key, VKSTR, 0);
+	key.u.sval = token_str;
+	expr_index(fs, &v, &key);
+
+	/* funcargs*/
+	expr_tonextreg(fs, &v);
+
+	/* argument: interval string */
+	expr_init(&args, VKSTR, 0);
+	args.u.sval = interval_str;
+
+	expr_tonextreg(fs, &args);
+
+	/* argument: callback function */
+	parse_body_no_args(ls, &args, 0, ls->linenumber);
+
+	expr_tonextreg(fs, &args);
+
+	base = v.u.s.info;  /* base register for call */
+	ins = BCINS_ABC(BC_CALL, base, 2, fs->freereg - base);
+
+	expr_init(&v, VCALL, bcemit_INS(fs, ins));
+	v.u.s.aux = base;
+	fs->bcbase[fs->pc - 1].line = line;
+	fs->freereg = base+1;  /* Leave one result by default. */
+
+	setbc_b(bcptr(fs, &v), 1);  /* No results. */
+}
+
+/* -- Parse statements ---------------------------------------------------- */
+
+/* Parse a statement. Returns 1 if it must be the last one in a chunk. */
+static int parse_stmt(LexState *ls)
+{
+	BCLine line = ls->linenumber;
+	switch (ls->tok) {
+	case TK_if:
+		parse_if(ls, line);
+		break;
+	case TK_while:
+		parse_while(ls, line);
+		break;
+	case TK_do:
+		kp_lex_next(ls);
+		parse_block(ls);
+		lex_match(ls, TK_end, TK_do, line);
+		break;
+	case TK_for:
+		parse_for(ls, line);
+		break;
+	case TK_repeat:
+		parse_repeat(ls, line);
+		break;
+	case TK_function:
+		parse_func(ls, line);
+		break;
+	case TK_local:
+		kp_lex_next(ls);
+		parse_local(ls);
+		break;
+	case TK_return:
+		parse_return(ls);
+		return 1;  /* Must be last. */
+	case TK_break:
+		kp_lex_next(ls);
+		parse_break(ls);
+		return 0;  /* Must be last. */
+	case ';':
+		kp_lex_next(ls);
+		break;
+	case TK_label:
+		parse_label(ls);
+		break;
+	case TK_trace:
+	case TK_trace_end:
+		parse_trace(ls);
+		break;
+	case TK_profile:
+	case TK_tick:
+		parse_timer(ls);
+		break;
+	default:
+		parse_call_assign(ls);
+		break;
+	}
+	return 0;
+}
+
+/* A chunk is a list of statements optionally separated by semicolons. */
+static void parse_chunk(LexState *ls)
+{
+	int islast = 0;
+
+	synlevel_begin(ls);
+	while (!islast && !parse_isend(ls->tok)) {
+		islast = parse_stmt(ls);
+		lex_opt(ls, ';');
+		kp_assert(ls->fs->framesize >= ls->fs->freereg &&
+			ls->fs->freereg >= ls->fs->nactvar);
+		/* Free registers after each stmt. */
+		ls->fs->freereg = ls->fs->nactvar;
+	}
+	synlevel_end(ls);
+}
+
+/* Entry point of bytecode parser. */
+ktap_proto_t *kp_parse(LexState *ls)
+{
+	FuncState fs;
+	FuncScope bl;
+	ktap_proto_t *pt;
+
+	ls->chunkname = kp_str_newz(ls->chunkarg);
+	ls->level = 0;
+	fs_init(ls, &fs);
+	fs.linedefined = 0;
+	fs.numparams = 0;
+	fs.bcbase = NULL;
+	fs.bclim = 0;
+	fs.flags |= PROTO_VARARG;  /* Main chunk is always a vararg func. */
+	fscope_begin(&fs, &bl, 0);
+	bcemit_AD(&fs, BC_FUNCV, 0, 0);  /* Placeholder. */
+	kp_lex_next(ls);  /* Read-ahead first token. */
+	parse_chunk(ls);
+	if (ls->tok != TK_eof)
+		err_token(ls, TK_eof);
+	pt = fs_finish(ls, ls->linenumber);
+	kp_assert(fs.prev == NULL);
+	kp_assert(ls->fs == NULL);
+	kp_assert(pt->sizeuv == 0);
+	return pt;
+}
+
diff --git a/tools/ktap/kp_parse.h b/tools/ktap/kp_parse.h
new file mode 100644
index 0000000..90d27cb
--- /dev/null
+++ b/tools/ktap/kp_parse.h
@@ -0,0 +1,4 @@
+
+ktap_proto_t *kp_parse(LexState *ls);
+ktap_str_t *kp_parse_keepstr(LexState *ls, const char *str, size_t l);
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 21/29] ktap: add symbol handling code(tools/ktap/symbol.[c|h])
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (19 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 20/29] ktap: add compiler(tools/ktap/kp_[lex|parse].[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 22/29] ktap: add events parse code(tools/ktap/kp_parse_events.c) Jovi Zhangwei
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

This file is use for uprobe(include SDT) symbol lookup,
for example:

trace probe:/lib64/libc.so.6:malloc {
        print("malloc entry:", execname)
}

trace sdt:/lib64/libc.so.6:* {
        print(execname, argstr)
}

It need libelf library support.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/kp_symbol.c | 360 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/ktap/kp_symbol.h |  50 +++++++
 2 files changed, 410 insertions(+)
 create mode 100644 tools/ktap/kp_symbol.c
 create mode 100644 tools/ktap/kp_symbol.h

diff --git a/tools/ktap/kp_symbol.c b/tools/ktap/kp_symbol.c
new file mode 100644
index 0000000..1d5b73e
--- /dev/null
+++ b/tools/ktap/kp_symbol.c
@@ -0,0 +1,360 @@
+/*
+ * symbol.c
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2013 Azat Khuzhin <a3at.mail@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <string.h>
+#include <linux/limits.h>
+
+#include <libelf.h>
+
+#include "../../include/uapi/ktap/ktap_types.h"
+#include "kp_symbol.h"
+
+const char *dbg_link_name = ".gnu_debuglink";
+const char *dbg_bin_dir = "/usr/lib/debug";
+
+static Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
+				    GElf_Shdr *shp, const char *name)
+{
+	Elf_Scn *scn = NULL;
+
+	/* Elf is corrupted/truncated, avoid calling elf_strptr. */
+	if (!elf_rawdata(elf_getscn(elf, ep->e_shstrndx), NULL))
+		return NULL;
+
+	while ((scn = elf_nextscn(elf, scn)) != NULL) {
+		char *str;
+
+		gelf_getshdr(scn, shp);
+		str = elf_strptr(elf, ep->e_shstrndx, shp->sh_name);
+		if (!strcmp(name, str))
+			break;
+	}
+
+	return scn;
+}
+
+/**
+ * @return v_addr of "LOAD" program header, that have zero offset.
+ */
+static int find_load_address(Elf *elf, vaddr_t *load_address)
+{
+	GElf_Phdr phdr;
+	size_t i, phdrnum;
+
+	if (elf_getphdrnum(elf, &phdrnum))
+		return -1;
+
+	for (i = 0; i < phdrnum; i++) {
+		if (gelf_getphdr(elf, i, &phdr) == NULL)
+			return -1;
+
+		if (phdr.p_type != PT_LOAD || phdr.p_offset != 0)
+			continue;
+
+		*load_address = phdr.p_vaddr;
+		return 0;
+	}
+
+	/* cannot found load address */
+	return -1;
+}
+
+static size_t elf_symbols(GElf_Shdr shdr)
+{
+	return shdr.sh_size / shdr.sh_entsize;
+}
+
+static int dso_symbols(Elf *elf, symbol_actor actor, void *arg)
+{
+	Elf_Data *elf_data = NULL;
+	Elf_Scn *scn = NULL;
+	GElf_Sym sym;
+	GElf_Shdr shdr;
+	int symbols_count = 0;
+	vaddr_t load_address;
+
+	if (find_load_address(elf, &load_address))
+		return -1;
+
+	while ((scn = elf_nextscn(elf, scn))) {
+		int i;
+
+		gelf_getshdr(scn, &shdr);
+
+		if (shdr.sh_type != SHT_SYMTAB)
+			continue;
+
+		elf_data = elf_getdata(scn, elf_data);
+
+		for (i = 0; i < elf_symbols(shdr); i++) {
+			char *name;
+			vaddr_t addr;
+			int ret;
+
+			gelf_getsym(elf_data, i, &sym);
+
+			if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
+				continue;
+
+			name = elf_strptr(elf, shdr.sh_link, sym.st_name);
+			addr = sym.st_value - load_address;
+
+			ret = actor(name, addr, arg);
+			if (ret)
+				return ret;
+
+			++symbols_count;
+		}
+	}
+
+	return symbols_count;
+}
+
+#define SDT_NOTE_TYPE 3
+#define SDT_NOTE_COUNT 3
+#define SDT_NOTE_SCN ".note.stapsdt"
+#define SDT_NOTE_NAME "stapsdt"
+
+static vaddr_t sdt_note_addr(Elf *elf, const char *data, size_t len, int type)
+{
+	vaddr_t vaddr;
+
+	/*
+	 * Three addresses need to be obtained :
+	 * Marker location, address of base section and semaphore location
+	 */
+	union {
+		Elf64_Addr a64[3];
+		Elf32_Addr a32[3];
+	} buf;
+
+	/*
+	 * dst and src are required for translation from file to memory
+	 * representation
+	 */
+	Elf_Data dst = {
+		.d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
+		.d_size = gelf_fsize(elf, ELF_T_ADDR, SDT_NOTE_COUNT, EV_CURRENT),
+		.d_off = 0, .d_align = 0
+	};
+
+	Elf_Data src = {
+		.d_buf = (void *) data, .d_type = ELF_T_ADDR,
+		.d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
+		.d_align = 0
+	};
+
+	/* Check the type of each of the notes */
+	if (type != SDT_NOTE_TYPE)
+		return 0;
+
+	if (len < dst.d_size + SDT_NOTE_COUNT)
+		return 0;
+
+	/* Translation from file representation to memory representation */
+	if (gelf_xlatetom(elf, &dst, &src,
+			  elf_getident(elf, NULL)[EI_DATA]) == NULL)
+		return 0; /* TODO */
+
+	memcpy(&vaddr, &buf, sizeof(vaddr));
+
+	return vaddr;
+}
+
+static const char *sdt_note_name(Elf *elf, GElf_Nhdr *nhdr, const char *data)
+{
+	const char *provider = data + gelf_fsize(elf,
+		ELF_T_ADDR, SDT_NOTE_COUNT, EV_CURRENT);
+	const char *name = (const char *)memchr(provider, '\0',
+		data + nhdr->n_descsz - provider);
+
+	if (name++ == NULL)
+		return NULL;
+
+	return name;
+}
+
+static const char *sdt_note_data(const Elf_Data *data, size_t off)
+{
+	return ((data->d_buf) + off);
+}
+
+static int dso_sdt_notes(Elf *elf, symbol_actor actor, void *arg)
+{
+	GElf_Ehdr ehdr;
+	Elf_Scn *scn = NULL;
+	Elf_Data *data;
+	GElf_Shdr shdr;
+	size_t shstrndx;
+	size_t next;
+	GElf_Nhdr nhdr;
+	size_t name_off, desc_off, offset;
+	vaddr_t vaddr = 0;
+	int symbols_count = 0;
+
+	if (gelf_getehdr(elf, &ehdr) == NULL)
+		return 0;
+	if (elf_getshdrstrndx(elf, &shstrndx) != 0)
+		return 0;
+
+	/*
+	 * Look for section type = SHT_NOTE, flags = no SHF_ALLOC
+	 * and name = .note.stapsdt
+	 */
+	scn = elf_section_by_name(elf, &ehdr, &shdr, SDT_NOTE_SCN);
+	if (!scn)
+		return 0;
+	if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
+		return 0;
+
+	data = elf_getdata(scn, NULL);
+
+	for (offset = 0;
+		(next = gelf_getnote(data, offset, &nhdr, &name_off, &desc_off)) > 0;
+		offset = next) {
+		const char *name;
+		int ret;
+
+		if (nhdr.n_namesz != sizeof(SDT_NOTE_NAME) ||
+		    memcmp(data->d_buf + name_off, SDT_NOTE_NAME,
+			    sizeof(SDT_NOTE_NAME)))
+			continue;
+
+		name = sdt_note_name(elf, &nhdr, sdt_note_data(data, desc_off));
+		if (!name)
+			continue;
+
+		vaddr = sdt_note_addr(elf, sdt_note_data(data, desc_off),
+					nhdr.n_descsz, nhdr.n_type);
+		if (!vaddr)
+			continue;
+
+		ret = actor(name, vaddr, arg);
+		if (ret)
+			return ret;
+
+		++symbols_count;
+	}
+
+	return symbols_count;
+}
+
+int dso_follow_debuglink(Elf *elf,
+			 const char *orig_exec,
+			 int type,
+			 symbol_actor actor,
+			 void *arg)
+{
+	GElf_Ehdr ehdr;
+	size_t shstrndx, orig_exec_dir_len;
+	GElf_Shdr shdr;
+	Elf_Scn *dbg_link_scn;
+	Elf_Data *dbg_link_scn_data;
+	char *dbg_link, *dbg_bin, *last_slash;
+	int symbols_count;
+
+	/* First try to find the .gnu_debuglink section in the binary. */
+	if (gelf_getehdr(elf, &ehdr) == NULL)
+		return 0;
+	if (elf_getshdrstrndx(elf, &shstrndx) != 0)
+		return 0;
+
+	dbg_link_scn = elf_section_by_name(elf, &ehdr, &shdr, dbg_link_name);
+	if (dbg_link_scn == NULL)
+		return 0;
+
+	/* Debug link section found, read of the content (only get the first
+	   string, no checksum checking atm). This is debug binary file name. */
+	dbg_link_scn_data = elf_getdata(dbg_link_scn, NULL);
+	if (dbg_link_scn_data == NULL ||
+	    dbg_link_scn_data->d_size <= 0 ||
+	    dbg_link_scn_data->d_buf == NULL)
+		return 0;
+
+	/* Now compose debug executable name */
+	dbg_link = (char *)(dbg_link_scn_data->d_buf);
+	dbg_bin = malloc(strlen(dbg_bin_dir) + 1 +
+			 strlen(orig_exec) + 1 +
+			 strlen(dbg_link) + 1);
+	if (!dbg_bin)
+		return 0;
+
+	orig_exec_dir_len = PATH_MAX;
+	last_slash = strrchr(orig_exec, '/');
+	if (last_slash != NULL)
+		orig_exec_dir_len = last_slash - orig_exec;
+
+	sprintf(dbg_bin, "%s/%.*s/%s",
+		dbg_bin_dir, (int)orig_exec_dir_len, orig_exec, dbg_link);
+
+	/* Retry symbol seach with the debug binary */
+	symbols_count = parse_dso_symbols(dbg_bin, type, actor, arg);
+
+	free(dbg_bin);
+
+	return symbols_count;
+}
+
+int parse_dso_symbols(const char *exec, int type, symbol_actor actor, void *arg)
+{
+	int symbols_count = 0;
+	Elf *elf;
+	int fd;
+
+	if (elf_version(EV_CURRENT) == EV_NONE)
+		return -1;
+
+	fd = open(exec, O_RDONLY);
+	if (fd < 0)
+		return -1;
+
+	elf = elf_begin(fd, ELF_C_READ, NULL);
+	if (elf) {
+		switch (type) {
+		case FIND_SYMBOL:
+			symbols_count = dso_symbols(elf, actor, arg);
+			if (symbols_count != 0)
+				break;
+			/* If no symbols found, try in the debuglink binary. */
+			symbols_count = dso_follow_debuglink(elf,
+							     exec,
+							     type,
+							     actor,
+							     arg);
+			break;
+		case FIND_STAPSDT_NOTE:
+			symbols_count = dso_sdt_notes(elf, actor, arg);
+			break;
+		}
+
+		elf_end(elf);
+	}
+
+	close(fd);
+	return symbols_count;
+}
diff --git a/tools/ktap/kp_symbol.h b/tools/ktap/kp_symbol.h
new file mode 100644
index 0000000..650e785
--- /dev/null
+++ b/tools/ktap/kp_symbol.h
@@ -0,0 +1,50 @@
+/*
+ * symbol.h - extract symbols from DSO.
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2013 Azat Khuzhin <a3at.mail@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+
+#define FIND_SYMBOL 1
+#define FIND_STAPSDT_NOTE 2
+
+#ifndef NO_LIBELF
+
+#include <gelf.h>
+#include <sys/queue.h>
+
+typedef GElf_Addr vaddr_t;
+typedef int (*symbol_actor)(const char *name, vaddr_t addr, void *arg);
+
+/**
+ * Parse all DSO symbols/sdt notes and all for every of them
+ * an actor.
+ *
+ * @exec - path to DSO
+ * @type - see FIND_*
+ * @symbol_actor - actor to call (callback)
+ * @arg - argument for @actor
+ *
+ * @return
+ * If there have errors, return negative value;
+ * No symbols found, return 0;
+ * Otherwise return number of dso symbols found
+ */
+int
+parse_dso_symbols(const char *exec, int type, symbol_actor actor, void *arg);
+#endif
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 22/29] ktap: add events parse code(tools/ktap/kp_parse_events.c)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (20 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 21/29] ktap: add symbol handling code(tools/ktap/symbol.[c|h]) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 23/29] ktap: add ring buffer reader(tools/ktap/kp_reader.c) Jovi Zhangwei
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Function 'kp_parse_events' parse event string passed by user,
and return events description structure, it covers:

1). tracepoint
        Search tracepoint name through '/sys/kernel/debug/tracing/events/',
        and get event id.

2). kprobe
        Search symbol name through '/proc/kallsyms', then write event to
        '/sys/kernel/debug/tracing/kprobe_events', and read events id.

3). uprobe
        Search symbol name through libelf, then write event to
        '/sys/kernel/debug/tracing/uprobe_events', then read events id.

4). SDT
        Same as uprobe.

All events id will assembly ktap_eventdesc_t structure, and finially
pass to 'kdebug.trace_by_id' function.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/kp_parse_events.c | 798 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 798 insertions(+)
 create mode 100644 tools/ktap/kp_parse_events.c

diff --git a/tools/ktap/kp_parse_events.c b/tools/ktap/kp_parse_events.c
new file mode 100644
index 0000000..50fea2f
--- /dev/null
+++ b/tools/ktap/kp_parse_events.c
@@ -0,0 +1,798 @@
+/*
+ * parse_events.c - ktap events parser
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <dirent.h>
+#include <fcntl.h>
+#include <ctype.h>
+
+#include "../../include/uapi/ktap/ktap_types.h"
+#include "../../include/uapi/ktap/ktap_bc.h"
+#include "kp_symbol.h"
+#include "kp_util.h"
+
+#define TRACING_EVENTS_DIR "/sys/kernel/debug/tracing/events"
+
+static u8 *idmap;
+static int idmap_size = 1024; /* set init size */
+static int id_nr;
+
+static int idmap_init(void)
+{
+	idmap = malloc(idmap_size);
+	if (!idmap)
+		return -1;
+
+	memset(idmap, 0, idmap_size);
+	return 0;
+}
+
+static void idmap_free(void)
+{
+	id_nr = 0;
+	free(idmap);
+}
+
+static inline int idmap_is_set(int id)
+{
+	return idmap[id / 8] & (1 << (id % 8));
+}
+
+static void idmap_set(int id)
+{
+	if (id >= idmap_size * 8) {
+		int newsize = id + 100; /* allocate extra 800 id */
+		idmap = realloc(idmap, newsize);
+		memset(idmap + idmap_size, 0, newsize - idmap_size);
+		idmap_size = newsize;
+	}
+
+	if (!idmap_is_set(id))
+		id_nr++;
+
+	idmap[id / 8] = idmap[id / 8] | (1 << (id % 8));
+}
+
+static void idmap_clear(int id)
+{
+	if (!idmap_is_set(id))
+		return;
+
+	id_nr--;
+	idmap[id / 8] = idmap[id / 8] & ~ (1 << (id % 8));
+}
+
+static int idmap_get_max_id(void)
+{
+	return idmap_size * 8;
+}
+
+static int *get_id_array()
+{
+	int *id_array;
+	int i, j = 0;
+
+	id_array = malloc(sizeof(int) * id_nr);
+	if (!id_array)
+		return NULL;
+
+	for (i = 0; i < idmap_get_max_id(); i++) {
+		if (idmap_is_set(i))
+			id_array[j++] = i;
+	}
+
+	return id_array;
+}
+
+static int add_event(char *evtid_path)
+{
+	char id_buf[24];
+	int id, fd;
+
+	fd = open(evtid_path, O_RDONLY);
+	if (fd < 0) {
+		/*
+		 * some tracepoint doesn't have id file, like ftrace,
+		 * return success in here, and don't print error.
+		 */
+		verbose_printf("warning: cannot open file %s\n", evtid_path);
+		return 0;
+	}
+
+	if (read(fd, id_buf, sizeof(id_buf)) < 0) {
+		fprintf(stderr, "read file error %s\n", evtid_path);
+		close(fd);
+		return -1;
+	}
+
+	id = atoll(id_buf);
+
+	idmap_set(id);
+
+	close(fd);
+	return 0;
+}
+
+static int add_tracepoint(const char *sys_name, const char *evt_name)
+{
+	char evtid_path[PATH_MAX] = {0};
+
+	snprintf(evtid_path, PATH_MAX, "%s/%s/%s/id", TRACING_EVENTS_DIR,
+					sys_name, evt_name);
+	return add_event(evtid_path);
+}
+
+static int parse_events_add_tracepoint(char *sys, char *event)
+{
+	process_available_tracepoints(sys, event, add_tracepoint);
+	return 0;
+}
+
+enum {
+	KPROBE_EVENT,
+	UPROBE_EVENT,
+};
+
+struct probe_list {
+	struct probe_list *next;
+	int type;
+	char event[64];
+};
+
+static struct probe_list *probe_list_head; /* for cleanup resources */
+
+/*
+ * Some symbol format cannot write to uprobe_events in debugfs, like:
+ * symbol "check_one_fd.part.0" in glibc.
+ * For those symbols, we change the format to:
+ * "check_one_fd.part.0" -> "check_one_fd_part_0"
+ */
+static char *format_symbol_name(const char *old_symbol)
+{
+	char *new_name = strdup(old_symbol);
+	char *name = new_name;
+	int changed = 0;
+
+        if (!isalpha(*name) && *name != '_') {
+		*name = '_';
+		changed = 1;
+	}
+
+        while (*++name != '\0') {
+                if (!isalpha(*name) && !isdigit(*name) && *name != '_') {
+			*name = '_';
+			changed = 1;
+			continue;
+		}
+        }
+
+	if (changed)
+		fprintf(stderr,
+			"Warning: symbol \"%s\" transformed to event \"%s\"\n",
+			old_symbol, new_name);
+
+	/* this is a good name */
+        return new_name;
+}
+
+
+#define KPROBE_EVENTS_PATH "/sys/kernel/debug/tracing/kprobe_events"
+
+/**
+ * @return 0 on success, otherwise -1
+ */
+static int
+write_kprobe_event(int fd, int ret_probe, const char *symbol,
+		   unsigned long start, char *fetch_args)
+{
+	char probe_event[128] = {0};
+	char event[64] = {0};
+	struct probe_list *pl;
+	char event_id_path[128] = {0};
+	char *symbol_name;
+	int id_fd, ret;
+
+	/* In case some symbols cannot write to uprobe_events debugfs file */
+	symbol_name = format_symbol_name(symbol);
+
+	if (!fetch_args)
+		fetch_args = " ";
+
+	if (ret_probe) {
+		snprintf(event, 64, "ktap_kprobes_%d/ret_%s",
+			 getpid(), symbol_name);
+		/* Return probe point must be a symbol */
+		snprintf(probe_event, 128, "r:%s %s %s",
+			 event, symbol, fetch_args);
+	} else {
+		snprintf(event, 64, "ktap_kprobes_%d/%s",
+			 getpid(), symbol_name);
+		snprintf(probe_event, 128, "p:%s 0x%lx %s",
+			 event, start, fetch_args);
+	}
+
+	sprintf(event_id_path, "/sys/kernel/debug/tracing/events/%s/id", event);
+	/* if event id already exist, then don't write to kprobes_event again */
+	id_fd = open(event_id_path, O_RDONLY);
+	if (id_fd > 0) {
+		close(id_fd);
+
+		/* remember add event id to ids_array */
+		ret = add_event(event_id_path);
+		if (ret)
+			goto error;
+
+		goto out;
+	}
+
+	verbose_printf("write kprobe event %s\n", probe_event);
+
+	if (write(fd, probe_event, strlen(probe_event)) <= 0) {
+		fprintf(stderr, "Cannot write %s to %s\n", probe_event,
+				KPROBE_EVENTS_PATH);
+		goto error;
+	}
+
+	/* add to cleanup list */
+	pl = malloc(sizeof(struct probe_list));
+	if (!pl)
+		goto error;
+
+	pl->type = KPROBE_EVENT;
+	pl->next = probe_list_head;
+	memcpy(pl->event, event, 64);
+	probe_list_head = pl;
+
+	ret = add_event(event_id_path);
+	if (ret < 0)
+		goto error;
+
+ out:
+	free(symbol_name);
+	return 0;
+
+ error:
+	free(symbol_name);
+	return -1;
+}
+
+static unsigned long kprobes_text_start;
+static unsigned long kprobes_text_end;
+
+static void init_kprobe_prohibited_area(void)
+{
+	static int once = 0;
+
+	if (once > 0)
+		return;
+
+	once = 1;
+	kprobes_text_start     = find_kernel_symbol("__kprobes_text_start");
+	kprobes_text_end       = find_kernel_symbol("__kprobes_text_end");
+}
+
+static int check_kprobe_addr_prohibited(unsigned long addr)
+{
+	if (addr >= kprobes_text_start && addr <= kprobes_text_end)
+		return -1;
+
+	return 0;
+}
+
+struct probe_cb_base {
+	int fd;
+	int ret_probe;
+	const char *event;
+	char *binary;
+	char *symbol;
+	char *fetch_args;
+};
+
+static int kprobe_symbol_actor(void *arg, const char *name, char type,
+			       unsigned long start)
+{
+	struct probe_cb_base *base = (struct probe_cb_base *)arg;
+
+	/* only can probe text function */
+	if (type != 't' && type != 'T')
+		return -1;
+
+	if (!strglobmatch(name, base->symbol))
+		return -1;
+
+	if (check_kprobe_addr_prohibited(start))
+		return -1;
+
+	/* ignore reture code of write debugfs */
+	write_kprobe_event(base->fd, base->ret_probe, name, start,
+			   base->fetch_args);
+
+	return 0; /* success */
+}
+
+static int parse_events_add_kprobe(char *event)
+{
+	char *symbol, *end;
+	struct probe_cb_base base;
+	int fd, ret;
+
+	fd = open(KPROBE_EVENTS_PATH, O_WRONLY);
+	if (fd < 0) {
+		fprintf(stderr, "Cannot open %s\n", KPROBE_EVENTS_PATH);
+		return -1;
+	}
+
+	end = strpbrk(event, "% ");
+	if (end)
+		symbol = strndup(event, end - event);
+	else
+		symbol = strdup(event);
+
+	base.fd = fd;
+	base.ret_probe = !!strstr(event, "%return");
+	base.symbol = symbol;
+	base.fetch_args = strchr(event, ' ');
+
+	init_kprobe_prohibited_area();
+
+	ret = kallsyms_parse(&base, kprobe_symbol_actor);
+	if (ret <= 0) {
+		fprintf(stderr, "cannot parse symbol \"%s\"\n", symbol);
+		ret = -1;
+	} else {
+		ret = 0;
+	}
+
+	free(symbol);
+	close(fd);
+
+	return ret;
+}
+
+#define UPROBE_EVENTS_PATH "/sys/kernel/debug/tracing/uprobe_events"
+
+/**
+ * @return 0 on success, otherwise -1
+ */
+static int
+write_uprobe_event(int fd, int ret_probe, const char *binary,
+		   const char *symbol, unsigned long addr,
+		   char *fetch_args)
+{
+	char probe_event[128] = {0};
+	char event[64] = {0};
+	struct probe_list *pl;
+	char event_id_path[128] = {0};
+	char *symbol_name;
+	int id_fd, ret;
+
+	/* In case some symbols cannot write to uprobe_events debugfs file */
+	symbol_name = format_symbol_name(symbol);
+
+	if (!fetch_args)
+		fetch_args = " ";
+
+	if (ret_probe) {
+		snprintf(event, 64, "ktap_uprobes_%d/ret_%s",
+			 getpid(), symbol_name);
+		snprintf(probe_event, 128, "r:%s %s:0x%lx %s",
+			 event, binary, addr, fetch_args);
+	} else {
+		snprintf(event, 64, "ktap_uprobes_%d/%s",
+			 getpid(), symbol_name);
+		snprintf(probe_event, 128, "p:%s %s:0x%lx %s",
+			 event, binary, addr, fetch_args);
+	}
+
+	sprintf(event_id_path, "/sys/kernel/debug/tracing/events/%s/id", event);
+	/* if event id already exist, then don't write to uprobes_event again */
+	id_fd = open(event_id_path, O_RDONLY);
+	if (id_fd > 0) {
+		close(id_fd);
+
+		/* remember add event id to ids_array */
+		ret = add_event(event_id_path);
+		if (ret)
+			goto error;
+
+		goto out;
+	}
+
+	verbose_printf("write uprobe event %s\n", probe_event);
+
+	if (write(fd, probe_event, strlen(probe_event)) <= 0) {
+		fprintf(stderr, "Cannot write %s to %s\n", probe_event,
+				UPROBE_EVENTS_PATH);
+		goto error;
+	}
+
+	/* add to cleanup list */
+	pl = malloc(sizeof(struct probe_list));
+	if (!pl)
+		goto error;
+
+	pl->type = UPROBE_EVENT;
+	pl->next = probe_list_head;
+	memcpy(pl->event, event, 64);
+	probe_list_head = pl;
+
+	ret = add_event(event_id_path);
+	if (ret < 0)
+		goto error;
+
+ out:
+	free(symbol_name);
+	return 0;
+
+ error:
+	free(symbol_name);
+	return -1;
+}
+
+/**
+ * TODO: avoid copy-paste stuff
+ *
+ * @return 1 on success, otherwise 0
+ */
+#ifdef NO_LIBELF
+static int parse_events_resolve_symbol(int fd, char *event, int type)
+{
+	char *colon, *binary, *fetch_args;
+	unsigned long symbol_address;
+
+	colon = strchr(event, ':');
+	if (!colon)
+		return -1;
+
+	symbol_address = strtol(colon + 1 /* skip ":" */, NULL, 0);
+
+	fetch_args = strchr(event, ' ');
+
+	/**
+	 * We already have address, no need in resolving.
+	 */
+	if (symbol_address) {
+		int ret;
+
+		binary = strndup(event, colon - event);
+		ret = write_uprobe_event(fd, !!strstr(event, "%return"), binary,
+					 "NULL", symbol_address, fetch_args);
+		free(binary);
+		return ret;
+	}
+
+	fprintf(stderr, "error: cannot resolve event \"%s\" without libelf, "
+			"please recompile ktap with NO_LIBELF disabled\n",
+			event);
+	exit(EXIT_FAILURE);
+	return -1;
+}
+
+#else
+static int uprobe_symbol_actor(const char *name, vaddr_t addr, void *arg)
+{
+	struct probe_cb_base *base = (struct probe_cb_base *)arg;
+	int ret;
+
+	if (!strglobmatch(name, base->symbol))
+		return 0;
+
+	verbose_printf("uprobe: binary: \"%s\" symbol \"%s\" "
+			"resolved to 0x%lx\n",
+			base->binary, base->symbol, (unsigned long)addr);
+
+	ret = write_uprobe_event(base->fd, base->ret_probe, base->binary,
+				 name, addr, base->fetch_args);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int parse_events_resolve_symbol(int fd, char *event, int type)
+{
+	char *colon, *end;
+	vaddr_t symbol_address;
+	int ret;
+	struct probe_cb_base base = {
+		.fd = fd,
+		.event = event
+	};
+
+	colon = strchr(event, ':');
+	if (!colon)
+		return 0;
+
+	base.ret_probe = !!strstr(event, "%return");
+	symbol_address = strtol(colon + 1 /* skip ":" */, NULL, 0);
+	base.binary = strndup(event, colon - event);
+
+	base.fetch_args = strchr(event, ' ');
+
+	/*
+	 * We already have address, no need in resolving.
+	 */
+	if (symbol_address) {
+		int ret;
+		ret = write_uprobe_event(fd, base.ret_probe, base.binary,
+					 "NULL", symbol_address,
+					 base.fetch_args);
+		free(base.binary);
+		return ret;
+	}
+
+	end = strpbrk(event, "% ");
+	if (end)
+		base.symbol = strndup(colon + 1, end - 1 - colon);
+	else
+		base.symbol = strdup(colon + 1);
+
+	ret = parse_dso_symbols(base.binary, type, uprobe_symbol_actor,
+				(void *)&base);
+	if (!ret) {
+		fprintf(stderr, "error: cannot find symbol %s in binary %s\n",
+			base.symbol, base.binary);
+		ret = -1;
+	} else if(ret > 0) {
+		/* no error found when parse symbols */
+		ret = 0;
+	}
+
+	free(base.binary);
+	free(base.symbol);
+
+	return ret;
+}
+#endif
+
+static int parse_events_add_uprobe(char *old_event, int type)
+{
+	int ret;
+	int fd;
+
+	fd = open(UPROBE_EVENTS_PATH, O_WRONLY);
+	if (fd < 0) {
+		fprintf(stderr, "Cannot open %s\n", UPROBE_EVENTS_PATH);
+		return -1;
+	}
+
+	ret = parse_events_resolve_symbol(fd, old_event, type);
+
+	close(fd);
+	return ret;
+}
+
+static int parse_events_add_probe(char *old_event)
+{
+	char *separator;
+
+	separator = strchr(old_event, ':');
+	if (!separator || (separator == old_event))
+		return parse_events_add_kprobe(old_event);
+	else
+		return parse_events_add_uprobe(old_event, FIND_SYMBOL);
+}
+
+static int parse_events_add_sdt(char *old_event)
+{
+	return parse_events_add_uprobe(old_event, FIND_STAPSDT_NOTE);
+}
+
+static void strim(char *s)
+{
+	size_t size;
+	char *end;
+
+	size = strlen(s);
+	if (!size)
+		return;
+
+	end = s + size -1;
+	while (end >= s && isspace(*end))
+		end--;
+
+	*(end + 1) = '\0';
+}
+
+static int get_sys_event_filter_str(char *start,
+				    char **sys, char **event, char **filter)
+{
+	char *separator, *separator2, *ptr, *end;
+
+	while (*start == ' ')
+		start++;
+
+	/* find sys */
+	separator = strchr(start, ':');
+	if (!separator || (separator == start)) {
+		return -1;
+	}
+
+	ptr = malloc(separator - start + 1);
+	if (!ptr)
+		return -1;
+
+	strncpy(ptr, start, separator - start);
+	ptr[separator - start] = '\0';
+
+	strim(ptr);
+	*sys = ptr;
+
+	if (!strcmp(*sys, "probe") && (*(separator + 1) == '/')) {
+		/* it's uprobe event */
+		separator2 = strchr(separator + 1, ':');
+		if (!separator2)
+			return -1;
+	} else
+		separator2 = separator;
+
+	/* find filter */
+	end = start + strlen(start);
+	while (*--end == ' ') {
+	}
+
+	if (*end == '/') {
+		char *filter_start;
+
+		filter_start = strchr(separator2, '/');
+		if (filter_start == end)
+			return -1;
+
+		ptr = malloc(end - filter_start);
+		if (!ptr)
+			return -1;
+
+		memcpy(ptr, filter_start + 1, end - filter_start - 1);
+		ptr[end - filter_start - 1] = '\0';
+
+		*filter = ptr;
+
+		end = filter_start;
+	} else {
+		*filter = NULL;
+		end++;
+	}
+
+	/* find event */
+	ptr = malloc(end - separator);
+	if (!ptr)
+		return -1;
+
+	memcpy(ptr, separator + 1, end - separator - 1);
+	ptr[end - separator - 1] = '\0';
+
+	strim(ptr);
+	*event = ptr;
+
+	return 0;
+}
+
+static char *get_next_eventdef(char *str)
+{
+	char *separator;
+
+	separator = strchr(str, ',');
+	if (!separator)
+		return str + strlen(str);
+
+	*separator = '\0';
+	return separator + 1;
+}
+
+ktap_eventdesc_t *kp_parse_events(const char *eventdef)
+{
+	char *str = strdup(eventdef);
+	char *sys, *event, *filter, *next;
+	ktap_eventdesc_t *evdef_info;
+	int ret;
+
+	idmap_init();
+
+ parse_next_eventdef:
+	next = get_next_eventdef(str);
+
+	if (get_sys_event_filter_str(str, &sys, &event, &filter))
+		goto error;
+
+	verbose_printf("parse_eventdef: sys[%s], event[%s], filter[%s]\n",
+		       sys, event, filter);
+
+	if (!strcmp(sys, "probe"))
+		ret = parse_events_add_probe(event);
+	else if (!strcmp(sys, "sdt"))
+		ret = parse_events_add_sdt(event);
+	else
+		ret = parse_events_add_tracepoint(sys, event);
+
+	if (ret)
+		goto error;
+
+	/* don't trace ftrace:function when all tracepoints enabled */
+	if (!strcmp(sys, "*"))
+		idmap_clear(1);
+
+
+	if (filter && *next != '\0') {
+		fprintf(stderr, "Error: eventdef only can append one filter\n");
+		goto error;
+	}
+
+	str = next;
+	if (*next != '\0')
+		goto parse_next_eventdef;
+
+	evdef_info = malloc(sizeof(*evdef_info));
+	if (!evdef_info)
+		goto error;
+
+	evdef_info->nr = id_nr;
+	evdef_info->id_arr = get_id_array();
+	evdef_info->filter = filter;
+
+	idmap_free();
+	return evdef_info;
+ error:
+	idmap_free();
+	cleanup_event_resources();
+	return NULL;
+}
+
+void cleanup_event_resources(void)
+{
+	struct probe_list *pl;
+	const char *path;
+	char probe_event[128] = {0};
+	int fd, ret;
+
+	for (pl = probe_list_head; pl; pl = pl->next) {
+		if (pl->type == KPROBE_EVENT)
+			path = KPROBE_EVENTS_PATH;
+		else if (pl->type == UPROBE_EVENT)
+			path = UPROBE_EVENTS_PATH;
+		else {
+			fprintf(stderr, "Cannot cleanup event type %d\n",
+					pl->type);
+			continue;
+		}
+
+		snprintf(probe_event, 128, "-:%s", pl->event);
+
+		fd = open(path, O_WRONLY);
+		if (fd < 0) {
+			fprintf(stderr, "Cannot open %s\n", UPROBE_EVENTS_PATH);
+			continue;
+		}
+
+		ret = write(fd, probe_event, strlen(probe_event));
+		if (ret <= 0) {
+			fprintf(stderr, "Cannot write %s to %s\n", probe_event,
+					path);
+			close(fd);
+			continue;
+		}
+
+		close(fd);
+	}
+}
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 23/29] ktap: add ring buffer reader(tools/ktap/kp_reader.c)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (21 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 22/29] ktap: add events parse code(tools/ktap/kp_parse_events.c) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 24/29] ktap: add bytecode writer(tools/ktap/kp_bcwrite.c) Jovi Zhangwei
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

This is ktap ring buffer consumer, a thread poll content from
'/sys/kernel/debug/ktap/trace_pipe_%pid' debugfs file.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/kp_reader.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)
 create mode 100644 tools/ktap/kp_reader.c

diff --git a/tools/ktap/kp_reader.c b/tools/ktap/kp_reader.c
new file mode 100644
index 0000000..103940e
--- /dev/null
+++ b/tools/ktap/kp_reader.c
@@ -0,0 +1,106 @@
+/*
+ * reader.c - ring buffer reader in userspace
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/poll.h>
+#include <sys/signal.h>
+#include <fcntl.h>
+#include <pthread.h>
+
+#define MAX_BUFLEN  131072
+#define PATH_MAX 128
+
+#define handle_error(str) do { perror(str); exit(-1); } while(0)
+
+void sigfunc(int signo)
+{
+	/* should not not reach here */
+}
+
+static void block_sigint()
+{
+	sigset_t mask;
+
+	sigemptyset(&mask);
+	sigaddset(&mask, SIGINT);
+
+	pthread_sigmask(SIG_BLOCK, &mask, NULL);
+}
+
+static void *reader_thread(void *data)
+{
+	char buf[MAX_BUFLEN];
+	char filename[PATH_MAX];
+	const char *output = data; 
+	int failed = 0, fd, out_fd, len;
+
+	block_sigint();
+
+	if (output) {
+		out_fd = open(output, O_CREAT | O_WRONLY | O_TRUNC,
+					S_IRUSR|S_IWUSR);
+		if (out_fd < 0) {
+			fprintf(stderr, "Cannot open output file %s\n", output);
+			return NULL;
+		}
+	} else
+		out_fd = 1;
+
+	sprintf(filename, "/sys/kernel/debug/ktap/trace_pipe_%d", getpid());
+
+ open_again:
+	fd = open(filename, O_RDONLY);
+	if (fd < 0) {
+		usleep(10000);
+
+		if (failed++ == 10) {
+			fprintf(stderr, "Cannot open file %s\n", filename);
+			return NULL;
+		}
+		goto open_again;
+	}
+
+	while ((len = read(fd, buf, sizeof(buf))) > 0)
+		write(out_fd, buf, len);
+
+	close(fd);
+	close(out_fd);
+
+	return NULL;
+}
+
+int kp_create_reader(const char *output)
+{
+	pthread_t reader;
+
+	signal(SIGINT, sigfunc);
+
+	if (pthread_create(&reader, NULL, reader_thread, (void *)output) < 0)
+		handle_error("pthread_create reader_thread failed\n");
+
+	return 0;
+}
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 24/29] ktap: add bytecode writer(tools/ktap/kp_bcwrite.c)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (22 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 23/29] ktap: add ring buffer reader(tools/ktap/kp_reader.c) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 25/29] ktap: add userspace util(tools/ktap/kp_util.c) Jovi Zhangwei
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Bytecode writer and listing.

[root@localhost ktap]# ./ktap -b -e 'var s = {} s["key"] = 1'

-- BYTECODE -- (command line):0-1
0001    TNEW    0       0
0002    KSHORT  1       1
0003    TSETS   1       0       0         ; "key"
0004    RET0    0       1

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/kp_bcwrite.c | 375 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 375 insertions(+)
 create mode 100644 tools/ktap/kp_bcwrite.c

diff --git a/tools/ktap/kp_bcwrite.c b/tools/ktap/kp_bcwrite.c
new file mode 100644
index 0000000..ae5f948
--- /dev/null
+++ b/tools/ktap/kp_bcwrite.c
@@ -0,0 +1,375 @@
+/*
+ * Bytecode writer
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Copyright (C) 1994-2013 Lua.org, PUC-Rio.
+ *  - The part of code in this file is copied from lua initially.
+ *  - lua's MIT license is compatible with GPL.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "../../include/uapi/ktap/ktap_types.h"
+#include "kp_util.h"
+
+/* Context for bytecode writer. */
+typedef struct BCWriteCtx {
+	SBuf sb;		/* Output buffer. */
+	ktap_proto_t *pt;	/* Root prototype. */
+	ktap_writer wfunc;	/* Writer callback. */
+	void *wdata;		/* Writer callback data. */
+	int strip;		/* Strip debug info. */
+	int status;		/* Status from writer callback. */
+} BCWriteCtx;
+
+
+static char *bcwrite_uint32(char *p, uint32_t v)
+{
+	memcpy(p, &v, sizeof(uint32_t));
+	p += sizeof(uint32_t);
+	return p;
+}
+
+/* -- Bytecode writer ----------------------------------------------------- */
+
+/* Write a single constant key/value of a template table. */
+static void bcwrite_ktabk(BCWriteCtx *ctx, const ktap_val_t *o, int narrow)
+{
+	char *p = kp_buf_more(&ctx->sb, 1+10);
+	if (is_string(o)) {
+		const ktap_str_t *str = rawtsvalue(o);
+		int len = str->len;
+		p = kp_buf_more(&ctx->sb, 5+len);
+		p = bcwrite_uint32(p, BCDUMP_KTAB_STR+len);
+		p = kp_buf_wmem(p, getstr(str), len);
+	} else if (is_number(o)) {
+		p = bcwrite_uint32(p, BCDUMP_KTAB_NUM);
+		p = kp_buf_wmem(p, &nvalue(o), sizeof(ktap_number));
+	} else {
+		kp_assert(tvispri(o));
+		p = bcwrite_uint32(p, BCDUMP_KTAB_NIL+~itype(o));
+	}
+	setsbufP(&ctx->sb, p);
+}
+
+/* Write a template table. */
+static void bcwrite_ktab(BCWriteCtx *ctx, char *p, const ktap_tab_t *t)
+{
+	int narray = 0, nhash = 0;
+	if (t->asize > 0) {  /* Determine max. length of array part. */
+		ptrdiff_t i;
+		ktap_val_t *array = t->array;
+		for (i = (ptrdiff_t)t->asize-1; i >= 0; i--)
+			if (!is_nil(&array[i]))
+				break;
+			narray = (int)(i+1);
+	}
+	if (t->hmask > 0) {  /* Count number of used hash slots. */
+		int i, hmask = t->hmask;
+		ktap_node_t *node = t->node;
+		for (i = 0; i <= hmask; i++)
+			nhash += !is_nil(&node[i].val);
+	}
+	/* Write number of array slots and hash slots. */
+	p = bcwrite_uint32(p, narray);
+	p = bcwrite_uint32(p, nhash);
+	setsbufP(&ctx->sb, p);
+	if (narray) {  /* Write array entries (may contain nil). */
+		int i;
+		ktap_val_t *o = t->array;
+		for (i = 0; i < narray; i++, o++)
+			bcwrite_ktabk(ctx, o, 1);
+	}
+	if (nhash) {  /* Write hash entries. */
+		int i = nhash;
+		ktap_node_t *node = t->node + t->hmask;
+		for (;; node--)
+			if (!is_nil(&node->val)) {
+				bcwrite_ktabk(ctx, &node->key, 0);
+				bcwrite_ktabk(ctx, &node->val, 1);
+				if (--i == 0)
+					break;
+			}
+	}
+}
+
+/* Write GC constants of a prototype. */
+static void bcwrite_kgc(BCWriteCtx *ctx, ktap_proto_t *pt)
+{
+	int i, sizekgc = pt->sizekgc;
+	ktap_obj_t **kr = (ktap_obj_t **)pt->k - (ptrdiff_t)sizekgc;
+
+	for (i = 0; i < sizekgc; i++, kr++) {
+		ktap_obj_t *o = *kr;
+		int tp, need = 1;
+		char *p;
+
+		/* Determine constant type and needed size. */
+		if (o->gch.gct == ~KTAP_TSTR) {
+			tp = BCDUMP_KGC_STR + ((ktap_str_t *)o)->len;
+			need = 5 + ((ktap_str_t *)o)->len;
+		} else if (o->gch.gct == ~KTAP_TPROTO) {
+			kp_assert((pt->flags & PROTO_CHILD));
+			tp = BCDUMP_KGC_CHILD;
+		} else {
+			kp_assert(o->gch.gct == ~KTAP_TTAB);
+			tp = BCDUMP_KGC_TAB;
+			need = 1+2*5;
+		}
+
+		/* Write constant type. */
+		p = kp_buf_more(&ctx->sb, need);
+		p = bcwrite_uint32(p, tp);
+		/* Write constant data (if any). */
+		if (tp >= BCDUMP_KGC_STR) {
+			p = kp_buf_wmem(p, getstr((ktap_str_t *)o),
+					((ktap_str_t *)o)->len);
+		} else if (tp == BCDUMP_KGC_TAB) {
+			bcwrite_ktab(ctx, p, (ktap_tab_t *)o);
+			continue;
+		}
+		setsbufP(&ctx->sb, p);
+	}
+}
+
+/* Write number constants of a prototype. */
+static void bcwrite_knum(BCWriteCtx *ctx, ktap_proto_t *pt)
+{
+	int i, sizekn = pt->sizekn;
+	const ktap_val_t *o = (ktap_val_t *)pt->k;
+	char *p = kp_buf_more(&ctx->sb, 10*sizekn);
+
+	for (i = 0; i < sizekn; i++, o++) {
+		if (is_number(o))
+			p = kp_buf_wmem(p, &nvalue(o), sizeof(ktap_number));
+	}
+	setsbufP(&ctx->sb, p);
+}
+
+/* Write bytecode instructions. */
+static char *bcwrite_bytecode(BCWriteCtx *ctx, char *p, ktap_proto_t *pt)
+{
+	int nbc = pt->sizebc-1;  /* Omit the [JI]FUNC* header. */
+
+	p = kp_buf_wmem(p, proto_bc(pt)+1, nbc*(int)sizeof(BCIns));
+	return p;
+}
+
+/* Write prototype. */
+static void bcwrite_proto(BCWriteCtx *ctx, ktap_proto_t *pt)
+{
+	int sizedbg = 0;
+	char *p;
+
+	/* Recursively write children of prototype. */
+	if (pt->flags & PROTO_CHILD) {
+		ptrdiff_t i, n = pt->sizekgc;
+		ktap_obj_t **kr = (ktap_obj_t **)pt->k - 1;
+		for (i = 0; i < n; i++, kr--) {
+			ktap_obj_t *o = *kr;
+			if (o->gch.gct == ~KTAP_TPROTO)
+				bcwrite_proto(ctx, (ktap_proto_t *)o);
+		}
+	}
+
+	/* Start writing the prototype info to a buffer. */
+	p = kp_buf_need(&ctx->sb,
+		5+4+6*5+(pt->sizebc-1)*(int)sizeof(BCIns)+pt->sizeuv*2);
+	p += 4;  /* Leave room for final size. */
+
+	/* Write prototype header. */
+	*p++ = (pt->flags & (PROTO_CHILD|PROTO_VARARG|PROTO_FFI));
+	*p++ = pt->numparams;
+	*p++ = pt->framesize;
+	*p++ = pt->sizeuv;
+	p = bcwrite_uint32(p, pt->sizekgc);
+	p = bcwrite_uint32(p, pt->sizekn);
+	p = bcwrite_uint32(p, pt->sizebc-1);
+	if (!ctx->strip) {
+		if (proto_lineinfo(pt))
+			sizedbg = pt->sizept -
+				(int)((char *)proto_lineinfo(pt) - (char *)pt);
+		p = bcwrite_uint32(p, sizedbg);
+		if (sizedbg) {
+			p = bcwrite_uint32(p, pt->firstline);
+			p = bcwrite_uint32(p, pt->numline);
+		}
+	}
+
+	/* Write bytecode instructions and upvalue refs. */
+	p = bcwrite_bytecode(ctx, p, pt);
+	p = kp_buf_wmem(p, proto_uv(pt), pt->sizeuv*2);
+	setsbufP(&ctx->sb, p);
+
+	/* Write constants. */
+	bcwrite_kgc(ctx, pt);
+	bcwrite_knum(ctx, pt);
+
+	/* Write debug info, if not stripped. */
+	if (sizedbg) {
+		p = kp_buf_more(&ctx->sb, sizedbg);
+		p = kp_buf_wmem(p, proto_lineinfo(pt), sizedbg);
+		setsbufP(&ctx->sb, p);
+	}
+
+	/* Pass buffer to writer function. */
+	if (ctx->status == 0) {
+		int n = sbuflen(&ctx->sb) - 4;
+		char *q = sbufB(&ctx->sb);
+		p = bcwrite_uint32(q, n);  /* Fill in final size. */
+		kp_assert(p == sbufB(&ctx->sb) + 4);
+		ctx->status = ctx->wfunc(q, n + 4, ctx->wdata);
+	}
+}
+
+/* Write header of bytecode dump. */
+static void bcwrite_header(BCWriteCtx *ctx)
+{
+	ktap_str_t *chunkname = proto_chunkname(ctx->pt);
+	const char *name = getstr(chunkname);
+	int len = chunkname->len;
+	char *p = kp_buf_need(&ctx->sb, 5+5+len);
+	*p++ = BCDUMP_HEAD1;
+	*p++ = BCDUMP_HEAD2;
+	*p++ = BCDUMP_HEAD3;
+	*p++ = BCDUMP_VERSION;
+	*p++ = (ctx->strip ? BCDUMP_F_STRIP : 0) + (KP_BE ? BCDUMP_F_BE : 0);
+
+	if (!ctx->strip) {
+		p = bcwrite_uint32(p, len);
+		p = kp_buf_wmem(p, name, len);
+	}
+	ctx->status = ctx->wfunc(sbufB(&ctx->sb),
+		(int)(p - sbufB(&ctx->sb)), ctx->wdata);
+}
+
+/* Write footer of bytecode dump. */
+static void bcwrite_footer(BCWriteCtx *ctx)
+{
+	if (ctx->status == 0) {
+		uint8_t zero = 0;
+		ctx->status = ctx->wfunc(&zero, 1, ctx->wdata);
+	}
+}
+
+/* Write bytecode for a prototype. */
+int kp_bcwrite(ktap_proto_t *pt, ktap_writer writer, void *data, int strip)
+{
+	BCWriteCtx ctx;
+
+	ctx.pt = pt;
+	ctx.wfunc = writer;
+	ctx.wdata = data;
+	ctx.strip = strip;
+	ctx.status = 0;
+
+	kp_buf_init(&ctx.sb);
+	kp_buf_need(&ctx.sb, 1024);  /* Avoids resize for most prototypes. */
+	bcwrite_header(&ctx);
+	bcwrite_proto(&ctx, ctx.pt);
+	bcwrite_footer(&ctx);
+
+	kp_buf_free(&ctx.sb);
+	return ctx.status;
+}
+
+/* -- Bytecode dump ----------------------------------------------------- */
+
+static const char * const bc_names[] = {
+#define BCNAME(name, ma, mb, mc, mt)       #name,
+	BCDEF(BCNAME)
+#undef BCNAME
+  NULL
+};
+
+static const uint16_t bc_mode[] = {
+	BCDEF(BCMODE)
+};
+
+static void dump_bytecode(ktap_proto_t *pt)
+{
+	int nbc = pt->sizebc - 1; /* Omit the FUNC* header. */
+	BCIns *ins = proto_bc(pt) + 1;
+	ktap_obj_t **kbase = pt->k;
+	int i;
+
+	printf("-- BYTECODE -- %s:%d-%d\n", getstr(pt->chunkname),
+		pt->firstline, pt->firstline + pt->numline);
+
+	for (i = 0; i < nbc; i++, ins++) {
+		int op = bc_op(*ins);
+
+		printf("%04d\t%s", i + 1, bc_names[op]);
+
+		printf("\t%d", bc_a(*ins));
+		if (bcmode_b(op) != BCMnone)
+			printf("\t%d", bc_b(*ins));
+
+		if (bcmode_hasd(op))
+			printf("\t%d", bc_d(*ins));
+		else
+			printf("\t%d", bc_c(*ins));
+
+		if (bcmode_b(op) == BCMstr || bcmode_c(op) == BCMstr) {
+			printf("\t  ; ");
+			if (bcmode_d(op) == BCMstr) {
+				int idx = ~bc_d(*ins);
+				printf("\"%s\"", getstr((ktap_str_t *)kbase[idx]));
+			}
+		}
+		printf("\n");
+	}
+}
+
+static int function_nr = 0;
+
+void kp_dump_proto(ktap_proto_t *pt)
+{
+	printf("\n----------------------------------------------------\n");
+	printf("function proto %d:\n", function_nr++);
+	printf("numparams: %d\n", pt->numparams);
+	printf("framesize: %d\n", pt->framesize);
+	printf("sizebc: %d\n", pt->sizebc);
+	printf("sizekgc: %d\n", pt->sizekgc);
+	printf("sizekn: %d\n", pt->sizekn);
+	printf("sizept: %d\n", pt->sizept);
+	printf("sizeuv: %d\n", pt->sizeuv);
+	printf("firstline: %d\n", pt->firstline);
+	printf("numline: %d\n", pt->numline);
+
+	printf("has child proto: %d\n", pt->flags & PROTO_CHILD);
+	printf("has vararg: %d\n", pt->flags & PROTO_VARARG);
+	printf("has ILOOP: %d\n", pt->flags & PROTO_ILOOP);
+
+	dump_bytecode(pt);
+
+	/* Recursively dump children of prototype. */
+	if (pt->flags & PROTO_CHILD) {
+		ptrdiff_t i, n = pt->sizekgc;
+		ktap_obj_t **kr = (ktap_obj_t **)pt->k - 1;
+		for (i = 0; i < n; i++, kr--) {
+			ktap_obj_t *o = *kr;
+			if (o->gch.gct == ~KTAP_TPROTO)
+				kp_dump_proto((ktap_proto_t *)o);		
+		}
+	}
+}
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 25/29] ktap: add userspace util(tools/ktap/kp_util.c)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (23 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 24/29] ktap: add bytecode writer(tools/ktap/kp_bcwrite.c) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 26/29] ktap: add userspace binary Makefile(tools/ktap/Makefile) Jovi Zhangwei
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/kp_util.c | 646 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/ktap/kp_util.h | 120 ++++++++++
 2 files changed, 766 insertions(+)
 create mode 100644 tools/ktap/kp_util.c
 create mode 100644 tools/ktap/kp_util.h

diff --git a/tools/ktap/kp_util.c b/tools/ktap/kp_util.c
new file mode 100644
index 0000000..2794a58
--- /dev/null
+++ b/tools/ktap/kp_util.c
@@ -0,0 +1,646 @@
+/*
+ * util.c
+ *
+ * This file is part of ktap by Jovi Zhangwei.
+ *
+ * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+ *
+ * Adapted from luajit and lua interpreter.
+ * Copyright (C) 2005-2014 Mike Pall.
+ * Copyright (C) 1994-2008 Lua.org, PUC-Rio.
+ *
+ * ktap is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * ktap is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <math.h>
+#include <ctype.h>
+#include "../../include/uapi/ktap/ktap_types.h"
+#include "../../include/uapi/ktap/ktap_bc.h"
+#include "kp_util.h"
+
+/* Error message strings. */
+const char *kp_err_allmsg =
+#define ERRDEF(name, msg)       msg "\0"
+#include "../../include/uapi/ktap/ktap_errmsg.h"
+;
+
+const uint8_t kp_char_bits[257] = {
+    0,
+    1,  1,  1,  1,  1,  1,  1,  1,  1,  3,  3,  3,  3,  3,  1,  1,
+    1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
+    2,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,
+  152,152,152,152,152,152,152,152,152,152,  4,  4,  4,  4,  4,  4,
+    4,176,176,176,176,176,176,160,160,160,160,160,160,160,160,160,
+  160,160,160,160,160,160,160,160,160,160,160,  4,  4,  4,  4,132,
+    4,208,208,208,208,208,208,192,192,192,192,192,192,192,192,192,
+  192,192,192,192,192,192,192,192,192,192,192,  4,  4,  4,  4,  1,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,
+  128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
+};
+
+void kp_buf_init(SBuf *sb)
+{
+	sb->b = (char *)malloc(200);
+	sb->p = NULL;
+	sb->e = sb->b + 200;
+}
+
+void kp_buf_reset(SBuf *sb)
+{
+	sb->p = sb->b;
+}
+
+void kp_buf_free(SBuf *sb)
+{
+	free(sbufB(sb));
+}
+
+char *kp_buf_more(SBuf *sb, int sz)
+{
+	char *b;
+	int old_len = sbuflen(sb);
+
+	if (sz > sbufleft(sb)) {
+		b = realloc(sbufB(sb), sbuflen(sb) * 2);
+		sb->b = b;
+		sb->p = b + old_len;
+		sb->e = b + old_len * 2;
+	}
+
+	return sbufP(sb);
+}
+
+char *kp_buf_need(SBuf *sb, int sz)
+{
+	char *b;
+	int old_len = sbuflen(sb);
+
+	if (sz > sbufsz(sb)) {
+		b = realloc(sbufB(sb), sz);
+		sb->b = b;
+		sb->p = b + old_len;
+		sb->e = b + sz;
+	}
+
+	return sbufB(sb);
+}
+
+char *kp_buf_wmem(char *p, const void *q, int len)
+{
+	return (char *)memcpy(p, q, len) + len;
+}
+
+void kp_buf_putb(SBuf *sb, int c)
+{
+	char *p = kp_buf_more(sb, 1);
+	*p++ = (char)c;
+	setsbufP(sb, p);
+}
+
+ktap_str_t *kp_buf_str(SBuf *sb)
+{
+	return kp_str_new(sbufB(sb), sbuflen(sb));
+}
+
+/* Write ULEB128 to buffer. */
+char *strfmt_wuleb128(char *p, uint32_t v)
+{
+  for (; v >= 0x80; v >>= 7)
+    *p++ = (char)((v & 0x7f) | 0x80);
+  *p++ = (char)v;
+  return p;
+}
+
+void kp_err_lex(ktap_str_t *src, const char *tok, BCLine line,
+		ErrMsg em, va_list argp)
+{
+	const char *msg;
+
+	msg = kp_sprintfv(err2msg(em), argp);
+	msg = kp_sprintf("%s:%d: %s", getstr(src), line, msg);
+	if (tok)
+		msg = kp_sprintf(err2msg(KP_ERR_XNEAR), msg, tok);
+	fprintf(stderr, "%s: %s\n", err2msg(KP_ERR_XSYNTAX), msg);
+	exit(-1);
+}
+
+void *kp_reallocv(void *block, size_t osize, size_t nsize)
+{
+	return realloc(block, nsize);
+}
+
+static const ktap_val_t kp_niltv = { {NULL}, {KTAP_TNIL} } ;
+#define niltv  (&kp_niltv)
+
+#define gnode(t,i)	(&(t)->node[i])
+#define gkey(n)		(&(n)->key)
+#define gval(n)		(&(n)->val)
+
+const ktap_val_t *kp_tab_get(ktap_tab_t *t, const ktap_val_t *key)
+{
+	int i;
+
+	switch (itype(key)) {
+	case KTAP_TNIL:
+		return niltv;
+	case KTAP_TNUM:
+		for (i = 0; i <= t->hmask; i++) {
+			ktap_val_t *v = gkey(gnode(t, i));
+			if (is_number(v) && nvalue(key) == nvalue(v))
+				return gval(gnode(t, i));
+		}
+		break;
+	case KTAP_TSTR:
+		for (i = 0; i <= t->hmask; i++) {
+			ktap_val_t *v = gkey(gnode(t, i));
+			if (is_string(v) && (rawtsvalue(key) == rawtsvalue(v)))
+				return gval(gnode(t, i));
+		}
+		break;
+	default:
+		for (i = 0; i <= t->hmask; i++) {
+			if (kp_obj_equal(key, gkey(gnode(t, i))))
+				return gval(gnode(t, i));
+		}
+		break;
+	}
+
+	return niltv;
+}
+
+const ktap_val_t *kp_tab_getstr(ktap_tab_t *t, const ktap_str_t *ts)
+{
+	int i;
+
+	for (i = 0; i <= t->hmask; i++) {
+		ktap_val_t *v = gkey(gnode(t, i));
+		if (is_string(v) && (ts == rawtsvalue(v)))
+			return gval(gnode(t, i));
+	}
+
+	return niltv;
+}
+
+void kp_tab_setvalue(ktap_tab_t *t, const ktap_val_t *key, ktap_val_t *val)
+{
+	const ktap_val_t *v = kp_tab_get(t, key);
+
+	if (v != niltv) {
+		set_obj((ktap_val_t *)v, val);
+	} else {
+		if (t->freetop == t->node) {
+			int size = (t->hmask + 1) * sizeof(ktap_node_t);
+			t->node = realloc(t->node, size * 2);
+			memset(t->node + t->hmask + 1, 0, size);
+			t->freetop = t->node + (t->hmask + 1) * 2;
+			t->hmask = (t->hmask + 1) * 2 - 1;
+		}
+
+		ktap_node_t *n = --t->freetop;
+		set_obj(gkey(n), key);
+		set_obj(gval(n), val);
+	}
+}
+
+ktap_val_t *kp_tab_set(ktap_tab_t *t, const ktap_val_t *key)
+{
+	const ktap_val_t *v = kp_tab_get(t, key);
+
+	if (v != niltv) {
+		return (ktap_val_t *)v;
+	} else {
+		if (t->freetop == t->node) {
+			int size = (t->hmask + 1) * sizeof(ktap_node_t);
+			t->node = realloc(t->node, size * 2);
+			memset(t->node + t->hmask + 1, 0, size);
+			t->freetop = t->node + (t->hmask + 1) * 2;
+			t->hmask = (t->hmask + 1) * 2 - 1;
+		}
+
+		ktap_node_t *n = --t->freetop;
+		set_obj(gkey(n), key);
+		set_nil(gval(n));
+		return gval(n);
+	}
+}
+
+
+ktap_tab_t *kp_tab_new(void)
+{
+	int hsize, i;
+
+	ktap_tab_t *t = malloc(sizeof(ktap_tab_t));
+	t->gct = ~KTAP_TTAB;
+	hsize = 1024;
+	t->hmask = hsize - 1;
+	t->node = (ktap_node_t *)malloc(hsize * sizeof(ktap_node_t));
+	t->freetop = &t->node[hsize];
+	t->asize = 0;
+
+	for (i = 0; i <= t->hmask; i++) {
+		set_nil(&t->node[i].val);
+		set_nil(&t->node[i].key);
+	}
+	return t;
+}
+
+/* simple interned string array, use hash table in future  */
+static ktap_str_t **strtab;
+static int strtab_size = 1000; /* initial size */
+static int strtab_nr;
+
+void kp_str_resize(void)
+{
+	int size = strtab_size * sizeof(ktap_str_t *);
+
+	strtab = malloc(size);
+	if (!strtab) {
+		fprintf(stderr, "cannot allocate stringtable\n");
+		exit(-1);
+	}
+
+	memset(strtab, 0, size);
+	strtab_nr = 0;
+}
+
+static ktap_str_t *stringtable_search(const char *str, int len)
+{
+	int i;
+
+	for (i = 0; i < strtab_nr; i++) {
+		ktap_str_t *s = strtab[i];
+		if ((len == s->len) && !memcmp(str, getstr(s), len))
+			return s;
+	}
+
+	return NULL;
+}
+
+static void stringtable_insert(ktap_str_t *ts)
+{
+	strtab[strtab_nr++] = ts;
+
+	if (strtab_nr == strtab_size) {
+		int size = strtab_size * sizeof(ktap_str_t *);
+		strtab = realloc(strtab, size * 2);
+		memset(strtab + strtab_size, 0, size);
+		strtab_size *= 2;
+	}
+}
+
+static ktap_str_t *createstrobj(const char *str, size_t l)
+{
+	ktap_str_t *ts;
+	size_t totalsize;  /* total size of TString object */
+
+	totalsize = sizeof(ktap_str_t) + ((l + 1) * sizeof(char));
+	ts = (ktap_str_t *)malloc(totalsize);
+	ts->gct = ~KTAP_TSTR;
+	ts->len = l;
+	ts->reserved = 0;
+	ts->extra = 0;
+	memcpy(ts + 1, str, l * sizeof(char));
+	((char *)(ts + 1))[l] = '\0';  /* ending 0 */
+	return ts;
+}
+
+ktap_str_t *kp_str_new(const char *str, size_t l)
+{
+	ktap_str_t *ts = stringtable_search(str, l);
+
+	if (ts)
+		return ts;
+
+	ts = createstrobj(str, l);
+	stringtable_insert(ts);
+	return ts;
+}
+
+ktap_str_t *kp_str_newz(const char *str)
+{
+	return kp_str_new(str, strlen(str));
+}
+
+/*
+ * todo: memory leak here
+ */
+char *kp_sprintf(const char *fmt, ...)
+{
+	char *msg = malloc(128);
+
+	va_list argp;
+	va_start(argp, fmt);
+	vsprintf(msg, fmt, argp);
+	va_end(argp);
+	return msg;
+}
+
+const char *kp_sprintfv(const char *fmt, va_list argp)
+{
+	char *msg = malloc(128);
+
+	vsprintf(msg, fmt, argp);
+	return msg;
+}
+
+int kp_obj_equal(const ktap_val_t *t1, const ktap_val_t *t2)
+{
+	switch (itype(t1)) {
+	case KTAP_TNIL:
+		return 1;
+	case KTAP_TNUM:
+		return nvalue(t1) == nvalue(t2);
+	case KTAP_TTRUE:
+	case KTAP_TFALSE:
+		return itype(t1) == itype(t2);
+	case KTAP_TLIGHTUD:
+		return pvalue(t1) == pvalue(t2);
+	case KTAP_TFUNC:
+		return fvalue(t1) == fvalue(t2);
+	case KTAP_TSTR:
+		return rawtsvalue(t1) == rawtsvalue(t2);
+	default:
+		return gcvalue(t1) == gcvalue(t2);
+	}
+
+	return 0;
+}
+
+/*
+ * strglobmatch is copyed from perf(linux/tools/perf/util/string.c)
+ */
+
+/* Character class matching */
+static bool __match_charclass(const char *pat, char c, const char **npat)
+{
+	bool complement = false, ret = true;
+
+	if (*pat == '!') {
+		complement = true;
+		pat++;
+	}
+	if (*pat++ == c)	/* First character is special */
+		goto end;
+
+	while (*pat && *pat != ']') {	/* Matching */
+		if (*pat == '-' && *(pat + 1) != ']') {	/* Range */
+			if (*(pat - 1) <= c && c <= *(pat + 1))
+				goto end;
+			if (*(pat - 1) > *(pat + 1))
+				goto error;
+			pat += 2;
+		} else if (*pat++ == c)
+			goto end;
+	}
+	if (!*pat)
+		goto error;
+	ret = false;
+
+end:
+	while (*pat && *pat != ']')	/* Searching closing */
+		pat++;
+	if (!*pat)
+		goto error;
+	*npat = pat + 1;
+	return complement ? !ret : ret;
+
+error:
+	return false;
+}
+
+/* Glob/lazy pattern matching */
+static bool __match_glob(const char *str, const char *pat, bool ignore_space)
+{
+	while (*str && *pat && *pat != '*') {
+		if (ignore_space) {
+			/* Ignore spaces for lazy matching */
+			if (isspace(*str)) {
+				str++;
+				continue;
+			}
+			if (isspace(*pat)) {
+				pat++;
+				continue;
+			}
+		}
+		if (*pat == '?') {	/* Matches any single character */
+			str++;
+			pat++;
+			continue;
+		} else if (*pat == '[')	/* Character classes/Ranges */
+			if (__match_charclass(pat + 1, *str, &pat)) {
+				str++;
+				continue;
+			} else
+				return false;
+		else if (*pat == '\\') /* Escaped char match as normal char */
+			pat++;
+		if (*str++ != *pat++)
+			return false;
+	}
+	/* Check wild card */
+	if (*pat == '*') {
+		while (*pat == '*')
+			pat++;
+		if (!*pat)	/* Tail wild card matches all */
+			return true;
+		while (*str)
+			if (__match_glob(str++, pat, ignore_space))
+				return true;
+	}
+	return !*str && !*pat;
+}
+
+/**
+ * strglobmatch - glob expression pattern matching
+ * @str: the target string to match
+ * @pat: the pattern string to match
+ *
+ * This returns true if the @str matches @pat. @pat can includes wildcards
+ * ('*','?') and character classes ([CHARS], complementation and ranges are
+ * also supported). Also, this supports escape character ('\') to use special
+ * characters as normal character.
+ *
+ * Note: if @pat syntax is broken, this always returns false.
+ */
+bool strglobmatch(const char *str, const char *pat)
+{
+	return __match_glob(str, pat, false);
+}
+
+#define handle_error(str) do { perror(str); exit(-1); } while(0)
+
+#define KALLSYMS_PATH "/proc/kallsyms"
+/*
+ * read kernel symbol from /proc/kallsyms
+ */
+int kallsyms_parse(void *arg,
+		   int(*process_symbol)(void *arg, const char *name,
+		   char type, unsigned long start))
+{
+	FILE *file;
+	char *line = NULL;
+	int ret = 0;
+	int found = 0;
+
+	file = fopen(KALLSYMS_PATH, "r");
+	if (file == NULL)
+		handle_error("open " KALLSYMS_PATH " failed");
+
+	while (!feof(file)) {
+		char *symbol_addr, *symbol_name;
+		char symbol_type;
+		unsigned long start;
+		int line_len;
+		size_t n;
+
+		line_len = getline(&line, &n, file);
+		if (line_len < 0 || !line)
+			break;
+
+		line[--line_len] = '\0'; /* \n */
+
+		symbol_addr = strtok(line, " \t");
+		start = strtoul(symbol_addr, NULL, 16);
+
+		symbol_type = *strtok(NULL, " \t");
+		symbol_name = strtok(NULL, " \t");
+
+		ret = process_symbol(arg, symbol_name, symbol_type, start);
+		if (!ret)
+			found = 1;
+	}
+
+	free(line);
+	fclose(file);
+
+	return found;
+}
+
+struct ksym_addr_t {
+	const char *name;
+	unsigned long addr;
+};
+
+static int symbol_cmp(void *arg, const char *name, char type,
+		      unsigned long start)
+{
+	struct ksym_addr_t *base = arg;
+
+	if (strcmp(base->name, name) == 0) {
+		base->addr = start;
+		return 1;
+	}
+
+	return 0;
+}
+
+unsigned long find_kernel_symbol(const char *symbol)
+{
+	int ret;
+	struct ksym_addr_t arg = {
+		.name = symbol,
+		.addr = 0
+	};
+
+	ret = kallsyms_parse(&arg, symbol_cmp);
+	if (ret < 0 || arg.addr == 0) {
+		fprintf(stderr, "cannot read kernel symbol \"%s\" in %s\n",
+			symbol, KALLSYMS_PATH);
+		exit(EXIT_FAILURE);
+	}
+
+	return arg.addr;
+}
+
+
+#define AVAILABLE_EVENTS_PATH "/sys/kernel/debug/tracing/available_events"
+
+void list_available_events(const char *match)
+{
+	FILE *file;
+	char *line = NULL;
+
+	file = fopen(AVAILABLE_EVENTS_PATH, "r");
+	if (file == NULL)
+		handle_error("open " AVAILABLE_EVENTS_PATH " failed");
+
+	while (!feof(file)) {
+		int line_len;
+		size_t n;
+
+		line_len = getline(&line, &n, file);
+		if (line_len < 0 || !line)
+			break;
+
+		if (!match || strglobmatch(line, match))
+			printf("%s", line);
+	}
+
+	free(line);
+	fclose(file);
+}
+
+void process_available_tracepoints(const char *sys, const char *event,
+				   int (*process)(const char *sys,
+						  const char *event))
+{
+	char *line = NULL;
+	FILE *file;
+	char str[128] = {0};
+
+	/* add '\n' into tail */
+	snprintf(str, 64, "%s:%s\n", sys, event);
+
+	file = fopen(AVAILABLE_EVENTS_PATH, "r");
+	if (file == NULL)
+		handle_error("open " AVAILABLE_EVENTS_PATH " failed");
+
+	while (!feof(file)) {
+		int line_len;
+		size_t n;
+
+		line_len = getline(&line, &n, file);
+		if (line_len < 0 || !line)
+			break;
+
+		if (strglobmatch(line, str)) {
+			char match_sys[64] = {0};
+			char match_event[64] = {0};
+			char *sep;
+
+			sep = strchr(line, ':');
+			memcpy(match_sys, line, sep - line);
+			memcpy(match_event, sep + 1,
+					    line_len - (sep - line) - 2);
+
+			if (process(match_sys, match_event))
+				break;
+		}
+	}
+
+	free(line);
+	fclose(file);
+}
+
diff --git a/tools/ktap/kp_util.h b/tools/ktap/kp_util.h
new file mode 100644
index 0000000..b0a5935
--- /dev/null
+++ b/tools/ktap/kp_util.h
@@ -0,0 +1,120 @@
+#ifndef __KTAP_UTIL_H__
+#define __KTAP_UTIL_H__
+
+#include "../../include/uapi/ktap/ktap_bc.h"
+#include "../../include/uapi/ktap/ktap_err.h"
+
+typedef int bool;
+#define false 0
+#define true 1
+
+/* Resizable string buffer. */
+typedef struct SBuf {
+	char *p; /* String buffer pointer. */
+	char *e; /* String buffer end pointer. */
+	char *b; /* String buffer base. */
+} SBuf;
+
+/* Resizable string buffers. Struct definition in kp_obj.h. */
+#define sbufB(sb)	((char *)(sb)->b)
+#define sbufP(sb)	((char *)(sb)->p)
+#define sbufE(sb)	((char *)(sb)->e)
+#define sbufsz(sb)	((int)(sbufE((sb)) - sbufB((sb))))
+#define sbuflen(sb)	((int)(sbufP((sb)) - sbufB((sb))))
+#define sbufleft(sb)	((int)(sbufE((sb)) - sbufP((sb))))
+#define setsbufP(sb, q) ((sb)->p = (q))
+
+void kp_buf_init(SBuf *sb);
+void kp_buf_reset(SBuf *sb);
+void kp_buf_free(SBuf *sb);
+char *kp_buf_more(SBuf *sb, int sz);
+char *kp_buf_need(SBuf *sb, int sz);
+char *kp_buf_wmem(char *p, const void *q, int len);
+void kp_buf_putb(SBuf *sb, int c);
+ktap_str_t *kp_buf_str(SBuf *sb);
+
+
+#define KP_CHAR_CNTRL	0x01
+#define KP_CHAR_SPACE	0x02
+#define KP_CHAR_PUNCT	0x04
+#define KP_CHAR_DIGIT	0x08
+#define KP_CHAR_XDIGIT	0x10
+#define KP_CHAR_UPPER	0x20
+#define KP_CHAR_LOWER	0x40
+#define KP_CHAR_IDENT	0x80
+#define KP_CHAR_ALPHA	(KP_CHAR_LOWER|KP_CHAR_UPPER)
+#define KP_CHAR_ALNUM	(KP_CHAR_ALPHA|KP_CHAR_DIGIT)
+#define KP_CHAR_GRAPH	(KP_CHAR_ALNUM|KP_CHAR_PUNCT)
+
+/* Only pass -1 or 0..255 to these macros. Never pass a signed char! */
+#define kp_char_isa(c, t)	((kp_char_bits+1)[(c)] & t)
+#define kp_char_iscntrl(c)	kp_char_isa((c), KP_CHAR_CNTRL)
+#define kp_char_isspace(c)	kp_char_isa((c), KP_CHAR_SPACE)
+#define kp_char_ispunct(c)	kp_char_isa((c), KP_CHAR_PUNCT)
+#define kp_char_isdigit(c)	kp_char_isa((c), KP_CHAR_DIGIT)
+#define kp_char_isxdigit(c)	kp_char_isa((c), KP_CHAR_XDIGIT)
+#define kp_char_isupper(c)	kp_char_isa((c), KP_CHAR_UPPER)
+#define kp_char_islower(c)	kp_char_isa((c), KP_CHAR_LOWER)
+#define kp_char_isident(c)	kp_char_isa((c), KP_CHAR_IDENT)
+#define kp_char_isalpha(c)	kp_char_isa((c), KP_CHAR_ALPHA)
+#define kp_char_isalnum(c)	kp_char_isa((c), KP_CHAR_ALNUM)
+#define kp_char_isgraph(c)	kp_char_isa((c), KP_CHAR_GRAPH)
+
+#define kp_char_toupper(c)	((c) - (kp_char_islower(c) >> 1))
+#define kp_char_tolower(c)	((c) + kp_char_isupper(c))
+
+extern const char *kp_err_allmsg;
+#define err2msg(em)     (kp_err_allmsg+(int)(em))
+
+extern const uint8_t kp_char_bits[257];
+
+
+char *strfmt_wuleb128(char *p, uint32_t v);
+void kp_err_lex(ktap_str_t *src, const char *tok, BCLine line,
+		ErrMsg em, va_list argp);
+char *kp_sprintf(const char *fmt, ...);
+const char *kp_sprintfv(const char *fmt, va_list argp);
+
+void *kp_reallocv(void *block, size_t osize, size_t nsize);
+
+void kp_str_resize(void);
+ktap_str_t *kp_str_newz(const char *str);
+ktap_str_t *kp_str_new(const char *str, size_t l);
+
+ktap_tab_t *kp_tab_new();
+const ktap_val_t *kp_tab_get(ktap_tab_t *t, const ktap_val_t *key);
+const ktap_val_t *kp_tab_getstr(ktap_tab_t *t, const ktap_str_t *ts);
+void kp_tab_setvalue(ktap_tab_t *t, const ktap_val_t *key, ktap_val_t *val);
+ktap_val_t *kp_tab_set(ktap_tab_t *t, const ktap_val_t *key);
+
+int kp_obj_equal(const ktap_val_t *t1, const ktap_val_t *t2);
+
+bool strglobmatch(const char *str, const char *pat);
+int kallsyms_parse(void *arg,
+		   int(*process_symbol)(void *arg, const char *name,
+		   char type, unsigned long start));
+
+unsigned long find_kernel_symbol(const char *symbol);
+void list_available_events(const char *match);
+void process_available_tracepoints(const char *sys, const char *event,
+				   int (*process)(const char *sys,
+						  const char *event));
+int kallsyms_parse(void *arg,
+                   int(*process_symbol)(void *arg, const char *name,
+                   char type, unsigned long start));
+
+ktap_eventdesc_t *kp_parse_events(const char *eventdef);
+void cleanup_event_resources(void);
+
+extern int verbose;
+#define verbose_printf(...) \
+	if (verbose)	\
+		printf("[verbose] " __VA_ARGS__);
+
+
+void kp_dump_proto(ktap_proto_t *pt);
+typedef int (*ktap_writer)(const void* p, size_t sz, void* ud);
+int kp_bcwrite(ktap_proto_t *pt, ktap_writer writer, void *data, int strip);
+
+int kp_create_reader(const char *output);
+#endif
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 26/29] ktap: add userspace binary Makefile(tools/ktap/Makefile)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (24 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 25/29] ktap: add userspace util(tools/ktap/kp_util.c) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 27/29] ktap: add testsuite and benchmark(tools/ktap/test/*) Jovi Zhangwei
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

Makefile for userspace binary.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/Makefile | 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 130 insertions(+)
 create mode 100644 tools/ktap/Makefile

diff --git a/tools/ktap/Makefile b/tools/ktap/Makefile
new file mode 100644
index 0000000..38aa113
--- /dev/null
+++ b/tools/ktap/Makefile
@@ -0,0 +1,130 @@
+#
+# Define NO_LIBELF if you do not want libelf dependency (e.g. cross-builds)
+# (this will also disable resolve resolving symbols in DSO functionality)
+#
+
+INC = ../../include/uapi/ktap
+KTAP_LIBS = -lpthread
+KTAPC_CFLAGS = -Wall -O2
+
+all: ktap
+
+# try-cc
+# Usage: option = $(call try-cc, source-to-build, cc-options, msg)
+ifneq ($(V),1)
+TRY_CC_OUTPUT= > /dev/null 2>&1
+endif
+TRY_CC_MSG=echo "    CHK $(3)" 1>&2;
+
+try-cc = $(shell sh -c							\
+         'TMP="/tmp/.$$$$";						\
+          $(TRY_CC_MSG)							\
+          echo "$(1)" |							\
+          $(CC) -x c - $(2) -o "$$TMP" $(TRY_CC_OUTPUT) && echo y;	\
+          rm -f "$$TMP"')
+
+
+define SOURCE_LIBELF
+#include <libelf.h>
+
+int main(void)
+{
+        Elf *elf = elf_begin(0, ELF_C_READ, 0);
+        return (long)elf;
+}
+endef
+
+FLAGS_LIBELF = -lelf
+
+ifdef NO_LIBELF
+	KTAPC_CFLAGS += -DNO_LIBELF
+else
+ifneq ($(call try-cc,$(SOURCE_LIBELF),$(FLAGS_LIBELF),libelf),y)
+    $(warning No libelf found, disables symbol resolving, please install elfutils-libelf-devel/libelf-dev);
+    NO_LIBELF := 1
+    KTAPC_CFLAGS += -DNO_LIBELF
+else
+    KTAP_LIBS += -lelf
+endif
+endif
+
+kp_main.o: kp_main.c $(INC)/* KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+kp_lex.o: kp_lex.c $(INC)/* KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+kp_parse.o: kp_parse.c $(INC)/* KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+kp_bcwrite.o: kp_bcwrite.c $(INC)/* KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+kp_reader.o: kp_reader.c $(INC)/* KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+kp_util.o: kp_util.c $(INC)/* KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+kp_parse_events.o: kp_parse_events.c $(INC)/* KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+ifndef NO_LIBELF
+kp_symbol.o: kp_symbol.c KTAP-CFLAGS
+	$(QUIET_CC)$(CC) $(DEBUGINFO_FLAG) $(KTAPC_CFLAGS) -o $@ -c $<
+endif
+
+KTAPOBJS =
+KTAPOBJS += kp_main.o
+KTAPOBJS += kp_lex.o
+KTAPOBJS += kp_parse.o
+KTAPOBJS += kp_bcwrite.o
+KTAPOBJS += kp_reader.o
+KTAPOBJS += kp_util.o
+KTAPOBJS += kp_parse_events.o
+ifndef NO_LIBELF
+KTAPOBJS += kp_symbol.o
+endif
+
+ktap: $(KTAPOBJS) KTAP-CFLAGS
+	$(QUIET_LINK)$(CC) $(KTAPC_CFLAGS) -o $@ $(KTAPOBJS) $(KTAP_LIBS)
+
+install: ktap
+	install -c ktap /usr/bin/
+	mkdir -p ~/.vim/ftdetect
+	mkdir -p ~/.vim/syntax
+	cp vim/ftdetect/ktap.vim ~/.vim/ftdetect/
+	cp vim/syntax/ktap.vim ~/.vim/syntax/
+
+test: FORCE
+	#start testing
+	prove -j4 -r test/
+
+clean:
+	$(RM) ktap *.o KTAP-CFLAGS
+
+
+PHONY += FORCE
+FORCE:
+
+TRACK_FLAGS = KTAP
+ifdef NO_LIBELF
+TRACK_FLAGS += NO_LIBELF
+endif
+
+KTAP-CFLAGS: FORCE
+	@FLAGS='$(TRACK_FLAGS)'; \
+	if test x"$$FLAGS" != x"`cat KTAP-CFLAGS 2>/dev/null`" ; then \
+		echo "$$FLAGS" >KTAP-CFLAGS; \
+	fi
+
+#generate tags/etags/cscope index for editor.
+define all_sources
+        (find . -name '*.[ch]' -print)
+endef
+
+.PHONY: tags
+tags:
+	$(all_sources) | xargs ctags
+
+.PHONY: etags
+etags:
+	$(all_sources) | xargs etags
+
+.PHONY: cscope
+cscope:
+	$(all_sources) > cscope.files
+	cscope -k -b
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 27/29] ktap: add testsuite and benchmark(tools/ktap/test/*)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (25 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 26/29] ktap: add userspace binary Makefile(tools/ktap/Makefile) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 28/29] ktap: add vim syntax file(tools/ktap/vim/*) Jovi Zhangwei
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap testsuite is based on perl-prove framwork.
More info can read from test/README.

The test framework is contributed by Yichun Zhang (agentzh)

ktap run the test suite in parallel defaultly:
        prove -j4 -r test/

There also have several benchmark script to compare performance
between ktap with stap.

The benchmark shows that:
1). ktap number computation and comparsion overhead is bigger than stap,
    nearly 10+%.

2). Perf backend tracing overhead is bigger than raw tracepoint/kprobe.

3). ktap table operation overhead is smaller than stap, nearly 10+%.

(This benchmark result only tell the data in my box)

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/test/README                   |  69 ++++
 tools/ktap/test/arithmetic.t             | 109 ++++++
 tools/ktap/test/benchmark/cmp_neq.sh     | 158 +++++++++
 tools/ktap/test/benchmark/cmp_profile.sh |  54 +++
 tools/ktap/test/benchmark/cmp_table.sh   | 112 +++++++
 tools/ktap/test/benchmark/sembench.c     | 556 +++++++++++++++++++++++++++++++
 tools/ktap/test/cli-arg.t                |  25 ++
 tools/ktap/test/concat.t                 |  21 ++
 tools/ktap/test/count.t                  |  25 ++
 tools/ktap/test/deadloop.t               |  37 ++
 tools/ktap/test/fibonacci.t              |  42 +++
 tools/ktap/test/function.t               |  78 +++++
 tools/ktap/test/if.t                     |  32 ++
 tools/ktap/test/kprobe.t                 |  82 +++++
 tools/ktap/test/kretprobe.t              |  35 ++
 tools/ktap/test/len.t                    |  27 ++
 tools/ktap/test/lib/Test/ktap.pm         | 128 +++++++
 tools/ktap/test/looping.t                |  46 +++
 tools/ktap/test/one-liner.t              |  48 +++
 tools/ktap/test/pairs.t                  |  52 +++
 tools/ktap/test/stack_overflow.t         |  22 ++
 tools/ktap/test/syntax-err.t             |  19 ++
 tools/ktap/test/table.t                  |  81 +++++
 tools/ktap/test/time.t                   |  59 ++++
 tools/ktap/test/timer.t                  |  65 ++++
 tools/ktap/test/tracepoint.t             |  53 +++
 tools/ktap/test/util/reindex             |  61 ++++
 tools/ktap/test/zerodivide.t             |  21 ++
 28 files changed, 2117 insertions(+)
 create mode 100644 tools/ktap/test/README
 create mode 100644 tools/ktap/test/arithmetic.t
 create mode 100644 tools/ktap/test/benchmark/cmp_neq.sh
 create mode 100644 tools/ktap/test/benchmark/cmp_profile.sh
 create mode 100644 tools/ktap/test/benchmark/cmp_table.sh
 create mode 100644 tools/ktap/test/benchmark/sembench.c
 create mode 100644 tools/ktap/test/cli-arg.t
 create mode 100644 tools/ktap/test/concat.t
 create mode 100644 tools/ktap/test/count.t
 create mode 100644 tools/ktap/test/deadloop.t
 create mode 100644 tools/ktap/test/fibonacci.t
 create mode 100644 tools/ktap/test/function.t
 create mode 100644 tools/ktap/test/if.t
 create mode 100644 tools/ktap/test/kprobe.t
 create mode 100644 tools/ktap/test/kretprobe.t
 create mode 100644 tools/ktap/test/len.t
 create mode 100644 tools/ktap/test/lib/Test/ktap.pm
 create mode 100644 tools/ktap/test/looping.t
 create mode 100644 tools/ktap/test/one-liner.t
 create mode 100644 tools/ktap/test/pairs.t
 create mode 100644 tools/ktap/test/stack_overflow.t
 create mode 100644 tools/ktap/test/syntax-err.t
 create mode 100644 tools/ktap/test/table.t
 create mode 100644 tools/ktap/test/time.t
 create mode 100644 tools/ktap/test/timer.t
 create mode 100644 tools/ktap/test/tracepoint.t
 create mode 100755 tools/ktap/test/util/reindex
 create mode 100644 tools/ktap/test/zerodivide.t

diff --git a/tools/ktap/test/README b/tools/ktap/test/README
new file mode 100644
index 0000000..5a628e1
--- /dev/null
+++ b/tools/ktap/test/README
@@ -0,0 +1,69 @@
+This directory contains the test suite for ktap.
+
+Prerequisites
+-------------
+
+One needs to install perl and CPAN modules Test::Base and IPC::Run
+before running the tests. After perl is installed, the "cpan" utility
+can be used to install the CPAN modules required:
+
+    cpan Test::Base IPC::Run
+
+Alternatively you can just install the pre-built binary packages
+provided by your Linux distribution vendor. For example, on Fedora, you
+just need to run
+
+    yum install perl-Test-Base perl-IPC-Run
+
+Running tests
+-------------
+
+You are required to run the tests from the root directory of this project.
+
+You can run the whole test suite like this:
+
+    prove -r test/
+
+To utilize multiple CPU cores while running the tests, it is also
+supported to spawn multiple processes to run the test files in parallel,
+as in
+
+    prove -j4 -r test/
+
+Then 4 processes will be spawned to run the tests at the same time.
+
+To run individual .t test files, just specify their file paths
+explicitly:
+
+    prove test/cli-args.t test/one-liner.t
+
+If you just want to run an individual test case in a particular .t
+file, then just add the line
+
+    --- ONLY
+
+to the end of the test block you want to run and run that .t file
+normally with the "prove" utility.
+
+Similarly, if you want to skip a particular test block, add the line
+
+    --- SKIP
+
+to that test block.
+
+Test file formatting
+--------------------
+
+We do have a "reindex" tool to automatically re-format
+the .t test files, so that you do not have to manually get the test
+serial numbers exactly right, like "TEST 1: ", "TEST 2: " and etc,
+nor manually keep 3 blank lines between adjacent test blocks. For
+example,
+
+    ./test/util/reindex test/cli-arg.t
+
+or re-format all the .t files:
+
+    ./test/util/reindex test/*.t
+
+Always run this tool before committing your newly editted tests.
diff --git a/tools/ktap/test/arithmetic.t b/tools/ktap/test/arithmetic.t
new file mode 100644
index 0000000..e56daf0
--- /dev/null
+++ b/tools/ktap/test/arithmetic.t
@@ -0,0 +1,109 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: arithmetic
+--- src
+if (1 > 2) {
+	print("failed")
+}
+
+if (200 < 100) {
+	print("failed")
+}
+
+if (1 == nil) {
+	print("failed")
+}
+
+if (1 != nil) {
+	print("1 != nil")
+}
+
+if (nil == 1) {
+	print("failed")
+}
+
+if (nil != 1) {
+	print("nil != 1")
+}
+
+if (1 == "test") {
+	print("failed")
+}
+
+if (1 != "test") {
+	print("1 != 'test'")
+}
+
+if ("test" == 1) {
+	print("failed")
+}
+
+if ("test" != 1) {
+	print("'test' != 1")
+}
+
+if ("1234" == "1") {
+	print("failed")
+}
+
+if ("1234" != "1") {
+	print("'1234' != '1'")
+}
+
+
+
+var a = 4
+var b = 5
+
+if ((a + b) != 9) {
+	print("failed")
+}
+
+if ((a - b) != -1) {
+	print("failed")
+}
+
+if ((a * b) != 20) {
+	print("failed")
+}
+
+if ((a % b) != 4) {
+	print("failed")
+}
+
+if ((a / b) != 0) {
+	print("failed")
+}
+
+
+
+#below checking only valid for 64-bit system
+
+var c = 0x1234567812345678
+var d = 0x2
+
+if (c + d != 0x123456781234567a) {
+	print("failed")
+}
+
+if (-1 != 0xffffffffffffffff) {
+	print("failed")
+}
+
+--- out
+1 != nil
+nil != 1
+1 != 'test'
+'test' != 1
+'1234' != '1'
+
+--- err
+
+
diff --git a/tools/ktap/test/benchmark/cmp_neq.sh b/tools/ktap/test/benchmark/cmp_neq.sh
new file mode 100644
index 0000000..da69131
--- /dev/null
+++ b/tools/ktap/test/benchmark/cmp_neq.sh
@@ -0,0 +1,158 @@
+#!/bin/sh
+
+# This script compare number equality performance between ktap and stap.
+# It also compare different ktap tracing interfaces.
+#
+# 1. ktap -e 'trace syscalls:sys_enter_futex {}'
+# 2. ktap -e 'kdebug.tracepoint("sys_enter_futex", function () {})'
+# 3. ktap -e 'trace probe:SyS_futex uaddr=%di {}'
+# 4. ktap -e 'kdebug.kprobe("SyS_futex", function () {})'
+# 5. stap -e 'probe syscall.futex {}'
+# 6. ktap -d -e 'trace syscalls:sys_enter_futex {}'
+# 7. ktap -d -e 'kdebug.tracepoint("sys_enter_futex", function () {})'
+# 8. ktap -e 'trace syscalls:sys_enter_futex /kernel_buildin_filter/ {}'
+
+#Result:
+#ktap number computation and comparsion overhead is bigger than stap,
+#nearly 10+% (4 vs. 5 in above)), ktap is not very slow.
+#
+#Perf backend tracing overhead is big, because it need copy temp buffer, and
+#code path is very long than direct callback(1 vs. 4 in above).
+
+gcc -o sembench sembench.c -O2 -lpthread
+
+COMMAND="./sembench -t 200 -w 20 -r 30 -o 2"
+
+#------------------------------------------------------------#
+
+echo -e "without tracing:"
+#$COMMAND; $COMMAND; $COMMAND
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'trace syscalls:sys_enter_futex {
+	var uaddr = arg2
+        if (uaddr == 0x100 || uaddr == 0x200 || uaddr == 0x300 ||
+	    uaddr == 0x400 || uaddr == 0x500 || uaddr == 0x600 ||
+	    uaddr == 0x700 || uaddr == 0x800 || uaddr == 0x900 ||
+	    uaddr == 0x1000) {
+                printf("%x %x\n", arg1, arg2)
+        }}' &
+
+echo -e "\nktap tracing: trace syscalls:sys_enter_futex { if (arg2 == 0x100 || arg2 == 0x200 ... }"
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'kdebug.tracepoint("sys_enter_futex", function () {
+	var arg = arg2
+        if (arg == 0x100 || arg == 0x200 || arg == 0x300 || arg == 0x400 ||
+            arg == 0x500 || arg == 0x600 || arg == 0x700 || arg == 0x800 ||
+            arg == 0x900 || arg == 0x1000) {
+                printf("%x %x\n", arg1, arg2)
+        }})' &
+
+echo -e '\nktap tracing: kdebug.tracepoint("sys_enter_futex", function (xxx) {})'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'trace probe:SyS_futex uaddr=%di {
+	var arg = arg1
+        if (arg == 0x100 || arg == 0x200 || arg == 0x300 || arg == 0x400 ||
+            arg == 0x500 || arg == 0x600 || arg == 0x700 || arg == 0x800 ||
+            arg == 0x900 || arg == 0x1000) {
+                printf("%x\n", arg1)
+        }}' &
+echo -e '\nktap tracing: trace probe:SyS_futex uaddr=%di {...}'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+
+#------------------------------------------------------------#
+../../ktap -q -e 'kdebug.kprobe("SyS_futex", function () {
+	var uaddr = 1
+        if (uaddr == 0x100 || uaddr == 0x200 || uaddr == 0x300 ||
+	    uaddr == 0x400 || uaddr == 0x500 || uaddr == 0x600 ||
+	    uaddr == 0x700 || uaddr == 0x800 || uaddr == 0x900 ||
+	    uaddr == 0x1000) {
+                printf("%x\n", uaddr)
+	}})' &
+echo -e '\nktap tracing: kdebug.kprobe("SyS_futex", function () {})'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+stap -e 'probe syscall.futex {
+	uaddr = $uaddr
+        if (uaddr == 0x100 || uaddr == 0x200 || uaddr == 0x300 ||
+	    uaddr == 0x400 || uaddr == 0x500 || uaddr == 0x600 ||
+	    uaddr == 0x700 || uaddr == 0x800 || uaddr == 0x900 ||
+	    uaddr == 0x1000) {
+                printf("%x\n", uaddr)
+        }}' &
+
+echo -e "\nstap tracing: probe syscall.futex { if (uaddr == 0x100 || addr == 0x200 ... }"
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof stap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+
+../../ktap -d -q -e 'trace syscalls:sys_enter_futex {
+	var uaddr = arg2
+        if (uaddr == 0x100 || uaddr == 0x200 || uaddr == 0x300 ||
+	    uaddr == 0x400 || uaddr == 0x500 || uaddr == 0x600 ||
+	    uaddr == 0x700 || uaddr == 0x800 || uaddr == 0x900 ||
+	    uaddr == 0x1000) {
+                printf("%x %x\n", arg1, arg2)
+        }}' &
+
+echo -e "\nktap tracing dry-run: trace syscalls:sys_enter_futex { if (arg2 == 0x100 || arg2 == 0x200 ... }"
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+
+#------------------------------------------------------------#
+
+../../ktap -d -q -e 'kdebug.tracepoint("sys_enter_futex", function () {
+	var arg = arg2
+        if (arg == 0x100 || arg == 0x200 || arg == 0x300 || arg == 0x400 ||
+            arg == 0x500 || arg == 0x600 || arg == 0x700 || arg == 0x800 ||
+            arg == 0x900 || arg == 0x1000) {
+                printf("%x %x\n", arg1, arg2)
+        }})' &
+
+echo -e '\nktap tracing dry-run: kdebug.tracepoint("sys_enter_futex", function (xxx) {})'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'trace syscalls:sys_enter_futex /
+	uaddr == 0x100 || uaddr == 0x200 || uaddr == 0x300 || uaddr == 0x400 ||
+	uaddr == 0x500 || uaddr == 0x600 || uaddr == 0x700 || uaddr == 0x800 ||
+	uaddr == 0x900 || uaddr == 0x1000/ {
+		printf("%x %x\n", arg1, arg2)
+	}' &
+
+echo -e "\nktap tracing: trace syscalls:sys_enter_futex /uaddr == 0x100 || uaddr == 0x200 .../ {}"
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+rm -rf ./sembench
+
diff --git a/tools/ktap/test/benchmark/cmp_profile.sh b/tools/ktap/test/benchmark/cmp_profile.sh
new file mode 100644
index 0000000..c400d8d
--- /dev/null
+++ b/tools/ktap/test/benchmark/cmp_profile.sh
@@ -0,0 +1,54 @@
+#!/bin/sh
+
+# This script compare stack profiling performance between ktap and stap.
+#
+# 1. ktap -e 'profile-1000us { s[stack(-1, 12)] += 1 }'
+# 2. stap -e 'probe timer.profile { s[backtrace()] += 1 }'
+# 3. stap -e 'probe timer.profile { s[backtrace()] <<< 1 }'
+
+#Result:
+#Currently the stack profiling overhead is nearly same between ktap and stap.
+#
+#ktap reslove kernel stack to string in runtime, which is very time consuming,
+#optimize it in future.
+
+
+gcc -o sembench sembench.c -O2 -lpthread
+
+COMMAND="./sembench -t 200 -w 20 -r 30 -o 2"
+
+#------------------------------------------------------------#
+
+echo -e "without tracing:"
+$COMMAND; $COMMAND; $COMMAND
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'var s = table.new(0, 20000) profile-1000us { s[stack(-1, 12)] += 1 }' &
+
+echo -e "\nktap tracing: profile-1000us { s[stack(-1, 12)] += 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+stap -o /dev/null -e 'global s[20000]; probe timer.profile { s[backtrace()] += 1 }' &
+
+echo -e "\nstap tracing: probe timer.profile { s[backtrace()] += 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pkill stap
+
+#------------------------------------------------------------#
+
+stap -o /dev/null -e 'global s[20000]; probe timer.profile { s[backtrace()] <<< 1 }' &
+
+echo -e "\nstap tracing: probe timer.profile { s[backtrace()] <<< 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pkill stap
+
+#------------------------------------------------------------#
+
+
+rm -rf ./sembench
+
diff --git a/tools/ktap/test/benchmark/cmp_table.sh b/tools/ktap/test/benchmark/cmp_table.sh
new file mode 100644
index 0000000..6e3f12f
--- /dev/null
+++ b/tools/ktap/test/benchmark/cmp_table.sh
@@ -0,0 +1,112 @@
+#!/bin/sh
+
+# This script compare table performance between ktap and stap.
+#
+# 1. ktap -e 'trace syscalls:sys_enter_futex { s[execname] += 1 }'
+# 2. ktap -e 'kdebug.tracepoint("sys_enter_futex", function () { s[execname] += 1 })'
+# 3. ktap -e 'kdebug.kprobe("SyS_futex", function () { s[execname] += 1 })'
+# 4. stap -e 'probe syscall.futex { s[execname()] += 1 }'
+# 5. ktap -e 'kdebug.kprobe("SyS_futex", function () { s[probename] += 1 })'
+# 6. stap -e 'probe syscall.futex { s[name] += 1 }'
+# 7. ktap -e 'kdebug.kprobe("SyS_futex", function () { s["constant_string_key"] += 1 })'
+# 8. stap -e 'probe syscall.futex { s["constant_string_key"] += 1 }'
+
+#Result:
+#Currently ktap table operation overhead is smaller than stap.
+
+
+gcc -o sembench sembench.c -O2 -lpthread
+
+COMMAND="./sembench -t 200 -w 20 -r 30 -o 2"
+
+#------------------------------------------------------------#
+
+echo -e "without tracing:"
+$COMMAND; $COMMAND; $COMMAND
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'var s = {} trace syscalls:sys_enter_futex { s[execname] += 1 }' &
+
+echo -e "\nktap tracing: trace syscalls:sys_enter_futex { s[execname] += 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'var s = {} kdebug.tracepoint("sys_enter_futex", function () {
+	s[execname] += 1 })' &
+
+echo -e '\nktap tracing: kdebug.tracepoint("sys_enter_futex", function () { s[execname] += 1})'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'var s = {} kdebug.kprobe("SyS_futex", function () {
+	s[execname] += 1 })' &
+
+echo -e '\nktap tracing: kdebug.kprobe("SyS_futex", function () { s[execname] += 1 })'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+stap -e 'global s; probe syscall.futex { s[execname()] += 1 }' &
+
+echo -e "\nstap tracing: probe syscall.futex { s[execname()] += 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pkill stap
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'var s = {} kdebug.kprobe("SyS_futex", function () {
+	s[probename] += 1 })' &
+
+echo -e '\nktap tracing: kdebug.kprobe("SyS_futex", function () { s[probename] += 1 })'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+stap -e 'global s; probe syscall.futex { s[name] += 1 }' &
+
+echo -e "\nstap tracing: probe syscall.futex { s[name] += 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pkill stap
+
+#------------------------------------------------------------#
+
+../../ktap -q -e 'var s = {} s["const_string_key"] = 0 kdebug.kprobe("SyS_futex", function () {
+	s["const_string_key"] += 1 })' &
+
+echo -e '\nktap tracing: kdebug.kprobe("SyS_futex", function () { s["const_string_key"] += 1 })'
+$COMMAND; $COMMAND; $COMMAND
+pid=`pidof ktap`
+disown $pid; kill -9 $pid; sleep 1
+
+#------------------------------------------------------------#
+
+stap -e 'global s; probe syscall.futex { s["const_string_key"] += 1 }' &
+
+echo -e "\nstap tracing: probe syscall.futex { s["const_string_key"] += 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pkill stap
+
+#------------------------------------------------------------#
+
+stap -o /dev/null -e 'global s; probe syscall.futex { s["const_string_key"] <<< 1 }' &
+
+echo -e "\nstap tracing: probe syscall.futex { s["const_string_key"] <<< 1 }"
+$COMMAND; $COMMAND; $COMMAND
+pkill stap
+
+#------------------------------------------------------------#
+
+
+rm -rf ./sembench
+
diff --git a/tools/ktap/test/benchmark/sembench.c b/tools/ktap/test/benchmark/sembench.c
new file mode 100644
index 0000000..5dfccd5
--- /dev/null
+++ b/tools/ktap/test/benchmark/sembench.c
@@ -0,0 +1,556 @@
+/*
+ * copyright Oracle 2007.  Licensed under GPLv2
+ * To compile: gcc -Wall -o sembench sembench.c -lpthread
+ *
+ * usage: sembench -t thread count -w wakenum -r runtime -o op
+ * op can be: 0 (ipc sem) 1 (nanosleep) 2 (futexes)
+ *
+ * example:
+ *	sembench -t 1024 -w 512 -r 60 -o 2
+ * runs 1024 threads, waking up 512 at a time, running for 60 seconds using
+ * futex locking.
+ *
+ */
+#define  _GNU_SOURCE
+#define _POSIX_C_SOURCE 199309
+#include <fcntl.h>
+#include <sched.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/sem.h>
+#include <sys/ipc.h>
+#include <sys/types.h>
+#include <sys/mman.h>
+#include <pthread.h>
+#include <unistd.h>
+#include <string.h>
+#include <time.h>
+#include <sys/time.h>
+#include <sys/syscall.h>
+#include <errno.h>
+
+#define VERSION "0.2"
+
+/* futexes have been around since 2.5.something, but it still seems I
+ * need to make my own syscall.  Sigh.
+ */
+#define FUTEX_WAIT              0
+#define FUTEX_WAKE              1
+#define FUTEX_FD                2
+#define FUTEX_REQUEUE           3
+#define FUTEX_CMP_REQUEUE       4
+#define FUTEX_WAKE_OP           5
+static inline int futex (int *uaddr, int op, int val,
+			 const struct timespec *timeout,
+			 int *uaddr2, int val3)
+{
+	return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
+}
+
+static void smp_mb(void)
+{
+	__sync_synchronize();
+}
+
+static int all_done = 0;
+static int timeout_test = 0;
+
+#define SEMS_PERID 250
+
+struct sem_operations;
+
+struct lockinfo {
+	unsigned long id;
+	unsigned long index;
+	int data;
+	pthread_t tid;
+	struct lockinfo *next;
+	struct sem_operations *ops;
+	unsigned long ready;
+};
+
+struct sem_wakeup_info {
+	int wakeup_count;
+	struct sembuf sb[SEMS_PERID];
+};
+
+struct sem_operations {
+	void (*wait)(struct lockinfo *l);
+	int (*wake)(struct sem_wakeup_info *wi, int num_semids, int num);
+	void (*setup)(struct sem_wakeup_info **wi, int num_semids);
+	void (*cleanup)(int num_semids);
+	char *name;
+};
+
+int *semid_lookup = NULL;
+
+pthread_mutex_t worklist_mutex = PTHREAD_MUTEX_INITIALIZER;
+static unsigned long total_burns = 0;
+static unsigned long min_burns = ~0UL;
+static unsigned long max_burns = 0;
+
+/* currently running threads */
+static int thread_count = 0;
+
+struct lockinfo *worklist = NULL;
+static int workers_started = 0;
+
+/* total threads started */
+static int num_threads = 2048;
+
+static void worklist_add(struct lockinfo *l)
+{
+	smp_mb();
+	l->ready = 1;
+}
+
+static struct lockinfo *worklist_rm(void)
+{
+	static int last_index = 0;
+	int i;
+	struct lockinfo *l;
+
+	for (i = 0; i < num_threads; i++) {
+		int test = (last_index + i) % num_threads;
+
+		l = worklist + test;
+		smp_mb();
+		if (l->ready) {
+			l->ready = 0;
+			last_index = test;
+			return l;
+		}
+	}
+	return NULL;
+}
+
+/* ipc semaphore post& wait */
+void wait_ipc_sem(struct lockinfo *l)
+{
+	struct sembuf sb;
+	int ret;
+	struct timespec *tvp = NULL;
+	struct timespec tv = { 0, 1 };
+
+	sb.sem_num = l->index;
+	sb.sem_flg = 0;
+
+	sb.sem_op = -1;
+	l->data = 1;
+
+	if (timeout_test && (l->id % 5) == 0)
+		tvp = &tv;
+
+	worklist_add(l);
+	ret = semtimedop(semid_lookup[l->id], &sb, 1, tvp);
+
+	while(l->data != 0 && tvp) {
+		struct timespec tv2 = { 0, 500 };
+		nanosleep(&tv2, NULL);
+	}
+
+	if (l->data != 0) {
+		if (tvp)
+			return;
+		fprintf(stderr, "wakeup without data update\n");
+		exit(1);
+	}
+	if (ret) {
+		if (errno == EAGAIN && tvp)
+			return;
+		perror("semtimed op");
+		exit(1);
+	}
+}
+
+int ipc_wake_some(struct sem_wakeup_info *wi, int num_semids, int num)
+{
+	int i;
+	int ret;
+	struct lockinfo *l;
+	int found = 0;
+
+	for (i = 0; i < num_semids; i++) {
+		wi[i].wakeup_count = 0;
+	}
+	while(num > 0) {
+		struct sembuf *sb;
+		l = worklist_rm();
+		if (!l)
+			break;
+		if (l->data != 1)
+			fprintf(stderr, "warning, lockinfo data was %d\n",
+				l->data);
+		l->data = 0;
+		sb = wi[l->id].sb + wi[l->id].wakeup_count;
+		sb->sem_num = l->index;
+		sb->sem_op = 1;
+		sb->sem_flg = IPC_NOWAIT;
+		wi[l->id].wakeup_count++;
+		found++;
+		num--;
+	}
+	if (!found)
+		return 0;
+	for (i = 0; i < num_semids; i++) {
+		int wakeup_total;
+		int cur;
+		int offset = 0;
+		if (!wi[i].wakeup_count)
+			continue;
+		wakeup_total = wi[i].wakeup_count;
+		while(wakeup_total > 0) {
+			cur = wakeup_total > 64 ? 64 : wakeup_total;
+			ret = semtimedop(semid_lookup[i], wi[i].sb + offset,
+					 cur, NULL);
+			if (ret) {
+				perror("semtimedop");
+				exit(1);
+			}
+			offset += cur;
+			wakeup_total -= cur;
+		}
+	}
+	return found;
+}
+
+void setup_ipc_sems(struct sem_wakeup_info **wi, int num_semids)
+{
+	int i;
+	*wi = malloc(sizeof(**wi) * num_semids);
+	semid_lookup = malloc(num_semids * sizeof(int));
+	for(i = 0; i < num_semids; i++) {
+		semid_lookup[i] = semget(IPC_PRIVATE, SEMS_PERID,
+					 IPC_CREAT | 0777);
+		if (semid_lookup[i] < 0) {
+			perror("semget");
+			exit(1);
+		}
+	}
+	sleep(10);
+}
+
+void cleanup_ipc_sems(int num)
+{
+	int i;
+	for (i = 0; i < num; i++) {
+		semctl(semid_lookup[i], 0, IPC_RMID);
+	}
+}
+
+struct sem_operations ipc_sem_ops = {
+	.wait = wait_ipc_sem,
+	.wake = ipc_wake_some,
+	.setup = setup_ipc_sems,
+	.cleanup = cleanup_ipc_sems,
+	.name = "ipc sem operations",
+};
+
+/* futex post & wait */
+void wait_futex_sem(struct lockinfo *l)
+{
+	int ret;
+	l->data = 1;
+	worklist_add(l);
+	while(l->data == 1) {
+		ret = futex(&l->data, FUTEX_WAIT, 1, NULL, NULL, 0);
+		/*
+		if (ret && ret != EWOULDBLOCK) {
+			perror("futex wait");
+			exit(1);
+		}*/
+	}
+}
+
+int futex_wake_some(struct sem_wakeup_info *wi, int num_semids, int num)
+{
+	int i;
+	int ret;
+	struct lockinfo *l;
+	int found = 0;
+
+	for (i = 0; i < num; i++) {
+		l = worklist_rm();
+		if (!l)
+			break;
+		if (l->data != 1)
+			fprintf(stderr, "warning, lockinfo data was %d\n",
+				l->data);
+		l->data = 0;
+		ret = futex(&l->data, FUTEX_WAKE, 1, NULL, NULL, 0);
+		if (ret < 0) {
+			perror("futex wake");
+			exit(1);
+		}
+		found++;
+	}
+	return found;
+}
+
+void setup_futex_sems(struct sem_wakeup_info **wi, int num_semids)
+{
+	return;
+}
+
+void cleanup_futex_sems(int num)
+{
+	return;
+}
+
+struct sem_operations futex_sem_ops = {
+	.wait = wait_futex_sem,
+	.wake = futex_wake_some,
+	.setup = setup_futex_sems,
+	.cleanup = cleanup_futex_sems,
+	.name = "futex sem operations",
+};
+
+/* nanosleep sems here */
+void wait_nanosleep_sem(struct lockinfo *l)
+{
+	int ret;
+	struct timespec tv = { 0, 1000000 };
+	int count = 0;
+
+	l->data = 1;
+	worklist_add(l);
+	while(l->data) {
+		ret = nanosleep(&tv, NULL);
+		if (ret) {
+			perror("nanosleep");
+			exit(1);
+		}
+		count++;
+	}
+}
+
+int nanosleep_wake_some(struct sem_wakeup_info *wi, int num_semids, int num)
+{
+	int i;
+	struct lockinfo *l;
+
+	for (i = 0; i < num; i++) {
+		l = worklist_rm();
+		if (!l)
+			break;
+		if (l->data != 1)
+			fprintf(stderr, "warning, lockinfo data was %d\n",
+				l->data);
+		l->data = 0;
+	}
+	return i;
+}
+
+void setup_nanosleep_sems(struct sem_wakeup_info **wi, int num_semids)
+{
+	return;
+}
+
+void cleanup_nanosleep_sems(int num)
+{
+	return;
+}
+
+struct sem_operations nanosleep_sem_ops = {
+	.wait = wait_nanosleep_sem,
+	.wake = nanosleep_wake_some,
+	.setup = setup_nanosleep_sems,
+	.cleanup = cleanup_nanosleep_sems,
+	.name = "nano sleep sem operations",
+};
+
+void *worker(void *arg)
+{
+	struct lockinfo *l = (struct lockinfo *)arg;
+	int burn_count = 0;
+	pthread_t tid = pthread_self();
+	size_t pagesize = getpagesize();
+	char *buf = malloc(pagesize);
+
+	if (!buf) {
+		perror("malloc");
+		exit(1);
+	}
+
+	l->tid = tid;
+	workers_started = 1;
+	smp_mb();
+
+	while(!all_done) {
+		l->ops->wait(l);
+		if (all_done)
+			break;
+		burn_count++;
+	}
+	pthread_mutex_lock(&worklist_mutex);
+	total_burns += burn_count;
+	if (burn_count < min_burns)
+		min_burns = burn_count;
+	if (burn_count > max_burns)
+		max_burns = burn_count;
+	thread_count--;
+	pthread_mutex_unlock(&worklist_mutex);
+	return (void *)0;
+}
+
+void print_usage(void)
+{
+	printf("usage: sembench [-t threads] [-w wake incr] [-r runtime]");
+	printf("                [-o num] (0=ipc, 1=nanosleep, 2=futex)\n");
+	exit(1);
+}
+
+#define NUM_OPERATIONS 3
+struct sem_operations *allops[NUM_OPERATIONS] = { &ipc_sem_ops,
+						&nanosleep_sem_ops,
+						&futex_sem_ops};
+
+int main(int ac, char **av) {
+	int ret;
+	int i;
+	int semid = 0;
+	int sem_num = 0;
+	int burn_count = 0;
+	struct sem_wakeup_info *wi = NULL;
+	struct timeval start;
+	struct timeval now;
+	int num_semids = 0;
+	int wake_num = 256;
+	int run_secs = 30;
+	int pagesize = getpagesize();
+	char *buf = malloc(pagesize);
+	struct sem_operations *ops = allops[0];
+	cpu_set_t cpu_mask;
+	cpu_set_t target_mask;
+	int target_cpu = 0;
+	int max_cpu = -1;
+
+	if (!buf) {
+		perror("malloc");
+		exit(1);
+	}
+	for (i = 1; i < ac; i++) {
+		if (strcmp(av[i], "-t") == 0) {
+			if (i == ac -1)
+				print_usage();
+			num_threads = atoi(av[i+1]);
+			i++;
+		} else if (strcmp(av[i], "-w") == 0) {
+			if (i == ac -1)
+				print_usage();
+			wake_num = atoi(av[i+1]);
+			i++;
+		} else if (strcmp(av[i], "-r") == 0) {
+			if (i == ac -1)
+				print_usage();
+			run_secs = atoi(av[i+1]);
+			i++;
+		} else if (strcmp(av[i], "-o") == 0) {
+			int index;
+			if (i == ac -1)
+				print_usage();
+			index = atoi(av[i+1]);
+			if (index >= NUM_OPERATIONS) {
+				fprintf(stderr, "invalid operations %d\n",
+					index);
+				exit(1);
+			}
+			ops = allops[index];
+			i++;
+		} else if (strcmp(av[i], "-T") == 0) {
+			timeout_test = 1;
+		} else if (strcmp(av[i], "-h") == 0) {
+			print_usage();
+		}
+	}
+	num_semids = (num_threads + SEMS_PERID - 1) / SEMS_PERID;
+	ops->setup(&wi, num_semids);
+
+	ret = sched_getaffinity(0, sizeof(cpu_set_t), &cpu_mask);
+	if (ret) {
+		perror("sched_getaffinity");
+		exit(1);
+	}
+	for (i = 0; i < CPU_SETSIZE; i++)
+		if (CPU_ISSET(i, &cpu_mask))
+			max_cpu = i;
+	if (max_cpu == -1) {
+		fprintf(stderr, "sched_getaffinity returned empty mask\n");
+		exit(1);
+	}
+
+	CPU_ZERO(&target_mask);
+
+	worklist = malloc(sizeof(*worklist) * num_threads);
+	memset(worklist, 0, sizeof(*worklist) * num_threads);
+
+	for (i = 0; i < num_threads; i++) {
+		struct lockinfo *l;
+		pthread_t tid;
+		thread_count++;
+		l = worklist + i;
+		if (!l) {
+			perror("malloc");
+			exit(1);
+		}
+		l->id = semid;
+		l->index = sem_num++;
+		l->ops = ops;
+		if (sem_num >= SEMS_PERID) {
+			semid++;
+			sem_num = 0;
+		}
+		ret = pthread_create(&tid, NULL, worker, (void *)l);
+		if (ret) {
+			perror("pthread_create");
+			exit(1);
+		}
+
+		while (!CPU_ISSET(target_cpu, &cpu_mask)) {
+			target_cpu++;
+			if (target_cpu > max_cpu)
+				target_cpu = 0;
+		}
+		CPU_SET(target_cpu, &target_mask);
+		ret = pthread_setaffinity_np(tid, sizeof(cpu_set_t),
+					     &target_mask);
+		CPU_CLR(target_cpu, &target_mask);
+		target_cpu++;
+
+		ret = pthread_detach(tid);
+		if (ret) {
+			perror("pthread_detach");
+			exit(1);
+		}
+	}
+	while(!workers_started) {
+		smp_mb();
+		usleep(200);
+	}
+	gettimeofday(&start, NULL);
+	//fprintf(stderr, "main loop going\n");
+	while(1) {
+		ops->wake(wi, num_semids, wake_num);
+		burn_count++;
+		gettimeofday(&now, NULL);
+		if (now.tv_sec - start.tv_sec >= run_secs)
+			break;
+	}
+	//fprintf(stderr, "all done\n");
+	all_done = 1;
+	while(thread_count > 0) {
+		ops->wake(wi, num_semids, wake_num);
+		usleep(200);
+	}
+	//printf("%d threads, waking %d at a time\n", num_threads, wake_num);
+	//printf("using %s\n", ops->name);
+	//printf("main thread burns: %d\n", burn_count);
+	//printf("worker burn count total %lu min %lu max %lu avg %lu\n",
+	//       total_burns, min_burns, max_burns, total_burns / num_threads);
+	printf("%d seconds: %lu worker burns per second\n",
+		(int)(now.tv_sec - start.tv_sec),
+		total_burns / (now.tv_sec - start.tv_sec));
+	ops->cleanup(num_semids);
+	return 0;
+}
+
diff --git a/tools/ktap/test/cli-arg.t b/tools/ktap/test/cli-arg.t
new file mode 100644
index 0000000..4bf3f6c
--- /dev/null
+++ b/tools/ktap/test/cli-arg.t
@@ -0,0 +1,25 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: sanity
+--- args: 1 testing "2 3 4"
+--- src
+printf("arg 0: %s\n", arg[0])
+printf("arg 1: %d\n", arg[1])
+printf("arg 2: %s\n", arg[2])
+printf("arg 3: %s\n", arg[2])
+
+--- out_like chop
+^arg 0: /tmp/\S+\.kp
+arg 1: 1
+arg 2: testing
+arg 3: testing$
+
+--- err
+
diff --git a/tools/ktap/test/concat.t b/tools/ktap/test/concat.t
new file mode 100644
index 0000000..12e33ee
--- /dev/null
+++ b/tools/ktap/test/concat.t
@@ -0,0 +1,21 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: string concat
+--- src
+var a = "123"
+var b = "456"
+
+print(a..b)
+
+--- out
+123456
+--- err
+
+
diff --git a/tools/ktap/test/count.t b/tools/ktap/test/count.t
new file mode 100644
index 0000000..972bf86
--- /dev/null
+++ b/tools/ktap/test/count.t
@@ -0,0 +1,25 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: count
+--- src
+var t = {}
+
+t["key"] += 1
+print(t["key"])
+
+t["key"] += 1
+print(t["key"])
+
+--- out
+1
+2
+--- err
+
+
diff --git a/tools/ktap/test/deadloop.t b/tools/ktap/test/deadloop.t
new file mode 100644
index 0000000..3fc4f97
--- /dev/null
+++ b/tools/ktap/test/deadloop.t
@@ -0,0 +1,37 @@
+# vi: ft= et ts=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: exit dead loop
+--- src
+tick-1s {
+	exit()
+}
+
+tick-3s {
+	print("dead loop not exited")
+}
+
+while (1) {}
+
+--- out_like
+error: loop execute count exceed max limit(.*)
+--- err
+
+
+
+=== TEST 2: dead loop killed by signal
+--- src
+
+while (1) {}
+
+--- out_like
+error: loop execute count exceed max limit(.*)
+
+--- err
+
diff --git a/tools/ktap/test/fibonacci.t b/tools/ktap/test/fibonacci.t
new file mode 100644
index 0000000..f92d244
--- /dev/null
+++ b/tools/ktap/test/fibonacci.t
@@ -0,0 +1,42 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: regular recursive fibonacci
+--- src
+function fib(n) {
+	if (n < 2) {
+		return n
+	}
+	return fib(n-1) + fib(n-2)
+}
+
+print(fib(20))
+--- out
+6765
+--- err
+
+
+
+=== TEST 2: tail recursive fibonacci
+--- src
+function fib(n) {
+	function f(iter, res, next) {
+		if (iter == 0) {
+			return res;
+		}
+		return f(iter-1, next, res+next)
+	}
+	return f(n, 0, 1)
+}
+
+print(fib(20))
+--- out
+6765
+--- err
+
diff --git a/tools/ktap/test/function.t b/tools/ktap/test/function.t
new file mode 100644
index 0000000..cd44ccb
--- /dev/null
+++ b/tools/ktap/test/function.t
@@ -0,0 +1,78 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: function
+--- src
+### basic function call ###
+function f1(a, b) {
+	return a + b
+}
+
+print(f1(2, 3))
+
+### return string ###
+function f2() {
+	return "function return"
+}
+
+print(f2())
+
+### closure testing ### 
+function f4() {
+	var f5 = function(a, b) {
+		return a * b
+	}
+	return f5
+}
+
+var f = f4()
+print(f(9, 9))
+
+### closure with lexcial variable ### 
+var i = 1
+function f6() {
+	i = 5
+	var f7 = function(a, b) {
+		return a * b + i
+	}
+	return f7
+}
+
+f = f6()
+print(f(9, 9))
+
+i = 6
+print(f(9, 9))
+
+### tail call
+### stack should not overflow in tail call mechanism
+var a = 0
+function f8(i) {
+	if (i == 1000000) {
+		a = 1000000
+		return
+	}
+	# must add return here, otherwise stack overflow
+	return f8(i+1)
+}
+
+f8(0)
+print(a)
+
+--- out
+5
+function return
+81
+86
+87
+1000000
+
+--- err
+
+
diff --git a/tools/ktap/test/if.t b/tools/ktap/test/if.t
new file mode 100644
index 0000000..05989f2
--- /dev/null
+++ b/tools/ktap/test/if.t
@@ -0,0 +1,32 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: test if
+--- src
+
+if (false) {
+	print("failed")
+}
+
+if (nil) {
+	print("failed")
+}
+
+# ktap only think false and nil is "real false", number 0 is true
+# it's same as lua
+# Might change it in future, to make similar with C
+if (0) {
+	print("number 0 is true")
+}
+
+--- out
+number 0 is true
+--- err
+
+
diff --git a/tools/ktap/test/kprobe.t b/tools/ktap/test/kprobe.t
new file mode 100644
index 0000000..4ea342e
--- /dev/null
+++ b/tools/ktap/test/kprobe.t
@@ -0,0 +1,82 @@
+# vi: ft= et ts=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: kprobe
+--- opts: -q
+--- src
+
+var n = 0
+trace probe:schedule {
+	n = n + 1
+}
+
+# share same event id with previous one
+trace probe:schedule {
+}
+
+# test event filter
+trace probe:do_sys_open dfd=%di filename=%si flags=%dx mode=%cx /dfd==1/ { }
+
+tick-1s {
+	print(n==0)
+	exit()
+}
+--- out
+false
+--- err
+
+
+
+=== TEST 2: kretprobe
+--- opts: -q
+--- src
+var n = 0
+trace probe:__schedule%return {
+	n = n + 1
+}
+
+tick-1s {
+	print(n==0)
+	exit()
+}
+
+--- out
+false
+--- err
+
+
+=== TEST 3: only can be called in mainthread
+--- opts: -q
+--- src
+
+trace probe:schedule {
+	trace *:* {
+	}
+}
+
+--- out
+error: only mainthread can create function
+--- err
+
+
+=== TEST 4: can not be called in trace_end context
+--- opts: -q
+--- src
+
+trace_end {
+	trace *:* {
+	}
+}
+
+--- out
+error: kdebug.trace_by_id only can be called in RUNNING state
+--- err
+
+
+
diff --git a/tools/ktap/test/kretprobe.t b/tools/ktap/test/kretprobe.t
new file mode 100644
index 0000000..2ee76ec
--- /dev/null
+++ b/tools/ktap/test/kretprobe.t
@@ -0,0 +1,35 @@
+# vi: ft= et ts=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: kprobe
+--- opts: -q
+--- src
+
+var n = 0
+trace probe:schedule {
+	n = n + 1
+}
+
+# share same event id with previous one
+trace probe:schedule {
+}
+
+# test event filter
+trace probe:do_sys_open dfd=%di filename=%si flags=%dx mode=%cx /dfd==1/ { }
+
+tick-1s {
+	print(n==0)
+	exit()
+}
+
+--- out
+false
+--- err
+
+
diff --git a/tools/ktap/test/len.t b/tools/ktap/test/len.t
new file mode 100644
index 0000000..9de5253
--- /dev/null
+++ b/tools/ktap/test/len.t
@@ -0,0 +1,27 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: len
+--- src
+var a = "123456789"
+
+print(len(a))
+
+var b = {}
+b[0] = 0
+b[1] = 1
+b["keys"] = "values"
+
+print(len(b))
+
+--- out
+9
+3
+--- err
+
diff --git a/tools/ktap/test/lib/Test/ktap.pm b/tools/ktap/test/lib/Test/ktap.pm
new file mode 100644
index 0000000..94c551f
--- /dev/null
+++ b/tools/ktap/test/lib/Test/ktap.pm
@@ -0,0 +1,128 @@
+# Copyright (C) Yichun Zhang (agentzh)
+
+package Test::ktap;
+
+use Test::Base -Base;
+use POSIX ();
+use IPC::Run ();
+
+our @EXPORT = qw( run_tests );
+
+sub run_tests () {
+    for my $block (Test::Base::blocks()) {
+        run_test($block);
+    }
+}
+
+sub bail_out (@) {
+    Test::More::BAIL_OUT(@_);
+}
+
+sub parse_cmd ($) {
+    my $cmd = shift;
+    my @cmd;
+    while (1) {
+        if ($cmd =~ /\G\s*"(.*?)"/gmsc) {
+            push @cmd, $1;
+
+        } elsif ($cmd =~ /\G\s*'(.*?)'/gmsc) {
+            push @cmd, $1;
+
+        } elsif ($cmd =~ /\G\s*(\S+)/gmsc) {
+            push @cmd, $1;
+
+        } else {
+            last;
+        }
+    }
+    return @cmd;
+}
+
+sub run_test ($) {
+    my $block = shift;
+    my $name = $block->name;
+
+    my $timeout = $block->timeout() || 10;
+    my $opts = $block->opts;
+    my $args = $block->args;
+
+    my $cmd = "./ktap";
+
+    if (defined $opts) {
+        $cmd .= " $opts";
+    }
+
+    my $kpfile;
+    if (defined $block->src) {
+        $kpfile = POSIX::tmpnam() . ".kp";
+        open my $out, ">$kpfile" or
+            bail_out("cannot open $kpfile for writing: $!");
+        print $out ($block->src);
+        close $out;
+        $cmd .= " $kpfile"
+    }
+
+    if (defined $args) {
+        $cmd .= " $args";
+    }
+
+    #warn "CMD: $cmd\n";
+
+    my @cmd = parse_cmd($cmd);
+
+    my ($out, $err);
+
+    eval {
+        IPC::Run::run(\@cmd, \undef, \$out, \$err,
+                      IPC::Run::timeout($timeout));
+    };
+    if ($@) {
+        # timed out
+        if ($@ =~ /timeout/) {
+            if (!defined $block->expect_timeout) {
+                fail("$name: ktap process timed out");
+            }
+	} else {
+            fail("$name: failed to run command [$cmd]: $@");
+        }
+    }
+
+    my $ret = ($? >> 8);
+
+    if (defined $kpfile) {
+        unlink $kpfile;
+    }
+
+    if (defined $block->out) {
+        is $out, $block->out, "$name - stdout eq okay";
+    }
+
+    my $regex = $block->out_like;
+    if (defined $regex) {
+        if (!ref $regex) {
+            $regex = qr/$regex/ms;
+        }
+        like $out, $regex, "$name - stdout like okay";
+    }
+
+    if (defined $block->err) {
+        is $err, $block->err, "$name - stderr eq okay";
+    }
+
+    $regex = $block->err_like;
+    if (defined $regex) {
+        if (!ref $regex) {
+            $regex = qr/$regex/ms;
+        }
+        like $err, $regex, "$name - stderr like okay";
+    }
+
+    my $exp_ret = $block->ret;
+    if (!defined $exp_ret) {
+        $exp_ret = 0;
+    }
+    is $ret, $exp_ret, "$name - exit code okay";
+}
+
+1;
+# vi: et
diff --git a/tools/ktap/test/looping.t b/tools/ktap/test/looping.t
new file mode 100644
index 0000000..3f61118
--- /dev/null
+++ b/tools/ktap/test/looping.t
@@ -0,0 +1,46 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: looping
+--- src
+
+### basic while-loop testing
+var a = 1
+while (a < 1000) {
+	a = a + 1
+}
+
+print(a)
+
+### break testing
+### Note that ktap don't have continue keyword
+var a = 1
+while (a < 1000) {
+	if (a == 10) {
+		break
+	}
+	a = a + 1
+}
+
+print(a)
+
+### for-loop testing
+var b = 0
+for (c = 0, 1000, 1) {
+	b = b + 1
+}
+
+print(b)
+
+--- out
+1000
+10
+1001
+--- err
+
diff --git a/tools/ktap/test/one-liner.t b/tools/ktap/test/one-liner.t
new file mode 100644
index 0000000..9998b1a
--- /dev/null
+++ b/tools/ktap/test/one-liner.t
@@ -0,0 +1,48 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: print
+--- args: -e 'print("one-liner testing")'
+--- out
+one-liner testing
+--- err
+
+
+
+=== TEST 2: exit
+--- args: -e 'exit() print("failed")'
+--- out
+--- err
+
+
+
+=== TEST 3: syscalls in "ls"
+--- args: -e 'trace syscalls:* { print(argstr) }' -- ls
+--- out_like
+sys_mprotect -> 0x0
+.*?
+sys_close\(fd: \d+\)
+--- err
+
+
+
+=== TEST 4: trace ktap syscalls
+--- args: -e 'trace syscalls:* { print(argstr) }' -- ./ktap -e 'print("trace ktap by self")'
+--- out_like
+sys_mprotect -> 0x0
+.*?
+sys_close\(fd: \d+\)
+--- err
+
+=== TEST 5: trace ktap function calls
+--- args: -q -e 'trace probe:kp_* {print(argstr)}' -- ./ktap samples/helloworld.kp
+--- out_like
+kp_vm_new_state: (.*)
+.*?
+
diff --git a/tools/ktap/test/pairs.t b/tools/ktap/test/pairs.t
new file mode 100644
index 0000000..bcf57cf
--- /dev/null
+++ b/tools/ktap/test/pairs.t
@@ -0,0 +1,52 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: looping
+--- src
+
+var t = {}
+t[1] = 101
+t[2] = 102
+t[3] = 103
+t["key_1"] = "value_1"
+t["key_2"] = "value_2"
+t["key_3"] = "value_3"
+
+var n = 0
+
+for (k, v in pairs(t)) {
+	n = n + 1
+
+	if (k == 1 && v != 101) {
+		print("failed")
+	}
+	if (k == 2 && v != 102) {
+		print("failed")
+	}
+	if (k == 3 && v != 103) {
+		print("failed")
+	}
+	if (k == "key_1" && v != "value_1") {
+		print("failed")
+	}
+	if (k == "key_2" && v != "value_2") {
+		print("failed")
+	}
+	if (k == "key_3" && v != "value_3") {
+		print("failed")
+	}
+}
+
+if (n != len(t)) {
+	print("failed")
+}
+
+--- out
+--- err
+
diff --git a/tools/ktap/test/stack_overflow.t b/tools/ktap/test/stack_overflow.t
new file mode 100644
index 0000000..702d389
--- /dev/null
+++ b/tools/ktap/test/stack_overflow.t
@@ -0,0 +1,22 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: stack overflow
+--- src
+function f(a) {
+	        return 1 + f(a+1)
+}
+
+print(f(0))
+
+--- out_like
+(.*)stack overflow(.*)
+--- err
+
+
diff --git a/tools/ktap/test/syntax-err.t b/tools/ktap/test/syntax-err.t
new file mode 100644
index 0000000..b400c2f
--- /dev/null
+++ b/tools/ktap/test/syntax-err.t
@@ -0,0 +1,19 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: bad assignment (unexpected eof)
+--- src
+a =
+
+--- out
+--- err_like
+unexpected symbol near '<eof>'
+
+--- ret: 255
+
diff --git a/tools/ktap/test/table.t b/tools/ktap/test/table.t
new file mode 100644
index 0000000..f7c52d8
--- /dev/null
+++ b/tools/ktap/test/table.t
@@ -0,0 +1,81 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: table
+--- src
+
+### table testing ###
+var x = {}
+x[1] = "1"
+if (x[1] != "1") {
+	print("failed")
+}
+
+x[1] = 22222222222222222222222222222222222222222
+if (x[1] != 22222222222222222222222222222222222222222) {
+	print("failed")
+}
+
+x[1] = "jovi"
+if (x[1] != "jovi") {
+	print("failed")
+}
+
+x[11111111111111111111111111111111] = "jovi"
+if (x[11111111111111111111111111111111] != "jovi") {
+	print("failed")
+}
+
+x["jovi"] = 1
+if (x["jovi"] != 1) {
+	print("failed")
+}
+
+x["long string....................................."] = 1
+if (x["long string....................................."] != 1) {
+	print("failed")
+}
+
+# issue: subx must declare firstly, otherwise kernel will oops
+var subx = {}
+subx["test"] = "this is test"
+x["test"] = subx
+if (x["test"]["test"] != "this is test") {
+	print("failed")
+}
+
+var tbl = table.new(9999, 0)
+var i = 1
+while (i < 10000) {
+	tbl[i] = i	
+	i = i + 1
+}
+
+var i = 1
+while (i < 10000) {
+	if (tbl[i] != i) {
+		print("failed")
+	}
+	i = i + 1
+}
+
+#### table initization
+var days = {"Sunday", "Monday", "Tuesday", "Wednesday",
+		"Thursday", "Friday", "Saturday"}
+
+if (days[2] != "Monday") {
+	print("failed")
+}
+
+
+--- out
+--- err
+
+
+
diff --git a/tools/ktap/test/time.t b/tools/ktap/test/time.t
new file mode 100644
index 0000000..eb1d5fe
--- /dev/null
+++ b/tools/ktap/test/time.t
@@ -0,0 +1,59 @@
+# vi: ft= ts=4 sw=4 et
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+our $SecPattern = time();
+$SecPattern =~ s{(\d)\d$}{ my $a = $1; my $b = $a + 1; "[$a$b]\\d" }e;
+
+#warn $SecPattern;
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: gettimeofday_s
+--- src
+var begin = gettimeofday_s()
+printf("sec: %d\n", begin)
+printf("elapsed: %d\n", begin - gettimeofday_s())
+
+--- out_like eval
+qr/^sec: $::SecPattern
+elapsed: 0$/
+
+--- err
+
+
+
+=== TEST 2: gettimeofday_ms
+--- src
+printf("%d\n", gettimeofday_ms())
+
+--- out_like eval
+qr/^$::SecPattern\d{3}$/
+
+--- err
+
+
+
+=== TEST 3: gettimeofday_us
+--- src
+printf("%d", gettimeofday_us())
+
+--- out_like eval
+qr/^$::SecPattern\d{6}$/
+
+--- err
+
+
+
+=== TEST 4: gettimeofday_ns
+--- src
+printf("%d", gettimeofday_ns())
+
+--- out_like eval
+qr/^$::SecPattern\d{9}$/
+
+--- err
+
diff --git a/tools/ktap/test/timer.t b/tools/ktap/test/timer.t
new file mode 100644
index 0000000..2be2be2
--- /dev/null
+++ b/tools/ktap/test/timer.t
@@ -0,0 +1,65 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: timer
+--- opts: -q
+--- src
+
+var n1 = 0
+var n2 = 0
+
+tick-1s {
+	n1 = n1 + 1
+}
+
+tick-1s {
+	n2 = n2 + 1
+}
+
+tick-4s {
+	if (n1 == 0 || n2 == 0) {
+		print("failed")
+	}
+	exit()
+}
+
+--- out
+--- err
+
+
+=== TEST 2: cannot call timer.tick in trace_end context
+--- opts: -q
+--- src
+
+trace_end {
+	tick-1s {
+		print("error")
+	}
+}
+
+--- out
+error: timer.tick only can be called in RUNNING state
+--- err
+
+
+=== TEST 3: cannot call timer.profile in trace_end context
+--- opts: -q
+--- src
+
+trace_end {
+	profile-1s {
+		print("error")
+	}
+}
+
+--- out
+error: timer.profile only can be called in RUNNING state
+
+--- err
+
diff --git a/tools/ktap/test/tracepoint.t b/tools/ktap/test/tracepoint.t
new file mode 100644
index 0000000..f504da1
--- /dev/null
+++ b/tools/ktap/test/tracepoint.t
@@ -0,0 +1,53 @@
+# vi: ft= et ts=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: tracepoint
+--- opts: -q
+--- src
+
+var n = 0
+
+trace sched:* {
+	n = n + 1
+}
+
+tick-1s {
+	if (n == 0) {
+		print("failed")
+	}
+	exit()
+}
+
+--- out
+--- err
+
+
+=== TEST 2: enable all tracepoints in dry-run mode
+--- opts: -q -d
+--- src
+
+trace *:* {}
+
+--- out
+--- err
+--- expect_timeout
+--- timeout: 10
+
+
+=== TEST 3: test kdebug.tracepoint
+--- opts: -q
+--- src
+
+kdebug.tracepoint("sys_enter_open", function () {})
+tick-1s {
+	exit()
+}
+
+--- out
+--- err
diff --git a/tools/ktap/test/util/reindex b/tools/ktap/test/util/reindex
new file mode 100755
index 0000000..e4e1b4e
--- /dev/null
+++ b/tools/ktap/test/util/reindex
@@ -0,0 +1,61 @@
+#!/usr/bin/env perl
+
+# reindex
+# reindex .t files for Test::Base based test files
+# Copyright (C) Yichun Zhang (agentzh)
+
+use strict;
+use warnings;
+
+use Getopt::Std;
+
+my %opts;
+getopts('hb:', \%opts);
+if ($opts{h} or ! @ARGV) {
+    die "Usage: reindex [-b 0] t/*.t\n";
+}
+
+my $init = $opts{b};
+$init = 1 if not defined $init;
+
+my @files = map glob, @ARGV;
+for my $file (@files) {
+    next if -d $file or $file !~ /\.t_?$/;
+    reindex($file);
+}
+
+sub reindex {
+    my $file = $_[0];
+    open my $in, $file or
+        die "Can't open $file for reading: $!";
+    my @lines;
+    my $counter = $init;
+    my $changed;
+    while (<$in>) {
+        s/\r$//;
+        my $num;
+        s/ ^ === \s+ TEST \s+ (\d+)/$num=$1; "=== TEST " . $counter++/xie;
+        next if !defined $num;
+        if ($num != $counter-1) {
+            $changed++;
+        }
+    } continue {
+        push @lines, $_;
+    }
+    close $in;
+    my $text = join '', @lines;
+    $text =~ s/(?x) \n+ === \s+ TEST/\n\n\n\n=== TEST/ixsg;
+    $text =~ s/__(DATA|END)__\n+=== TEST/__${1}__\n\n=== TEST/;
+    #$text =~ s/\n+$/\n\n/s;
+    if (! $changed and $text eq join '', @lines) {
+        warn "reindex: $file:\tskipped.\n";
+        return;
+    }
+    open my $out, "> $file" or
+        die "Can't open $file for writing: $!";
+    binmode $out;
+    print $out $text;
+    close $out;
+
+    warn "reindex: $file:\tdone.\n";
+}
diff --git a/tools/ktap/test/zerodivide.t b/tools/ktap/test/zerodivide.t
new file mode 100644
index 0000000..daf1ff6
--- /dev/null
+++ b/tools/ktap/test/zerodivide.t
@@ -0,0 +1,21 @@
+# vi: ft= et tw=4 sw=4
+
+use lib 'test/lib';
+use Test::ktap 'no_plan';
+
+run_tests();
+
+__DATA__
+
+=== TEST 1: zero divide
+--- src
+
+var a = 1/0
+#should not go here
+print("failed")
+
+--- out_like
+(.*)divide 0(.*)
+--- err
+
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 28/29] ktap: add vim syntax file(tools/ktap/vim/*)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (26 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 27/29] ktap: add testsuite and benchmark(tools/ktap/test/*) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-28 14:45 ` [PATCH v2 29/29] ktap: add COPYRIGHT file(tools/ktap/COPYRIGHT) Jovi Zhangwei
  2014-03-30  1:00 ` [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Andi Kleen
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

To make ktap script looks more beautiful.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/vim/ftdetect/ktap.vim |   3 ++
 tools/ktap/vim/syntax/ktap.vim   | 106 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 109 insertions(+)
 create mode 100644 tools/ktap/vim/ftdetect/ktap.vim
 create mode 100644 tools/ktap/vim/syntax/ktap.vim

diff --git a/tools/ktap/vim/ftdetect/ktap.vim b/tools/ktap/vim/ftdetect/ktap.vim
new file mode 100644
index 0000000..439f54c
--- /dev/null
+++ b/tools/ktap/vim/ftdetect/ktap.vim
@@ -0,0 +1,3 @@
+augroup filetype
+  au BufNewFile,BufRead *.kp   set filetype=ktap
+augroup end
diff --git a/tools/ktap/vim/syntax/ktap.vim b/tools/ktap/vim/syntax/ktap.vim
new file mode 100644
index 0000000..a375b79
--- /dev/null
+++ b/tools/ktap/vim/syntax/ktap.vim
@@ -0,0 +1,106 @@
+" Vim syntax file
+" Language:     ktap
+" Maintainer:	Jovi Zhangwei <jovi.zhangwei@gmail.com>
+" First Author:	Jovi Zhangwei <jovi.zhangwei@gmail.com>
+" Last Change:	2013 Dec 19
+
+" For version 5.x: Clear all syntax items
+" For version 6.x: Quit when a syntax file was already loaded
+if version < 600
+  syn clear
+elseif exists("b:current_syntax")
+  finish
+endif
+
+setlocal iskeyword=@,48-57,_,$
+
+syn keyword ktapStatement break continue return
+syn keyword ktapRepeat while for in
+syn keyword ktapConditional if else elseif
+syn keyword ktapDeclaration trace trace_end
+syn keyword ktapIdentifier var
+syn keyword ktapFunction function
+syn match   ktapBraces "[{}\[\]]"
+syn match   ktapParens "[()]"
+syn keyword ktapReserved argstr probename arg0 arg1 arg2 arg3 arg4 arg5 arg6 arg7 arg8 arg9
+syn keyword ktapReserved cpu pid tid uid execname
+
+
+syn region ktapTraceDec start="\<trace\>"lc=5 end="{"me=s-1 contains=ktapString,ktapNumber
+syn region ktapTraceDec start="\<trace_end\>"lc=9 end="{"me=s-1 contains=ktapString,ktapNumber
+syn match ktapTrace contained "\<\w\+\>" containedin=ktapTraceDec
+
+syn region ktapFuncDec start="\<function\>"lc=8 end=":\|("me=s-1 contains=ktapString,ktapNumber
+syn match ktapFuncCall contained "\<\w\+\ze\(\s\|\n\)*("
+syn match ktapFunc contained "\<\w\+\>" containedin=ktapFuncDec,ktapFuncCall
+
+syn match ktapStat contained "@\<\w\+\ze\(\s\|\n\)*("
+
+" decimal number
+syn match ktapNumber "\<\d\+\>"
+" octal number
+syn match ktapNumber "\<0\o\+\>" contains=ktapOctalZero
+" Flag the first zero of an octal number as something special
+syn match ktapOctalZero contained "\<0"
+" flag an octal number with wrong digits
+syn match ktapOctalError "\<0\o*[89]\d*"
+" hex number
+syn match ktapNumber "\<0x\x\+\>"
+" numeric arguments
+syn match ktapNumber "\<\$\d\+\>"
+syn match ktapNumber "\<\$#"
+
+syn region ktapString oneline start=+"+ skip=+\\"+ end=+"+ 
+" string arguments
+syn match ktapString "@\d\+\>"
+syn match ktapString "@#"
+syn region ktapString2 matchgroup=ktapString start="\[\z(=*\)\[" end="\]\z1\]" contains=@Spell
+
+" syn keyword ktapTodo contained TODO FIXME XXX
+
+syn match ktapComment "#.*"
+
+" treat ^#! as special
+syn match ktapSharpBang "^#!.*"
+
+
+syn keyword ktapFunc printf print print_hist stack
+syn keyword ktapFunc gettimeofday_us
+syn keyword ktapFunc pairs
+
+
+" Define the default highlighting.
+" For version 5.7 and earlier: only when not done already
+" For version 5.8 and later: only when an item doesn't have highlighting yet
+if version >= 508 || !exists("did_lua_syntax_inits")
+  if version < 508
+    let did_lua_syntax_inits = 1
+    command -nargs=+ HiLink hi link <args>
+  else
+    command -nargs=+ HiLink hi def link <args>
+  endif
+
+  HiLink ktapNumber		Number
+  HiLink ktapOctalZero		PreProc " c.vim does it this way...
+  HiLink ktapOctalError		Error
+  HiLink ktapString		String
+  HiLink ktapString2		String
+  HiLink ktapTodo		Todo
+  HiLink ktapComment		Comment
+  HiLink ktapSharpBang		PreProc
+  HiLink ktapStatement		Statement
+  HiLink ktapConditional	Conditional
+  HiLink ktapRepeat		Repeat
+  HiLink ktapTrace		Function
+  HiLink ktapFunc		Function
+  HiLink ktapStat		Function
+  HiLink ktapFunction		Function
+  HiLink ktapBraces		Function
+  HiLink ktapDeclaration	Typedef
+  HiLink ktapIdentifier		Identifier
+  HiLink ktapReserved		Keyword
+
+  delcommand HiLink
+endif
+
+let b:current_syntax = "ktap"
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 29/29] ktap: add COPYRIGHT file(tools/ktap/COPYRIGHT)
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (27 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 28/29] ktap: add vim syntax file(tools/ktap/vim/*) Jovi Zhangwei
@ 2014-03-28 14:45 ` Jovi Zhangwei
  2014-03-30  1:00 ` [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Andi Kleen
  29 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-28 14:45 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Greg Kroah-Hartman,
	Frederic Weisbecker, Andi Kleen, Jovi Zhangwei

ktap is based on laujit and lua, so carry they copyright notices
and MIT license in ktap tree.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
---
 tools/ktap/COPYRIGHT | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)
 create mode 100644 tools/ktap/COPYRIGHT

diff --git a/tools/ktap/COPYRIGHT b/tools/ktap/COPYRIGHT
new file mode 100644
index 0000000..9552521
--- /dev/null
+++ b/tools/ktap/COPYRIGHT
@@ -0,0 +1,63 @@
+
+Copyright (C) 2012-2014, Jovi Zhangwei <jovi.zhangwei@gmail.com>.
+All rights reserved.
+
+Licensed under the GPL License, Version 2.0
+
+===============================================================================
+
+* ktap code is based on luajit(compiler & bytecode), so carry luajit
+  copyright notices in below. 
+
+LuaJIT -- a Just-In-Time Compiler for Lua. http://luajit.org/
+
+Copyright (C) 2005-2014 Mike Pall.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
+
+[ MIT license: http://www.opensource.org/licenses/mit-license.php ]
+
+===============================================================================
+
+* Some ktap code is based on lua programming language initially,
+  so carry lua own copyright notices and license terms:
+  (lua's MIT license is compatible with GPL.
+   ktap can redistribute as GPL v2, without violate with lua license,
+   this was confirmed with official lua team)
+
+Copyright (C) 1994~2013 Lua.org, PUC-Rio.
+ 
+Permissiossion is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux
  2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
                   ` (28 preceding siblings ...)
  2014-03-28 14:45 ` [PATCH v2 29/29] ktap: add COPYRIGHT file(tools/ktap/COPYRIGHT) Jovi Zhangwei
@ 2014-03-30  1:00 ` Andi Kleen
  2014-03-30  9:18   ` Jovi Zhangwei
  29 siblings, 1 reply; 46+ messages in thread
From: Andi Kleen @ 2014-03-30  1:00 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker, Andi Kleen


For now I would suggest concentrating on the kernel ring 0 parts only.
Split the user space part into a separate patchkit that is posted 
on a separate schedule.

It's hard to make progress with too large patchkits.

-Andi


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h])
  2014-03-28 14:45 ` [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h]) Jovi Zhangwei
@ 2014-03-30  2:47   ` Andi Kleen
  2014-03-30  8:02     ` Jovi Zhangwei
  0 siblings, 1 reply; 46+ messages in thread
From: Andi Kleen @ 2014-03-30  2:47 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker, Andi Kleen

> +/* Read debug info of a prototype. */
> +static void bcread_dbg(BCReadCtx *ctx, ktap_proto_t *pt, int sizedbg)
> +{
> +	void *lineinfo = (void *)proto_lineinfo(pt);
> +
> +	bcread_block(ctx, lineinfo, sizedbg);
> +	/* Swap lineinfo if the endianess differs. */


Why does this care about endianness? Can't that be handled in the user
space? And why would the user space create different endianness than
the host is?

> +	for (i = 0; i < sizekgc; i++, kr++) {
> +		int tp = bcread_uint32(ctx);
> +		if (tp >= BCDUMP_KGC_STR) {

The signedness handling all over this file is a scary.
What happens if the user puts in negative values or near overflow
values.

Most likely a lot of these checks should be unsigned
and need to be audited again (and ideally fuzzed too)

> +
> +	/* Allocate prototype object and initialize its fields. */
> +	pt = (ktap_proto_t *)kp_obj_new(ctx->ks, (int)sizept);

Error check?

Lots of other similar cases.


-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h])
  2014-03-28 14:45 ` [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h]) Jovi Zhangwei
@ 2014-03-30  3:50   ` Andi Kleen
  2014-03-30  9:12     ` Jovi Zhangwei
  0 siblings, 1 reply; 46+ messages in thread
From: Andi Kleen @ 2014-03-30  3:50 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker, Andi Kleen


It's not clear to me why a kernel script language needs
all that complicated string interning code.

What kind of scripts would create as many strings that
it would be worth it?

I think it would be better to replace it with a really
simple non interning dynamic string type.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 12/29] ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h])
  2014-03-28 14:45 ` [PATCH v2 12/29] ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h]) Jovi Zhangwei
@ 2014-03-30  3:56   ` Andi Kleen
  2014-03-30  8:14     ` Jovi Zhangwei
  0 siblings, 1 reply; 46+ messages in thread
From: Andi Kleen @ 2014-03-30  3:56 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker, Andi Kleen

> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc.,
> + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.

We're not supposed to use the address anymore.

> +/* memory allocation flag */
> +#define KTAP_ALLOC_FLAGS ((GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN) \
> +			 & ~__GFP_WAIT)
> +
> +void *kp_malloc(ktap_state_t *ks, int size)
> +{
> +	void *addr;
> +
> +	addr = kmalloc(size, KTAP_ALLOC_FLAGS);
> +	if (unlikely(!addr)) {
> +		kp_error(ks, "kmalloc failed\n");
> +	}
> +	return addr;

Please remove this pointless wrapper. Similar for the functions below.
Just use kmalloc etc. directly.

> +	case KTAP_TNUM:
> +		kp_printf(ks, "NUM %ld", nvalue(v));

Similar here. That's all printk


-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 13/29] ktap: add ring buffer handling code(kernel/trace/ktap/kp_transport.[c|h])
  2014-03-28 14:45 ` [PATCH v2 13/29] ktap: add ring buffer handling code(kernel/trace/ktap/kp_transport.[c|h]) Jovi Zhangwei
@ 2014-03-30  3:58   ` Andi Kleen
  2014-03-30  7:40     ` Jovi Zhangwei
  0 siblings, 1 reply; 46+ messages in thread
From: Andi Kleen @ 2014-03-30  3:58 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker, Andi Kleen

> 
> A lot of code in this file is duplicated with kernel trace_output.c.

Please modify trace_output instead to avoid this.

-Andi


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 13/29] ktap: add ring buffer handling code(kernel/trace/ktap/kp_transport.[c|h])
  2014-03-30  3:58   ` Andi Kleen
@ 2014-03-30  7:40     ` Jovi Zhangwei
  0 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-30  7:40 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

On Sun, Mar 30, 2014 at 11:58 AM, Andi Kleen <andi@firstfloor.org> wrote:
>>
>> A lot of code in this file is duplicated with kernel trace_output.c.
>
> Please modify trace_output instead to avoid this.
>
Yeah, ktap transport functionality is based on ftrace ring buffer,
and reading buffer is through trace pipe.

I will try to figure out some way to reuse ftrace trace pipe reading
code as much as possible.

Thanks.

Jovi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h])
  2014-03-30  2:47   ` Andi Kleen
@ 2014-03-30  8:02     ` Jovi Zhangwei
  2014-03-30 17:17       ` Andi Kleen
  0 siblings, 1 reply; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-30  8:02 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

On Sun, Mar 30, 2014 at 10:47 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> +/* Read debug info of a prototype. */
>> +static void bcread_dbg(BCReadCtx *ctx, ktap_proto_t *pt, int sizedbg)
>> +{
>> +     void *lineinfo = (void *)proto_lineinfo(pt);
>> +
>> +     bcread_block(ctx, lineinfo, sizedbg);
>> +     /* Swap lineinfo if the endianess differs. */
>
>
> Why does this care about endianness? Can't that be handled in the user
> space? And why would the user space create different endianness than
> the host is?
>
That's designed for portability initially, it means we can just run bytecode
without compile script file everywhere, especially when compilation  is
a heavily task for some embedded platform, even though ktap compilation
is extremely fast.

I doubt maybe there will have this bytecode portability requirement in future?

>> +     for (i = 0; i < sizekgc; i++, kr++) {
>> +             int tp = bcread_uint32(ctx);
>> +             if (tp >= BCDUMP_KGC_STR) {
>
> The signedness handling all over this file is a scary.
> What happens if the user puts in negative values or near overflow
> values.
>
> Most likely a lot of these checks should be unsigned
> and need to be audited again (and ideally fuzzed too)
>
>> +
>> +     /* Allocate prototype object and initialize its fields. */
>> +     pt = (ktap_proto_t *)kp_obj_new(ctx->ks, (int)sizept);
>
> Error check?
>
> Lots of other similar cases.
>
I will take more check in kp_bcread.c file, will fix in next version.

Thanks.

Jovi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 12/29] ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h])
  2014-03-30  3:56   ` Andi Kleen
@ 2014-03-30  8:14     ` Jovi Zhangwei
  0 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-30  8:14 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

On Sun, Mar 30, 2014 at 11:56 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; if not, write to the Free Software Foundation, Inc.,
>> + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
>
> We're not supposed to use the address anymore.
>
I will update it.

>> +/* memory allocation flag */
>> +#define KTAP_ALLOC_FLAGS ((GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN) \
>> +                      & ~__GFP_WAIT)
>> +
>> +void *kp_malloc(ktap_state_t *ks, int size)
>> +{
>> +     void *addr;
>> +
>> +     addr = kmalloc(size, KTAP_ALLOC_FLAGS);
>> +     if (unlikely(!addr)) {
>> +             kp_error(ks, "kmalloc failed\n");
>> +     }
>> +     return addr;
>
> Please remove this pointless wrapper. Similar for the functions below.
> Just use kmalloc etc. directly.
>
Reasonable, save a extra function call.

>> +     case KTAP_TNUM:
>> +             kp_printf(ks, "NUM %ld", nvalue(v));
>
> Similar here. That's all printk
>
Hmm, kp_printf is not printk, there is not printk in ktap,
all content is dump to ring buffer, we cannot use printk in prove context.

And we allow multiple ktap instances running at same time,
so different ktap instance have different ring buffer, that's why we have to
pass the context variable "ks" into kp_printf.

Thanks.

Jovi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h])
  2014-03-30  3:50   ` Andi Kleen
@ 2014-03-30  9:12     ` Jovi Zhangwei
  2014-03-30 17:19       ` Andi Kleen
  0 siblings, 1 reply; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-30  9:12 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

On Sun, Mar 30, 2014 at 11:50 AM, Andi Kleen <andi@firstfloor.org> wrote:
>
> It's not clear to me why a kernel script language needs
> all that complicated string interning code.
>
> What kind of scripts would create as many strings that
> it would be worth it?
>
> I think it would be better to replace it with a really
> simple non interning dynamic string type.
>
Basically I think string interning is very useful in ktap, and
the implementation is not complicated(kp_str_new function
is very simple).

String interning will make string comparison and table index
extremely fast, just pointer equality, no strcmp. table index
is heavily used in these dynamic tracing tool ktap/stap/dtrace.

String interning make there don't need to copy whole string
each time when use string key in associative array(table)
(stap/dtrace need copy it), and don't need to compute
string hash every time when use string table key.
(Things became more easily if need to support multi-key
table, ktap don't need to pre-allocate string in table)

See test/benchmark/cmp_table.sh, that script compare
table operation between ktap with stap, the result is very
inspiring, ktap table operation overhead is quite lower than
stap, especially when use constant string key.

But I agree with you partly, because in some cases we don't
want/need to interning all string, for example:
    trace xxx:yyy {
        var str = cast("char *", arg1)
        print(str)
    }

In above case, arg1 is a long kernel string, and no table insert,
so definitely no need to interned, so we need to add
KTAP_TRAWSTR to represent these values.

The simplicity design of ktap make it very flexible to support
different kind of value type. :)

Thanks.

Jovi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux
  2014-03-30  1:00 ` [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Andi Kleen
@ 2014-03-30  9:18   ` Jovi Zhangwei
  0 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-30  9:18 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

On Sun, Mar 30, 2014 at 9:00 AM, Andi Kleen <andi@firstfloor.org> wrote:
>
> For now I would suggest concentrating on the kernel ring 0 parts only.
> Split the user space part into a separate patchkit that is posted
> on a separate schedule.
>
> It's hard to make progress with too large patchkits.
>
Agreed, we can only focus on kernel module now, userspace part
is much simple, just a one pass compiler.

I will only send kernel module part in next version.

Thanks for this suggestion.

Jovi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h])
  2014-03-30  8:02     ` Jovi Zhangwei
@ 2014-03-30 17:17       ` Andi Kleen
  2014-03-31  2:05         ` Jovi Zhangwei
  0 siblings, 1 reply; 46+ messages in thread
From: Andi Kleen @ 2014-03-30 17:17 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Andi Kleen, Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

> That's designed for portability initially, it means we can just run bytecode
> without compile script file everywhere, especially when compilation  is
> a heavily task for some embedded platform, even though ktap compilation
> is extremely fast.
> 
> I doubt maybe there will have this bytecode portability requirement in future?

Portability should be on the source level.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h])
  2014-03-30  9:12     ` Jovi Zhangwei
@ 2014-03-30 17:19       ` Andi Kleen
  2014-03-31  2:35         ` Jovi Zhangwei
  0 siblings, 1 reply; 46+ messages in thread
From: Andi Kleen @ 2014-03-30 17:19 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Andi Kleen, Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

> See test/benchmark/cmp_table.sh, that script compare

Is that a realistic tracing scenario?

> table operation between ktap with stap, the result is very
> inspiring, ktap table operation overhead is quite lower than
> stap, especially when use constant string key.

Ok fair enough.

> 
> But I agree with you partly, because in some cases we don't
> want/need to interning all string, for example:
>     trace xxx:yyy {
>         var str = cast("char *", arg1)
>         print(str)
>     }
> 
> In above case, arg1 is a long kernel string, and no table insert,
> so definitely no need to interned, so we need to add
> KTAP_TRAWSTR to represent these values.

Please don't make it more complicated. If there's a good rationale
for interning it' ok to use always.

It would be better to find ways to simplify things.

-Andi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h])
  2014-03-30 17:17       ` Andi Kleen
@ 2014-03-31  2:05         ` Jovi Zhangwei
  0 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-31  2:05 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

On Mon, Mar 31, 2014 at 1:17 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> That's designed for portability initially, it means we can just run bytecode
>> without compile script file everywhere, especially when compilation  is
>> a heavily task for some embedded platform, even though ktap compilation
>> is extremely fast.
>>
>> I doubt maybe there will have this bytecode portability requirement in future?
>
> Portability should be on the source level.
>
>
That's fine, I will add one endianness flag in bytecode header, kernel
will reject
loading if find endianness doesn't match.

Thanks.

Jovi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 16/29] ktap: add amalgamation build(kernel/trace/ktap/amalg.c)
  2014-03-28 14:45 ` [PATCH v2 16/29] ktap: add amalgamation build(kernel/trace/ktap/amalg.c) Jovi Zhangwei
@ 2014-03-31  2:17   ` Li Zefan
  2014-03-31  3:22     ` Jovi Zhangwei
  0 siblings, 1 reply; 46+ messages in thread
From: Li Zefan @ 2014-03-31  2:17 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker, Andi Kleen

On 2014/3/28 22:45, Jovi Zhangwei wrote:
> This compiles the ktapvm as one huge C file and allows
> GCC to generate faster and shorter code.
> 
> No amalgamation build in x86_64:
> ktapvm.ko: 3.1M
> 
> amalgamation build in x86_64:
> ktapvm.ko: 1.1M
> 
> User can set use amalgamation build or not in Makefile.
> 
> (Need to analyze further why have so big differences)
> 

Let's drop this patch for now to make the patchset smaller ?

> Signed-off-by: Jovi Zhangwei <jovi.zhangwei@gmail.com>
> ---
>  kernel/trace/ktap/amalg.c | 37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>  create mode 100644 kernel/trace/ktap/amalg.c
> 
> diff --git a/kernel/trace/ktap/amalg.c b/kernel/trace/ktap/amalg.c
> new file mode 100644
> index 0000000..9935ccf
> --- /dev/null
> +++ b/kernel/trace/ktap/amalg.c
> @@ -0,0 +1,37 @@
> +/*
> + * amalg.c - ktapvm kernel module amalgamation.
> + *
> + * This file is part of ktap by Jovi Zhangwei.
> + *
> + * Copyright (C) 2012-2014 Jovi Zhangwei <jovi.zhangwei@gmail.com>.
> + *
> + * ktap is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * ktap is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc.,
> + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
> + */
> +
> +#include "ktap.c"
> +#include "kp_obj.c"
> +#include "kp_bcread.c"
> +#include "kp_str.c"
> +#include "kp_mempool.c"
> +#include "kp_tab.c"
> +#include "kp_transport.c"
> +#include "kp_vm.c"
> +#include "kp_events.c"
> +#include "lib_base.c"
> +#include "lib_ansi.c"
> +#include "lib_kdebug.c"
> +#include "lib_timer.c"
> +#include "lib_table.c"
> +#include "lib_net.c"
> +
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h])
  2014-03-30 17:19       ` Andi Kleen
@ 2014-03-31  2:35         ` Jovi Zhangwei
  0 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-31  2:35 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker

On Mon, Mar 31, 2014 at 1:19 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> See test/benchmark/cmp_table.sh, that script compare
>
> Is that a realistic tracing scenario?
>
Yes, it's quite common to use string key in dynamic tracing tool,
for example, See samples/userspace/glibc_func_hist.kp

    var s = {}

    trace probe:/lib64/libc.so.6:* {
        s[probename] += 1
    }

    trace_end {
        print_hist(s)
    }


Result:

Tracing... Hit Ctrl-C to end.
^C
                         value ------------- Distribution ------------- count
                  _IO_sputbackc |@@                                    108344
             __GI__IO_sputbackc |@@                                    107768
             _IO_default_xsputn |                                      46639
        __GI__IO_default_xsputn |                                      46624
                           free |                                      36871
                    __libc_free |                                      36841
                          cfree |                                      36841
                         __free |                                      36811
                        __cfree |                                      36811
               __GI___libc_free |                                      36804
        ____strtoull_l_internal |                                      28670
    __GI_____strtoul_l_internal |                                      28670
   __GI_____strtoull_l_internal |                                      28518
         ____strtoul_l_internal |                                      28518
                      strchrnul |                                      27763
                    __strchrnul |                                      27741
                       _IO_putc |                                      27589
                  __GI__IO_putc |                                      27589
                           putc |                                      27589
                            ... |

Above script output histogram of glibc function call, you will know
which function will be called frequently, a very useful script.

'probename' return probe name string, then insert table as key.
The magic of above script is there have no string copy and string hash
in probe context, because probename string is interned.


>> table operation between ktap with stap, the result is very
>> inspiring, ktap table operation overhead is quite lower than
>> stap, especially when use constant string key.
>
> Ok fair enough.
>
>>
>> But I agree with you partly, because in some cases we don't
>> want/need to interning all string, for example:
>>     trace xxx:yyy {
>>         var str = cast("char *", arg1)
>>         print(str)
>>     }
>>
>> In above case, arg1 is a long kernel string, and no table insert,
>> so definitely no need to interned, so we need to add
>> KTAP_TRAWSTR to represent these values.
>
> Please don't make it more complicated. If there's a good rationale
> for interning it' ok to use always.
>
> It would be better to find ways to simplify things.
>
Definitely, the reason I implement ktap based on lua is the simplicity
and efficiency of lua.

The whole bytecode design and value type is very simple, it could
build a complete safe sandbox for kernel scripting, and easy to
extend to fulfill our need(like multi-key table, aggregation)

Thanks.

Jovi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 16/29] ktap: add amalgamation build(kernel/trace/ktap/amalg.c)
  2014-03-31  2:17   ` Li Zefan
@ 2014-03-31  3:22     ` Jovi Zhangwei
  0 siblings, 0 replies; 46+ messages in thread
From: Jovi Zhangwei @ 2014-03-31  3:22 UTC (permalink / raw)
  To: Li Zefan
  Cc: Ingo Molnar, Steven Rostedt, LKML, Masami Hiramatsu,
	Greg Kroah-Hartman, Frederic Weisbecker, Andi Kleen

On Mon, Mar 31, 2014 at 10:17 AM, Li Zefan <lizefan@huawei.com> wrote:
> On 2014/3/28 22:45, Jovi Zhangwei wrote:
>> This compiles the ktapvm as one huge C file and allows
>> GCC to generate faster and shorter code.
>>
>> No amalgamation build in x86_64:
>> ktapvm.ko: 3.1M
>>
>> amalgamation build in x86_64:
>> ktapvm.ko: 1.1M
>>
>> User can set use amalgamation build or not in Makefile.
>>
>> (Need to analyze further why have so big differences)
>>
>
> Let's drop this patch for now to make the patchset smaller ?
>
Sure, I will analyze the size difference and find the root cause offline.

Thanks for this suggestion.

Jovi.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2014-03-31  3:22 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-28 14:44 [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Jovi Zhangwei
2014-03-28 14:44 ` [PATCH v2 01/29] ktap: add tools/ktap/README.md file Jovi Zhangwei
2014-03-28 14:44 ` [PATCH v2 02/29] ktap: add ktap tutorial(tools/ktap/doc/tutorial.md) Jovi Zhangwei
2014-03-28 14:44 ` [PATCH v2 03/29] ktap: add sample scripts(tools/ktap/samples/*) Jovi Zhangwei
2014-03-28 14:44 ` [PATCH v2 04/29] ktap: add basic ktap types definition(include/uapi/ktap/ktap_types.h) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 05/29] ktap: add bytecode definition(include/uapi/ktap/ktap_bc.h) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 06/29] ktap: add ktap_arch.h and error header file(include/uapi/ktap/) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 07/29] ktap: add kernel module main entry(kernel/trace/ktap/ktap.[c|h]) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 08/29] ktap: add bytecode reader(kernel/trace/ktap/kp_bcread.[c|h]) Jovi Zhangwei
2014-03-30  2:47   ` Andi Kleen
2014-03-30  8:02     ` Jovi Zhangwei
2014-03-30 17:17       ` Andi Kleen
2014-03-31  2:05         ` Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 09/29] ktap: add bytecode execution engine(kernel/trace/ktap/kp_vm.[c|h]) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 10/29] ktap: add string handling code(kernel/trace/ktap/kp_[str|mempool].[c|h]) Jovi Zhangwei
2014-03-30  3:50   ` Andi Kleen
2014-03-30  9:12     ` Jovi Zhangwei
2014-03-30 17:19       ` Andi Kleen
2014-03-31  2:35         ` Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 11/29] ktap: add table handling code(kernel/trace/ktap/kp_tab.[c|h]) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 12/29] ktap: add generic object handling code(kernel/trace/ktap/kp_obj.[c|h]) Jovi Zhangwei
2014-03-30  3:56   ` Andi Kleen
2014-03-30  8:14     ` Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 13/29] ktap: add ring buffer handling code(kernel/trace/ktap/kp_transport.[c|h]) Jovi Zhangwei
2014-03-30  3:58   ` Andi Kleen
2014-03-30  7:40     ` Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 14/29] ktap: add events management(kernel/trace/ktap/kp_events.[c|h]) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 15/29] ktap: add built-in functions and library(kernel/trace/ktap/lib_*.c) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 16/29] ktap: add amalgamation build(kernel/trace/ktap/amalg.c) Jovi Zhangwei
2014-03-31  2:17   ` Li Zefan
2014-03-31  3:22     ` Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 17/29] ktap: add Makefile for kernel module(kernel/trace/ktap/Makefile) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 18/29] ktap: add Kconfig(kernel/trace/ktap/Kconfig) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 19/29] ktap: add main file for ktap binary(tools/ktap/kp_main.c) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 20/29] ktap: add compiler(tools/ktap/kp_[lex|parse].[c|h]) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 21/29] ktap: add symbol handling code(tools/ktap/symbol.[c|h]) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 22/29] ktap: add events parse code(tools/ktap/kp_parse_events.c) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 23/29] ktap: add ring buffer reader(tools/ktap/kp_reader.c) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 24/29] ktap: add bytecode writer(tools/ktap/kp_bcwrite.c) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 25/29] ktap: add userspace util(tools/ktap/kp_util.c) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 26/29] ktap: add userspace binary Makefile(tools/ktap/Makefile) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 27/29] ktap: add testsuite and benchmark(tools/ktap/test/*) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 28/29] ktap: add vim syntax file(tools/ktap/vim/*) Jovi Zhangwei
2014-03-28 14:45 ` [PATCH v2 29/29] ktap: add COPYRIGHT file(tools/ktap/COPYRIGHT) Jovi Zhangwei
2014-03-30  1:00 ` [RFC PATCH v2 00/29] ktap: A lightweight dynamic tracing tool for Linux Andi Kleen
2014-03-30  9:18   ` Jovi Zhangwei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.