linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/11] Early kprobe: enable kprobes at very early
@ 2015-01-07  7:34 Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized Wang Nan
                   ` (13 more replies)
  0 siblings, 14 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:34 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

This patch series shows early kprobe, a mechanism allows users to track
events at very early. It should be useful for optimization of system
booting. This can also be used by BSP developers to hook their platform
specific procedures at kernel booting stages after setup_arch().

This patch series provides X86 and ARM support for early kprobes. The ARM
portion is based on my OPTPROBES for ARM 32 patches (ARM: kprobes: OPTPROBES
and other improvements), which have not been accepted yet.

Kprobes is very useful for tracking events. However, it can only be used
after system fully initialized. When debugging kernel booting stage, for
example, checking memory consumption during booting, analyzing boot
phase processes creation and optimization of booting speed, specific
tools must be created. Sometimes we have to modify kernel code.

Early kprobes is my idea on it. By utilizing OPTPROBES which converts probed
instructions into branches instead of breakpoints, kprobe can be used even
before setup of exception handlers. By adding cmdline options, one can insert
kprobes to track kernel booting stage without code modification. 

BSP developers can also benefit from it. For example, when booting an
SoC equipped with unstoppable watchdog like IMP706, wathdog writting
code must be inserted into different places to avoid watchdog resetting
system before watchdogd is pulled up (especially during memory
initialization, which is the most time-consuming portion of booting).
With early kprobe, BSP developers are able to put such code at their
private directory without disturbing arch-independent code.

In this patch series, early kprobes simply print messagees when the
probed instructions are hit. My futher plan is to connect 'ekprobe='
cmdline parameters to '/sys/kernel/debug/tracing/kprobe_events', allows
installing kprobe events from kernel cmdline, and dump early kprobe
messages into ring buffer without print them out.

Patch 1 - 4 are architecture dependent code, allow text modification
before kprobes_initialized is setup, and alloc resources statically from
vmlinux.lds. Currently only x86 and ARM are supported.

Patch 5 - 8 define required flags and macros.

Patch 9 is the core logic of early kprobes. When register_kprobe() is
called before kprobes_initialized, it marks the probed kprobes as
'KPROBE_FLAG_EARLY' and allocs resources from slots which is reserved
during linking. After kprobe is fully initialized, it converts early
kprobes to normal kprobes.

Patch 10 enables cmdline option 'ekprobe=', allows setup probe at
cmdline. However, currently the kprobe handler is only a simple printk.

Patch 11 introduces required Kconfig options to actually enable early
kprobes.

Usage of early kprobe is as follow:

Booting kernel with cmdline 'ekprobe=', like:

... rdinit=/sbin/init ekprobe=0xc00f3c2c ekprobe=__free_pages ...

During boot, kernel will print trace using printk:

 ...
 Hit early kprobe at __alloc_pages_nodemask+0x4
 Hit early kprobe at __free_pages+0x0
 Hit early kprobe at __alloc_pages_nodemask+0x4
 Hit early kprobe at __free_pages+0x0
 Hit early kprobe at __free_pages+0x0
 Hit early kprobe at __alloc_pages_nodemask+0x4
 ...

After fully initialized, early kprobes will be converted to normal
kprobes, and can be turned-off using:

 echo 0 > /sys/kernel/debug/kprobes/enabled

And reenabled using:

 echo 1 > /sys/kernel/debug/kprobes/enabled

Also, optimization can be turned off using:

 echo 0 > /proc/sys/debug/kprobes-optimization

There's no way to remove specific early kprobe now. I'd like to convert
early kprobes into kprobe events in futher patches, and then they can be
totally removed through event interface.

Wang Nan (11):
  ARM: kprobes: directly modify code if kprobe is not initialized.
  ARM: kprobes: introduce early kprobes related code area.
  x86: kprobes: directly modify code if kprobe is not initialized.
  x86: kprobes: introduce early kprobes related code area.
  kprobes: Add an KPROBE_FLAG_EARLY for early kprobe.
  kprobes: makes kprobes_initialized globally visable.
  kprobes: introduces macros for allocing early kprobe resources.
  kprobes: allows __alloc_insn_slot() from early kprobes slots.
  kprobes: core logic of eraly kprobes.
  kprobes: enable 'ekprobe=' cmdline option for early kprobes.
  kprobes: add CONFIG_EARLY_KPROBES option.

 arch/Kconfig                      |  12 ++
 arch/arm/include/asm/kprobes.h    |  29 ++++-
 arch/arm/kernel/vmlinux.lds.S     |   2 +
 arch/arm/probes/kprobes/opt-arm.c |  11 +-
 arch/x86/include/asm/insn.h       |   7 +-
 arch/x86/include/asm/kprobes.h    |  44 +++++--
 arch/x86/kernel/kprobes/opt.c     |   7 +-
 arch/x86/kernel/vmlinux.lds.S     |   2 +
 include/linux/kprobes.h           | 109 ++++++++++++++++++
 kernel/kprobes.c                  | 237 ++++++++++++++++++++++++++++++++++++--
 10 files changed, 437 insertions(+), 23 deletions(-)

-- 
1.8.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07 17:15   ` Christopher Covington
  2015-01-13 15:34   ` Masami Hiramatsu
  2015-01-07  7:35 ` [RFC PATCH 02/11] ARM: kprobes: introduce early kprobes related code area Wang Nan
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

If kprobe is optimized before kprobe is initialized, there should
be only one core, the probed instruction is not armed with breakpoint,
so simply patch text is okay.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 arch/arm/probes/kprobes/opt-arm.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/arm/probes/kprobes/opt-arm.c b/arch/arm/probes/kprobes/opt-arm.c
index 15b37c0..a021474 100644
--- a/arch/arm/probes/kprobes/opt-arm.c
+++ b/arch/arm/probes/kprobes/opt-arm.c
@@ -325,8 +325,17 @@ void __kprobes arch_optimize_kprobes(struct list_head *oplist)
 		 * Similar to __arch_disarm_kprobe, operations which
 		 * removing breakpoints must be wrapped by stop_machine
 		 * to avoid racing.
+		 *
+		 * If this function is called before kprobes initialized,
+		 * the kprobe should be an early kprobe, the instruction
+		 * is not armed with breakpoint. There should be only
+		 * one core now, so directly __patch_text is enough.
 		 */
-		kprobes_remove_breakpoint(op->kp.addr, insn);
+		if (unlikely(!kprobes_initialized)) {
+			BUG_ON(!(op->kp.flags & KPROBE_FLAG_EARLY));
+			__patch_text(op->kp.addr, insn);
+		} else
+			kprobes_remove_breakpoint(op->kp.addr, insn);
 
 		list_del_init(&op->list);
 	}
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 02/11] ARM: kprobes: introduce early kprobes related code area.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 03/11] x86: kprobes: directly modify code if kprobe is not initialized Wang Nan
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

In arm's vmlinux.lds, introduces code area inside text section.
Executable area used by early kprobes will be allocated from there.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 arch/arm/include/asm/kprobes.h | 29 +++++++++++++++++++++++++++--
 arch/arm/kernel/vmlinux.lds.S  |  2 ++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kprobes.h b/arch/arm/include/asm/kprobes.h
index 3ea9be5..99f16dc 100644
--- a/arch/arm/include/asm/kprobes.h
+++ b/arch/arm/include/asm/kprobes.h
@@ -17,16 +17,40 @@
 #define _ARM_KPROBES_H
 
 #include <linux/types.h>
-#include <linux/ptrace.h>
-#include <linux/notifier.h>
 
 #define __ARCH_WANT_KPROBES_INSN_SLOT
 #define MAX_INSN_SIZE			2
 
+#ifdef __ASSEMBLY__
+
+#define KPROBE_OPCODE_SIZE	4
+#define MAX_OPTINSN_SIZE (optprobe_template_end - optprobe_template_entry)
+
+#ifdef CONFIG_EARLY_KPROBES
+#define EARLY_KPROBES_CODES_AREA					\
+	. = ALIGN(8);							\
+	VMLINUX_SYMBOL(__early_kprobes_code_area_start) = .;		\
+	. = . + MAX_OPTINSN_SIZE * CONFIG_NR_EARLY_KPROBES_SLOTS;	\
+	VMLINUX_SYMBOL(__early_kprobes_code_area_end) = .;		\
+	. = ALIGN(8);							\
+	VMLINUX_SYMBOL(__early_kprobes_insn_slot_start) = .;		\
+	. = . + MAX_INSN_SIZE * KPROBE_OPCODE_SIZE * CONFIG_NR_EARLY_KPROBES_SLOTS;\
+	VMLINUX_SYMBOL(__early_kprobes_insn_slot_end) = .;
+
+#else
+#define EARLY_KPROBES_CODES_AREA
+#endif
+
+#else
+
+#include <linux/ptrace.h>
+#include <linux/notifier.h>
+
 #define flush_insn_slot(p)		do { } while (0)
 #define kretprobe_blacklist_size	0
 
 typedef u32 kprobe_opcode_t;
+#define KPROBE_OPCODE_SIZE	sizeof(kprobe_opcode_t)
 struct kprobe;
 #include <asm/probes.h>
 
@@ -83,4 +107,5 @@ struct arch_optimized_insn {
 	 */
 };
 
+#endif /* __ASSEMBLY__ */
 #endif /* _ARM_KPROBES_H */
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index b31aa73..12ff76d 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -11,6 +11,7 @@
 #ifdef CONFIG_ARM_KERNMEM_PERMS
 #include <asm/pgtable.h>
 #endif
+#include <asm/kprobes.h>
 	
 #define PROC_INFO							\
 	. = ALIGN(4);							\
@@ -108,6 +109,7 @@ SECTIONS
 			SCHED_TEXT
 			LOCK_TEXT
 			KPROBES_TEXT
+			EARLY_KPROBES_CODES_AREA
 			IDMAP_TEXT
 #ifdef CONFIG_MMU
 			*(.fixup)
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 03/11] x86: kprobes: directly modify code if kprobe is not initialized.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 02/11] ARM: kprobes: introduce early kprobes related code area Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 04/11] x86: kprobes: introduce early kprobes related code area Wang Nan
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

When registering early kprobes, SMP should has not been enabled, so
doesn't require synchronization in text_poke_bp(). Simply memcpy is
enough.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 arch/x86/kernel/kprobes/opt.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 0dd8d08..dc5fccb 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -397,8 +397,11 @@ void arch_optimize_kprobes(struct list_head *oplist)
 		insn_buf[0] = RELATIVEJUMP_OPCODE;
 		*(s32 *)(&insn_buf[1]) = rel;
 
-		text_poke_bp(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE,
-			     op->optinsn.insn);
+		if (unlikely(!kprobes_initialized))
+			memcpy(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE);
+		else
+			text_poke_bp(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE,
+				     op->optinsn.insn);
 
 		list_del_init(&op->list);
 	}
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 04/11] x86: kprobes: introduce early kprobes related code area.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (2 preceding siblings ...)
  2015-01-07  7:35 ` [RFC PATCH 03/11] x86: kprobes: directly modify code if kprobe is not initialized Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 05/11] kprobes: Add an KPROBE_FLAG_EARLY for early kprobe Wang Nan
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

This patch introduces EARLY_KPROBES_CODES_AREA in x86 vmlinux for early
kprobes.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 arch/x86/include/asm/insn.h    |  7 ++++---
 arch/x86/include/asm/kprobes.h | 45 ++++++++++++++++++++++++++++++++++--------
 arch/x86/kernel/vmlinux.lds.S  |  2 ++
 3 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 47f29b1..ea6f318 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -20,6 +20,9 @@
  * Copyright (C) IBM Corporation, 2009
  */
 
+#define MAX_INSN_SIZE	16
+
+#ifndef __ASSEMBLY__
 /* insn_attr_t is defined in inat.h */
 #include <asm/inat.h>
 
@@ -69,8 +72,6 @@ struct insn {
 	const insn_byte_t *next_byte;
 };
 
-#define MAX_INSN_SIZE	16
-
 #define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6)
 #define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3)
 #define X86_MODRM_RM(modrm) ((modrm) & 0x07)
@@ -197,5 +198,5 @@ static inline int insn_offset_immediate(struct insn *insn)
 {
 	return insn_offset_displacement(insn) + insn->displacement.nbytes;
 }
-
+#endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_INSN_H */
diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 4421b5d..017f4bb 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -21,23 +21,52 @@
  *
  * See arch/x86/kernel/kprobes.c for x86 kprobes history.
  */
-#include <linux/types.h>
-#include <linux/ptrace.h>
-#include <linux/percpu.h>
-#include <asm/insn.h>
 
 #define  __ARCH_WANT_KPROBES_INSN_SLOT
 
-struct pt_regs;
-struct kprobe;
+#include <linux/types.h>
+#include <asm/insn.h>
 
-typedef u8 kprobe_opcode_t;
 #define BREAKPOINT_INSTRUCTION	0xcc
 #define RELATIVEJUMP_OPCODE 0xe9
 #define RELATIVEJUMP_SIZE 5
 #define RELATIVECALL_OPCODE 0xe8
 #define RELATIVE_ADDR_SIZE 4
 #define MAX_STACK_SIZE 64
+#define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
+
+#ifdef __ASSEMBLY__
+
+#define KPROBE_OPCODE_SIZE     1
+#define MAX_OPTINSN_SIZE ((optprobe_template_end - optprobe_template_entry) + \
+	MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
+
+#ifdef CONFIG_EARLY_KPROBES
+# define EARLY_KPROBES_CODES_AREA					\
+	. = ALIGN(8);							\
+	VMLINUX_SYMBOL(__early_kprobes_code_area_start) = .;		\
+	. = . + MAX_OPTINSN_SIZE * CONFIG_NR_EARLY_KPROBES_SLOTS;	\
+	VMLINUX_SYMBOL(__early_kprobes_code_area_end) = .;		\
+	. = ALIGN(8);							\
+	VMLINUX_SYMBOL(__early_kprobes_insn_slot_start) = .;		\
+	. = . + MAX_INSN_SIZE * KPROBE_OPCODE_SIZE *			\
+		CONFIG_NR_EARLY_KPROBES_SLOTS;				\
+	VMLINUX_SYMBOL(__early_kprobes_insn_slot_end) = .;
+#else
+# define EARLY_KPROBES_CODES_AREA
+#endif
+
+#else
+
+#include <linux/ptrace.h>
+#include <linux/percpu.h>
+
+
+struct pt_regs;
+struct kprobe;
+
+typedef u8 kprobe_opcode_t;
+#define KPROBE_OPCODE_SIZE     sizeof(kprobe_opcode_t)
 #define MIN_STACK_SIZE(ADDR)					       \
 	(((MAX_STACK_SIZE) < (((unsigned long)current_thread_info()) + \
 			      THREAD_SIZE - (unsigned long)(ADDR)))    \
@@ -52,7 +81,6 @@ extern __visible kprobe_opcode_t optprobe_template_entry;
 extern __visible kprobe_opcode_t optprobe_template_val;
 extern __visible kprobe_opcode_t optprobe_template_call;
 extern __visible kprobe_opcode_t optprobe_template_end;
-#define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
 #define MAX_OPTINSN_SIZE 				\
 	(((unsigned long)&optprobe_template_end -	\
 	  (unsigned long)&optprobe_template_entry) +	\
@@ -117,4 +145,5 @@ extern int kprobe_exceptions_notify(struct notifier_block *self,
 				    unsigned long val, void *data);
 extern int kprobe_int3_handler(struct pt_regs *regs);
 extern int kprobe_debug_handler(struct pt_regs *regs);
+#endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_KPROBES_H */
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 00bf300..69f3f0e 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -26,6 +26,7 @@
 #include <asm/page_types.h>
 #include <asm/cache.h>
 #include <asm/boot.h>
+#include <asm/kprobes.h>
 
 #undef i386     /* in case the preprocessor is a 32bit one */
 
@@ -100,6 +101,7 @@ SECTIONS
 		SCHED_TEXT
 		LOCK_TEXT
 		KPROBES_TEXT
+		EARLY_KPROBES_CODES_AREA
 		ENTRY_TEXT
 		IRQENTRY_TEXT
 		*(.fixup)
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 05/11] kprobes: Add an KPROBE_FLAG_EARLY for early kprobe.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (3 preceding siblings ...)
  2015-01-07  7:35 ` [RFC PATCH 04/11] x86: kprobes: introduce early kprobes related code area Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 06/11] kprobes: makes kprobes_initialized globally visable Wang Nan
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

Introduce a KPROBE_FLAG_EARLY for futher expansion. KPROBE_FLAG_EARLY
indicates a kprobe is installed at very early stage, its resources
should be allocated statically.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 include/linux/kprobes.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 1ab5475..fa0de88 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -128,6 +128,7 @@ struct kprobe {
 				   * this flag is only for optimized_kprobe.
 				   */
 #define KPROBE_FLAG_FTRACE	8 /* probe is using ftrace */
+#define KPROBE_FLAG_EARLY	16 /* early kprobe */
 
 /* Has this kprobe gone ? */
 static inline int kprobe_gone(struct kprobe *p)
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 06/11] kprobes: makes kprobes_initialized globally visable.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (4 preceding siblings ...)
  2015-01-07  7:35 ` [RFC PATCH 05/11] kprobes: Add an KPROBE_FLAG_EARLY for early kprobe Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 07/11] kprobes: introduces macros for allocing early kprobe resources Wang Nan
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

Following patches will enable kprobe registering very early, before
kprobe system initialized. Arch code can use it to do special treatments
for such kprobes.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 include/linux/kprobes.h | 2 ++
 kernel/kprobes.c        | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index fa0de88..b0265f9 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -50,6 +50,8 @@
 #define KPROBE_REENTER		0x00000004
 #define KPROBE_HIT_SSDONE	0x00000008
 
+extern int kprobes_initialized;
+
 #else /* CONFIG_KPROBES */
 typedef int kprobe_opcode_t;
 struct arch_specific_insn {
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 9fbe0c3..4591cae 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -67,7 +67,7 @@
 	addr = ((kprobe_opcode_t *)(kallsyms_lookup_name(name)))
 #endif
 
-static int kprobes_initialized;
+int kprobes_initialized;
 static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
 static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
 
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 07/11] kprobes: introduces macros for allocing early kprobe resources.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (5 preceding siblings ...)
  2015-01-07  7:35 ` [RFC PATCH 06/11] kprobes: makes kprobes_initialized globally visable Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07 10:45   ` Wang Nan
  2015-01-07  7:35 ` [RFC PATCH 08/11] kprobes: allows __alloc_insn_slot() from early kprobes slots Wang Nan
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

Introduces macros to genearte common early kprobe related resource
allocator.

All early kprobe related resources are statically allocated during
linking for each early kprobe slot. For each type of resource, a bitmap
is used to track allocation. __DEFINE_EKPROBE_ALLOC_OPS defines alloc
and free handler for them. The range of the resource and the bitmap
should be provided for allocaing and freeing. DEFINE_EKPROBE_ALLOC_OPS
defines bitmap and the array used by it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 include/linux/kprobes.h | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index b0265f9..9a18188 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -270,6 +270,75 @@ extern void show_registers(struct pt_regs *regs);
 extern void kprobes_inc_nmissed_count(struct kprobe *p);
 extern bool arch_within_kprobe_blacklist(unsigned long addr);
 
+#ifdef CONFIG_EARLY_KPROBES
+
+#define NR_EARLY_KPROBES_SLOTS	CONFIG_NR_EARLY_KPROBES_SLOTS
+#define ALIGN_UP(v, a)	(((v) + ((a) - 1)) & ~((a) - 1))
+#define EARLY_KPROBES_BITMAP_SZ	ALIGN_UP(NR_EARLY_KPROBES_SLOTS, BITS_PER_LONG)
+
+#define __ek_in_range(v, s, e)	(((v) >= (s)) && ((v) < (e)))
+#define __ek_buf_sz(s, e)	((void *)(e) - (void *)(s))
+#define __ek_elem_sz_b(s, e)	(__ek_buf_sz(s, e) / NR_EARLY_KPROBES_SLOTS)
+#define __ek_elem_sz(s, e)	(__ek_elem_sz_b(s, e) / sizeof(s[0]))
+#define __ek_elem_idx(v, s, e)	(__ek_buf_sz(s, v) / __ek_elem_sz_b(s, e))
+#define __ek_get_elem(i, s, e)	(&((s)[__ek_elem_sz(s, e) * (i)]))
+#define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)				\
+static inline __t *__ek_alloc_##__name(__t *__s, __t *__e, unsigned long *__b)\
+{									\
+	int __i = find_next_zero_bit(__b, NR_EARLY_KPROBES_SLOTS, 0);	\
+	if (__i >= NR_EARLY_KPROBES_SLOTS)				\
+		return NULL;						\
+	set_bit(__i, __b);						\
+	return __ek_get_elem(__i, __s, __e);				\
+}									\
+static inline int __ek_free_##__name(__t *__v, __t *__s, __t *__e, unsigned long *__b)	\
+{									\
+	if (!__ek_in_range(__v, __s, __e))				\
+		return 0;						\
+	clear_bit(__ek_elem_idx(__v, __s, __e), __b);			\
+	return 1;							\
+}
+
+#define DEFINE_EKPROBE_ALLOC_OPS(__t, __name, __static)			\
+__static __t __ek_##__name##_slots[NR_EARLY_KPROBES_SLOTS];		\
+__static unsigned long __ek_##__name##_bitmap[EARLY_KPROBES_BITMAP_SZ];	\
+__DEFINE_EKPROBE_ALLOC_OPS(__t, __name)					\
+static inline __t *ek_alloc_##__name(void)				\
+{									\
+	return __ek_alloc_##__name(&((__ek_##__name##_slots)[0]),	\
+			&((__ek_##__name##_slots)[NR_EARLY_KPROBES_SLOTS]),\
+			(__ek_##__name##_bitmap));			\
+}									\
+static inline int ek_free_##__name(__t *__s)				\
+{									\
+	return __ek_free_##__name(__s, &((__ek_##__name##_slots)[0]),	\
+			&((__ek_##__name##_slots)[NR_EARLY_KPROBES_SLOTS]),\
+			(__ek_##__name##_bitmap));			\
+}
+
+
+#else
+#define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)				\
+static inline __t *__ek_alloc_##__name(__t *__s, __t *__e, unsigned long *__b)\
+{									\
+	return NULL;							\
+}									\
+static inline int __ek_free_##__name(__t *__v, __t *__s, __t *__e, unsigned long *__b)\
+{									\
+	return 0;							\
+}
+
+#define DEFINE_EKPROBE_ALLOC_OPS(__t, __name, __static)			\
+static inline __t *ek_alloc_##__name(void)				\
+{									\
+	return NULL;							\
+}									\
+static inline void ek_free_##__name(__t *__s)				\
+{									\
+}
+
+#endif
+
 struct kprobe_insn_cache {
 	struct mutex mutex;
 	void *(*alloc)(void);	/* allocate insn page */
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 08/11] kprobes: allows __alloc_insn_slot() from early kprobes slots.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (6 preceding siblings ...)
  2015-01-07  7:35 ` [RFC PATCH 07/11] kprobes: introduces macros for allocing early kprobe resources Wang Nan
@ 2015-01-07  7:35 ` Wang Nan
  2015-01-07  7:36 ` [RFC PATCH 09/11] kprobes: core logic of eraly kprobes Wang Nan
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:35 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

Introduces early_slots_start/end and bitmap for struct kprobe_insn_cache
then uses previous introduced macro to generate allocator. This patch
makes get/free_insn_slot() and get/free_optinsn_slot() transparent to
early kprobes.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 include/linux/kprobes.h | 33 +++++++++++++++++++++++++++++++++
 kernel/kprobes.c        | 14 ++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 9a18188..27a27ed 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -316,6 +316,10 @@ static inline int ek_free_##__name(__t *__s)				\
 			(__ek_##__name##_bitmap));			\
 }
 
+extern kprobe_opcode_t __early_kprobes_code_area_start[];
+extern kprobe_opcode_t __early_kprobes_code_area_end[];
+extern kprobe_opcode_t __early_kprobes_insn_slot_start[];
+extern kprobe_opcode_t __early_kprobes_insn_slot_end[];
 
 #else
 #define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)				\
@@ -339,6 +343,8 @@ static inline void ek_free_##__name(__t *__s)				\
 
 #endif
 
+__DEFINE_EKPROBE_ALLOC_OPS(kprobe_opcode_t, opcode)
+
 struct kprobe_insn_cache {
 	struct mutex mutex;
 	void *(*alloc)(void);	/* allocate insn page */
@@ -346,8 +352,35 @@ struct kprobe_insn_cache {
 	struct list_head pages; /* list of kprobe_insn_page */
 	size_t insn_size;	/* size of instruction slot */
 	int nr_garbage;
+#ifdef CONFIG_EARLY_KPROBES
+# define slots_start(c)	((c)->early_slots_start)
+# define slots_end(c)	((c)->early_slots_end)
+# define slots_bitmap(c)	((c)->early_slots_bitmap)
+	kprobe_opcode_t *early_slots_start;
+	kprobe_opcode_t *early_slots_end;
+	unsigned long early_slots_bitmap[EARLY_KPROBES_BITMAP_SZ];
+#else
+# define slots_start(c)	NULL
+# define slots_end(c)	NULL
+# define slots_bitmap(c)	NULL
+#endif
 };
 
+static inline kprobe_opcode_t *
+__get_insn_slot_early(struct kprobe_insn_cache *c)
+{
+	return __ek_alloc_opcode(slots_start(c),
+			slots_end(c), slots_bitmap(c));
+}
+
+static inline int
+__free_insn_slot_early(struct kprobe_insn_cache *c,
+		kprobe_opcode_t *slot)
+{
+	return __ek_free_opcode(slot, slots_start(c),
+			slots_end(c), slots_bitmap(c));
+}
+
 extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
 extern void __free_insn_slot(struct kprobe_insn_cache *c,
 			     kprobe_opcode_t *slot, int dirty);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4591cae..1882bfa 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -137,6 +137,10 @@ struct kprobe_insn_cache kprobe_insn_slots = {
 	.pages = LIST_HEAD_INIT(kprobe_insn_slots.pages),
 	.insn_size = MAX_INSN_SIZE,
 	.nr_garbage = 0,
+#ifdef CONFIG_EARLY_KPROBES
+	.early_slots_start = __early_kprobes_insn_slot_start,
+	.early_slots_end = __early_kprobes_insn_slot_end,
+#endif
 };
 static int collect_garbage_slots(struct kprobe_insn_cache *c);
 
@@ -149,6 +153,9 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	struct kprobe_insn_page *kip;
 	kprobe_opcode_t *slot = NULL;
 
+	if (unlikely(!kprobes_initialized))
+		return __get_insn_slot_early(c);
+
 	mutex_lock(&c->mutex);
  retry:
 	list_for_each_entry(kip, &c->pages, list) {
@@ -249,6 +256,9 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 {
 	struct kprobe_insn_page *kip;
 
+	if (unlikely(__free_insn_slot_early(c, slot)))
+		return;
+
 	mutex_lock(&c->mutex);
 	list_for_each_entry(kip, &c->pages, list) {
 		long idx = ((long)slot - (long)kip->insns) /
@@ -280,6 +290,10 @@ struct kprobe_insn_cache kprobe_optinsn_slots = {
 	.pages = LIST_HEAD_INIT(kprobe_optinsn_slots.pages),
 	/* .insn_size is initialized later */
 	.nr_garbage = 0,
+#ifdef CONFIG_EARLY_KPROBES
+	.early_slots_start = __early_kprobes_code_area_start,
+	.early_slots_end = __early_kprobes_code_area_end,
+#endif
 };
 #endif
 #endif
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 09/11] kprobes: core logic of eraly kprobes.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (7 preceding siblings ...)
  2015-01-07  7:35 ` [RFC PATCH 08/11] kprobes: allows __alloc_insn_slot() from early kprobes slots Wang Nan
@ 2015-01-07  7:36 ` Wang Nan
  2015-01-07  7:36 ` [RFC PATCH 10/11] kprobes: enable 'ekprobe=' cmdline option for early kprobes Wang Nan
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:36 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

This patch is the main logic of early kprobe.

If register_kprobe() is called before kprobes_initialized, an early
kprobe is allocated. Try to utilize existing OPTPROBE mechanism to
replace the target instruction by a branch instead of breakpoint,
because interrupt handlers may not been initialized yet.

All resources required by early kprobes are allocated statically.
CONFIG_NR_EARLY_KPROBES_SLOTS is used to control number of possible
early kprobes.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 include/linux/kprobes.h |   4 ++
 kernel/kprobes.c        | 151 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 149 insertions(+), 6 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 27a27ed..a54947d 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -434,6 +434,10 @@ extern int proc_kprobes_optimization_handler(struct ctl_table *table,
 					     size_t *length, loff_t *ppos);
 #endif
 
+struct early_kprobe_slot {
+	struct optimized_kprobe op;
+};
+
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 1882bfa..9c3ea9b 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -71,6 +71,10 @@ int kprobes_initialized;
 static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
 static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
 
+#ifdef CONFIG_EARLY_KPROBES
+static HLIST_HEAD(early_kprobe_hlist);
+#endif
+
 /* NOTE: change this value only with kprobe_mutex held */
 static bool kprobes_all_disarmed;
 
@@ -320,7 +324,12 @@ struct kprobe *get_kprobe(void *addr)
 	struct hlist_head *head;
 	struct kprobe *p;
 
-	head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)];
+#ifdef CONFIG_EARLY_KPROBES
+	if (unlikely(!kprobes_initialized))
+		head = &early_kprobe_hlist;
+	else
+#endif
+		head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)];
 	hlist_for_each_entry_rcu(p, head, hlist) {
 		if (p->addr == addr)
 			return p;
@@ -377,14 +386,18 @@ void opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
 NOKPROBE_SYMBOL(opt_pre_handler);
 
 /* Free optimized instructions and optimized_kprobe */
+static int ek_free_early_kprobe(struct early_kprobe_slot *slot);
 static void free_aggr_kprobe(struct kprobe *p)
 {
 	struct optimized_kprobe *op;
+	struct early_kprobe_slot *ep;
 
 	op = container_of(p, struct optimized_kprobe, kp);
 	arch_remove_optimized_kprobe(op);
 	arch_remove_kprobe(p);
-	kfree(op);
+	ep = container_of(op, struct early_kprobe_slot, op);
+	if (likely(!ek_free_early_kprobe(ep)))
+		kfree(op);
 }
 
 /* Return true(!0) if the kprobe is ready for optimization. */
@@ -601,9 +614,15 @@ static void optimize_kprobe(struct kprobe *p)
 	struct optimized_kprobe *op;
 
 	/* Check if the kprobe is disabled or not ready for optimization. */
-	if (!kprobe_optready(p) || !kprobes_allow_optimization ||
-	    (kprobe_disabled(p) || kprobes_all_disarmed))
-		return;
+	if (unlikely(!kprobes_initialized)) {
+		BUG_ON(!(p->flags & KPROBE_FLAG_EARLY));
+		if (!kprobe_optready(p) || kprobe_disabled(p))
+			return;
+	} else {
+		if (!kprobe_optready(p) || !kprobes_allow_optimization ||
+		    (kprobe_disabled(p) || kprobes_all_disarmed))
+			return;
+	}
 
 	/* Both of break_handler and post_handler are not supported. */
 	if (p->break_handler || p->post_handler)
@@ -625,7 +644,10 @@ static void optimize_kprobe(struct kprobe *p)
 		list_del_init(&op->list);
 	else {
 		list_add(&op->list, &optimizing_list);
-		kick_kprobe_optimizer();
+		if (unlikely(!kprobes_initialized))
+			arch_optimize_kprobes(&optimizing_list);
+		else
+			kick_kprobe_optimizer();
 	}
 }
 
@@ -1491,6 +1513,8 @@ out:
 	return ret;
 }
 
+static int register_early_kprobe(struct kprobe *p);
+
 int register_kprobe(struct kprobe *p)
 {
 	int ret;
@@ -1504,6 +1528,14 @@ int register_kprobe(struct kprobe *p)
 		return PTR_ERR(addr);
 	p->addr = addr;
 
+	if (unlikely(!kprobes_initialized)) {
+		p->flags |= KPROBE_FLAG_EARLY;
+		return register_early_kprobe(p);
+	}
+
+	WARN(p->flags & KPROBE_FLAG_EARLY,
+		"register early kprobe after kprobes initialized\n");
+
 	ret = check_kprobe_rereg(p);
 	if (ret)
 		return ret;
@@ -2136,6 +2168,8 @@ static struct notifier_block kprobe_module_nb = {
 extern unsigned long __start_kprobe_blacklist[];
 extern unsigned long __stop_kprobe_blacklist[];
 
+static void convert_early_kprobes(void);
+
 static int __init init_kprobes(void)
 {
 	int i, err = 0;
@@ -2184,6 +2218,7 @@ static int __init init_kprobes(void)
 	if (!err)
 		err = register_module_notifier(&kprobe_module_nb);
 
+	convert_early_kprobes();
 	kprobes_initialized = (err == 0);
 
 	if (!err)
@@ -2477,3 +2512,107 @@ module_init(init_kprobes);
 
 /* defined in arch/.../kernel/kprobes.c */
 EXPORT_SYMBOL_GPL(jprobe_return);
+
+#ifdef CONFIG_EARLY_KPROBES
+DEFINE_EKPROBE_ALLOC_OPS(struct early_kprobe_slot, early_kprobe, static);
+
+static int register_early_kprobe(struct kprobe *p)
+{
+	struct early_kprobe_slot *slot;
+	int err;
+
+	if (p->break_handler || p->post_handler)
+		return -EINVAL;
+	if (p->flags & KPROBE_FLAG_DISABLED)
+		return -EINVAL;
+
+	slot = ek_alloc_early_kprobe();
+	if (!slot) {
+		pr_err("No enough early kprobe slots.\n");
+		return -ENOMEM;
+	}
+
+	p->flags &= KPROBE_FLAG_DISABLED;
+	p->flags |= KPROBE_FLAG_EARLY;
+	p->nmissed = 0;
+
+	err = arch_prepare_kprobe(p);
+	if (err) {
+		pr_err("arch_prepare_kprobe failed\n");
+		goto free_slot;
+	}
+
+	INIT_LIST_HEAD(&p->list);
+	INIT_HLIST_NODE(&p->hlist);
+	INIT_LIST_HEAD(&slot->op.list);
+	slot->op.kp.addr = p->addr;
+	slot->op.kp.flags = p->flags | KPROBE_FLAG_EARLY;
+
+	err = arch_prepare_optimized_kprobe(&slot->op, p);
+	if (err) {
+		pr_err("Failed to prepare optimized kprobe.\n");
+		goto remove_optimized;
+	}
+
+	if (!arch_prepared_optinsn(&slot->op.optinsn)) {
+		pr_err("Failed to prepare optinsn.\n");
+		err = -ENOMEM;
+		goto remove_optimized;
+	}
+
+	hlist_add_head_rcu(&p->hlist, &early_kprobe_hlist);
+	init_aggr_kprobe(&slot->op.kp, p);
+	optimize_kprobe(&slot->op.kp);
+	return 0;
+
+remove_optimized:
+	arch_remove_optimized_kprobe(&slot->op);
+free_slot:
+	ek_free_early_kprobe(slot);
+	return err;
+}
+
+static void
+convert_early_kprobe(struct kprobe *kp)
+{
+	struct module *probed_mod;
+	int err;
+
+	BUG_ON(!kprobe_aggrprobe(kp));
+
+	err = check_kprobe_address_safe(kp, &probed_mod);
+	if (err)
+		panic("Insert kprobe at %p is not safe!", kp->addr);
+
+	/*
+	 * FIXME:
+	 * convert kprobe to ftrace if CONFIG_KPROBES_ON_FTRACE is on
+	 * and kp is on ftrace location.
+	 */
+
+	mutex_lock(&kprobe_mutex);
+	hlist_del_rcu(&kp->hlist);
+
+	INIT_HLIST_NODE(&kp->hlist);
+	hlist_add_head_rcu(&kp->hlist,
+		       &kprobe_table[hash_ptr(kp->addr, KPROBE_HASH_BITS)]);
+	mutex_unlock(&kprobe_mutex);
+
+	if (probed_mod)
+		module_put(probed_mod);
+}
+
+static void
+convert_early_kprobes(void)
+{
+	struct kprobe *p;
+	struct hlist_node *tmp;
+
+	hlist_for_each_entry_safe(p, tmp, &early_kprobe_hlist, hlist)
+		convert_early_kprobe(p);
+};
+#else
+static int register_early_kprobe(struct kprobe *p) { return -ENOSYS; }
+static int ek_free_early_kprobe(struct early_kprobe_slot *slot) { return 0; }
+static void convert_early_kprobes(void) {};
+#endif
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 10/11] kprobes: enable 'ekprobe=' cmdline option for early kprobes.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (8 preceding siblings ...)
  2015-01-07  7:36 ` [RFC PATCH 09/11] kprobes: core logic of eraly kprobes Wang Nan
@ 2015-01-07  7:36 ` Wang Nan
  2015-01-07  7:36 ` [RFC PATCH 11/11] kprobes: add CONFIG_EARLY_KPROBES option Wang Nan
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:36 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

This patch shows the basic idea of usage of early kprobes. By adding
kernel cmdline options such as 'ekprobe=__alloc_pages_nodemask' or
'ekprobe=0xc00f3c2c', early kprobes are installed. When the probed
instructions get hit, a message is printed.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 kernel/kprobes.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 9c3ea9b..73f9b7f 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2611,8 +2611,79 @@ convert_early_kprobes(void)
 	hlist_for_each_entry_safe(p, tmp, &early_kprobe_hlist, hlist)
 		convert_early_kprobe(p);
 };
+
+static int early_kprobe_pre_handler(struct kprobe *p, struct pt_regs *regs)
+{
+	const char *sym = NULL;
+	char *modname, namebuf[KSYM_NAME_LEN];
+	unsigned long offset = 0;
+
+	sym = kallsyms_lookup((unsigned long)p->addr, NULL,
+			&offset, &modname, namebuf);
+	if (sym)
+		pr_info("Hit early kprobe at %s+0x%lx%s%s\n",
+				sym, offset,
+				(modname ? " " : ""),
+				(modname ? modname : ""));
+	else
+		pr_info("Hit early kprobe at %p\n", p->addr);
+	return 0;
+}
+
+DEFINE_EKPROBE_ALLOC_OPS(struct kprobe, early_kprobe_setup, static);
+static int __init early_kprobe_setup(char *p)
+{
+	unsigned long long addr;
+	struct kprobe *kp;
+	int len = strlen(p);
+	int err;
+
+	if (len <= 0) {
+		pr_err("early kprobe: wrong param: %s\n", p);
+		return 0;
+	}
+
+	if ((p[0] == '0') && (p[1] == 'x')) {
+		err = kstrtoull(p, 16, &addr);
+		if (err) {
+			pr_err("early kprobe: wrong address: %p\n", p);
+			return 0;
+		}
+	} else {
+		addr = kallsyms_lookup_name(p);
+		if (!addr) {
+			pr_err("early kprobe: wrong symbol: %s\n", p);
+			return 0;
+		}
+	}
+
+	if ((addr < (unsigned long)_text) ||
+			(addr >= (unsigned long)_etext))
+		pr_err("early kprobe: address of %p out of range\n", p);
+
+	kp = ek_alloc_early_kprobe_setup();
+	if (kp == NULL) {
+		pr_err("early kprobe: no enough early kprobe slot\n");
+		return 0;
+	}
+	kp->addr = (void *)(unsigned long)(addr);
+	kp->pre_handler = early_kprobe_pre_handler;
+	err = register_kprobe(kp);
+	if (err) {
+		pr_err("early kprobe: register early kprobe %s failed\n", p);
+		ek_free_early_kprobe_setup(kp);
+	}
+	return 0;
+}
 #else
 static int register_early_kprobe(struct kprobe *p) { return -ENOSYS; }
 static int ek_free_early_kprobe(struct early_kprobe_slot *slot) { return 0; }
 static void convert_early_kprobes(void) {};
+
+static int __init early_kprobe_setup(char *p)
+{
+	return 0;
+}
 #endif
+
+early_param("ekprobe", early_kprobe_setup);
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 11/11] kprobes: add CONFIG_EARLY_KPROBES option.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (9 preceding siblings ...)
  2015-01-07  7:36 ` [RFC PATCH 10/11] kprobes: enable 'ekprobe=' cmdline option for early kprobes Wang Nan
@ 2015-01-07  7:36 ` Wang Nan
  2015-01-07  7:42 ` Wang Nan
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:36 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

Enable early kprobes in Kconfig.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 arch/Kconfig | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 05d7a8a..06dff4b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -46,6 +46,18 @@ config KPROBES
 	  for kernel debugging, non-intrusive instrumentation and testing.
 	  If in doubt, say "N".
 
+config EARLY_KPROBES
+	depends on KPROBES && OPTPROBES
+	def_bool y
+
+config NR_EARLY_KPROBES_SLOTS
+	int "Number of possible early kprobes"
+	range 1 64
+	default 16
+	depends on EARLY_KPROBES
+	help
+	  Number of early kprobes slots.
+
 config JUMP_LABEL
        bool "Optimize very unlikely/likely branches"
        depends on HAVE_ARCH_JUMP_LABEL
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 11/11] kprobes: add CONFIG_EARLY_KPROBES option.
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (10 preceding siblings ...)
  2015-01-07  7:36 ` [RFC PATCH 11/11] kprobes: add CONFIG_EARLY_KPROBES option Wang Nan
@ 2015-01-07  7:42 ` Wang Nan
  2015-01-13 15:58 ` [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Masami Hiramatsu
  2015-01-16 17:48 ` [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Steven Rostedt
  13 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07  7:42 UTC (permalink / raw)
  To: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy
  Cc: lizefan, linux-arm-kernel, linux-kernel

Enable early kprobes in Kconfig.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 arch/Kconfig | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 05d7a8a..06dff4b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -46,6 +46,18 @@ config KPROBES
 	  for kernel debugging, non-intrusive instrumentation and testing.
 	  If in doubt, say "N".
 
+config EARLY_KPROBES
+	depends on KPROBES && OPTPROBES
+	def_bool y
+
+config NR_EARLY_KPROBES_SLOTS
+	int "Number of possible early kprobes"
+	range 1 64
+	default 16
+	depends on EARLY_KPROBES
+	help
+	  Number of early kprobes slots.
+
 config JUMP_LABEL
        bool "Optimize very unlikely/likely branches"
        depends on HAVE_ARCH_JUMP_LABEL
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 07/11] kprobes: introduces macros for allocing early kprobe resources.
  2015-01-07  7:35 ` [RFC PATCH 07/11] kprobes: introduces macros for allocing early kprobe resources Wang Nan
@ 2015-01-07 10:45   ` Wang Nan
  0 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-01-07 10:45 UTC (permalink / raw)
  To: Hillf Danton
  Cc: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy, lizefan, linux-arm-kernel,
	linux-kernel

On 2015/1/7 15:35, Wang Nan wrote:
> Introduces macros to genearte common early kprobe related resource
> allocator.
> 
> All early kprobe related resources are statically allocated during
> linking for each early kprobe slot. For each type of resource, a bitmap
> is used to track allocation. __DEFINE_EKPROBE_ALLOC_OPS defines alloc
> and free handler for them. The range of the resource and the bitmap
> should be provided for allocaing and freeing. DEFINE_EKPROBE_ALLOC_OPS
> defines bitmap and the array used by it.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> ---
>  include/linux/kprobes.h | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 69 insertions(+)
> 
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index b0265f9..9a18188 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -270,6 +270,75 @@ extern void show_registers(struct pt_regs *regs);
>  extern void kprobes_inc_nmissed_count(struct kprobe *p);
>  extern bool arch_within_kprobe_blacklist(unsigned long addr);
>  
> +#ifdef CONFIG_EARLY_KPROBES
> +
> +#define NR_EARLY_KPROBES_SLOTS	CONFIG_NR_EARLY_KPROBES_SLOTS
> +#define ALIGN_UP(v, a)	(((v) + ((a) - 1)) & ~((a) - 1))
> +#define EARLY_KPROBES_BITMAP_SZ	ALIGN_UP(NR_EARLY_KPROBES_SLOTS, BITS_PER_LONG)
> +
> +#define __ek_in_range(v, s, e)	(((v) >= (s)) && ((v) < (e)))
> +#define __ek_buf_sz(s, e)	((void *)(e) - (void *)(s))
> +#define __ek_elem_sz_b(s, e)	(__ek_buf_sz(s, e) / NR_EARLY_KPROBES_SLOTS)
> +#define __ek_elem_sz(s, e)	(__ek_elem_sz_b(s, e) / sizeof(s[0]))
> +#define __ek_elem_idx(v, s, e)	(__ek_buf_sz(s, v) / __ek_elem_sz_b(s, e))
> +#define __ek_get_elem(i, s, e)	(&((s)[__ek_elem_sz(s, e) * (i)]))
> +#define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)				\
> +static inline __t *__ek_alloc_##__name(__t *__s, __t *__e, unsigned long *__b)\
> +{									\
> +	int __i = find_next_zero_bit(__b, NR_EARLY_KPROBES_SLOTS, 0);	\
> +	if (__i >= NR_EARLY_KPROBES_SLOTS)				\
> +		return NULL;						\
> +	set_bit(__i, __b);						\
> +	return __ek_get_elem(__i, __s, __e);				\
> +}									\
> +static inline int __ek_free_##__name(__t *__v, __t *__s, __t *__e, unsigned long *__b)	\
> +{									\
> +	if (!__ek_in_range(__v, __s, __e))				\
> +		return 0;						\
> +	clear_bit(__ek_elem_idx(__v, __s, __e), __b);			\
> +	return 1;							\
> +}
> +
> +#define DEFINE_EKPROBE_ALLOC_OPS(__t, __name, __static)			\
> +__static __t __ek_##__name##_slots[NR_EARLY_KPROBES_SLOTS];		\
> +__static unsigned long __ek_##__name##_bitmap[EARLY_KPROBES_BITMAP_SZ];	\
> +__DEFINE_EKPROBE_ALLOC_OPS(__t, __name)					\
> +static inline __t *ek_alloc_##__name(void)				\
> +{									\
> +	return __ek_alloc_##__name(&((__ek_##__name##_slots)[0]),	\
> +			&((__ek_##__name##_slots)[NR_EARLY_KPROBES_SLOTS]),\
> +			(__ek_##__name##_bitmap));			\
> +}									\
> +static inline int ek_free_##__name(__t *__s)				\
> +{									\
> +	return __ek_free_##__name(__s, &((__ek_##__name##_slots)[0]),	\
> +			&((__ek_##__name##_slots)[NR_EARLY_KPROBES_SLOTS]),\
> +			(__ek_##__name##_bitmap));			\
> +}
> +
> +
> +#else
> +#define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)				\
> +static inline __t *__ek_alloc_##__name(__t *__s, __t *__e, unsigned long *__b)\
> +{									\
> +	return NULL;							\
> +}									\
> +static inline int __ek_free_##__name(__t *__v, __t *__s, __t *__e, unsigned long *__b)\
> +{									\
> +	return 0;							\
> +}
> +
> +#define DEFINE_EKPROBE_ALLOC_OPS(__t, __name, __static)			\
> +static inline __t *ek_alloc_##__name(void)				\
> +{									\
> +	return NULL;							\
> +}									\
> +static inline void ek_free_##__name(__t *__s)				\
> +{									\
> +}
> +

Sorry, here is a small problem. Should be:
static inline int ek_free_##__name(__t *__s)				\
{									\
	return 0;							\
}

> +#endif
> +
>  struct kprobe_insn_cache {
>  	struct mutex mutex;
>  	void *(*alloc)(void);	/* allocate insn page */
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized.
  2015-01-07  7:35 ` [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized Wang Nan
@ 2015-01-07 17:15   ` Christopher Covington
  2015-01-13 15:34   ` Masami Hiramatsu
  1 sibling, 0 replies; 21+ messages in thread
From: Christopher Covington @ 2015-01-07 17:15 UTC (permalink / raw)
  To: Wang Nan, linux, mingo, x86, masami.hiramatsu.pt,
	anil.s.keshavamurthy, davem, ananth, dave.long, tixy
  Cc: lizefan, linux-kernel, linux-arm-kernel

On 01/07/2015 02:35 AM, Wang Nan wrote:
> If kprobe is optimized before kprobe is initialized, there should
> be only one core, the probed instruction is not armed with breakpoint,
> so simply patch text is okay.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> ---
>  arch/arm/probes/kprobes/opt-arm.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/probes/kprobes/opt-arm.c b/arch/arm/probes/kprobes/opt-arm.c
> index 15b37c0..a021474 100644
> --- a/arch/arm/probes/kprobes/opt-arm.c
> +++ b/arch/arm/probes/kprobes/opt-arm.c
> @@ -325,8 +325,17 @@ void __kprobes arch_optimize_kprobes(struct list_head *oplist)
>  		 * Similar to __arch_disarm_kprobe, operations which
>  		 * removing breakpoints must be wrapped by stop_machine
>  		 * to avoid racing.
> +		 *
> +		 * If this function is called before kprobes initialized,
> +		 * the kprobe should be an early kprobe, the instruction
> +		 * is not armed with breakpoint. There should be only
> +		 * one core now, so directly __patch_text is enough.
>  		 */
> -		kprobes_remove_breakpoint(op->kp.addr, insn);
> +		if (unlikely(!kprobes_initialized)) {
> +			BUG_ON(!(op->kp.flags & KPROBE_FLAG_EARLY));
> +			__patch_text(op->kp.addr, insn);
> +		} else
> +			kprobes_remove_breakpoint(op->kp.addr, insn);

"...if only one branch of a conditional statement is a single statement ...
use braces in both branches".

https://www.kernel.org/doc/Documentation/CodingStyle

>  
>  		list_del_init(&op->list);
>  	}
> 

Regards,
Chris

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized.
  2015-01-07  7:35 ` [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized Wang Nan
  2015-01-07 17:15   ` Christopher Covington
@ 2015-01-13 15:34   ` Masami Hiramatsu
  1 sibling, 0 replies; 21+ messages in thread
From: Masami Hiramatsu @ 2015-01-13 15:34 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux, mingo, x86, anil.s.keshavamurthy, davem, ananth,
	dave.long, tixy, lizefan, linux-arm-kernel, linux-kernel

(2015/01/07 16:35), Wang Nan wrote:
> If kprobe is optimized before kprobe is initialized, there should
> be only one core, the probed instruction is not armed with breakpoint,
> so simply patch text is okay.

This patch looks very hacky. If kprobes is not initialized, why anyone
can optimize kprobes? I think you must introduce early kprobes init
routine and set init flag at that point.

Thank you,

> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> ---
>  arch/arm/probes/kprobes/opt-arm.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/probes/kprobes/opt-arm.c b/arch/arm/probes/kprobes/opt-arm.c
> index 15b37c0..a021474 100644
> --- a/arch/arm/probes/kprobes/opt-arm.c
> +++ b/arch/arm/probes/kprobes/opt-arm.c
> @@ -325,8 +325,17 @@ void __kprobes arch_optimize_kprobes(struct list_head *oplist)
>  		 * Similar to __arch_disarm_kprobe, operations which
>  		 * removing breakpoints must be wrapped by stop_machine
>  		 * to avoid racing.
> +		 *
> +		 * If this function is called before kprobes initialized,
> +		 * the kprobe should be an early kprobe, the instruction
> +		 * is not armed with breakpoint. There should be only
> +		 * one core now, so directly __patch_text is enough.
>  		 */
> -		kprobes_remove_breakpoint(op->kp.addr, insn);
> +		if (unlikely(!kprobes_initialized)) {
> +			BUG_ON(!(op->kp.flags & KPROBE_FLAG_EARLY));
> +			__patch_text(op->kp.addr, insn);
> +		} else
> +			kprobes_remove_breakpoint(op->kp.addr, insn);
>  
>  		list_del_init(&op->list);
>  	}
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 00/11] Early kprobe: enable kprobes at very early
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (11 preceding siblings ...)
  2015-01-07  7:42 ` Wang Nan
@ 2015-01-13 15:58 ` Masami Hiramatsu
  2015-02-06 10:30   ` [RFC PATCH] x86: kprobes: enable optmize relative call insn Wang Nan
  2015-01-16 17:48 ` [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Steven Rostedt
  13 siblings, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2015-01-13 15:58 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux, mingo, x86, anil.s.keshavamurthy, davem, ananth,
	dave.long, tixy, lizefan, linux-arm-kernel, linux-kernel,
	Steven Rostedt

(2015/01/07 16:34), Wang Nan wrote:
> This patch series shows early kprobe, a mechanism allows users to track
> events at very early. It should be useful for optimization of system
> booting. This can also be used by BSP developers to hook their platform
> specific procedures at kernel booting stages after setup_arch().

Good work!! :)

> This patch series provides X86 and ARM support for early kprobes. The ARM
> portion is based on my OPTPROBES for ARM 32 patches (ARM: kprobes: OPTPROBES
> and other improvements), which have not been accepted yet.
> 
> Kprobes is very useful for tracking events. However, it can only be used
> after system fully initialized. When debugging kernel booting stage, for
> example, checking memory consumption during booting, analyzing boot
> phase processes creation and optimization of booting speed, specific
> tools must be created. Sometimes we have to modify kernel code.
> 
> Early kprobes is my idea on it. By utilizing OPTPROBES which converts probed
> instructions into branches instead of breakpoints, kprobe can be used even
> before setup of exception handlers. By adding cmdline options, one can insert
> kprobes to track kernel booting stage without code modification. 

Hmm, for arm32, this strategy is good. but on x86, not so many instructions
can be optimized. I doubt that we really need to use it before initializing
exception handlers. Since any exception can be happen on early point, we
need to initialize it on very early stage.

> BSP developers can also benefit from it. For example, when booting an
> SoC equipped with unstoppable watchdog like IMP706, wathdog writting
> code must be inserted into different places to avoid watchdog resetting
> system before watchdogd is pulled up (especially during memory
> initialization, which is the most time-consuming portion of booting).
> With early kprobe, BSP developers are able to put such code at their
> private directory without disturbing arch-independent code.
> 
> In this patch series, early kprobes simply print messagees when the
> probed instructions are hit. My futher plan is to connect 'ekprobe='
> cmdline parameters to '/sys/kernel/debug/tracing/kprobe_events', allows
> installing kprobe events from kernel cmdline, and dump early kprobe
> messages into ring buffer without print them out.

Yeah, I really need this early-ftrace (event-trace) feature to
trace booting kernel, even without kprobe events.

> Patch 1 - 4 are architecture dependent code, allow text modification
> before kprobes_initialized is setup, and alloc resources statically from
> vmlinux.lds. Currently only x86 and ARM are supported.
> 
> Patch 5 - 8 define required flags and macros.
> 
> Patch 9 is the core logic of early kprobes. When register_kprobe() is
> called before kprobes_initialized, it marks the probed kprobes as
> 'KPROBE_FLAG_EARLY' and allocs resources from slots which is reserved
> during linking. After kprobe is fully initialized, it converts early
> kprobes to normal kprobes.
> 
> Patch 10 enables cmdline option 'ekprobe=', allows setup probe at
> cmdline. However, currently the kprobe handler is only a simple printk.
> 
> Patch 11 introduces required Kconfig options to actually enable early
> kprobes.

BTW, did you ensure all patches in the series are "bisect-clean" ?
It seems some early patches in the series depend on later patches.

> 
> Usage of early kprobe is as follow:
> 
> Booting kernel with cmdline 'ekprobe=', like:
> 
> ... rdinit=/sbin/init ekprobe=0xc00f3c2c ekprobe=__free_pages ...
> 
> During boot, kernel will print trace using printk:
> 
>  ...
>  Hit early kprobe at __alloc_pages_nodemask+0x4
>  Hit early kprobe at __free_pages+0x0
>  Hit early kprobe at __alloc_pages_nodemask+0x4
>  Hit early kprobe at __free_pages+0x0
>  Hit early kprobe at __free_pages+0x0
>  Hit early kprobe at __alloc_pages_nodemask+0x4
>  ...
> 
> After fully initialized, early kprobes will be converted to normal
> kprobes, and can be turned-off using:

I think it should be just removed automatically instead of converting.

Thank you!

> 
>  echo 0 > /sys/kernel/debug/kprobes/enabled
> 
> And reenabled using:
> 
>  echo 1 > /sys/kernel/debug/kprobes/enabled
> 
> Also, optimization can be turned off using:
> 
>  echo 0 > /proc/sys/debug/kprobes-optimization
> 
> There's no way to remove specific early kprobe now. I'd like to convert
> early kprobes into kprobe events in futher patches, and then they can be
> totally removed through event interface.
> 
> Wang Nan (11):
>   ARM: kprobes: directly modify code if kprobe is not initialized.
>   ARM: kprobes: introduce early kprobes related code area.
>   x86: kprobes: directly modify code if kprobe is not initialized.
>   x86: kprobes: introduce early kprobes related code area.
>   kprobes: Add an KPROBE_FLAG_EARLY for early kprobe.
>   kprobes: makes kprobes_initialized globally visable.
>   kprobes: introduces macros for allocing early kprobe resources.
>   kprobes: allows __alloc_insn_slot() from early kprobes slots.
>   kprobes: core logic of eraly kprobes.
>   kprobes: enable 'ekprobe=' cmdline option for early kprobes.
>   kprobes: add CONFIG_EARLY_KPROBES option.
> 
>  arch/Kconfig                      |  12 ++
>  arch/arm/include/asm/kprobes.h    |  29 ++++-
>  arch/arm/kernel/vmlinux.lds.S     |   2 +
>  arch/arm/probes/kprobes/opt-arm.c |  11 +-
>  arch/x86/include/asm/insn.h       |   7 +-
>  arch/x86/include/asm/kprobes.h    |  44 +++++--
>  arch/x86/kernel/kprobes/opt.c     |   7 +-
>  arch/x86/kernel/vmlinux.lds.S     |   2 +
>  include/linux/kprobes.h           | 109 ++++++++++++++++++
>  kernel/kprobes.c                  | 237 ++++++++++++++++++++++++++++++++++++--
>  10 files changed, 437 insertions(+), 23 deletions(-)
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 00/11] Early kprobe: enable kprobes at very early
  2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
                   ` (12 preceding siblings ...)
  2015-01-13 15:58 ` [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Masami Hiramatsu
@ 2015-01-16 17:48 ` Steven Rostedt
  13 siblings, 0 replies; 21+ messages in thread
From: Steven Rostedt @ 2015-01-16 17:48 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux, mingo, x86, masami.hiramatsu.pt, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy, lizefan, linux-arm-kernel,
	linux-kernel

On Wed, Jan 07, 2015 at 03:34:46PM +0800, Wang Nan wrote:
> 
> Patch 10 enables cmdline option 'ekprobe=', allows setup probe at
> cmdline. However, currently the kprobe handler is only a simple printk.
> 
> Patch 11 introduces required Kconfig options to actually enable early
> kprobes.
> 
> Usage of early kprobe is as follow:
> 
> Booting kernel with cmdline 'ekprobe=', like:
> 
> ... rdinit=/sbin/init ekprobe=0xc00f3c2c ekprobe=__free_pages ...

Perhaps you should specify what the probe will do. For now it is only printk.
For example:

  ekprobe=printk,__free_pages

That is, here you are attaching a printk to __free_pages.

Later, when you implement tracing, you could have:

 ekprobe=trace,__free_pages

where it will be sent to the ring buffer.

This will maintain backward compatibility when you add new features. Instead
of getting something like printk working now, and people use it for such,
and then when you switch it over to tracing, it breaks the printk version.

Also, if you plan on converting early kprobes to normal kprobes during boot,
then it should just be kprobes=..., why add the 'e'. It being early is just
an implementation detail, not something that should be expressed by the users
of the facility.

-- Steve


> 
> During boot, kernel will print trace using printk:
> 
>  ...
>  Hit early kprobe at __alloc_pages_nodemask+0x4
>  Hit early kprobe at __free_pages+0x0
>  Hit early kprobe at __alloc_pages_nodemask+0x4
>  Hit early kprobe at __free_pages+0x0
>  Hit early kprobe at __free_pages+0x0
>  Hit early kprobe at __alloc_pages_nodemask+0x4
>  ...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH] x86: kprobes: enable optmize relative call insn
  2015-01-13 15:58 ` [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Masami Hiramatsu
@ 2015-02-06 10:30   ` Wang Nan
  2015-02-07 10:08     ` Masami Hiramatsu
  0 siblings, 1 reply; 21+ messages in thread
From: Wang Nan @ 2015-02-06 10:30 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, hpa, x86, linux-kernel, linux, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy, lizefan, linux-arm-kernel,
	rostedt

In reply to Masami Hiramatsu's question on my previous early kprobe
patch series at:

http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/315771.html

that on x86, early kprobe's applications range is limited by the type of
optimizable instructions, I made this patch, which enables optimizing
relative call instructions by introducing specific template for them.
Such instructions make up about 7% of the kernel. In addition, when
ftrace is enabled, funtion entry will be it, so early kprobe will be
much useful than before.

The relationship between ftrace and kprobe is interesting. Under normal
circumstances, kprobe utilizes ftrace. However, under early case,
there's no way to tell whether the probing instruction is an ftrace
entry. Another possible method on that is to move part of ftrace init
ahead. However, to allow optimize more instructions should also be
good for performance.

Masami, I'd like to hear your reply on it. Do you think this patch is
also useful for the normal cases?

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 arch/x86/include/asm/kprobes.h | 17 +++++++--
 arch/x86/kernel/kprobes/opt.c  | 82 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 94 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 017f4bb..3627694 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -31,6 +31,7 @@
 #define RELATIVEJUMP_OPCODE 0xe9
 #define RELATIVEJUMP_SIZE 5
 #define RELATIVECALL_OPCODE 0xe8
+#define RELATIVECALL_SIZE 5
 #define RELATIVE_ADDR_SIZE 4
 #define MAX_STACK_SIZE 64
 #define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
@@ -38,8 +39,10 @@
 #ifdef __ASSEMBLY__
 
 #define KPROBE_OPCODE_SIZE     1
+#define OPT_CALL_TEMPLATE_SIZE (optprobe_call_template_end - \
+		optprobe_call_template_entry)
 #define MAX_OPTINSN_SIZE ((optprobe_template_end - optprobe_template_entry) + \
-	MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
+	MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE + OPT_CALL_TEMPLATE_SIZE)
 
 #ifdef CONFIG_EARLY_KPROBES
 # define EARLY_KPROBES_CODES_AREA					\
@@ -81,10 +84,20 @@ extern __visible kprobe_opcode_t optprobe_template_entry;
 extern __visible kprobe_opcode_t optprobe_template_val;
 extern __visible kprobe_opcode_t optprobe_template_call;
 extern __visible kprobe_opcode_t optprobe_template_end;
+
+extern __visible kprobe_opcode_t optprobe_call_template_entry;
+extern __visible kprobe_opcode_t optprobe_call_template_val_destaddr;
+extern __visible kprobe_opcode_t optprobe_call_template_val_retaddr;
+extern __visible kprobe_opcode_t optprobe_call_template_end;
+
+#define OPT_CALL_TEMPLATE_SIZE				\
+	((unsigned long)&optprobe_call_template_end -	\
+	 (unsigned long)&optprobe_call_template_entry)
 #define MAX_OPTINSN_SIZE 				\
 	(((unsigned long)&optprobe_template_end -	\
 	  (unsigned long)&optprobe_template_entry) +	\
-	 MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
+	 MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE +	\
+	 OPT_CALL_TEMPLATE_SIZE)
 
 extern const int kretprobe_blacklist_size;
 
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index dc5fccb..05dd06f 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -39,6 +39,23 @@
 
 #include "common.h"
 
+static inline bool
+is_relcall(u8 *addr)
+{
+	return (*(u8 *)(addr) == RELATIVECALL_OPCODE);
+}
+
+static inline void *
+get_relcall_target(u8 *addr)
+{
+	struct __arch_relative_insn {
+		u8 op;
+		s32 raddr;
+	} __packed *insn;
+	insn = (struct __arch_relative_insn *)addr;
+	return (void *)((unsigned long)addr + RELATIVECALL_SIZE + insn->raddr);
+}
+
 unsigned long __recover_optprobed_insn(kprobe_opcode_t *buf, unsigned long addr)
 {
 	struct optimized_kprobe *op;
@@ -89,6 +106,48 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
 }
 
 asm (
+#ifdef CONFIG_X86_64
+			".global optprobe_call_template_entry\n"
+			"optprobe_call_template_entry:"
+			"pushq %rdi\n"
+			".global optprobe_call_template_val_destaddr\n"
+			"optprobe_call_template_val_destaddr:"
+			ASM_NOP5
+			ASM_NOP5
+			"pushq %rdi\n"
+			".global optprobe_call_template_val_retaddr\n"
+			"optprobe_call_template_val_retaddr:"
+			ASM_NOP5
+			ASM_NOP5
+			"xchgq %rdi, 8(%rsp)\n"
+			"retq\n"
+#else /* CONFIG_X86_32 */
+			".global optprobe_call_template_entry\n"
+			"optprobe_call_template_entry:"
+			"push %edi\n"
+			".global optprobe_call_template_val_destaddr\n"
+			"optprobe_call_template_val_destaddr:"
+			ASM_NOP5
+			"push %edi\n"
+			".global optprobe_call_template_val_retaddr\n"
+			"optprobe_call_template_val_retaddr:"
+			ASM_NOP5
+			"xchg %edi, 4(%esp)\n"
+			"ret\n"
+#endif
+			".global optprobe_call_template_end\n"
+			"optprobe_call_template_end:\n"
+);
+
+#define __OPTCALL_TMPL_MOVE_DESTADDR_IDX \
+	((long)&optprobe_call_template_val_destaddr - (long)&optprobe_call_template_entry)
+#define __OPTCALL_TMPL_MOVE_RETADDR_IDX \
+	((long)&optprobe_call_template_val_retaddr - (long)&optprobe_call_template_entry)
+#define __OPTCALL_TMPL_END_IDX \
+	((long)&optprobe_call_template_end - (long)&optprobe_call_template_entry)
+#define OPTCALL_TMPL_SIZE	__OPTCALL_TMPL_END_IDX
+
+asm (
 			".global optprobe_template_entry\n"
 			"optprobe_template_entry:\n"
 #ifdef CONFIG_X86_64
@@ -135,6 +194,10 @@ asm (
 #define TMPL_END_IDX \
 	((long)&optprobe_template_end - (long)&optprobe_template_entry)
 
+#define TMPL_OPTCALL_MOVE_DESTADDR_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_MOVE_DESTADDR_IDX)
+#define TMPL_OPTCALL_MOVE_RETADDR_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_MOVE_RETADDR_IDX)
+#define TMPL_OPTCALL_END_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_END_IDX)
+
 #define INT3_SIZE sizeof(kprobe_opcode_t)
 
 /* Optimized kprobe call back function: called from optinsn */
@@ -175,6 +238,12 @@ static int copy_optimized_instructions(u8 *dest, u8 *src)
 {
 	int len = 0, ret;
 
+	if (is_relcall(src)) {
+		memcpy(dest, &optprobe_call_template_entry,
+				OPTCALL_TMPL_SIZE);
+		return OPTCALL_TMPL_SIZE;
+	}
+
 	while (len < RELATIVEJUMP_SIZE) {
 		ret = __copy_instruction(dest + len, src + len);
 		if (!ret || !can_boost(dest + len))
@@ -365,9 +434,16 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
 	/* Set probe function call */
 	synthesize_relcall(buf + TMPL_CALL_IDX, optimized_callback);
 
-	/* Set returning jmp instruction at the tail of out-of-line buffer */
-	synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
-			   (u8 *)op->kp.addr + op->optinsn.size);
+	if (!is_relcall(op->kp.addr)) {
+		/* Set returning jmp instruction at the tail of out-of-line buffer */
+		synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
+				(u8 *)op->kp.addr + op->optinsn.size);
+	} else {
+		synthesize_set_arg1(buf + TMPL_OPTCALL_MOVE_DESTADDR_IDX,
+				(unsigned long)(get_relcall_target(op->kp.addr)));
+		synthesize_set_arg1(buf + TMPL_OPTCALL_MOVE_RETADDR_IDX,
+				(unsigned long)(op->kp.addr + RELATIVECALL_SIZE));
+	}
 
 	flush_icache_range((unsigned long) buf,
 			   (unsigned long) buf + TMPL_END_IDX +
-- 
1.8.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH] x86: kprobes: enable optmize relative call insn
  2015-02-06 10:30   ` [RFC PATCH] x86: kprobes: enable optmize relative call insn Wang Nan
@ 2015-02-07 10:08     ` Masami Hiramatsu
  2015-02-07 10:42       ` Wang Nan
  0 siblings, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2015-02-07 10:08 UTC (permalink / raw)
  To: Wang Nan
  Cc: Ingo Molnar, hpa, x86, linux-kernel, linux, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy, lizefan, linux-arm-kernel,
	rostedt

(2015/02/06 19:30), Wang Nan wrote:
> In reply to Masami Hiramatsu's question on my previous early kprobe
> patch series at:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/315771.html
> 
> that on x86, early kprobe's applications range is limited by the type of
> optimizable instructions, I made this patch, which enables optimizing
> relative call instructions by introducing specific template for them.
> Such instructions make up about 7% of the kernel. In addition, when
> ftrace is enabled, funtion entry will be it, so early kprobe will be
> much useful than before.

Sorry, I couldn't understand this part. If you put kprobe on ftrace site
after ftrace enabled, it uses ftrace directly, instead of int3 or jump.
Anyway, ftrace-site instruction should be controlled by ftrace, not kprobes.

> 
> The relationship between ftrace and kprobe is interesting. Under normal
> circumstances, kprobe utilizes ftrace. However, under early case,
> there's no way to tell whether the probing instruction is an ftrace
> entry. Another possible method on that is to move part of ftrace init
> ahead. However, to allow optimize more instructions should also be
> good for performance.
> 
> Masami, I'd like to hear your reply on it. Do you think this patch is
> also useful for the normal cases?

Expanding "optimizability" is good, but I don't like add new asm-templates
which reduces maintainability less. Perhaps, we'd better start with reviewing
can_boost table again...

Thank you,

> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> ---
>  arch/x86/include/asm/kprobes.h | 17 +++++++--
>  arch/x86/kernel/kprobes/opt.c  | 82 ++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 94 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
> index 017f4bb..3627694 100644
> --- a/arch/x86/include/asm/kprobes.h
> +++ b/arch/x86/include/asm/kprobes.h
> @@ -31,6 +31,7 @@
>  #define RELATIVEJUMP_OPCODE 0xe9
>  #define RELATIVEJUMP_SIZE 5
>  #define RELATIVECALL_OPCODE 0xe8
> +#define RELATIVECALL_SIZE 5
>  #define RELATIVE_ADDR_SIZE 4
>  #define MAX_STACK_SIZE 64
>  #define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
> @@ -38,8 +39,10 @@
>  #ifdef __ASSEMBLY__
>  
>  #define KPROBE_OPCODE_SIZE     1
> +#define OPT_CALL_TEMPLATE_SIZE (optprobe_call_template_end - \
> +		optprobe_call_template_entry)
>  #define MAX_OPTINSN_SIZE ((optprobe_template_end - optprobe_template_entry) + \
> -	MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
> +	MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE + OPT_CALL_TEMPLATE_SIZE)
>  
>  #ifdef CONFIG_EARLY_KPROBES
>  # define EARLY_KPROBES_CODES_AREA					\
> @@ -81,10 +84,20 @@ extern __visible kprobe_opcode_t optprobe_template_entry;
>  extern __visible kprobe_opcode_t optprobe_template_val;
>  extern __visible kprobe_opcode_t optprobe_template_call;
>  extern __visible kprobe_opcode_t optprobe_template_end;
> +
> +extern __visible kprobe_opcode_t optprobe_call_template_entry;
> +extern __visible kprobe_opcode_t optprobe_call_template_val_destaddr;
> +extern __visible kprobe_opcode_t optprobe_call_template_val_retaddr;
> +extern __visible kprobe_opcode_t optprobe_call_template_end;
> +
> +#define OPT_CALL_TEMPLATE_SIZE				\
> +	((unsigned long)&optprobe_call_template_end -	\
> +	 (unsigned long)&optprobe_call_template_entry)
>  #define MAX_OPTINSN_SIZE 				\
>  	(((unsigned long)&optprobe_template_end -	\
>  	  (unsigned long)&optprobe_template_entry) +	\
> -	 MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
> +	 MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE +	\
> +	 OPT_CALL_TEMPLATE_SIZE)
>  
>  extern const int kretprobe_blacklist_size;
>  
> diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
> index dc5fccb..05dd06f 100644
> --- a/arch/x86/kernel/kprobes/opt.c
> +++ b/arch/x86/kernel/kprobes/opt.c
> @@ -39,6 +39,23 @@
>  
>  #include "common.h"
>  
> +static inline bool
> +is_relcall(u8 *addr)
> +{
> +	return (*(u8 *)(addr) == RELATIVECALL_OPCODE);
> +}
> +
> +static inline void *
> +get_relcall_target(u8 *addr)
> +{
> +	struct __arch_relative_insn {
> +		u8 op;
> +		s32 raddr;
> +	} __packed *insn;
> +	insn = (struct __arch_relative_insn *)addr;
> +	return (void *)((unsigned long)addr + RELATIVECALL_SIZE + insn->raddr);
> +}
> +
>  unsigned long __recover_optprobed_insn(kprobe_opcode_t *buf, unsigned long addr)
>  {
>  	struct optimized_kprobe *op;
> @@ -89,6 +106,48 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
>  }
>  
>  asm (
> +#ifdef CONFIG_X86_64
> +			".global optprobe_call_template_entry\n"
> +			"optprobe_call_template_entry:"
> +			"pushq %rdi\n"
> +			".global optprobe_call_template_val_destaddr\n"
> +			"optprobe_call_template_val_destaddr:"
> +			ASM_NOP5
> +			ASM_NOP5
> +			"pushq %rdi\n"
> +			".global optprobe_call_template_val_retaddr\n"
> +			"optprobe_call_template_val_retaddr:"
> +			ASM_NOP5
> +			ASM_NOP5
> +			"xchgq %rdi, 8(%rsp)\n"
> +			"retq\n"
> +#else /* CONFIG_X86_32 */
> +			".global optprobe_call_template_entry\n"
> +			"optprobe_call_template_entry:"
> +			"push %edi\n"
> +			".global optprobe_call_template_val_destaddr\n"
> +			"optprobe_call_template_val_destaddr:"
> +			ASM_NOP5
> +			"push %edi\n"
> +			".global optprobe_call_template_val_retaddr\n"
> +			"optprobe_call_template_val_retaddr:"
> +			ASM_NOP5
> +			"xchg %edi, 4(%esp)\n"
> +			"ret\n"
> +#endif
> +			".global optprobe_call_template_end\n"
> +			"optprobe_call_template_end:\n"
> +);
> +
> +#define __OPTCALL_TMPL_MOVE_DESTADDR_IDX \
> +	((long)&optprobe_call_template_val_destaddr - (long)&optprobe_call_template_entry)
> +#define __OPTCALL_TMPL_MOVE_RETADDR_IDX \
> +	((long)&optprobe_call_template_val_retaddr - (long)&optprobe_call_template_entry)
> +#define __OPTCALL_TMPL_END_IDX \
> +	((long)&optprobe_call_template_end - (long)&optprobe_call_template_entry)
> +#define OPTCALL_TMPL_SIZE	__OPTCALL_TMPL_END_IDX
> +
> +asm (
>  			".global optprobe_template_entry\n"
>  			"optprobe_template_entry:\n"
>  #ifdef CONFIG_X86_64
> @@ -135,6 +194,10 @@ asm (
>  #define TMPL_END_IDX \
>  	((long)&optprobe_template_end - (long)&optprobe_template_entry)
>  
> +#define TMPL_OPTCALL_MOVE_DESTADDR_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_MOVE_DESTADDR_IDX)
> +#define TMPL_OPTCALL_MOVE_RETADDR_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_MOVE_RETADDR_IDX)
> +#define TMPL_OPTCALL_END_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_END_IDX)
> +
>  #define INT3_SIZE sizeof(kprobe_opcode_t)
>  
>  /* Optimized kprobe call back function: called from optinsn */
> @@ -175,6 +238,12 @@ static int copy_optimized_instructions(u8 *dest, u8 *src)
>  {
>  	int len = 0, ret;
>  
> +	if (is_relcall(src)) {
> +		memcpy(dest, &optprobe_call_template_entry,
> +				OPTCALL_TMPL_SIZE);
> +		return OPTCALL_TMPL_SIZE;
> +	}
> +
>  	while (len < RELATIVEJUMP_SIZE) {
>  		ret = __copy_instruction(dest + len, src + len);
>  		if (!ret || !can_boost(dest + len))
> @@ -365,9 +434,16 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
>  	/* Set probe function call */
>  	synthesize_relcall(buf + TMPL_CALL_IDX, optimized_callback);
>  
> -	/* Set returning jmp instruction at the tail of out-of-line buffer */
> -	synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
> -			   (u8 *)op->kp.addr + op->optinsn.size);
> +	if (!is_relcall(op->kp.addr)) {
> +		/* Set returning jmp instruction at the tail of out-of-line buffer */
> +		synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
> +				(u8 *)op->kp.addr + op->optinsn.size);
> +	} else {
> +		synthesize_set_arg1(buf + TMPL_OPTCALL_MOVE_DESTADDR_IDX,
> +				(unsigned long)(get_relcall_target(op->kp.addr)));
> +		synthesize_set_arg1(buf + TMPL_OPTCALL_MOVE_RETADDR_IDX,
> +				(unsigned long)(op->kp.addr + RELATIVECALL_SIZE));
> +	}
>  
>  	flush_icache_range((unsigned long) buf,
>  			   (unsigned long) buf + TMPL_END_IDX +
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH] x86: kprobes: enable optmize relative call insn
  2015-02-07 10:08     ` Masami Hiramatsu
@ 2015-02-07 10:42       ` Wang Nan
  0 siblings, 0 replies; 21+ messages in thread
From: Wang Nan @ 2015-02-07 10:42 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, hpa, x86, linux-kernel, linux, anil.s.keshavamurthy,
	davem, ananth, dave.long, tixy, lizefan, linux-arm-kernel,
	rostedt

On 2015/2/7 18:08, Masami Hiramatsu wrote:
> (2015/02/06 19:30), Wang Nan wrote:
>> In reply to Masami Hiramatsu's question on my previous early kprobe
>> patch series at:
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/315771.html
>>
>> that on x86, early kprobe's applications range is limited by the type of
>> optimizable instructions, I made this patch, which enables optimizing
>> relative call instructions by introducing specific template for them.
>> Such instructions make up about 7% of the kernel. In addition, when
>> ftrace is enabled, funtion entry will be it, so early kprobe will be
>> much useful than before.
> 
> Sorry, I couldn't understand this part. If you put kprobe on ftrace site
> after ftrace enabled, it uses ftrace directly, instead of int3 or jump.
> Anyway, ftrace-site instruction should be controlled by ftrace, not kprobes.
> 

I agree with you. Let ftrace control ftrace things.

The goal of this patch is to expand early kprobe's use. I posted early kprobe
patch series serval weeks before.

Early kprobe relies on kprobeopt: it directly optimizes probed instruction to
avoid exception, so we can use it before exception handler installed, and definitely
before ftrace_init() (that function converts function entries to NOP from CALL when
ftrace is on).

As you pointed out, on x86, this strategy is not good because not so many instructions
can be optimized. In my opinion, the most intolerable thing is that we are unable to
optimize 'call' instruction, so we are unable to probe function entries before ftrace_init().

After this patch my early kprobe is able to probe function entries. I thought you may
allow 'CALL' to be a special case which deserve a new template for early kprobe. Anyway,
thank you for your quick response so I won't spent my time on it.

My next plan is to split ftrace_init() into two parts: converting 'CALL' to 'NOP' before
early kprobe and doing other thing at its current place.

>>
>> The relationship between ftrace and kprobe is interesting. Under normal
>> circumstances, kprobe utilizes ftrace. However, under early case,
>> there's no way to tell whether the probing instruction is an ftrace
>> entry. Another possible method on that is to move part of ftrace init
>> ahead. However, to allow optimize more instructions should also be
>> good for performance.
>>
>> Masami, I'd like to hear your reply on it. Do you think this patch is
>> also useful for the normal cases?
> 
> Expanding "optimizability" is good, but I don't like add new asm-templates
> which reduces maintainability less. Perhaps, we'd better start with reviewing
> can_boost table again...
> 
> Thank you,
> 
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> ---
>>  arch/x86/include/asm/kprobes.h | 17 +++++++--
>>  arch/x86/kernel/kprobes/opt.c  | 82 ++++++++++++++++++++++++++++++++++++++++--
>>  2 files changed, 94 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
>> index 017f4bb..3627694 100644
>> --- a/arch/x86/include/asm/kprobes.h
>> +++ b/arch/x86/include/asm/kprobes.h
>> @@ -31,6 +31,7 @@
>>  #define RELATIVEJUMP_OPCODE 0xe9
>>  #define RELATIVEJUMP_SIZE 5
>>  #define RELATIVECALL_OPCODE 0xe8
>> +#define RELATIVECALL_SIZE 5
>>  #define RELATIVE_ADDR_SIZE 4
>>  #define MAX_STACK_SIZE 64
>>  #define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
>> @@ -38,8 +39,10 @@
>>  #ifdef __ASSEMBLY__
>>  
>>  #define KPROBE_OPCODE_SIZE     1
>> +#define OPT_CALL_TEMPLATE_SIZE (optprobe_call_template_end - \
>> +		optprobe_call_template_entry)
>>  #define MAX_OPTINSN_SIZE ((optprobe_template_end - optprobe_template_entry) + \
>> -	MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
>> +	MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE + OPT_CALL_TEMPLATE_SIZE)
>>  
>>  #ifdef CONFIG_EARLY_KPROBES
>>  # define EARLY_KPROBES_CODES_AREA					\
>> @@ -81,10 +84,20 @@ extern __visible kprobe_opcode_t optprobe_template_entry;
>>  extern __visible kprobe_opcode_t optprobe_template_val;
>>  extern __visible kprobe_opcode_t optprobe_template_call;
>>  extern __visible kprobe_opcode_t optprobe_template_end;
>> +
>> +extern __visible kprobe_opcode_t optprobe_call_template_entry;
>> +extern __visible kprobe_opcode_t optprobe_call_template_val_destaddr;
>> +extern __visible kprobe_opcode_t optprobe_call_template_val_retaddr;
>> +extern __visible kprobe_opcode_t optprobe_call_template_end;
>> +
>> +#define OPT_CALL_TEMPLATE_SIZE				\
>> +	((unsigned long)&optprobe_call_template_end -	\
>> +	 (unsigned long)&optprobe_call_template_entry)
>>  #define MAX_OPTINSN_SIZE 				\
>>  	(((unsigned long)&optprobe_template_end -	\
>>  	  (unsigned long)&optprobe_template_entry) +	\
>> -	 MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
>> +	 MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE +	\
>> +	 OPT_CALL_TEMPLATE_SIZE)
>>  
>>  extern const int kretprobe_blacklist_size;
>>  
>> diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
>> index dc5fccb..05dd06f 100644
>> --- a/arch/x86/kernel/kprobes/opt.c
>> +++ b/arch/x86/kernel/kprobes/opt.c
>> @@ -39,6 +39,23 @@
>>  
>>  #include "common.h"
>>  
>> +static inline bool
>> +is_relcall(u8 *addr)
>> +{
>> +	return (*(u8 *)(addr) == RELATIVECALL_OPCODE);
>> +}
>> +
>> +static inline void *
>> +get_relcall_target(u8 *addr)
>> +{
>> +	struct __arch_relative_insn {
>> +		u8 op;
>> +		s32 raddr;
>> +	} __packed *insn;
>> +	insn = (struct __arch_relative_insn *)addr;
>> +	return (void *)((unsigned long)addr + RELATIVECALL_SIZE + insn->raddr);
>> +}
>> +
>>  unsigned long __recover_optprobed_insn(kprobe_opcode_t *buf, unsigned long addr)
>>  {
>>  	struct optimized_kprobe *op;
>> @@ -89,6 +106,48 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
>>  }
>>  
>>  asm (
>> +#ifdef CONFIG_X86_64
>> +			".global optprobe_call_template_entry\n"
>> +			"optprobe_call_template_entry:"
>> +			"pushq %rdi\n"
>> +			".global optprobe_call_template_val_destaddr\n"
>> +			"optprobe_call_template_val_destaddr:"
>> +			ASM_NOP5
>> +			ASM_NOP5
>> +			"pushq %rdi\n"
>> +			".global optprobe_call_template_val_retaddr\n"
>> +			"optprobe_call_template_val_retaddr:"
>> +			ASM_NOP5
>> +			ASM_NOP5
>> +			"xchgq %rdi, 8(%rsp)\n"
>> +			"retq\n"
>> +#else /* CONFIG_X86_32 */
>> +			".global optprobe_call_template_entry\n"
>> +			"optprobe_call_template_entry:"
>> +			"push %edi\n"
>> +			".global optprobe_call_template_val_destaddr\n"
>> +			"optprobe_call_template_val_destaddr:"
>> +			ASM_NOP5
>> +			"push %edi\n"
>> +			".global optprobe_call_template_val_retaddr\n"
>> +			"optprobe_call_template_val_retaddr:"
>> +			ASM_NOP5
>> +			"xchg %edi, 4(%esp)\n"
>> +			"ret\n"
>> +#endif
>> +			".global optprobe_call_template_end\n"
>> +			"optprobe_call_template_end:\n"
>> +);
>> +
>> +#define __OPTCALL_TMPL_MOVE_DESTADDR_IDX \
>> +	((long)&optprobe_call_template_val_destaddr - (long)&optprobe_call_template_entry)
>> +#define __OPTCALL_TMPL_MOVE_RETADDR_IDX \
>> +	((long)&optprobe_call_template_val_retaddr - (long)&optprobe_call_template_entry)
>> +#define __OPTCALL_TMPL_END_IDX \
>> +	((long)&optprobe_call_template_end - (long)&optprobe_call_template_entry)
>> +#define OPTCALL_TMPL_SIZE	__OPTCALL_TMPL_END_IDX
>> +
>> +asm (
>>  			".global optprobe_template_entry\n"
>>  			"optprobe_template_entry:\n"
>>  #ifdef CONFIG_X86_64
>> @@ -135,6 +194,10 @@ asm (
>>  #define TMPL_END_IDX \
>>  	((long)&optprobe_template_end - (long)&optprobe_template_entry)
>>  
>> +#define TMPL_OPTCALL_MOVE_DESTADDR_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_MOVE_DESTADDR_IDX)
>> +#define TMPL_OPTCALL_MOVE_RETADDR_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_MOVE_RETADDR_IDX)
>> +#define TMPL_OPTCALL_END_IDX	(TMPL_END_IDX + __OPTCALL_TMPL_END_IDX)
>> +
>>  #define INT3_SIZE sizeof(kprobe_opcode_t)
>>  
>>  /* Optimized kprobe call back function: called from optinsn */
>> @@ -175,6 +238,12 @@ static int copy_optimized_instructions(u8 *dest, u8 *src)
>>  {
>>  	int len = 0, ret;
>>  
>> +	if (is_relcall(src)) {
>> +		memcpy(dest, &optprobe_call_template_entry,
>> +				OPTCALL_TMPL_SIZE);
>> +		return OPTCALL_TMPL_SIZE;
>> +	}
>> +
>>  	while (len < RELATIVEJUMP_SIZE) {
>>  		ret = __copy_instruction(dest + len, src + len);
>>  		if (!ret || !can_boost(dest + len))
>> @@ -365,9 +434,16 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
>>  	/* Set probe function call */
>>  	synthesize_relcall(buf + TMPL_CALL_IDX, optimized_callback);
>>  
>> -	/* Set returning jmp instruction at the tail of out-of-line buffer */
>> -	synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
>> -			   (u8 *)op->kp.addr + op->optinsn.size);
>> +	if (!is_relcall(op->kp.addr)) {
>> +		/* Set returning jmp instruction at the tail of out-of-line buffer */
>> +		synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
>> +				(u8 *)op->kp.addr + op->optinsn.size);
>> +	} else {
>> +		synthesize_set_arg1(buf + TMPL_OPTCALL_MOVE_DESTADDR_IDX,
>> +				(unsigned long)(get_relcall_target(op->kp.addr)));
>> +		synthesize_set_arg1(buf + TMPL_OPTCALL_MOVE_RETADDR_IDX,
>> +				(unsigned long)(op->kp.addr + RELATIVECALL_SIZE));
>> +	}
>>  
>>  	flush_icache_range((unsigned long) buf,
>>  			   (unsigned long) buf + TMPL_END_IDX +
>>
> 
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-02-07 10:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-07  7:34 [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Wang Nan
2015-01-07  7:35 ` [RFC PATCH 01/11] ARM: kprobes: directly modify code if kprobe is not initialized Wang Nan
2015-01-07 17:15   ` Christopher Covington
2015-01-13 15:34   ` Masami Hiramatsu
2015-01-07  7:35 ` [RFC PATCH 02/11] ARM: kprobes: introduce early kprobes related code area Wang Nan
2015-01-07  7:35 ` [RFC PATCH 03/11] x86: kprobes: directly modify code if kprobe is not initialized Wang Nan
2015-01-07  7:35 ` [RFC PATCH 04/11] x86: kprobes: introduce early kprobes related code area Wang Nan
2015-01-07  7:35 ` [RFC PATCH 05/11] kprobes: Add an KPROBE_FLAG_EARLY for early kprobe Wang Nan
2015-01-07  7:35 ` [RFC PATCH 06/11] kprobes: makes kprobes_initialized globally visable Wang Nan
2015-01-07  7:35 ` [RFC PATCH 07/11] kprobes: introduces macros for allocing early kprobe resources Wang Nan
2015-01-07 10:45   ` Wang Nan
2015-01-07  7:35 ` [RFC PATCH 08/11] kprobes: allows __alloc_insn_slot() from early kprobes slots Wang Nan
2015-01-07  7:36 ` [RFC PATCH 09/11] kprobes: core logic of eraly kprobes Wang Nan
2015-01-07  7:36 ` [RFC PATCH 10/11] kprobes: enable 'ekprobe=' cmdline option for early kprobes Wang Nan
2015-01-07  7:36 ` [RFC PATCH 11/11] kprobes: add CONFIG_EARLY_KPROBES option Wang Nan
2015-01-07  7:42 ` Wang Nan
2015-01-13 15:58 ` [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Masami Hiramatsu
2015-02-06 10:30   ` [RFC PATCH] x86: kprobes: enable optmize relative call insn Wang Nan
2015-02-07 10:08     ` Masami Hiramatsu
2015-02-07 10:42       ` Wang Nan
2015-01-16 17:48 ` [RFC PATCH 00/11] Early kprobe: enable kprobes at very early Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).