All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups
@ 2012-05-02 19:24 Steven Rostedt
  2012-05-02 19:24 ` [PATCH 1/9][RFC] ftrace: Sort all function addresses, not just per page Steven Rostedt
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin


These patches are based on a previous pull request:

  http://comments.gmane.org/gmane.linux.kernel/1289098

(Hmm, LKML.org is missing April 27th and 28th, as that pull request
 was on the 27th)

This code makes the search for an ftrace location done by kprobes
much faster as it now does a binary search instead of a linear one.
It also consolidates two functions ftrace_location() and ftrace_text_resevered()
as on is for one addr, the other is for a range. The two functions still
exist, but use the same search.

Some more clean ups are also made to the ftrace code.

These haven't been through all my tests yet, but if they survive this
will be the patches I use for my next git pull request.

I would like these to make it into 3.5. Please comment if you have concerns.

Thanks!

-- Steve

Steven Rostedt (9):
      ftrace: Sort all function addresses, not just per page
      ftrace: Remove extra helper functions
      ftrace: Speed up search by skipping pages by address
      ftrace: Consolidate ftrace_location() and ftrace_text_reserved()
      ftrace: Return record ip addr for ftrace_location()
      kprobes: Allow probe on ftrace reserved text (but move it)
      ftrace: Make ftrace_modify_all_code() global for archs to use
      ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code()
      ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER

----
 arch/x86/kernel/ftrace.c          |   15 +--
 include/asm-generic/vmlinux.lds.h |    2 +-
 include/linux/ftrace.h            |    7 +-
 include/linux/kprobes.h           |    1 +
 kernel/kprobes.c                  |   13 ++-
 kernel/trace/Kconfig              |    1 -
 kernel/trace/ftrace.c             |  198 ++++++++++++++++++++-----------------
 7 files changed, 129 insertions(+), 108 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/9][RFC] ftrace: Sort all function addresses, not just per page
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 19:24 ` [PATCH 2/9][RFC] ftrace: Remove extra helper functions Steven Rostedt
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0001-ftrace-Sort-all-function-addresses-not-just-per-page.patch --]
[-- Type: text/plain, Size: 2569 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Instead of just sorting the ip's of the functions per ftrace page,
sort the entire list before adding them to the ftrace pages.

This will allow the bsearch algorithm to be sped up as it can
also sort by pages, not just records within a page.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/asm-generic/vmlinux.lds.h |    2 +-
 kernel/trace/ftrace.c             |   34 ++++++++++++++++++++++------------
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 8aeadf6..4e2e1cc 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -486,8 +486,8 @@
 	CPU_DISCARD(init.data)						\
 	MEM_DISCARD(init.data)						\
 	KERNEL_CTORS()							\
-	*(.init.rodata)							\
 	MCOUNT_REC()							\
+	*(.init.rodata)							\
 	FTRACE_EVENTS()							\
 	TRACE_SYSCALLS()						\
 	DEV_DISCARD(init.rodata)					\
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 0fa92f6..6a19e81 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3688,15 +3688,27 @@ static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer)
 	return 0;
 }
 
-static void ftrace_swap_recs(void *a, void *b, int size)
+static int ftrace_cmp_ips(const void *a, const void *b)
 {
-	struct dyn_ftrace *reca = a;
-	struct dyn_ftrace *recb = b;
-	struct dyn_ftrace t;
+	const unsigned long *ipa = a;
+	const unsigned long *ipb = b;
 
-	t = *reca;
-	*reca = *recb;
-	*recb = t;
+	if (*ipa > *ipb)
+		return 1;
+	if (*ipa < *ipb)
+		return -1;
+	return 0;
+}
+
+static void ftrace_swap_ips(void *a, void *b, int size)
+{
+	unsigned long *ipa = a;
+	unsigned long *ipb = b;
+	unsigned long t;
+
+	t = *ipa;
+	*ipa = *ipb;
+	*ipb = t;
 }
 
 static int ftrace_process_locs(struct module *mod,
@@ -3715,6 +3727,9 @@ static int ftrace_process_locs(struct module *mod,
 	if (!count)
 		return 0;
 
+	sort(start, count, sizeof(*start),
+	     ftrace_cmp_ips, ftrace_swap_ips);
+
 	pg = ftrace_allocate_pages(count);
 	if (!pg)
 		return -ENOMEM;
@@ -3762,11 +3777,6 @@ static int ftrace_process_locs(struct module *mod,
 	/* These new locations need to be initialized */
 	ftrace_new_pgs = pg;
 
-	/* Make each individual set of pages sorted by ips */
-	for (; pg; pg = pg->next)
-		sort(pg->records, pg->index, sizeof(struct dyn_ftrace),
-		     ftrace_cmp_recs, ftrace_swap_recs);
-
 	/*
 	 * We only need to disable interrupts on start up
 	 * because we are modifying code that an interrupt
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/9][RFC] ftrace: Remove extra helper functions
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
  2012-05-02 19:24 ` [PATCH 1/9][RFC] ftrace: Sort all function addresses, not just per page Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 19:24 ` [PATCH 3/9][RFC] ftrace: Speed up search by skipping pages by address Steven Rostedt
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0002-ftrace-Remove-extra-helper-functions.patch --]
[-- Type: text/plain, Size: 3279 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The ftrace_record_ip() and ftrace_alloc_dyn_node() were from the
time of the ftrace daemon. Although they were still used, they
still make things a bit more complex than necessary.

Move the code into the one function that uses it, and remove the
helper functions.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   61 +++++++++++++++++++------------------------------
 1 file changed, 24 insertions(+), 37 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6a19e81..4e608f8 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1520,35 +1520,6 @@ static void ftrace_hash_rec_enable(struct ftrace_ops *ops,
 	__ftrace_hash_rec_update(ops, filter_hash, 1);
 }
 
-static struct dyn_ftrace *ftrace_alloc_dyn_node(unsigned long ip)
-{
-	if (ftrace_pages->index == ftrace_pages->size) {
-		/* We should have allocated enough */
-		if (WARN_ON(!ftrace_pages->next))
-			return NULL;
-		ftrace_pages = ftrace_pages->next;
-	}
-
-	return &ftrace_pages->records[ftrace_pages->index++];
-}
-
-static struct dyn_ftrace *
-ftrace_record_ip(unsigned long ip)
-{
-	struct dyn_ftrace *rec;
-
-	if (ftrace_disabled)
-		return NULL;
-
-	rec = ftrace_alloc_dyn_node(ip);
-	if (!rec)
-		return NULL;
-
-	rec->ip = ip;
-
-	return rec;
-}
-
 static void print_ip_ins(const char *fmt, unsigned char *p)
 {
 	int i;
@@ -3715,7 +3686,9 @@ static int ftrace_process_locs(struct module *mod,
 			       unsigned long *start,
 			       unsigned long *end)
 {
+	struct ftrace_page *start_pg;
 	struct ftrace_page *pg;
+	struct dyn_ftrace *rec;
 	unsigned long count;
 	unsigned long *p;
 	unsigned long addr;
@@ -3730,8 +3703,8 @@ static int ftrace_process_locs(struct module *mod,
 	sort(start, count, sizeof(*start),
 	     ftrace_cmp_ips, ftrace_swap_ips);
 
-	pg = ftrace_allocate_pages(count);
-	if (!pg)
+	start_pg = ftrace_allocate_pages(count);
+	if (!start_pg)
 		return -ENOMEM;
 
 	mutex_lock(&ftrace_lock);
@@ -3744,7 +3717,7 @@ static int ftrace_process_locs(struct module *mod,
 	if (!mod) {
 		WARN_ON(ftrace_pages || ftrace_pages_start);
 		/* First initialization */
-		ftrace_pages = ftrace_pages_start = pg;
+		ftrace_pages = ftrace_pages_start = start_pg;
 	} else {
 		if (!ftrace_pages)
 			goto out;
@@ -3755,11 +3728,11 @@ static int ftrace_process_locs(struct module *mod,
 				ftrace_pages = ftrace_pages->next;
 		}
 
-		ftrace_pages->next = pg;
-		ftrace_pages = pg;
+		ftrace_pages->next = start_pg;
 	}
 
 	p = start;
+	pg = start_pg;
 	while (p < end) {
 		addr = ftrace_call_adjust(*p++);
 		/*
@@ -3770,12 +3743,26 @@ static int ftrace_process_locs(struct module *mod,
 		 */
 		if (!addr)
 			continue;
-		if (!ftrace_record_ip(addr))
-			break;
+
+		if (pg->index == pg->size) {
+			/* We should have allocated enough */
+			if (WARN_ON(!pg->next))
+				break;
+			pg = pg->next;
+		}
+
+		rec = &pg->records[pg->index++];
+		rec->ip = addr;
 	}
 
+	/* We should have used all pages */
+	WARN_ON(pg->next);
+
+	/* Assign the last page to ftrace_pages */
+	ftrace_pages = pg;
+
 	/* These new locations need to be initialized */
-	ftrace_new_pgs = pg;
+	ftrace_new_pgs = start_pg;
 
 	/*
 	 * We only need to disable interrupts on start up
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/9][RFC] ftrace: Speed up search by skipping pages by address
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
  2012-05-02 19:24 ` [PATCH 1/9][RFC] ftrace: Sort all function addresses, not just per page Steven Rostedt
  2012-05-02 19:24 ` [PATCH 2/9][RFC] ftrace: Remove extra helper functions Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 19:24 ` [PATCH 4/9][RFC] ftrace: Consolidate ftrace_location() and ftrace_text_reserved() Steven Rostedt
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0003-ftrace-Speed-up-search-by-skipping-pages-by-address.patch --]
[-- Type: text/plain, Size: 1977 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

As all records in a page of the ftrace table are sorted, we can
speed up the search algorithm by checking if the address to look for
falls in between the first and last record ip on the page.

This speeds up both the ftrace_location() and ftrace_text_reserved()
algorithms, as it can skip full pages when the search address is
not in them.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 4e608f8..333f890 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1411,6 +1411,8 @@ int ftrace_location(unsigned long ip)
 	key.ip = ip;
 
 	for (pg = ftrace_pages_start; pg; pg = pg->next) {
+		if (ip < pg->records[0].ip || ip > pg->records[pg->index - 1].ip)
+			continue;
 		rec = bsearch(&key, pg->records, pg->index,
 			      sizeof(struct dyn_ftrace),
 			      ftrace_cmp_recs);
@@ -1571,16 +1573,24 @@ void ftrace_bug(int failed, unsigned long ip)
 
 
 /* Return 1 if the address range is reserved for ftrace */
-int ftrace_text_reserved(void *start, void *end)
+int ftrace_text_reserved(void *s, void *e)
 {
 	struct dyn_ftrace *rec;
 	struct ftrace_page *pg;
+	unsigned long start = (unsigned long)s;
+	unsigned long end = (unsigned long)e;
+	int i;
 
-	do_for_each_ftrace_rec(pg, rec) {
-		if (rec->ip <= (unsigned long)end &&
-		    rec->ip + MCOUNT_INSN_SIZE > (unsigned long)start)
-			return 1;
-	} while_for_each_ftrace_rec();
+	for (pg = ftrace_pages_start; pg; pg = pg->next) {
+		if (end < pg->records[0].ip ||
+		    start >= (pg->records[pg->index - 1].ip + MCOUNT_INSN_SIZE))
+			continue;
+		for (i = 0; i < pg->index; i++) {
+			rec = &pg->records[i];
+			if (rec->ip <= end && rec->ip + MCOUNT_INSN_SIZE > start)
+				return 1;
+		}
+	}
 	return 0;
 }
 
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/9][RFC] ftrace: Consolidate ftrace_location() and ftrace_text_reserved()
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
                   ` (2 preceding siblings ...)
  2012-05-02 19:24 ` [PATCH 3/9][RFC] ftrace: Speed up search by skipping pages by address Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 19:24 ` [PATCH 5/9][RFC] ftrace: Return record ip addr for ftrace_location() Steven Rostedt
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0004-ftrace-Consolidate-ftrace_location-and-ftrace_text_r.patch --]
[-- Type: text/plain, Size: 4359 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Both ftrace_location() and ftrace_text_reserved() do basically the same thing.
They search to see if an address is in the ftace table (contains an address
that may change from nop to call ftrace_caller). The difference is
that ftrace_location() searches a single address, but ftrace_text_reserved()
searches a range.

This also makes the ftrace_text_reserved() faster as it now uses a bsearch()
instead of linearly searching all the addresses within a page.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   80 ++++++++++++++++++++++++-------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 333f890..51e07f7 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1383,35 +1383,28 @@ ftrace_ops_test(struct ftrace_ops *ops, unsigned long ip)
 
 static int ftrace_cmp_recs(const void *a, const void *b)
 {
-	const struct dyn_ftrace *reca = a;
-	const struct dyn_ftrace *recb = b;
+	const struct dyn_ftrace *key = a;
+	const struct dyn_ftrace *rec = b;
 
-	if (reca->ip > recb->ip)
-		return 1;
-	if (reca->ip < recb->ip)
+	if (key->flags < rec->ip)
 		return -1;
+	if (key->ip >= rec->ip + MCOUNT_INSN_SIZE)
+		return 1;
 	return 0;
 }
 
-/**
- * ftrace_location - return true if the ip giving is a traced location
- * @ip: the instruction pointer to check
- *
- * Returns 1 if @ip given is a pointer to a ftrace location.
- * That is, the instruction that is either a NOP or call to
- * the function tracer. It checks the ftrace internal tables to
- * determine if the address belongs or not.
- */
-int ftrace_location(unsigned long ip)
+static int ftrace_location_range(unsigned long start, unsigned long end)
 {
 	struct ftrace_page *pg;
 	struct dyn_ftrace *rec;
 	struct dyn_ftrace key;
 
-	key.ip = ip;
+	key.ip = start;
+	key.flags = end;	/* overload flags, as it is unsigned long */
 
 	for (pg = ftrace_pages_start; pg; pg = pg->next) {
-		if (ip < pg->records[0].ip || ip > pg->records[pg->index - 1].ip)
+		if (end < pg->records[0].ip ||
+		    start >= (pg->records[pg->index - 1].ip + MCOUNT_INSN_SIZE))
 			continue;
 		rec = bsearch(&key, pg->records, pg->index,
 			      sizeof(struct dyn_ftrace),
@@ -1423,6 +1416,36 @@ int ftrace_location(unsigned long ip)
 	return 0;
 }
 
+/**
+ * ftrace_location - return true if the ip giving is a traced location
+ * @ip: the instruction pointer to check
+ *
+ * Returns 1 if @ip given is a pointer to a ftrace location.
+ * That is, the instruction that is either a NOP or call to
+ * the function tracer. It checks the ftrace internal tables to
+ * determine if the address belongs or not.
+ */
+int ftrace_location(unsigned long ip)
+{
+	return ftrace_location_range(ip, ip);
+}
+
+/**
+ * ftrace_text_reserved - return true if range contains an ftrace location
+ * @start: start of range to search
+ * @end: end of range to search (inclusive). @end points to the last byte to check.
+ *
+ * Returns 1 if @start and @end contains a ftrace location.
+ * That is, the instruction that is either a NOP or call to
+ * the function tracer. It checks the ftrace internal tables to
+ * determine if the address belongs or not.
+ */
+int ftrace_text_reserved(void *start, void *end)
+{
+	return ftrace_location_range((unsigned long)start,
+				     (unsigned long)end);
+}
+
 static void __ftrace_hash_rec_update(struct ftrace_ops *ops,
 				     int filter_hash,
 				     bool inc)
@@ -1571,29 +1594,6 @@ void ftrace_bug(int failed, unsigned long ip)
 	}
 }
 
-
-/* Return 1 if the address range is reserved for ftrace */
-int ftrace_text_reserved(void *s, void *e)
-{
-	struct dyn_ftrace *rec;
-	struct ftrace_page *pg;
-	unsigned long start = (unsigned long)s;
-	unsigned long end = (unsigned long)e;
-	int i;
-
-	for (pg = ftrace_pages_start; pg; pg = pg->next) {
-		if (end < pg->records[0].ip ||
-		    start >= (pg->records[pg->index - 1].ip + MCOUNT_INSN_SIZE))
-			continue;
-		for (i = 0; i < pg->index; i++) {
-			rec = &pg->records[i];
-			if (rec->ip <= end && rec->ip + MCOUNT_INSN_SIZE > start)
-				return 1;
-		}
-	}
-	return 0;
-}
-
 static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)
 {
 	unsigned long flag = 0UL;
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/9][RFC] ftrace: Return record ip addr for ftrace_location()
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
                   ` (3 preceding siblings ...)
  2012-05-02 19:24 ` [PATCH 4/9][RFC] ftrace: Consolidate ftrace_location() and ftrace_text_reserved() Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 19:24 ` [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it) Steven Rostedt
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0005-ftrace-Return-record-ip-addr-for-ftrace_location.patch --]
[-- Type: text/plain, Size: 3053 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

ftrace_location() is passed an addr, and returns 1 if the addr is
on a ftrace nop (or caller to ftrace_caller), and 0 otherwise.

To let kprobes know if it should move a breakpoint or not, it
must return the actual addr that is the start of the ftrace nop.
This way a kprobe placed on the location of a ftrace nop, can
instead be placed on the instruction after the nop. Even if the
probe addr is on the second or later byte of the nop, it can
simply be moved forward.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace.h |    2 +-
 kernel/trace/ftrace.c  |   16 ++++++++++------
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 0b55903..9310993 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -295,7 +295,7 @@ struct dyn_ftrace *ftrace_rec_iter_record(struct ftrace_rec_iter *iter);
 int ftrace_update_record(struct dyn_ftrace *rec, int enable);
 int ftrace_test_record(struct dyn_ftrace *rec, int enable);
 void ftrace_run_stop_machine(int command);
-int ftrace_location(unsigned long ip);
+unsigned long ftrace_location(unsigned long ip);
 
 extern ftrace_func_t ftrace_trace_function;
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 51e07f7..7e3c398 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1393,7 +1393,7 @@ static int ftrace_cmp_recs(const void *a, const void *b)
 	return 0;
 }
 
-static int ftrace_location_range(unsigned long start, unsigned long end)
+static unsigned long ftrace_location_range(unsigned long start, unsigned long end)
 {
 	struct ftrace_page *pg;
 	struct dyn_ftrace *rec;
@@ -1410,7 +1410,7 @@ static int ftrace_location_range(unsigned long start, unsigned long end)
 			      sizeof(struct dyn_ftrace),
 			      ftrace_cmp_recs);
 		if (rec)
-			return 1;
+			return rec->ip;
 	}
 
 	return 0;
@@ -1420,12 +1420,12 @@ static int ftrace_location_range(unsigned long start, unsigned long end)
  * ftrace_location - return true if the ip giving is a traced location
  * @ip: the instruction pointer to check
  *
- * Returns 1 if @ip given is a pointer to a ftrace location.
+ * Returns rec->ip if @ip given is a pointer to a ftrace location.
  * That is, the instruction that is either a NOP or call to
  * the function tracer. It checks the ftrace internal tables to
  * determine if the address belongs or not.
  */
-int ftrace_location(unsigned long ip)
+unsigned long ftrace_location(unsigned long ip)
 {
 	return ftrace_location_range(ip, ip);
 }
@@ -1442,8 +1442,12 @@ int ftrace_location(unsigned long ip)
  */
 int ftrace_text_reserved(void *start, void *end)
 {
-	return ftrace_location_range((unsigned long)start,
-				     (unsigned long)end);
+	unsigned long ret;
+
+	ret = ftrace_location_range((unsigned long)start,
+				    (unsigned long)end);
+
+	return (int)!!ret;
 }
 
 static void __ftrace_hash_rec_update(struct ftrace_ops *ops,
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
                   ` (4 preceding siblings ...)
  2012-05-02 19:24 ` [PATCH 5/9][RFC] ftrace: Return record ip addr for ftrace_location() Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 20:40   ` Frank Ch. Eigler
  2012-05-02 19:24 ` [PATCH 7/9][RFC] ftrace: Make ftrace_modify_all_code() global for archs to use Steven Rostedt
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0006-kprobes-Allow-probe-on-ftrace-reserved-text-but-move.patch --]
[-- Type: text/plain, Size: 3034 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

If a probe is placed on a ftrace nop (or ftrace_caller), simply move the
probe to the next instruction instead of rejecting it. This will allow
kprobes not to be affected by ftrace using the -mfentry gcc option which
will put the ftrace nop at the beginning of the function. As a very common
case for kprobes is to add a probe to the very beginning of a function
we need a way to handle the case when ftrace takes the first instruction.

Added KPROBE_FLAG_MOVED (as suggested by Masami) that is set when the address
is moved to get around an ftrace nop.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace.h  |    2 ++
 include/linux/kprobes.h |    1 +
 kernel/kprobes.c        |   13 ++++++++++++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9310993..211ae45 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -414,6 +414,8 @@ static inline int ftrace_text_reserved(void *start, void *end)
 #define ftrace_set_notrace(ops, buf, len, reset) ({ -ENODEV; })
 #define ftrace_free_filter(ops) do { } while (0)
 
+static inline unsigned long ftrace_location(unsigned long ip) { return 0; }
+
 static inline ssize_t ftrace_filter_write(struct file *file, const char __user *ubuf,
 			    size_t cnt, loff_t *ppos) { return -ENODEV; }
 static inline ssize_t ftrace_notrace_write(struct file *file, const char __user *ubuf,
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index b6e1f8c..23cf41e 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -128,6 +128,7 @@ struct kprobe {
 				   * NOTE:
 				   * this flag is only for optimized_kprobe.
 				   */
+#define KPROBE_FLAG_MOVED	8 /* probe was moved passed ftrace nop */
 
 /* Has this kprobe gone ? */
 static inline int kprobe_gone(struct kprobe *p)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index c62b854..952619b 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1319,10 +1319,22 @@ int __kprobes register_kprobe(struct kprobe *p)
 	struct kprobe *old_p;
 	struct module *probed_mod;
 	kprobe_opcode_t *addr;
+	unsigned long ftrace_addr;
 
 	addr = kprobe_addr(p);
 	if (IS_ERR(addr))
 		return PTR_ERR(addr);
+
+	/*
+	 * If the address is located on a ftrace nop, set the
+	 * breakpoint to the following instruction.
+	 */
+	ftrace_addr = ftrace_location((unsigned long)addr);
+	if (unlikely(ftrace_addr)) {
+		addr = (kprobe_opcode_t *)(ftrace_addr + MCOUNT_INSN_SIZE);
+		p->flags |= KPROBE_FLAG_MOVED;
+	}
+
 	p->addr = addr;
 
 	ret = check_kprobe_rereg(p);
@@ -1333,7 +1345,6 @@ int __kprobes register_kprobe(struct kprobe *p)
 	preempt_disable();
 	if (!kernel_text_address((unsigned long) p->addr) ||
 	    in_kprobes_functions((unsigned long) p->addr) ||
-	    ftrace_text_reserved(p->addr, p->addr) ||
 	    jump_label_text_reserved(p->addr, p->addr)) {
 		ret = -EINVAL;
 		goto cannot_probe;
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/9][RFC] ftrace: Make ftrace_modify_all_code() global for archs to use
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
                   ` (5 preceding siblings ...)
  2012-05-02 19:24 ` [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it) Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 19:24 ` [PATCH 8/9][RFC] ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() Steven Rostedt
  2012-05-02 19:24 ` [PATCH 9/9][RFC] ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER Steven Rostedt
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0007-ftrace-Make-ftrace_modify_all_code-global-for-archs-.patch --]
[-- Type: text/plain, Size: 1989 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Rename __ftrace_modify_code() to ftrace_modify_all_code() and make
it global for all archs to use. This will remove the duplication
of code, as archs that can modify code without stop_machine()
can use it directly outside of the stop_machine() call.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace.h |    2 ++
 kernel/trace/ftrace.c  |   21 +++++++++++++--------
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 211ae45..873ceea 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -319,6 +319,8 @@ extern void ftrace_caller(void);
 extern void ftrace_call(void);
 extern void mcount_call(void);
 
+void ftrace_modify_all_code(int command);
+
 #ifndef FTRACE_ADDR
 #define FTRACE_ADDR ((unsigned long)ftrace_caller)
 #endif
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 7e3c398..a4ef17d 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1811,22 +1811,27 @@ int __weak ftrace_arch_code_modify_post_process(void)
 	return 0;
 }
 
-static int __ftrace_modify_code(void *data)
+void ftrace_modify_all_code(int command)
 {
-	int *command = data;
-
-	if (*command & FTRACE_UPDATE_CALLS)
+	if (command & FTRACE_UPDATE_CALLS)
 		ftrace_replace_code(1);
-	else if (*command & FTRACE_DISABLE_CALLS)
+	else if (command & FTRACE_DISABLE_CALLS)
 		ftrace_replace_code(0);
 
-	if (*command & FTRACE_UPDATE_TRACE_FUNC)
+	if (command & FTRACE_UPDATE_TRACE_FUNC)
 		ftrace_update_ftrace_func(ftrace_trace_function);
 
-	if (*command & FTRACE_START_FUNC_RET)
+	if (command & FTRACE_START_FUNC_RET)
 		ftrace_enable_ftrace_graph_caller();
-	else if (*command & FTRACE_STOP_FUNC_RET)
+	else if (command & FTRACE_STOP_FUNC_RET)
 		ftrace_disable_ftrace_graph_caller();
+}
+
+static int __ftrace_modify_code(void *data)
+{
+	int *command = data;
+
+	ftrace_modify_all_code(*command);
 
 	return 0;
 }
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 8/9][RFC] ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code()
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
                   ` (6 preceding siblings ...)
  2012-05-02 19:24 ` [PATCH 7/9][RFC] ftrace: Make ftrace_modify_all_code() global for archs to use Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  2012-05-02 19:24 ` [PATCH 9/9][RFC] ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER Steven Rostedt
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0008-ftrace-x86-Have-x86-ftrace-use-the-ftrace_modify_all.patch --]
[-- Type: text/plain, Size: 2675 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

To remove duplicate code, have the ftrace arch_ftrace_update_code()
use the generic ftrace_modify_all_code(). This requires that the
default ftrace_replace_code() becomes a weak function so that an
arch may override it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/ftrace.c |   15 ++-------------
 include/linux/ftrace.h   |    1 +
 kernel/trace/ftrace.c    |    4 ++--
 3 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index cf2d03e..06cfc67 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -435,7 +435,7 @@ static void run_sync(void)
 		local_irq_disable();
 }
 
-static void ftrace_replace_code(int enable)
+void ftrace_replace_code(int enable)
 {
 	struct ftrace_rec_iter *iter;
 	struct dyn_ftrace *rec;
@@ -493,18 +493,7 @@ void arch_ftrace_update_code(int command)
 {
 	modifying_ftrace_code++;
 
-	if (command & FTRACE_UPDATE_CALLS)
-		ftrace_replace_code(1);
-	else if (command & FTRACE_DISABLE_CALLS)
-		ftrace_replace_code(0);
-
-	if (command & FTRACE_UPDATE_TRACE_FUNC)
-		ftrace_update_ftrace_func(ftrace_trace_function);
-
-	if (command & FTRACE_START_FUNC_RET)
-		ftrace_enable_ftrace_graph_caller();
-	else if (command & FTRACE_STOP_FUNC_RET)
-		ftrace_disable_ftrace_graph_caller();
+	ftrace_modify_all_code(command);
 
 	modifying_ftrace_code--;
 }
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 873ceea..760845f 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -314,6 +314,7 @@ ftrace_set_early_filter(struct ftrace_ops *ops, char *buf, int enable);
 /* defined in arch */
 extern int ftrace_ip_converted(unsigned long ip);
 extern int ftrace_dyn_arch_init(void *data);
+extern void ftrace_replace_code(int enable);
 extern int ftrace_update_ftrace_func(ftrace_func_t func);
 extern void ftrace_caller(void);
 extern void ftrace_call(void);
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a4ef17d..9e9eebd 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1683,7 +1683,7 @@ __ftrace_replace_code(struct dyn_ftrace *rec, int enable)
 	return -1; /* unknow ftrace bug */
 }
 
-static void ftrace_replace_code(int update)
+void __weak ftrace_replace_code(int enable)
 {
 	struct dyn_ftrace *rec;
 	struct ftrace_page *pg;
@@ -1693,7 +1693,7 @@ static void ftrace_replace_code(int update)
 		return;
 
 	do_for_each_ftrace_rec(pg, rec) {
-		failed = __ftrace_replace_code(rec, update);
+		failed = __ftrace_replace_code(rec, enable);
 		if (failed) {
 			ftrace_bug(failed, rec->ip);
 			/* Stop processing */
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 9/9][RFC] ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER
  2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
                   ` (7 preceding siblings ...)
  2012-05-02 19:24 ` [PATCH 8/9][RFC] ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() Steven Rostedt
@ 2012-05-02 19:24 ` Steven Rostedt
  8 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 19:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0009-ftrace-Remove-selecting-FRAME_POINTER-with-FUNCTION_.patch --]
[-- Type: text/plain, Size: 1816 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The function tracer will enable the -pg option with gcc, which requires
that frame pointers. When FRAME_POINTER is defined in the kernel config
it adds the gcc option -fno-omit-frame-pointer which causes some problems
on some architectures. For those architectures, the FRAME_POINTER select
was not set.

When FUNCTION_TRACER was selected on these architectures that can not have
-fno-omit-frame-pointer, the -pg option is still set. But when
FRAME_POINTER is not selected, the kernel config would add the gcc option
-fomit-frame-pointer. Adding this option is incompatible with -pg
even on archs that do not need frame pointers with -pg.

The answer to this was to just not add either -fno-omit-frame-pointer
or -fomit-frame-pointer on these archs that want function tracing
but do not set FRAME_POINTER.

As it turns out, for archs that require frame pointers for function
tracing, the same can be used. If gcc requires frame pointers with
-pg, it will simply add it. The best thing to do is not select FRAME_POINTER
when function tracing is selected, and let gcc add it if needed.

Only add the -fno-omit-frame-pointer when something else selects
FRAME_POINTER, but do not add -fomit-frame-pointer if function tracing
is selected.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/Kconfig |    1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index a1d2849..d81a1a5 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -141,7 +141,6 @@ if FTRACE
 config FUNCTION_TRACER
 	bool "Kernel Function Tracer"
 	depends on HAVE_FUNCTION_TRACER
-	select FRAME_POINTER if !ARM_UNWIND && !PPC && !S390 && !MICROBLAZE
 	select KALLSYMS
 	select GENERIC_TRACER
 	select CONTEXT_SWITCH_TRACER
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-02 19:24 ` [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it) Steven Rostedt
@ 2012-05-02 20:40   ` Frank Ch. Eigler
  2012-05-02 23:40     ` Steven Rostedt
  0 siblings, 1 reply; 25+ messages in thread
From: Frank Ch. Eigler @ 2012-05-02 20:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin, fche


rostedt wrote:

> [...]  Added KPROBE_FLAG_MOVED (as suggested by Masami) that is set
> when the address is moved to get around an ftrace nop. [...]

Steve, perhaps my earlier comments on this got lost during the mailing
list outage.

The gist is that a KPROBE_FLAG_MOVED being set this way accomplishes
very little since nothing is looking for that flag.  Instead, you
should patch {arch/*}/kernel/kprobe.c kprobe_handler() to subtract
MCOUNT_INSN_SIZE back out from pt_regs->ip if KPROBE_FLAG_MOVED was
set.  That way, kprobes clients need do not perceive the int3 movement.

- FChE

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-02 20:40   ` Frank Ch. Eigler
@ 2012-05-02 23:40     ` Steven Rostedt
  2012-05-07 11:37       ` Masami Hiramatsu
  0 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2012-05-02 23:40 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

On Wed, 2012-05-02 at 16:40 -0400, Frank Ch. Eigler wrote:
> rostedt wrote:
> 
> > [...]  Added KPROBE_FLAG_MOVED (as suggested by Masami) that is set
> > when the address is moved to get around an ftrace nop. [...]
> 
> Steve, perhaps my earlier comments on this got lost during the mailing
> list outage.

I saw it, but it didn't really specify what you wanted. Here's your
comment:


> I suspect Masami intended that this flag is later used during int3
> processing to subtract MCOUNT_INSN_SIZE back out from the pt_regs->ip
> during kprobe_handler() if this flag was set.

This is what I thought too, but to me it sounded like Masami could do
the work. I was just setting up a flag to make it possible.

> 
> The gist is that a KPROBE_FLAG_MOVED being set this way accomplishes
> very little since nothing is looking for that flag.  Instead, you
> should patch {arch/*}/kernel/kprobe.c kprobe_handler() to subtract
> MCOUNT_INSN_SIZE back out from pt_regs->ip if KPROBE_FLAG_MOVED was
> set.  That way, kprobes clients need do not perceive the int3 movement.

I basically thought that Masami wanted me to add the flag, and then
others could look for this and do the adjustment. I'm not the kprobes
author. I was just adding a flag that Masami and others could use to do
such updates.

I'm not sure if the adjustment is fine with everyone, as it may cause
repercussions that I don't know about. 

Perhaps that could be another patch (want to write it?)

-- Steve



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-02 23:40     ` Steven Rostedt
@ 2012-05-07 11:37       ` Masami Hiramatsu
  2012-05-07 11:59         ` Masami Hiramatsu
  2012-05-07 12:43         ` Steven Rostedt
  0 siblings, 2 replies; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-07 11:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/05/03 8:40), Steven Rostedt wrote:
> On Wed, 2012-05-02 at 16:40 -0400, Frank Ch. Eigler wrote:
>> rostedt wrote:
>>
>>> [...]  Added KPROBE_FLAG_MOVED (as suggested by Masami) that is set
>>> when the address is moved to get around an ftrace nop. [...]
>>
>> Steve, perhaps my earlier comments on this got lost during the mailing
>> list outage.
> 
> I saw it, but it didn't really specify what you wanted. Here's your
> comment:
> 
> 
>> I suspect Masami intended that this flag is later used during int3
>> processing to subtract MCOUNT_INSN_SIZE back out from the pt_regs->ip
>> during kprobe_handler() if this flag was set.
> 
> This is what I thought too, but to me it sounded like Masami could do
> the work. I was just setting up a flag to make it possible.
> 
>>
>> The gist is that a KPROBE_FLAG_MOVED being set this way accomplishes
>> very little since nothing is looking for that flag.  Instead, you
>> should patch {arch/*}/kernel/kprobe.c kprobe_handler() to subtract
>> MCOUNT_INSN_SIZE back out from pt_regs->ip if KPROBE_FLAG_MOVED was
>> set.  That way, kprobes clients need do not perceive the int3 movement.
> 
> I basically thought that Masami wanted me to add the flag, and then
> others could look for this and do the adjustment. I'm not the kprobes
> author. I was just adding a flag that Masami and others could use to do
> such updates.

Right, that was what I thought. Since the kp->addr is changed when
kprobe is set, kprobes itself don't need to adjust the pt_regs->ip.
I mean, struct kprobe itself puts a probe on the next to the mcount
entry, even if the caller tries to put a probe on the mcount entry.

This change may be unintended and caller will doubt that why the
kp->addr is automatically changed. So this KPROBE_FLAG_MOVED gives
a hint for the caller who knows the original intended probed address.

> I'm not sure if the adjustment is fine with everyone, as it may cause
> repercussions that I don't know about. 

Yeah, that's a point. if the adjustment is transparently done, there
is no problem. But it changes kp->addr when registering a probe.
If adjustment is done, following code (still) doesn't work.

---
int func(struct kprobe *kp, strcut pt_regs *regs)
{
	BUG_ON(kp->addr != regs->ip);
	/* or */
	store_probed_address(kp->addr);	/* since regs->ip depends on x86*/
}

kp->handler = func;
kp->addr = <somewhere on ftrace>
register_kprobe(kp);
---

but if adjustment is not done, at least, kprobes behavior itself
looks same. (but just be moved if probed on ftrace)

Yeah, I know systemtap people likes regs->ip to be adjusted, but
there may be someone who use raw kprobes.

> Perhaps that could be another patch (want to write it?)

Oh, so I think we need to show the new flag on debugfs for
someone who want to know why the probe has been moved. :)


By the way, there is another way to do that transparently which
we add a "real_addr" member to struct kprobes and put the real
probed address to the member (without changing kp->addr). This
will keep API compatibility.

Thank you,


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-07 11:37       ` Masami Hiramatsu
@ 2012-05-07 11:59         ` Masami Hiramatsu
  2012-05-07 12:44           ` Steven Rostedt
  2012-05-07 12:43         ` Steven Rostedt
  1 sibling, 1 reply; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-07 11:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/05/07 20:37), Masami Hiramatsu wrote:
> (2012/05/03 8:40), Steven Rostedt wrote:
>> On Wed, 2012-05-02 at 16:40 -0400, Frank Ch. Eigler wrote:
>>> rostedt wrote:
>>>
>>>> [...]  Added KPROBE_FLAG_MOVED (as suggested by Masami) that is set
>>>> when the address is moved to get around an ftrace nop. [...]
>>>
>>> Steve, perhaps my earlier comments on this got lost during the mailing
>>> list outage.
>>
>> I saw it, but it didn't really specify what you wanted. Here's your
>> comment:
>>
>>
>>> I suspect Masami intended that this flag is later used during int3
>>> processing to subtract MCOUNT_INSN_SIZE back out from the pt_regs->ip
>>> during kprobe_handler() if this flag was set.
>>
>> This is what I thought too, but to me it sounded like Masami could do
>> the work. I was just setting up a flag to make it possible.
>>
>>>
>>> The gist is that a KPROBE_FLAG_MOVED being set this way accomplishes
>>> very little since nothing is looking for that flag.  Instead, you
>>> should patch {arch/*}/kernel/kprobe.c kprobe_handler() to subtract
>>> MCOUNT_INSN_SIZE back out from pt_regs->ip if KPROBE_FLAG_MOVED was
>>> set.  That way, kprobes clients need do not perceive the int3 movement.
>>
>> I basically thought that Masami wanted me to add the flag, and then
>> others could look for this and do the adjustment. I'm not the kprobes
>> author. I was just adding a flag that Masami and others could use to do
>> such updates.
> 
> Right, that was what I thought. Since the kp->addr is changed when
> kprobe is set, kprobes itself don't need to adjust the pt_regs->ip.
> I mean, struct kprobe itself puts a probe on the next to the mcount
> entry, even if the caller tries to put a probe on the mcount entry.
> 
> This change may be unintended and caller will doubt that why the
> kp->addr is automatically changed. So this KPROBE_FLAG_MOVED gives
> a hint for the caller who knows the original intended probed address.
> 
>> I'm not sure if the adjustment is fine with everyone, as it may cause
>> repercussions that I don't know about. 
> 
> Yeah, that's a point. if the adjustment is transparently done, there
> is no problem. But it changes kp->addr when registering a probe.
> If adjustment is done, following code (still) doesn't work.
> 
> ---
> int func(struct kprobe *kp, strcut pt_regs *regs)
> {
> 	BUG_ON(kp->addr != regs->ip);
> 	/* or */
> 	store_probed_address(kp->addr);	/* since regs->ip depends on x86*/
> }
> 
> kp->handler = func;
> kp->addr = <somewhere on ftrace>
> register_kprobe(kp);
> ---
> 
> but if adjustment is not done, at least, kprobes behavior itself
> looks same. (but just be moved if probed on ftrace)
> 
> Yeah, I know systemtap people likes regs->ip to be adjusted, but
> there may be someone who use raw kprobes.
> 
>> Perhaps that could be another patch (want to write it?)
> 
> Oh, so I think we need to show the new flag on debugfs for
> someone who want to know why the probe has been moved. :)

Hmm, I hit another good idea. :)

Adding an optional flag for kprobes like KPROBE_FLAG_ALLOWMOVE, and
only if it is set, kprobes moves probe on ftrace, and adjust pt_regs
(on arch which supports dynamic-ftrace and kprobes).
If not, it rejects the probe.

This will not break any backward compatibility and also encapsulates
arch-dependent address adjustment. (and also, it can be a separated
patches)

BTW, Steven, is this series already put on some git repository?
I'd like to pull it to work on that.

Thank you,


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-07 11:37       ` Masami Hiramatsu
  2012-05-07 11:59         ` Masami Hiramatsu
@ 2012-05-07 12:43         ` Steven Rostedt
  2012-05-08  3:08           ` Masami Hiramatsu
  1 sibling, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2012-05-07 12:43 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Mon, 2012-05-07 at 20:37 +0900, Masami Hiramatsu wrote:

> By the way, there is another way to do that transparently which
> we add a "real_addr" member to struct kprobes and put the real
> probed address to the member (without changing kp->addr). This
> will keep API compatibility.

I like this a lot. Perhaps we don't need a flag at all, and just use
these two addrs instead?

-- Steve



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-07 11:59         ` Masami Hiramatsu
@ 2012-05-07 12:44           ` Steven Rostedt
  2012-05-07 12:51             ` Steven Rostedt
  0 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2012-05-07 12:44 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Mon, 2012-05-07 at 20:59 +0900, Masami Hiramatsu wrote:

> BTW, Steven, is this series already put on some git repository?
> I'd like to pull it to work on that.

It's currently only on my local machines because I rebase it a lot. I
can push it out for you to play with, but it is not in its final form
(yet) and may be rebased once again. Thus, don't rely on it ;-)

-- Steve



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-07 12:44           ` Steven Rostedt
@ 2012-05-07 12:51             ` Steven Rostedt
  0 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-07 12:51 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Mon, 2012-05-07 at 08:44 -0400, Steven Rostedt wrote:
> On Mon, 2012-05-07 at 20:59 +0900, Masami Hiramatsu wrote:
> 
> > BTW, Steven, is this series already put on some git repository?
> > I'd like to pull it to work on that.
> 
> It's currently only on my local machines because I rebase it a lot. I
> can push it out for you to play with, but it is not in its final form
> (yet) and may be rebased once again. Thus, don't rely on it ;-)
> 

I just pushed this branch to my repo:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git

tip/perf/next

-- Steve



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Re: Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-07 12:43         ` Steven Rostedt
@ 2012-05-08  3:08           ` Masami Hiramatsu
  2012-05-08 13:04             ` Steven Rostedt
  0 siblings, 1 reply; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-08  3:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/05/07 21:43), Steven Rostedt wrote:
> On Mon, 2012-05-07 at 20:37 +0900, Masami Hiramatsu wrote:
> 
>> By the way, there is another way to do that transparently which
>> we add a "real_addr" member to struct kprobes and put the real
>> probed address to the member (without changing kp->addr). This
>> will keep API compatibility.
> 
> I like this a lot. Perhaps we don't need a flag at all, and just use
> these two addrs instead?

Yes, just replacing all "*p->addr" and "kp.addr" with real_addr
and copying addr to real_addr when registering. That's the basic
idea.

I just concern about the cost balance... this is feasible, but
we need to change many arch-dependent parts. Do we really need to
pay that cost just for the backward compatibility?

I mean, the kprobes itself can be changed because we don't ensure
the stability of kernel APIs. If so, it looks enough to change the
behavior of kprobes and give an upgrade hint for users (current patch),
isn't it?

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Re: Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-08  3:08           ` Masami Hiramatsu
@ 2012-05-08 13:04             ` Steven Rostedt
  2012-05-09  5:53               ` Masami Hiramatsu
  0 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2012-05-08 13:04 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Tue, 2012-05-08 at 12:08 +0900, Masami Hiramatsu wrote:
> (2012/05/07 21:43), Steven Rostedt wrote:
> > On Mon, 2012-05-07 at 20:37 +0900, Masami Hiramatsu wrote:
> > 
> >> By the way, there is another way to do that transparently which
> >> we add a "real_addr" member to struct kprobes and put the real
> >> probed address to the member (without changing kp->addr). This
> >> will keep API compatibility.
> > 
> > I like this a lot. Perhaps we don't need a flag at all, and just use
> > these two addrs instead?
> 
> Yes, just replacing all "*p->addr" and "kp.addr" with real_addr
> and copying addr to real_addr when registering. That's the basic
> idea.
> 
> I just concern about the cost balance... this is feasible, but
> we need to change many arch-dependent parts. Do we really need to
> pay that cost just for the backward compatibility?
> 
> I mean, the kprobes itself can be changed because we don't ensure
> the stability of kernel APIs. If so, it looks enough to change the
> behavior of kprobes and give an upgrade hint for users (current patch),
> isn't it?
> 
> Thank you,

I guess the question is what's best long term. That's what I would like
to do. If a flag is "good enough" for both now and long term, than
that's fine with me. But if we find that it would be better to have a
"real_addr" then we should do it now and bite the bullet with all archs.

Otherwise, we'll have all the archs doing something special with the
MOVE flag and that would cause even more pain to update it later.

I also like the real addr because it helps with the optimize probes. We
only need to search one location. This doesn't matter with this patch
set, but with the code I have that uses ftrace hooks. One solution with
that is to have the optimize code see that the probe was moved, (or its
real addr was on a ftrace nop) and then just use the ftrace code on
optimization instead of normal optimizations (replacing with a jump).

Note, the big difference with using ftrace optimization and normal
kprobe jump optimization is that the ftrace one can be used on a preempt
kernel. But this code is still under development. I want to get a
solution for the current code (this patch set) now. It would be nice if
it was ready for 3.5.


-- Steve



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-08 13:04             ` Steven Rostedt
@ 2012-05-09  5:53               ` Masami Hiramatsu
  2012-05-09  8:12                 ` Masami Hiramatsu
  0 siblings, 1 reply; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-09  5:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/05/08 22:04), Steven Rostedt wrote:
> I guess the question is what's best long term. That's what I would like
> to do. If a flag is "good enough" for both now and long term, than
> that's fine with me. But if we find that it would be better to have a
> "real_addr" then we should do it now and bite the bullet with all archs.

Well, I was not sure that the moving probe address method was the
short-term solution. Maybe that was wrong.

>
> Otherwise, we'll have all the archs doing something special with the
> MOVE flag and that would cause even more pain to update it later.

Just a comment. If user find that the MOVE flag is set, then they can
 choose;
- reject the probing request which on the ftrace
- stores original IP on another variable and use that instead of
  kp->regs.
So, they don't need to adjust address for each arch. :)


> I also like the real addr because it helps with the optimize probes. We
> only need to search one location. This doesn't matter with this patch
> set, but with the code I have that uses ftrace hooks. One solution with
> that is to have the optimize code see that the probe was moved, (or its
> real addr was on a ftrace nop) and then just use the ftrace code on
  ^^^^^^^^^ would you mean addr? :)
> optimization instead of normal optimizations (replacing with a jump).

OK, I misunderstood. I thought that ftrace-optimization could replace
the moving probe address solution, but it couldn't.
For example, jprobe, which puts a probe on the entry of function and
change IP to special handler, can not be optimized even with ftrace.
Thus, we still need to move probe address to the next instruction.

So, I agree with you. We need real_addr solution for transparent
moving the probepoint.

> Note, the big difference with using ftrace optimization and normal
> kprobe jump optimization is that the ftrace one can be used on a preempt
> kernel. But this code is still under development. I want to get a
> solution for the current code (this patch set) now. It would be nice if
> it was ready for 3.5.

I doubt that we can really do this. If this is possible, I can make
jump optimization work with preemptive kernel.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-09  5:53               ` Masami Hiramatsu
@ 2012-05-09  8:12                 ` Masami Hiramatsu
  2012-05-09  9:02                   ` Masami Hiramatsu
  0 siblings, 1 reply; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-09  8:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/05/09 14:53), Masami Hiramatsu wrote:
> (2012/05/08 22:04), Steven Rostedt wrote:
>> Note, the big difference with using ftrace optimization and normal
>> kprobe jump optimization is that the ftrace one can be used on a preempt
>> kernel. But this code is still under development. I want to get a
>> solution for the current code (this patch set) now. It would be nice if
>> it was ready for 3.5.
> 
> I doubt that we can really do this. If this is possible, I can make
> jump optimization work with preemptive kernel.

Now I'm clear that it can be done only with ftrace-nop.
What I concerned with is out-of-line execution buffer releasing
timing. but with the ftrace-nop, we don't need to do that.

Theoretically, that will be done in following steps;

1. call mcount (w/ push call-site address+5)
2. mcount calls ftrace handlers
3. mcount makes pt_regs on the stack
 - on x86-32, regs->cs should be the call-site address+5, since
   regs->sp must be the top of original stack.
 - on x86-64, it is enough that regs->sp points the top of original
   stack.
4. mcount store call-site address+5 to %ax
5. mcount calls new-optprobe handler (which returns call-site address+5)
6. mcount restores registers
7. mcount returns to the call-site address+5.

Between 6 and 7, original opt-probe has to do out-of-line execution,
but this ftrace-based one doesn't need to do that. In that case,
we don't need to allocate nor to release out-of-line execution buffer
for this probe.

Thank you!

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-09  8:12                 ` Masami Hiramatsu
@ 2012-05-09  9:02                   ` Masami Hiramatsu
  2012-05-09 14:50                     ` Steven Rostedt
  2012-05-10 11:54                     ` [RFC PATCH -tip ] x86/kprobes: kprobes call optimization Masami Hiramatsu
  0 siblings, 2 replies; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-09  9:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/05/09 17:12), Masami Hiramatsu wrote:
> (2012/05/09 14:53), Masami Hiramatsu wrote:
>> (2012/05/08 22:04), Steven Rostedt wrote:
>>> Note, the big difference with using ftrace optimization and normal
>>> kprobe jump optimization is that the ftrace one can be used on a preempt
>>> kernel. But this code is still under development. I want to get a
>>> solution for the current code (this patch set) now. It would be nice if
>>> it was ready for 3.5.
>>
>> I doubt that we can really do this. If this is possible, I can make
>> jump optimization work with preemptive kernel.
> 
> Now I'm clear that it can be done only with ftrace-nop.
> What I concerned with is out-of-line execution buffer releasing
> timing. but with the ftrace-nop, we don't need to do that.
> 
> Theoretically, that will be done in following steps;
> 
> 1. call mcount (w/ push call-site address+5)
> 2. mcount calls ftrace handlers
> 3. mcount makes pt_regs on the stack
>  - on x86-32, regs->cs should be the call-site address+5, since
>    regs->sp must be the top of original stack.

Oops, no, %flags is stored on regs->cs and regs->flags is
call-site address. So we have to fix it up in optprobe handler.

>  - on x86-64, it is enough that regs->sp points the top of original
>    stack.
> 4. mcount store call-site address+5 to %ax
> 5. mcount calls new-optprobe handler (which returns call-site address+5)
> 6. mcount restores registers
> 7. mcount returns to the call-site address+5.
> 
> Between 6 and 7, original opt-probe has to do out-of-line execution,
> but this ftrace-based one doesn't need to do that. In that case,
> we don't need to allocate nor to release out-of-line execution buffer
> for this probe.

And I got an concrete idea of kprobes call-optimization which
uses call instead of jump. That will make ftrace-based optimization
very easy. :)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it)
  2012-05-09  9:02                   ` Masami Hiramatsu
@ 2012-05-09 14:50                     ` Steven Rostedt
  2012-05-10 11:54                     ` [RFC PATCH -tip ] x86/kprobes: kprobes call optimization Masami Hiramatsu
  1 sibling, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2012-05-09 14:50 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Wed, 2012-05-09 at 18:02 +0900, Masami Hiramatsu wrote:

> And I got an concrete idea of kprobes call-optimization which
> uses call instead of jump. That will make ftrace-based optimization
> very easy. :)

I just uploaded my devel branch. If you want to make suggestions, or
even modify and add to what I have, feel free:

   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git

rfc/kprobes/ftrace-v2

-- Steve



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH -tip ] x86/kprobes: kprobes call optimization
  2012-05-09  9:02                   ` Masami Hiramatsu
  2012-05-09 14:50                     ` Steven Rostedt
@ 2012-05-10 11:54                     ` Masami Hiramatsu
  2012-05-11  8:29                       ` Masami Hiramatsu
  1 sibling, 1 reply; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-10 11:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frank Ch. Eigler, linux-kernel, Ingo Molnar, Andrew Morton,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt,
	Steven Rostedt, Masami Hiramatsu

Note: this code is still under development, but it will
be useful for the discussion to show this.

Use relative call instead of relative jump for optimizing
kprobes. This reduces x86-depend magic code and runtime
memory size.

Current jump-based optimization copies below regions
into an out-of-line buffer and add some portions to make
a detour path for each optimized probe.

ffffffff815f7009 T optprobe_template_entry
ffffffff815f7029 T optprobe_template_val
ffffffff815f7033 T optprobe_template_call
ffffffff815f7068 T optprobe_template_end

This actually consumes 0x5f == 95 bytes + copied
instructions(20bytes) + jump back (5bytes).

Since call-based optimization can share above templates
of trampoline code, it just requires 25bytes for each
optimized probe.

Steven, I saw your ftrace-based optimization code. And
this will do similar thing but a different way. I'd like
to reuse this code for getting pt_regs in both i386/x86-64.
So, that will be done as follows;

- ftrace-call calls mcount
- mcount calls ftrace handlers
- mcount checks if there is a kprobe
  if so, it jumps into trampoline code
- trampoline code does same way if it is called from
  probe point directly. But if it is called from mcount,
  it directly returns into the ftrace-caller site.

Thank you,

Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---

 Documentation/kprobes.txt        |   39 +++++---
 arch/x86/include/asm/kprobes.h   |   15 +--
 arch/x86/kernel/kprobes-common.h |   31 ++++--
 arch/x86/kernel/kprobes-opt.c    |  191 ++++++++++++++++----------------------
 arch/x86/kernel/kprobes.c        |    4 -
 kernel/kprobes.c                 |    6 -
 6 files changed, 131 insertions(+), 155 deletions(-)

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 0cfb00f..316d5d2 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -45,7 +45,7 @@ can speed up unregistration process when you have to unregister
 a lot of probes at once.
 
 The next four subsections explain how the different types of
-probes work and how jump optimization works.  They explain certain
+probes work and how call optimization works.  They explain certain
 things that you'll need to know in order to make the best use of
 Kprobes -- e.g., the difference between a pre_handler and
 a post_handler, and how to use the maxactive and nmissed fields of
@@ -163,12 +163,12 @@ In case probed function is entered but there is no kretprobe_instance
 object available, then in addition to incrementing the nmissed count,
 the user entry_handler invocation is also skipped.
 
-1.4 How Does Jump Optimization Work?
+1.4 How Does Call Optimization Work?
 
 If your kernel is built with CONFIG_OPTPROBES=y (currently this flag
 is automatically set 'y' on x86/x86-64, non-preemptive kernel) and
 the "debug.kprobes_optimization" kernel parameter is set to 1 (see
-sysctl(8)), Kprobes tries to reduce probe-hit overhead by using a jump
+sysctl(8)), Kprobes tries to reduce probe-hit overhead by using a call
 instruction instead of a breakpoint instruction at each probepoint.
 
 1.4.1 Init a Kprobe
@@ -182,9 +182,9 @@ probepoint, there'll be a probe there.
 
 Before optimizing a probe, Kprobes performs the following safety checks:
 
-- Kprobes verifies that the region that will be replaced by the jump
+- Kprobes verifies that the region that will be replaced by the call
 instruction (the "optimized region") lies entirely within one function.
-(A jump instruction is multiple bytes, and so may overlay multiple
+(A call instruction is multiple bytes, and so may overlay multiple
 instructions.)
 
 - Kprobes analyzes the entire function and verifies that there is no
@@ -204,9 +204,6 @@ the instruction can be executed out of line.
 
 Next, Kprobes prepares a "detour" buffer, which contains the following
 instruction sequence:
-- code to push the CPU's registers (emulating a breakpoint trap)
-- a call to the trampoline code which calls user's probe handlers.
-- code to restore registers
 - the instructions from the optimized region
 - a jump back to the original execution path.
 
@@ -231,7 +228,7 @@ the CPU's instruction pointer to the copied code in the detour buffer
 
 1.4.5 Optimization
 
-The Kprobe-optimizer doesn't insert the jump instruction immediately;
+The Kprobe-optimizer doesn't insert the call instruction immediately;
 rather, it calls synchronize_sched() for safety first, because it's
 possible for a CPU to be interrupted in the middle of executing the
 optimized region(*).  As you know, synchronize_sched() can ensure
@@ -240,10 +237,20 @@ was called are done, but only if CONFIG_PREEMPT=n.  So, this version
 of kprobe optimization supports only kernels with CONFIG_PREEMPT=n.(**)
 
 After that, the Kprobe-optimizer calls stop_machine() to replace
-the optimized region with a jump instruction to the detour buffer,
-using text_poke_smp().
+the optimized region with a call instruction to the trampoline code,
+by using text_poke_smp().
 
-1.4.6 Unoptimization
+1.4.6 Trampoline code
+
+The trampoline code doing as follows;
+- save all registers (to make pt_regs on stack)
+- pass call-address (which was on the top of original stack)
+  and pt_regs to optprobe_callback
+- optprobe callback finding kprobes on the call address and
+  returns the address of detour buffer
+- restore all registers and return to the address
+
+1.4.7 Unoptimization
 
 When an optimized kprobe is unregistered, disabled, or blocked by
 another kprobe, it will be unoptimized.  If this happens before
@@ -571,7 +578,7 @@ reason, Kprobes doesn't support return probes (or kprobes or jprobes)
 on the x86_64 version of __switch_to(); the registration functions
 return -EINVAL.
 
-On x86/x86-64, since the Jump Optimization of Kprobes modifies
+On x86/x86-64, since the Call Optimization of Kprobes modifies
 instructions widely, there are some limitations to optimization. To
 explain it, we introduce some terminology. Imagine a 3-instruction
 sequence consisting of a two 2-byte instructions and one 3-byte
@@ -593,7 +600,7 @@ DCR: Detoured Code Region
 
 The instructions in DCR are copied to the out-of-line buffer
 of the kprobe, because the bytes in DCR are replaced by
-a 5-byte jump instruction. So there are several limitations.
+a 5-byte call instruction. So there are several limitations.
 
 a) The instructions in DCR must be relocatable.
 b) The instructions in DCR must not include a call instruction.
@@ -704,8 +711,8 @@ Appendix B: The kprobes sysctl interface
 /proc/sys/debug/kprobes-optimization: Turn kprobes optimization ON/OFF.
 
 When CONFIG_OPTPROBES=y, this sysctl interface appears and it provides
-a knob to globally and forcibly turn jump optimization (see section
-1.4) ON or OFF. By default, jump optimization is allowed (ON).
+a knob to globally and forcibly turn the call optimization (see section
+1.4) ON or OFF. By default, call optimization is allowed (ON).
 If you echo "0" to this file or set "debug.kprobes_optimization" to
 0 via sysctl, all optimized probes will be unoptimized, and any new
 probes registered after that will not be optimized.  Note that this
diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 5478825..c6e0dec 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -36,6 +36,7 @@ typedef u8 kprobe_opcode_t;
 #define RELATIVEJUMP_OPCODE 0xe9
 #define RELATIVEJUMP_SIZE 5
 #define RELATIVECALL_OPCODE 0xe8
+#define RELATIVECALL_SIZE 5
 #define RELATIVE_ADDR_SIZE 4
 #define MAX_STACK_SIZE 64
 #define MIN_STACK_SIZE(ADDR)					       \
@@ -47,16 +48,10 @@ typedef u8 kprobe_opcode_t;
 
 #define flush_insn_slot(p)	do { } while (0)
 
-/* optinsn template addresses */
-extern kprobe_opcode_t optprobe_template_entry;
-extern kprobe_opcode_t optprobe_template_val;
-extern kprobe_opcode_t optprobe_template_call;
-extern kprobe_opcode_t optprobe_template_end;
-#define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
-#define MAX_OPTINSN_SIZE 				\
-	(((unsigned long)&optprobe_template_end -	\
-	  (unsigned long)&optprobe_template_entry) +	\
-	 MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
+/* optprobe trampoline addresses */
+void optprobe_trampoline(void);
+#define MAX_OPTIMIZED_LENGTH	(MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
+#define MAX_OPTINSN_SIZE	(MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
 
 extern const int kretprobe_blacklist_size;
 
diff --git a/arch/x86/kernel/kprobes-common.h b/arch/x86/kernel/kprobes-common.h
index 3230b68..fedc8a3 100644
--- a/arch/x86/kernel/kprobes-common.h
+++ b/arch/x86/kernel/kprobes-common.h
@@ -4,9 +4,8 @@
 /* Kprobes and Optprobes common header */
 
 #ifdef CONFIG_X86_64
-#define SAVE_REGS_STRING			\
-	/* Skip cs, ip, orig_ax. */		\
-	"	subq $24, %rsp\n"		\
+#define SAVE_REGS_STRING_OFF(offs)		\
+	"	subq $" #offs ", %rsp\n"	\
 	"	pushq %rdi\n"			\
 	"	pushq %rsi\n"			\
 	"	pushq %rdx\n"			\
@@ -22,7 +21,7 @@
 	"	pushq %r13\n"			\
 	"	pushq %r14\n"			\
 	"	pushq %r15\n"
-#define RESTORE_REGS_STRING			\
+#define RESTORE_REGS_STRING_OFF(offs)		\
 	"	popq %r15\n"			\
 	"	popq %r14\n"			\
 	"	popq %r13\n"			\
@@ -38,12 +37,15 @@
 	"	popq %rdx\n"			\
 	"	popq %rsi\n"			\
 	"	popq %rdi\n"			\
-	/* Skip orig_ax, ip, cs */		\
-	"	addq $24, %rsp\n"
+	"	addq $" #offs ", %rsp\n"
+
+/* skip cs, ip, orig_ax */
+#define SAVE_REGS_STRING SAVE_REGS_STRING_OFF(24)
+#define RESTORE_REGS_STRING RESTORE_REGS_STRING_OFF(24)
+
 #else
-#define SAVE_REGS_STRING			\
-	/* Skip cs, ip, orig_ax and gs. */	\
-	"	subl $16, %esp\n"		\
+#define SAVE_REGS_STRING_OFF(offs)		\
+	"	subl $" #offs ", %esp\n"	\
 	"	pushl %fs\n"			\
 	"	pushl %es\n"			\
 	"	pushl %ds\n"			\
@@ -54,7 +56,7 @@
 	"	pushl %edx\n"			\
 	"	pushl %ecx\n"			\
 	"	pushl %ebx\n"
-#define RESTORE_REGS_STRING			\
+#define RESTORE_REGS_STRING_OFF(offs)		\
 	"	popl %ebx\n"			\
 	"	popl %ecx\n"			\
 	"	popl %edx\n"			\
@@ -62,8 +64,13 @@
 	"	popl %edi\n"			\
 	"	popl %ebp\n"			\
 	"	popl %eax\n"			\
-	/* Skip ds, es, fs, gs, orig_ax, and ip. Note: don't pop cs here*/\
-	"	addl $24, %esp\n"
+	"	addl $" #offs ", %esp\n"
+
+/* skip cs, ip, orig_ax and gs. */
+#define SAVE_REGS_STRING SAVE_REGS_STRING_OFF(16)
+/* skip ds, es, fs, gs, orig_ax, and ip. Note: don't pop cs here*/
+#define RESTORE_REGS_STRING RESTORE_REGS_STRING_OFF(24)
+
 #endif
 
 /* Ensure if the instruction can be boostable */
diff --git a/arch/x86/kernel/kprobes-opt.c b/arch/x86/kernel/kprobes-opt.c
index c5e410e..0b0a382 100644
--- a/arch/x86/kernel/kprobes-opt.c
+++ b/arch/x86/kernel/kprobes-opt.c
@@ -1,5 +1,5 @@
 /*
- *  Kernel Probes Jump Optimization (Optprobes)
+ *  Kernel Probes Call Optimization (Optprobes)
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -46,9 +46,9 @@ unsigned long __recover_optprobed_insn(kprobe_opcode_t *buf, unsigned long addr)
 	long offs;
 	int i;
 
-	for (i = 0; i < RELATIVEJUMP_SIZE; i++) {
+	for (i = 0; i < RELATIVECALL_SIZE; i++) {
 		kp = get_kprobe((void *)addr - i);
-		/* This function only handles jump-optimized kprobe */
+		/* This function only handles call-optimized kprobe */
 		if (kp && kprobe_optimized(kp)) {
 			op = container_of(kp, struct optimized_kprobe, kp);
 			/* If op->list is not empty, op is under optimizing */
@@ -61,7 +61,7 @@ unsigned long __recover_optprobed_insn(kprobe_opcode_t *buf, unsigned long addr)
 found:
 	/*
 	 * If the kprobe can be optimized, original bytes which can be
-	 * overwritten by jump destination address. In this case, original
+	 * overwritten by call destination address. In this case, original
 	 * bytes must be recovered from op->optinsn.copied_insn buffer.
 	 */
 	memcpy(buf, (void *)addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
@@ -76,83 +76,69 @@ found:
 	return (unsigned long)buf;
 }
 
-/* Insert a move instruction which sets a pointer to eax/rdi (1st arg). */
-static void __kprobes synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
-{
-#ifdef CONFIG_X86_64
-	*addr++ = 0x48;
-	*addr++ = 0xbf;
-#else
-	*addr++ = 0xb8;
-#endif
-	*(unsigned long *)addr = val;
-}
-
-static void __used __kprobes kprobes_optinsn_template_holder(void)
+static void __used __kprobes optprobe_trampoline_holder(void)
 {
 	asm volatile (
-			".global optprobe_template_entry\n"
-			"optprobe_template_entry:\n"
+			".global optprobe_trampoline\n"
+			"optprobe_trampoline:\n"
+			/* When call to here, the stack has call address+5 */
 #ifdef CONFIG_X86_64
 			/* We don't bother saving the ss register */
 			"	pushq %rsp\n"
 			"	pushfq\n"
 			SAVE_REGS_STRING
-			"	movq %rsp, %rsi\n"
-			".global optprobe_template_val\n"
-			"optprobe_template_val:\n"
-			ASM_NOP5
-			ASM_NOP5
-			".global optprobe_template_call\n"
-			"optprobe_template_call:\n"
-			ASM_NOP5
-			/* Move flags to rsp */
-			"	movq 144(%rsp), %rdx\n"
-			"	movq %rdx, 152(%rsp)\n"
+			"	movq %rsp, %rsi\n"	/* pt_regs */
+			"	movq 160(%rsp), %rdi\n"	/* call address */
+			"	call optimized_callback\n"
+			/* Replace call address with stub address */
+			"       movq %rax, 160(%rsp)\n"
 			RESTORE_REGS_STRING
-			/* Skip flags entry */
-			"	addq $8, %rsp\n"
 			"	popfq\n"
+			/* Skip rsp entry */
+			"	addq $8, %rsp\n"
 #else /* CONFIG_X86_32 */
 			"	pushf\n"
-			SAVE_REGS_STRING
-			"	movl %esp, %edx\n"
-			".global optprobe_template_val\n"
-			"optprobe_template_val:\n"
-			ASM_NOP5
-			".global optprobe_template_call\n"
-			"optprobe_template_call:\n"
-			ASM_NOP5
+			/* skip only ip, orig_ax, and gs (flags saved on cs) */
+			SAVE_REGS_STRING_OFF(12)
+			"	movl 56(%esp), %eax\n"	/* call address */
+			/* recover flags from cs */
+			"	movl 52(%esp), %edx\n"
+			"	movl %edx, 56(%esp)\n"
+			"	movl %esp, %edx\n"	/* pt_regs */
+			"	call optimized_callback\n"
+			/* Move flags to cs */
+			"       movl 56(%esp), %edx\n"
+			"       movl %edx, 52(%esp)\n"
+			/* Replace flags with stub address */
+			"       movl %eax, 56(%esp)\n"
 			RESTORE_REGS_STRING
-			"	addl $4, %esp\n"	/* skip cs */
 			"	popf\n"
 #endif
-			".global optprobe_template_end\n"
-			"optprobe_template_end:\n");
+			"	ret\n");
 }
 
-#define TMPL_MOVE_IDX \
-	((long)&optprobe_template_val - (long)&optprobe_template_entry)
-#define TMPL_CALL_IDX \
-	((long)&optprobe_template_call - (long)&optprobe_template_entry)
-#define TMPL_END_IDX \
-	((long)&optprobe_template_end - (long)&optprobe_template_entry)
-
 #define INT3_SIZE sizeof(kprobe_opcode_t)
 
 /* Optimized kprobe call back function: called from optinsn */
-static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
+static __used __kprobes kprobe_opcode_t *optimized_callback(kprobe_opcode_t *addr, struct pt_regs *regs)
 {
-	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+	struct optimized_kprobe *op;
+	struct kprobe_ctlblk *kcb;
 	unsigned long flags;
+	struct kprobe *p;
+
+	local_irq_save(flags);
+
+	kcb = get_kprobe_ctlblk();
+	p = get_kprobe(addr - RELATIVECALL_SIZE);
+	BUG_ON(!p);
 
 	/* This is possible if op is under delayed unoptimizing */
-	if (kprobe_disabled(&op->kp))
-		return;
+	if (kprobe_disabled(p))
+		goto end;
 
-	local_irq_save(flags);
 	if (kprobe_running()) {
-		kprobes_inc_nmissed_count(&op->kp);
+		kprobes_inc_nmissed_count(p);
 	} else {
 		/* Save skipped registers */
 #ifdef CONFIG_X86_64
@@ -161,22 +147,26 @@ static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_
 		regs->cs = __KERNEL_CS | get_kernel_rpl();
 		regs->gs = 0;
 #endif
-		regs->ip = (unsigned long)op->kp.addr + INT3_SIZE;
+		regs->ip = (unsigned long)p->addr + INT3_SIZE;
 		regs->orig_ax = ~0UL;
 
-		__this_cpu_write(current_kprobe, &op->kp);
+		__this_cpu_write(current_kprobe, p);
 		kcb->kprobe_status = KPROBE_HIT_ACTIVE;
-		opt_pre_handler(&op->kp, regs);
+		opt_pre_handler(p, regs);
 		__this_cpu_write(current_kprobe, NULL);
 	}
+end:
+	op = container_of(p, struct optimized_kprobe, kp);
+	addr = (kprobe_opcode_t *)(op->optinsn.insn);
 	local_irq_restore(flags);
+	return addr;
 }
 
 static int __kprobes copy_optimized_instructions(u8 *dest, u8 *src)
 {
 	int len = 0, ret;
 
-	while (len < RELATIVEJUMP_SIZE) {
+	while (len < RELATIVECALL_SIZE) {
 		ret = __copy_instruction(dest + len, src + len);
 		if (!ret || !can_boost(dest + len))
 			return -EINVAL;
@@ -245,8 +235,8 @@ static int __kprobes can_optimize(unsigned long paddr)
 	    (paddr <  (unsigned long)__entry_text_end))
 		return 0;
 
-	/* Check there is enough space for a relative jump. */
-	if (size - offset < RELATIVEJUMP_SIZE)
+	/* Check there is enough space for a relative call. */
+	if (size - offset < RELATIVECALL_SIZE)
 		return 0;
 
 	/* Decode instructions */
@@ -325,7 +315,6 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
 {
 	u8 *buf;
 	int ret;
-	long rel;
 
 	if (!can_optimize((unsigned long)op->kp.addr))
 		return -EILSEQ;
@@ -334,70 +323,52 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
 	if (!op->optinsn.insn)
 		return -ENOMEM;
 
-	/*
-	 * Verify if the address gap is in 2GB range, because this uses
-	 * a relative jump.
-	 */
-	rel = (long)op->optinsn.insn - (long)op->kp.addr + RELATIVEJUMP_SIZE;
-	if (abs(rel) > 0x7fffffff)
-		return -ERANGE;
-
 	buf = (u8 *)op->optinsn.insn;
 
 	/* Copy instructions into the out-of-line buffer */
-	ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr);
+	ret = copy_optimized_instructions(buf, op->kp.addr);
 	if (ret < 0) {
 		__arch_remove_optimized_kprobe(op, 0);
 		return ret;
 	}
 	op->optinsn.size = ret;
 
-	/* Copy arch-dep-instance from template */
-	memcpy(buf, &optprobe_template_entry, TMPL_END_IDX);
-
-	/* Set probe information */
-	synthesize_set_arg1(buf + TMPL_MOVE_IDX, (unsigned long)op);
-
-	/* Set probe function call */
-	synthesize_relcall(buf + TMPL_CALL_IDX, optimized_callback);
-
 	/* Set returning jmp instruction at the tail of out-of-line buffer */
-	synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
+	synthesize_reljump(buf + op->optinsn.size,
 			   (u8 *)op->kp.addr + op->optinsn.size);
 
-	flush_icache_range((unsigned long) buf,
-			   (unsigned long) buf + TMPL_END_IDX +
+	flush_icache_range((unsigned long) buf, (unsigned long) buf +
 			   op->optinsn.size + RELATIVEJUMP_SIZE);
 	return 0;
 }
 
 #define MAX_OPTIMIZE_PROBES 256
-static struct text_poke_param *jump_poke_params;
-static struct jump_poke_buffer {
-	u8 buf[RELATIVEJUMP_SIZE];
-} *jump_poke_bufs;
+static struct text_poke_param *call_poke_params;
+static struct call_poke_buffer {
+	u8 buf[RELATIVECALL_SIZE];
+} *call_poke_bufs;
 
 static void __kprobes setup_optimize_kprobe(struct text_poke_param *tprm,
 					    u8 *insn_buf,
 					    struct optimized_kprobe *op)
 {
-	s32 rel = (s32)((long)op->optinsn.insn -
-			((long)op->kp.addr + RELATIVEJUMP_SIZE));
+	s32 rel = (s32)((long)&optprobe_trampoline -
+			((long)op->kp.addr + RELATIVECALL_SIZE));
 
 	/* Backup instructions which will be replaced by jump address */
 	memcpy(op->optinsn.copied_insn, op->kp.addr + INT3_SIZE,
 	       RELATIVE_ADDR_SIZE);
 
-	insn_buf[0] = RELATIVEJUMP_OPCODE;
+	insn_buf[0] = RELATIVECALL_OPCODE;
 	*(s32 *)(&insn_buf[1]) = rel;
 
 	tprm->addr = op->kp.addr;
 	tprm->opcode = insn_buf;
-	tprm->len = RELATIVEJUMP_SIZE;
+	tprm->len = RELATIVECALL_SIZE;
 }
 
 /*
- * Replace breakpoints (int3) with relative jumps.
+ * Replace breakpoints (int3) with relative calls.
  * Caller must call with locking kprobe_mutex and text_mutex.
  */
 void __kprobes arch_optimize_kprobes(struct list_head *oplist)
@@ -408,8 +379,8 @@ void __kprobes arch_optimize_kprobes(struct list_head *oplist)
 	list_for_each_entry_safe(op, tmp, oplist, list) {
 		WARN_ON(kprobe_disabled(&op->kp));
 		/* Setup param */
-		setup_optimize_kprobe(&jump_poke_params[c],
-				      jump_poke_bufs[c].buf, op);
+		setup_optimize_kprobe(&call_poke_params[c],
+				      call_poke_bufs[c].buf, op);
 		list_del_init(&op->list);
 		if (++c >= MAX_OPTIMIZE_PROBES)
 			break;
@@ -420,7 +391,7 @@ void __kprobes arch_optimize_kprobes(struct list_head *oplist)
 	 * However, since kprobes itself also doesn't support NMI/MCE
 	 * code probing, it's not a problem.
 	 */
-	text_poke_smp_batch(jump_poke_params, c);
+	text_poke_smp_batch(call_poke_params, c);
 }
 
 static void __kprobes setup_unoptimize_kprobe(struct text_poke_param *tprm,
@@ -433,11 +404,11 @@ static void __kprobes setup_unoptimize_kprobe(struct text_poke_param *tprm,
 
 	tprm->addr = op->kp.addr;
 	tprm->opcode = insn_buf;
-	tprm->len = RELATIVEJUMP_SIZE;
+	tprm->len = RELATIVECALL_SIZE;
 }
 
 /*
- * Recover original instructions and breakpoints from relative jumps.
+ * Recover original instructions and breakpoints from relative calls.
  * Caller must call with locking kprobe_mutex.
  */
 extern void arch_unoptimize_kprobes(struct list_head *oplist,
@@ -448,8 +419,8 @@ extern void arch_unoptimize_kprobes(struct list_head *oplist,
 
 	list_for_each_entry_safe(op, tmp, oplist, list) {
 		/* Setup param */
-		setup_unoptimize_kprobe(&jump_poke_params[c],
-					jump_poke_bufs[c].buf, op);
+		setup_unoptimize_kprobe(&call_poke_params[c],
+					call_poke_bufs[c].buf, op);
 		list_move(&op->list, done_list);
 		if (++c >= MAX_OPTIMIZE_PROBES)
 			break;
@@ -460,18 +431,18 @@ extern void arch_unoptimize_kprobes(struct list_head *oplist,
 	 * However, since kprobes itself also doesn't support NMI/MCE
 	 * code probing, it's not a problem.
 	 */
-	text_poke_smp_batch(jump_poke_params, c);
+	text_poke_smp_batch(call_poke_params, c);
 }
 
 /* Replace a relative jump with a breakpoint (int3).  */
 void __kprobes arch_unoptimize_kprobe(struct optimized_kprobe *op)
 {
-	u8 buf[RELATIVEJUMP_SIZE];
+	u8 buf[RELATIVECALL_SIZE];
 
 	/* Set int3 to first byte for kprobes */
 	buf[0] = BREAKPOINT_INSTRUCTION;
 	memcpy(buf + 1, op->optinsn.copied_insn, RELATIVE_ADDR_SIZE);
-	text_poke_smp(op->kp.addr, buf, RELATIVEJUMP_SIZE);
+	text_poke_smp(op->kp.addr, buf, RELATIVECALL_SIZE);
 }
 
 int  __kprobes
@@ -483,7 +454,7 @@ setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
 		/* This kprobe is really able to run optimized path. */
 		op = container_of(p, struct optimized_kprobe, kp);
 		/* Detour through copied instructions */
-		regs->ip = (unsigned long)op->optinsn.insn + TMPL_END_IDX;
+		regs->ip = (unsigned long)op->optinsn.insn;
 		if (!reenter)
 			reset_current_kprobe();
 		preempt_enable_no_resched();
@@ -495,16 +466,16 @@ setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
 int __kprobes arch_init_optprobes(void)
 {
 	/* Allocate code buffer and parameter array */
-	jump_poke_bufs = kmalloc(sizeof(struct jump_poke_buffer) *
+	call_poke_bufs = kmalloc(sizeof(struct call_poke_buffer) *
 				 MAX_OPTIMIZE_PROBES, GFP_KERNEL);
-	if (!jump_poke_bufs)
+	if (!call_poke_bufs)
 		return -ENOMEM;
 
-	jump_poke_params = kmalloc(sizeof(struct text_poke_param) *
+	call_poke_params = kmalloc(sizeof(struct text_poke_param) *
 				   MAX_OPTIMIZE_PROBES, GFP_KERNEL);
-	if (!jump_poke_params) {
-		kfree(jump_poke_bufs);
-		jump_poke_bufs = NULL;
+	if (!call_poke_params) {
+		kfree(call_poke_bufs);
+		call_poke_bufs = NULL;
 		return -ENOMEM;
 	}
 
diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index e2f751e..2fe982c 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -276,8 +276,8 @@ static int __kprobes can_probe(unsigned long paddr)
 		 * Check if the instruction has been modified by another
 		 * kprobe, in which case we replace the breakpoint by the
 		 * original instruction in our buffer.
-		 * Also, jump optimization will change the breakpoint to
-		 * relative-jump. Since the relative-jump itself is
+		 * Also, call optimization will change the breakpoint to
+		 * relative-call. Since the relative-call itself is
 		 * normally used, we just go through if there is no kprobe.
 		 */
 		__addr = recover_probed_instruction(buf, addr);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index c62b854..b1552dd 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -290,7 +290,7 @@ void __kprobes free_insn_slot(kprobe_opcode_t * slot, int dirty)
 static DEFINE_MUTEX(kprobe_optinsn_mutex); /* Protects kprobe_optinsn_slots */
 static struct kprobe_insn_cache kprobe_optinsn_slots = {
 	.pages = LIST_HEAD_INIT(kprobe_optinsn_slots.pages),
-	/* .insn_size is initialized later */
+	.insn_size = MAX_OPTINSN_SIZE,
 	.nr_garbage = 0,
 };
 /* Get a slot for optimized_kprobe buffer */
@@ -2002,10 +2002,6 @@ static int __init init_kprobes(void)
 	}
 
 #if defined(CONFIG_OPTPROBES)
-#if defined(__ARCH_WANT_KPROBES_INSN_SLOT)
-	/* Init kprobe_optinsn_slots */
-	kprobe_optinsn_slots.insn_size = MAX_OPTINSN_SIZE;
-#endif
 	/* By default, kprobes can be optimized */
 	kprobes_allow_optimization = true;
 #endif


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH -tip ] x86/kprobes: kprobes call optimization
  2012-05-10 11:54                     ` [RFC PATCH -tip ] x86/kprobes: kprobes call optimization Masami Hiramatsu
@ 2012-05-11  8:29                       ` Masami Hiramatsu
  0 siblings, 0 replies; 25+ messages in thread
From: Masami Hiramatsu @ 2012-05-11  8:29 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Frank Ch. Eigler, linux-kernel, Ingo Molnar,
	Andrew Morton, Frederic Weisbecker, H. Peter Anvin,
	yrl.pp-manager.tt

(2012/05/10 20:54), Masami Hiramatsu wrote:
> Note: this code is still under development, but it will
> be useful for the discussion to show this.
> 
> Use relative call instead of relative jump for optimizing
> kprobes. This reduces x86-depend magic code and runtime
> memory size.

Hmm, I've found that this patch may increase the execution
time of probing more than 20%. I think that is more than
acceptable...


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-05-11  8:29 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-02 19:24 [PATCH 0/9][RFC] ftrace: ftrace location lookup speedup, and clean ups Steven Rostedt
2012-05-02 19:24 ` [PATCH 1/9][RFC] ftrace: Sort all function addresses, not just per page Steven Rostedt
2012-05-02 19:24 ` [PATCH 2/9][RFC] ftrace: Remove extra helper functions Steven Rostedt
2012-05-02 19:24 ` [PATCH 3/9][RFC] ftrace: Speed up search by skipping pages by address Steven Rostedt
2012-05-02 19:24 ` [PATCH 4/9][RFC] ftrace: Consolidate ftrace_location() and ftrace_text_reserved() Steven Rostedt
2012-05-02 19:24 ` [PATCH 5/9][RFC] ftrace: Return record ip addr for ftrace_location() Steven Rostedt
2012-05-02 19:24 ` [PATCH 6/9][RFC] kprobes: Allow probe on ftrace reserved text (but move it) Steven Rostedt
2012-05-02 20:40   ` Frank Ch. Eigler
2012-05-02 23:40     ` Steven Rostedt
2012-05-07 11:37       ` Masami Hiramatsu
2012-05-07 11:59         ` Masami Hiramatsu
2012-05-07 12:44           ` Steven Rostedt
2012-05-07 12:51             ` Steven Rostedt
2012-05-07 12:43         ` Steven Rostedt
2012-05-08  3:08           ` Masami Hiramatsu
2012-05-08 13:04             ` Steven Rostedt
2012-05-09  5:53               ` Masami Hiramatsu
2012-05-09  8:12                 ` Masami Hiramatsu
2012-05-09  9:02                   ` Masami Hiramatsu
2012-05-09 14:50                     ` Steven Rostedt
2012-05-10 11:54                     ` [RFC PATCH -tip ] x86/kprobes: kprobes call optimization Masami Hiramatsu
2012-05-11  8:29                       ` Masami Hiramatsu
2012-05-02 19:24 ` [PATCH 7/9][RFC] ftrace: Make ftrace_modify_all_code() global for archs to use Steven Rostedt
2012-05-02 19:24 ` [PATCH 8/9][RFC] ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() Steven Rostedt
2012-05-02 19:24 ` [PATCH 9/9][RFC] ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.