All of lore.kernel.org
 help / color / mirror / Atom feed
* [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init
@ 2017-04-05 16:21 Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 1/7] ftrace: Clean up __seq_open_private() return check Steven Rostedt
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: 696ced4fb1d76802f864d8848aa4716633f83c17


Alban Crequy (1):
      tracing/kprobes: expose maxactive for kretprobe in kprobe_events

Steven Rostedt (VMware) (6):
      ftrace: Clean up __seq_open_private() return check
      ftrace: Assign iter->hash to filter or notrace hashes on seq read
      ftrace: Return NULL at end of t_start() instead of calling t_hash_start()
      ftrace: Update func_pos in t_start() when all functions are enabled
      ftrace: Create separate t_func_next() to simplify the function / hash logic
      ftrace: Have init/main.c call ftrace directly to free init memory

----
 Documentation/trace/kprobetrace.txt                |  5 +-
 include/linux/ftrace.h                             |  6 +-
 init/main.c                                        |  1 +
 kernel/trace/ftrace.c                              | 96 +++++++++++++---------
 kernel/trace/trace_kprobe.c                        | 39 +++++++--
 mm/page_alloc.c                                    |  3 -
 .../ftrace/test.d/kprobe/kretprobe_maxactive.tc    | 39 +++++++++
 7 files changed, 137 insertions(+), 52 deletions(-)
 create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [for-next][PATCH 1/7] ftrace: Clean up __seq_open_private() return check
  2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
@ 2017-04-05 16:21 ` Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 2/7] ftrace: Assign iter->hash to filter or notrace hashes on seq read Steven Rostedt
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0001-ftrace-Clean-up-__seq_open_private-return-check.patch --]
[-- Type: text/plain, Size: 1668 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The return status check of __seq_open_private() is rather strange:

	iter = __seq_open_private();
	if (iter) {
		/* do stuff */
	}

	return iter ? 0 : -ENOMEM;

It makes much more sense to do the return of failure right away:

	iter = __seq_open_private();
	if (!iter)
		return -ENOMEM;

	/* do stuff */

	return 0;

This clean up will make updates to this code a bit nicer.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 0556a202c055..527c4d3e8d7f 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3355,12 +3355,13 @@ ftrace_avail_open(struct inode *inode, struct file *file)
 		return -ENODEV;
 
 	iter = __seq_open_private(file, &show_ftrace_seq_ops, sizeof(*iter));
-	if (iter) {
-		iter->pg = ftrace_pages_start;
-		iter->ops = &global_ops;
-	}
+	if (!iter)
+		return -ENOMEM;
 
-	return iter ? 0 : -ENOMEM;
+	iter->pg = ftrace_pages_start;
+	iter->ops = &global_ops;
+
+	return 0;
 }
 
 static int
@@ -3369,13 +3370,14 @@ ftrace_enabled_open(struct inode *inode, struct file *file)
 	struct ftrace_iterator *iter;
 
 	iter = __seq_open_private(file, &show_ftrace_seq_ops, sizeof(*iter));
-	if (iter) {
-		iter->pg = ftrace_pages_start;
-		iter->flags = FTRACE_ITER_ENABLED;
-		iter->ops = &global_ops;
-	}
+	if (!iter)
+		return -ENOMEM;
 
-	return iter ? 0 : -ENOMEM;
+	iter->pg = ftrace_pages_start;
+	iter->flags = FTRACE_ITER_ENABLED;
+	iter->ops = &global_ops;
+
+	return 0;
 }
 
 /**
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 2/7] ftrace: Assign iter->hash to filter or notrace hashes on seq read
  2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 1/7] ftrace: Clean up __seq_open_private() return check Steven Rostedt
@ 2017-04-05 16:21 ` Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 3/7] ftrace: Return NULL at end of t_start() instead of calling t_hash_start() Steven Rostedt
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0002-ftrace-Assign-iter-hash-to-filter-or-notrace-hashes-.patch --]
[-- Type: text/plain, Size: 2831 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Instead of testing if the hash to use is the filter_hash or the notrace_hash
at each iteration, do the test at open, and set the iter->hash to point to
the corresponding filter or notrace hash. Then use that directly instead of
testing which hash needs to be used each iteration.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 527c4d3e8d7f..3165b7f840e6 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3157,7 +3157,6 @@ static void *
 t_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	struct ftrace_iterator *iter = m->private;
-	struct ftrace_ops *ops = iter->ops;
 	struct dyn_ftrace *rec = NULL;
 
 	if (unlikely(ftrace_disabled))
@@ -3181,11 +3180,8 @@ t_next(struct seq_file *m, void *v, loff_t *pos)
 		}
 	} else {
 		rec = &iter->pg->records[iter->idx++];
-		if (((iter->flags & FTRACE_ITER_FILTER) &&
-		     !(ftrace_lookup_ip(ops->func_hash->filter_hash, rec->ip))) ||
-
-		    ((iter->flags & FTRACE_ITER_NOTRACE) &&
-		     !ftrace_lookup_ip(ops->func_hash->notrace_hash, rec->ip)) ||
+		if (((iter->flags & (FTRACE_ITER_FILTER | FTRACE_ITER_NOTRACE)) &&
+		     !ftrace_lookup_ip(iter->hash, rec->ip)) ||
 
 		    ((iter->flags & FTRACE_ITER_ENABLED) &&
 		     !(rec->flags & FTRACE_FL_ENABLED))) {
@@ -3213,7 +3209,6 @@ static void reset_iter_read(struct ftrace_iterator *iter)
 static void *t_start(struct seq_file *m, loff_t *pos)
 {
 	struct ftrace_iterator *iter = m->private;
-	struct ftrace_ops *ops = iter->ops;
 	void *p = NULL;
 	loff_t l;
 
@@ -3233,10 +3228,8 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 	 * off, we can short cut and just print out that all
 	 * functions are enabled.
 	 */
-	if ((iter->flags & FTRACE_ITER_FILTER &&
-	     ftrace_hash_empty(ops->func_hash->filter_hash)) ||
-	    (iter->flags & FTRACE_ITER_NOTRACE &&
-	     ftrace_hash_empty(ops->func_hash->notrace_hash))) {
+	if ((iter->flags & (FTRACE_ITER_FILTER | FTRACE_ITER_NOTRACE)) &&
+	    ftrace_hash_empty(iter->hash)) {
 		if (*pos > 0)
 			return t_hash_start(m, pos);
 		iter->flags |= FTRACE_ITER_PRINTALL;
@@ -3442,7 +3435,8 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
 			ret = -ENOMEM;
 			goto out_unlock;
 		}
-	}
+	} else
+		iter->hash = hash;
 
 	if (file->f_mode & FMODE_READ) {
 		iter->pg = ftrace_pages_start;
@@ -4526,6 +4520,9 @@ int ftrace_regex_release(struct inode *inode, struct file *file)
 			free_ftrace_hash_rcu(old_hash);
 		}
 		mutex_unlock(&ftrace_lock);
+	} else {
+		/* For read only, the hash is the ops hash */
+		iter->hash = NULL;
 	}
 
 	mutex_unlock(&iter->ops->func_hash->regex_lock);
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 3/7] ftrace: Return NULL at end of t_start() instead of calling t_hash_start()
  2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 1/7] ftrace: Clean up __seq_open_private() return check Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 2/7] ftrace: Assign iter->hash to filter or notrace hashes on seq read Steven Rostedt
@ 2017-04-05 16:21 ` Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 4/7] ftrace: Update func_pos in t_start() when all functions are enabled Steven Rostedt
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0003-ftrace-Return-NULL-at-end-of-t_start-instead-of-call.patch --]
[-- Type: text/plain, Size: 747 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The loop in t_start() of calling t_next() will call t_hash_start() if the
pos is beyond the functions and enters the hash items. There's no reason to
check if p is NULL and call t_hash_start(), as that would be redundant.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 3165b7f840e6..421530831ddd 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3255,7 +3255,7 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 	}
 
 	if (!p)
-		return t_hash_start(m, pos);
+		return NULL;
 
 	return iter;
 }
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 4/7] ftrace: Update func_pos in t_start() when all functions are enabled
  2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
                   ` (2 preceding siblings ...)
  2017-04-05 16:21 ` [for-next][PATCH 3/7] ftrace: Return NULL at end of t_start() instead of calling t_hash_start() Steven Rostedt
@ 2017-04-05 16:21 ` Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 5/7] ftrace: Create separate t_func_next() to simplify the function / hash logic Steven Rostedt
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0004-ftrace-Update-func_pos-in-t_start-when-all-functions.patch --]
[-- Type: text/plain, Size: 1745 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

If all functions are enabled, there's a comment displayed in the file to
denote that:

  # cd /sys/kernel/debug/tracing
  # cat set_ftrace_filter
 #### all functions enabled ####

If a function trigger is set, those are displayed as well:

  # echo schedule:traceoff >> /debug/tracing/set_ftrace_filter
  # cat set_ftrace_filter
 #### all functions enabled ####
 schedule:traceoff:unlimited

But if you read that file with dd, the output can change:

  # dd if=/debug/tracing/set_ftrace_filter bs=1
 #### all functions enabled ####
 32+0 records in
 32+0 records out
 32 bytes copied, 7.0237e-05 s, 456 kB/s

This is because the "pos" variable is updated for the comment, but func_pos
is not. "func_pos" is used by the triggers (or hashes) to know how many
functions were printed and it bases its index from the pos - func_pos.
func_pos should be 1 to count for the comment printed. But since it is not,
t_hash_start() thinks that one trigger was already printed.

The cat gets to t_hash_start() via t_next() and not t_start() which updates
both pos and func_pos.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 421530831ddd..d4b18ce9ba88 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3230,6 +3230,7 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 	 */
 	if ((iter->flags & (FTRACE_ITER_FILTER | FTRACE_ITER_NOTRACE)) &&
 	    ftrace_hash_empty(iter->hash)) {
+		iter->func_pos = 1; /* Account for the message */
 		if (*pos > 0)
 			return t_hash_start(m, pos);
 		iter->flags |= FTRACE_ITER_PRINTALL;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 5/7] ftrace: Create separate t_func_next() to simplify the function / hash logic
  2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
                   ` (3 preceding siblings ...)
  2017-04-05 16:21 ` [for-next][PATCH 4/7] ftrace: Update func_pos in t_start() when all functions are enabled Steven Rostedt
@ 2017-04-05 16:21 ` Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 6/7] ftrace: Have init/main.c call ftrace directly to free init memory Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 7/7] tracing/kprobes: expose maxactive for kretprobe in kprobe_events Steven Rostedt
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0005-ftrace-Create-separate-t_func_next-to-simplify-the-f.patch --]
[-- Type: text/plain, Size: 3114 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

I noticed that if I use dd to read the set_ftrace_filter file that the first
hash command is repeated.

 # cd /sys/kernel/debug/tracing
 # echo schedule > set_ftrace_filter
 # echo do_IRQ >> set_ftrace_filter
 # echo schedule:traceoff >> set_ftrace_filter
 # echo do_IRQ:traceoff >> set_ftrace_filter

 # cat set_ftrace_filter
 schedule
 do_IRQ
 schedule:traceoff:unlimited
 do_IRQ:traceoff:unlimited

 # dd if=set_ftrace_filter bs=1
 schedule
 do_IRQ
 schedule:traceoff:unlimited
 schedule:traceoff:unlimited
 do_IRQ:traceoff:unlimited
 98+0 records in
 98+0 records out
 98 bytes copied, 0.00265011 s, 37.0 kB/s

This is due to the way t_start() calls t_next() as well as the seq_file
calls t_next() and the state is slightly different between the two. Namely,
t_start() will call t_next() with a local "pos" variable.

By separating out the function listing from t_next() into its own function,
we can have better control of outputting the functions and the hash of
triggers. This simplifies the code.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 45 +++++++++++++++++++++++++++++++--------------
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index d4b18ce9ba88..aff7a2c08387 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3154,22 +3154,12 @@ t_hash_show(struct seq_file *m, struct ftrace_iterator *iter)
 }
 
 static void *
-t_next(struct seq_file *m, void *v, loff_t *pos)
+t_func_next(struct seq_file *m, loff_t *pos)
 {
 	struct ftrace_iterator *iter = m->private;
 	struct dyn_ftrace *rec = NULL;
 
-	if (unlikely(ftrace_disabled))
-		return NULL;
-
-	if (iter->flags & FTRACE_ITER_HASH)
-		return t_hash_next(m, pos);
-
 	(*pos)++;
-	iter->pos = iter->func_pos = *pos;
-
-	if (iter->flags & FTRACE_ITER_PRINTALL)
-		return t_hash_start(m, pos);
 
  retry:
 	if (iter->idx >= iter->pg->index) {
@@ -3192,13 +3182,40 @@ t_next(struct seq_file *m, void *v, loff_t *pos)
 	}
 
 	if (!rec)
-		return t_hash_start(m, pos);
+		return NULL;
 
+	iter->pos = iter->func_pos = *pos;
 	iter->func = rec;
 
 	return iter;
 }
 
+static void *
+t_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct ftrace_iterator *iter = m->private;
+	void *ret;
+
+	if (unlikely(ftrace_disabled))
+		return NULL;
+
+	if (iter->flags & FTRACE_ITER_HASH)
+		return t_hash_next(m, pos);
+
+	if (iter->flags & FTRACE_ITER_PRINTALL) {
+		/* next must increment pos, and t_hash_start does not */
+		(*pos)++;
+		return t_hash_start(m, pos);
+	}
+
+	ret = t_func_next(m, pos);
+
+	if (!ret)
+		return t_hash_start(m, pos);
+
+	return ret;
+}
+
 static void reset_iter_read(struct ftrace_iterator *iter)
 {
 	iter->pos = 0;
@@ -3250,13 +3267,13 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 	iter->pg = ftrace_pages_start;
 	iter->idx = 0;
 	for (l = 0; l <= *pos; ) {
-		p = t_next(m, p, &l);
+		p = t_func_next(m, &l);
 		if (!p)
 			break;
 	}
 
 	if (!p)
-		return NULL;
+		return t_hash_start(m, pos);
 
 	return iter;
 }
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 6/7] ftrace: Have init/main.c call ftrace directly to free init memory
  2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
                   ` (4 preceding siblings ...)
  2017-04-05 16:21 ` [for-next][PATCH 5/7] ftrace: Create separate t_func_next() to simplify the function / hash logic Steven Rostedt
@ 2017-04-05 16:21 ` Steven Rostedt
  2017-04-05 16:21 ` [for-next][PATCH 7/7] tracing/kprobes: expose maxactive for kretprobe in kprobe_events Steven Rostedt
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, kernel test robot, Fengguang Wu

[-- Attachment #1: 0006-ftrace-Have-init-main.c-call-ftrace-directly-to-free.patch --]
[-- Type: text/plain, Size: 3681 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Relying on free_reserved_area() to call ftrace to free init memory proved to
not be sufficient. The issue is that on x86, when debug_pagealloc is
enabled, the init memory is not freed, but simply set as not present. Since
ftrace was uninformed of this, starting function tracing still tries to
update pages that are not present according to the page tables, causing
ftrace to bug, as well as killing the kernel itself.

Instead of relying on free_reserved_area(), have init/main.c call ftrace
directly just before it frees the init memory. Then it needs to use
__init_begin and __init_end to know where the init memory location is.
Looking at all archs (and testing what I can), it appears that this should
work for each of them.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/ftrace.h | 6 +++---
 init/main.c            | 1 +
 kernel/trace/ftrace.c  | 7 ++++---
 mm/page_alloc.c        | 3 ---
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 0276a2c487e6..ef7123219f14 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -147,9 +147,9 @@ struct ftrace_ops_hash {
 	struct mutex			regex_lock;
 };
 
-void ftrace_free_mem(void *start, void *end);
+void ftrace_free_init_mem(void);
 #else
-static inline void ftrace_free_mem(void *start, void *end) { }
+static inline void ftrace_free_init_mem(void) { }
 #endif
 
 /*
@@ -266,7 +266,7 @@ static inline int ftrace_nr_registered_ops(void)
 }
 static inline void clear_ftrace_function(void) { }
 static inline void ftrace_kill(void) { }
-static inline void ftrace_free_mem(void *start, void *end) { }
+static inline void ftrace_free_init_mem(void) { }
 #endif /* CONFIG_FUNCTION_TRACER */
 
 #ifdef CONFIG_STACK_TRACER
diff --git a/init/main.c b/init/main.c
index c0137b916aa1..0e8849f74561 100644
--- a/init/main.c
+++ b/init/main.c
@@ -962,6 +962,7 @@ static int __ref kernel_init(void *unused)
 	kernel_init_freeable();
 	/* need to finish all async __init code before freeing the memory */
 	async_synchronize_full();
+	ftrace_free_init_mem();
 	free_initmem();
 	mark_readonly();
 	system_state = SYSTEM_RUNNING;
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index aff7a2c08387..8efd9fe7aec0 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -36,6 +36,7 @@
 
 #include <trace/events/sched.h>
 
+#include <asm/sections.h>
 #include <asm/setup.h>
 
 #include "trace_output.h"
@@ -5279,10 +5280,10 @@ void ftrace_module_init(struct module *mod)
 }
 #endif /* CONFIG_MODULES */
 
-void ftrace_free_mem(void *start_ptr, void *end_ptr)
+void __init ftrace_free_init_mem(void)
 {
-	unsigned long start = (unsigned long)start_ptr;
-	unsigned long end = (unsigned long)end_ptr;
+	unsigned long start = (unsigned long)(&__init_begin);
+	unsigned long end = (unsigned long)(&__init_end);
 	struct ftrace_page **last_pg = &ftrace_pages_start;
 	struct ftrace_page *pg;
 	struct dyn_ftrace *rec;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index eee82bfb7cd8..178bf9c2a2cb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6606,9 +6606,6 @@ unsigned long free_reserved_area(void *start, void *end, int poison, char *s)
 	void *pos;
 	unsigned long pages = 0;
 
-	/* This may be .init text, inform ftrace to remove it */
-	ftrace_free_mem(start, end);
-
 	start = (void *)PAGE_ALIGN((unsigned long)start);
 	end = (void *)((unsigned long)end & PAGE_MASK);
 	for (pos = start; pos < end; pos += PAGE_SIZE, pages++) {
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 7/7] tracing/kprobes: expose maxactive for kretprobe in kprobe_events
  2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
                   ` (5 preceding siblings ...)
  2017-04-05 16:21 ` [for-next][PATCH 6/7] ftrace: Have init/main.c call ftrace directly to free init memory Steven Rostedt
@ 2017-04-05 16:21 ` Steven Rostedt
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2017-04-05 16:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Alban Crequy

[-- Attachment #1: 0007-tracing-kprobes-expose-maxactive-for-kretprobe-in-kp.patch --]
[-- Type: text/plain, Size: 7859 bytes --]

From: Alban Crequy <alban@kinvolk.io>

When a kretprobe is installed on a kernel function, there is a maximum
limit of how many calls in parallel it can catch (aka "maxactive"). A
kernel module could call register_kretprobe() and initialize maxactive
(see example in samples/kprobes/kretprobe_example.c).

But that is not exposed to userspace and it is currently not possible to
choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events

The default maxactive can be as low as 1 on single-core with a
non-preemptive kernel. This is too low and we need to increase it not
only for recursive functions, but for functions that sleep or resched.

This patch updates the format of the command that can be written to
kprobe_events so that maxactive can be optionally specified.

I need this for a bpf program attached to the kretprobe of
inet_csk_accept, which can sleep for a long time.

This patch includes a basic selftest:

> # ./ftracetest -v  test.d/kprobe/
> === Ftrace unit tests ===
> [1] Kprobe dynamic event - adding and removing	[PASS]
> [2] Kprobe dynamic event - busy event check	[PASS]
> [3] Kprobe dynamic event with arguments	[PASS]
> [4] Kprobes event arguments with types	[PASS]
> [5] Kprobe dynamic event with function tracer	[PASS]
> [6] Kretprobe dynamic event with arguments	[PASS]
> [7] Kretprobe dynamic event with maxactive	[PASS]
>
> # of passed:  7
> # of failed:  0
> # of unresolved:  0
> # of untested:  0
> # of unsupported:  0
> # of xfailed:  0
> # of undefined(test bug):  0

BugLink: https://github.com/iovisor/bcc/issues/1072
Link: http://lkml.kernel.org/r/1491215782-15490-1-git-send-email-alban@kinvolk.io

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 Documentation/trace/kprobetrace.txt                |  5 ++-
 kernel/trace/trace_kprobe.c                        | 39 ++++++++++++++++++----
 .../ftrace/test.d/kprobe/kretprobe_maxactive.tc    | 39 ++++++++++++++++++++++
 3 files changed, 76 insertions(+), 7 deletions(-)
 create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc

diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
index 41ef9d8efe95..25f39601a7f7 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -23,7 +23,7 @@ current_tracer. Instead of that, add probe points via
 Synopsis of kprobe_events
 -------------------------
   p[:[GRP/]EVENT] [MOD:]SYM[+offs]|MEMADDR [FETCHARGS]	: Set a probe
-  r[:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]		: Set a return probe
+  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]	: Set a return probe
   -:[GRP/]EVENT						: Clear a probe
 
  GRP		: Group name. If omitted, use "kprobes" for it.
@@ -32,6 +32,9 @@ Synopsis of kprobe_events
  MOD		: Module name which has given SYM.
  SYM[+offs]	: Symbol+offset where the probe is inserted.
  MEMADDR	: Address where the probe is inserted.
+ MAXACTIVE	: Maximum number of instances of the specified function that
+		  can be probed simultaneously, or 0 for the default value
+		  as defined in Documentation/kprobes.txt section 1.3.1.
 
  FETCHARGS	: Arguments. Each probe can have up to 128 args.
   %REG		: Fetch register REG
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5f688cc724f0..f01d49b576c0 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -25,6 +25,7 @@
 #include "trace_probe.h"
 
 #define KPROBE_EVENT_SYSTEM "kprobes"
+#define KRETPROBE_MAXACTIVE_MAX 4096
 
 /**
  * Kprobe event core functions
@@ -282,6 +283,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char *group,
 					     void *addr,
 					     const char *symbol,
 					     unsigned long offs,
+					     int maxactive,
 					     int nargs, bool is_return)
 {
 	struct trace_kprobe *tk;
@@ -309,6 +311,8 @@ static struct trace_kprobe *alloc_trace_kprobe(const char *group,
 	else
 		tk->rp.kp.pre_handler = kprobe_dispatcher;
 
+	tk->rp.maxactive = maxactive;
+
 	if (!event || !is_good_name(event)) {
 		ret = -EINVAL;
 		goto error;
@@ -598,8 +602,10 @@ static int create_trace_kprobe(int argc, char **argv)
 {
 	/*
 	 * Argument syntax:
-	 *  - Add kprobe: p[:[GRP/]EVENT] [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
-	 *  - Add kretprobe: r[:[GRP/]EVENT] [MOD:]KSYM[+0] [FETCHARGS]
+	 *  - Add kprobe:
+	 *      p[:[GRP/]EVENT] [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
+	 *  - Add kretprobe:
+	 *      r[MAXACTIVE][:[GRP/]EVENT] [MOD:]KSYM[+0] [FETCHARGS]
 	 * Fetch args:
 	 *  $retval	: fetch return value
 	 *  $stack	: fetch stack address
@@ -619,6 +625,7 @@ static int create_trace_kprobe(int argc, char **argv)
 	int i, ret = 0;
 	bool is_return = false, is_delete = false;
 	char *symbol = NULL, *event = NULL, *group = NULL;
+	int maxactive = 0;
 	char *arg;
 	unsigned long offset = 0;
 	void *addr = NULL;
@@ -637,8 +644,28 @@ static int create_trace_kprobe(int argc, char **argv)
 		return -EINVAL;
 	}
 
-	if (argv[0][1] == ':') {
-		event = &argv[0][2];
+	event = strchr(&argv[0][1], ':');
+	if (event) {
+		event[0] = '\0';
+		event++;
+	}
+	if (is_return && isdigit(argv[0][1])) {
+		ret = kstrtouint(&argv[0][1], 0, &maxactive);
+		if (ret) {
+			pr_info("Failed to parse maxactive.\n");
+			return ret;
+		}
+		/* kretprobes instances are iterated over via a list. The
+		 * maximum should stay reasonable.
+		 */
+		if (maxactive > KRETPROBE_MAXACTIVE_MAX) {
+			pr_info("Maxactive is too big (%d > %d).\n",
+				maxactive, KRETPROBE_MAXACTIVE_MAX);
+			return -E2BIG;
+		}
+	}
+
+	if (event) {
 		if (strchr(event, '/')) {
 			group = event;
 			event = strchr(group, '/') + 1;
@@ -718,8 +745,8 @@ static int create_trace_kprobe(int argc, char **argv)
 				 is_return ? 'r' : 'p', addr);
 		event = buf;
 	}
-	tk = alloc_trace_kprobe(group, event, addr, symbol, offset, argc,
-			       is_return);
+	tk = alloc_trace_kprobe(group, event, addr, symbol, offset, maxactive,
+			       argc, is_return);
 	if (IS_ERR(tk)) {
 		pr_info("Failed to allocate trace_probe.(%d)\n",
 			(int)PTR_ERR(tk));
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc
new file mode 100644
index 000000000000..57abdf1caabf
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc
@@ -0,0 +1,39 @@
+#!/bin/sh
+# description: Kretprobe dynamic event with maxactive
+
+[ -f kprobe_events ] || exit_unsupported # this is configurable
+
+echo > kprobe_events
+
+# Test if we successfully reject unknown messages
+if echo 'a:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi
+
+# Test if we successfully reject too big maxactive
+if echo 'r1000000:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi
+
+# Test if we successfully reject unparsable numbers for maxactive
+if echo 'r10fuzz:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi
+
+# Test for kretprobe with event name without maxactive
+echo 'r:myprobeaccept inet_csk_accept' > kprobe_events
+grep myprobeaccept kprobe_events
+test -d events/kprobes/myprobeaccept
+echo '-:myprobeaccept' >> kprobe_events
+
+# Test for kretprobe with event name with a small maxactive
+echo 'r10:myprobeaccept inet_csk_accept' > kprobe_events
+grep myprobeaccept kprobe_events
+test -d events/kprobes/myprobeaccept
+echo '-:myprobeaccept' >> kprobe_events
+
+# Test for kretprobe without event name without maxactive
+echo 'r inet_csk_accept' > kprobe_events
+grep inet_csk_accept kprobe_events
+echo > kprobe_events
+
+# Test for kretprobe without event name with a small maxactive
+echo 'r10 inet_csk_accept' > kprobe_events
+grep inet_csk_accept kprobe_events
+echo > kprobe_events
+
+clear_trace
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-05 16:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-05 16:21 [for-next][PATCH 0/7] tracing: Updates to filter probes and early ftrace init Steven Rostedt
2017-04-05 16:21 ` [for-next][PATCH 1/7] ftrace: Clean up __seq_open_private() return check Steven Rostedt
2017-04-05 16:21 ` [for-next][PATCH 2/7] ftrace: Assign iter->hash to filter or notrace hashes on seq read Steven Rostedt
2017-04-05 16:21 ` [for-next][PATCH 3/7] ftrace: Return NULL at end of t_start() instead of calling t_hash_start() Steven Rostedt
2017-04-05 16:21 ` [for-next][PATCH 4/7] ftrace: Update func_pos in t_start() when all functions are enabled Steven Rostedt
2017-04-05 16:21 ` [for-next][PATCH 5/7] ftrace: Create separate t_func_next() to simplify the function / hash logic Steven Rostedt
2017-04-05 16:21 ` [for-next][PATCH 6/7] ftrace: Have init/main.c call ftrace directly to free init memory Steven Rostedt
2017-04-05 16:21 ` [for-next][PATCH 7/7] tracing/kprobes: expose maxactive for kretprobe in kprobe_events Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.