linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/7] tracing: Creation of event probe
@ 2021-08-17  3:42 Steven Rostedt
  2021-08-17  3:42 ` [PATCH v6 1/7] tracing: Add DYNAMIC flag for dynamic events Steven Rostedt
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

This is a patch series that adds the event probe feature and tries to
incorporate all of Masami's comments.

While updating Tzvetomir's patch, I found some other things that could be
changed as well, and some was added as extra patches.

For instance, removing the customize struct size macros from kprobe and
uprobe events and using the struct_size() macro.

To implement "REC->type" and show the event type for the trace event, the
traceprobe_set_print_fmt() needed to be updated to allow for that.

Instead of allocating a temp buffer that traceprobe_parse_probe_arg() can
manipulate, just have that function do the allocation instead of placing the
burden onto the callers.

Anyway, here's version 6!

-- Steve


Steven Rostedt (VMware) (6):
      tracing: Add DYNAMIC flag for dynamic events
      tracing: Have dynamic events have a ref counter
      tracing/probe: Have traceprobe_parse_probe_arg() take a const arg
      tracing/probes: Allow for dot delimiter as well as slash for system names
      tracing/probes: Use struct_size() instead of defining custom macros
      tracing/probe: Change traceprobe_set_print_fmt() to take a type

Tzvetomir Stoyanov (VMware) (1):
      tracing: Add a probe that attaches to trace events

----
 include/linux/trace_events.h        |  52 ++-
 kernel/trace/Makefile               |   1 +
 kernel/trace/trace.c                |   4 +-
 kernel/trace/trace.h                |  18 +
 kernel/trace/trace_dynevent.c       |  38 ++
 kernel/trace/trace_dynevent.h       |   4 +-
 kernel/trace/trace_eprobe.c         | 895 ++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_event_perf.c     |   6 +-
 kernel/trace/trace_events.c         |  22 +-
 kernel/trace/trace_events_synth.c   |  21 +-
 kernel/trace/trace_events_trigger.c |  20 +-
 kernel/trace/trace_kprobe.c         |  43 +-
 kernel/trace/trace_probe.c          |  81 ++--
 kernel/trace/trace_probe.h          |  15 +-
 kernel/trace/trace_probe_tmpl.h     |   6 +-
 kernel/trace/trace_uprobe.c         |  34 +-
 16 files changed, 1156 insertions(+), 104 deletions(-)
 create mode 100644 kernel/trace/trace_eprobe.c

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 1/7] tracing: Add DYNAMIC flag for dynamic events
  2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
@ 2021-08-17  3:42 ` Steven Rostedt
  2021-08-18  8:15   ` Masami Hiramatsu
  2021-08-17  3:42 ` [PATCH v6 2/7] tracing: Have dynamic events have a ref counter Steven Rostedt
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

To differentiate between static and dynamic events, add a new flag
DYNAMIC to the event flags that all dynamic events have set. This will
allow to differentiate when attaching to a dynamic event from a static
event.

Static events have a mod pointer that references the module they were
created in (or NULL for core kernel). This can be incremented when the
event has something attached to it. But there exists no such mechanism for
dynamic events. This is dangerous as the dynamic events may now disappear
without the "attachment" knowing that it no longer exists.

To enforce the dynamic flag, change dyn_event_add() to pass the event that
is being created such that it can set the DYNAMIC flag of the event. This
helps make sure that no location that creates a dynamic event misses
setting this flag.

Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/

Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/trace_events.h      | 3 +++
 kernel/trace/trace_dynevent.h     | 4 +++-
 kernel/trace/trace_events_synth.c | 2 +-
 kernel/trace/trace_kprobe.c       | 4 ++--
 kernel/trace/trace_uprobe.c       | 4 ++--
 5 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index ad413b382a3c..53c9dffd87fd 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -310,6 +310,7 @@ enum {
 	TRACE_EVENT_FL_NO_SET_FILTER_BIT,
 	TRACE_EVENT_FL_IGNORE_ENABLE_BIT,
 	TRACE_EVENT_FL_TRACEPOINT_BIT,
+	TRACE_EVENT_FL_DYNAMIC_BIT,
 	TRACE_EVENT_FL_KPROBE_BIT,
 	TRACE_EVENT_FL_UPROBE_BIT,
 };
@@ -321,6 +322,7 @@ enum {
  *  NO_SET_FILTER - Set when filter has error and is to be ignored
  *  IGNORE_ENABLE - For trace internal events, do not enable with debugfs file
  *  TRACEPOINT    - Event is a tracepoint
+ *  DYNAMIC       - Event is a dynamic event (created at run time)
  *  KPROBE        - Event is a kprobe
  *  UPROBE        - Event is a uprobe
  */
@@ -330,6 +332,7 @@ enum {
 	TRACE_EVENT_FL_NO_SET_FILTER	= (1 << TRACE_EVENT_FL_NO_SET_FILTER_BIT),
 	TRACE_EVENT_FL_IGNORE_ENABLE	= (1 << TRACE_EVENT_FL_IGNORE_ENABLE_BIT),
 	TRACE_EVENT_FL_TRACEPOINT	= (1 << TRACE_EVENT_FL_TRACEPOINT_BIT),
+	TRACE_EVENT_FL_DYNAMIC		= (1 << TRACE_EVENT_FL_DYNAMIC_BIT),
 	TRACE_EVENT_FL_KPROBE		= (1 << TRACE_EVENT_FL_KPROBE_BIT),
 	TRACE_EVENT_FL_UPROBE		= (1 << TRACE_EVENT_FL_UPROBE_BIT),
 };
diff --git a/kernel/trace/trace_dynevent.h b/kernel/trace/trace_dynevent.h
index 7754936b57ee..936477a111d3 100644
--- a/kernel/trace/trace_dynevent.h
+++ b/kernel/trace/trace_dynevent.h
@@ -76,13 +76,15 @@ int dyn_event_init(struct dyn_event *ev, struct dyn_event_operations *ops)
 	return 0;
 }
 
-static inline int dyn_event_add(struct dyn_event *ev)
+static inline int dyn_event_add(struct dyn_event *ev,
+				struct trace_event_call *call)
 {
 	lockdep_assert_held(&event_mutex);
 
 	if (!ev || !ev->ops)
 		return -EINVAL;
 
+	call->flags |= TRACE_EVENT_FL_DYNAMIC;
 	list_add_tail(&ev->list, &dyn_event_list);
 	return 0;
 }
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index 9315fc03e303..f4f5489e1e28 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -1298,7 +1298,7 @@ static int __create_synth_event(const char *name, const char *raw_fields)
 	}
 	ret = register_synth_event(event);
 	if (!ret)
-		dyn_event_add(&event->devent);
+		dyn_event_add(&event->devent, &event->call);
 	else
 		free_synth_event(event);
  out:
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index ea6178cb5e33..bfef43bfce37 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -618,7 +618,7 @@ static int append_trace_kprobe(struct trace_kprobe *tk, struct trace_kprobe *to)
 	if (ret)
 		trace_probe_unlink(&tk->tp);
 	else
-		dyn_event_add(&tk->devent);
+		dyn_event_add(&tk->devent, trace_probe_event_call(&tk->tp));
 
 	return ret;
 }
@@ -661,7 +661,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
 	if (ret < 0)
 		unregister_kprobe_event(tk);
 	else
-		dyn_event_add(&tk->devent);
+		dyn_event_add(&tk->devent, trace_probe_event_call(&tk->tp));
 
 end:
 	mutex_unlock(&event_mutex);
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 9b50869a5ddb..50eca53b8d22 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -455,7 +455,7 @@ static int append_trace_uprobe(struct trace_uprobe *tu, struct trace_uprobe *to)
 	/* Append to existing event */
 	ret = trace_probe_append(&tu->tp, &to->tp);
 	if (!ret)
-		dyn_event_add(&tu->devent);
+		dyn_event_add(&tu->devent, trace_probe_event_call(&tu->tp));
 
 	return ret;
 }
@@ -518,7 +518,7 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
 		goto end;
 	}
 
-	dyn_event_add(&tu->devent);
+	dyn_event_add(&tu->devent, trace_probe_event_call(&tu->tp));
 
 end:
 	mutex_unlock(&event_mutex);
-- 
2.30.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 2/7] tracing: Have dynamic events have a ref counter
  2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
  2021-08-17  3:42 ` [PATCH v6 1/7] tracing: Add DYNAMIC flag for dynamic events Steven Rostedt
@ 2021-08-17  3:42 ` Steven Rostedt
  2021-08-18 16:14   ` Masami Hiramatsu
  2021-08-17  3:42 ` [PATCH v6 3/7] tracing/probe: Have traceprobe_parse_probe_arg() take a const arg Steven Rostedt
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

As dynamic events are not created by modules, if something is attached to
one, calling "try_module_get()" on its "mod" field, is not going to keep
the dynamic event from going away.

Since dynamic events do not need the "mod" pointer of the event structure,
make a union out of it in order to save memory (there's one structure for
each of the thousand+ events in the kernel), and have any event with the
DYNAMIC flag set to use a ref counter instead.

Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/

Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/trace_events.h        | 45 ++++++++++++++++++++++++++++-
 kernel/trace/trace.c                |  4 +--
 kernel/trace/trace_dynevent.c       | 38 ++++++++++++++++++++++++
 kernel/trace/trace_event_perf.c     |  6 ++--
 kernel/trace/trace_events.c         | 22 +++++++++-----
 kernel/trace/trace_events_synth.c   | 19 +++++++-----
 kernel/trace/trace_events_trigger.c |  6 ++--
 kernel/trace/trace_kprobe.c         |  4 +++
 kernel/trace/trace_uprobe.c         |  4 +++
 9 files changed, 124 insertions(+), 24 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 53c9dffd87fd..858443258812 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -350,7 +350,14 @@ struct trace_event_call {
 	struct trace_event	event;
 	char			*print_fmt;
 	struct event_filter	*filter;
-	void			*mod;
+	/*
+	 * Static events can disappear with modules,
+	 * where as dynamic ones need their own ref count.
+	 */
+	union {
+		void				*module;
+		atomic_t			refcnt;
+	};
 	void			*data;
 
 	/* See the TRACE_EVENT_FL_* flags above */
@@ -366,6 +373,42 @@ struct trace_event_call {
 #endif
 };
 
+#ifdef CONFIG_DYNAMIC_EVENTS
+bool trace_event_dyn_try_get_ref(struct trace_event_call *call);
+void trace_event_dyn_put_ref(struct trace_event_call *call);
+bool trace_event_dyn_busy(struct trace_event_call *call);
+#else
+static inline bool trace_event_dyn_try_get_ref(struct trace_event_call *call)
+{
+	/* Without DYNAMIC_EVENTS configured, nothing should be calling this */
+	return false;
+}
+static inline void trace_event_dyn_put_ref(struct trace_event_call *call)
+{
+}
+static inline bool trace_event_dyn_busy(struct trace_event_call *call)
+{
+	/* Nothing should call this without DYNAIMIC_EVENTS configured. */
+	return true;
+}
+#endif
+
+static inline bool trace_event_try_get_ref(struct trace_event_call *call)
+{
+	if (call->flags & TRACE_EVENT_FL_DYNAMIC)
+		return trace_event_dyn_try_get_ref(call);
+	else
+		return try_module_get(call->module);
+}
+
+static inline void trace_event_put_ref(struct trace_event_call *call)
+{
+	if (call->flags & TRACE_EVENT_FL_DYNAMIC)
+		return trace_event_dyn_put_ref(call);
+	else
+		return module_put(call->module);
+}
+
 #ifdef CONFIG_PERF_EVENTS
 static inline bool bpf_prog_array_valid(struct trace_event_call *call)
 {
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index be0169594de5..8425c3d70895 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3697,11 +3697,11 @@ static bool trace_safe_str(struct trace_iterator *iter, const char *str)
 		return false;
 
 	event = container_of(trace_event, struct trace_event_call, event);
-	if (!event->mod)
+	if ((event->flags & TRACE_EVENT_FL_DYNAMIC) || !event->module)
 		return false;
 
 	/* Would rather have rodata, but this will suffice */
-	if (within_module_core(addr, event->mod))
+	if (within_module_core(addr, event->module))
 		return true;
 
 	return false;
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
index e57cc0870892..1110112e55bd 100644
--- a/kernel/trace/trace_dynevent.c
+++ b/kernel/trace/trace_dynevent.c
@@ -13,11 +13,49 @@
 #include <linux/tracefs.h>
 
 #include "trace.h"
+#include "trace_output.h"	/* for trace_event_sem */
 #include "trace_dynevent.h"
 
 static DEFINE_MUTEX(dyn_event_ops_mutex);
 static LIST_HEAD(dyn_event_ops_list);
 
+bool trace_event_dyn_try_get_ref(struct trace_event_call *dyn_call)
+{
+	struct trace_event_call *call;
+	bool ret = false;
+
+	if (WARN_ON_ONCE(!(dyn_call->flags & TRACE_EVENT_FL_DYNAMIC)))
+		return false;
+
+	down_read(&trace_event_sem);
+	list_for_each_entry(call, &ftrace_events, list) {
+		if (call == dyn_call) {
+			atomic_inc(&dyn_call->refcnt);
+			ret = true;
+		}
+	}
+	up_read(&trace_event_sem);
+	return ret;
+}
+
+void trace_event_dyn_put_ref(struct trace_event_call *call)
+{
+	if (WARN_ON_ONCE(!(call->flags & TRACE_EVENT_FL_DYNAMIC)))
+		return;
+
+	if (WARN_ON_ONCE(atomic_read(&call->refcnt) <= 0)) {
+		atomic_set(&call->refcnt, 0);
+		return;
+	}
+
+	atomic_dec(&call->refcnt);
+}
+
+bool trace_event_dyn_busy(struct trace_event_call *call)
+{
+	return atomic_read(&call->refcnt) != 0;
+}
+
 int dyn_event_register(struct dyn_event_operations *ops)
 {
 	if (!ops || !ops->create || !ops->show || !ops->is_busy ||
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 03be4435d103..6aed10e2f7ce 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -177,7 +177,7 @@ static void perf_trace_event_unreg(struct perf_event *p_event)
 		}
 	}
 out:
-	module_put(tp_event->mod);
+	trace_event_put_ref(tp_event);
 }
 
 static int perf_trace_event_open(struct perf_event *p_event)
@@ -224,10 +224,10 @@ int perf_trace_init(struct perf_event *p_event)
 	list_for_each_entry(tp_event, &ftrace_events, list) {
 		if (tp_event->event.type == event_id &&
 		    tp_event->class && tp_event->class->reg &&
-		    try_module_get(tp_event->mod)) {
+		    trace_event_try_get_ref(tp_event)) {
 			ret = perf_trace_event_init(tp_event, p_event);
 			if (ret)
-				module_put(tp_event->mod);
+				trace_event_put_ref(tp_event);
 			break;
 		}
 	}
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 80e96989770e..1349b6de5eeb 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2525,7 +2525,10 @@ __register_event(struct trace_event_call *call, struct module *mod)
 		return ret;
 
 	list_add(&call->list, &ftrace_events);
-	call->mod = mod;
+	if (call->flags & TRACE_EVENT_FL_DYNAMIC)
+		atomic_set(&call->refcnt, 0);
+	else
+		call->module = mod;
 
 	return 0;
 }
@@ -2839,7 +2842,9 @@ static void trace_module_remove_events(struct module *mod)
 
 	down_write(&trace_event_sem);
 	list_for_each_entry_safe(call, p, &ftrace_events, list) {
-		if (call->mod == mod)
+		if ((call->flags & TRACE_EVENT_FL_DYNAMIC) || !call->module)
+			continue;
+		if (call->module == mod)
 			__trace_remove_event_call(call);
 	}
 	up_write(&trace_event_sem);
@@ -2982,7 +2987,7 @@ struct trace_event_file *trace_get_event_file(const char *instance,
 	}
 
 	/* Don't let event modules unload while in use */
-	ret = try_module_get(file->event_call->mod);
+	ret = trace_event_try_get_ref(file->event_call);
 	if (!ret) {
 		trace_array_put(tr);
 		ret = -EBUSY;
@@ -3012,7 +3017,7 @@ EXPORT_SYMBOL_GPL(trace_get_event_file);
 void trace_put_event_file(struct trace_event_file *file)
 {
 	mutex_lock(&event_mutex);
-	module_put(file->event_call->mod);
+	trace_event_put_ref(file->event_call);
 	mutex_unlock(&event_mutex);
 
 	trace_array_put(file->tr);
@@ -3147,7 +3152,7 @@ static int free_probe_data(void *data)
 	if (!edata->ref) {
 		/* Remove the SOFT_MODE flag */
 		__ftrace_event_enable_disable(edata->file, 0, 1);
-		module_put(edata->file->event_call->mod);
+		trace_event_put_ref(edata->file->event_call);
 		kfree(edata);
 	}
 	return 0;
@@ -3280,7 +3285,7 @@ event_enable_func(struct trace_array *tr, struct ftrace_hash *hash,
 
  out_reg:
 	/* Don't let event modules unload while probe registered */
-	ret = try_module_get(file->event_call->mod);
+	ret = trace_event_try_get_ref(file->event_call);
 	if (!ret) {
 		ret = -EBUSY;
 		goto out_free;
@@ -3310,7 +3315,7 @@ event_enable_func(struct trace_array *tr, struct ftrace_hash *hash,
  out_disable:
 	__ftrace_event_enable_disable(file, 0, 1);
  out_put:
-	module_put(file->event_call->mod);
+	trace_event_put_ref(file->event_call);
  out_free:
 	kfree(data);
 	goto out;
@@ -3376,7 +3381,8 @@ void __trace_early_add_events(struct trace_array *tr)
 
 	list_for_each_entry(call, &ftrace_events, list) {
 		/* Early boot up should not have any modules loaded */
-		if (WARN_ON_ONCE(call->mod))
+		if (!(call->flags & TRACE_EVENT_FL_DYNAMIC) &&
+		    WARN_ON_ONCE(call->module))
 			continue;
 
 		ret = __trace_early_add_new_event(call, tr);
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index f4f5489e1e28..d54094b7a9d7 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -1369,13 +1369,15 @@ static int destroy_synth_event(struct synth_event *se)
 	int ret;
 
 	if (se->ref)
-		ret = -EBUSY;
-	else {
-		ret = unregister_synth_event(se);
-		if (!ret) {
-			dyn_event_remove(&se->devent);
-			free_synth_event(se);
-		}
+		return -EBUSY;
+
+	if (trace_event_dyn_busy(&se->call))
+		return -EBUSY;
+
+	ret = unregister_synth_event(se);
+	if (!ret) {
+		dyn_event_remove(&se->devent);
+		free_synth_event(se);
 	}
 
 	return ret;
@@ -2102,6 +2104,9 @@ static int synth_event_release(struct dyn_event *ev)
 	if (event->ref)
 		return -EBUSY;
 
+	if (trace_event_dyn_busy(&event->call))
+		return -EBUSY;
+
 	ret = unregister_synth_event(event);
 	if (ret)
 		return ret;
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index cf84d0f6583a..6b11e335a62e 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -1334,7 +1334,7 @@ void event_enable_trigger_free(struct event_trigger_ops *ops,
 	if (!data->ref) {
 		/* Remove the SOFT_MODE flag */
 		trace_event_enable_disable(enable_data->file, 0, 1);
-		module_put(enable_data->file->event_call->mod);
+		trace_event_put_ref(enable_data->file->event_call);
 		trigger_data_free(data);
 		kfree(enable_data);
 	}
@@ -1481,7 +1481,7 @@ int event_enable_trigger_func(struct event_command *cmd_ops,
 
  out_reg:
 	/* Don't let event modules unload while probe registered */
-	ret = try_module_get(event_enable_file->event_call->mod);
+	ret = trace_event_try_get_ref(event_enable_file->event_call);
 	if (!ret) {
 		ret = -EBUSY;
 		goto out_free;
@@ -1510,7 +1510,7 @@ int event_enable_trigger_func(struct event_command *cmd_ops,
  out_disable:
 	trace_event_enable_disable(event_enable_file, 0, 1);
  out_put:
-	module_put(event_enable_file->event_call->mod);
+	trace_event_put_ref(event_enable_file->event_call);
  out_free:
 	if (cmd_ops->set_filter)
 		cmd_ops->set_filter(NULL, trigger_data, NULL);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index bfef43bfce37..82c3b86013b2 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -543,6 +543,10 @@ static int unregister_trace_kprobe(struct trace_kprobe *tk)
 	if (trace_probe_is_enabled(&tk->tp))
 		return -EBUSY;
 
+	/* If there's a reference to the dynamic event */
+	if (trace_event_dyn_busy(trace_probe_event_call(&tk->tp)))
+		return -EBUSY;
+
 	/* Will fail if probe is being used by ftrace or perf */
 	if (unregister_kprobe_event(tk))
 		return -EBUSY;
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 50eca53b8d22..1e2a92e7607d 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -393,6 +393,10 @@ static int unregister_trace_uprobe(struct trace_uprobe *tu)
 	if (trace_probe_has_sibling(&tu->tp))
 		goto unreg;
 
+	/* If there's a reference to the dynamic event */
+	if (trace_event_dyn_busy(trace_probe_event_call(&tu->tp)))
+		return -EBUSY;
+
 	ret = unregister_uprobe_event(tu);
 	if (ret)
 		return ret;
-- 
2.30.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 3/7] tracing/probe: Have traceprobe_parse_probe_arg() take a const arg
  2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
  2021-08-17  3:42 ` [PATCH v6 1/7] tracing: Add DYNAMIC flag for dynamic events Steven Rostedt
  2021-08-17  3:42 ` [PATCH v6 2/7] tracing: Have dynamic events have a ref counter Steven Rostedt
@ 2021-08-17  3:42 ` Steven Rostedt
  2021-08-18 16:16   ` Masami Hiramatsu
  2021-08-17  3:42 ` [PATCH v6 4/7] tracing/probes: Allow for dot delimiter as well as slash for system names Steven Rostedt
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The two places that call traceprobe_parse_probe_arg() allocate a temporary
buffer to copy the argv[i] into, because argv[i] is constant and the
traceprobe_parse_probe_arg() will modify it to do the parsing. These two
places allocate this buffer and then free it right after calling this
function, leaving the onus of this allocation to the caller.

As there's about to be a third user of this function that will have to do
the same thing, instead of having the caller allocate the temporary
buffer, simply move that allocation into the traceprobe_parse_probe_arg()
itself, which will simplify the code of the callers.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_kprobe.c |  9 +------
 kernel/trace/trace_probe.c  | 47 ++++++++++++++++++++++---------------
 kernel/trace/trace_probe.h  |  2 +-
 kernel/trace/trace_uprobe.c |  9 +------
 4 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 82c3b86013b2..ed1e3c2087ab 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -873,15 +873,8 @@ static int __trace_kprobe_create(int argc, const char *argv[])
 
 	/* parse arguments */
 	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
-		tmp = kstrdup(argv[i], GFP_KERNEL);
-		if (!tmp) {
-			ret = -ENOMEM;
-			goto error;
-		}
-
 		trace_probe_log_set_index(i + 2);
-		ret = traceprobe_parse_probe_arg(&tk->tp, i, tmp, flags);
-		kfree(tmp);
+		ret = traceprobe_parse_probe_arg(&tk->tp, i, argv[i], flags);
 		if (ret)
 			goto error;	/* This can be -ENOMEM */
 	}
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 15413ad7cef2..ef717b373443 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -540,26 +540,34 @@ static int __parse_bitfield_probe_arg(const char *bf,
 }
 
 /* String length checking wrapper */
-static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
+static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size,
 		struct probe_arg *parg, unsigned int flags, int offset)
 {
 	struct fetch_insn *code, *scode, *tmp = NULL;
 	char *t, *t2, *t3;
+	char *arg;
 	int ret, len;
 
+	arg = kstrdup(argv, GFP_KERNEL);
+	if (!arg)
+		return -ENOMEM;
+
+	ret = -EINVAL;
 	len = strlen(arg);
 	if (len > MAX_ARGSTR_LEN) {
 		trace_probe_log_err(offset, ARG_TOO_LONG);
-		return -EINVAL;
+		goto out;
 	} else if (len == 0) {
 		trace_probe_log_err(offset, NO_ARG_BODY);
-		return -EINVAL;
+		goto out;
 	}
 
+	ret = -ENOMEM;
 	parg->comm = kstrdup(arg, GFP_KERNEL);
 	if (!parg->comm)
-		return -ENOMEM;
+		goto out;
 
+	ret = -EINVAL;
 	t = strchr(arg, ':');
 	if (t) {
 		*t = '\0';
@@ -571,22 +579,22 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 				offset += t2 + strlen(t2) - arg;
 				trace_probe_log_err(offset,
 						    ARRAY_NO_CLOSE);
-				return -EINVAL;
+				goto out;
 			} else if (t3[1] != '\0') {
 				trace_probe_log_err(offset + t3 + 1 - arg,
 						    BAD_ARRAY_SUFFIX);
-				return -EINVAL;
+				goto out;
 			}
 			*t3 = '\0';
 			if (kstrtouint(t2, 0, &parg->count) || !parg->count) {
 				trace_probe_log_err(offset + t2 - arg,
 						    BAD_ARRAY_NUM);
-				return -EINVAL;
+				goto out;
 			}
 			if (parg->count > MAX_ARRAY_LEN) {
 				trace_probe_log_err(offset + t2 - arg,
 						    ARRAY_TOO_BIG);
-				return -EINVAL;
+				goto out;
 			}
 		}
 	}
@@ -598,29 +606,30 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 	if (strcmp(arg, "$comm") == 0 || strncmp(arg, "\\\"", 2) == 0) {
 		/* The type of $comm must be "string", and not an array. */
 		if (parg->count || (t && strcmp(t, "string")))
-			return -EINVAL;
+			goto out;
 		parg->type = find_fetch_type("string");
 	} else
 		parg->type = find_fetch_type(t);
 	if (!parg->type) {
 		trace_probe_log_err(offset + (t ? (t - arg) : 0), BAD_TYPE);
-		return -EINVAL;
+		goto out;
 	}
 	parg->offset = *size;
 	*size += parg->type->size * (parg->count ?: 1);
 
+	ret = -ENOMEM;
 	if (parg->count) {
 		len = strlen(parg->type->fmttype) + 6;
 		parg->fmt = kmalloc(len, GFP_KERNEL);
 		if (!parg->fmt)
-			return -ENOMEM;
+			goto out;
 		snprintf(parg->fmt, len, "%s[%d]", parg->type->fmttype,
 			 parg->count);
 	}
 
 	code = tmp = kcalloc(FETCH_INSN_MAX, sizeof(*code), GFP_KERNEL);
 	if (!code)
-		return -ENOMEM;
+		goto out;
 	code[FETCH_INSN_MAX - 1].op = FETCH_OP_END;
 
 	ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1],
@@ -628,6 +637,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 	if (ret)
 		goto fail;
 
+	ret = -EINVAL;
 	/* Store operation */
 	if (!strcmp(parg->type->name, "string") ||
 	    !strcmp(parg->type->name, "ustring")) {
@@ -636,7 +646,6 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 		    code->op != FETCH_OP_DATA) {
 			trace_probe_log_err(offset + (t ? (t - arg) : 0),
 					    BAD_STRING);
-			ret = -EINVAL;
 			goto fail;
 		}
 		if ((code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM ||
@@ -650,7 +659,6 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 			code++;
 			if (code->op != FETCH_OP_NOP) {
 				trace_probe_log_err(offset, TOO_MANY_OPS);
-				ret = -EINVAL;
 				goto fail;
 			}
 		}
@@ -672,7 +680,6 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 		code++;
 		if (code->op != FETCH_OP_NOP) {
 			trace_probe_log_err(offset, TOO_MANY_OPS);
-			ret = -EINVAL;
 			goto fail;
 		}
 		code->op = FETCH_OP_ST_RAW;
@@ -687,6 +694,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 			goto fail;
 		}
 	}
+	ret = -EINVAL;
 	/* Loop(Array) operation */
 	if (parg->count) {
 		if (scode->op != FETCH_OP_ST_MEM &&
@@ -694,13 +702,11 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 		    scode->op != FETCH_OP_ST_USTRING) {
 			trace_probe_log_err(offset + (t ? (t - arg) : 0),
 					    BAD_STRING);
-			ret = -EINVAL;
 			goto fail;
 		}
 		code++;
 		if (code->op != FETCH_OP_NOP) {
 			trace_probe_log_err(offset, TOO_MANY_OPS);
-			ret = -EINVAL;
 			goto fail;
 		}
 		code->op = FETCH_OP_LP_ARRAY;
@@ -709,6 +715,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 	code++;
 	code->op = FETCH_OP_END;
 
+	ret = 0;
 	/* Shrink down the code buffer */
 	parg->code = kcalloc(code - tmp + 1, sizeof(*code), GFP_KERNEL);
 	if (!parg->code)
@@ -724,6 +731,8 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
 				kfree(code->data);
 	}
 	kfree(tmp);
+out:
+	kfree(arg);
 
 	return ret;
 }
@@ -745,11 +754,11 @@ static int traceprobe_conflict_field_name(const char *name,
 	return 0;
 }
 
-int traceprobe_parse_probe_arg(struct trace_probe *tp, int i, char *arg,
+int traceprobe_parse_probe_arg(struct trace_probe *tp, int i, const char *arg,
 				unsigned int flags)
 {
 	struct probe_arg *parg = &tp->args[i];
-	char *body;
+	const char *body;
 
 	/* Increment count for freeing args in error case */
 	tp->nr_args++;
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 227d518e5ba5..42aa084902fa 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -354,7 +354,7 @@ int trace_probe_create(const char *raw_command, int (*createfn)(int, const char
 #define TPARG_FL_MASK	GENMASK(2, 0)
 
 extern int traceprobe_parse_probe_arg(struct trace_probe *tp, int i,
-				char *arg, unsigned int flags);
+				const char *argv, unsigned int flags);
 
 extern int traceprobe_update_arg(struct probe_arg *arg);
 extern void traceprobe_free_probe_arg(struct probe_arg *arg);
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 1e2a92e7607d..93ff96541971 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -684,16 +684,9 @@ static int __trace_uprobe_create(int argc, const char **argv)
 
 	/* parse arguments */
 	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
-		tmp = kstrdup(argv[i], GFP_KERNEL);
-		if (!tmp) {
-			ret = -ENOMEM;
-			goto error;
-		}
-
 		trace_probe_log_set_index(i + 2);
-		ret = traceprobe_parse_probe_arg(&tu->tp, i, tmp,
+		ret = traceprobe_parse_probe_arg(&tu->tp, i, argv[i],
 					is_return ? TPARG_FL_RETURN : 0);
-		kfree(tmp);
 		if (ret)
 			goto error;
 	}
-- 
2.30.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 4/7] tracing/probes: Allow for dot delimiter as well as slash for system names
  2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
                   ` (2 preceding siblings ...)
  2021-08-17  3:42 ` [PATCH v6 3/7] tracing/probe: Have traceprobe_parse_probe_arg() take a const arg Steven Rostedt
@ 2021-08-17  3:42 ` Steven Rostedt
  2021-08-18 10:57   ` Masami Hiramatsu
  2021-08-17  3:43 ` [PATCH v6 5/7] tracing/probes: Use struct_size() instead of defining custom macros Steven Rostedt
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Kprobe and uprobe events can add a "system" to the events that are created
via the kprobe_events and uprobe_events files respectively. If they do not
include a "system" in the name, then the default "kprobes" or "uprobes" is
used. The current notation to specify a system for one of these probe
events is to add a '/' delimiter in the name, where the content before the
'/' will be the system to use, and the content after will be the event
name.

 echo 'p:my_system/my_event' > kprobe_events

But this is inconsistent with the way histogram triggers separate their
system / event names. The histogram triggers use a '.' delimiter, which
can be confusing.

To allow this to be more consistent, as well as keep backward
compatibility, allow the kprobe and uprobe events to denote a system name
with either a '/' or a '.'.

That is:

  echo 'p:my_system/my_event' > kprobe_events

is equivalent to:

  echo 'p:my_system.my_event' > kprobe_events

Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/

Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_probe.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index ef717b373443..0916a9964719 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -233,6 +233,9 @@ int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
 	int len;
 
 	slash = strchr(event, '/');
+	if (!slash)
+		slash = strchr(event, '.');
+
 	if (slash) {
 		if (slash == event) {
 			trace_probe_log_err(offset, NO_GROUP_NAME);
-- 
2.30.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 5/7] tracing/probes: Use struct_size() instead of defining custom macros
  2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
                   ` (3 preceding siblings ...)
  2021-08-17  3:42 ` [PATCH v6 4/7] tracing/probes: Allow for dot delimiter as well as slash for system names Steven Rostedt
@ 2021-08-17  3:43 ` Steven Rostedt
  2021-08-18 11:08   ` Masami Hiramatsu
  2021-08-17  3:43 ` [PATCH v6 6/7] tracing/probe: Change traceprobe_set_print_fmt() to take a type Steven Rostedt
  2021-08-17  3:43 ` [PATCH v6 7/7] tracing: Add a probe that attaches to trace events Steven Rostedt
  6 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Remove SIZEOF_TRACE_KPROBE() and SIZEOF_TRACE_UPROBE() and use
struct_size() as that's what it is made for. No need to have custom
macros. Especially since struct_size() has some extra memory checks for
correctness.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_kprobe.c | 6 +-----
 kernel/trace/trace_uprobe.c | 6 +-----
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index ed1e3c2087ab..ca726c9d0859 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -80,10 +80,6 @@ static struct trace_kprobe *to_trace_kprobe(struct dyn_event *ev)
 	for_each_dyn_event(dpos)		\
 		if (is_trace_kprobe(dpos) && (pos = to_trace_kprobe(dpos)))
 
-#define SIZEOF_TRACE_KPROBE(n)				\
-	(offsetof(struct trace_kprobe, tp.args) +	\
-	(sizeof(struct probe_arg) * (n)))
-
 static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
 {
 	return tk->rp.handler != NULL;
@@ -265,7 +261,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char *group,
 	struct trace_kprobe *tk;
 	int ret = -ENOMEM;
 
-	tk = kzalloc(SIZEOF_TRACE_KPROBE(nargs), GFP_KERNEL);
+	tk = kzalloc(struct_size(tk, tp.args, nargs), GFP_KERNEL);
 	if (!tk)
 		return ERR_PTR(ret);
 
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 93ff96541971..590bb9a02f8d 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -83,10 +83,6 @@ static struct trace_uprobe *to_trace_uprobe(struct dyn_event *ev)
 	for_each_dyn_event(dpos)		\
 		if (is_trace_uprobe(dpos) && (pos = to_trace_uprobe(dpos)))
 
-#define SIZEOF_TRACE_UPROBE(n)				\
-	(offsetof(struct trace_uprobe, tp.args) +	\
-	(sizeof(struct probe_arg) * (n)))
-
 static int register_uprobe_event(struct trace_uprobe *tu);
 static int unregister_uprobe_event(struct trace_uprobe *tu);
 
@@ -340,7 +336,7 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret)
 	struct trace_uprobe *tu;
 	int ret;
 
-	tu = kzalloc(SIZEOF_TRACE_UPROBE(nargs), GFP_KERNEL);
+	tu = kzalloc(struct_size(tu, tp.args, nargs), GFP_KERNEL);
 	if (!tu)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.30.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 6/7] tracing/probe: Change traceprobe_set_print_fmt() to take a type
  2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
                   ` (4 preceding siblings ...)
  2021-08-17  3:43 ` [PATCH v6 5/7] tracing/probes: Use struct_size() instead of defining custom macros Steven Rostedt
@ 2021-08-17  3:43 ` Steven Rostedt
  2021-08-18 16:19   ` Masami Hiramatsu
  2021-08-17  3:43 ` [PATCH v6 7/7] tracing: Add a probe that attaches to trace events Steven Rostedt
  6 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Instead of a boolean "is_return" have traceprobe_set_print_fmt() take a
type (currently just PROBE_PRINT_NORMAL and PROBE_PRINT_RETURN). This will
simplify adding different types. For example, the development of the
event_probe, will need its own type as it prints an event, and not an IP.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_kprobe.c | 9 +++++++--
 kernel/trace/trace_probe.c  | 3 ++-
 kernel/trace/trace_probe.h  | 7 ++++++-
 kernel/trace/trace_uprobe.c | 8 ++++++--
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index ca726c9d0859..c6fe7a6e3f35 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -742,6 +742,7 @@ static int __trace_kprobe_create(int argc, const char *argv[])
 	bool is_return = false;
 	char *symbol = NULL, *tmp = NULL;
 	const char *event = NULL, *group = KPROBE_EVENT_SYSTEM;
+	enum probe_print_type ptype;
 	int maxactive = 0;
 	long offset = 0;
 	void *addr = NULL;
@@ -875,7 +876,8 @@ static int __trace_kprobe_create(int argc, const char *argv[])
 			goto error;	/* This can be -ENOMEM */
 	}
 
-	ret = traceprobe_set_print_fmt(&tk->tp, is_return);
+	ptype = is_return ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
+	ret = traceprobe_set_print_fmt(&tk->tp, ptype);
 	if (ret < 0)
 		goto error;
 
@@ -1799,6 +1801,7 @@ struct trace_event_call *
 create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
 			  bool is_return)
 {
+	enum probe_print_type ptype;
 	struct trace_kprobe *tk;
 	int ret;
 	char *event;
@@ -1822,7 +1825,9 @@ create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
 
 	init_trace_event_call(tk);
 
-	if (traceprobe_set_print_fmt(&tk->tp, trace_kprobe_is_return(tk)) < 0) {
+	ptype = trace_kprobe_is_return(tk) ?
+		PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
+	if (traceprobe_set_print_fmt(&tk->tp, ptype) < 0) {
 		ret = -ENOMEM;
 		goto error;
 	}
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 0916a9964719..a8dcadeaae95 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -912,9 +912,10 @@ static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
 }
 #undef LEN_OR_ZERO
 
-int traceprobe_set_print_fmt(struct trace_probe *tp, bool is_return)
+int traceprobe_set_print_fmt(struct trace_probe *tp, enum probe_print_type ptype)
 {
 	struct trace_event_call *call = trace_probe_event_call(tp);
+	bool is_return = ptype == PROBE_PRINT_RETURN;
 	int len;
 	char *print_fmt;
 
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 42aa084902fa..8adf5f3542a6 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -363,7 +363,12 @@ extern int traceprobe_split_symbol_offset(char *symbol, long *offset);
 int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
 				char *buf, int offset);
 
-extern int traceprobe_set_print_fmt(struct trace_probe *tp, bool is_return);
+enum probe_print_type {
+	PROBE_PRINT_NORMAL,
+	PROBE_PRINT_RETURN,
+};
+
+extern int traceprobe_set_print_fmt(struct trace_probe *tp, enum probe_print_type ptype);
 
 #ifdef CONFIG_PERF_EVENTS
 extern struct trace_event_call *
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 590bb9a02f8d..09f8ca7f7ba0 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -536,6 +536,7 @@ static int __trace_uprobe_create(int argc, const char **argv)
 	const char *event = NULL, *group = UPROBE_EVENT_SYSTEM;
 	char *arg, *filename, *rctr, *rctr_end, *tmp;
 	char buf[MAX_EVENT_NAME_LEN];
+	enum probe_print_type ptype;
 	struct path path;
 	unsigned long offset, ref_ctr_offset;
 	bool is_return = false;
@@ -687,7 +688,8 @@ static int __trace_uprobe_create(int argc, const char **argv)
 			goto error;
 	}
 
-	ret = traceprobe_set_print_fmt(&tu->tp, is_ret_probe(tu));
+	ptype = is_ret_probe(tu) ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
+	ret = traceprobe_set_print_fmt(&tu->tp, ptype);
 	if (ret < 0)
 		goto error;
 
@@ -1578,6 +1580,7 @@ struct trace_event_call *
 create_local_trace_uprobe(char *name, unsigned long offs,
 			  unsigned long ref_ctr_offset, bool is_return)
 {
+	enum probe_print_type ptype;
 	struct trace_uprobe *tu;
 	struct path path;
 	int ret;
@@ -1612,7 +1615,8 @@ create_local_trace_uprobe(char *name, unsigned long offs,
 	tu->filename = kstrdup(name, GFP_KERNEL);
 	init_trace_event_call(tu);
 
-	if (traceprobe_set_print_fmt(&tu->tp, is_ret_probe(tu)) < 0) {
+	ptype = is_ret_probe(tu) ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
+	if (traceprobe_set_print_fmt(&tu->tp, ptype) < 0) {
 		ret = -ENOMEM;
 		goto error;
 	}
-- 
2.30.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 7/7] tracing: Add a probe that attaches to trace events
  2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
                   ` (5 preceding siblings ...)
  2021-08-17  3:43 ` [PATCH v6 6/7] tracing/probe: Change traceprobe_set_print_fmt() to take a type Steven Rostedt
@ 2021-08-17  3:43 ` Steven Rostedt
  2021-08-19  2:29   ` Masami Hiramatsu
  6 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-17  3:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

From: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>

A new dynamic event is introduced: event probe. The event is attached
to an existing tracepoint and uses its fields as arguments. The user
can specify custom format string of the new event, select what tracepoint
arguments will be printed and how to print them.
An event probe is created by writing configuration string in
'dynamic_events' ftrace file:
 e:[SNAME/]ENAME SYSTEM/EVENT [FETCHARGS]	- Set an event probe
 -:SNAME/ENAME					- Delete an event probe

Where:
 SNAME	- System name, if omitted 'eprobes' is used.
 ENAME	- Name of the new event in SNAME, mandatory.
 SYSTEM	- Name of the system, where the tracepoint is defined, mandatory.
 EVENT	- Name of the tracepoint event in SYSTEM, mandatory.
 FETCHARGS - Arguments:
  <name>=$<field>[:TYPE] - Fetch given filed of the tracepoint and print
			   it as given TYPE with given name. Supported
			   types are:
	                    (u8/u16/u32/u64/s8/s16/s32/s64), basic type
        	            (x8/x16/x32/x64), hexadecimal types
			    "string", "ustring" and bitfield.

Example, attach an event probe on openat system call and print name of the
file that will be opened:
 echo "e:esys/eopen syscalls/sys_enter_openat file=\$filename:string" >> dynamic_events
A new dynamic event is created in events/esys/eopen/ directory. It
can be deleted with:
 echo "-:esys/eopen" >> dynamic_events

Filters, triggers and histograms can be attached to the new event, it can
be matched in synthetic events. There is one limitation - an event probe
can not be attached to kprobe, uprobe or another event probe.

Link: https://lkml.kernel.org/r/20210812145805.2292326-1-tz.stoyanov@gmail.com

Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/trace_events.h        |   4 +
 kernel/trace/Makefile               |   1 +
 kernel/trace/trace.h                |  18 +
 kernel/trace/trace_eprobe.c         | 895 ++++++++++++++++++++++++++++
 kernel/trace/trace_events_trigger.c |  14 +-
 kernel/trace/trace_kprobe.c         |  11 +-
 kernel/trace/trace_probe.c          |  30 +-
 kernel/trace/trace_probe.h          |   6 +-
 kernel/trace/trace_probe_tmpl.h     |   6 +-
 kernel/trace/trace_uprobe.c         |   3 +-
 10 files changed, 965 insertions(+), 23 deletions(-)
 create mode 100644 kernel/trace/trace_eprobe.c

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 858443258812..4e24275398c3 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -313,6 +313,7 @@ enum {
 	TRACE_EVENT_FL_DYNAMIC_BIT,
 	TRACE_EVENT_FL_KPROBE_BIT,
 	TRACE_EVENT_FL_UPROBE_BIT,
+	TRACE_EVENT_FL_EPROBE_BIT,
 };
 
 /*
@@ -325,6 +326,7 @@ enum {
  *  DYNAMIC       - Event is a dynamic event (created at run time)
  *  KPROBE        - Event is a kprobe
  *  UPROBE        - Event is a uprobe
+ *  EPROBE        - Event is an event probe
  */
 enum {
 	TRACE_EVENT_FL_FILTERED		= (1 << TRACE_EVENT_FL_FILTERED_BIT),
@@ -335,6 +337,7 @@ enum {
 	TRACE_EVENT_FL_DYNAMIC		= (1 << TRACE_EVENT_FL_DYNAMIC_BIT),
 	TRACE_EVENT_FL_KPROBE		= (1 << TRACE_EVENT_FL_KPROBE_BIT),
 	TRACE_EVENT_FL_UPROBE		= (1 << TRACE_EVENT_FL_UPROBE_BIT),
+	TRACE_EVENT_FL_EPROBE		= (1 << TRACE_EVENT_FL_EPROBE_BIT),
 };
 
 #define TRACE_EVENT_FL_UKPROBE (TRACE_EVENT_FL_KPROBE | TRACE_EVENT_FL_UPROBE)
@@ -680,6 +683,7 @@ enum event_trigger_type {
 	ETT_EVENT_ENABLE	= (1 << 3),
 	ETT_EVENT_HIST		= (1 << 4),
 	ETT_HIST_ENABLE		= (1 << 5),
+	ETT_EVENT_EPROBE	= (1 << 6),
 };
 
 extern int filter_match_preds(struct event_filter *filter, void *rec);
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index b1c47ccf4f73..6de5d4d63165 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -77,6 +77,7 @@ obj-$(CONFIG_EVENT_TRACING) += trace_event_perf.o
 endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_EVENT_TRACING) += trace_events_trigger.o
+obj-$(CONFIG_PROBE_EVENTS) += trace_eprobe.o
 obj-$(CONFIG_TRACE_EVENT_INJECT) += trace_events_inject.o
 obj-$(CONFIG_SYNTH_EVENTS) += trace_events_synth.o
 obj-$(CONFIG_HIST_TRIGGERS) += trace_events_hist.o
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 4a0e693000c6..b7c0f8e160fb 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -126,6 +126,11 @@ struct kprobe_trace_entry_head {
 	unsigned long		ip;
 };
 
+struct eprobe_trace_entry_head {
+	struct trace_entry	ent;
+	unsigned int		type;
+};
+
 struct kretprobe_trace_entry_head {
 	struct trace_entry	ent;
 	unsigned long		func;
@@ -1508,9 +1513,14 @@ static inline int register_trigger_hist_enable_disable_cmds(void) { return 0; }
 extern int register_trigger_cmds(void);
 extern void clear_event_triggers(struct trace_array *tr);
 
+enum {
+	EVENT_TRIGGER_FL_PROBE		= BIT(0),
+};
+
 struct event_trigger_data {
 	unsigned long			count;
 	int				ref;
+	int				flags;
 	struct event_trigger_ops	*ops;
 	struct event_command		*cmd_ops;
 	struct event_filter __rcu	*filter;
@@ -1918,6 +1928,14 @@ static inline bool is_good_name(const char *name)
 	return true;
 }
 
+/* Convert certain expected symbols into '_' when generating event names */
+static inline void sanitize_event_name(char *name)
+{
+	while (*name++ != '\0')
+		if (*name == ':' || *name == '.')
+			*name = '_';
+}
+
 /*
  * This is a generic way to read and write a u64 value from a file in tracefs.
  *
diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
new file mode 100644
index 000000000000..3c76b63b278e
--- /dev/null
+++ b/kernel/trace/trace_eprobe.c
@@ -0,0 +1,895 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * event probes
+ *
+ * Part of this code was copied from kernel/trace/trace_kprobe.c written by
+ * Masami Hiramatsu <mhiramat@kernel.org>
+ *
+ * Copyright (C) 2021, VMware Inc, Steven Rostedt <rostedt@goodmis.org>
+ * Copyright (C) 2021, VMware Inc, Tzvetomir Stoyanov tz.stoyanov@gmail.com>
+ *
+ */
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/ftrace.h>
+
+#include "trace_dynevent.h"
+#include "trace_probe.h"
+#include "trace_probe_tmpl.h"
+
+#define EPROBE_EVENT_SYSTEM "eprobes"
+
+struct trace_eprobe {
+	/* tracepoint system */
+	const char *event_system;
+
+	/* tracepoint event */
+	const char *event_name;
+
+	struct trace_event_call *event;
+
+	struct dyn_event	devent;
+	struct trace_probe	tp;
+};
+
+struct eprobe_data {
+	struct trace_event_file	*file;
+	struct trace_eprobe	*ep;
+};
+
+static int __trace_eprobe_create(int argc, const char *argv[]);
+
+static void trace_event_probe_cleanup(struct trace_eprobe *ep)
+{
+	trace_probe_cleanup(&ep->tp);
+	kfree(ep->event_name);
+	kfree(ep->event_system);
+	if (ep->event)
+		trace_event_put_ref(ep->event);
+	kfree(ep);
+}
+
+static struct trace_eprobe *to_trace_eprobe(struct dyn_event *ev)
+{
+	return container_of(ev, struct trace_eprobe, devent);
+}
+
+static int eprobe_dyn_event_create(const char *raw_command)
+{
+	return trace_probe_create(raw_command, __trace_eprobe_create);
+}
+
+static int eprobe_dyn_event_show(struct seq_file *m, struct dyn_event *ev)
+{
+	struct trace_eprobe *ep = to_trace_eprobe(ev);
+	int i;
+
+	seq_printf(m, "e:%s/%s", trace_probe_group_name(&ep->tp),
+				trace_probe_name(&ep->tp));
+	seq_printf(m, " %s.%s", ep->event_system, ep->event_name);
+
+	for (i = 0; i < ep->tp.nr_args; i++)
+		seq_printf(m, " %s=%s", ep->tp.args[i].name, ep->tp.args[i].comm);
+	seq_putc(m, '\n');
+
+	return 0;
+}
+
+static int unregister_trace_eprobe(struct trace_eprobe *ep)
+{
+	/* If other probes are on the event, just unregister eprobe */
+	if (trace_probe_has_sibling(&ep->tp))
+		goto unreg;
+
+	/* Enabled event can not be unregistered */
+	if (trace_probe_is_enabled(&ep->tp))
+		return -EBUSY;
+
+	/* Will fail if probe is being used by ftrace or perf */
+	if (trace_probe_unregister_event_call(&ep->tp))
+		return -EBUSY;
+
+unreg:
+	dyn_event_remove(&ep->devent);
+	trace_probe_unlink(&ep->tp);
+
+	return 0;
+}
+
+static int eprobe_dyn_event_release(struct dyn_event *ev)
+{
+	struct trace_eprobe *ep = to_trace_eprobe(ev);
+	int ret = unregister_trace_eprobe(ep);
+
+	if (!ret)
+		trace_event_probe_cleanup(ep);
+	return ret;
+}
+
+static bool eprobe_dyn_event_is_busy(struct dyn_event *ev)
+{
+	struct trace_eprobe *ep = to_trace_eprobe(ev);
+
+	return trace_probe_is_enabled(&ep->tp);
+}
+
+static bool eprobe_dyn_event_match(const char *system, const char *event,
+			int argc, const char **argv, struct dyn_event *ev)
+{
+	struct trace_eprobe *ep = to_trace_eprobe(ev);
+
+	return strcmp(trace_probe_name(&ep->tp), event) == 0 &&
+	    (!system || strcmp(trace_probe_group_name(&ep->tp), system) == 0) &&
+	    trace_probe_match_command_args(&ep->tp, argc, argv);
+}
+
+static struct dyn_event_operations eprobe_dyn_event_ops = {
+	.create = eprobe_dyn_event_create,
+	.show = eprobe_dyn_event_show,
+	.is_busy = eprobe_dyn_event_is_busy,
+	.free = eprobe_dyn_event_release,
+	.match = eprobe_dyn_event_match,
+};
+
+static struct trace_eprobe *alloc_event_probe(const char *group,
+					      const char *this_event,
+					      struct trace_event_call *event,
+					      int nargs)
+{
+	struct trace_eprobe *ep;
+	const char *event_name;
+	const char *sys_name;
+	int ret = -ENOMEM;
+
+	if (!event)
+		return ERR_PTR(-ENODEV);
+
+	sys_name = event->class->system;
+	event_name = trace_event_name(event);
+
+	ep = kzalloc(struct_size(ep, tp.args, nargs), GFP_KERNEL);
+	if (!ep) {
+		trace_event_put_ref(ep->event);
+		goto error;
+	}
+	ep->event = event;
+	ep->event_name = kstrdup(event_name, GFP_KERNEL);
+	if (!ep->event_name)
+		goto error;
+	ep->event_system = kstrdup(sys_name, GFP_KERNEL);
+	if (!ep->event_system)
+		goto error;
+
+	ret = trace_probe_init(&ep->tp, this_event, group, false);
+	if (ret < 0)
+		goto error;
+
+	dyn_event_init(&ep->devent, &eprobe_dyn_event_ops);
+	return ep;
+error:
+	trace_event_probe_cleanup(ep);
+	return ERR_PTR(ret);
+}
+
+static int trace_eprobe_tp_arg_update(struct trace_eprobe *ep, int i)
+{
+	struct probe_arg *parg = &ep->tp.args[i];
+	struct ftrace_event_field *field;
+	struct list_head *head;
+
+	head = trace_get_fields(ep->event);
+	list_for_each_entry(field, head, link) {
+		if (!strcmp(parg->code->data, field->name)) {
+			kfree(parg->code->data);
+			parg->code->data = field;
+			return 0;
+		}
+	}
+	kfree(parg->code->data);
+	parg->code->data = NULL;
+	return -ENOENT;
+}
+
+static int eprobe_event_define_fields(struct trace_event_call *event_call)
+{
+	int ret;
+	struct eprobe_trace_entry_head field;
+	struct trace_probe *tp;
+
+	tp = trace_probe_primary_from_call(event_call);
+	if (WARN_ON_ONCE(!tp))
+		return -ENOENT;
+
+	DEFINE_FIELD(unsigned int, type, FIELD_STRING_TYPE, 0);
+
+	return traceprobe_define_arg_fields(event_call, sizeof(field), tp);
+}
+
+static struct trace_event_fields eprobe_fields_array[] = {
+	{ .type = TRACE_FUNCTION_TYPE,
+	  .define_fields = eprobe_event_define_fields },
+	{}
+};
+
+/* Event entry printers */
+static enum print_line_t
+print_eprobe_event(struct trace_iterator *iter, int flags,
+		   struct trace_event *event)
+{
+	struct eprobe_trace_entry_head *field;
+	struct trace_event_call *pevent;
+	struct trace_event *probed_event;
+	struct trace_seq *s = &iter->seq;
+	struct trace_probe *tp;
+
+	field = (struct eprobe_trace_entry_head *)iter->ent;
+	tp = trace_probe_primary_from_call(
+		container_of(event, struct trace_event_call, event));
+	if (WARN_ON_ONCE(!tp))
+		goto out;
+
+	trace_seq_printf(s, "%s: (", trace_probe_name(tp));
+
+	probed_event = ftrace_find_event(field->type);
+	if (probed_event) {
+		pevent = container_of(probed_event, struct trace_event_call, event);
+		trace_seq_printf(s, "%s.%s", pevent->class->system,
+				 trace_event_name(pevent));
+	} else {
+		trace_seq_printf(s, "%u", field->type);
+	}
+
+	trace_seq_putc(s, ')');
+
+	if (print_probe_args(s, tp->args, tp->nr_args,
+			     (u8 *)&field[1], field) < 0)
+		goto out;
+
+	trace_seq_putc(s, '\n');
+ out:
+	return trace_handle_return(s);
+}
+
+static unsigned long get_event_field(struct fetch_insn *code, void *rec)
+{
+	struct ftrace_event_field *field = code->data;
+	unsigned long val;
+	void *addr;
+
+	addr = rec + field->offset;
+
+	switch (field->size) {
+	case 1:
+		if (field->is_signed)
+			val = *(char *)addr;
+		else
+			val = *(unsigned char *)addr;
+		break;
+	case 2:
+		if (field->is_signed)
+			val = *(short *)addr;
+		else
+			val = *(unsigned short *)addr;
+		break;
+	case 4:
+		if (field->is_signed)
+			val = *(int *)addr;
+		else
+			val = *(unsigned int *)addr;
+		break;
+	default:
+		if (field->is_signed)
+			val = *(long *)addr;
+		else
+			val = *(unsigned long *)addr;
+		break;
+	}
+	return val;
+}
+
+static int get_eprobe_size(struct trace_probe *tp, void *rec)
+{
+	struct probe_arg *arg;
+	int i, len, ret = 0;
+
+	for (i = 0; i < tp->nr_args; i++) {
+		arg = tp->args + i;
+		if (unlikely(arg->dynamic)) {
+			unsigned long val;
+
+			val = get_event_field(arg->code, rec);
+			len = process_fetch_insn_bottom(arg->code + 1, val, NULL, NULL);
+			if (len > 0)
+				ret += len;
+		}
+	}
+
+	return ret;
+}
+
+/* Kprobe specific fetch functions */
+
+/* Note that we don't verify it, since the code does not come from user space */
+static int
+process_fetch_insn(struct fetch_insn *code, void *rec, void *dest,
+		   void *base)
+{
+	unsigned long val;
+
+	val = get_event_field(code, rec);
+	return process_fetch_insn_bottom(code + 1, val, dest, base);
+}
+NOKPROBE_SYMBOL(process_fetch_insn)
+
+/* Return the length of string -- including null terminal byte */
+static nokprobe_inline int
+fetch_store_strlen_user(unsigned long addr)
+{
+	const void __user *uaddr =  (__force const void __user *)addr;
+
+	return strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
+}
+
+/* Return the length of string -- including null terminal byte */
+static nokprobe_inline int
+fetch_store_strlen(unsigned long addr)
+{
+	int ret, len = 0;
+	u8 c;
+
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+	if (addr < TASK_SIZE)
+		return fetch_store_strlen_user(addr);
+#endif
+
+	do {
+		ret = copy_from_kernel_nofault(&c, (u8 *)addr + len, 1);
+		len++;
+	} while (c && ret == 0 && len < MAX_STRING_SIZE);
+
+	return (ret < 0) ? ret : len;
+}
+
+/*
+ * Fetch a null-terminated string from user. Caller MUST set *(u32 *)buf
+ * with max length and relative data location.
+ */
+static nokprobe_inline int
+fetch_store_string_user(unsigned long addr, void *dest, void *base)
+{
+	const void __user *uaddr =  (__force const void __user *)addr;
+	int maxlen = get_loc_len(*(u32 *)dest);
+	void *__dest;
+	long ret;
+
+	if (unlikely(!maxlen))
+		return -ENOMEM;
+
+	__dest = get_loc_data(dest, base);
+
+	ret = strncpy_from_user_nofault(__dest, uaddr, maxlen);
+	if (ret >= 0)
+		*(u32 *)dest = make_data_loc(ret, __dest - base);
+
+	return ret;
+}
+
+/*
+ * Fetch a null-terminated string. Caller MUST set *(u32 *)buf with max
+ * length and relative data location.
+ */
+static nokprobe_inline int
+fetch_store_string(unsigned long addr, void *dest, void *base)
+{
+	int maxlen = get_loc_len(*(u32 *)dest);
+	void *__dest;
+	long ret;
+
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+	if ((unsigned long)addr < TASK_SIZE)
+		return fetch_store_string_user(addr, dest, base);
+#endif
+
+	if (unlikely(!maxlen))
+		return -ENOMEM;
+
+	__dest = get_loc_data(dest, base);
+
+	/*
+	 * Try to get string again, since the string can be changed while
+	 * probing.
+	 */
+	ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen);
+	if (ret >= 0)
+		*(u32 *)dest = make_data_loc(ret, __dest - base);
+
+	return ret;
+}
+
+static nokprobe_inline int
+probe_mem_read_user(void *dest, void *src, size_t size)
+{
+	const void __user *uaddr =  (__force const void __user *)src;
+
+	return copy_from_user_nofault(dest, uaddr, size);
+}
+
+static nokprobe_inline int
+probe_mem_read(void *dest, void *src, size_t size)
+{
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+	if ((unsigned long)src < TASK_SIZE)
+		return probe_mem_read_user(dest, src, size);
+#endif
+	return copy_from_kernel_nofault(dest, src, size);
+}
+
+/* eprobe handler */
+static inline void
+__eprobe_trace_func(struct eprobe_data *edata, void *rec)
+{
+	struct eprobe_trace_entry_head *entry;
+	struct trace_event_call *call = trace_probe_event_call(&edata->ep->tp);
+	struct trace_event_buffer fbuffer;
+	int dsize;
+
+	if (WARN_ON_ONCE(call != edata->file->event_call))
+		return;
+
+	if (trace_trigger_soft_disabled(edata->file))
+		return;
+
+	fbuffer.trace_ctx = tracing_gen_ctx();
+	fbuffer.trace_file = edata->file;
+
+	dsize = get_eprobe_size(&edata->ep->tp, rec);
+	fbuffer.regs = NULL;
+
+	fbuffer.event =
+		trace_event_buffer_lock_reserve(&fbuffer.buffer, edata->file,
+					call->event.type,
+					sizeof(*entry) + edata->ep->tp.size + dsize,
+					fbuffer.trace_ctx);
+	if (!fbuffer.event)
+		return;
+
+	entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
+	if (edata->ep->event)
+		entry->type = edata->ep->event->event.type;
+	else
+		entry->type = 0;
+	store_trace_args(&entry[1], &edata->ep->tp, rec, sizeof(*entry), dsize);
+
+	trace_event_buffer_commit(&fbuffer);
+}
+
+/*
+ * The event probe implementation uses event triggers to get access to
+ * the event it is attached to, but is not an actual trigger. The below
+ * functions are just stubs to fulfill what is needed to use the trigger
+ * infrastructure.
+ */
+static int eprobe_trigger_init(struct event_trigger_ops *ops,
+			       struct event_trigger_data *data)
+{
+	return 0;
+}
+
+static void eprobe_trigger_free(struct event_trigger_ops *ops,
+				struct event_trigger_data *data)
+{
+
+}
+
+static int eprobe_trigger_print(struct seq_file *m,
+				struct event_trigger_ops *ops,
+				struct event_trigger_data *data)
+{
+	/* Do not print eprobe event triggers */
+	return 0;
+}
+
+static void eprobe_trigger_func(struct event_trigger_data *data,
+				struct trace_buffer *buffer, void *rec,
+				struct ring_buffer_event *rbe)
+{
+	struct eprobe_data *edata = data->private_data;
+
+	__eprobe_trace_func(edata, rec);
+}
+
+static struct event_trigger_ops eprobe_trigger_ops = {
+	.func			= eprobe_trigger_func,
+	.print			= eprobe_trigger_print,
+	.init			= eprobe_trigger_init,
+	.free			= eprobe_trigger_free,
+};
+
+static int eprobe_trigger_cmd_func(struct event_command *cmd_ops,
+				   struct trace_event_file *file,
+				   char *glob, char *cmd, char *param)
+{
+	return -1;
+}
+
+static int eprobe_trigger_reg_func(char *glob, struct event_trigger_ops *ops,
+				 struct event_trigger_data *data,
+				 struct trace_event_file *file)
+{
+	return -1;
+}
+
+static void eprobe_trigger_unreg_func(char *glob, struct event_trigger_ops *ops,
+				    struct event_trigger_data *data,
+				    struct trace_event_file *file)
+{
+
+}
+
+static struct event_trigger_ops *eprobe_trigger_get_ops(char *cmd,
+							char *param)
+{
+	return &eprobe_trigger_ops;
+}
+
+static struct event_command event_trigger_cmd = {
+	.name			= "eprobe",
+	.trigger_type		= ETT_EVENT_EPROBE,
+	.flags			= EVENT_CMD_FL_NEEDS_REC,
+	.func			= eprobe_trigger_cmd_func,
+	.reg			= eprobe_trigger_reg_func,
+	.unreg			= eprobe_trigger_unreg_func,
+	.unreg_all		= NULL,
+	.get_trigger_ops	= eprobe_trigger_get_ops,
+	.set_filter		= NULL,
+};
+
+static struct event_trigger_data *
+new_eprobe_trigger(struct trace_eprobe *ep, struct trace_event_file *file)
+{
+	struct event_trigger_data *trigger;
+	struct eprobe_data *edata;
+
+	edata = kzalloc(sizeof(*edata), GFP_KERNEL);
+	trigger = kzalloc(sizeof(*trigger), GFP_KERNEL);
+	if (!trigger || !edata) {
+		kfree(edata);
+		kfree(trigger);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	trigger->flags = EVENT_TRIGGER_FL_PROBE;
+	trigger->count = -1;
+	trigger->ops = &eprobe_trigger_ops;
+
+	/*
+	 * EVENT PROBE triggers are not registered as commands with
+	 * register_event_command(), as they are not controlled by the user
+	 * from the trigger file
+	 */
+	trigger->cmd_ops = &event_trigger_cmd;
+
+	INIT_LIST_HEAD(&trigger->list);
+	RCU_INIT_POINTER(trigger->filter, NULL);
+
+	edata->file = file;
+	edata->ep = ep;
+	trigger->private_data = edata;
+
+	return trigger;
+}
+
+static int enable_eprobe(struct trace_eprobe *ep,
+			 struct trace_event_file *eprobe_file)
+{
+	struct event_trigger_data *trigger;
+	struct trace_event_file *file;
+	struct trace_array *tr = eprobe_file->tr;
+
+	file = find_event_file(tr, ep->event_system, ep->event_name);
+	if (!file)
+		return -ENOENT;
+	trigger = new_eprobe_trigger(ep, eprobe_file);
+	if (IS_ERR(trigger))
+		return PTR_ERR(trigger);
+
+	list_add_tail_rcu(&trigger->list, &file->triggers);
+
+	trace_event_trigger_enable_disable(file, 1);
+	update_cond_flag(file);
+
+	return 0;
+}
+
+static struct trace_event_functions eprobe_funcs = {
+	.trace		= print_eprobe_event
+};
+
+static int disable_eprobe(struct trace_eprobe *ep,
+			  struct trace_array *tr)
+{
+	struct event_trigger_data *trigger;
+	struct trace_event_file *file;
+	struct eprobe_data *edata;
+
+	file = find_event_file(tr, ep->event_system, ep->event_name);
+	if (!file)
+		return -ENOENT;
+
+	list_for_each_entry(trigger, &file->triggers, list) {
+		if (!(trigger->flags & EVENT_TRIGGER_FL_PROBE))
+			continue;
+		edata = trigger->private_data;
+		if (edata->ep == ep)
+			break;
+	}
+	if (list_entry_is_head(trigger, &file->triggers, list))
+		return -ENODEV;
+
+	list_del_rcu(&trigger->list);
+
+	trace_event_trigger_enable_disable(file, 0);
+	update_cond_flag(file);
+	return 0;
+}
+
+static int enable_trace_eprobe(struct trace_event_call *call,
+			       struct trace_event_file *file)
+{
+	struct trace_probe *pos, *tp;
+	struct trace_eprobe *ep;
+	bool enabled;
+	int ret = 0;
+
+	tp = trace_probe_primary_from_call(call);
+	if (WARN_ON_ONCE(!tp))
+		return -ENODEV;
+	enabled = trace_probe_is_enabled(tp);
+
+	/* This also changes "enabled" state */
+	if (file) {
+		ret = trace_probe_add_file(tp, file);
+		if (ret)
+			return ret;
+	} else
+		trace_probe_set_flag(tp, TP_FLAG_PROFILE);
+
+	if (enabled)
+		return 0;
+
+	list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
+		ep = container_of(pos, struct trace_eprobe, tp);
+		ret = enable_eprobe(ep, file);
+		if (ret)
+			break;
+		enabled = true;
+	}
+
+	if (ret) {
+		/* Failed to enable one of them. Roll back all */
+		if (enabled)
+			disable_eprobe(ep, file->tr);
+		if (file)
+			trace_probe_remove_file(tp, file);
+		else
+			trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
+	}
+
+	return ret;
+}
+
+static int disable_trace_eprobe(struct trace_event_call *call,
+				struct trace_event_file *file)
+{
+	struct trace_probe *pos, *tp;
+	struct trace_eprobe *ep;
+
+	tp = trace_probe_primary_from_call(call);
+	if (WARN_ON_ONCE(!tp))
+		return -ENODEV;
+
+	if (file) {
+		if (!trace_probe_get_file_link(tp, file))
+			return -ENOENT;
+		if (!trace_probe_has_single_file(tp))
+			goto out;
+		trace_probe_clear_flag(tp, TP_FLAG_TRACE);
+	} else
+		trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
+
+	if (!trace_probe_is_enabled(tp)) {
+		list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
+			ep = container_of(pos, struct trace_eprobe, tp);
+			disable_eprobe(ep, file->tr);
+		}
+	}
+
+ out:
+	if (file)
+		/*
+		 * Synchronization is done in below function. For perf event,
+		 * file == NULL and perf_trace_event_unreg() calls
+		 * tracepoint_synchronize_unregister() to ensure synchronize
+		 * event. We don't need to care about it.
+		 */
+		trace_probe_remove_file(tp, file);
+
+	return 0;
+}
+
+static int eprobe_register(struct trace_event_call *event,
+			   enum trace_reg type, void *data)
+{
+	struct trace_event_file *file = data;
+
+	switch (type) {
+	case TRACE_REG_REGISTER:
+		return enable_trace_eprobe(event, file);
+	case TRACE_REG_UNREGISTER:
+		return disable_trace_eprobe(event, file);
+#ifdef CONFIG_PERF_EVENTS
+	case TRACE_REG_PERF_REGISTER:
+	case TRACE_REG_PERF_UNREGISTER:
+	case TRACE_REG_PERF_OPEN:
+	case TRACE_REG_PERF_CLOSE:
+	case TRACE_REG_PERF_ADD:
+	case TRACE_REG_PERF_DEL:
+		return 0;
+#endif
+	}
+	return 0;
+}
+
+static inline void init_trace_eprobe_call(struct trace_eprobe *ep)
+{
+	struct trace_event_call *call = trace_probe_event_call(&ep->tp);
+
+	call->flags = TRACE_EVENT_FL_EPROBE;
+	call->event.funcs = &eprobe_funcs;
+	call->class->fields_array = eprobe_fields_array;
+	call->class->reg = eprobe_register;
+}
+
+static struct trace_event_call *
+find_and_get_event(const char *system, const char *event_name)
+{
+	struct trace_event_call *tp_event;
+	const char *name;
+
+	list_for_each_entry(tp_event, &ftrace_events, list) {
+		/* Skip other probes and ftrace events */
+		if ((tp_event->flags & TRACE_EVENT_FL_IGNORE_ENABLE) ||
+		    (tp_event->flags & TRACE_EVENT_FL_KPROBE) ||
+		    (tp_event->flags & TRACE_EVENT_FL_UPROBE) ||
+		    (tp_event->flags & TRACE_EVENT_FL_EPROBE))
+			continue;
+		if (!tp_event->class->system ||
+		    strcmp(system, tp_event->class->system))
+			continue;
+		name = trace_event_name(tp_event);
+		if (!name || strcmp(event_name, name))
+			continue;
+		if (!trace_event_try_get_ref(tp_event)) {
+			return NULL;
+			break;
+		}
+		return tp_event;
+		break;
+	}
+	return NULL;
+}
+
+static int trace_eprobe_tp_update_arg(struct trace_eprobe *ep, const char *argv[], int i)
+{
+	unsigned int flags = TPARG_FL_KERNEL | TPARG_FL_TPOINT;
+	int ret;
+
+	ret = traceprobe_parse_probe_arg(&ep->tp, i, argv[i], flags);
+	if (ret)
+		return ret;
+
+	if (ep->tp.args[i].code->op == FETCH_OP_TP_ARG)
+		ret = trace_eprobe_tp_arg_update(ep, i);
+
+	return ret;
+}
+
+static int __trace_eprobe_create(int argc, const char *argv[])
+{
+	/*
+	 * Argument syntax:
+	 *      e[:[GRP/]ENAME] SYSTEM.EVENT [FETCHARGS]
+	 * Fetch args:
+	 *  <name>=$<field>[:TYPE]
+	 */
+	const char *event = NULL, *group = EPROBE_EVENT_SYSTEM;
+	const char *sys_event = NULL, *sys_name = NULL;
+	struct trace_event_call *event_call;
+	struct trace_eprobe *ep = NULL;
+	char buf1[MAX_EVENT_NAME_LEN];
+	char buf2[MAX_EVENT_NAME_LEN];
+	int ret = 0;
+	int i;
+
+	if (argc < 2)
+		return -ECANCELED;
+
+	trace_probe_log_init("event_probe", argc, argv);
+
+	event = strchr(&argv[0][1], ':');
+	if (event) {
+		event++;
+		ret = traceprobe_parse_event_name(&event, &group, buf1,
+						  event - argv[0]);
+		if (ret)
+			goto parse_error;
+	} else {
+		strscpy(buf1, argv[1], MAX_EVENT_NAME_LEN);
+		sanitize_event_name(buf1);
+		event = buf1;
+	}
+	if (!is_good_name(event) || !is_good_name(group))
+		goto parse_error;
+
+	sys_event = argv[1];
+	ret = traceprobe_parse_event_name(&sys_event, &sys_name, buf2,
+					  sys_event - argv[1]);
+	if (ret || !sys_name)
+		goto parse_error;
+	if (!is_good_name(sys_event) || !is_good_name(sys_name))
+		goto parse_error;
+
+	mutex_lock(&event_mutex);
+	event_call = find_and_get_event(sys_name, sys_event);
+	ep = alloc_event_probe(group, event, event_call, argc - 2);
+	mutex_unlock(&event_mutex);
+
+	if (IS_ERR(ep)) {
+		ret = PTR_ERR(ep);
+		/* This must return -ENOMEM, else there is a bug */
+		WARN_ON_ONCE(ret != -ENOMEM);
+		goto error;	/* We know ep is not allocated */
+	}
+
+	argc -= 2; argv += 2;
+	/* parse arguments */
+	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+		ret = trace_eprobe_tp_update_arg(ep, argv, i);
+		if (ret)
+			goto error;
+	}
+	ret = traceprobe_set_print_fmt(&ep->tp, PROBE_PRINT_EVENT);
+	if (ret < 0)
+		goto error;
+	init_trace_eprobe_call(ep);
+	mutex_lock(&event_mutex);
+	ret = trace_probe_register_event_call(&ep->tp);
+	if (ret) {
+		mutex_unlock(&event_mutex);
+		goto error;
+	}
+	ret = dyn_event_add(&ep->devent, &ep->tp.event->call);
+	mutex_unlock(&event_mutex);
+	return ret;
+parse_error:
+	ret = -EINVAL;
+error:
+	trace_event_probe_cleanup(ep);
+	return ret;
+}
+
+/*
+ * Register dynevent at core_initcall. This allows kernel to setup eprobe
+ * events in postcore_initcall without tracefs.
+ */
+static __init int trace_events_eprobe_init_early(void)
+{
+	int err = 0;
+
+	err = dyn_event_register(&eprobe_dyn_event_ops);
+	if (err)
+		pr_warn("Could not register eprobe_dyn_event_ops\n");
+
+	return err;
+}
+core_initcall(trace_events_eprobe_init_early);
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 6b11e335a62e..3d5c07239a2a 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -124,6 +124,18 @@ static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
 	return seq_list_next(t, &event_file->triggers, pos);
 }
 
+static bool check_user_trigger(struct trace_event_file *file)
+{
+	struct event_trigger_data *data;
+
+	list_for_each_entry_rcu(data, &file->triggers, list) {
+		if (data->flags & EVENT_TRIGGER_FL_PROBE)
+			continue;
+		return true;
+	}
+	return false;
+}
+
 static void *trigger_start(struct seq_file *m, loff_t *pos)
 {
 	struct trace_event_file *event_file;
@@ -134,7 +146,7 @@ static void *trigger_start(struct seq_file *m, loff_t *pos)
 	if (unlikely(!event_file))
 		return ERR_PTR(-ENODEV);
 
-	if (list_empty(&event_file->triggers))
+	if (list_empty(&event_file->triggers) || !check_user_trigger(event_file))
 		return *pos == 0 ? SHOW_AVAILABLE_TRIGGERS : NULL;
 
 	return seq_list_start(&event_file->triggers, *pos);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index c6fe7a6e3f35..360b27a81459 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -703,14 +703,6 @@ static struct notifier_block trace_kprobe_module_nb = {
 	.priority = 1	/* Invoked after kprobe module callback */
 };
 
-/* Convert certain expected symbols into '_' when generating event names */
-static inline void sanitize_event_name(char *name)
-{
-	while (*name++ != '\0')
-		if (*name == ':' || *name == '.')
-			*name = '_';
-}
-
 static int __trace_kprobe_create(int argc, const char *argv[])
 {
 	/*
@@ -1325,9 +1317,10 @@ probe_mem_read(void *dest, void *src, size_t size)
 
 /* Note that we don't verify it, since the code does not come from user space */
 static int
-process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
+process_fetch_insn(struct fetch_insn *code, void *rec, void *dest,
 		   void *base)
 {
+	struct pt_regs *regs = rec;
 	unsigned long val;
 
 retry:
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index a8dcadeaae95..663ce2164406 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -319,6 +319,13 @@ static int parse_probe_vars(char *arg, const struct fetch_type *t,
 		code->op = FETCH_OP_ARG;
 		code->param = (unsigned int)param - 1;
 #endif
+	} else if (flags & TPARG_FL_TPOINT) {
+		if (code->data)
+			return -EFAULT;
+		code->data = kstrdup(arg, GFP_KERNEL);
+		if (!code->data)
+			return -ENOMEM;
+		code->op = FETCH_OP_TP_ARG;
 	} else
 		goto inval_var;
 
@@ -646,13 +653,14 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size,
 	    !strcmp(parg->type->name, "ustring")) {
 		if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_UDEREF &&
 		    code->op != FETCH_OP_IMM && code->op != FETCH_OP_COMM &&
-		    code->op != FETCH_OP_DATA) {
+		    code->op != FETCH_OP_DATA && code->op != FETCH_OP_TP_ARG) {
 			trace_probe_log_err(offset + (t ? (t - arg) : 0),
 					    BAD_STRING);
 			goto fail;
 		}
 		if ((code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM ||
-		     code->op == FETCH_OP_DATA) || parg->count) {
+		     code->op == FETCH_OP_DATA) || code->op == FETCH_OP_TP_ARG ||
+		     parg->count) {
 			/*
 			 * IMM, DATA and COMM is pointing actual address, those
 			 * must be kept, and if parg->count != 0, this is an
@@ -851,19 +859,26 @@ int traceprobe_update_arg(struct probe_arg *arg)
 /* When len=0, we just calculate the needed length */
 #define LEN_OR_ZERO (len ? len - pos : 0)
 static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
-			   bool is_return)
+			   enum probe_print_type ptype)
 {
 	struct probe_arg *parg;
 	int i, j;
 	int pos = 0;
 	const char *fmt, *arg;
 
-	if (!is_return) {
+	switch (ptype) {
+	case PROBE_PRINT_NORMAL:
 		fmt = "(%lx)";
 		arg = "REC->" FIELD_STRING_IP;
-	} else {
+		break;
+	case PROBE_PRINT_RETURN:
 		fmt = "(%lx <- %lx)";
 		arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
+		break;
+	case PROBE_PRINT_EVENT:
+		fmt = "(%u)";
+		arg = "REC->" FIELD_STRING_TYPE;
+
 	}
 
 	pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
@@ -915,18 +930,17 @@ static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
 int traceprobe_set_print_fmt(struct trace_probe *tp, enum probe_print_type ptype)
 {
 	struct trace_event_call *call = trace_probe_event_call(tp);
-	bool is_return = ptype == PROBE_PRINT_RETURN;
 	int len;
 	char *print_fmt;
 
 	/* First: called with 0 length to calculate the needed length */
-	len = __set_print_fmt(tp, NULL, 0, is_return);
+	len = __set_print_fmt(tp, NULL, 0, ptype);
 	print_fmt = kmalloc(len + 1, GFP_KERNEL);
 	if (!print_fmt)
 		return -ENOMEM;
 
 	/* Second: actually write the @print_fmt */
-	__set_print_fmt(tp, print_fmt, len + 1, is_return);
+	__set_print_fmt(tp, print_fmt, len + 1, ptype);
 	call->print_fmt = print_fmt;
 
 	return 0;
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 8adf5f3542a6..a25eb235a5e3 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -38,6 +38,7 @@
 #define FIELD_STRING_IP		"__probe_ip"
 #define FIELD_STRING_RETIP	"__probe_ret_ip"
 #define FIELD_STRING_FUNC	"__probe_func"
+#define FIELD_STRING_TYPE	"__probe_type"
 
 #undef DEFINE_FIELD
 #define DEFINE_FIELD(type, item, name, is_signed)			\
@@ -102,6 +103,7 @@ enum fetch_op {
 	FETCH_OP_MOD_BF,	/* Bitfield: .basesize, .lshift, .rshift */
 	// Stage 5 (loop) op
 	FETCH_OP_LP_ARRAY,	/* Array: .param = loop count */
+	FETCH_OP_TP_ARG,	/* Trace Point argument */
 	FETCH_OP_END,
 	FETCH_NOP_SYMBOL,	/* Unresolved Symbol holder */
 };
@@ -351,7 +353,8 @@ int trace_probe_create(const char *raw_command, int (*createfn)(int, const char
 #define TPARG_FL_RETURN BIT(0)
 #define TPARG_FL_KERNEL BIT(1)
 #define TPARG_FL_FENTRY BIT(2)
-#define TPARG_FL_MASK	GENMASK(2, 0)
+#define TPARG_FL_TPOINT BIT(3)
+#define TPARG_FL_MASK	GENMASK(3, 0)
 
 extern int traceprobe_parse_probe_arg(struct trace_probe *tp, int i,
 				const char *argv, unsigned int flags);
@@ -366,6 +369,7 @@ int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
 enum probe_print_type {
 	PROBE_PRINT_NORMAL,
 	PROBE_PRINT_RETURN,
+	PROBE_PRINT_EVENT,
 };
 
 extern int traceprobe_set_print_fmt(struct trace_probe *tp, enum probe_print_type ptype);
diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h
index f003c5d02a3a..b3bdb8ddb862 100644
--- a/kernel/trace/trace_probe_tmpl.h
+++ b/kernel/trace/trace_probe_tmpl.h
@@ -54,7 +54,7 @@ fetch_apply_bitfield(struct fetch_insn *code, void *buf)
  * If dest is NULL, don't store result and return required dynamic data size.
  */
 static int
-process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs,
+process_fetch_insn(struct fetch_insn *code, void *rec,
 		   void *dest, void *base);
 static nokprobe_inline int fetch_store_strlen(unsigned long addr);
 static nokprobe_inline int
@@ -188,7 +188,7 @@ __get_data_size(struct trace_probe *tp, struct pt_regs *regs)
 
 /* Store the value of each argument */
 static nokprobe_inline void
-store_trace_args(void *data, struct trace_probe *tp, struct pt_regs *regs,
+store_trace_args(void *data, struct trace_probe *tp, void *rec,
 		 int header_size, int maxlen)
 {
 	struct probe_arg *arg;
@@ -203,7 +203,7 @@ store_trace_args(void *data, struct trace_probe *tp, struct pt_regs *regs,
 		/* Point the dynamic data area if needed */
 		if (unlikely(arg->dynamic))
 			*dl = make_data_loc(maxlen, dyndata - base);
-		ret = process_fetch_insn(arg->code, regs, dl, base);
+		ret = process_fetch_insn(arg->code, rec, dl, base);
 		if (unlikely(ret < 0 && arg->dynamic)) {
 			*dl = make_data_loc(0, dyndata - base);
 		} else {
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 09f8ca7f7ba0..d219ba50efbd 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -213,9 +213,10 @@ static unsigned long translate_user_vaddr(unsigned long file_offset)
 
 /* Note that we don't verify it, since the code does not come from user space */
 static int
-process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
+process_fetch_insn(struct fetch_insn *code, void *rec, void *dest,
 		   void *base)
 {
+	struct pt_regs *regs = rec;
 	unsigned long val;
 
 	/* 1st stage: get value from context */
-- 
2.30.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 1/7] tracing: Add DYNAMIC flag for dynamic events
  2021-08-17  3:42 ` [PATCH v6 1/7] tracing: Add DYNAMIC flag for dynamic events Steven Rostedt
@ 2021-08-18  8:15   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-18  8:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Tzvetomir Stoyanov, Tom Zanussi, linux-trace-devel

On Mon, 16 Aug 2021 23:42:56 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> To differentiate between static and dynamic events, add a new flag
> DYNAMIC to the event flags that all dynamic events have set. This will
> allow to differentiate when attaching to a dynamic event from a static
> event.
> 
> Static events have a mod pointer that references the module they were
> created in (or NULL for core kernel). This can be incremented when the
> event has something attached to it. But there exists no such mechanism for
> dynamic events. This is dangerous as the dynamic events may now disappear
> without the "attachment" knowing that it no longer exists.
> 
> To enforce the dynamic flag, change dyn_event_add() to pass the event that
> is being created such that it can set the DYNAMIC flag of the event. This
> helps make sure that no location that creates a dynamic event misses
> setting this flag.
> 
> Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/
> 

This looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you,

> Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  include/linux/trace_events.h      | 3 +++
>  kernel/trace/trace_dynevent.h     | 4 +++-
>  kernel/trace/trace_events_synth.c | 2 +-
>  kernel/trace/trace_kprobe.c       | 4 ++--
>  kernel/trace/trace_uprobe.c       | 4 ++--
>  5 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
> index ad413b382a3c..53c9dffd87fd 100644
> --- a/include/linux/trace_events.h
> +++ b/include/linux/trace_events.h
> @@ -310,6 +310,7 @@ enum {
>  	TRACE_EVENT_FL_NO_SET_FILTER_BIT,
>  	TRACE_EVENT_FL_IGNORE_ENABLE_BIT,
>  	TRACE_EVENT_FL_TRACEPOINT_BIT,
> +	TRACE_EVENT_FL_DYNAMIC_BIT,
>  	TRACE_EVENT_FL_KPROBE_BIT,
>  	TRACE_EVENT_FL_UPROBE_BIT,
>  };
> @@ -321,6 +322,7 @@ enum {
>   *  NO_SET_FILTER - Set when filter has error and is to be ignored
>   *  IGNORE_ENABLE - For trace internal events, do not enable with debugfs file
>   *  TRACEPOINT    - Event is a tracepoint
> + *  DYNAMIC       - Event is a dynamic event (created at run time)
>   *  KPROBE        - Event is a kprobe
>   *  UPROBE        - Event is a uprobe
>   */
> @@ -330,6 +332,7 @@ enum {
>  	TRACE_EVENT_FL_NO_SET_FILTER	= (1 << TRACE_EVENT_FL_NO_SET_FILTER_BIT),
>  	TRACE_EVENT_FL_IGNORE_ENABLE	= (1 << TRACE_EVENT_FL_IGNORE_ENABLE_BIT),
>  	TRACE_EVENT_FL_TRACEPOINT	= (1 << TRACE_EVENT_FL_TRACEPOINT_BIT),
> +	TRACE_EVENT_FL_DYNAMIC		= (1 << TRACE_EVENT_FL_DYNAMIC_BIT),
>  	TRACE_EVENT_FL_KPROBE		= (1 << TRACE_EVENT_FL_KPROBE_BIT),
>  	TRACE_EVENT_FL_UPROBE		= (1 << TRACE_EVENT_FL_UPROBE_BIT),
>  };
> diff --git a/kernel/trace/trace_dynevent.h b/kernel/trace/trace_dynevent.h
> index 7754936b57ee..936477a111d3 100644
> --- a/kernel/trace/trace_dynevent.h
> +++ b/kernel/trace/trace_dynevent.h
> @@ -76,13 +76,15 @@ int dyn_event_init(struct dyn_event *ev, struct dyn_event_operations *ops)
>  	return 0;
>  }
>  
> -static inline int dyn_event_add(struct dyn_event *ev)
> +static inline int dyn_event_add(struct dyn_event *ev,
> +				struct trace_event_call *call)
>  {
>  	lockdep_assert_held(&event_mutex);
>  
>  	if (!ev || !ev->ops)
>  		return -EINVAL;
>  
> +	call->flags |= TRACE_EVENT_FL_DYNAMIC;
>  	list_add_tail(&ev->list, &dyn_event_list);
>  	return 0;
>  }
> diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
> index 9315fc03e303..f4f5489e1e28 100644
> --- a/kernel/trace/trace_events_synth.c
> +++ b/kernel/trace/trace_events_synth.c
> @@ -1298,7 +1298,7 @@ static int __create_synth_event(const char *name, const char *raw_fields)
>  	}
>  	ret = register_synth_event(event);
>  	if (!ret)
> -		dyn_event_add(&event->devent);
> +		dyn_event_add(&event->devent, &event->call);
>  	else
>  		free_synth_event(event);
>   out:
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index ea6178cb5e33..bfef43bfce37 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -618,7 +618,7 @@ static int append_trace_kprobe(struct trace_kprobe *tk, struct trace_kprobe *to)
>  	if (ret)
>  		trace_probe_unlink(&tk->tp);
>  	else
> -		dyn_event_add(&tk->devent);
> +		dyn_event_add(&tk->devent, trace_probe_event_call(&tk->tp));
>  
>  	return ret;
>  }
> @@ -661,7 +661,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
>  	if (ret < 0)
>  		unregister_kprobe_event(tk);
>  	else
> -		dyn_event_add(&tk->devent);
> +		dyn_event_add(&tk->devent, trace_probe_event_call(&tk->tp));
>  
>  end:
>  	mutex_unlock(&event_mutex);
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 9b50869a5ddb..50eca53b8d22 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -455,7 +455,7 @@ static int append_trace_uprobe(struct trace_uprobe *tu, struct trace_uprobe *to)
>  	/* Append to existing event */
>  	ret = trace_probe_append(&tu->tp, &to->tp);
>  	if (!ret)
> -		dyn_event_add(&tu->devent);
> +		dyn_event_add(&tu->devent, trace_probe_event_call(&tu->tp));
>  
>  	return ret;
>  }
> @@ -518,7 +518,7 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
>  		goto end;
>  	}
>  
> -	dyn_event_add(&tu->devent);
> +	dyn_event_add(&tu->devent, trace_probe_event_call(&tu->tp));
>  
>  end:
>  	mutex_unlock(&event_mutex);
> -- 
> 2.30.2


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 4/7] tracing/probes: Allow for dot delimiter as well as slash for system names
  2021-08-17  3:42 ` [PATCH v6 4/7] tracing/probes: Allow for dot delimiter as well as slash for system names Steven Rostedt
@ 2021-08-18 10:57   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-18 10:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Tzvetomir Stoyanov, Tom Zanussi, linux-trace-devel

On Mon, 16 Aug 2021 23:42:59 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Kprobe and uprobe events can add a "system" to the events that are created
> via the kprobe_events and uprobe_events files respectively. If they do not
> include a "system" in the name, then the default "kprobes" or "uprobes" is
> used. The current notation to specify a system for one of these probe
> events is to add a '/' delimiter in the name, where the content before the
> '/' will be the system to use, and the content after will be the event
> name.
> 
>  echo 'p:my_system/my_event' > kprobe_events
> 
> But this is inconsistent with the way histogram triggers separate their
> system / event names. The histogram triggers use a '.' delimiter, which
> can be confusing.
> 
> To allow this to be more consistent, as well as keep backward
> compatibility, allow the kprobe and uprobe events to denote a system name
> with either a '/' or a '.'.
> 
> That is:
> 
>   echo 'p:my_system/my_event' > kprobe_events
> 
> is equivalent to:
> 
>   echo 'p:my_system.my_event' > kprobe_events
> 
> Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/
> 

Yes, this is what I suggested :)

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thanks!


> Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_probe.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index ef717b373443..0916a9964719 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -233,6 +233,9 @@ int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
>  	int len;
>  
>  	slash = strchr(event, '/');
> +	if (!slash)
> +		slash = strchr(event, '.');
> +
>  	if (slash) {
>  		if (slash == event) {
>  			trace_probe_log_err(offset, NO_GROUP_NAME);
> -- 
> 2.30.2


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 5/7] tracing/probes: Use struct_size() instead of defining custom macros
  2021-08-17  3:43 ` [PATCH v6 5/7] tracing/probes: Use struct_size() instead of defining custom macros Steven Rostedt
@ 2021-08-18 11:08   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-18 11:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Tzvetomir Stoyanov, Tom Zanussi, linux-trace-devel

On Mon, 16 Aug 2021 23:43:00 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Remove SIZEOF_TRACE_KPROBE() and SIZEOF_TRACE_UPROBE() and use
> struct_size() as that's what it is made for. No need to have custom
> macros. Especially since struct_size() has some extra memory checks for
> correctness.
> 

Good catch!

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you!

> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_kprobe.c | 6 +-----
>  kernel/trace/trace_uprobe.c | 6 +-----
>  2 files changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index ed1e3c2087ab..ca726c9d0859 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -80,10 +80,6 @@ static struct trace_kprobe *to_trace_kprobe(struct dyn_event *ev)
>  	for_each_dyn_event(dpos)		\
>  		if (is_trace_kprobe(dpos) && (pos = to_trace_kprobe(dpos)))
>  
> -#define SIZEOF_TRACE_KPROBE(n)				\
> -	(offsetof(struct trace_kprobe, tp.args) +	\
> -	(sizeof(struct probe_arg) * (n)))
> -
>  static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
>  {
>  	return tk->rp.handler != NULL;
> @@ -265,7 +261,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char *group,
>  	struct trace_kprobe *tk;
>  	int ret = -ENOMEM;
>  
> -	tk = kzalloc(SIZEOF_TRACE_KPROBE(nargs), GFP_KERNEL);
> +	tk = kzalloc(struct_size(tk, tp.args, nargs), GFP_KERNEL);
>  	if (!tk)
>  		return ERR_PTR(ret);
>  
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 93ff96541971..590bb9a02f8d 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -83,10 +83,6 @@ static struct trace_uprobe *to_trace_uprobe(struct dyn_event *ev)
>  	for_each_dyn_event(dpos)		\
>  		if (is_trace_uprobe(dpos) && (pos = to_trace_uprobe(dpos)))
>  
> -#define SIZEOF_TRACE_UPROBE(n)				\
> -	(offsetof(struct trace_uprobe, tp.args) +	\
> -	(sizeof(struct probe_arg) * (n)))
> -
>  static int register_uprobe_event(struct trace_uprobe *tu);
>  static int unregister_uprobe_event(struct trace_uprobe *tu);
>  
> @@ -340,7 +336,7 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret)
>  	struct trace_uprobe *tu;
>  	int ret;
>  
> -	tu = kzalloc(SIZEOF_TRACE_UPROBE(nargs), GFP_KERNEL);
> +	tu = kzalloc(struct_size(tu, tp.args, nargs), GFP_KERNEL);
>  	if (!tu)
>  		return ERR_PTR(-ENOMEM);
>  
> -- 
> 2.30.2


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 2/7] tracing: Have dynamic events have a ref counter
  2021-08-17  3:42 ` [PATCH v6 2/7] tracing: Have dynamic events have a ref counter Steven Rostedt
@ 2021-08-18 16:14   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-18 16:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Tzvetomir Stoyanov, Tom Zanussi, linux-trace-devel

On Mon, 16 Aug 2021 23:42:57 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> As dynamic events are not created by modules, if something is attached to
> one, calling "try_module_get()" on its "mod" field, is not going to keep
> the dynamic event from going away.
> 
> Since dynamic events do not need the "mod" pointer of the event structure,
> make a union out of it in order to save memory (there's one structure for
> each of the thousand+ events in the kernel), and have any event with the
> DYNAMIC flag set to use a ref counter instead.
> 
> Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/
> 

This looks good to me, just one nitpick.

[..]
> +
> +static inline void trace_event_put_ref(struct trace_event_call *call)
> +{
> +	if (call->flags & TRACE_EVENT_FL_DYNAMIC)
> +		return trace_event_dyn_put_ref(call);
> +	else
> +		return module_put(call->module);

You don't need to return for void function.

Except for this,
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you!



-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 3/7] tracing/probe: Have traceprobe_parse_probe_arg() take a const arg
  2021-08-17  3:42 ` [PATCH v6 3/7] tracing/probe: Have traceprobe_parse_probe_arg() take a const arg Steven Rostedt
@ 2021-08-18 16:16   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-18 16:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Tzvetomir Stoyanov, Tom Zanussi, linux-trace-devel

On Mon, 16 Aug 2021 23:42:58 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> The two places that call traceprobe_parse_probe_arg() allocate a temporary
> buffer to copy the argv[i] into, because argv[i] is constant and the
> traceprobe_parse_probe_arg() will modify it to do the parsing. These two
> places allocate this buffer and then free it right after calling this
> function, leaving the onus of this allocation to the caller.
> 
> As there's about to be a third user of this function that will have to do
> the same thing, instead of having the caller allocate the temporary
> buffer, simply move that allocation into the traceprobe_parse_probe_arg()
> itself, which will simplify the code of the callers.
> 

Thanks, this looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you,

> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_kprobe.c |  9 +------
>  kernel/trace/trace_probe.c  | 47 ++++++++++++++++++++++---------------
>  kernel/trace/trace_probe.h  |  2 +-
>  kernel/trace/trace_uprobe.c |  9 +------
>  4 files changed, 31 insertions(+), 36 deletions(-)
> 
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 82c3b86013b2..ed1e3c2087ab 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -873,15 +873,8 @@ static int __trace_kprobe_create(int argc, const char *argv[])
>  
>  	/* parse arguments */
>  	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
> -		tmp = kstrdup(argv[i], GFP_KERNEL);
> -		if (!tmp) {
> -			ret = -ENOMEM;
> -			goto error;
> -		}
> -
>  		trace_probe_log_set_index(i + 2);
> -		ret = traceprobe_parse_probe_arg(&tk->tp, i, tmp, flags);
> -		kfree(tmp);
> +		ret = traceprobe_parse_probe_arg(&tk->tp, i, argv[i], flags);
>  		if (ret)
>  			goto error;	/* This can be -ENOMEM */
>  	}
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 15413ad7cef2..ef717b373443 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -540,26 +540,34 @@ static int __parse_bitfield_probe_arg(const char *bf,
>  }
>  
>  /* String length checking wrapper */
> -static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
> +static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size,
>  		struct probe_arg *parg, unsigned int flags, int offset)
>  {
>  	struct fetch_insn *code, *scode, *tmp = NULL;
>  	char *t, *t2, *t3;
> +	char *arg;
>  	int ret, len;
>  
> +	arg = kstrdup(argv, GFP_KERNEL);
> +	if (!arg)
> +		return -ENOMEM;
> +
> +	ret = -EINVAL;
>  	len = strlen(arg);
>  	if (len > MAX_ARGSTR_LEN) {
>  		trace_probe_log_err(offset, ARG_TOO_LONG);
> -		return -EINVAL;
> +		goto out;
>  	} else if (len == 0) {
>  		trace_probe_log_err(offset, NO_ARG_BODY);
> -		return -EINVAL;
> +		goto out;
>  	}
>  
> +	ret = -ENOMEM;
>  	parg->comm = kstrdup(arg, GFP_KERNEL);
>  	if (!parg->comm)
> -		return -ENOMEM;
> +		goto out;
>  
> +	ret = -EINVAL;
>  	t = strchr(arg, ':');
>  	if (t) {
>  		*t = '\0';
> @@ -571,22 +579,22 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  				offset += t2 + strlen(t2) - arg;
>  				trace_probe_log_err(offset,
>  						    ARRAY_NO_CLOSE);
> -				return -EINVAL;
> +				goto out;
>  			} else if (t3[1] != '\0') {
>  				trace_probe_log_err(offset + t3 + 1 - arg,
>  						    BAD_ARRAY_SUFFIX);
> -				return -EINVAL;
> +				goto out;
>  			}
>  			*t3 = '\0';
>  			if (kstrtouint(t2, 0, &parg->count) || !parg->count) {
>  				trace_probe_log_err(offset + t2 - arg,
>  						    BAD_ARRAY_NUM);
> -				return -EINVAL;
> +				goto out;
>  			}
>  			if (parg->count > MAX_ARRAY_LEN) {
>  				trace_probe_log_err(offset + t2 - arg,
>  						    ARRAY_TOO_BIG);
> -				return -EINVAL;
> +				goto out;
>  			}
>  		}
>  	}
> @@ -598,29 +606,30 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  	if (strcmp(arg, "$comm") == 0 || strncmp(arg, "\\\"", 2) == 0) {
>  		/* The type of $comm must be "string", and not an array. */
>  		if (parg->count || (t && strcmp(t, "string")))
> -			return -EINVAL;
> +			goto out;
>  		parg->type = find_fetch_type("string");
>  	} else
>  		parg->type = find_fetch_type(t);
>  	if (!parg->type) {
>  		trace_probe_log_err(offset + (t ? (t - arg) : 0), BAD_TYPE);
> -		return -EINVAL;
> +		goto out;
>  	}
>  	parg->offset = *size;
>  	*size += parg->type->size * (parg->count ?: 1);
>  
> +	ret = -ENOMEM;
>  	if (parg->count) {
>  		len = strlen(parg->type->fmttype) + 6;
>  		parg->fmt = kmalloc(len, GFP_KERNEL);
>  		if (!parg->fmt)
> -			return -ENOMEM;
> +			goto out;
>  		snprintf(parg->fmt, len, "%s[%d]", parg->type->fmttype,
>  			 parg->count);
>  	}
>  
>  	code = tmp = kcalloc(FETCH_INSN_MAX, sizeof(*code), GFP_KERNEL);
>  	if (!code)
> -		return -ENOMEM;
> +		goto out;
>  	code[FETCH_INSN_MAX - 1].op = FETCH_OP_END;
>  
>  	ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1],
> @@ -628,6 +637,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  	if (ret)
>  		goto fail;
>  
> +	ret = -EINVAL;
>  	/* Store operation */
>  	if (!strcmp(parg->type->name, "string") ||
>  	    !strcmp(parg->type->name, "ustring")) {
> @@ -636,7 +646,6 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  		    code->op != FETCH_OP_DATA) {
>  			trace_probe_log_err(offset + (t ? (t - arg) : 0),
>  					    BAD_STRING);
> -			ret = -EINVAL;
>  			goto fail;
>  		}
>  		if ((code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM ||
> @@ -650,7 +659,6 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  			code++;
>  			if (code->op != FETCH_OP_NOP) {
>  				trace_probe_log_err(offset, TOO_MANY_OPS);
> -				ret = -EINVAL;
>  				goto fail;
>  			}
>  		}
> @@ -672,7 +680,6 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  		code++;
>  		if (code->op != FETCH_OP_NOP) {
>  			trace_probe_log_err(offset, TOO_MANY_OPS);
> -			ret = -EINVAL;
>  			goto fail;
>  		}
>  		code->op = FETCH_OP_ST_RAW;
> @@ -687,6 +694,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  			goto fail;
>  		}
>  	}
> +	ret = -EINVAL;
>  	/* Loop(Array) operation */
>  	if (parg->count) {
>  		if (scode->op != FETCH_OP_ST_MEM &&
> @@ -694,13 +702,11 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  		    scode->op != FETCH_OP_ST_USTRING) {
>  			trace_probe_log_err(offset + (t ? (t - arg) : 0),
>  					    BAD_STRING);
> -			ret = -EINVAL;
>  			goto fail;
>  		}
>  		code++;
>  		if (code->op != FETCH_OP_NOP) {
>  			trace_probe_log_err(offset, TOO_MANY_OPS);
> -			ret = -EINVAL;
>  			goto fail;
>  		}
>  		code->op = FETCH_OP_LP_ARRAY;
> @@ -709,6 +715,7 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  	code++;
>  	code->op = FETCH_OP_END;
>  
> +	ret = 0;
>  	/* Shrink down the code buffer */
>  	parg->code = kcalloc(code - tmp + 1, sizeof(*code), GFP_KERNEL);
>  	if (!parg->code)
> @@ -724,6 +731,8 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size,
>  				kfree(code->data);
>  	}
>  	kfree(tmp);
> +out:
> +	kfree(arg);
>  
>  	return ret;
>  }
> @@ -745,11 +754,11 @@ static int traceprobe_conflict_field_name(const char *name,
>  	return 0;
>  }
>  
> -int traceprobe_parse_probe_arg(struct trace_probe *tp, int i, char *arg,
> +int traceprobe_parse_probe_arg(struct trace_probe *tp, int i, const char *arg,
>  				unsigned int flags)
>  {
>  	struct probe_arg *parg = &tp->args[i];
> -	char *body;
> +	const char *body;
>  
>  	/* Increment count for freeing args in error case */
>  	tp->nr_args++;
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index 227d518e5ba5..42aa084902fa 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -354,7 +354,7 @@ int trace_probe_create(const char *raw_command, int (*createfn)(int, const char
>  #define TPARG_FL_MASK	GENMASK(2, 0)
>  
>  extern int traceprobe_parse_probe_arg(struct trace_probe *tp, int i,
> -				char *arg, unsigned int flags);
> +				const char *argv, unsigned int flags);
>  
>  extern int traceprobe_update_arg(struct probe_arg *arg);
>  extern void traceprobe_free_probe_arg(struct probe_arg *arg);
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 1e2a92e7607d..93ff96541971 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -684,16 +684,9 @@ static int __trace_uprobe_create(int argc, const char **argv)
>  
>  	/* parse arguments */
>  	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
> -		tmp = kstrdup(argv[i], GFP_KERNEL);
> -		if (!tmp) {
> -			ret = -ENOMEM;
> -			goto error;
> -		}
> -
>  		trace_probe_log_set_index(i + 2);
> -		ret = traceprobe_parse_probe_arg(&tu->tp, i, tmp,
> +		ret = traceprobe_parse_probe_arg(&tu->tp, i, argv[i],
>  					is_return ? TPARG_FL_RETURN : 0);
> -		kfree(tmp);
>  		if (ret)
>  			goto error;
>  	}
> -- 
> 2.30.2


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/7] tracing/probe: Change traceprobe_set_print_fmt() to take a type
  2021-08-17  3:43 ` [PATCH v6 6/7] tracing/probe: Change traceprobe_set_print_fmt() to take a type Steven Rostedt
@ 2021-08-18 16:19   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-18 16:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Tzvetomir Stoyanov, Tom Zanussi, linux-trace-devel

On Mon, 16 Aug 2021 23:43:01 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Instead of a boolean "is_return" have traceprobe_set_print_fmt() take a
> type (currently just PROBE_PRINT_NORMAL and PROBE_PRINT_RETURN). This will
> simplify adding different types. For example, the development of the
> event_probe, will need its own type as it prints an event, and not an IP.

This looks good and reasonable to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you!

> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_kprobe.c | 9 +++++++--
>  kernel/trace/trace_probe.c  | 3 ++-
>  kernel/trace/trace_probe.h  | 7 ++++++-
>  kernel/trace/trace_uprobe.c | 8 ++++++--
>  4 files changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index ca726c9d0859..c6fe7a6e3f35 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -742,6 +742,7 @@ static int __trace_kprobe_create(int argc, const char *argv[])
>  	bool is_return = false;
>  	char *symbol = NULL, *tmp = NULL;
>  	const char *event = NULL, *group = KPROBE_EVENT_SYSTEM;
> +	enum probe_print_type ptype;
>  	int maxactive = 0;
>  	long offset = 0;
>  	void *addr = NULL;
> @@ -875,7 +876,8 @@ static int __trace_kprobe_create(int argc, const char *argv[])
>  			goto error;	/* This can be -ENOMEM */
>  	}
>  
> -	ret = traceprobe_set_print_fmt(&tk->tp, is_return);
> +	ptype = is_return ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
> +	ret = traceprobe_set_print_fmt(&tk->tp, ptype);
>  	if (ret < 0)
>  		goto error;
>  
> @@ -1799,6 +1801,7 @@ struct trace_event_call *
>  create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
>  			  bool is_return)
>  {
> +	enum probe_print_type ptype;
>  	struct trace_kprobe *tk;
>  	int ret;
>  	char *event;
> @@ -1822,7 +1825,9 @@ create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
>  
>  	init_trace_event_call(tk);
>  
> -	if (traceprobe_set_print_fmt(&tk->tp, trace_kprobe_is_return(tk)) < 0) {
> +	ptype = trace_kprobe_is_return(tk) ?
> +		PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
> +	if (traceprobe_set_print_fmt(&tk->tp, ptype) < 0) {
>  		ret = -ENOMEM;
>  		goto error;
>  	}
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 0916a9964719..a8dcadeaae95 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -912,9 +912,10 @@ static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
>  }
>  #undef LEN_OR_ZERO
>  
> -int traceprobe_set_print_fmt(struct trace_probe *tp, bool is_return)
> +int traceprobe_set_print_fmt(struct trace_probe *tp, enum probe_print_type ptype)
>  {
>  	struct trace_event_call *call = trace_probe_event_call(tp);
> +	bool is_return = ptype == PROBE_PRINT_RETURN;
>  	int len;
>  	char *print_fmt;
>  
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index 42aa084902fa..8adf5f3542a6 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -363,7 +363,12 @@ extern int traceprobe_split_symbol_offset(char *symbol, long *offset);
>  int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
>  				char *buf, int offset);
>  
> -extern int traceprobe_set_print_fmt(struct trace_probe *tp, bool is_return);
> +enum probe_print_type {
> +	PROBE_PRINT_NORMAL,
> +	PROBE_PRINT_RETURN,
> +};
> +
> +extern int traceprobe_set_print_fmt(struct trace_probe *tp, enum probe_print_type ptype);
>  
>  #ifdef CONFIG_PERF_EVENTS
>  extern struct trace_event_call *
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 590bb9a02f8d..09f8ca7f7ba0 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -536,6 +536,7 @@ static int __trace_uprobe_create(int argc, const char **argv)
>  	const char *event = NULL, *group = UPROBE_EVENT_SYSTEM;
>  	char *arg, *filename, *rctr, *rctr_end, *tmp;
>  	char buf[MAX_EVENT_NAME_LEN];
> +	enum probe_print_type ptype;
>  	struct path path;
>  	unsigned long offset, ref_ctr_offset;
>  	bool is_return = false;
> @@ -687,7 +688,8 @@ static int __trace_uprobe_create(int argc, const char **argv)
>  			goto error;
>  	}
>  
> -	ret = traceprobe_set_print_fmt(&tu->tp, is_ret_probe(tu));
> +	ptype = is_ret_probe(tu) ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
> +	ret = traceprobe_set_print_fmt(&tu->tp, ptype);
>  	if (ret < 0)
>  		goto error;
>  
> @@ -1578,6 +1580,7 @@ struct trace_event_call *
>  create_local_trace_uprobe(char *name, unsigned long offs,
>  			  unsigned long ref_ctr_offset, bool is_return)
>  {
> +	enum probe_print_type ptype;
>  	struct trace_uprobe *tu;
>  	struct path path;
>  	int ret;
> @@ -1612,7 +1615,8 @@ create_local_trace_uprobe(char *name, unsigned long offs,
>  	tu->filename = kstrdup(name, GFP_KERNEL);
>  	init_trace_event_call(tu);
>  
> -	if (traceprobe_set_print_fmt(&tu->tp, is_ret_probe(tu)) < 0) {
> +	ptype = is_ret_probe(tu) ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL;
> +	if (traceprobe_set_print_fmt(&tu->tp, ptype) < 0) {
>  		ret = -ENOMEM;
>  		goto error;
>  	}
> -- 
> 2.30.2


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 7/7] tracing: Add a probe that attaches to trace events
  2021-08-17  3:43 ` [PATCH v6 7/7] tracing: Add a probe that attaches to trace events Steven Rostedt
@ 2021-08-19  2:29   ` Masami Hiramatsu
  2021-08-19  3:37     ` Steven Rostedt
  0 siblings, 1 reply; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-19  2:29 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Tzvetomir Stoyanov, Tom Zanussi, linux-trace-devel

On Mon, 16 Aug 2021 23:43:02 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
> 
> A new dynamic event is introduced: event probe. The event is attached
> to an existing tracepoint and uses its fields as arguments. The user
> can specify custom format string of the new event, select what tracepoint
> arguments will be printed and how to print them.
> An event probe is created by writing configuration string in
> 'dynamic_events' ftrace file:
>  e:[SNAME/]ENAME SYSTEM/EVENT [FETCHARGS]	- Set an event probe

Just a small comment. This ENAME is actually optional.

>  -:SNAME/ENAME					- Delete an event probe
> 
> Where:
>  SNAME	- System name, if omitted 'eprobes' is used.
>  ENAME	- Name of the new event in SNAME, mandatory.

Ditto.
See __trace_eprobe_create(), it generates the event name from
the "SYSTEM_EVENT". (BTW, what happen if we have 2 different event
probe on the same event with same name?)

>  SYSTEM	- Name of the system, where the tracepoint is defined, mandatory.
>  EVENT	- Name of the tracepoint event in SYSTEM, mandatory.
>  FETCHARGS - Arguments:
>   <name>=$<field>[:TYPE] - Fetch given filed of the tracepoint and print
> 			   it as given TYPE with given name. Supported
> 			   types are:
> 	                    (u8/u16/u32/u64/s8/s16/s32/s64), basic type
>         	            (x8/x16/x32/x64), hexadecimal types
> 			    "string", "ustring" and bitfield.
> 
> Example, attach an event probe on openat system call and print name of the
> file that will be opened:
>  echo "e:esys/eopen syscalls/sys_enter_openat file=\$filename:string" >> dynamic_events
> A new dynamic event is created in events/esys/eopen/ directory. It
> can be deleted with:
>  echo "-:esys/eopen" >> dynamic_events
> 
> Filters, triggers and histograms can be attached to the new event, it can
> be matched in synthetic events. There is one limitation - an event probe
> can not be attached to kprobe, uprobe or another event probe.
> 
> Link: https://lkml.kernel.org/r/20210812145805.2292326-1-tz.stoyanov@gmail.com
> 
> Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
[...]
> +static struct trace_event_call *
> +find_and_get_event(const char *system, const char *event_name)
> +{
> +	struct trace_event_call *tp_event;
> +	const char *name;
> +
> +	list_for_each_entry(tp_event, &ftrace_events, list) {
> +		/* Skip other probes and ftrace events */
> +		if ((tp_event->flags & TRACE_EVENT_FL_IGNORE_ENABLE) ||
> +		    (tp_event->flags & TRACE_EVENT_FL_KPROBE) ||
> +		    (tp_event->flags & TRACE_EVENT_FL_UPROBE) ||
> +		    (tp_event->flags & TRACE_EVENT_FL_EPROBE))

Maybe it is better to define a bitmask for it instead of testing
different bit many times.

[...]
> +static int __trace_eprobe_create(int argc, const char *argv[])
> +{
> +	/*
> +	 * Argument syntax:
> +	 *      e[:[GRP/]ENAME] SYSTEM.EVENT [FETCHARGS]
> +	 * Fetch args:
> +	 *  <name>=$<field>[:TYPE]
> +	 */
> +	const char *event = NULL, *group = EPROBE_EVENT_SYSTEM;
> +	const char *sys_event = NULL, *sys_name = NULL;
> +	struct trace_event_call *event_call;
> +	struct trace_eprobe *ep = NULL;
> +	char buf1[MAX_EVENT_NAME_LEN];
> +	char buf2[MAX_EVENT_NAME_LEN];
> +	int ret = 0;
> +	int i;
> +
> +	if (argc < 2)
> +		return -ECANCELED;
> +
> +	trace_probe_log_init("event_probe", argc, argv);
> +
> +	event = strchr(&argv[0][1], ':');
> +	if (event) {
> +		event++;
> +		ret = traceprobe_parse_event_name(&event, &group, buf1,
> +						  event - argv[0]);
> +		if (ret)
> +			goto parse_error;
> +	} else {
> +		strscpy(buf1, argv[1], MAX_EVENT_NAME_LEN);
> +		sanitize_event_name(buf1);
> +		event = buf1;
> +	}
> +	if (!is_good_name(event) || !is_good_name(group))
> +		goto parse_error;
> +
> +	sys_event = argv[1];
> +	ret = traceprobe_parse_event_name(&sys_event, &sys_name, buf2,
> +					  sys_event - argv[1]);
> +	if (ret || !sys_name)
> +		goto parse_error;
> +	if (!is_good_name(sys_event) || !is_good_name(sys_name))
> +		goto parse_error;
> +
> +	mutex_lock(&event_mutex);
> +	event_call = find_and_get_event(sys_name, sys_event);
> +	ep = alloc_event_probe(group, event, event_call, argc - 2);
> +	mutex_unlock(&event_mutex);
> +
> +	if (IS_ERR(ep)) {
> +		ret = PTR_ERR(ep);
> +		/* This must return -ENOMEM, else there is a bug */
> +		WARN_ON_ONCE(ret != -ENOMEM);
> +		goto error;	/* We know ep is not allocated */
> +	}
> +
> +	argc -= 2; argv += 2;
> +	/* parse arguments */
> +	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
> +		ret = trace_eprobe_tp_update_arg(ep, argv, i);
> +		if (ret)
> +			goto error;
> +	}
> +	ret = traceprobe_set_print_fmt(&ep->tp, PROBE_PRINT_EVENT);
> +	if (ret < 0)
> +		goto error;
> +	init_trace_eprobe_call(ep);
> +	mutex_lock(&event_mutex);

Here, you don't check the event name collision. Since the kprobe event
supports multiprobe event, it checks the collision by itself.
See register_trace_kprobe().
BTW, I found another issue on the name collision. Let me fix it.

> +	ret = trace_probe_register_event_call(&ep->tp);
> +	if (ret) {
> +		mutex_unlock(&event_mutex);
> +		goto error;
> +	}
> +	ret = dyn_event_add(&ep->devent, &ep->tp.event->call);
> +	mutex_unlock(&event_mutex);
> +	return ret;
> +parse_error:
> +	ret = -EINVAL;
> +error:
> +	trace_event_probe_cleanup(ep);
> +	return ret;
> +}

Thank you,
-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 7/7] tracing: Add a probe that attaches to trace events
  2021-08-19  2:29   ` Masami Hiramatsu
@ 2021-08-19  3:37     ` Steven Rostedt
  2021-08-19  3:56       ` Masami Hiramatsu
  0 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2021-08-19  3:37 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

On Thu, 19 Aug 2021 11:29:20 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Mon, 16 Aug 2021 23:43:02 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > From: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
> > 
> > A new dynamic event is introduced: event probe. The event is attached
> > to an existing tracepoint and uses its fields as arguments. The user
> > can specify custom format string of the new event, select what tracepoint
> > arguments will be printed and how to print them.
> > An event probe is created by writing configuration string in
> > 'dynamic_events' ftrace file:
> >  e:[SNAME/]ENAME SYSTEM/EVENT [FETCHARGS]	- Set an event probe  
> 
> Just a small comment. This ENAME is actually optional.
> 
> >  -:SNAME/ENAME					- Delete an event probe
> > 
> > Where:
> >  SNAME	- System name, if omitted 'eprobes' is used.
> >  ENAME	- Name of the new event in SNAME, mandatory.  
> 
> Ditto.

Will updated.

> See __trace_eprobe_create(), it generates the event name from
> the "SYSTEM_EVENT". (BTW, what happen if we have 2 different event
> probe on the same event with same name?)

I'll have to check. We should try to avoid that.

> 
> >  SYSTEM	- Name of the system, where the tracepoint is defined, mandatory.
> >  EVENT	- Name of the tracepoint event in SYSTEM, mandatory.
> >  FETCHARGS - Arguments:
> >   <name>=$<field>[:TYPE] - Fetch given filed of the tracepoint and print
> > 			   it as given TYPE with given name. Supported
> > 			   types are:
> > 	                    (u8/u16/u32/u64/s8/s16/s32/s64), basic type
> >         	            (x8/x16/x32/x64), hexadecimal types
> > 			    "string", "ustring" and bitfield.
> > 
> > Example, attach an event probe on openat system call and print name of the
> > file that will be opened:
> >  echo "e:esys/eopen syscalls/sys_enter_openat file=\$filename:string" >> dynamic_events
> > A new dynamic event is created in events/esys/eopen/ directory. It
> > can be deleted with:
> >  echo "-:esys/eopen" >> dynamic_events
> > 
> > Filters, triggers and histograms can be attached to the new event, it can
> > be matched in synthetic events. There is one limitation - an event probe
> > can not be attached to kprobe, uprobe or another event probe.
> > 
> > Link: https://lkml.kernel.org/r/20210812145805.2292326-1-tz.stoyanov@gmail.com
> > 
> > Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > ---  
> [...]
> > +static struct trace_event_call *
> > +find_and_get_event(const char *system, const char *event_name)
> > +{
> > +	struct trace_event_call *tp_event;
> > +	const char *name;
> > +
> > +	list_for_each_entry(tp_event, &ftrace_events, list) {
> > +		/* Skip other probes and ftrace events */
> > +		if ((tp_event->flags & TRACE_EVENT_FL_IGNORE_ENABLE) ||
> > +		    (tp_event->flags & TRACE_EVENT_FL_KPROBE) ||
> > +		    (tp_event->flags & TRACE_EVENT_FL_UPROBE) ||
> > +		    (tp_event->flags & TRACE_EVENT_FL_EPROBE))  
> 
> Maybe it is better to define a bitmask for it instead of testing
> different bit many times.

Doh! Of course!

> 
> [...]
> > +static int __trace_eprobe_create(int argc, const char *argv[])
> > +{
> > +	/*
> > +	 * Argument syntax:
> > +	 *      e[:[GRP/]ENAME] SYSTEM.EVENT [FETCHARGS]
> > +	 * Fetch args:
> > +	 *  <name>=$<field>[:TYPE]
> > +	 */
> > +	const char *event = NULL, *group = EPROBE_EVENT_SYSTEM;
> > +	const char *sys_event = NULL, *sys_name = NULL;
> > +	struct trace_event_call *event_call;
> > +	struct trace_eprobe *ep = NULL;
> > +	char buf1[MAX_EVENT_NAME_LEN];
> > +	char buf2[MAX_EVENT_NAME_LEN];
> > +	int ret = 0;
> > +	int i;
> > +
> > +	if (argc < 2)
> > +		return -ECANCELED;
> > +
> > +	trace_probe_log_init("event_probe", argc, argv);
> > +
> > +	event = strchr(&argv[0][1], ':');
> > +	if (event) {
> > +		event++;
> > +		ret = traceprobe_parse_event_name(&event, &group, buf1,
> > +						  event - argv[0]);
> > +		if (ret)
> > +			goto parse_error;
> > +	} else {
> > +		strscpy(buf1, argv[1], MAX_EVENT_NAME_LEN);
> > +		sanitize_event_name(buf1);
> > +		event = buf1;
> > +	}
> > +	if (!is_good_name(event) || !is_good_name(group))
> > +		goto parse_error;
> > +
> > +	sys_event = argv[1];
> > +	ret = traceprobe_parse_event_name(&sys_event, &sys_name, buf2,
> > +					  sys_event - argv[1]);
> > +	if (ret || !sys_name)
> > +		goto parse_error;
> > +	if (!is_good_name(sys_event) || !is_good_name(sys_name))
> > +		goto parse_error;
> > +
> > +	mutex_lock(&event_mutex);
> > +	event_call = find_and_get_event(sys_name, sys_event);
> > +	ep = alloc_event_probe(group, event, event_call, argc - 2);
> > +	mutex_unlock(&event_mutex);
> > +
> > +	if (IS_ERR(ep)) {
> > +		ret = PTR_ERR(ep);
> > +		/* This must return -ENOMEM, else there is a bug */
> > +		WARN_ON_ONCE(ret != -ENOMEM);
> > +		goto error;	/* We know ep is not allocated */
> > +	}
> > +
> > +	argc -= 2; argv += 2;
> > +	/* parse arguments */
> > +	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
> > +		ret = trace_eprobe_tp_update_arg(ep, argv, i);
> > +		if (ret)
> > +			goto error;
> > +	}
> > +	ret = traceprobe_set_print_fmt(&ep->tp, PROBE_PRINT_EVENT);
> > +	if (ret < 0)
> > +		goto error;
> > +	init_trace_eprobe_call(ep);
> > +	mutex_lock(&event_mutex);  
> 
> Here, you don't check the event name collision. Since the kprobe event
> supports multiprobe event, it checks the collision by itself.
> See register_trace_kprobe().
> BTW, I found another issue on the name collision. Let me fix it.

I'm thinking of adding this above before taking the event_mutex:

	/* Check if the name already exists */
	if (find_event_probe(group, event))
		return -EEXIST;

Where I have:

static bool find_event_probe(const char *group, const char *event)
{
	struct dyn_event *ev;
	struct trace_eprobe *ep;

	for_each_dyn_event(ev) {
		if (ev->ops != &eprobe_dyn_event_ops)
			continue;
		
		ep = to_trace_eprobe(ev);
		if (strcmp(ep->tp.event->class.system, group) == 0 &&
		    strcmp(ep->tp.event->call.name, event) == 0)
			return true;
	}
	return false;
}

Cheers,

-- Steve


> 
> > +	ret = trace_probe_register_event_call(&ep->tp);
> > +	if (ret) {
> > +		mutex_unlock(&event_mutex);
> > +		goto error;
> > +	}
> > +	ret = dyn_event_add(&ep->devent, &ep->tp.event->call);
> > +	mutex_unlock(&event_mutex);
> > +	return ret;
> > +parse_error:
> > +	ret = -EINVAL;
> > +error:
> > +	trace_event_probe_cleanup(ep);
> > +	return ret;
> > +}  
> 
> Thank you,


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 7/7] tracing: Add a probe that attaches to trace events
  2021-08-19  3:37     ` Steven Rostedt
@ 2021-08-19  3:56       ` Masami Hiramatsu
  2021-08-19  4:03         ` Steven Rostedt
  0 siblings, 1 reply; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-19  3:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

On Wed, 18 Aug 2021 23:37:57 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:


> > Here, you don't check the event name collision. Since the kprobe event
> > supports multiprobe event, it checks the collision by itself.
> > See register_trace_kprobe().
> > BTW, I found another issue on the name collision. Let me fix it.
> 
> I'm thinking of adding this above before taking the event_mutex:
> 
> 	/* Check if the name already exists */
> 	if (find_event_probe(group, event))
> 		return -EEXIST;
> 
> Where I have:
> 
> static bool find_event_probe(const char *group, const char *event)
> {
> 	struct dyn_event *ev;
> 	struct trace_eprobe *ep;
> 
> 	for_each_dyn_event(ev) {
> 		if (ev->ops != &eprobe_dyn_event_ops)
> 			continue;
> 		
> 		ep = to_trace_eprobe(ev);
> 		if (strcmp(ep->tp.event->class.system, group) == 0 &&
> 		    strcmp(ep->tp.event->call.name, event) == 0)
> 			return true;
> 	}
> 	return false;
> }

Yeah, but I think this should be done with event_mutex, shouldn't it?

Thank you,


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 7/7] tracing: Add a probe that attaches to trace events
  2021-08-19  3:56       ` Masami Hiramatsu
@ 2021-08-19  4:03         ` Steven Rostedt
  2021-08-19  4:10           ` Steven Rostedt
  2021-08-19  4:28           ` Masami Hiramatsu
  0 siblings, 2 replies; 20+ messages in thread
From: Steven Rostedt @ 2021-08-19  4:03 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

On Thu, 19 Aug 2021 12:56:52 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> > static bool find_event_probe(const char *group, const char *event)
> > {
> > 	struct dyn_event *ev;
> > 	struct trace_eprobe *ep;
> > 
> > 	for_each_dyn_event(ev) {
> > 		if (ev->ops != &eprobe_dyn_event_ops)
> > 			continue;
> > 		
> > 		ep = to_trace_eprobe(ev);
> > 		if (strcmp(ep->tp.event->class.system, group) == 0 &&
> > 		    strcmp(ep->tp.event->call.name, event) == 0)
> > 			return true;
> > 	}
> > 	return false;
> > }  
> 
> Yeah, but I think this should be done with event_mutex, shouldn't it?

Probably. I noticed that it was updated under the dyn_event_ops_mutex,
and thought that was enough protection. But I now see the lockdep
assert on the event_mutex in the other functions.

Is there ever a case where this list is updated without
dyn_event_ops_mutex held?

-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 7/7] tracing: Add a probe that attaches to trace events
  2021-08-19  4:03         ` Steven Rostedt
@ 2021-08-19  4:10           ` Steven Rostedt
  2021-08-19  4:28           ` Masami Hiramatsu
  1 sibling, 0 replies; 20+ messages in thread
From: Steven Rostedt @ 2021-08-19  4:10 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

On Thu, 19 Aug 2021 00:03:42 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> Probably. I noticed that it was updated under the dyn_event_ops_mutex,
> and thought that was enough protection. But I now see the lockdep
> assert on the event_mutex in the other functions.

Anyway, here's the new version:

static bool find_event_probe(const char *group, const char *event)
{
	struct trace_eprobe *ep;
	struct dyn_event *ev;
	bool ret = false;

	/*
	 * Must grab the event_mutex to prevent the list from being modified
	 * by other probes. But the event_probe being only created via the
	 * dynamic_events file, is only added under the dyn_event_ops_mutex,
	 * which is currently held. There is no race between this check and
	 * adding the new probe.
	 */
	mutex_lock(&event_mutex);
	for_each_dyn_event(ev) {
		if (ev->ops != &eprobe_dyn_event_ops)
			continue;
		ep = to_trace_eprobe(ev);
		if (strcmp(ep->tp.event->class.system, group) == 0 &&
		    strcmp(ep->tp.event->call.name, event) == 0) {
			ret = true;
			break;
		}
	}
	mutex_lock(&event_mutex);

	return ret;
}


-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 7/7] tracing: Add a probe that attaches to trace events
  2021-08-19  4:03         ` Steven Rostedt
  2021-08-19  4:10           ` Steven Rostedt
@ 2021-08-19  4:28           ` Masami Hiramatsu
  1 sibling, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2021-08-19  4:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Tzvetomir Stoyanov,
	Tom Zanussi, linux-trace-devel

On Thu, 19 Aug 2021 00:03:42 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Thu, 19 Aug 2021 12:56:52 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > > static bool find_event_probe(const char *group, const char *event)
> > > {
> > > 	struct dyn_event *ev;
> > > 	struct trace_eprobe *ep;
> > > 
> > > 	for_each_dyn_event(ev) {
> > > 		if (ev->ops != &eprobe_dyn_event_ops)
> > > 			continue;
> > > 		
> > > 		ep = to_trace_eprobe(ev);
> > > 		if (strcmp(ep->tp.event->class.system, group) == 0 &&
> > > 		    strcmp(ep->tp.event->call.name, event) == 0)
> > > 			return true;
> > > 	}
> > > 	return false;
> > > }  
> > 
> > Yeah, but I think this should be done with event_mutex, shouldn't it?
> 
> Probably. I noticed that it was updated under the dyn_event_ops_mutex,
> and thought that was enough protection. But I now see the lockdep
> assert on the event_mutex in the other functions.
> 
> Is there ever a case where this list is updated without
> dyn_event_ops_mutex held?

dyn_event_ops_mutex is for the "dyn_event_ops_list" which manages
the list of "struct dyn_event_operations" (e.g. kprobe, uprobe, synthetic).

In kernel/trace/trace_dynevent.c, you can see,

/* Protected by event_mutex */
LIST_HEAD(dyn_event_list);

:)

Thank you,


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-08-19  4:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-17  3:42 [PATCH v6 0/7] tracing: Creation of event probe Steven Rostedt
2021-08-17  3:42 ` [PATCH v6 1/7] tracing: Add DYNAMIC flag for dynamic events Steven Rostedt
2021-08-18  8:15   ` Masami Hiramatsu
2021-08-17  3:42 ` [PATCH v6 2/7] tracing: Have dynamic events have a ref counter Steven Rostedt
2021-08-18 16:14   ` Masami Hiramatsu
2021-08-17  3:42 ` [PATCH v6 3/7] tracing/probe: Have traceprobe_parse_probe_arg() take a const arg Steven Rostedt
2021-08-18 16:16   ` Masami Hiramatsu
2021-08-17  3:42 ` [PATCH v6 4/7] tracing/probes: Allow for dot delimiter as well as slash for system names Steven Rostedt
2021-08-18 10:57   ` Masami Hiramatsu
2021-08-17  3:43 ` [PATCH v6 5/7] tracing/probes: Use struct_size() instead of defining custom macros Steven Rostedt
2021-08-18 11:08   ` Masami Hiramatsu
2021-08-17  3:43 ` [PATCH v6 6/7] tracing/probe: Change traceprobe_set_print_fmt() to take a type Steven Rostedt
2021-08-18 16:19   ` Masami Hiramatsu
2021-08-17  3:43 ` [PATCH v6 7/7] tracing: Add a probe that attaches to trace events Steven Rostedt
2021-08-19  2:29   ` Masami Hiramatsu
2021-08-19  3:37     ` Steven Rostedt
2021-08-19  3:56       ` Masami Hiramatsu
2021-08-19  4:03         ` Steven Rostedt
2021-08-19  4:10           ` Steven Rostedt
2021-08-19  4:28           ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).