linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-next][PATCH 00/12] tracing: Updates for v5.18
@ 2022-03-12 23:25 Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 01/12] tracing: Fix allocation of last_cmd in last_cmd_set() Steven Rostedt
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: bc47ee4844d6b7d7351536cd99d35848c4449689


Beau Belgrave (2):
      user_events: Fix potential uninitialized pointer while parsing field
      user_events: Prevent dyn_event delete racing with ioctl add/delete

Steven Rostedt (Google) (9):
      tracing: Fix allocation of last_cmd in last_cmd_set()
      tracing: Fix last_cmd_set() string management in histogram code
      tracing: Allow custom events to be added to the tracefs directory
      tracing: Add sample code for custom trace events
      tracing: Move the defines to create TRACE_EVENTS into their own files
      tracing: Add TRACE_CUSTOM_EVENT() macro
      tracing: Have TRACE_DEFINE_ENUM affect trace event types as well
      tracing: Add snapshot at end of kernel boot up
      tracing/user_events: Use alloc_pages instead of kzalloc() for register pages

Tom Zanussi (1):
      tracing: Fix strncpy warning in trace_events_synth.c

----
 Documentation/admin-guide/kernel-parameters.txt |   8 +
 include/linux/ftrace.h                          |  11 +-
 include/linux/trace_events.h                    |  24 +-
 include/trace/define_custom_trace.h             |  77 ++++
 include/trace/stages/init.h                     |  37 ++
 include/trace/stages/stage1_defines.h           |  45 +++
 include/trace/stages/stage2_defines.h           |  48 +++
 include/trace/stages/stage3_defines.h           | 129 ++++++
 include/trace/stages/stage4_defines.h           |  57 +++
 include/trace/stages/stage5_defines.h           |  83 ++++
 include/trace/stages/stage6_defines.h           |  86 ++++
 include/trace/stages/stage7_defines.h           |  34 ++
 include/trace/trace_custom_events.h             | 221 +++++++++++
 include/trace/trace_events.h                    | 499 +-----------------------
 kernel/trace/ftrace.c                           |   2 +
 kernel/trace/trace.c                            |  18 +
 kernel/trace/trace_events.c                     |  30 ++
 kernel/trace/trace_events_hist.c                |   9 +-
 kernel/trace/trace_events_synth.c               |   5 +-
 kernel/trace/trace_events_user.c                |  64 ++-
 samples/Kconfig                                 |   8 +-
 samples/Makefile                                |   1 +
 samples/trace_events/Makefile                   |   2 +
 samples/trace_events/trace_custom_sched.c       |  60 +++
 samples/trace_events/trace_custom_sched.h       |  95 +++++
 25 files changed, 1139 insertions(+), 514 deletions(-)
 create mode 100644 include/trace/define_custom_trace.h
 create mode 100644 include/trace/stages/init.h
 create mode 100644 include/trace/stages/stage1_defines.h
 create mode 100644 include/trace/stages/stage2_defines.h
 create mode 100644 include/trace/stages/stage3_defines.h
 create mode 100644 include/trace/stages/stage4_defines.h
 create mode 100644 include/trace/stages/stage5_defines.h
 create mode 100644 include/trace/stages/stage6_defines.h
 create mode 100644 include/trace/stages/stage7_defines.h
 create mode 100644 include/trace/trace_custom_events.h
 create mode 100644 samples/trace_events/trace_custom_sched.c
 create mode 100644 samples/trace_events/trace_custom_sched.h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [for-next][PATCH 01/12] tracing: Fix allocation of last_cmd in last_cmd_set()
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 02/12] user_events: Fix potential uninitialized pointer while parsing field Steven Rostedt
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, kernel test robot

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The strncat() used in last_cmd_set() includes the nul byte of length of
the string being copied in, when it should only hold the size of the
string being copied (not the nul byte). Change it to subtract the length
of the allocated space and the nul byte to pass that into the strncat().

Also, assign "len" instead of initializing it to zero and its first update
is to do a "+=".

Link: https://lore.kernel.org/all/202202140628.fj6e4w4v-lkp@intel.com/

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_hist.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 5e8970624bce..78788049f3d3 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -744,19 +744,20 @@ static void last_cmd_set(struct trace_event_file *file, char *str)
 {
 	const char *system = NULL, *name = NULL;
 	struct trace_event_call *call;
-	int len = 0;
+	int len;
 
 	if (!str)
 		return;
 
-	len += sizeof(HIST_PREFIX) + strlen(str) + 1;
+	len = sizeof(HIST_PREFIX) + strlen(str) + 1;
 	kfree(last_cmd);
 	last_cmd = kzalloc(len, GFP_KERNEL);
 	if (!last_cmd)
 		return;
 
 	strcpy(last_cmd, HIST_PREFIX);
-	strncat(last_cmd, str, len - sizeof(HIST_PREFIX));
+	len -= sizeof(HIST_PREFIX) + 1;
+	strncat(last_cmd, str, len);
 
 	if (file) {
 		call = file->event_call;
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 02/12] user_events: Fix potential uninitialized pointer while parsing field
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 01/12] tracing: Fix allocation of last_cmd in last_cmd_set() Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 03/12] tracing: Fix last_cmd_set() string management in histogram code Steven Rostedt
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Beau Belgrave, Dan Carpenter

From: Beau Belgrave <beaub@linux.microsoft.com>

Ensure name is initialized by default to NULL to prevent possible edge
cases that could lead to it being left uninitialized. Add an explicit
check for NULL name to ensure edge boundaries.

Link: https://lore.kernel.org/bpf/20220224105334.GA2248@kili/
Link: https://lore.kernel.org/linux-trace-devel/20220224181637.2129-1-beaub@linux.microsoft.com

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_user.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 2b5e9fdb63a0..9a6191a6a786 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -362,6 +362,8 @@ static int user_event_parse_field(char *field, struct user_event *user,
 	*field++ = '\0';
 	depth++;
 parse:
+	name = NULL;
+
 	while ((part = strsep(&field, " ")) != NULL) {
 		switch (depth++) {
 		case FIELD_DEPTH_TYPE:
@@ -382,7 +384,7 @@ static int user_event_parse_field(char *field, struct user_event *user,
 		}
 	}
 
-	if (depth < FIELD_DEPTH_SIZE)
+	if (depth < FIELD_DEPTH_SIZE || !name)
 		return -EINVAL;
 
 	if (depth == FIELD_DEPTH_SIZE)
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 03/12] tracing: Fix last_cmd_set() string management in histogram code
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 01/12] tracing: Fix allocation of last_cmd in last_cmd_set() Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 02/12] user_events: Fix potential uninitialized pointer while parsing field Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 04/12] tracing: Allow custom events to be added to the tracefs directory Steven Rostedt
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, kernel test robot

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Using strnlen(dest, str, n) is confusing, as the size of dest must be
strlen(dest) + n + 1. Even more confusing, using sizeof(string constant)
gives you strlen(string constant) + 1 and not just strlen().

These two together made using strncat() with a constant string a bit off
in the calculations as we have:

	len = sizeof(HIST_PREFIX) + strlen(str) + 1;
	kfree(last_cmd);
	last_cmd = kzalloc(len, GFP_KERNEL);
	strcpy(last_cmd, HIST_PREFIX);
	len -= sizeof(HIST_PREFIX) + 1;
	strncat(last_cmd, str, len);

The above works if we s/sizeof/strlen/ with HIST_PREFIX (which is defined
as "hist:", but because sizeof(HIST_PREFIX) is equal to
strlen(HIST_PREFIX) + 1, we can drop the +1 in the code. But at least
comment that we are doing so.

Link: https://lore.kernel.org/all/202203082112.Iu7tvFl4-lkp@intel.com/

Fixes: 9f8e5aee93ed2 ("tracing: Fix allocation of last_cmd in last_cmd_set()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_hist.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 78788049f3d3..954b19e2f196 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -749,14 +749,16 @@ static void last_cmd_set(struct trace_event_file *file, char *str)
 	if (!str)
 		return;
 
-	len = sizeof(HIST_PREFIX) + strlen(str) + 1;
+	/* sizeof() contains the nul byte */
+	len = sizeof(HIST_PREFIX) + strlen(str);
 	kfree(last_cmd);
 	last_cmd = kzalloc(len, GFP_KERNEL);
 	if (!last_cmd)
 		return;
 
 	strcpy(last_cmd, HIST_PREFIX);
-	len -= sizeof(HIST_PREFIX) + 1;
+	/* Again, sizeof() contains the nul byte */
+	len -= sizeof(HIST_PREFIX);
 	strncat(last_cmd, str, len);
 
 	if (file) {
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 04/12] tracing: Allow custom events to be added to the tracefs directory
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (2 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 03/12] tracing: Fix last_cmd_set() string management in histogram code Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 05/12] tracing: Add sample code for custom trace events Steven Rostedt
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Joel Fernandes, Peter Zijlstra,
	Masami Hiramatsu, Tom Zanussi

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Allow custom events to be added to the events directory in the tracefs
file system. For example, a module could be installed that attaches to an
event and wants to be enabled and disabled via the tracefs file system. It
would use trace_add_event_call() to add the event to the tracefs
directory, and trace_remove_event_call() to remove it.

Make both those functions EXPORT_SYMBOL_GPL().

Link: https://lkml.kernel.org/r/20220303220625.186988045@goodmis.org

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 3147614c1812..38afd66d80e3 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2758,6 +2758,7 @@ int trace_add_event_call(struct trace_event_call *call)
 	mutex_unlock(&trace_types_lock);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(trace_add_event_call);
 
 /*
  * Must be called under locking of trace_types_lock, event_mutex and
@@ -2819,6 +2820,7 @@ int trace_remove_event_call(struct trace_event_call *call)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(trace_remove_event_call);
 
 #define for_each_event(event, start, end)			\
 	for (event = start;					\
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 05/12] tracing: Add sample code for custom trace events
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (3 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 04/12] tracing: Allow custom events to be added to the tracefs directory Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 06/12] tracing: Move the defines to create TRACE_EVENTS into their own files Steven Rostedt
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Masami Hiramatsu,
	Tom Zanussi, Joel Fernandes

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Add sample code to show how to create custom trace events in the tracefs
directory that can be enabled and modified like any event in tracefs
(including triggers, histograms, synthetic events and event probes).

The example is creating a custom sched_switch and a sched_waking to limit
what is recorded:

If the custom sched switch only records the prev_prio, next_prio and
next_pid, it can bring the size from 64 bytes per event, down to just 16
bytes!

If sched_waking only records the prio and pid of the woken event, it will
bring the size down from 36 bytes to 12 bytes per event.

This will allow for a much smaller footprint into the ring buffer and keep
more events from dropping.

Link: https://lkml.kernel.org/r/20220303220625.369226746@goodmis.org

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Suggested-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 samples/Kconfig                           |   8 +-
 samples/Makefile                          |   1 +
 samples/trace_events/Makefile             |   2 +
 samples/trace_events/trace_custom_sched.c | 271 ++++++++++++++++++++++
 4 files changed, 281 insertions(+), 1 deletion(-)
 create mode 100644 samples/trace_events/trace_custom_sched.c

diff --git a/samples/Kconfig b/samples/Kconfig
index 22cc921ae291..10e021c72282 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -14,7 +14,13 @@ config SAMPLE_TRACE_EVENTS
 	tristate "Build trace_events examples -- loadable modules only"
 	depends on EVENT_TRACING && m
 	help
-	  This build trace event example modules.
+	  This builds the trace event example module.
+
+config SAMPLE_TRACE_CUSTOM_EVENTS
+	tristate "Build custom trace event example -- loadable modules only"
+	depends on EVENT_TRACING && m
+	help
+	  This builds the custom trace event example module.
 
 config SAMPLE_TRACE_PRINTK
         tristate "Build trace_printk module - tests various trace_printk formats"
diff --git a/samples/Makefile b/samples/Makefile
index 1ae4de99c983..448343e8faeb 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_SAMPLE_RPMSG_CLIENT)	+= rpmsg/
 subdir-$(CONFIG_SAMPLE_SECCOMP)		+= seccomp
 subdir-$(CONFIG_SAMPLE_TIMER)		+= timers
 obj-$(CONFIG_SAMPLE_TRACE_EVENTS)	+= trace_events/
+obj-$(CONFIG_SAMPLE_TRACE_CUSTOM_EVENTS) += trace_events/
 obj-$(CONFIG_SAMPLE_TRACE_PRINTK)	+= trace_printk/
 obj-$(CONFIG_SAMPLE_FTRACE_DIRECT)	+= ftrace/
 obj-$(CONFIG_SAMPLE_FTRACE_DIRECT_MULTI) += ftrace/
diff --git a/samples/trace_events/Makefile b/samples/trace_events/Makefile
index b78344e7bbed..e98afc447fe1 100644
--- a/samples/trace_events/Makefile
+++ b/samples/trace_events/Makefile
@@ -13,3 +13,5 @@
 CFLAGS_trace-events-sample.o := -I$(src)
 
 obj-$(CONFIG_SAMPLE_TRACE_EVENTS) += trace-events-sample.o
+
+obj-$(CONFIG_SAMPLE_TRACE_CUSTOM_EVENTS) += trace_custom_sched.o
diff --git a/samples/trace_events/trace_custom_sched.c b/samples/trace_events/trace_custom_sched.c
new file mode 100644
index 000000000000..70a12c32ff99
--- /dev/null
+++ b/samples/trace_events/trace_custom_sched.c
@@ -0,0 +1,271 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * event tracer
+ *
+ * Copyright (C) 2022 Google Inc, Steven Rostedt <rostedt@goodmis.org>
+ */
+
+#define pr_fmt(fmt) fmt
+
+#include <linux/trace_events.h>
+#include <linux/version.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <trace/events/sched.h>
+
+#define THIS_SYSTEM "custom_sched"
+
+#define SCHED_PRINT_FMT							\
+	C("prev_prio=%d next_pid=%d next_prio=%d", REC->prev_prio, REC->next_pid, \
+	  REC->next_prio)
+
+#define SCHED_WAKING_FMT				\
+	C("pid=%d prio=%d", REC->pid, REC->prio)
+
+#undef C
+#define C(a, b...) a, b
+
+static struct trace_event_fields sched_switch_fields[] = {
+	{
+		.type = "unsigned short",
+		.name = "prev_prio",
+		.size = sizeof(short),
+		.align = __alignof__(short),
+		.is_signed = 0,
+		.filter_type = FILTER_OTHER,
+	},
+	{
+		.type = "unsigned short",
+		.name = "next_prio",
+		.size = sizeof(short),
+		.align = __alignof__(short),
+		.is_signed = 0,
+		.filter_type = FILTER_OTHER,
+	},
+	{
+		.type = "unsigned int",
+		.name = "next_prio",
+		.size = sizeof(int),
+		.align = __alignof__(int),
+		.is_signed = 0,
+		.filter_type = FILTER_OTHER,
+	},
+	{}
+};
+
+struct sched_event {
+	struct trace_entry	ent;
+	unsigned short		prev_prio;
+	unsigned short		next_prio;
+	unsigned int		next_pid;
+};
+
+static struct trace_event_fields sched_waking_fields[] = {
+	{
+		.type = "unsigned int",
+		.name = "pid",
+		.size = sizeof(int),
+		.align = __alignof__(int),
+		.is_signed = 0,
+		.filter_type = FILTER_OTHER,
+	},
+	{
+		.type = "unsigned short",
+		.name = "prio",
+		.size = sizeof(short),
+		.align = __alignof__(short),
+		.is_signed = 0,
+		.filter_type = FILTER_OTHER,
+	},
+	{}
+};
+
+struct wake_event {
+	struct trace_entry	ent;
+	unsigned int		pid;
+	unsigned short		prio;
+};
+
+static void sched_switch_probe(void *data, bool preempt, struct task_struct *prev,
+			       struct task_struct *next)
+{
+	struct trace_event_file *trace_file = data;
+	struct trace_event_buffer fbuffer;
+	struct sched_event *entry;
+
+	if (trace_trigger_soft_disabled(trace_file))
+		return;
+
+	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
+					   sizeof(*entry));
+
+	if (!entry)
+		return;
+
+	entry->prev_prio = prev->prio;
+	entry->next_prio = next->prio;
+	entry->next_pid = next->pid;
+
+	trace_event_buffer_commit(&fbuffer);
+}
+
+static struct trace_event_class sched_switch_class = {
+	.system			= THIS_SYSTEM,
+	.reg			= trace_event_reg,
+	.fields_array		= sched_switch_fields,
+	.fields			= LIST_HEAD_INIT(sched_switch_class.fields),
+	.probe			= sched_switch_probe,
+};
+
+static void sched_waking_probe(void *data, struct task_struct *t)
+{
+	struct trace_event_file *trace_file = data;
+	struct trace_event_buffer fbuffer;
+	struct wake_event *entry;
+
+	if (trace_trigger_soft_disabled(trace_file))
+		return;
+
+	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
+					   sizeof(*entry));
+
+	if (!entry)
+		return;
+
+	entry->prio = t->prio;
+	entry->pid = t->pid;
+
+	trace_event_buffer_commit(&fbuffer);
+}
+
+static struct trace_event_class sched_waking_class = {
+	.system			= THIS_SYSTEM,
+	.reg			= trace_event_reg,
+	.fields_array		= sched_waking_fields,
+	.fields			= LIST_HEAD_INIT(sched_waking_class.fields),
+	.probe			= sched_waking_probe,
+};
+
+static enum print_line_t sched_switch_output(struct trace_iterator *iter, int flags,
+					     struct trace_event *trace_event)
+{
+	struct trace_seq *s = &iter->seq;
+	struct sched_event *REC = (struct sched_event *)iter->ent;
+	int ret;
+
+	ret = trace_raw_output_prep(iter, trace_event);
+	if (ret != TRACE_TYPE_HANDLED)
+		return ret;
+
+	trace_seq_printf(s, SCHED_PRINT_FMT);
+	trace_seq_putc(s, '\n');
+
+	return trace_handle_return(s);
+}
+
+static struct trace_event_functions sched_switch_funcs = {
+	.trace			= sched_switch_output,
+};
+
+static enum print_line_t sched_waking_output(struct trace_iterator *iter, int flags,
+					     struct trace_event *trace_event)
+{
+	struct trace_seq *s = &iter->seq;
+	struct wake_event *REC = (struct wake_event *)iter->ent;
+	int ret;
+
+	ret = trace_raw_output_prep(iter, trace_event);
+	if (ret != TRACE_TYPE_HANDLED)
+		return ret;
+
+	trace_seq_printf(s, SCHED_WAKING_FMT);
+	trace_seq_putc(s, '\n');
+
+	return trace_handle_return(s);
+}
+
+static struct trace_event_functions sched_waking_funcs = {
+	.trace			= sched_waking_output,
+};
+
+#undef C
+#define C(a, b...) #a "," __stringify(b)
+
+static struct trace_event_call sched_switch_call = {
+	.class			= &sched_switch_class,
+	.event			= {
+		.funcs			= &sched_switch_funcs,
+	},
+	.print_fmt		= SCHED_PRINT_FMT,
+	.module			= THIS_MODULE,
+	.flags			= TRACE_EVENT_FL_TRACEPOINT,
+};
+
+static struct trace_event_call sched_waking_call = {
+	.class			= &sched_waking_class,
+	.event			= {
+		.funcs			= &sched_waking_funcs,
+	},
+	.print_fmt		= SCHED_WAKING_FMT,
+	.module			= THIS_MODULE,
+	.flags			= TRACE_EVENT_FL_TRACEPOINT,
+};
+
+static void fct(struct tracepoint *tp, void *priv)
+{
+	if (tp->name && strcmp(tp->name, "sched_switch") == 0)
+		sched_switch_call.tp = tp;
+	else if (tp->name && strcmp(tp->name, "sched_waking") == 0)
+		sched_waking_call.tp = tp;
+}
+
+static int add_event(struct trace_event_call *call)
+{
+	int ret;
+
+	ret = register_trace_event(&call->event);
+	if (WARN_ON(!ret))
+		return -ENODEV;
+
+	ret = trace_add_event_call(call);
+	if (WARN_ON(ret))
+		unregister_trace_event(&call->event);
+
+	return ret;
+}
+
+static int __init trace_sched_init(void)
+{
+	int ret;
+
+	check_trace_callback_type_sched_switch(sched_switch_probe);
+	check_trace_callback_type_sched_waking(sched_waking_probe);
+
+	for_each_kernel_tracepoint(fct, NULL);
+
+	ret = add_event(&sched_switch_call);
+	if (ret)
+		return ret;
+
+	ret = add_event(&sched_waking_call);
+	if (ret)
+		trace_remove_event_call(&sched_switch_call);
+
+	return ret;
+}
+
+static void __exit trace_sched_exit(void)
+{
+	trace_set_clr_event(THIS_SYSTEM, "sched_switch", 0);
+	trace_set_clr_event(THIS_SYSTEM, "sched_waking", 0);
+
+	trace_remove_event_call(&sched_switch_call);
+	trace_remove_event_call(&sched_waking_call);
+}
+
+module_init(trace_sched_init);
+module_exit(trace_sched_exit);
+
+MODULE_AUTHOR("Steven Rostedt");
+MODULE_DESCRIPTION("Custom scheduling events");
+MODULE_LICENSE("GPL");
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 06/12] tracing: Move the defines to create TRACE_EVENTS into their own files
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (4 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 05/12] tracing: Add sample code for custom trace events Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 07/12] tracing: Add TRACE_CUSTOM_EVENT() macro Steven Rostedt
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Joel Fernandes, Peter Zijlstra,
	Masami Hiramatsu, Tom Zanussi

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

In an effort to add custom event macros that can be used to create your
own custom events based on existing tracepoints, move the defines of the
special macros used in TRACE_EVENT() into their own files such that they
can be reused for TRACE_CUSTOM_EVENT().

Link: https://lkml.kernel.org/r/20220303220625.553406495@goodmis.org

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/trace/stages/init.h           |  37 ++
 include/trace/stages/stage1_defines.h |  45 +++
 include/trace/stages/stage2_defines.h |  48 +++
 include/trace/stages/stage3_defines.h | 129 +++++++
 include/trace/stages/stage4_defines.h |  57 +++
 include/trace/stages/stage5_defines.h |  83 +++++
 include/trace/stages/stage6_defines.h |  86 +++++
 include/trace/stages/stage7_defines.h |  34 ++
 include/trace/trace_events.h          | 499 +-------------------------
 9 files changed, 527 insertions(+), 491 deletions(-)
 create mode 100644 include/trace/stages/init.h
 create mode 100644 include/trace/stages/stage1_defines.h
 create mode 100644 include/trace/stages/stage2_defines.h
 create mode 100644 include/trace/stages/stage3_defines.h
 create mode 100644 include/trace/stages/stage4_defines.h
 create mode 100644 include/trace/stages/stage5_defines.h
 create mode 100644 include/trace/stages/stage6_defines.h
 create mode 100644 include/trace/stages/stage7_defines.h

diff --git a/include/trace/stages/init.h b/include/trace/stages/init.h
new file mode 100644
index 000000000000..000bcfc8dd2e
--- /dev/null
+++ b/include/trace/stages/init.h
@@ -0,0 +1,37 @@
+
+#define __app__(x, y) str__##x##y
+#define __app(x, y) __app__(x, y)
+
+#define TRACE_SYSTEM_STRING __app(TRACE_SYSTEM_VAR,__trace_system_name)
+
+#define TRACE_MAKE_SYSTEM_STR()				\
+	static const char TRACE_SYSTEM_STRING[] =	\
+		__stringify(TRACE_SYSTEM)
+
+TRACE_MAKE_SYSTEM_STR();
+
+#undef TRACE_DEFINE_ENUM
+#define TRACE_DEFINE_ENUM(a)				\
+	static struct trace_eval_map __used __initdata	\
+	__##TRACE_SYSTEM##_##a =			\
+	{						\
+		.system = TRACE_SYSTEM_STRING,		\
+		.eval_string = #a,			\
+		.eval_value = a				\
+	};						\
+	static struct trace_eval_map __used		\
+	__section("_ftrace_eval_map")			\
+	*TRACE_SYSTEM##_##a = &__##TRACE_SYSTEM##_##a
+
+#undef TRACE_DEFINE_SIZEOF
+#define TRACE_DEFINE_SIZEOF(a)				\
+	static struct trace_eval_map __used __initdata	\
+	__##TRACE_SYSTEM##_##a =			\
+	{						\
+		.system = TRACE_SYSTEM_STRING,		\
+		.eval_string = "sizeof(" #a ")",	\
+		.eval_value = sizeof(a)			\
+	};						\
+	static struct trace_eval_map __used		\
+	__section("_ftrace_eval_map")			\
+	*TRACE_SYSTEM##_##a = &__##TRACE_SYSTEM##_##a
diff --git a/include/trace/stages/stage1_defines.h b/include/trace/stages/stage1_defines.h
new file mode 100644
index 000000000000..8ab88c766d2b
--- /dev/null
+++ b/include/trace/stages/stage1_defines.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Stage 1 definitions for creating trace events */
+
+#undef __field
+#define __field(type, item)		type	item;
+
+#undef __field_ext
+#define __field_ext(type, item, filter_type)	type	item;
+
+#undef __field_struct
+#define __field_struct(type, item)	type	item;
+
+#undef __field_struct_ext
+#define __field_struct_ext(type, item, filter_type)	type	item;
+
+#undef __array
+#define __array(type, item, len)	type	item[len];
+
+#undef __dynamic_array
+#define __dynamic_array(type, item, len) u32 __data_loc_##item;
+
+#undef __string
+#define __string(item, src) __dynamic_array(char, item, -1)
+
+#undef __string_len
+#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
+#undef __bitmask
+#define __bitmask(item, nr_bits) __dynamic_array(char, item, -1)
+
+#undef __rel_dynamic_array
+#define __rel_dynamic_array(type, item, len) u32 __rel_loc_##item;
+
+#undef __rel_string
+#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
+
+#undef __rel_string_len
+#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
+
+#undef __rel_bitmask
+#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(char, item, -1)
+
+#undef TP_STRUCT__entry
+#define TP_STRUCT__entry(args...) args
diff --git a/include/trace/stages/stage2_defines.h b/include/trace/stages/stage2_defines.h
new file mode 100644
index 000000000000..9f2341df40da
--- /dev/null
+++ b/include/trace/stages/stage2_defines.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Stage 2 definitions for creating trace events */
+
+#undef TRACE_DEFINE_ENUM
+#define TRACE_DEFINE_ENUM(a)
+
+#undef TRACE_DEFINE_SIZEOF
+#define TRACE_DEFINE_SIZEOF(a)
+
+#undef __field
+#define __field(type, item)
+
+#undef __field_ext
+#define __field_ext(type, item, filter_type)
+
+#undef __field_struct
+#define __field_struct(type, item)
+
+#undef __field_struct_ext
+#define __field_struct_ext(type, item, filter_type)
+
+#undef __array
+#define __array(type, item, len)
+
+#undef __dynamic_array
+#define __dynamic_array(type, item, len)	u32 item;
+
+#undef __string
+#define __string(item, src) __dynamic_array(char, item, -1)
+
+#undef __bitmask
+#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
+
+#undef __string_len
+#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
+#undef __rel_dynamic_array
+#define __rel_dynamic_array(type, item, len)	u32 item;
+
+#undef __rel_string
+#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
+
+#undef __rel_string_len
+#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
+
+#undef __rel_bitmask
+#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item, -1)
diff --git a/include/trace/stages/stage3_defines.h b/include/trace/stages/stage3_defines.h
new file mode 100644
index 000000000000..0bc131993b7a
--- /dev/null
+++ b/include/trace/stages/stage3_defines.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Stage 3 definitions for creating trace events */
+
+#undef __entry
+#define __entry field
+
+#undef TP_printk
+#define TP_printk(fmt, args...) fmt "\n", args
+
+#undef __get_dynamic_array
+#define __get_dynamic_array(field)	\
+		((void *)__entry + (__entry->__data_loc_##field & 0xffff))
+
+#undef __get_dynamic_array_len
+#define __get_dynamic_array_len(field)	\
+		((__entry->__data_loc_##field >> 16) & 0xffff)
+
+#undef __get_str
+#define __get_str(field) ((char *)__get_dynamic_array(field))
+
+#undef __get_rel_dynamic_array
+#define __get_rel_dynamic_array(field)					\
+		((void *)__entry + 					\
+		 offsetof(typeof(*__entry), __rel_loc_##field) +	\
+		 sizeof(__entry->__rel_loc_##field) +			\
+		 (__entry->__rel_loc_##field & 0xffff))
+
+#undef __get_rel_dynamic_array_len
+#define __get_rel_dynamic_array_len(field)	\
+		((__entry->__rel_loc_##field >> 16) & 0xffff)
+
+#undef __get_rel_str
+#define __get_rel_str(field) ((char *)__get_rel_dynamic_array(field))
+
+#undef __get_bitmask
+#define __get_bitmask(field)						\
+	({								\
+		void *__bitmask = __get_dynamic_array(field);		\
+		unsigned int __bitmask_size;				\
+		__bitmask_size = __get_dynamic_array_len(field);	\
+		trace_print_bitmask_seq(p, __bitmask, __bitmask_size);	\
+	})
+
+#undef __get_rel_bitmask
+#define __get_rel_bitmask(field)						\
+	({								\
+		void *__bitmask = __get_rel_dynamic_array(field);		\
+		unsigned int __bitmask_size;				\
+		__bitmask_size = __get_rel_dynamic_array_len(field);	\
+		trace_print_bitmask_seq(p, __bitmask, __bitmask_size);	\
+	})
+
+#undef __print_flags
+#define __print_flags(flag, delim, flag_array...)			\
+	({								\
+		static const struct trace_print_flags __flags[] =	\
+			{ flag_array, { -1, NULL }};			\
+		trace_print_flags_seq(p, delim, flag, __flags);	\
+	})
+
+#undef __print_symbolic
+#define __print_symbolic(value, symbol_array...)			\
+	({								\
+		static const struct trace_print_flags symbols[] =	\
+			{ symbol_array, { -1, NULL }};			\
+		trace_print_symbols_seq(p, value, symbols);		\
+	})
+
+#undef __print_flags_u64
+#undef __print_symbolic_u64
+#if BITS_PER_LONG == 32
+#define __print_flags_u64(flag, delim, flag_array...)			\
+	({								\
+		static const struct trace_print_flags_u64 __flags[] =	\
+			{ flag_array, { -1, NULL } };			\
+		trace_print_flags_seq_u64(p, delim, flag, __flags);	\
+	})
+
+#define __print_symbolic_u64(value, symbol_array...)			\
+	({								\
+		static const struct trace_print_flags_u64 symbols[] =	\
+			{ symbol_array, { -1, NULL } };			\
+		trace_print_symbols_seq_u64(p, value, symbols);	\
+	})
+#else
+#define __print_flags_u64(flag, delim, flag_array...)			\
+			__print_flags(flag, delim, flag_array)
+
+#define __print_symbolic_u64(value, symbol_array...)			\
+			__print_symbolic(value, symbol_array)
+#endif
+
+#undef __print_hex
+#define __print_hex(buf, buf_len)					\
+	trace_print_hex_seq(p, buf, buf_len, false)
+
+#undef __print_hex_str
+#define __print_hex_str(buf, buf_len)					\
+	trace_print_hex_seq(p, buf, buf_len, true)
+
+#undef __print_array
+#define __print_array(array, count, el_size)				\
+	({								\
+		BUILD_BUG_ON(el_size != 1 && el_size != 2 &&		\
+			     el_size != 4 && el_size != 8);		\
+		trace_print_array_seq(p, array, count, el_size);	\
+	})
+
+#undef __print_hex_dump
+#define __print_hex_dump(prefix_str, prefix_type,			\
+			 rowsize, groupsize, buf, len, ascii)		\
+	trace_print_hex_dump_seq(p, prefix_str, prefix_type,		\
+				 rowsize, groupsize, buf, len, ascii)
+
+#undef __print_ns_to_secs
+#define __print_ns_to_secs(value)			\
+	({						\
+		u64 ____val = (u64)(value);		\
+		do_div(____val, NSEC_PER_SEC);		\
+		____val;				\
+	})
+
+#undef __print_ns_without_secs
+#define __print_ns_without_secs(value)			\
+	({						\
+		u64 ____val = (u64)(value);		\
+		(u32) do_div(____val, NSEC_PER_SEC);	\
+	})
diff --git a/include/trace/stages/stage4_defines.h b/include/trace/stages/stage4_defines.h
new file mode 100644
index 000000000000..780a10fa5279
--- /dev/null
+++ b/include/trace/stages/stage4_defines.h
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Stage 4 definitions for creating trace events */
+
+#undef __field_ext
+#define __field_ext(_type, _item, _filter_type) {			\
+	.type = #_type, .name = #_item,					\
+	.size = sizeof(_type), .align = __alignof__(_type),		\
+	.is_signed = is_signed_type(_type), .filter_type = _filter_type },
+
+#undef __field_struct_ext
+#define __field_struct_ext(_type, _item, _filter_type) {		\
+	.type = #_type, .name = #_item,					\
+	.size = sizeof(_type), .align = __alignof__(_type),		\
+	0, .filter_type = _filter_type },
+
+#undef __field
+#define __field(type, item)	__field_ext(type, item, FILTER_OTHER)
+
+#undef __field_struct
+#define __field_struct(type, item) __field_struct_ext(type, item, FILTER_OTHER)
+
+#undef __array
+#define __array(_type, _item, _len) {					\
+	.type = #_type"["__stringify(_len)"]", .name = #_item,		\
+	.size = sizeof(_type[_len]), .align = __alignof__(_type),	\
+	.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
+
+#undef __dynamic_array
+#define __dynamic_array(_type, _item, _len) {				\
+	.type = "__data_loc " #_type "[]", .name = #_item,		\
+	.size = 4, .align = 4,						\
+	.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
+
+#undef __string
+#define __string(item, src) __dynamic_array(char, item, -1)
+
+#undef __string_len
+#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
+#undef __bitmask
+#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
+
+#undef __rel_dynamic_array
+#define __rel_dynamic_array(_type, _item, _len) {			\
+	.type = "__rel_loc " #_type "[]", .name = #_item,		\
+	.size = 4, .align = 4,						\
+	.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
+
+#undef __rel_string
+#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
+
+#undef __rel_string_len
+#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
+
+#undef __rel_bitmask
+#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item, -1)
diff --git a/include/trace/stages/stage5_defines.h b/include/trace/stages/stage5_defines.h
new file mode 100644
index 000000000000..fb15394aae31
--- /dev/null
+++ b/include/trace/stages/stage5_defines.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Stage 5 definitions for creating trace events */
+
+/*
+ * remember the offset of each array from the beginning of the event.
+ */
+
+#undef __entry
+#define __entry entry
+
+#undef __field
+#define __field(type, item)
+
+#undef __field_ext
+#define __field_ext(type, item, filter_type)
+
+#undef __field_struct
+#define __field_struct(type, item)
+
+#undef __field_struct_ext
+#define __field_struct_ext(type, item, filter_type)
+
+#undef __array
+#define __array(type, item, len)
+
+#undef __dynamic_array
+#define __dynamic_array(type, item, len)				\
+	__item_length = (len) * sizeof(type);				\
+	__data_offsets->item = __data_size +				\
+			       offsetof(typeof(*entry), __data);	\
+	__data_offsets->item |= __item_length << 16;			\
+	__data_size += __item_length;
+
+#undef __string
+#define __string(item, src) __dynamic_array(char, item,			\
+		    strlen((src) ? (const char *)(src) : "(null)") + 1)
+
+#undef __string_len
+#define __string_len(item, src, len) __dynamic_array(char, item, (len) + 1)
+
+#undef __rel_dynamic_array
+#define __rel_dynamic_array(type, item, len)				\
+	__item_length = (len) * sizeof(type);				\
+	__data_offsets->item = __data_size +				\
+			       offsetof(typeof(*entry), __data) -	\
+			       offsetof(typeof(*entry), __rel_loc_##item) -	\
+			       sizeof(u32);				\
+	__data_offsets->item |= __item_length << 16;			\
+	__data_size += __item_length;
+
+#undef __rel_string
+#define __rel_string(item, src) __rel_dynamic_array(char, item,			\
+		    strlen((src) ? (const char *)(src) : "(null)") + 1)
+
+#undef __rel_string_len
+#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, (len) + 1)
+/*
+ * __bitmask_size_in_bytes_raw is the number of bytes needed to hold
+ * num_possible_cpus().
+ */
+#define __bitmask_size_in_bytes_raw(nr_bits)	\
+	(((nr_bits) + 7) / 8)
+
+#define __bitmask_size_in_longs(nr_bits)			\
+	((__bitmask_size_in_bytes_raw(nr_bits) +		\
+	  ((BITS_PER_LONG / 8) - 1)) / (BITS_PER_LONG / 8))
+
+/*
+ * __bitmask_size_in_bytes is the number of bytes needed to hold
+ * num_possible_cpus() padded out to the nearest long. This is what
+ * is saved in the buffer, just to be consistent.
+ */
+#define __bitmask_size_in_bytes(nr_bits)				\
+	(__bitmask_size_in_longs(nr_bits) * (BITS_PER_LONG / 8))
+
+#undef __bitmask
+#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item,	\
+					 __bitmask_size_in_longs(nr_bits))
+
+#undef __rel_bitmask
+#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item,	\
+					 __bitmask_size_in_longs(nr_bits))
diff --git a/include/trace/stages/stage6_defines.h b/include/trace/stages/stage6_defines.h
new file mode 100644
index 000000000000..b3a1f26026be
--- /dev/null
+++ b/include/trace/stages/stage6_defines.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Stage 6 definitions for creating trace events */
+
+#undef __entry
+#define __entry entry
+
+#undef __field
+#define __field(type, item)
+
+#undef __field_struct
+#define __field_struct(type, item)
+
+#undef __array
+#define __array(type, item, len)
+
+#undef __dynamic_array
+#define __dynamic_array(type, item, len)				\
+	__entry->__data_loc_##item = __data_offsets.item;
+
+#undef __string
+#define __string(item, src) __dynamic_array(char, item, -1)
+
+#undef __string_len
+#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
+#undef __assign_str
+#define __assign_str(dst, src)						\
+	strcpy(__get_str(dst), (src) ? (const char *)(src) : "(null)");
+
+#undef __assign_str_len
+#define __assign_str_len(dst, src, len)					\
+	do {								\
+		memcpy(__get_str(dst), (src), (len));			\
+		__get_str(dst)[len] = '\0';				\
+	} while(0)
+
+#undef __bitmask
+#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
+
+#undef __get_bitmask
+#define __get_bitmask(field) (char *)__get_dynamic_array(field)
+
+#undef __assign_bitmask
+#define __assign_bitmask(dst, src, nr_bits)					\
+	memcpy(__get_bitmask(dst), (src), __bitmask_size_in_bytes(nr_bits))
+
+#undef __rel_dynamic_array
+#define __rel_dynamic_array(type, item, len)				\
+	__entry->__rel_loc_##item = __data_offsets.item;
+
+#undef __rel_string
+#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
+
+#undef __rel_string_len
+#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
+
+#undef __assign_rel_str
+#define __assign_rel_str(dst, src)					\
+	strcpy(__get_rel_str(dst), (src) ? (const char *)(src) : "(null)");
+
+#undef __assign_rel_str_len
+#define __assign_rel_str_len(dst, src, len)				\
+	do {								\
+		memcpy(__get_rel_str(dst), (src), (len));		\
+		__get_rel_str(dst)[len] = '\0';				\
+	} while (0)
+
+#undef __rel_bitmask
+#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item, -1)
+
+#undef __get_rel_bitmask
+#define __get_rel_bitmask(field) (char *)__get_rel_dynamic_array(field)
+
+#undef __assign_rel_bitmask
+#define __assign_rel_bitmask(dst, src, nr_bits)					\
+	memcpy(__get_rel_bitmask(dst), (src), __bitmask_size_in_bytes(nr_bits))
+
+#undef TP_fast_assign
+#define TP_fast_assign(args...) args
+
+#undef __perf_count
+#define __perf_count(c)	(c)
+
+#undef __perf_task
+#define __perf_task(t)	(t)
diff --git a/include/trace/stages/stage7_defines.h b/include/trace/stages/stage7_defines.h
new file mode 100644
index 000000000000..d65445328f18
--- /dev/null
+++ b/include/trace/stages/stage7_defines.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Stage 7 definitions for creating trace events */
+
+#undef __entry
+#define __entry REC
+
+#undef __print_flags
+#undef __print_symbolic
+#undef __print_hex
+#undef __print_hex_str
+#undef __get_dynamic_array
+#undef __get_dynamic_array_len
+#undef __get_str
+#undef __get_bitmask
+#undef __get_rel_dynamic_array
+#undef __get_rel_dynamic_array_len
+#undef __get_rel_str
+#undef __get_rel_bitmask
+#undef __print_array
+#undef __print_hex_dump
+
+/*
+ * The below is not executed in the kernel. It is only what is
+ * displayed in the print format for userspace to parse.
+ */
+#undef __print_ns_to_secs
+#define __print_ns_to_secs(val) (val) / 1000000000UL
+
+#undef __print_ns_without_secs
+#define __print_ns_without_secs(val) (val) % 1000000000UL
+
+#undef TP_printk
+#define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h
index 3d29919045af..8a8cd66cc6d5 100644
--- a/include/trace/trace_events.h
+++ b/include/trace/trace_events.h
@@ -24,42 +24,7 @@
 #define TRACE_SYSTEM_VAR TRACE_SYSTEM
 #endif
 
-#define __app__(x, y) str__##x##y
-#define __app(x, y) __app__(x, y)
-
-#define TRACE_SYSTEM_STRING __app(TRACE_SYSTEM_VAR,__trace_system_name)
-
-#define TRACE_MAKE_SYSTEM_STR()				\
-	static const char TRACE_SYSTEM_STRING[] =	\
-		__stringify(TRACE_SYSTEM)
-
-TRACE_MAKE_SYSTEM_STR();
-
-#undef TRACE_DEFINE_ENUM
-#define TRACE_DEFINE_ENUM(a)				\
-	static struct trace_eval_map __used __initdata	\
-	__##TRACE_SYSTEM##_##a =			\
-	{						\
-		.system = TRACE_SYSTEM_STRING,		\
-		.eval_string = #a,			\
-		.eval_value = a				\
-	};						\
-	static struct trace_eval_map __used		\
-	__section("_ftrace_eval_map")			\
-	*TRACE_SYSTEM##_##a = &__##TRACE_SYSTEM##_##a
-
-#undef TRACE_DEFINE_SIZEOF
-#define TRACE_DEFINE_SIZEOF(a)				\
-	static struct trace_eval_map __used __initdata	\
-	__##TRACE_SYSTEM##_##a =			\
-	{						\
-		.system = TRACE_SYSTEM_STRING,		\
-		.eval_string = "sizeof(" #a ")",	\
-		.eval_value = sizeof(a)			\
-	};						\
-	static struct trace_eval_map __used		\
-	__section("_ftrace_eval_map")			\
-	*TRACE_SYSTEM##_##a = &__##TRACE_SYSTEM##_##a
+#include "stages/init.h"
 
 /*
  * DECLARE_EVENT_CLASS can be used to add a generic function
@@ -80,48 +45,7 @@ TRACE_MAKE_SYSTEM_STR();
 			     PARAMS(print));		       \
 	DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
 
-
-#undef __field
-#define __field(type, item)		type	item;
-
-#undef __field_ext
-#define __field_ext(type, item, filter_type)	type	item;
-
-#undef __field_struct
-#define __field_struct(type, item)	type	item;
-
-#undef __field_struct_ext
-#define __field_struct_ext(type, item, filter_type)	type	item;
-
-#undef __array
-#define __array(type, item, len)	type	item[len];
-
-#undef __dynamic_array
-#define __dynamic_array(type, item, len) u32 __data_loc_##item;
-
-#undef __string
-#define __string(item, src) __dynamic_array(char, item, -1)
-
-#undef __string_len
-#define __string_len(item, src, len) __dynamic_array(char, item, -1)
-
-#undef __bitmask
-#define __bitmask(item, nr_bits) __dynamic_array(char, item, -1)
-
-#undef __rel_dynamic_array
-#define __rel_dynamic_array(type, item, len) u32 __rel_loc_##item;
-
-#undef __rel_string
-#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
-
-#undef __rel_string_len
-#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
-
-#undef __rel_bitmask
-#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(char, item, -1)
-
-#undef TP_STRUCT__entry
-#define TP_STRUCT__entry(args...) args
+#include "stages/stage1_defines.h"
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, print)	\
@@ -185,50 +109,7 @@ TRACE_MAKE_SYSTEM_STR();
  * The size of an array is also encoded, in the higher 16 bits of <item>.
  */
 
-#undef TRACE_DEFINE_ENUM
-#define TRACE_DEFINE_ENUM(a)
-
-#undef TRACE_DEFINE_SIZEOF
-#define TRACE_DEFINE_SIZEOF(a)
-
-#undef __field
-#define __field(type, item)
-
-#undef __field_ext
-#define __field_ext(type, item, filter_type)
-
-#undef __field_struct
-#define __field_struct(type, item)
-
-#undef __field_struct_ext
-#define __field_struct_ext(type, item, filter_type)
-
-#undef __array
-#define __array(type, item, len)
-
-#undef __dynamic_array
-#define __dynamic_array(type, item, len)	u32 item;
-
-#undef __string
-#define __string(item, src) __dynamic_array(char, item, -1)
-
-#undef __bitmask
-#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
-
-#undef __string_len
-#define __string_len(item, src, len) __dynamic_array(char, item, -1)
-
-#undef __rel_dynamic_array
-#define __rel_dynamic_array(type, item, len)	u32 item;
-
-#undef __rel_string
-#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
-
-#undef __rel_string_len
-#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
-
-#undef __rel_bitmask
-#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item, -1)
+#include "stages/stage2_defines.h"
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
@@ -300,131 +181,7 @@ TRACE_MAKE_SYSTEM_STR();
  * in binary.
  */
 
-#undef __entry
-#define __entry field
-
-#undef TP_printk
-#define TP_printk(fmt, args...) fmt "\n", args
-
-#undef __get_dynamic_array
-#define __get_dynamic_array(field)	\
-		((void *)__entry + (__entry->__data_loc_##field & 0xffff))
-
-#undef __get_dynamic_array_len
-#define __get_dynamic_array_len(field)	\
-		((__entry->__data_loc_##field >> 16) & 0xffff)
-
-#undef __get_str
-#define __get_str(field) ((char *)__get_dynamic_array(field))
-
-#undef __get_rel_dynamic_array
-#define __get_rel_dynamic_array(field)					\
-		((void *)__entry + 					\
-		 offsetof(typeof(*__entry), __rel_loc_##field) +	\
-		 sizeof(__entry->__rel_loc_##field) +			\
-		 (__entry->__rel_loc_##field & 0xffff))
-
-#undef __get_rel_dynamic_array_len
-#define __get_rel_dynamic_array_len(field)	\
-		((__entry->__rel_loc_##field >> 16) & 0xffff)
-
-#undef __get_rel_str
-#define __get_rel_str(field) ((char *)__get_rel_dynamic_array(field))
-
-#undef __get_bitmask
-#define __get_bitmask(field)						\
-	({								\
-		void *__bitmask = __get_dynamic_array(field);		\
-		unsigned int __bitmask_size;				\
-		__bitmask_size = __get_dynamic_array_len(field);	\
-		trace_print_bitmask_seq(p, __bitmask, __bitmask_size);	\
-	})
-
-#undef __get_rel_bitmask
-#define __get_rel_bitmask(field)						\
-	({								\
-		void *__bitmask = __get_rel_dynamic_array(field);		\
-		unsigned int __bitmask_size;				\
-		__bitmask_size = __get_rel_dynamic_array_len(field);	\
-		trace_print_bitmask_seq(p, __bitmask, __bitmask_size);	\
-	})
-
-#undef __print_flags
-#define __print_flags(flag, delim, flag_array...)			\
-	({								\
-		static const struct trace_print_flags __flags[] =	\
-			{ flag_array, { -1, NULL }};			\
-		trace_print_flags_seq(p, delim, flag, __flags);	\
-	})
-
-#undef __print_symbolic
-#define __print_symbolic(value, symbol_array...)			\
-	({								\
-		static const struct trace_print_flags symbols[] =	\
-			{ symbol_array, { -1, NULL }};			\
-		trace_print_symbols_seq(p, value, symbols);		\
-	})
-
-#undef __print_flags_u64
-#undef __print_symbolic_u64
-#if BITS_PER_LONG == 32
-#define __print_flags_u64(flag, delim, flag_array...)			\
-	({								\
-		static const struct trace_print_flags_u64 __flags[] =	\
-			{ flag_array, { -1, NULL } };			\
-		trace_print_flags_seq_u64(p, delim, flag, __flags);	\
-	})
-
-#define __print_symbolic_u64(value, symbol_array...)			\
-	({								\
-		static const struct trace_print_flags_u64 symbols[] =	\
-			{ symbol_array, { -1, NULL } };			\
-		trace_print_symbols_seq_u64(p, value, symbols);	\
-	})
-#else
-#define __print_flags_u64(flag, delim, flag_array...)			\
-			__print_flags(flag, delim, flag_array)
-
-#define __print_symbolic_u64(value, symbol_array...)			\
-			__print_symbolic(value, symbol_array)
-#endif
-
-#undef __print_hex
-#define __print_hex(buf, buf_len)					\
-	trace_print_hex_seq(p, buf, buf_len, false)
-
-#undef __print_hex_str
-#define __print_hex_str(buf, buf_len)					\
-	trace_print_hex_seq(p, buf, buf_len, true)
-
-#undef __print_array
-#define __print_array(array, count, el_size)				\
-	({								\
-		BUILD_BUG_ON(el_size != 1 && el_size != 2 &&		\
-			     el_size != 4 && el_size != 8);		\
-		trace_print_array_seq(p, array, count, el_size);	\
-	})
-
-#undef __print_hex_dump
-#define __print_hex_dump(prefix_str, prefix_type,			\
-			 rowsize, groupsize, buf, len, ascii)		\
-	trace_print_hex_dump_seq(p, prefix_str, prefix_type,		\
-				 rowsize, groupsize, buf, len, ascii)
-
-#undef __print_ns_to_secs
-#define __print_ns_to_secs(value)			\
-	({						\
-		u64 ____val = (u64)(value);		\
-		do_div(____val, NSEC_PER_SEC);		\
-		____val;				\
-	})
-
-#undef __print_ns_without_secs
-#define __print_ns_without_secs(value)			\
-	({						\
-		u64 ____val = (u64)(value);		\
-		(u32) do_div(____val, NSEC_PER_SEC);	\
-	})
+#include "stages/stage3_defines.h"
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
@@ -479,59 +236,7 @@ static struct trace_event_functions trace_event_type_funcs_##call = {	\
 
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
-#undef __field_ext
-#define __field_ext(_type, _item, _filter_type) {			\
-	.type = #_type, .name = #_item,					\
-	.size = sizeof(_type), .align = __alignof__(_type),		\
-	.is_signed = is_signed_type(_type), .filter_type = _filter_type },
-
-#undef __field_struct_ext
-#define __field_struct_ext(_type, _item, _filter_type) {		\
-	.type = #_type, .name = #_item,					\
-	.size = sizeof(_type), .align = __alignof__(_type),		\
-	0, .filter_type = _filter_type },
-
-#undef __field
-#define __field(type, item)	__field_ext(type, item, FILTER_OTHER)
-
-#undef __field_struct
-#define __field_struct(type, item) __field_struct_ext(type, item, FILTER_OTHER)
-
-#undef __array
-#define __array(_type, _item, _len) {					\
-	.type = #_type"["__stringify(_len)"]", .name = #_item,		\
-	.size = sizeof(_type[_len]), .align = __alignof__(_type),	\
-	.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
-
-#undef __dynamic_array
-#define __dynamic_array(_type, _item, _len) {				\
-	.type = "__data_loc " #_type "[]", .name = #_item,		\
-	.size = 4, .align = 4,						\
-	.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
-
-#undef __string
-#define __string(item, src) __dynamic_array(char, item, -1)
-
-#undef __string_len
-#define __string_len(item, src, len) __dynamic_array(char, item, -1)
-
-#undef __bitmask
-#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
-
-#undef __rel_dynamic_array
-#define __rel_dynamic_array(_type, _item, _len) {			\
-	.type = "__rel_loc " #_type "[]", .name = #_item,		\
-	.size = 4, .align = 4,						\
-	.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
-
-#undef __rel_string
-#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
-
-#undef __rel_string_len
-#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
-
-#undef __rel_bitmask
-#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item, -1)
+#include "stages/stage4_defines.h"
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, func, print)	\
@@ -544,85 +249,7 @@ static struct trace_event_fields trace_event_fields_##call[] = {	\
 
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
-/*
- * remember the offset of each array from the beginning of the event.
- */
-
-#undef __entry
-#define __entry entry
-
-#undef __field
-#define __field(type, item)
-
-#undef __field_ext
-#define __field_ext(type, item, filter_type)
-
-#undef __field_struct
-#define __field_struct(type, item)
-
-#undef __field_struct_ext
-#define __field_struct_ext(type, item, filter_type)
-
-#undef __array
-#define __array(type, item, len)
-
-#undef __dynamic_array
-#define __dynamic_array(type, item, len)				\
-	__item_length = (len) * sizeof(type);				\
-	__data_offsets->item = __data_size +				\
-			       offsetof(typeof(*entry), __data);	\
-	__data_offsets->item |= __item_length << 16;			\
-	__data_size += __item_length;
-
-#undef __string
-#define __string(item, src) __dynamic_array(char, item,			\
-		    strlen((src) ? (const char *)(src) : "(null)") + 1)
-
-#undef __string_len
-#define __string_len(item, src, len) __dynamic_array(char, item, (len) + 1)
-
-#undef __rel_dynamic_array
-#define __rel_dynamic_array(type, item, len)				\
-	__item_length = (len) * sizeof(type);				\
-	__data_offsets->item = __data_size +				\
-			       offsetof(typeof(*entry), __data) -	\
-			       offsetof(typeof(*entry), __rel_loc_##item) -	\
-			       sizeof(u32);				\
-	__data_offsets->item |= __item_length << 16;			\
-	__data_size += __item_length;
-
-#undef __rel_string
-#define __rel_string(item, src) __rel_dynamic_array(char, item,			\
-		    strlen((src) ? (const char *)(src) : "(null)") + 1)
-
-#undef __rel_string_len
-#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, (len) + 1)
-/*
- * __bitmask_size_in_bytes_raw is the number of bytes needed to hold
- * num_possible_cpus().
- */
-#define __bitmask_size_in_bytes_raw(nr_bits)	\
-	(((nr_bits) + 7) / 8)
-
-#define __bitmask_size_in_longs(nr_bits)			\
-	((__bitmask_size_in_bytes_raw(nr_bits) +		\
-	  ((BITS_PER_LONG / 8) - 1)) / (BITS_PER_LONG / 8))
-
-/*
- * __bitmask_size_in_bytes is the number of bytes needed to hold
- * num_possible_cpus() padded out to the nearest long. This is what
- * is saved in the buffer, just to be consistent.
- */
-#define __bitmask_size_in_bytes(nr_bits)				\
-	(__bitmask_size_in_longs(nr_bits) * (BITS_PER_LONG / 8))
-
-#undef __bitmask
-#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item,	\
-					 __bitmask_size_in_longs(nr_bits))
-
-#undef __rel_bitmask
-#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item,	\
-					 __bitmask_size_in_longs(nr_bits))
+#include "stages/stage5_defines.h"
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
@@ -745,88 +372,7 @@ static inline notrace int trace_event_get_offsets_##call(		\
 #define _TRACE_PERF_INIT(call)
 #endif /* CONFIG_PERF_EVENTS */
 
-#undef __entry
-#define __entry entry
-
-#undef __field
-#define __field(type, item)
-
-#undef __field_struct
-#define __field_struct(type, item)
-
-#undef __array
-#define __array(type, item, len)
-
-#undef __dynamic_array
-#define __dynamic_array(type, item, len)				\
-	__entry->__data_loc_##item = __data_offsets.item;
-
-#undef __string
-#define __string(item, src) __dynamic_array(char, item, -1)
-
-#undef __string_len
-#define __string_len(item, src, len) __dynamic_array(char, item, -1)
-
-#undef __assign_str
-#define __assign_str(dst, src)						\
-	strcpy(__get_str(dst), (src) ? (const char *)(src) : "(null)");
-
-#undef __assign_str_len
-#define __assign_str_len(dst, src, len)					\
-	do {								\
-		memcpy(__get_str(dst), (src), (len));			\
-		__get_str(dst)[len] = '\0';				\
-	} while(0)
-
-#undef __bitmask
-#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
-
-#undef __get_bitmask
-#define __get_bitmask(field) (char *)__get_dynamic_array(field)
-
-#undef __assign_bitmask
-#define __assign_bitmask(dst, src, nr_bits)					\
-	memcpy(__get_bitmask(dst), (src), __bitmask_size_in_bytes(nr_bits))
-
-#undef __rel_dynamic_array
-#define __rel_dynamic_array(type, item, len)				\
-	__entry->__rel_loc_##item = __data_offsets.item;
-
-#undef __rel_string
-#define __rel_string(item, src) __rel_dynamic_array(char, item, -1)
-
-#undef __rel_string_len
-#define __rel_string_len(item, src, len) __rel_dynamic_array(char, item, -1)
-
-#undef __assign_rel_str
-#define __assign_rel_str(dst, src)					\
-	strcpy(__get_rel_str(dst), (src) ? (const char *)(src) : "(null)");
-
-#undef __assign_rel_str_len
-#define __assign_rel_str_len(dst, src, len)				\
-	do {								\
-		memcpy(__get_rel_str(dst), (src), (len));		\
-		__get_rel_str(dst)[len] = '\0';				\
-	} while (0)
-
-#undef __rel_bitmask
-#define __rel_bitmask(item, nr_bits) __rel_dynamic_array(unsigned long, item, -1)
-
-#undef __get_rel_bitmask
-#define __get_rel_bitmask(field) (char *)__get_rel_dynamic_array(field)
-
-#undef __assign_rel_bitmask
-#define __assign_rel_bitmask(dst, src, nr_bits)					\
-	memcpy(__get_rel_bitmask(dst), (src), __bitmask_size_in_bytes(nr_bits))
-
-#undef TP_fast_assign
-#define TP_fast_assign(args...) args
-
-#undef __perf_count
-#define __perf_count(c)	(c)
-
-#undef __perf_task
-#define __perf_task(t)	(t)
+#include "stages/stage6_defines.h"
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
@@ -872,36 +418,7 @@ static inline void ftrace_test_probe_##call(void)			\
 
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
-#undef __entry
-#define __entry REC
-
-#undef __print_flags
-#undef __print_symbolic
-#undef __print_hex
-#undef __print_hex_str
-#undef __get_dynamic_array
-#undef __get_dynamic_array_len
-#undef __get_str
-#undef __get_bitmask
-#undef __get_rel_dynamic_array
-#undef __get_rel_dynamic_array_len
-#undef __get_rel_str
-#undef __get_rel_bitmask
-#undef __print_array
-#undef __print_hex_dump
-
-/*
- * The below is not executed in the kernel. It is only what is
- * displayed in the print format for userspace to parse.
- */
-#undef __print_ns_to_secs
-#define __print_ns_to_secs(val) (val) / 1000000000UL
-
-#undef __print_ns_without_secs
-#define __print_ns_without_secs(val) (val) % 1000000000UL
-
-#undef TP_printk
-#define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
+#include "stages/stage7_defines.h"
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 07/12] tracing: Add TRACE_CUSTOM_EVENT() macro
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (5 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 06/12] tracing: Move the defines to create TRACE_EVENTS into their own files Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 08/12] user_events: Prevent dyn_event delete racing with ioctl add/delete Steven Rostedt
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Joel Fernandes, Peter Zijlstra,
	Masami Hiramatsu, Tom Zanussi

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

To make it really easy to add custom events from modules, add a
TRACE_CUSTOM_EVENT() macro that acts just like the TRACE_EVENT() macro,
but creates a custom event to an already existing tracepoint.

The trace_custom_sched.[ch] has been updated to use this new macro to show
how simple it is.

Link: https://lkml.kernel.org/r/20220303220625.738622494@goodmis.org

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/linux/trace_events.h              |  24 +-
 include/trace/define_custom_trace.h       |  77 +++++++
 include/trace/trace_custom_events.h       | 221 ++++++++++++++++++
 samples/trace_events/Makefile             |   2 +-
 samples/trace_events/trace_custom_sched.c | 259 ++--------------------
 samples/trace_events/trace_custom_sched.h |  95 ++++++++
 6 files changed, 441 insertions(+), 237 deletions(-)
 create mode 100644 include/trace/define_custom_trace.h
 create mode 100644 include/trace/trace_custom_events.h
 create mode 100644 samples/trace_events/trace_custom_sched.h

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 70c069aef02c..9b09fd633d48 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -315,6 +315,7 @@ enum {
 	TRACE_EVENT_FL_KPROBE_BIT,
 	TRACE_EVENT_FL_UPROBE_BIT,
 	TRACE_EVENT_FL_EPROBE_BIT,
+	TRACE_EVENT_FL_CUSTOM_BIT,
 };
 
 /*
@@ -328,6 +329,9 @@ enum {
  *  KPROBE        - Event is a kprobe
  *  UPROBE        - Event is a uprobe
  *  EPROBE        - Event is an event probe
+ *  CUSTOM        - Event is a custom event (to be attached to an exsiting tracepoint)
+ *                   This is set when the custom event has not been attached
+ *                   to a tracepoint yet, then it is cleared when it is.
  */
 enum {
 	TRACE_EVENT_FL_FILTERED		= (1 << TRACE_EVENT_FL_FILTERED_BIT),
@@ -339,6 +343,7 @@ enum {
 	TRACE_EVENT_FL_KPROBE		= (1 << TRACE_EVENT_FL_KPROBE_BIT),
 	TRACE_EVENT_FL_UPROBE		= (1 << TRACE_EVENT_FL_UPROBE_BIT),
 	TRACE_EVENT_FL_EPROBE		= (1 << TRACE_EVENT_FL_EPROBE_BIT),
+	TRACE_EVENT_FL_CUSTOM		= (1 << TRACE_EVENT_FL_CUSTOM_BIT),
 };
 
 #define TRACE_EVENT_FL_UKPROBE (TRACE_EVENT_FL_KPROBE | TRACE_EVENT_FL_UPROBE)
@@ -440,7 +445,9 @@ static inline bool bpf_prog_array_valid(struct trace_event_call *call)
 static inline const char *
 trace_event_name(struct trace_event_call *call)
 {
-	if (call->flags & TRACE_EVENT_FL_TRACEPOINT)
+	if (call->flags & TRACE_EVENT_FL_CUSTOM)
+		return call->name;
+	else if (call->flags & TRACE_EVENT_FL_TRACEPOINT)
 		return call->tp ? call->tp->name : NULL;
 	else
 		return call->name;
@@ -901,3 +908,18 @@ perf_trace_buf_submit(void *raw_data, int size, int rctx, u16 type,
 #endif
 
 #endif /* _LINUX_TRACE_EVENT_H */
+
+/*
+ * Note: we keep the TRACE_CUSTOM_EVENT outside the include file ifdef protection.
+ *  This is due to the way trace custom events work. If a file includes two
+ *  trace event headers under one "CREATE_CUSTOM_TRACE_EVENTS" the first include
+ *  will override the TRACE_CUSTOM_EVENT and break the second include.
+ */
+
+#ifndef TRACE_CUSTOM_EVENT
+
+#define DECLARE_CUSTOM_EVENT_CLASS(name, proto, args, tstruct, assign, print)
+#define DEFINE_CUSTOM_EVENT(template, name, proto, args)
+#define TRACE_CUSTOM_EVENT(name, proto, args, struct, assign, print)
+
+#endif /* ifdef TRACE_CUSTOM_EVENT (see note above) */
diff --git a/include/trace/define_custom_trace.h b/include/trace/define_custom_trace.h
new file mode 100644
index 000000000000..5827a4c92c74
--- /dev/null
+++ b/include/trace/define_custom_trace.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Trace files that want to automate creation of all tracepoints defined
+ * in their file should include this file. The following are macros that the
+ * trace file may define:
+ *
+ * TRACE_SYSTEM defines the system the tracepoint is for
+ *
+ * TRACE_INCLUDE_FILE if the file name is something other than TRACE_SYSTEM.h
+ *     This macro may be defined to tell define_trace.h what file to include.
+ *     Note, leave off the ".h".
+ *
+ * TRACE_INCLUDE_PATH if the path is something other than core kernel include/trace
+ *     then this macro can define the path to use. Note, the path is relative to
+ *     define_trace.h, not the file including it. Full path names for out of tree
+ *     modules must be used.
+ */
+
+#ifdef CREATE_CUSTOM_TRACE_EVENTS
+
+/* Prevent recursion */
+#undef CREATE_CUSTOM_TRACE_EVENTS
+
+#include <linux/stringify.h>
+
+#undef TRACE_CUSTOM_EVENT
+#define TRACE_CUSTOM_EVENT(name, proto, args, tstruct, assign, print)
+
+#undef DEFINE_CUSTOM_EVENT
+#define DEFINE_CUSTOM_EVENT(template, name, proto, args)
+
+#undef TRACE_INCLUDE
+#undef __TRACE_INCLUDE
+
+#ifndef TRACE_INCLUDE_FILE
+# define TRACE_INCLUDE_FILE TRACE_SYSTEM
+# define UNDEF_TRACE_INCLUDE_FILE
+#endif
+
+#ifndef TRACE_INCLUDE_PATH
+# define __TRACE_INCLUDE(system) <trace/events/system.h>
+# define UNDEF_TRACE_INCLUDE_PATH
+#else
+# define __TRACE_INCLUDE(system) __stringify(TRACE_INCLUDE_PATH/system.h)
+#endif
+
+# define TRACE_INCLUDE(system) __TRACE_INCLUDE(system)
+
+/* Let the trace headers be reread */
+#define TRACE_CUSTOM_MULTI_READ
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+#ifdef TRACEPOINTS_ENABLED
+#include <trace/trace_custom_events.h>
+#endif
+
+#undef TRACE_CUSTOM_EVENT
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#undef DEFINE_CUSTOM_EVENT
+#undef TRACE_CUSTOM_MULTI_READ
+
+/* Only undef what we defined in this file */
+#ifdef UNDEF_TRACE_INCLUDE_FILE
+# undef TRACE_INCLUDE_FILE
+# undef UNDEF_TRACE_INCLUDE_FILE
+#endif
+
+#ifdef UNDEF_TRACE_INCLUDE_PATH
+# undef TRACE_INCLUDE_PATH
+# undef UNDEF_TRACE_INCLUDE_PATH
+#endif
+
+/* We may be processing more files */
+#define CREATE_CUSTOM_TRACE_POINTS
+
+#endif /* CREATE_CUSTOM_TRACE_POINTS */
diff --git a/include/trace/trace_custom_events.h b/include/trace/trace_custom_events.h
new file mode 100644
index 000000000000..b567c7202339
--- /dev/null
+++ b/include/trace/trace_custom_events.h
@@ -0,0 +1,221 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * This is similar to the trace_events.h file, but is to only
+ * create custom trace events to be attached to existing tracepoints.
+ * Where as the TRACE_EVENT() macro (from trace_events.h) will create
+ * both the trace event and the tracepoint it will attach the event to,
+ * TRACE_CUSTOM_EVENT() is to create only a custom version of an existing
+ * trace event (created by TRACE_EVENT() or DEFINE_EVENT()), and will
+ * be placed in the "custom" system.
+ */
+
+#include <linux/trace_events.h>
+
+/* All custom events are placed in the custom group */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM custom
+
+#ifndef TRACE_SYSTEM_VAR
+#define TRACE_SYSTEM_VAR TRACE_SYSTEM
+#endif
+
+/* The init stage creates the system string and enum mappings */
+
+#include "stages/init.h"
+
+#undef TRACE_CUSTOM_EVENT
+#define TRACE_CUSTOM_EVENT(name, proto, args, tstruct, assign, print) \
+	DECLARE_CUSTOM_EVENT_CLASS(name,			      \
+			     PARAMS(proto),		       \
+			     PARAMS(args),		       \
+			     PARAMS(tstruct),		       \
+			     PARAMS(assign),		       \
+			     PARAMS(print));		       \
+	DEFINE_CUSTOM_EVENT(name, name, PARAMS(proto), PARAMS(args));
+
+/* Stage 1 creates the structure of the recorded event layout */
+
+#include "stages/stage1_defines.h"
+
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#define DECLARE_CUSTOM_EVENT_CLASS(name, proto, args, tstruct, assign, print) \
+	struct trace_custom_event_raw_##name {				\
+		struct trace_entry	ent;				\
+		tstruct							\
+		char			__data[];			\
+	};								\
+									\
+	static struct trace_event_class custom_event_class_##name;
+
+#undef DEFINE_CUSTOM_EVENT
+#define DEFINE_CUSTOM_EVENT(template, name, proto, args)	\
+	static struct trace_event_call	__used			\
+	__attribute__((__aligned__(4))) custom_event_##name
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+/* Stage 2 creates the custom class */
+
+#include "stages/stage2_defines.h"
+
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#define DECLARE_CUSTOM_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
+	struct trace_custom_event_data_offsets_##call {			\
+		tstruct;						\
+	};
+
+#undef DEFINE_CUSTOM_EVENT
+#define DEFINE_CUSTOM_EVENT(template, name, proto, args)
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+/* Stage 3 create the way to print the custom event */
+
+#include "stages/stage3_defines.h"
+
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#define DECLARE_CUSTOM_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
+static notrace enum print_line_t					\
+trace_custom_raw_output_##call(struct trace_iterator *iter, int flags,	\
+			struct trace_event *trace_event)		\
+{									\
+	struct trace_seq *s = &iter->seq;				\
+	struct trace_seq __maybe_unused *p = &iter->tmp_seq;		\
+	struct trace_custom_event_raw_##call *field;			\
+	int ret;							\
+									\
+	field = (typeof(field))iter->ent;				\
+									\
+	ret = trace_raw_output_prep(iter, trace_event);			\
+	if (ret != TRACE_TYPE_HANDLED)					\
+		return ret;						\
+									\
+	trace_event_printf(iter, print);				\
+									\
+	return trace_handle_return(s);					\
+}									\
+static struct trace_event_functions trace_custom_event_type_funcs_##call = { \
+	.trace			= trace_custom_raw_output_##call,	\
+};
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+/* Stage 4 creates the offset layout for the fields */
+
+#include "stages/stage4_defines.h"
+
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#define DECLARE_CUSTOM_EVENT_CLASS(call, proto, args, tstruct, func, print)	\
+static struct trace_event_fields trace_custom_event_fields_##call[] = {	\
+	tstruct								\
+	{} };
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+/* Stage 5 creates the helper function for dynamic fields */
+
+#include "stages/stage5_defines.h"
+
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#define DECLARE_CUSTOM_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
+static inline notrace int trace_custom_event_get_offsets_##call(	\
+	struct trace_custom_event_data_offsets_##call *__data_offsets, proto) \
+{									\
+	int __data_size = 0;						\
+	int __maybe_unused __item_length;				\
+	struct trace_custom_event_raw_##call __maybe_unused *entry;	\
+									\
+	tstruct;							\
+									\
+	return __data_size;						\
+}
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+/* Stage 6 creates the probe function that records the event */
+
+#include "stages/stage6_defines.h"
+
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#define DECLARE_CUSTOM_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
+									\
+static notrace void							\
+trace_custom_event_raw_event_##call(void *__data, proto)		\
+{									\
+	struct trace_event_file *trace_file = __data;			\
+	struct trace_custom_event_data_offsets_##call __maybe_unused __data_offsets; \
+	struct trace_event_buffer fbuffer;				\
+	struct trace_custom_event_raw_##call *entry;			\
+	int __data_size;						\
+									\
+	if (trace_trigger_soft_disabled(trace_file))			\
+		return;							\
+									\
+	__data_size = trace_custom_event_get_offsets_##call(&__data_offsets, args); \
+									\
+	entry = trace_event_buffer_reserve(&fbuffer, trace_file,	\
+				 sizeof(*entry) + __data_size);		\
+									\
+	if (!entry)							\
+		return;							\
+									\
+	tstruct								\
+									\
+	{ assign; }							\
+									\
+	trace_event_buffer_commit(&fbuffer);				\
+}
+/*
+ * The ftrace_test_custom_probe is compiled out, it is only here as a build time check
+ * to make sure that if the tracepoint handling changes, the ftrace probe will
+ * fail to compile unless it too is updated.
+ */
+
+#undef DEFINE_CUSTOM_EVENT
+#define DEFINE_CUSTOM_EVENT(template, call, proto, args)		\
+static inline void ftrace_test_custom_probe_##call(void)		\
+{									\
+	check_trace_callback_type_##call(trace_custom_event_raw_event_##template); \
+}
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+/* Stage 7 creates the actual class and event structure for the custom event */
+
+#include "stages/stage7_defines.h"
+
+#undef DECLARE_CUSTOM_EVENT_CLASS
+#define DECLARE_CUSTOM_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
+static char custom_print_fmt_##call[] = print;					\
+static struct trace_event_class __used __refdata custom_event_class_##call = { \
+	.system			= TRACE_SYSTEM_STRING,			\
+	.fields_array		= trace_custom_event_fields_##call,		\
+	.fields			= LIST_HEAD_INIT(custom_event_class_##call.fields),\
+	.raw_init		= trace_event_raw_init,			\
+	.probe			= trace_custom_event_raw_event_##call,	\
+	.reg			= trace_event_reg,			\
+};
+
+#undef DEFINE_CUSTOM_EVENT
+#define DEFINE_CUSTOM_EVENT(template, call, proto, args)		\
+									\
+static struct trace_event_call __used custom_event_##call = {		\
+	.name			= #call,				\
+	.class			= &custom_event_class_##template,	\
+	.event.funcs		= &trace_custom_event_type_funcs_##template, \
+	.print_fmt		= custom_print_fmt_##template,		\
+	.flags			= TRACE_EVENT_FL_CUSTOM,		\
+};									\
+static inline int trace_custom_event_##call##_update(struct tracepoint *tp) \
+{									\
+	if (tp->name && strcmp(tp->name, #call) == 0) {			\
+		custom_event_##call.tp = tp;				\
+		custom_event_##call.flags = TRACE_EVENT_FL_TRACEPOINT;	\
+		return 1;						\
+	}								\
+	return 0;							\
+}									\
+static struct trace_event_call __used					\
+__section("_ftrace_events") *__custom_event_##call = &custom_event_##call
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
diff --git a/samples/trace_events/Makefile b/samples/trace_events/Makefile
index e98afc447fe1..b3808bb4cf8b 100644
--- a/samples/trace_events/Makefile
+++ b/samples/trace_events/Makefile
@@ -11,7 +11,7 @@
 # Here trace-events-sample.c does the CREATE_TRACE_POINTS.
 #
 CFLAGS_trace-events-sample.o := -I$(src)
+CFLAGS_trace_custom_sched.o := -I$(src)
 
 obj-$(CONFIG_SAMPLE_TRACE_EVENTS) += trace-events-sample.o
-
 obj-$(CONFIG_SAMPLE_TRACE_CUSTOM_EVENTS) += trace_custom_sched.o
diff --git a/samples/trace_events/trace_custom_sched.c b/samples/trace_events/trace_custom_sched.c
index 70a12c32ff99..b99d9ab7db85 100644
--- a/samples/trace_events/trace_custom_sched.c
+++ b/samples/trace_events/trace_custom_sched.c
@@ -11,256 +11,45 @@
 #include <linux/version.h>
 #include <linux/module.h>
 #include <linux/sched.h>
-#include <trace/events/sched.h>
-
-#define THIS_SYSTEM "custom_sched"
-
-#define SCHED_PRINT_FMT							\
-	C("prev_prio=%d next_pid=%d next_prio=%d", REC->prev_prio, REC->next_pid, \
-	  REC->next_prio)
-
-#define SCHED_WAKING_FMT				\
-	C("pid=%d prio=%d", REC->pid, REC->prio)
-
-#undef C
-#define C(a, b...) a, b
-
-static struct trace_event_fields sched_switch_fields[] = {
-	{
-		.type = "unsigned short",
-		.name = "prev_prio",
-		.size = sizeof(short),
-		.align = __alignof__(short),
-		.is_signed = 0,
-		.filter_type = FILTER_OTHER,
-	},
-	{
-		.type = "unsigned short",
-		.name = "next_prio",
-		.size = sizeof(short),
-		.align = __alignof__(short),
-		.is_signed = 0,
-		.filter_type = FILTER_OTHER,
-	},
-	{
-		.type = "unsigned int",
-		.name = "next_prio",
-		.size = sizeof(int),
-		.align = __alignof__(int),
-		.is_signed = 0,
-		.filter_type = FILTER_OTHER,
-	},
-	{}
-};
-
-struct sched_event {
-	struct trace_entry	ent;
-	unsigned short		prev_prio;
-	unsigned short		next_prio;
-	unsigned int		next_pid;
-};
-
-static struct trace_event_fields sched_waking_fields[] = {
-	{
-		.type = "unsigned int",
-		.name = "pid",
-		.size = sizeof(int),
-		.align = __alignof__(int),
-		.is_signed = 0,
-		.filter_type = FILTER_OTHER,
-	},
-	{
-		.type = "unsigned short",
-		.name = "prio",
-		.size = sizeof(short),
-		.align = __alignof__(short),
-		.is_signed = 0,
-		.filter_type = FILTER_OTHER,
-	},
-	{}
-};
-
-struct wake_event {
-	struct trace_entry	ent;
-	unsigned int		pid;
-	unsigned short		prio;
-};
-
-static void sched_switch_probe(void *data, bool preempt, struct task_struct *prev,
-			       struct task_struct *next)
-{
-	struct trace_event_file *trace_file = data;
-	struct trace_event_buffer fbuffer;
-	struct sched_event *entry;
-
-	if (trace_trigger_soft_disabled(trace_file))
-		return;
-
-	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
-					   sizeof(*entry));
-
-	if (!entry)
-		return;
-
-	entry->prev_prio = prev->prio;
-	entry->next_prio = next->prio;
-	entry->next_pid = next->pid;
-
-	trace_event_buffer_commit(&fbuffer);
-}
-
-static struct trace_event_class sched_switch_class = {
-	.system			= THIS_SYSTEM,
-	.reg			= trace_event_reg,
-	.fields_array		= sched_switch_fields,
-	.fields			= LIST_HEAD_INIT(sched_switch_class.fields),
-	.probe			= sched_switch_probe,
-};
-
-static void sched_waking_probe(void *data, struct task_struct *t)
-{
-	struct trace_event_file *trace_file = data;
-	struct trace_event_buffer fbuffer;
-	struct wake_event *entry;
-
-	if (trace_trigger_soft_disabled(trace_file))
-		return;
-
-	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
-					   sizeof(*entry));
-
-	if (!entry)
-		return;
-
-	entry->prio = t->prio;
-	entry->pid = t->pid;
-
-	trace_event_buffer_commit(&fbuffer);
-}
-
-static struct trace_event_class sched_waking_class = {
-	.system			= THIS_SYSTEM,
-	.reg			= trace_event_reg,
-	.fields_array		= sched_waking_fields,
-	.fields			= LIST_HEAD_INIT(sched_waking_class.fields),
-	.probe			= sched_waking_probe,
-};
-
-static enum print_line_t sched_switch_output(struct trace_iterator *iter, int flags,
-					     struct trace_event *trace_event)
-{
-	struct trace_seq *s = &iter->seq;
-	struct sched_event *REC = (struct sched_event *)iter->ent;
-	int ret;
-
-	ret = trace_raw_output_prep(iter, trace_event);
-	if (ret != TRACE_TYPE_HANDLED)
-		return ret;
-
-	trace_seq_printf(s, SCHED_PRINT_FMT);
-	trace_seq_putc(s, '\n');
 
-	return trace_handle_return(s);
-}
-
-static struct trace_event_functions sched_switch_funcs = {
-	.trace			= sched_switch_output,
-};
-
-static enum print_line_t sched_waking_output(struct trace_iterator *iter, int flags,
-					     struct trace_event *trace_event)
-{
-	struct trace_seq *s = &iter->seq;
-	struct wake_event *REC = (struct wake_event *)iter->ent;
-	int ret;
-
-	ret = trace_raw_output_prep(iter, trace_event);
-	if (ret != TRACE_TYPE_HANDLED)
-		return ret;
-
-	trace_seq_printf(s, SCHED_WAKING_FMT);
-	trace_seq_putc(s, '\n');
-
-	return trace_handle_return(s);
-}
-
-static struct trace_event_functions sched_waking_funcs = {
-	.trace			= sched_waking_output,
-};
-
-#undef C
-#define C(a, b...) #a "," __stringify(b)
+/*
+ * Must include the event header that the custom event will attach to,
+ * from the C file, and not in the custom header file.
+ */
+#include <trace/events/sched.h>
 
-static struct trace_event_call sched_switch_call = {
-	.class			= &sched_switch_class,
-	.event			= {
-		.funcs			= &sched_switch_funcs,
-	},
-	.print_fmt		= SCHED_PRINT_FMT,
-	.module			= THIS_MODULE,
-	.flags			= TRACE_EVENT_FL_TRACEPOINT,
-};
+/* Declare CREATE_CUSTOM_TRACE_EVENTS before including custom header */
+#define CREATE_CUSTOM_TRACE_EVENTS
 
-static struct trace_event_call sched_waking_call = {
-	.class			= &sched_waking_class,
-	.event			= {
-		.funcs			= &sched_waking_funcs,
-	},
-	.print_fmt		= SCHED_WAKING_FMT,
-	.module			= THIS_MODULE,
-	.flags			= TRACE_EVENT_FL_TRACEPOINT,
-};
+#include "trace_custom_sched.h"
 
+/*
+ * As the trace events are not exported to modules, the use of
+ * for_each_kernel_tracepoint() is needed to find the trace event
+ * to attach to. The fct() function below, is a callback that
+ * will be called for every event.
+ *
+ * Helper functions are created by the TRACE_CUSTOM_EVENT() macro
+ * update the event. Those are of the form:
+ *
+ *    trace_custom_event_<event>_update()
+ *
+ * Where <event> is the event to attach.
+ */
 static void fct(struct tracepoint *tp, void *priv)
 {
-	if (tp->name && strcmp(tp->name, "sched_switch") == 0)
-		sched_switch_call.tp = tp;
-	else if (tp->name && strcmp(tp->name, "sched_waking") == 0)
-		sched_waking_call.tp = tp;
-}
-
-static int add_event(struct trace_event_call *call)
-{
-	int ret;
-
-	ret = register_trace_event(&call->event);
-	if (WARN_ON(!ret))
-		return -ENODEV;
-
-	ret = trace_add_event_call(call);
-	if (WARN_ON(ret))
-		unregister_trace_event(&call->event);
-
-	return ret;
+	trace_custom_event_sched_switch_update(tp);
+	trace_custom_event_sched_waking_update(tp);
 }
 
 static int __init trace_sched_init(void)
 {
-	int ret;
-
-	check_trace_callback_type_sched_switch(sched_switch_probe);
-	check_trace_callback_type_sched_waking(sched_waking_probe);
-
 	for_each_kernel_tracepoint(fct, NULL);
-
-	ret = add_event(&sched_switch_call);
-	if (ret)
-		return ret;
-
-	ret = add_event(&sched_waking_call);
-	if (ret)
-		trace_remove_event_call(&sched_switch_call);
-
-	return ret;
+	return 0;
 }
 
 static void __exit trace_sched_exit(void)
 {
-	trace_set_clr_event(THIS_SYSTEM, "sched_switch", 0);
-	trace_set_clr_event(THIS_SYSTEM, "sched_waking", 0);
-
-	trace_remove_event_call(&sched_switch_call);
-	trace_remove_event_call(&sched_waking_call);
 }
 
 module_init(trace_sched_init);
diff --git a/samples/trace_events/trace_custom_sched.h b/samples/trace_events/trace_custom_sched.h
new file mode 100644
index 000000000000..a3d14de6a2e5
--- /dev/null
+++ b/samples/trace_events/trace_custom_sched.h
@@ -0,0 +1,95 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Like the headers that use TRACE_EVENT(), the TRACE_CUSTOM_EVENT()
+ * needs a header that allows for multiple inclusions.
+ *
+ * Test for a unique name (here we have _TRACE_CUSTOM_SCHED_H),
+ * also allowing to continue if TRACE_CUSTOM_MULTI_READ is defined.
+ */
+#if !defined(_TRACE_CUSTOM_SCHED_H) || defined(TRACE_CUSTOM_MULTI_READ)
+#define _TRACE_CUSTOM_SCHED_H
+
+/* Include linux/trace_events.h for initial defines of TRACE_CUSTOM_EVENT() */
+#include <linux/trace_events.h>
+
+/*
+ * TRACE_CUSTOM_EVENT() is just like TRACE_EVENT(). The first parameter
+ * is the event name of an existing event where the TRACE_EVENT has been included
+ * in the C file before including this file.
+ */
+TRACE_CUSTOM_EVENT(sched_switch,
+
+	/*
+	 * The TP_PROTO() and TP_ARGS must match the trace event
+	 * that the custom event is using.
+	 */
+	TP_PROTO(bool preempt,
+		 struct task_struct *prev,
+		 struct task_struct *next),
+
+	TP_ARGS(preempt, prev, next),
+
+	/*
+	 * The next fields are where the customization happens.
+	 * The TP_STRUCT__entry() defines what will be recorded
+	 * in the ring buffer when the custom event triggers.
+	 *
+	 * The rest is just like the TRACE_EVENT() macro except that
+	 * it uses the custom entry.
+	 */
+	TP_STRUCT__entry(
+		__field(	unsigned short,		prev_prio	)
+		__field(	unsigned short,		next_prio	)
+		__field(	pid_t,	next_pid			)
+	),
+
+	TP_fast_assign(
+		__entry->prev_prio	= prev->prio;
+		__entry->next_pid	= next->pid;
+		__entry->next_prio	= next->prio;
+	),
+
+	TP_printk("prev_prio=%d next_pid=%d next_prio=%d",
+		  __entry->prev_prio, __entry->next_pid, __entry->next_prio)
+)
+
+
+TRACE_CUSTOM_EVENT(sched_waking,
+
+	TP_PROTO(struct task_struct *p),
+
+	TP_ARGS(p),
+
+	TP_STRUCT__entry(
+		__field(	pid_t,			pid	)
+		__field(	unsigned short,		prio	)
+	),
+
+	TP_fast_assign(
+		__entry->pid	= p->pid;
+		__entry->prio	= p->prio;
+	),
+
+	TP_printk("pid=%d prio=%d", __entry->pid, __entry->prio)
+)
+#endif
+/*
+ * Just like the headers that create TRACE_EVENTs, the below must
+ * be outside the protection of the above #if block.
+ */
+
+/*
+ * It is required that the Makefile includes:
+ *    CFLAGS_<c_file>.o := -I$(src)
+ */
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+
+/*
+ * It is requred that the TRACE_INCLUDE_FILE be the same
+ * as this file without the ".h".
+ */
+#define TRACE_INCLUDE_FILE trace_custom_sched
+#include <trace/define_custom_trace.h>
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 08/12] user_events: Prevent dyn_event delete racing with ioctl add/delete
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (6 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 07/12] tracing: Add TRACE_CUSTOM_EVENT() macro Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 09/12] tracing: Fix strncpy warning in trace_events_synth.c Steven Rostedt
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Beau Belgrave

From: Beau Belgrave <beaub@linux.microsoft.com>

Find user_events always while under the event_mutex and before leaving
the lock, add a ref count to the user_event. This ensures that all paths
under the event_mutex that check the ref counts will be synchronized.

The ioctl add/delete paths are protected by the reg_mutex. However,
dyn_event is only protected by the event_mutex. The dyn_event delete
path cannot acquire reg_mutex, since that could cause a deadlock between
the ioctl delete case acquiring event_mutex after acquiring the reg_mutex.

Link: https://lkml.kernel.org/r/20220310001141.1660-1-beaub@linux.microsoft.com

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_user.c | 46 +++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 9a6191a6a786..4febc1d6ae72 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -135,6 +135,8 @@ static struct list_head *user_event_get_fields(struct trace_event_call *call)
  * NOTE: Offsets are from the user data perspective, they are not from the
  * trace_entry/buffer perspective. We automatically add the common properties
  * sizes to the offset for the user.
+ *
+ * Upon success user_event has its ref count increased by 1.
  */
 static int user_event_parse_cmd(char *raw_command, struct user_event **newuser)
 {
@@ -593,8 +595,10 @@ static struct user_event *find_user_event(char *name, u32 *outkey)
 	*outkey = key;
 
 	hash_for_each_possible(register_table, user, node, key)
-		if (!strcmp(EVENT_NAME(user), name))
+		if (!strcmp(EVENT_NAME(user), name)) {
+			atomic_inc(&user->refcnt);
 			return user;
+		}
 
 	return NULL;
 }
@@ -883,7 +887,12 @@ static int user_event_create(const char *raw_command)
 		return -ENOMEM;
 
 	mutex_lock(&reg_mutex);
+
 	ret = user_event_parse_cmd(name, &user);
+
+	if (!ret)
+		atomic_dec(&user->refcnt);
+
 	mutex_unlock(&reg_mutex);
 
 	if (ret)
@@ -1050,6 +1059,7 @@ static int user_event_trace_register(struct user_event *user)
 /*
  * Parses the event name, arguments and flags then registers if successful.
  * The name buffer lifetime is owned by this method for success cases only.
+ * Upon success the returned user_event has its ref count increased by 1.
  */
 static int user_event_parse(char *name, char *args, char *flags,
 			    struct user_event **newuser)
@@ -1057,7 +1067,12 @@ static int user_event_parse(char *name, char *args, char *flags,
 	int ret;
 	int index;
 	u32 key;
-	struct user_event *user = find_user_event(name, &key);
+	struct user_event *user;
+
+	/* Prevent dyn_event from racing */
+	mutex_lock(&event_mutex);
+	user = find_user_event(name, &key);
+	mutex_unlock(&event_mutex);
 
 	if (user) {
 		*newuser = user;
@@ -1121,6 +1136,10 @@ static int user_event_parse(char *name, char *args, char *flags,
 		goto put_user;
 
 	user->index = index;
+
+	/* Ensure we track ref */
+	atomic_inc(&user->refcnt);
+
 	dyn_event_init(&user->devent, &user_event_dops);
 	dyn_event_add(&user->devent, &user->call);
 	set_bit(user->index, page_bitmap);
@@ -1147,12 +1166,21 @@ static int delete_user_event(char *name)
 	if (!user)
 		return -ENOENT;
 
-	if (atomic_read(&user->refcnt) != 0)
-		return -EBUSY;
+	/* Ensure we are the last ref */
+	if (atomic_read(&user->refcnt) != 1) {
+		ret = -EBUSY;
+		goto put_ref;
+	}
 
-	mutex_lock(&event_mutex);
 	ret = destroy_user_event(user);
-	mutex_unlock(&event_mutex);
+
+	if (ret)
+		goto put_ref;
+
+	return ret;
+put_ref:
+	/* No longer have this ref */
+	atomic_dec(&user->refcnt);
 
 	return ret;
 }
@@ -1340,6 +1368,9 @@ static long user_events_ioctl_reg(struct file *file, unsigned long uarg)
 
 	ret = user_events_ref_add(file, user);
 
+	/* No longer need parse ref, ref_add either worked or not */
+	atomic_dec(&user->refcnt);
+
 	/* Positive number is index and valid */
 	if (ret < 0)
 		return ret;
@@ -1364,7 +1395,10 @@ static long user_events_ioctl_del(struct file *file, unsigned long uarg)
 	if (IS_ERR(name))
 		return PTR_ERR(name);
 
+	/* event_mutex prevents dyn_event from racing */
+	mutex_lock(&event_mutex);
 	ret = delete_user_event(name);
+	mutex_unlock(&event_mutex);
 
 	kfree(name);
 
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 09/12] tracing: Fix strncpy warning in trace_events_synth.c
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (7 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 08/12] user_events: Prevent dyn_event delete racing with ioctl add/delete Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 10/12] tracing: Have TRACE_DEFINE_ENUM affect trace event types as well Steven Rostedt
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, kernel test robot, Tom Zanussi

From: Tom Zanussi <zanussi@kernel.org>

0-day reported the strncpy error below:

../kernel/trace/trace_events_synth.c: In function 'last_cmd_set':
../kernel/trace/trace_events_synth.c:65:9: warning: 'strncpy' specified bound depends on the length o\
f the source argument [-Wstringop-truncation]
   65 |         strncpy(last_cmd, str, strlen(str) + 1);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../kernel/trace/trace_events_synth.c:65:32: note: length computed here
   65 |         strncpy(last_cmd, str, strlen(str) + 1);
      |                                ^~~~~~~~~~~

There's no reason to use strncpy here, in fact there's no reason to do
anything but a simple kstrdup() (note we don't even need to check for
failure since last_cmod is expected to be either the last cmd string
or NULL, and the containing function is a void return).

Link: https://lkml.kernel.org/r/77deca8cbfd226981b3f1eab203967381e9b5bd9.camel@kernel.org

Fixes: 27c888da9867 ("tracing: Remove size restriction on synthetic event cmd error logging")

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_synth.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index fdd79e07e2fc..5e8c07aef071 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -58,11 +58,8 @@ static void last_cmd_set(const char *str)
 		return;
 
 	kfree(last_cmd);
-	last_cmd = kzalloc(strlen(str) + 1, GFP_KERNEL);
-	if (!last_cmd)
-		return;
 
-	strncpy(last_cmd, str, strlen(str) + 1);
+	last_cmd = kstrdup(str, GFP_KERNEL);
 }
 
 static void synth_err(u8 err_type, u16 err_pos)
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 10/12] tracing: Have TRACE_DEFINE_ENUM affect trace event types as well
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (8 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 09/12] tracing: Fix strncpy warning in trace_events_synth.c Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 11/12] tracing: Add snapshot at end of kernel boot up Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 12/12] tracing/user_events: Use alloc_pages instead of kzalloc() for register pages Steven Rostedt
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Ritesh Harjani

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The macro TRACE_DEFINE_ENUM is used to convert enums in the kernel to
their actual value when they are exported to user space via the trace
event format file.

Currently only the enums in the "print fmt" (TP_printk in the TRACE_EVENT
macro) have the enums converted. But the enums can be used to denote array
size:

        field:unsigned int fc_ineligible_rc[EXT4_FC_REASON_MAX]; offset:12;      size:36;        signed:0;

The EXT4_FC_REASON_MAX has no meaning to userspace but it needs to know
that information to know how to parse the array.

Have the array indexes also be parsed as well.

Link: https://lore.kernel.org/all/cover.1646922487.git.riteshh@linux.ibm.com/

Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Tested-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 38afd66d80e3..ae9a3b8481f5 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2633,6 +2633,33 @@ static void update_event_printk(struct trace_event_call *call,
 	}
 }
 
+static void update_event_fields(struct trace_event_call *call,
+				struct trace_eval_map *map)
+{
+	struct ftrace_event_field *field;
+	struct list_head *head;
+	char *ptr;
+	int len = strlen(map->eval_string);
+
+	head = trace_get_fields(call);
+	list_for_each_entry(field, head, link) {
+		ptr = strchr(field->type, '[');
+		if (!ptr)
+			continue;
+		ptr++;
+
+		if (!isalpha(*ptr) && *ptr != '_')
+			continue;
+
+		if (strncmp(map->eval_string, ptr, len) != 0)
+			continue;
+
+		ptr = eval_replace(ptr, map, len);
+		/* enum/sizeof string smaller than value */
+		WARN_ON_ONCE(!ptr);
+	}
+}
+
 void trace_event_eval_update(struct trace_eval_map **map, int len)
 {
 	struct trace_event_call *call, *p;
@@ -2668,6 +2695,7 @@ void trace_event_eval_update(struct trace_eval_map **map, int len)
 					first = false;
 				}
 				update_event_printk(call, map[i]);
+				update_event_fields(call, map[i]);
 			}
 		}
 	}
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 11/12] tracing: Add snapshot at end of kernel boot up
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (9 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 10/12] tracing: Have TRACE_DEFINE_ENUM affect trace event types as well Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  2022-03-12 23:25 ` [for-next][PATCH 12/12] tracing/user_events: Use alloc_pages instead of kzalloc() for register pages Steven Rostedt
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Add ftrace_boot_snapshot kernel parameter that will take a snapshot at the
end of boot up just before switching over to user space (it happens during
the kernel freeing of init memory).

This is useful when there's interesting data that can be collected from
kernel start up, but gets overridden by user space start up code. With
this option, the ring buffer content from the boot up traces gets saved in
the snapshot at the end of boot up. This trace can be read from:

 /sys/kernel/tracing/snapshot

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 .../admin-guide/kernel-parameters.txt          |  8 ++++++++
 include/linux/ftrace.h                         | 11 ++++++++++-
 kernel/trace/ftrace.c                          |  2 ++
 kernel/trace/trace.c                           | 18 ++++++++++++++++++
 4 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f5a27f067db9..f6b7ee64ace8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1435,6 +1435,14 @@
 			as early as possible in order to facilitate early
 			boot debugging.
 
+	ftrace_boot_snapshot
+			[FTRACE] On boot up, a snapshot will be taken of the
+			ftrace ring buffer that can be read at:
+			/sys/kernel/tracing/snapshot.
+			This is useful if you need tracing information from kernel
+			boot up that is likely to be overridden by user space
+			start up functionality.
+
 	ftrace_dump_on_oops[=orig_cpu]
 			[FTRACE] will dump the trace buffers on oops.
 			If no parameter is passed, ftrace will dump
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9999e29187de..37b619185ec9 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -30,6 +30,12 @@
 #define ARCH_SUPPORTS_FTRACE_OPS 0
 #endif
 
+#ifdef CONFIG_TRACING
+extern void ftrace_boot_snapshot(void);
+#else
+static inline void ftrace_boot_snapshot(void) { }
+#endif
+
 #ifdef CONFIG_FUNCTION_TRACER
 struct ftrace_ops;
 struct ftrace_regs;
@@ -215,7 +221,10 @@ struct ftrace_ops_hash {
 void ftrace_free_init_mem(void);
 void ftrace_free_mem(struct module *mod, void *start, void *end);
 #else
-static inline void ftrace_free_init_mem(void) { }
+static inline void ftrace_free_init_mem(void)
+{
+	ftrace_boot_snapshot();
+}
 static inline void ftrace_free_mem(struct module *mod, void *start, void *end) { }
 #endif
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index f9feb197b2da..4e29bd1cf151 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -7096,6 +7096,8 @@ void __init ftrace_free_init_mem(void)
 	void *start = (void *)(&__init_begin);
 	void *end = (void *)(&__init_end);
 
+	ftrace_boot_snapshot();
+
 	ftrace_free_mem(NULL, start, end);
 }
 
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 7c85ce9ffdc3..eaf7d30ca6f1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -185,6 +185,7 @@ static char bootup_tracer_buf[MAX_TRACER_SIZE] __initdata;
 static char *default_bootup_tracer;
 
 static bool allocate_snapshot;
+static bool snapshot_at_boot;
 
 static int __init set_cmdline_ftrace(char *str)
 {
@@ -230,6 +231,15 @@ static int __init boot_alloc_snapshot(char *str)
 __setup("alloc_snapshot", boot_alloc_snapshot);
 
 
+static int __init boot_snapshot(char *str)
+{
+	snapshot_at_boot = true;
+	boot_alloc_snapshot(str);
+	return 1;
+}
+__setup("ftrace_boot_snapshot", boot_snapshot);
+
+
 static char trace_boot_options_buf[MAX_TRACER_SIZE] __initdata;
 
 static int __init set_trace_boot_options(char *str)
@@ -10149,6 +10159,14 @@ __init static int tracer_alloc_buffers(void)
 	return ret;
 }
 
+void __init ftrace_boot_snapshot(void)
+{
+	if (snapshot_at_boot) {
+		tracing_snapshot();
+		internal_trace_puts("** Boot snapshot taken **\n");
+	}
+}
+
 void __init early_trace_init(void)
 {
 	if (tracepoint_printk) {
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 12/12] tracing/user_events: Use alloc_pages instead of kzalloc() for register pages
  2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
                   ` (10 preceding siblings ...)
  2022-03-12 23:25 ` [for-next][PATCH 11/12] tracing: Add snapshot at end of kernel boot up Steven Rostedt
@ 2022-03-12 23:25 ` Steven Rostedt
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2022-03-12 23:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Beau Belgrave, Anders Roxell

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

kzalloc virtual addresses do not work with SetPageReserved, use the actual
page virtual addresses instead via alloc_pages.

The issue is reported when booting with user_events and
DEBUG_VM_PGFLAGS=y.

Also make the number of events based on the ORDER.

Link: https://lore.kernel.org/all/CADYN=9+xY5Vku3Ws5E9S60SM5dCFfeGeRBkmDFbcxX0ZMoFing@mail.gmail.com/
Link: https://lore.kernel.org/all/20220311223028.1865-1-beaub@linux.microsoft.com/

Cc: Beau Belgrave <beaub@linux.microsoft.com>
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_user.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 4febc1d6ae72..e10ad057e797 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -30,9 +30,10 @@
 
 /*
  * Limits how many trace_event calls user processes can create:
- * Must be multiple of PAGE_SIZE.
+ * Must be a power of two of PAGE_SIZE.
  */
-#define MAX_PAGES 1
+#define MAX_PAGE_ORDER 0
+#define MAX_PAGES (1 << MAX_PAGE_ORDER)
 #define MAX_EVENTS (MAX_PAGES * PAGE_SIZE)
 
 /* Limit how long of an event name plus args within the subsystem. */
@@ -1622,16 +1623,17 @@ static void set_page_reservations(bool set)
 
 static int __init trace_events_user_init(void)
 {
+	struct page *pages;
 	int ret;
 
 	/* Zero all bits beside 0 (which is reserved for failures) */
 	bitmap_zero(page_bitmap, MAX_EVENTS);
 	set_bit(0, page_bitmap);
 
-	register_page_data = kzalloc(MAX_EVENTS, GFP_KERNEL);
-
-	if (!register_page_data)
+	pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, MAX_PAGE_ORDER);
+	if (!pages)
 		return -ENOMEM;
+	register_page_data = page_address(pages);
 
 	set_page_reservations(true);
 
@@ -1640,7 +1642,7 @@ static int __init trace_events_user_init(void)
 	if (ret) {
 		pr_warn("user_events could not register with tracefs\n");
 		set_page_reservations(false);
-		kfree(register_page_data);
+		__free_pages(pages, MAX_PAGE_ORDER);
 		return ret;
 	}
 
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-03-12 23:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-12 23:25 [for-next][PATCH 00/12] tracing: Updates for v5.18 Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 01/12] tracing: Fix allocation of last_cmd in last_cmd_set() Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 02/12] user_events: Fix potential uninitialized pointer while parsing field Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 03/12] tracing: Fix last_cmd_set() string management in histogram code Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 04/12] tracing: Allow custom events to be added to the tracefs directory Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 05/12] tracing: Add sample code for custom trace events Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 06/12] tracing: Move the defines to create TRACE_EVENTS into their own files Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 07/12] tracing: Add TRACE_CUSTOM_EVENT() macro Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 08/12] user_events: Prevent dyn_event delete racing with ioctl add/delete Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 09/12] tracing: Fix strncpy warning in trace_events_synth.c Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 10/12] tracing: Have TRACE_DEFINE_ENUM affect trace event types as well Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 11/12] tracing: Add snapshot at end of kernel boot up Steven Rostedt
2022-03-12 23:25 ` [for-next][PATCH 12/12] tracing/user_events: Use alloc_pages instead of kzalloc() for register pages Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).