linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them)
@ 2020-06-01 12:27 Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 01/12] ftrace: show debugging information when panic_on_warn set Steven Rostedt
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: c200784a08d4ea82f82a30678955b7f2c7550af4


Cheng Jian (1):
      ftrace: show debugging information when panic_on_warn set

Julia Lawall (1):
      tracing/probe: reverse arguments to list_add

Masami Hiramatsu (1):
      tools/bootconfig: Add a summary of test cases and return error

Peter Zijlstra (1):
      ftrace,bug: Improve traceoff_on_warn

Steven Rostedt (VMware) (2):
      tracing/doc: Fix typos in histogram-design.rst
      tracing: Add a trace print when traceoff_on_warning is triggered

Tom Zanussi (6):
      tracing: Check state.disabled in synth event trace functions
      tracing: Add histogram-design document
      tracing: Add hist_debug trace event files for histogram debugging
      tracing: Fix events.rst section numbering
      tracing: Move synthetic events to a separate file
      selftests/ftrace: Distinguish between hist and synthetic event checks

----
 Documentation/trace/events.rst                     |   28 +-
 Documentation/trace/histogram-design.rst           | 2115 +++++++++++++++++++
 kernel/trace/Kconfig                               |   43 +-
 kernel/trace/Makefile                              |    1 +
 kernel/trace/ftrace.c                              |    8 +-
 kernel/trace/trace.c                               |    5 +-
 kernel/trace/trace.h                               |    1 +
 kernel/trace/trace_events.c                        |    4 +
 kernel/trace/trace_events_hist.c                   | 2193 +++-----------------
 kernel/trace/trace_events_synth.c                  | 1789 ++++++++++++++++
 kernel/trace/trace_probe.c                         |    2 +-
 kernel/trace/trace_synth.h                         |   36 +
 lib/bug.c                                          |    3 +
 tools/bootconfig/test-bootconfig.sh                |    7 +
 .../inter-event/trigger-field-variable-support.tc  |    5 +
 .../trigger-inter-event-combined-hist.tc           |    5 +
 .../inter-event/trigger-multi-actions-accept.tc    |    5 +
 .../inter-event/trigger-onmatch-action-hist.tc     |    5 +
 .../trigger-onmatch-onmax-action-hist.tc           |    5 +
 .../inter-event/trigger-onmax-action-hist.tc       |    5 +
 .../inter-event/trigger-snapshot-action-hist.tc    |    5 +
 .../inter-event/trigger-trace-action-hist.tc       |    5 +
 22 files changed, 4396 insertions(+), 1879 deletions(-)
 create mode 100644 Documentation/trace/histogram-design.rst
 create mode 100644 kernel/trace/trace_events_synth.c
 create mode 100644 kernel/trace/trace_synth.h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [for-next][PATCH 01/12] ftrace: show debugging information when panic_on_warn set
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 02/12] tools/bootconfig: Add a summary of test cases and return error Steven Rostedt
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Cheng Jian

From: Cheng Jian <cj.chengjian@huawei.com>

When an anomaly is detected in the function call modification
code, ftrace_bug() is called to disable function tracing as well
as give some warn and information that may help debug the problem.

But currently, we call FTRACE_WARN_ON_ONCE() first in ftrace_bug(),
so when panic_on_warn is set, we can't see the debugging information
here. Call FTRACE_WARN_ON_ONCE() at the end of ftrace_bug() to ensure
that the debugging information is displayed first.

after this patch, the dmesg looks like:

	------------[ ftrace bug ]------------
	ftrace failed to modify
	[<ffff800010081004>] bcm2835_handle_irq+0x4/0x58
	 actual:   1f:20:03:d5
	Setting ftrace call site to call ftrace function
	ftrace record flags: 80000001
	 (1)
	 expected tramp: ffff80001009d6f0
	------------[ cut here ]------------
	WARNING: CPU: 2 PID: 1635 at kernel/trace/ftrace.c:2078 ftrace_bug+0x204/0x238
	Kernel panic - not syncing: panic_on_warn set ...
	CPU: 2 PID: 1635 Comm: sh Not tainted 5.7.0-rc5-00033-gb922183867f5 #14
	Hardware name: linux,dummy-virt (DT)
	Call trace:
	 dump_backtrace+0x0/0x1b0
	 show_stack+0x20/0x30
	 dump_stack+0xc0/0x10c
	 panic+0x16c/0x368
	 __warn+0x120/0x160
	 report_bug+0xc8/0x160
	 bug_handler+0x28/0x98
	 brk_handler+0x70/0xd0
	 do_debug_exception+0xcc/0x1ac
	 el1_sync_handler+0xe4/0x120
	 el1_sync+0x7c/0x100
	 ftrace_bug+0x204/0x238

Link: https://lkml.kernel.org/r/20200515100828.7091-1-cj.chengjian@huawei.com

Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index bd030b1b9514..cd39cbf3631a 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2027,14 +2027,14 @@ void ftrace_bug(int failed, struct dyn_ftrace *rec)
 {
 	unsigned long ip = rec ? rec->ip : 0;
 
+	pr_info("------------[ ftrace bug ]------------\n");
+
 	switch (failed) {
 	case -EFAULT:
-		FTRACE_WARN_ON_ONCE(1);
 		pr_info("ftrace faulted on modifying ");
 		print_ip_sym(ip);
 		break;
 	case -EINVAL:
-		FTRACE_WARN_ON_ONCE(1);
 		pr_info("ftrace failed to modify ");
 		print_ip_sym(ip);
 		print_ip_ins(" actual:   ", (unsigned char *)ip);
@@ -2045,12 +2045,10 @@ void ftrace_bug(int failed, struct dyn_ftrace *rec)
 		}
 		break;
 	case -EPERM:
-		FTRACE_WARN_ON_ONCE(1);
 		pr_info("ftrace faulted on writing ");
 		print_ip_sym(ip);
 		break;
 	default:
-		FTRACE_WARN_ON_ONCE(1);
 		pr_info("ftrace faulted on unknown error ");
 		print_ip_sym(ip);
 	}
@@ -2077,6 +2075,8 @@ void ftrace_bug(int failed, struct dyn_ftrace *rec)
 		ip = ftrace_get_addr_curr(rec);
 		pr_cont("\n expected tramp: %lx\n", ip);
 	}
+
+	FTRACE_WARN_ON_ONCE(1);
 }
 
 static int ftrace_check_record(struct dyn_ftrace *rec, bool enable, bool update)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 02/12] tools/bootconfig: Add a summary of test cases and return error
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 01/12] ftrace: show debugging information when panic_on_warn set Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 03/12] tracing/probe: reverse arguments to list_add Steven Rostedt
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu

From: Masami Hiramatsu <mhiramat@kernel.org>

Add summary lines of test cases and return an error
code if any test case fails so that tester don't have
to monitor the output.

Link: https://lkml.kernel.org/r/158898590533.22749.10269622752797822320.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 tools/bootconfig/test-bootconfig.sh | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/bootconfig/test-bootconfig.sh b/tools/bootconfig/test-bootconfig.sh
index 81b350ffd03f..eff16b77d5eb 100755
--- a/tools/bootconfig/test-bootconfig.sh
+++ b/tools/bootconfig/test-bootconfig.sh
@@ -124,9 +124,16 @@ for i in samples/good-* ; do
   xpass $BOOTCONF -a $i $INITRD
 done
 
+
+echo
+echo "=== Summary ==="
+echo "# of Passed: $(expr $NO - $NG - 1)"
+echo "# of Failed: $NG"
+
 echo
 if [ $NG -eq 0 ]; then
 	echo "All tests passed"
 else
 	echo "$NG tests failed"
+	exit 1
 fi
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 03/12] tracing/probe: reverse arguments to list_add
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 01/12] ftrace: show debugging information when panic_on_warn set Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 02/12] tools/bootconfig: Add a summary of test cases and return error Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 04/12] tracing: Check state.disabled in synth event trace functions Steven Rostedt
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Julia Lawall

From: Julia Lawall <Julia.Lawall@inria.fr>

Elsewhere in the file, the function trace_kprobe_has_same_kprobe uses
a trace_probe_event.probes object as the second argument of
list_for_each_entry, ie as a list head, while the list_for_each_entry
iterates over the list fields of the trace_probe structures, making
them the list elements.  So, exchange the arguments on the list_add
call to put the list head in the second argument.

Since both list_head structures were just initialized, this problem
did not cause any loss of information.

Link: https://lkml.kernel.org/r/1588879808-24488-1-git-send-email-Julia.Lawall@inria.fr

Fixes: 60d53e2c3b75 ("tracing/probe: Split trace_event related data from trace_probe")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_probe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index ab8b6436d53f..b8a928e925c7 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -1006,7 +1006,7 @@ int trace_probe_init(struct trace_probe *tp, const char *event,
 	INIT_LIST_HEAD(&tp->event->class.fields);
 	INIT_LIST_HEAD(&tp->event->probes);
 	INIT_LIST_HEAD(&tp->list);
-	list_add(&tp->event->probes, &tp->list);
+	list_add(&tp->list, &tp->event->probes);
 
 	call = trace_probe_event_call(tp);
 	call->class = &tp->event->class;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 04/12] tracing: Check state.disabled in synth event trace functions
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (2 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 03/12] tracing/probe: reverse arguments to list_add Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 05/12] tracing: Add histogram-design document Steven Rostedt
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tom Zanussi

From: Tom Zanussi <zanussi@kernel.org>

Since trace_state.disabled is set in __synth_event_trace_start() at
the same time -ENOENT is returned, don't bother returning -ENOENT -
just have callers check trace_state.disabled instead, and avoid the
extra return val munging.

Link: http://lkml.kernel.org/r/87315c3889af870e8370e82b76cf48b426d70130.1585941485.git.zanussi@kernel.org

Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@godmis.org>
---
 kernel/trace/trace_events_hist.c | 19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index fcab11cc6833..5cfe8f998b3e 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1845,7 +1845,6 @@ __synth_event_trace_start(struct trace_event_file *file,
 	if (!(file->flags & EVENT_FILE_FL_ENABLED) ||
 	    trace_trigger_soft_disabled(file)) {
 		trace_state->disabled = true;
-		ret = -ENOENT;
 		goto out;
 	}
 
@@ -1907,11 +1906,8 @@ int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...)
 	int ret;
 
 	ret = __synth_event_trace_start(file, &state);
-	if (ret) {
-		if (ret == -ENOENT)
-			ret = 0; /* just disabled, not really an error */
+	if (ret || state.disabled)
 		return ret;
-	}
 
 	if (n_vals != state.event->n_fields) {
 		ret = -EINVAL;
@@ -1987,11 +1983,8 @@ int synth_event_trace_array(struct trace_event_file *file, u64 *vals,
 	int ret;
 
 	ret = __synth_event_trace_start(file, &state);
-	if (ret) {
-		if (ret == -ENOENT)
-			ret = 0; /* just disabled, not really an error */
+	if (ret || state.disabled)
 		return ret;
-	}
 
 	if (n_vals != state.event->n_fields) {
 		ret = -EINVAL;
@@ -2067,16 +2060,10 @@ EXPORT_SYMBOL_GPL(synth_event_trace_array);
 int synth_event_trace_start(struct trace_event_file *file,
 			    struct synth_event_trace_state *trace_state)
 {
-	int ret;
-
 	if (!trace_state)
 		return -EINVAL;
 
-	ret = __synth_event_trace_start(file, trace_state);
-	if (ret == -ENOENT)
-		ret = 0; /* just disabled, not really an error */
-
-	return ret;
+	return __synth_event_trace_start(file, trace_state);
 }
 EXPORT_SYMBOL_GPL(synth_event_trace_start);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 05/12] tracing: Add histogram-design document
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (3 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 04/12] tracing: Check state.disabled in synth event trace functions Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 06/12] tracing: Add hist_debug trace event files for histogram debugging Steven Rostedt
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Tom Zanussi

From: Tom Zanussi <zanussi@kernel.org>

Add a new Documentation/trace/histogram-design.rst file describing the
ftrace histogram low-level design, meant to help developers trying to
understand the internals when extending or making use of the hist
triggers for higher-level tools.

This documentation refers to the hist_debug files implemented by
'tracing: Add hist_debug trace event files for histogram debugging' so
users wishing to try out the test examples here should make sure
CONFIG_HIST_TRIGGERS_DEBUG is enabled.

Link: http://lkml.kernel.org/r/256b29c3274bb89a10157c4a8d1a8bce7e74849e.1585941485.git.zanussi@kernel.org

Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 Documentation/trace/histogram-design.rst | 2115 ++++++++++++++++++++++
 1 file changed, 2115 insertions(+)
 create mode 100644 Documentation/trace/histogram-design.rst

diff --git a/Documentation/trace/histogram-design.rst b/Documentation/trace/histogram-design.rst
new file mode 100644
index 000000000000..c246753f0ffc
--- /dev/null
+++ b/Documentation/trace/histogram-design.rst
@@ -0,0 +1,2115 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+Histogram Design Notes
+======================
+
+:Author: Tom Zanussi <zanussi@kernel.org>
+
+This document attempts to provide a description of how the ftrace
+histograms work and how the individual pieces map to the data
+structures used to implement them in trace_events_hist.c and
+tracing_map.c.
+
+Note: All the ftrace histogram command examples assume the working
+      directory is the ftrace /tracing directory. For example::
+
+  # cd /sys/kernel/debug/tracing
+
+Also, the histogram output displayed for those commands will be
+generally be truncated - only enough to make the point is displayed.
+
+'hist_debug' trace event files
+==============================
+
+If the kernel is compiled with CONFIG_HIST_TRIGGERS_DEBUG set, an
+event file named 'hist_debug' will appear in each event's
+subdirectory.  This file can be read at any time and will display some
+of the hist trigger internals described in this document. Specific
+examples and output will be described in test cases below.
+
+Basic histograms
+================
+
+First, basic histograms.  Below is pretty much the simplest thing you
+can do with histograms - create one with a single key on a single
+event and cat the output::
+
+  # echo 'hist:keys=pid' >> events/sched/sched_waking/trigger
+
+  # cat events/sched/sched_waking/hist
+
+  { pid:      18249 } hitcount:          1
+  { pid:      13399 } hitcount:          1
+  { pid:      17973 } hitcount:          1
+  { pid:      12572 } hitcount:          1
+  ...
+  { pid:         10 } hitcount:        921
+  { pid:      18255 } hitcount:       1444
+  { pid:      25526 } hitcount:       2055
+  { pid:       5257 } hitcount:       2055
+  { pid:      27367 } hitcount:       2055
+  { pid:       1728 } hitcount:       2161
+
+  Totals:
+    Hits: 21305
+    Entries: 183
+    Dropped: 0
+
+What this does is create a histogram on the sched_waking event using
+pid as a key and with a single value, hitcount, which even if not
+explicitly specified, exists for every histogram regardless.
+
+The hitcount value is a per-bucket value that's automatically
+incremented on every hit for the given key, which in this case is the
+pid.
+
+So in this histogram, there's a separate bucket for each pid, and each
+bucket contains a value for that bucket, counting the number of times
+sched_waking was called for that pid.
+
+Each histogram is represented by a hist_data struct.
+
+To keep track of each key and value field in the histogram, hist_data
+keeps an array of these fields named fields[].  The fields[] array is
+an array containing struct hist_field representations of each
+histogram val and key in the histogram (variables are also included
+here, but are discussed later). So for the above histogram we have one
+key and one value; in this case the one value is the hitcount value,
+which all histograms have, regardless of whether they define that
+value or not, which the above histogram does not.
+
+Each struct hist_field contains a pointer to the ftrace_event_field
+from the event's trace_event_file along with various bits related to
+that such as the size, offset, type, and a hist_field_fn_t function,
+which is used to grab the field's data from the ftrace event buffer
+(in most cases - some hist_fields such as hitcount don't directly map
+to an event field in the trace buffer - in these cases the function
+implementation gets its value from somewhere else).  The flags field
+indicates which type of field it is - key, value, variable, variable
+reference, etc., with value being the default.
+
+The other important hist_data data structure in addition to the
+fields[] array is the tracing_map instance created for the histogram,
+which is held in the .map member.  The tracing_map implements the
+lock-free hash table used to implement histograms (see
+kernel/trace/tracing_map.h for much more discussion about the
+low-level data structures implementing the tracing_map).  For the
+purposes of this discussion, the tracing_map contains a number of
+buckets, each bucket corresponding to a particular tracing_map_elt
+object hashed by a given histogram key.
+
+Below is a diagram the first part of which describes the hist_data and
+associated key and value fields for the histogram described above.  As
+you can see, there are two fields in the fields array, one val field
+for the hitcount and one key field for the pid key.
+
+Below that is a diagram of a run-time snapshot of what the tracing_map
+might look like for a given run.  It attempts to show the
+relationships between the hist_data fields and the tracing_map
+elements for a couple hypothetical keys and values.
+
+  +------------------+
+  | hist_data        |
+  +------------------+     +----------------+
+    | .fields[]      |---->| val = hitcount |----------------------------+
+    +----------------+     +----------------+                            |
+    | .map           |       | .size        |                            |
+    +----------------+       +--------------+                            |
+                             | .offset      |                            |
+                             +--------------+                            |
+                             | .fn()        |                            |
+                             +--------------+                            |
+                                   .                                     |
+                                   .                                     |
+                                   .                                     |
+                           +----------------+ <--- n_vals                |
+                           | key = pid      |----------------------------|--+
+                           +----------------+                            |  |
+                             | .size        |                            |  |
+                             +--------------+                            |  |
+                             | .offset      |                            |  |
+                             +--------------+                            |  |
+                             | .fn()        |                            |  |
+                           +----------------+ <--- n_fields              |  |
+                           | unused         |                            |  |
+                           +----------------+                            |  |
+                             |              |                            |  |
+                             +--------------+                            |  |
+                             |              |                            |  |
+                             +--------------+                            |  |
+                             |              |                            |  |
+                             +--------------+                            |  |
+                                            n_keys = n_fields - n_vals   |  |
+		                                                         |  |
+The hist_data n_vals and n_fields delineate the extent of the fields[]   |  |
+array and separate keys from values for the rest of the code.            |  |
+		                                                         |  |
+Below is a run-time representation of the tracing_map part of the        |  |
+histogram, with pointers from various parts of the fields[] array        |  |
+to corresponding parts of the tracing_map.                               |  |
+		                                                         |  |
+The tracing_map consists of an array of tracing_map_entrys and a set     |  |
+of preallocated tracing_map_elts (abbreviated below as map_entry and     |  |
+map_elt).  The total number of map_entrys in the hist_data.map array =   |  |
+map->max_elts (actually map->map_size but only max_elts of those are     |  |
+used.  This is a property required by the map_insert() algorithm).       |  |
+		                                                         |  |
+If a map_entry is unused, meaning no key has yet hashed into it, its     |  |
+.key value is 0 and its .val pointer is NULL.  Once a map_entry has      |  |
+been claimed, the .key value contains the key's hash value and the       |  |
+.val member points to a map_elt containing the full key and an entry     |  |
+for each key or value in the map_elt.fields[] array.  There is an        |  |
+entry in the map_elt.fields[] array corresponding to each hist_field     |  |
+in the histogram, and this is where the continually aggregated sums      |  |
+corresponding to each histogram value are kept.                          |  |
+		                                                         |  |
+The diagram attempts to show the relationship between the                |  |
+hist_data.fields[] and the map_elt.fields[] with the links drawn         |  |
+between diagrams::                                                       |  |
+		                                                         |  |
+  +-----------+		                                                 |  |
+  | hist_data |		                                                 |  |
+  +-----------+		                                                 |  |
+    | .fields |		                                                 |  |
+    +---------+     +-----------+		                         |  |
+    | .map    |---->| map_entry |		                         |  |
+    +---------+     +-----------+		                         |  |
+                      | .key    |---> 0		                         |  |
+                      +---------+		                         |  |
+                      | .val    |---> NULL		                 |  |
+                    +-----------+                                        |  |
+                    | map_entry |                                        |  |
+                    +-----------+                                        |  |
+                      | .key    |---> pid = 999                          |  |
+                      +---------+    +-----------+                       |  |
+                      | .val    |--->| map_elt   |                       |  |
+                      +---------+    +-----------+                       |  |
+                           .           | .key    |---> full key *        |  |
+                           .           +---------+    +---------------+  |  |
+			   .           | .fields |--->| .sum (val)    |<-+  |
+                    +-----------+      +---------+    | 2345          |  |  |
+                    | map_entry |                     +---------------+  |  |
+                    +-----------+                     | .offset (key) |<----+
+                      | .key    |---> 0               | 0             |  |  |
+                      +---------+                     +---------------+  |  |
+                      | .val    |---> NULL                    .          |  |
+                    +-----------+                             .          |  |
+                    | map_entry |                             .          |  |
+                    +-----------+                     +---------------+  |  |
+                      | .key    |                     | .sum (val) or |  |  |
+                      +---------+    +---------+      | .offset (key) |  |  |
+                      | .val    |--->| map_elt |      +---------------+  |  |
+                    +-----------+    +---------+      | .sum (val) or |  |  |
+                    | map_entry |                     | .offset (key) |  |  |
+                    +-----------+                     +---------------+  |  |
+                      | .key    |---> pid = 4444                         |  |
+                      +---------+    +-----------+                       |  |
+                      | .val    |    | map_elt   |                       |  |
+                      +---------+    +-----------+                       |  |
+                                       | .key    |---> full key *        |  |
+                                       +---------+    +---------------+  |  |
+			               | .fields |--->| .sum (val)    |<-+  |
+                                       +---------+    | 65523         |     |
+                                                      +---------------+     |
+                                                      | .offset (key) |<----+
+                                                      | 0             |
+                                                      +---------------+
+                                                              .
+                                                              .
+                                                              .
+                                                      +---------------+
+                                                      | .sum (val) or |
+                                                      | .offset (key) |
+                                                      +---------------+
+                                                      | .sum (val) or |
+                                                      | .offset (key) |
+                                                      +---------------+
+
+Abbreviations used in the diagrams::
+
+  hist_data = struct hist_trigger_data
+  hist_data.fields = struct hist_field
+  fn = hist_field_fn_t
+  map_entry = struct tracing_map_entry
+  map_elt = struct tracing_map_elt
+  map_elt.fields = struct tracing_map_field
+
+Whenever a new event occurs and it has a hist trigger associated with
+it, event_hist_trigger() is called.  event_hist_trigger() first deals
+with the key: for each subkey in the key (in the above example, there
+is just one subkey corresponding to pid), the hist_field that
+represents that subkey is retrieved from hist_data.fields[] and the
+hist_field_fn_t fn() associated with that field, along with the
+field's size and offset, is used to grab that subkey's data from the
+current trace record.
+
+Once the complete key has been retrieved, it's used to look that key
+up in the tracing_map.  If there's no tracing_map_elt associated with
+that key, an empty one is claimed and inserted in the map for the new
+key.  In either case, the tracing_map_elt associated with that key is
+returned.
+
+Once a tracing_map_elt available, hist_trigger_elt_update() is called.
+As the name implies, this updates the element, which basically means
+updating the element's fields.  There's a tracing_map_field associated
+with each key and value in the histogram, and each of these correspond
+to the key and value hist_fields created when the histogram was
+created.  hist_trigger_elt_update() goes through each value hist_field
+and, as for the keys, uses the hist_field's fn() and size and offset
+to grab the field's value from the current trace record.  Once it has
+that value, it simply adds that value to that field's
+continually-updated tracing_map_field.sum member.  Some hist_field
+fn()s, such as for the hitcount, don't actually grab anything from the
+trace record (the hitcount fn() just increments the counter sum by 1),
+but the idea is the same.
+
+Once all the values have been updated, hist_trigger_elt_update() is
+done and returns.  Note that there are also tracing_map_fields for
+each subkey in the key, but hist_trigger_elt_update() doesn't look at
+them or update anything - those exist only for sorting, which can
+happen later.
+
+Basic histogram test
+--------------------
+
+This is a good example to try.  It produces 3 value fields and 2 key
+fields in the output::
+
+  # echo 'hist:keys=common_pid,call_site.sym:values=bytes_req,bytes_alloc,hitcount' >> events/kmem/kmalloc/trigger
+
+To see the debug data, cat the kmem/kmalloc's 'hist_debug' file. It
+will show the trigger info of the histogram it corresponds to, along
+with the address of the hist_data associated with the histogram, which
+will become useful in later examples.  It then displays the number of
+total hist_fields associated with the histogram along with a count of
+how many of those correspond to keys and how many correspond to values.
+
+It then goes on to display details for each field, including the
+field's flags and the position of each field in the hist_data's
+fields[] array, which is useful information for verifying that things
+internally appear correct or not, and which again will become even
+more useful in further examples::
+
+  # cat events/kmem/kmalloc/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=common_pid,call_site.sym:vals=hitcount,bytes_req,bytes_alloc:sort=hitcount:size=2048 [active]
+  #
+
+  hist_data: 000000005e48c9a5
+
+  n_vals: 3
+  n_keys: 2
+  n_fields: 5
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        VAL: normal u64 value
+      ftrace_event_field name: bytes_req
+      type: size_t
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[2]:
+      flags:
+        VAL: normal u64 value
+      ftrace_event_field name: bytes_alloc
+      type: size_t
+      size: 8
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[3]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: common_pid
+      type: int
+      size: 8
+      is_signed: 1
+
+    hist_data->fields[4]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: call_site
+      type: unsigned long
+      size: 8
+      is_signed: 0
+
+The commands below can be used to clean things up for the next test::
+
+  # echo '!hist:keys=common_pid,call_site.sym:values=bytes_req,bytes_alloc,hitcount' >> events/kmem/kmalloc/trigger
+
+Variables
+=========
+
+Variables allow data from one hist trigger to be saved by one hist
+trigger and retrieved by another hist trigger.  For example, a trigger
+on the sched_waking event can capture a timestamp for a particular
+pid, and later a sched_switch event that switches to that pid event
+can grab the timestamp and use it to calculate a time delta between
+the two events::
+
+  # echo 'hist:keys=pid:ts0=common_timestamp.usecs' >>
+          events/sched/sched_waking/trigger
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >>
+          events/sched/sched_switch/trigger
+
+In terms of the histogram data structures, variables are implemented
+as another type of hist_field and for a given hist trigger are added
+to the hist_data.fields[] array just after all the val fields.  To
+distinguish them from the existing key and val fields, they're given a
+new flag type, HIST_FIELD_FL_VAR (abbreviated FL_VAR) and they also
+make use of a new .var.idx field member in struct hist_field, which
+maps them to an index in a new map_elt.vars[] array added to the
+map_elt specifically designed to store and retrieve variable values.
+The diagram below shows those new elements and adds a new variable
+entry, ts0, corresponding to the ts0 variable in the sched_waking
+trigger above.
+
+sched_waking histogram
+----------------------
+
+  +------------------+
+  | hist_data        |<-------------------------------------------------------+
+  +------------------+   +-------------------+                                |
+    | .fields[]      |-->| val = hitcount    |                                |
+    +----------------+   +-------------------+                                |
+    | .map           |     | .size           |                                |
+    +----------------+     +-----------------+                                |
+                           | .offset         |                                |
+                           +-----------------+                                |
+                           | .fn()           |                                |
+                           +-----------------+                                |
+                           | .flags          |                                |
+                           +-----------------+                                |
+                           | .var.idx        |                                |
+                         +-------------------+                                |
+                         | var = ts0         |                                |
+                         +-------------------+                                |
+                           | .size           |                                |
+                           +-----------------+                                |
+                           | .offset         |                                |
+                           +-----------------+                                |
+                           | .fn()           |                                |
+                           +-----------------+                                |
+                           | .flags & FL_VAR |                                |
+                           +-----------------+                                |
+                           | .var.idx        |----------------------------+-+ |
+                           +-----------------+                            | | |
+			            .                                     | | |
+				    .                                     | | |
+                                    .                                     | | |
+                         +-------------------+ <--- n_vals                | | |
+                         | key = pid         |                            | | |
+                         +-------------------+                            | | |
+                           | .size           |                            | | |
+                           +-----------------+                            | | |
+                           | .offset         |                            | | |
+                           +-----------------+                            | | |
+                           | .fn()           |                            | | |
+                           +-----------------+                            | | |
+                           | .flags & FL_KEY |                            | | |
+                           +-----------------+                            | | |
+                           | .var.idx        |                            | | |
+                         +-------------------+ <--- n_fields              | | |
+                         | unused            |                            | | |
+                         +-------------------+                            | | |
+                           |                 |                            | | |
+                           +-----------------+                            | | |
+                           |                 |                            | | |
+                           +-----------------+                            | | |
+                           |                 |                            | | |
+                           +-----------------+                            | | |
+                           |                 |                            | | |
+                           +-----------------+                            | | |
+                           |                 |                            | | |
+                           +-----------------+                            | | |
+                                             n_keys = n_fields - n_vals   | | |
+                                                                          | | |
+                                                                          | | |
+This is very similar to the basic case.  In the above diagram, we can     | | |
+see a new .flags member has been added to the struct hist_field           | | |
+struct, and a new entry added to hist_data.fields representing the ts0    | | |
+variable.  For a normal val hist_field, .flags is just 0 (modulo          | | |
+modifier flags), but if the value is defined as a variable, the .flags    | | |
+contains a set FL_VAR bit.                                                | | |
+                                                                          | | |
+As you can see, the ts0 entry's .var.idx member contains the index        | | |
+into the tracing_map_elts' .vars[] array containing variable values.      | | |
+This idx is used whenever the value of the variable is set or read.       | | |
+The map_elt.vars idx assigned to the given variable is assigned and       | | |
+saved in .var.idx by create_tracing_map_fields() after it calls           | | |
+tracing_map_add_var().                                                    | | |
+                                                                          | | |
+Below is a representation of the histogram at run-time, which             | | |
+populates the map, along with correspondence to the above hist_data and   | | |
+hist_field data structures.                                               | | |
+                                                                          | | |
+The diagram attempts to show the relationship between the                 | | |
+hist_data.fields[] and the map_elt.fields[] and map_elt.vars[] with       | | |
+the links drawn between diagrams.  For each of the map_elts, you can      | | |
+see that the .fields[] members point to the .sum or .offset of a key      | | |
+or val and the .vars[] members point to the value of a variable.  The     | | |
+arrows between the two diagrams show the linkages between those           | | |
+tracing_map members and the field definitions in the corresponding        | | |
+hist_data fields[] members.                                               | | |
+                                                                          | | |
+  +-----------+		                                                  | | |
+  | hist_data |		                                                  | | |
+  +-----------+		                                                  | | |
+    | .fields |		                                                  | | |
+    +---------+     +-----------+		                          | | |
+    | .map    |---->| map_entry |		                          | | |
+    +---------+     +-----------+		                          | | |
+                      | .key    |---> 0		                          | | |
+                      +---------+		                          | | |
+                      | .val    |---> NULL		                  | | |
+                    +-----------+                                         | | |
+                    | map_entry |                                         | | |
+                    +-----------+                                         | | |
+                      | .key    |---> pid = 999                           | | |
+                      +---------+    +-----------+                        | | |
+                      | .val    |--->| map_elt   |                        | | |
+                      +---------+    +-----------+                        | | |
+                           .           | .key    |---> full key *         | | |
+                           .           +---------+    +---------------+   | | |
+			   .           | .fields |--->| .sum (val)    |   | | |
+                           .           +---------+    | 2345          |   | | |
+                           .        +--| .vars   |    +---------------+   | | |
+                           .        |  +---------+    | .offset (key) |   | | |
+                           .        |                 | 0             |   | | |
+                           .        |                 +---------------+   | | |
+                           .        |                         .           | | |
+                           .        |                         .           | | |
+                           .        |                         .           | | |
+                           .        |                 +---------------+   | | |
+                           .        |                 | .sum (val) or |   | | |
+                           .        |                 | .offset (key) |   | | |
+                           .        |                 +---------------+   | | |
+                           .        |                 | .sum (val) or |   | | |
+                           .        |                 | .offset (key) |   | | |
+                           .        |                 +---------------+   | | |
+                           .        |                                     | | |
+                           .        +---------------->+---------------+   | | |
+			   .                          | ts0           |<--+ | |
+                           .                          | 113345679876  |   | | |
+                           .                          +---------------+   | | |
+                           .                          | unused        |   | | |
+                           .                          |               |   | | |
+                           .                          +---------------+   | | |
+                           .                                  .           | | |
+                           .                                  .           | | |
+                           .                                  .           | | |
+                           .                          +---------------+   | | |
+                           .                          | unused        |   | | |
+                           .                          |               |   | | |
+                           .                          +---------------+   | | |
+                           .                          | unused        |   | | |
+                           .                          |               |   | | |
+                           .                          +---------------+   | | |
+                           .                                              | | |
+                    +-----------+                                         | | |
+                    | map_entry |                                         | | |
+                    +-----------+                                         | | |
+                      | .key    |---> pid = 4444                          | | |
+                      +---------+    +-----------+                        | | |
+                      | .val    |--->| map_elt   |                        | | |
+                      +---------+    +-----------+                        | | |
+                           .           | .key    |---> full key *         | | |
+                           .           +---------+    +---------------+   | | |
+			   .           | .fields |--->| .sum (val)    |   | | |
+                                       +---------+    | 2345          |   | | |
+                                    +--| .vars   |    +---------------+   | | |
+                                    |  +---------+    | .offset (key) |   | | |
+                                    |                 | 0             |   | | |
+                                    |                 +---------------+   | | |
+                                    |                         .           | | |
+                                    |                         .           | | |
+                                    |                         .           | | |
+                                    |                 +---------------+   | | |
+                                    |                 | .sum (val) or |   | | |
+                                    |                 | .offset (key) |   | | |
+                                    |                 +---------------+   | | |
+                                    |                 | .sum (val) or |   | | |
+                                    |                 | .offset (key) |   | | |
+                                    |                 +---------------+   | | |
+                                    |                                     | | |
+                                    |                 +---------------+   | | |
+			            +---------------->| ts0           |<--+ | |
+                                                      | 213499240729  |     | |
+                                                      +---------------+     | |
+                                                      | unused        |     | |
+                                                      |               |     | |
+                                                      +---------------+     | |
+                                                              .             | |
+                                                              .             | |
+                                                              .             | |
+                                                      +---------------+     | |
+                                                      | unused        |     | |
+                                                      |               |     | |
+                                                      +---------------+     | |
+                                                      | unused        |     | |
+                                                      |               |     | |
+                                                      +---------------+     | |
+                                                                            | |
+For each used map entry, there's a map_elt pointing to an array of          | |
+.vars containing the current value of the variables associated with         | |
+that histogram entry.  So in the above, the timestamp associated with       | |
+pid 999 is 113345679876, and the timestamp variable in the same             | |
+.var.idx for pid 4444 is 213499240729.                                      | |
+                                                                            | |
+sched_switch histogram                                                      | |
+----------------------                                                      | |
+                                                                            | |
+The sched_switch histogram paired with the above sched_waking               | |
+histogram is shown below.  The most important aspect of the                 | |
+sched_switch histogram is that it references a variable on the              | |
+sched_waking histogram above.                                               | |
+                                                                            | |
+The histogram diagram is very similar to the others so far displayed,       | |
+but it adds variable references.  You can see the normal hitcount and       | |
+key fields along with a new wakeup_lat variable implemented in the          | |
+same way as the sched_waking ts0 variable, but in addition there's an       | |
+entry with the new FL_VAR_REF (short for HIST_FIELD_FL_VAR_REF) flag.       | |
+                                                                            | |
+Associated with the new var ref field are a couple of new hist_field        | |
+members, var.hist_data and var_ref_idx.  For a variable reference, the      | |
+var.hist_data goes with the var.idx, which together uniquely identify       | |
+a particular variable on a particular histogram.  The var_ref_idx is        | |
+just the index into the var_ref_vals[] array that caches the values of      | |
+each variable whenever a hist trigger is updated.  Those resulting          | |
+values are then finally accessed by other code such as trace action         | |
+code that uses the var_ref_idx values to assign param values.               | |
+                                                                            | |
+The diagram below describes the situation for the sched_switch              | |
+histogram referred to before:                                               | |
+                                                                            | |
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >>     | |
+          events/sched/sched_switch/trigger                                 | |
+                                                                            | |
+  +------------------+                                                      | |
+  | hist_data        |                                                      | |
+  +------------------+   +-----------------------+                          | |
+    | .fields[]      |-->| val = hitcount        |                          | |
+    +----------------+   +-----------------------+                          | |
+    | .map           |     | .size               |                          | |
+    +----------------+     +---------------------+                          | |
+ +--| .var_refs[]    |     | .offset             |                          | |
+ |  +----------------+     +---------------------+                          | |
+ |                         | .fn()               |                          | |
+ |   var_ref_vals[]        +---------------------+                          | |
+ |  +-------------+        | .flags              |                          | |
+ |  | $ts0        |<---+   +---------------------+                          | |
+ |  +-------------+    |   | .var.idx            |                          | |
+ |  |             |    |   +---------------------+                          | |
+ |  +-------------+    |   | .var.hist_data      |                          | |
+ |  |             |    |   +---------------------+                          | |
+ |  +-------------+    |   | .var_ref_idx        |                          | |
+ |  |             |    | +-----------------------+                          | |
+ |  +-------------+    | | var = wakeup_lat      |                          | |
+ |         .           | +-----------------------+                          | |
+ |         .           |   | .size               |                          | |
+ |         .           |   +---------------------+                          | |
+ |  +-------------+    |   | .offset             |                          | |
+ |  |             |    |   +---------------------+                          | |
+ |  +-------------+    |   | .fn()               |                          | |
+ |  |             |    |   +---------------------+                          | |
+ |  +-------------+    |   | .flags & FL_VAR     |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   | .var.idx            |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   | .var.hist_data      |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   | .var_ref_idx        |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |             .                                      | |
+ |                     |             .                                      | |
+ |                     |             .                                      | |
+ |                     | +-----------------------+ <--- n_vals              | |
+ |                     | | key = pid             |                          | |
+ |                     | +-----------------------+                          | |
+ |                     |   | .size               |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   | .offset             |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   | .fn()               |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   | .flags              |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   | .var.idx            |                          | |
+ |                     | +-----------------------+ <--- n_fields            | |
+ |                     | | unused                |                          | |
+ |                     | +-----------------------+                          | |
+ |                     |   |                     |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   |                     |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   |                     |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   |                     |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |   |                     |                          | |
+ |                     |   +---------------------+                          | |
+ |                     |                         n_keys = n_fields - n_vals | |
+ |                     |                                                    | |
+ |                     |						    | |
+ |                     | +-----------------------+                          | |
+ +---------------------->| var_ref = $ts0        |                          | |
+                       | +-----------------------+                          | |
+                       |   | .size               |                          | |
+                       |   +---------------------+                          | |
+                       |   | .offset             |                          | |
+                       |   +---------------------+                          | |
+                       |   | .fn()               |                          | |
+                       |   +---------------------+                          | |
+                       |   | .flags & FL_VAR_REF |                          | |
+                       |   +---------------------+                          | |
+                       |   | .var.idx            |--------------------------+ |
+                       |   +---------------------+                            |
+                       |   | .var.hist_data      |----------------------------+
+                       |   +---------------------+
+                       +---| .var_ref_idx        |
+                           +---------------------+
+
+Abbreviations used in the diagrams::
+
+  hist_data = struct hist_trigger_data
+  hist_data.fields = struct hist_field
+  fn = hist_field_fn_t
+  FL_KEY = HIST_FIELD_FL_KEY
+  FL_VAR = HIST_FIELD_FL_VAR
+  FL_VAR_REF = HIST_FIELD_FL_VAR_REF
+
+When a hist trigger makes use of a variable, a new hist_field is
+created with flag HIST_FIELD_FL_VAR_REF.  For a VAR_REF field, the
+var.idx and var.hist_data take the same values as the referenced
+variable, as well as the referenced variable's size, type, and
+is_signed values.  The VAR_REF field's .name is set to the name of the
+variable it references.  If a variable reference was created using the
+explicit system.event.$var_ref notation, the hist_field's system and
+event_name variabls are also set.
+
+So, in order to handle an event for the sched_switch histogram,
+because we have a reference to a variable on another histogram, we
+need to resolve all variable references first.  This is done via the
+resolve_var_refs() calls made from event_hist_trigger().  What this
+does is grabs the var_refs[] array from the hist_data representing the
+sched_switch histogram.  For each one of those, the referenced
+variable's var.hist_data along with the current key is used to look up
+the corresponding tracing_map_elt in that histogram.  Once found, the
+referenced variable's var.idx is used to look up the variable's value
+using tracing_map_read_var(elt, var.idx), which yields the value of
+the variable for that element, ts0 in the case above.  Note that both
+the hist_fields representing both the variable and the variable
+reference have the same var.idx, so this is straightforward.
+
+Variable and variable reference test
+------------------------------------
+
+This example creates a variable on the sched_waking event, ts0, and
+uses it in the sched_switch trigger.  The sched_switch trigger also
+creates its own variable, wakeup_lat, but nothing yet uses it::
+
+  # echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >> events/sched/sched_switch/trigger
+
+Looking at the sched_waking 'hist_debug' output, in addition to the
+normal key and value hist_fields, in the val fields section we see a
+field with the HIST_FIELD_FL_VAR flag, which indicates that that field
+represents a variable.  Note that in addition to the variable name,
+contained in the var.name field, it includes the var.idx, which is the
+index into the tracing_map_elt.vars[] array of the actual variable
+location.  Note also that the output shows that variables live in the
+same part of the hist_data->fields[] array as normal values::
+
+  # cat events/sched/sched_waking/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active]
+  #
+
+  hist_data: 000000009536f554
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+Moving on to the sched_switch trigger hist_debug output, in addition
+to the unused wakeup_lat variable, we see a new section displaying
+variable references.  Variable references are displayed in a separate
+section because in addition to to being logically separate from
+variables and values, they actually live in a separate hist_data
+array, var_refs[].
+
+In this example, the sched_switch trigger has a reference to a
+variable on the sched_waking trigger, $ts0.  Looking at the details,
+we can see that the var.hist_data value of the referenced variable
+matches the previously displayed sched_waking trigger, and the var.idx
+value matches the previously displayed var.idx value for that
+variable.  Also displayed is the var_ref_idx value for that variable
+reference, which is where the value for that variable is cached for
+use when the trigger is invoked::
+
+  # cat events/sched/sched_switch/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global [active]
+  #
+
+  hist_data: 00000000f4ee8006
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 0
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: next_pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+  variable reference fields:
+
+    hist_data->var_refs[0]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 000000009536f554
+      var_ref_idx (into hist_data->var_refs[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+The commands below can be used to clean things up for the next test::
+
+  # echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >> events/sched/sched_switch/trigger
+
+  # echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+Actions and Handlers
+====================
+
+Adding onto the previous example, we will now do something with that
+wakeup_lat variable, namely send it and another field as a synthetic
+event.
+
+The onmatch() action below basically says that whenever we have a
+sched_switch event, if we have a matching sched_waking event, in this
+case if we have a pid in the sched_waking histogram that matches the
+the next_pid field on this sched_switch event, we retrieve the
+variables specified in the wakeup_latency() trace action, and use
+them to generate a new wakeup_latency event into the trace stream.
+
+Note that the way the trace handlers such as wakeup_latency() (which
+could equivalently be written trace(wakeup_latency,$wakeup_lat,next_pid)
+are implemented, the parameters specified to the trace handler must be
+variables.  In this case, $wakeup_lat is obviously a variable, but
+next_pid isn't, since it's just naming a field in the sched_switch
+trace event.  Since this is something that almost every trace() and
+save() action does, a special shortcut is implemented to allow field
+names to be used directly in those cases.  How it works is that under
+the covers, a temporary variable is created for the named field, and
+this variable is what is actually passed to the trace handler.  In the
+code and documentation, this type of variable is called a 'field
+variable'.
+
+Fields on other trace event's histograms can be used as well.  In that
+case we have to generate a new histogram and an unfortunately named
+'synthetic_field' (the use of synthetic here has nothing to do with
+synthetic events) and use that special histogram field as a variable.
+
+The diagram below illustrates the new elements described above in the
+context of the sched_switch histogram using the onmatch() handler and
+the trace() action.
+
+First, we define the wakeup_latency synthetic event::
+
+  # echo 'wakeup_latency u64 lat; pid_t pid' >> synthetic_events
+
+Next, the sched_waking hist trigger as before::
+
+  # echo 'hist:keys=pid:ts0=common_timestamp.usecs' >>
+          events/sched/sched_waking/trigger
+
+Finally, we create a hist trigger on the sched_switch event that
+generates a wakeup_latency() trace event.  In this case we pass
+next_pid into the wakeup_latency synthetic event invocation, which
+means it will be automatically converted into a field variable::
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
+          onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid)' >>
+	  /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+
+The diagram for the sched_switch event is similar to previous examples
+but shows the additional field_vars[] array for hist_data and shows
+the linkages between the field_vars and the variables and references
+created to implement the field variables.  The details are discussed
+below::
+
+    +------------------+
+    | hist_data        |
+    +------------------+   +-----------------------+
+      | .fields[]      |-->| val = hitcount        |
+      +----------------+   +-----------------------+
+      | .map           |     | .size               |
+      +----------------+     +---------------------+
+  +---| .field_vars[]  |     | .offset             |
+  |   +----------------+     +---------------------+
+  |+--| .var_refs[]    |     | .offset             |
+  ||  +----------------+     +---------------------+
+  ||                         | .fn()               |
+  ||   var_ref_vals[]        +---------------------+
+  ||  +-------------+        | .flags              |
+  ||  | $ts0        |<---+   +---------------------+
+  ||  +-------------+    |   | .var.idx            |
+  ||  | $next_pid   |<-+ |   +---------------------+
+  ||  +-------------+  | |   | .var.hist_data      |
+  ||+>| $wakeup_lat |  | |   +---------------------+
+  ||| +-------------+  | |   | .var_ref_idx        |
+  ||| |             |  | | +-----------------------+
+  ||| +-------------+  | | | var = wakeup_lat      |
+  |||        .         | | +-----------------------+
+  |||        .         | |   | .size               |
+  |||        .         | |   +---------------------+
+  ||| +-------------+  | |   | .offset             |
+  ||| |             |  | |   +---------------------+
+  ||| +-------------+  | |   | .fn()               |
+  ||| |             |  | |   +---------------------+
+  ||| +-------------+  | |   | .flags & FL_VAR     |
+  |||                  | |   +---------------------+
+  |||                  | |   | .var.idx            |
+  |||                  | |   +---------------------+
+  |||                  | |   | .var.hist_data      |
+  |||                  | |   +---------------------+
+  |||                  | |   | .var_ref_idx        |
+  |||                  | |   +---------------------+
+  |||                  | |              .
+  |||                  | |              .
+  |||                  | |              .
+  |||                  | |              .
+  ||| +--------------+ | |              .
+  +-->| field_var    | | |              .
+   || +--------------+ | |              .
+   ||   | var        | | |              .
+   ||   +------------+ | |              .
+   ||   | val        | | |              .
+   || +--------------+ | |              .
+   || | field_var    | | |              .
+   || +--------------+ | |              .
+   ||   | var        | | |              .
+   ||   +------------+ | |              .
+   ||   | val        | | |              .
+   ||   +------------+ | |              .
+   ||         .        | |              .
+   ||         .        | |              .
+   ||         .        | | +-----------------------+ <--- n_vals
+   || +--------------+ | | | key = pid             |
+   || | field_var    | | | +-----------------------+
+   || +--------------+ | |   | .size               |
+   ||   | var        |--+|   +---------------------+
+   ||   +------------+ |||   | .offset             |
+   ||   | val        |-+||   +---------------------+
+   ||   +------------+ |||   | .fn()               |
+   ||                  |||   +---------------------+
+   ||                  |||   | .flags              |
+   ||                  |||   +---------------------+
+   ||                  |||   | .var.idx            |
+   ||                  |||   +---------------------+ <--- n_fields
+   ||                  |||
+   ||                  |||                           n_keys = n_fields - n_vals
+   ||                  ||| +-----------------------+
+   ||                  |+->| var = next_pid        |
+   ||                  | | +-----------------------+
+   ||                  | |   | .size               |
+   ||                  | |   +---------------------+
+   ||                  | |   | .offset             |
+   ||                  | |   +---------------------+
+   ||                  | |   | .flags & FL_VAR     |
+   ||                  | |   +---------------------+
+   ||                  | |   | .var.idx            |
+   ||                  | |   +---------------------+
+   ||                  | |   | .var.hist_data      |
+   ||                  | | +-----------------------+
+   ||                  +-->| val for next_pid      |
+   ||                  | | +-----------------------+
+   ||                  | |   | .size               |
+   ||                  | |   +---------------------+
+   ||                  | |   | .offset             |
+   ||                  | |   +---------------------+
+   ||                  | |   | .fn()               |
+   ||                  | |   +---------------------+
+   ||                  | |   | .flags              |
+   ||                  | |   +---------------------+
+   ||                  | |   |                     |
+   ||                  | |   +---------------------+
+   ||                  | |
+   ||                  | |
+   ||                  | | +-----------------------+
+   +|------------------|-|>| var_ref = $ts0        |
+    |                  | | +-----------------------+
+    |                  | |   | .size               |
+    |                  | |   +---------------------+
+    |                  | |   | .offset             |
+    |                  | |   +---------------------+
+    |                  | |   | .fn()               |
+    |                  | |   +---------------------+
+    |                  | |   | .flags & FL_VAR_REF |
+    |                  | |   +---------------------+
+    |                  | +---| .var_ref_idx        |
+    |                  |   +-----------------------+
+    |                  |   | var_ref = $next_pid   |
+    |                  |   +-----------------------+
+    |                  |     | .size               |
+    |                  |     +---------------------+
+    |                  |     | .offset             |
+    |                  |     +---------------------+
+    |                  |     | .fn()               |
+    |                  |     +---------------------+
+    |                  |     | .flags & FL_VAR_REF |
+    |                  |     +---------------------+
+    |                  +-----| .var_ref_idx        |
+    |                      +-----------------------+
+    |                      | var_ref = $wakeup_lat |
+    |                      +-----------------------+
+    |                        | .size               |
+    |                        +---------------------+
+    |                        | .offset             |
+    |                        +---------------------+
+    |                        | .fn()               |
+    |                        +---------------------+
+    |                        | .flags & FL_VAR_REF |
+    |                        +---------------------+
+    +------------------------| .var_ref_idx        |
+                             +---------------------+
+
+As you can see, for a field variable, two hist_fields are created: one
+representing the variable, in this case next_pid, and one to actually
+get the value of the field from the trace stream, like a normal val
+field does.  These are created separately from normal variable
+creation and are saved in the hist_data->field_vars[] array.  See
+below for how these are used.  In addition, a reference hist_field is
+also created, which is needed to reference the field variables such as
+$next_pid variable in the trace() action.
+
+Note that $wakeup_lat is also a variable reference, referencing the
+value of the expression common_timestamp-$ts0, and so also needs to
+have a hist field entry representing that reference created.
+
+When hist_trigger_elt_update() is called to get the normal key and
+value fields, it also calls update_field_vars(), which goes through
+each field_var created for the histogram, and available from
+hist_data->field_vars and calls val->fn() to get the data from the
+current trace record, and then uses the var's var.idx to set the
+variable at the var.idx offset in the appropriate tracing_map_elt's
+variable at elt->vars[var.idx].
+
+Once all the variables have been updated, resolve_var_refs() can be
+called from event_hist_trigger(), and not only can our $ts0 and
+$next_pid references be resolved but the $wakeup_lat reference as
+well.  At this point, the trace() action can simply access the values
+assembled in the var_ref_vals[] array and generate the trace event.
+
+The same process occurs for the field variables associated with the
+save() action.
+
+Abbreviations used in the diagram::
+
+  hist_data = struct hist_trigger_data
+  hist_data.fields = struct hist_field
+  field_var = struct field_var
+  fn = hist_field_fn_t
+  FL_KEY = HIST_FIELD_FL_KEY
+  FL_VAR = HIST_FIELD_FL_VAR
+  FL_VAR_REF = HIST_FIELD_FL_VAR_REF
+
+trace() action field variable test
+----------------------------------
+
+This example adds to the previous test example by finally making use
+of the wakeup_lat variable, but in addition also creates a couple of
+field variables that then are all passed to the wakeup_latency() trace
+action via the onmatch() handler.
+
+First, we create the wakeup_latency synthetic event::
+
+  # echo 'wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events
+
+Next, the sched_waking trigger from previous examples::
+
+  # echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+Finally, as in the previous test example, we calculate and assign the
+wakeup latency using the $ts0 reference from the sched_waking trigger
+to the wakeup_lat variable, and finally use it along with a couple
+sched_switch event fields, next_pid and next_comm, to generate a
+wakeup_latency trace event.  The next_pid and next_comm event fields
+are automatically converted into field variables for this purpose::
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+
+The sched_waking hist_debug output shows the same data as in the
+previous test example::
+
+  # cat events/sched/sched_waking/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active]
+  #
+
+  hist_data: 00000000d60ff61f
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+The sched_switch hist_debug output shows the same key and value fields
+as in the previous test example - note that wakeup_lat is still in the
+val fields section, but that the new field variables are not there -
+although the field variables are variables, they're held separately in
+the hist_data's field_vars[] array.  Although the field variables and
+the normal variables are located in separate places, you can see that
+the actual variable locations for those variables in the
+tracing_map_elt.vars[] do have increasing indices as expected:
+wakeup_lat takes the var.idx = 0 slot, while the field variables for
+next_pid and next_comm have values var.idx = 1, and var.idx = 2.  Note
+also that those are the same values displayed for the variable
+references corresponding to those variables in the variable reference
+fields section.  Since there are two triggers and thus two hist_data
+addresses, those addresses also need to be accounted for when doing
+the matching - you can see that the first variable refers to the 0
+var.idx on the previous hist trigger (see the hist_data address
+associated with that trigger), while the second variable refers to the
+0 var.idx on the sched_switch hist trigger, as do all the remaining
+variable references.
+
+Finally, the action tracking variables section just shows the system
+and event name for the onmatch() handler::
+
+  # cat events/sched/sched_switch/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm) [active]
+  #
+
+  hist_data: 0000000008f551b7
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 0
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: next_pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+  variable reference fields:
+
+    hist_data->var_refs[0]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 00000000d60ff61f
+      var_ref_idx (into hist_data->var_refs[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->var_refs[1]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 0000000008f551b7
+      var_ref_idx (into hist_data->var_refs[]): 1
+      type: u64
+      size: 0
+      is_signed: 0
+
+    hist_data->var_refs[2]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: next_pid
+      var.idx (into tracing_map_elt.vars[]): 1
+      var.hist_data: 0000000008f551b7
+      var_ref_idx (into hist_data->var_refs[]): 2
+      type: pid_t
+      size: 4
+      is_signed: 0
+
+    hist_data->var_refs[3]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: next_comm
+      var.idx (into tracing_map_elt.vars[]): 2
+      var.hist_data: 0000000008f551b7
+      var_ref_idx (into hist_data->var_refs[]): 3
+      type: char[16]
+      size: 256
+      is_signed: 0
+
+  field variables:
+
+    hist_data->field_vars[0]:
+
+      field_vars[0].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: next_pid
+      var.idx (into tracing_map_elt.vars[]): 1
+
+      field_vars[0].val:
+      ftrace_event_field name: next_pid
+      type: pid_t
+      size: 4
+      is_signed: 1
+
+    hist_data->field_vars[1]:
+
+      field_vars[1].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: next_comm
+      var.idx (into tracing_map_elt.vars[]): 2
+
+      field_vars[1].val:
+      ftrace_event_field name: next_comm
+      type: char[16]
+      size: 256
+      is_signed: 0
+
+  action tracking variables (for onmax()/onchange()/onmatch()):
+
+    hist_data->actions[0].match_data.event_system: sched
+    hist_data->actions[0].match_data.event: sched_waking
+
+The commands below can be used to clean things up for the next test::
+
+  # echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+
+  # echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+  # echo '!wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events
+
+action_data and the trace() action
+----------------------------------
+
+As mentioned above, when the trace() action generates a synthetic
+event, all the parameters to the synthetic event either already are
+variables or are converted into variables (via field variables), and
+finally all those variable values are collected via references to them
+into a var_ref_vals[] array.
+
+The values in the var_ref_vals[] array, however, don't necessarily
+follow the same ordering as the synthetic event params.  To address
+that, struct action_data contains another array, var_ref_idx[] that
+maps the trace action params to the var_ref_vals[] values.  Below is a
+diagram illustrating that for the wakeup_latency() synthetic event::
+
+  +------------------+     wakeup_latency()
+  | action_data      |       event params               var_ref_vals[]
+  +------------------+    +-----------------+        +-----------------+
+    | .var_ref_idx[] |--->| $wakeup_lat idx |---+    |                 |
+    +----------------+    +-----------------+   |    +-----------------+
+    | .synth_event   |    | $next_pid idx   |---|-+  | $wakeup_lat val |
+    +----------------+    +-----------------+   | |  +-----------------+
+                                   .            | +->| $next_pid val   |
+                                   .            |    +-----------------+
+                                   .            |           .
+                          +-----------------+   |           .
+			  |                 |   |           .
+			  +-----------------+   |    +-----------------+
+                                                +--->| $wakeup_lat val |
+                                                     +-----------------+
+
+Basically, how this ends up getting used in the synthetic event probe
+function, trace_event_raw_event_synth(), is as follows::
+
+  for each field i in .synth_event
+    val_idx = .var_ref_idx[i]
+    val = var_ref_vals[val_idx]
+
+action_data and the onXXX() handlers
+------------------------------------
+
+The hist trigger onXXX() actions other than onmatch(), such as onmax()
+and onchange(), also make use of and internally create hidden
+variables.  This information is contained in the
+action_data.track_data struct, and is also visible in the hist_debug
+output as will be described in the example below.
+
+Typically, the onmax() or onchange() handlers are used in conjunction
+with the save() and snapshot() actions.  For example::
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
+          onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm)' >>
+          /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+
+or::
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
+          onmax($wakeup_lat).snapshot()' >>
+          /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+
+save() action field variable test
+---------------------------------
+
+For this example, instead of generating a synthetic event, the save()
+action is used to save field values whenever an onmax() handler
+detects that a new max latency has been hit.  As in the previous
+example, the values being saved are also field values, but in this
+case, are kept in a separate hist_data array named save_vars[].
+
+As in previous test examples, we set up the sched_waking trigger::
+
+  # echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+In this case, however, we set up the sched_switch trigger to save some
+sched_switch field values whenever we hit a new maximum latency.  For
+both the onmax() handler and save() action, variables will be created,
+which we can use the hist_debug files to examine::
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm)' >> events/sched/sched_switch/trigger
+
+The sched_waking hist_debug output shows the same data as in the
+previous test examples::
+
+  # cat events/sched/sched_waking/hist_debug
+
+  #
+  # trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active]
+  #
+
+  hist_data: 00000000e6290f48
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+The output of the sched_switch trigger shows the same val and key
+values as before, but also shows a couple new sections.
+
+First, the action tracking variables section now shows the
+actions[].track_data information describing the special tracking
+variables and references used to track, in this case, the running
+maximum value.  The actions[].track_data.var_ref member contains the
+reference to the variable being tracked, in this case the $wakeup_lat
+variable.  In order to perform the onmax() handler function, there
+also needs to be a variable that tracks the current maximum by getting
+updated whenever a new maximum is hit.  In this case, we can see that
+an autogenerated veriable named ' __max' has been created and is
+visible in the actions[].track_data.track_var variable.
+
+Finally, in the new 'save action variables' section, we can see that
+the 4 params to the save() function have resulted in 4 field variables
+being created for the purposes of saving the values of the named
+fields when the max is hit.  These variables are kept in a separate
+save_vars[] array off of hist_data, so are displayed in a separate
+section::
+
+  # cat events/sched/sched_switch/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) [active]
+  #
+
+  hist_data: 0000000057bcd28d
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 0
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: next_pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+  variable reference fields:
+
+    hist_data->var_refs[0]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 00000000e6290f48
+      var_ref_idx (into hist_data->var_refs[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->var_refs[1]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 0000000057bcd28d
+      var_ref_idx (into hist_data->var_refs[]): 1
+      type: u64
+      size: 0
+      is_signed: 0
+
+  action tracking variables (for onmax()/onchange()/onmatch()):
+
+    hist_data->actions[0].track_data.var_ref:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 0000000057bcd28d
+      var_ref_idx (into hist_data->var_refs[]): 1
+      type: u64
+      size: 0
+      is_signed: 0
+
+    hist_data->actions[0].track_data.track_var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: __max
+      var.idx (into tracing_map_elt.vars[]): 1
+      type: u64
+      size: 8
+      is_signed: 0
+
+  save action variables (save() params):
+
+    hist_data->save_vars[0]:
+
+      save_vars[0].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: next_comm
+      var.idx (into tracing_map_elt.vars[]): 2
+
+      save_vars[0].val:
+      ftrace_event_field name: next_comm
+      type: char[16]
+      size: 256
+      is_signed: 0
+
+    hist_data->save_vars[1]:
+
+      save_vars[1].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: prev_pid
+      var.idx (into tracing_map_elt.vars[]): 3
+
+      save_vars[1].val:
+      ftrace_event_field name: prev_pid
+      type: pid_t
+      size: 4
+      is_signed: 1
+
+    hist_data->save_vars[2]:
+
+      save_vars[2].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: prev_prio
+      var.idx (into tracing_map_elt.vars[]): 4
+
+      save_vars[2].val:
+      ftrace_event_field name: prev_prio
+      type: int
+      size: 4
+      is_signed: 1
+
+    hist_data->save_vars[3]:
+
+      save_vars[3].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: prev_comm
+      var.idx (into tracing_map_elt.vars[]): 5
+
+      save_vars[3].val:
+      ftrace_event_field name: prev_comm
+      type: char[16]
+      size: 256
+      is_signed: 0
+
+The commands below can be used to clean things up for the next test::
+
+  # echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm)' >> events/sched/sched_switch/trigger
+
+  # echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+A couple special cases
+======================
+
+While the above covers the basics of the histogram internals, there
+are a couple of special cases that should be discussed, since they
+tend to creae even more confusion.  Those are field variables on other
+histograms, and aliases, both described below through example tests
+using the hist_debug files.
+
+Test of field variables on other histograms
+-------------------------------------------
+
+This example is similar to the previous examples, but in this case,
+the sched_switch trigger references a hist trigger field on another
+event, namely the sched_waking event.  In order to accomplish this, a
+field variable is created for the other event, but since an existing
+histogram can't be used, as existing histograms are immutable, a new
+histogram with a matching variable is created and used, and we'll see
+that reflected in the hist_debug output shown below.
+
+First, we create the wakeup_latency synthetic event.  Note the
+addition of the prio field::
+
+  # echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> synthetic_events
+
+As in previous test examples, we set up the sched_waking trigger::
+
+  # echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+Here we set up a hist trigger on sched_switch to send a wakeup_latency
+event using an onmatch handler naming the sched_waking event.  Note
+that the third param being passed to the wakeup_latency() is prio,
+which is a field name that needs to have a field variable created for
+it.  There isn't however any prio field on the sched_switch event so
+it would seem that it wouldn't be possible to create a field variable
+for it.  The matching sched_waking event does have a prio field, so it
+should be possible to make use of it for this purpose.  The problem
+with that is that it's not currently possible to define a new variable
+on an existing histogram, so it's not possible to add a new prio field
+variable to the existing sched_waking histogram.  It is however
+possible to create an additional new 'matching' sched_waking histogram
+for the same event, meaning that it uses the same key and filters, and
+define the new prio field variable on that.
+
+Here's the sched_switch trigger::
+
+  # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,prio)' >> events/sched/sched_switch/trigger
+
+And here's the output of the hist_debug information for the
+sched_waking hist trigger.  Note that there are two histograms
+displayed in the output: the first is the normal sched_waking
+histogram we've seen in the previous examples, and the second is the
+special histogram we created to provide the prio field variable.
+
+Looking at the second histogram below, we see a variable with the name
+synthetic_prio.  This is the field variable created for the prio field
+on that sched_waking histogram::
+
+  # cat events/sched/sched_waking/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active]
+  #
+
+  hist_data: 00000000349570e4
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+
+  # event histogram
+  #
+  # trigger info: hist:keys=pid:vals=hitcount:synthetic_prio=prio:sort=hitcount:size=2048 [active]
+  #
+
+  hist_data: 000000006920cf38
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      ftrace_event_field name: prio
+      var.name: synthetic_prio
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: int
+      size: 4
+      is_signed: 1
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+Looking at the sched_switch histogram below, we can see a reference to
+the synthetic_prio variable on sched_waking, and looking at the
+associated hist_data address we see that it is indeed associated with
+the new histogram.  Note also that the other references are to a
+normal variable, wakeup_lat, and to a normal field variable, next_pid,
+the details of which are in the field variables section::
+
+  # cat events/sched/sched_switch/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,prio) [active]
+  #
+
+  hist_data: 00000000a73b67df
+
+  n_vals: 2
+  n_keys: 1
+  n_fields: 3
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: u64
+      size: 0
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: next_pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+  variable reference fields:
+
+    hist_data->var_refs[0]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: ts0
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 00000000349570e4
+      var_ref_idx (into hist_data->var_refs[]): 0
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->var_refs[1]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 00000000a73b67df
+      var_ref_idx (into hist_data->var_refs[]): 1
+      type: u64
+      size: 0
+      is_signed: 0
+
+    hist_data->var_refs[2]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: next_pid
+      var.idx (into tracing_map_elt.vars[]): 1
+      var.hist_data: 00000000a73b67df
+      var_ref_idx (into hist_data->var_refs[]): 2
+      type: pid_t
+      size: 4
+      is_signed: 0
+
+    hist_data->var_refs[3]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: synthetic_prio
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 000000006920cf38
+      var_ref_idx (into hist_data->var_refs[]): 3
+      type: int
+      size: 4
+      is_signed: 1
+
+  field variables:
+
+    hist_data->field_vars[0]:
+
+      field_vars[0].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: next_pid
+      var.idx (into tracing_map_elt.vars[]): 1
+
+      field_vars[0].val:
+      ftrace_event_field name: next_pid
+      type: pid_t
+      size: 4
+      is_signed: 1
+
+  action tracking variables (for onmax()/onchange()/onmatch()):
+
+    hist_data->actions[0].match_data.event_system: sched
+    hist_data->actions[0].match_data.event: sched_waking
+
+The commands below can be used to clean things up for the next test::
+
+  # echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,prio)' >> events/sched/sched_switch/trigger
+
+  # echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+  # echo '!wakeup_latency u64 lat; pid_t pid; int prio' >> synthetic_events
+
+Alias test
+----------
+
+This example is very similar to previous examples, but demonstrates
+the alias flag.
+
+First, we create the wakeup_latency synthetic event::
+
+  # echo 'wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events
+
+Next, we create a sched_waking trigger similar to previous examples,
+but in this case we save the pid in the waking_pid variable::
+
+  # echo 'hist:keys=pid:waking_pid=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+For the sched_switch trigger, instead of using $waking_pid directly in
+the wakeup_latency synthetic event invocation, we create an alias of
+$waking_pid named $woken_pid, and use that in the synthetic event
+invocation instead::
+
+  # echo 'hist:keys=next_pid:woken_pid=$waking_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,$woken_pid,next_comm)' >> events/sched/sched_switch/trigger
+
+Looking at the sched_waking hist_debug output, in addition to the
+normal fields, we can see the waking_pid variable::
+
+  # cat events/sched/sched_waking/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=pid:vals=hitcount:waking_pid=pid,ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active]
+  #
+
+  hist_data: 00000000a250528c
+
+  n_vals: 3
+  n_keys: 1
+  n_fields: 4
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+      ftrace_event_field name: pid
+      var.name: waking_pid
+      var.idx (into tracing_map_elt.vars[]): 0
+      type: pid_t
+      size: 4
+      is_signed: 1
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: ts0
+      var.idx (into tracing_map_elt.vars[]): 1
+      type: u64
+      size: 8
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[3]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+The sched_switch hist_debug output shows that a variable named
+woken_pid has been created but that it also has the
+HIST_FIELD_FL_ALIAS flag set.  It also has the HIST_FIELD_FL_VAR flag
+set, which is why it appears in the val field section.
+
+Despite that implementation detail, an alias variable is actually more
+like a variable reference; in fact it can be thought of as a reference
+to a reference.  The implementation copies the var_ref->fn() from the
+variable reference being referenced, in this case, the waking_pid
+fn(), which is hist_field_var_ref() and makes that the fn() of the
+alias.  The hist_field_var_ref() fn() requires the var_ref_idx of the
+variable reference it's using, so waking_pid's var_ref_idx is also
+copied to the alias.  The end result is that when the value of alias
+is retrieved, in the end it just does the same thing the original
+reference would have done and retrieves the same value from the
+var_ref_vals[] array.  You can verify this in the output by noting
+that the var_ref_idx of the alias, in this case woken_pid, is the same
+as the var_ref_idx of the reference, waking_pid, in the variable
+reference fields section.
+
+Additionally, once it gets that value, since it is also a variable, it
+then saves that value into its var.idx.  So the var.idx of the
+woken_pid alias is 0, which it fills with the value from var_ref_idx 0
+when its fn() is called to update itself.  You'll also notice that
+there's a woken_pid var_ref in the variable refs section.  That is the
+reference to the woken_pid alias variable, and you can see that it
+retrieves the value from the same var.idx as the woken_pid alias, 0,
+and then in turn saves that value in its own var_ref_idx slot, 3, and
+the value at this position is finally what gets assigned to the
+$woken_pid slot in the trace event invocation::
+
+  # cat events/sched/sched_switch/hist_debug
+
+  # event histogram
+  #
+  # trigger info: hist:keys=next_pid:vals=hitcount:woken_pid=$waking_pid,wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,$woken_pid,next_comm) [active]
+  #
+
+  hist_data: 0000000055d65ed0
+
+  n_vals: 3
+  n_keys: 1
+  n_fields: 4
+
+  val fields:
+
+    hist_data->fields[0]:
+      flags:
+        VAL: HIST_FIELD_FL_HITCOUNT
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->fields[1]:
+      flags:
+        HIST_FIELD_FL_VAR
+        HIST_FIELD_FL_ALIAS
+      var.name: woken_pid
+      var.idx (into tracing_map_elt.vars[]): 0
+      var_ref_idx (into hist_data->var_refs[]): 0
+      type: pid_t
+      size: 4
+      is_signed: 1
+
+    hist_data->fields[2]:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 1
+      type: u64
+      size: 0
+      is_signed: 0
+
+  key fields:
+
+    hist_data->fields[3]:
+      flags:
+        HIST_FIELD_FL_KEY
+      ftrace_event_field name: next_pid
+      type: pid_t
+      size: 8
+      is_signed: 1
+
+  variable reference fields:
+
+    hist_data->var_refs[0]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: waking_pid
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 00000000a250528c
+      var_ref_idx (into hist_data->var_refs[]): 0
+      type: pid_t
+      size: 4
+      is_signed: 1
+
+    hist_data->var_refs[1]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: ts0
+      var.idx (into tracing_map_elt.vars[]): 1
+      var.hist_data: 00000000a250528c
+      var_ref_idx (into hist_data->var_refs[]): 1
+      type: u64
+      size: 8
+      is_signed: 0
+
+    hist_data->var_refs[2]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: wakeup_lat
+      var.idx (into tracing_map_elt.vars[]): 1
+      var.hist_data: 0000000055d65ed0
+      var_ref_idx (into hist_data->var_refs[]): 2
+      type: u64
+      size: 0
+      is_signed: 0
+
+    hist_data->var_refs[3]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: woken_pid
+      var.idx (into tracing_map_elt.vars[]): 0
+      var.hist_data: 0000000055d65ed0
+      var_ref_idx (into hist_data->var_refs[]): 3
+      type: pid_t
+      size: 4
+      is_signed: 1
+
+    hist_data->var_refs[4]:
+      flags:
+        HIST_FIELD_FL_VAR_REF
+      name: next_comm
+      var.idx (into tracing_map_elt.vars[]): 2
+      var.hist_data: 0000000055d65ed0
+      var_ref_idx (into hist_data->var_refs[]): 4
+      type: char[16]
+      size: 256
+      is_signed: 0
+
+  field variables:
+
+    hist_data->field_vars[0]:
+
+      field_vars[0].var:
+      flags:
+        HIST_FIELD_FL_VAR
+      var.name: next_comm
+      var.idx (into tracing_map_elt.vars[]): 2
+
+      field_vars[0].val:
+      ftrace_event_field name: next_comm
+      type: char[16]
+      size: 256
+      is_signed: 0
+
+  action tracking variables (for onmax()/onchange()/onmatch()):
+
+    hist_data->actions[0].match_data.event_system: sched
+    hist_data->actions[0].match_data.event: sched_waking
+
+The commands below can be used to clean things up for the next test::
+
+  # echo '!hist:keys=next_pid:woken_pid=$waking_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,$woken_pid,next_comm)' >> events/sched/sched_switch/trigger
+
+  # echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger
+
+  # echo '!wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 06/12] tracing: Add hist_debug trace event files for histogram debugging
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (4 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 05/12] tracing: Add histogram-design document Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 07/12] tracing/doc: Fix typos in histogram-design.rst Steven Rostedt
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Tom Zanussi

From: Tom Zanussi <zanussi@kernel.org>

Add a new "hist_debug" file for each trace event, which when read will
dump out a bunch of internal details about the hist triggers defined
on that event.

This is normally off but can be enabled by saying 'y' to the new
CONFIG_HIST_TRIGGERS_DEBUG config option.

This is in support of the new Documentation file describing histogram
internals, Documentation/trace/histogram-design.rst, which was
requested by developers trying to understand the internals when
extending or making use of the hist triggers for higher-level tools.

The histogram-design.rst documentation refers to the hist_debug files
and demonstrates their use with output in the test examples.

Link: http://lkml.kernel.org/r/77914c22b0ba493d9783c53bbfbc6087d6a7e1b1.1585941485.git.zanussi@kernel.org

Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/Kconfig             |  23 +++
 kernel/trace/trace.h             |   1 +
 kernel/trace/trace_events.c      |   4 +
 kernel/trace/trace_events_hist.c | 273 +++++++++++++++++++++++++++++++
 4 files changed, 301 insertions(+)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 743647005f64..2aec9376825f 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -847,6 +847,29 @@ config KPROBE_EVENT_GEN_TEST
 
 	  If unsure, say N.
 
+config HIST_TRIGGERS_DEBUG
+	bool "Hist trigger debug support"
+	depends on HIST_TRIGGERS
+	help
+          Add "hist_debug" file for each event, which when read will
+          dump out a bunch of internal details about the hist triggers
+          defined on that event.
+
+          The hist_debug file serves a couple of purposes:
+
+            - Helps developers verify that nothing is broken.
+
+            - Provides educational information to support the details
+              of the hist trigger internals as described by
+              Documentation/trace/histogram-design.rst.
+
+          The hist_debug output only covers the data structures
+          related to the histogram definitions themselves and doesn't
+          display the internals of map buckets or variable values of
+          running histograms.
+
+          If unsure, say N.
+
 endif # FTRACE
 
 endif # TRACING_SUPPORT
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 4eb1d004d5f2..def769df5bf1 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1661,6 +1661,7 @@ extern struct list_head ftrace_events;
 
 extern const struct file_operations event_trigger_fops;
 extern const struct file_operations event_hist_fops;
+extern const struct file_operations event_hist_debug_fops;
 extern const struct file_operations event_inject_fops;
 
 #ifdef CONFIG_HIST_TRIGGERS
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 242f59e7f17d..f6f55682d3e2 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2208,6 +2208,10 @@ event_create_dir(struct dentry *parent, struct trace_event_file *file)
 #ifdef CONFIG_HIST_TRIGGERS
 	trace_create_file("hist", 0444, file->dir, file,
 			  &event_hist_fops);
+#endif
+#ifdef CONFIG_HIST_TRIGGERS_DEBUG
+	trace_create_file("hist_debug", 0444, file->dir, file,
+			  &event_hist_debug_fops);
 #endif
 	trace_create_file("format", 0444, file->dir, call,
 			  &ftrace_event_format_fops);
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 5cfe8f998b3e..313227da4925 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -6478,6 +6478,279 @@ const struct file_operations event_hist_fops = {
 	.release = single_release,
 };
 
+#ifdef CONFIG_HIST_TRIGGERS_DEBUG
+static void hist_field_debug_show_flags(struct seq_file *m,
+					unsigned long flags)
+{
+	seq_puts(m, "      flags:\n");
+
+	if (flags & HIST_FIELD_FL_KEY)
+		seq_puts(m, "        HIST_FIELD_FL_KEY\n");
+	else if (flags & HIST_FIELD_FL_HITCOUNT)
+		seq_puts(m, "        VAL: HIST_FIELD_FL_HITCOUNT\n");
+	else if (flags & HIST_FIELD_FL_VAR)
+		seq_puts(m, "        HIST_FIELD_FL_VAR\n");
+	else if (flags & HIST_FIELD_FL_VAR_REF)
+		seq_puts(m, "        HIST_FIELD_FL_VAR_REF\n");
+	else
+		seq_puts(m, "        VAL: normal u64 value\n");
+
+	if (flags & HIST_FIELD_FL_ALIAS)
+		seq_puts(m, "        HIST_FIELD_FL_ALIAS\n");
+}
+
+static int hist_field_debug_show(struct seq_file *m,
+				 struct hist_field *field, unsigned long flags)
+{
+	if ((field->flags & flags) != flags) {
+		seq_printf(m, "ERROR: bad flags - %lx\n", flags);
+		return -EINVAL;
+	}
+
+	hist_field_debug_show_flags(m, field->flags);
+	if (field->field)
+		seq_printf(m, "      ftrace_event_field name: %s\n",
+			   field->field->name);
+
+	if (field->flags & HIST_FIELD_FL_VAR) {
+		seq_printf(m, "      var.name: %s\n", field->var.name);
+		seq_printf(m, "      var.idx (into tracing_map_elt.vars[]): %u\n",
+			   field->var.idx);
+	}
+
+	if (field->flags & HIST_FIELD_FL_ALIAS)
+		seq_printf(m, "      var_ref_idx (into hist_data->var_refs[]): %u\n",
+			   field->var_ref_idx);
+
+	if (field->flags & HIST_FIELD_FL_VAR_REF) {
+		seq_printf(m, "      name: %s\n", field->name);
+		seq_printf(m, "      var.idx (into tracing_map_elt.vars[]): %u\n",
+			   field->var.idx);
+		seq_printf(m, "      var.hist_data: %p\n", field->var.hist_data);
+		seq_printf(m, "      var_ref_idx (into hist_data->var_refs[]): %u\n",
+			   field->var_ref_idx);
+		if (field->system)
+			seq_printf(m, "      system: %s\n", field->system);
+		if (field->event_name)
+			seq_printf(m, "      event_name: %s\n", field->event_name);
+	}
+
+	seq_printf(m, "      type: %s\n", field->type);
+	seq_printf(m, "      size: %u\n", field->size);
+	seq_printf(m, "      is_signed: %u\n", field->is_signed);
+
+	return 0;
+}
+
+static int field_var_debug_show(struct seq_file *m,
+				struct field_var *field_var, unsigned int i,
+				bool save_vars)
+{
+	const char *vars_name = save_vars ? "save_vars" : "field_vars";
+	struct hist_field *field;
+	int ret = 0;
+
+	seq_printf(m, "\n    hist_data->%s[%d]:\n", vars_name, i);
+
+	field = field_var->var;
+
+	seq_printf(m, "\n      %s[%d].var:\n", vars_name, i);
+
+	hist_field_debug_show_flags(m, field->flags);
+	seq_printf(m, "      var.name: %s\n", field->var.name);
+	seq_printf(m, "      var.idx (into tracing_map_elt.vars[]): %u\n",
+		   field->var.idx);
+
+	field = field_var->val;
+
+	seq_printf(m, "\n      %s[%d].val:\n", vars_name, i);
+	if (field->field)
+		seq_printf(m, "      ftrace_event_field name: %s\n",
+			   field->field->name);
+	else {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	seq_printf(m, "      type: %s\n", field->type);
+	seq_printf(m, "      size: %u\n", field->size);
+	seq_printf(m, "      is_signed: %u\n", field->is_signed);
+out:
+	return ret;
+}
+
+static int hist_action_debug_show(struct seq_file *m,
+				  struct action_data *data, int i)
+{
+	int ret = 0;
+
+	if (data->handler == HANDLER_ONMAX ||
+	    data->handler == HANDLER_ONCHANGE) {
+		seq_printf(m, "\n    hist_data->actions[%d].track_data.var_ref:\n", i);
+		ret = hist_field_debug_show(m, data->track_data.var_ref,
+					    HIST_FIELD_FL_VAR_REF);
+		if (ret)
+			goto out;
+
+		seq_printf(m, "\n    hist_data->actions[%d].track_data.track_var:\n", i);
+		ret = hist_field_debug_show(m, data->track_data.track_var,
+					    HIST_FIELD_FL_VAR);
+		if (ret)
+			goto out;
+	}
+
+	if (data->handler == HANDLER_ONMATCH) {
+		seq_printf(m, "\n    hist_data->actions[%d].match_data.event_system: %s\n",
+			   i, data->match_data.event_system);
+		seq_printf(m, "    hist_data->actions[%d].match_data.event: %s\n",
+			   i, data->match_data.event);
+	}
+out:
+	return ret;
+}
+
+static int hist_actions_debug_show(struct seq_file *m,
+				   struct hist_trigger_data *hist_data)
+{
+	int i, ret = 0;
+
+	if (hist_data->n_actions)
+		seq_puts(m, "\n  action tracking variables (for onmax()/onchange()/onmatch()):\n");
+
+	for (i = 0; i < hist_data->n_actions; i++) {
+		struct action_data *action = hist_data->actions[i];
+
+		ret = hist_action_debug_show(m, action, i);
+		if (ret)
+			goto out;
+	}
+
+	if (hist_data->n_save_vars)
+		seq_puts(m, "\n  save action variables (save() params):\n");
+
+	for (i = 0; i < hist_data->n_save_vars; i++) {
+		ret = field_var_debug_show(m, hist_data->save_vars[i], i, true);
+		if (ret)
+			goto out;
+	}
+out:
+	return ret;
+}
+
+static void hist_trigger_debug_show(struct seq_file *m,
+				    struct event_trigger_data *data, int n)
+{
+	struct hist_trigger_data *hist_data;
+	int i, ret;
+
+	if (n > 0)
+		seq_puts(m, "\n\n");
+
+	seq_puts(m, "# event histogram\n#\n# trigger info: ");
+	data->ops->print(m, data->ops, data);
+	seq_puts(m, "#\n\n");
+
+	hist_data = data->private_data;
+
+	seq_printf(m, "hist_data: %p\n\n", hist_data);
+	seq_printf(m, "  n_vals: %u\n", hist_data->n_vals);
+	seq_printf(m, "  n_keys: %u\n", hist_data->n_keys);
+	seq_printf(m, "  n_fields: %u\n", hist_data->n_fields);
+
+	seq_puts(m, "\n  val fields:\n\n");
+
+	seq_puts(m, "    hist_data->fields[0]:\n");
+	ret = hist_field_debug_show(m, hist_data->fields[0],
+				    HIST_FIELD_FL_HITCOUNT);
+	if (ret)
+		return;
+
+	for (i = 1; i < hist_data->n_vals; i++) {
+		seq_printf(m, "\n    hist_data->fields[%d]:\n", i);
+		ret = hist_field_debug_show(m, hist_data->fields[i], 0);
+		if (ret)
+			return;
+	}
+
+	seq_puts(m, "\n  key fields:\n");
+
+	for (i = hist_data->n_vals; i < hist_data->n_fields; i++) {
+		seq_printf(m, "\n    hist_data->fields[%d]:\n", i);
+		ret = hist_field_debug_show(m, hist_data->fields[i],
+					    HIST_FIELD_FL_KEY);
+		if (ret)
+			return;
+	}
+
+	if (hist_data->n_var_refs)
+		seq_puts(m, "\n  variable reference fields:\n");
+
+	for (i = 0; i < hist_data->n_var_refs; i++) {
+		seq_printf(m, "\n    hist_data->var_refs[%d]:\n", i);
+		ret = hist_field_debug_show(m, hist_data->var_refs[i],
+					    HIST_FIELD_FL_VAR_REF);
+		if (ret)
+			return;
+	}
+
+	if (hist_data->n_field_vars)
+		seq_puts(m, "\n  field variables:\n");
+
+	for (i = 0; i < hist_data->n_field_vars; i++) {
+		ret = field_var_debug_show(m, hist_data->field_vars[i], i, false);
+		if (ret)
+			return;
+	}
+
+	ret = hist_actions_debug_show(m, hist_data);
+	if (ret)
+		return;
+}
+
+static int hist_debug_show(struct seq_file *m, void *v)
+{
+	struct event_trigger_data *data;
+	struct trace_event_file *event_file;
+	int n = 0, ret = 0;
+
+	mutex_lock(&event_mutex);
+
+	event_file = event_file_data(m->private);
+	if (unlikely(!event_file)) {
+		ret = -ENODEV;
+		goto out_unlock;
+	}
+
+	list_for_each_entry(data, &event_file->triggers, list) {
+		if (data->cmd_ops->trigger_type == ETT_EVENT_HIST)
+			hist_trigger_debug_show(m, data, n++);
+	}
+
+ out_unlock:
+	mutex_unlock(&event_mutex);
+
+	return ret;
+}
+
+static int event_hist_debug_open(struct inode *inode, struct file *file)
+{
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
+	return single_open(file, hist_debug_show, file);
+}
+
+const struct file_operations event_hist_debug_fops = {
+	.open = event_hist_debug_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+#endif
+
 static void hist_field_print(struct seq_file *m, struct hist_field *hist_field)
 {
 	const char *field_name = hist_field_name(hist_field, 0);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 07/12] tracing/doc: Fix typos in histogram-design.rst
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (5 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 06/12] tracing: Add hist_debug trace event files for histogram debugging Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 09/12] tracing: Move synthetic events to a separate file Steven Rostedt
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Tom Zanussi, Randy Dunlap

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

There's a few typos in the histogram-design.rst document that need need to
be fixed.

Cc: Tom Zanussi <zanussi@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 Documentation/trace/histogram-design.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/trace/histogram-design.rst b/Documentation/trace/histogram-design.rst
index c246753f0ffc..06f5c7e5f2ee 100644
--- a/Documentation/trace/histogram-design.rst
+++ b/Documentation/trace/histogram-design.rst
@@ -700,7 +700,7 @@ variable, as well as the referenced variable's size, type, and
 is_signed values.  The VAR_REF field's .name is set to the name of the
 variable it references.  If a variable reference was created using the
 explicit system.event.$var_ref notation, the hist_field's system and
-event_name variabls are also set.
+event_name variables are also set.
 
 So, in order to handle an event for the sched_switch histogram,
 because we have a reference to a variable on another histogram, we
@@ -1445,7 +1445,7 @@ reference to the variable being tracked, in this case the $wakeup_lat
 variable.  In order to perform the onmax() handler function, there
 also needs to be a variable that tracks the current maximum by getting
 updated whenever a new maximum is hit.  In this case, we can see that
-an autogenerated veriable named ' __max' has been created and is
+an auto-generated variable named ' __max' has been created and is
 visible in the actions[].track_data.track_var variable.
 
 Finally, in the new 'save action variables' section, we can see that
@@ -1611,7 +1611,7 @@ A couple special cases
 
 While the above covers the basics of the histogram internals, there
 are a couple of special cases that should be discussed, since they
-tend to creae even more confusion.  Those are field variables on other
+tend to create even more confusion.  Those are field variables on other
 histograms, and aliases, both described below through example tests
 using the hist_debug files.
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 09/12] tracing: Move synthetic events to a separate file
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (6 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 07/12] tracing/doc: Fix typos in histogram-design.rst Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 10/12] selftests/ftrace: Distinguish between hist and synthetic event checks Steven Rostedt
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Tom Zanussi

From: Tom Zanussi <zanussi@kernel.org>

With the addition of the in-kernel synthetic event API, synthetic
events are no longer specifically tied to the histogram triggers.

The synthetic event code is also making trace_event_hist.c very
bloated, so for those reasons, move it to a separate file,
trace_events_synth.c, along with a new trace_synth.h header file.

Because synthetic events are now independent from hist triggers, add a
new CONFIG_SYNTH_EVENTS config option, and have CONFIG_HIST_TRIGGERS
select it, and have CONFIG_SYNTH_EVENT_GEN_TEST depend on it.

Link: http://lkml.kernel.org/r/4d1fa1f85ed5982706ac44844ac92451dcb04715.1590693308.git.zanussi@kernel.org

Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/Kconfig              |   20 +-
 kernel/trace/Makefile             |    1 +
 kernel/trace/trace_events_hist.c  | 2071 ++---------------------------
 kernel/trace/trace_events_synth.c | 1789 +++++++++++++++++++++++++
 kernel/trace/trace_synth.h        |   36 +
 5 files changed, 1989 insertions(+), 1928 deletions(-)
 create mode 100644 kernel/trace/trace_events_synth.c
 create mode 100644 kernel/trace/trace_synth.h

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 2aec9376825f..75407d5dc83a 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -623,12 +623,30 @@ config TRACING_MAP
 	  generally used outside of that context, and is normally
 	  selected by tracers that use it.
 
+config SYNTH_EVENTS
+	bool "Synthetic trace events"
+	select TRACING
+	select DYNAMIC_EVENTS
+	default n
+	help
+	  Synthetic events are user-defined trace events that can be
+	  used to combine data from other trace events or in fact any
+	  data source.  Synthetic events can be generated indirectly
+	  via the trace() action of histogram triggers or directly
+	  by way of an in-kernel API.
+
+	  See Documentation/trace/events.rst or
+	  Documentation/trace/histogram.rst for details and examples.
+
+	  If in doubt, say N.
+
 config HIST_TRIGGERS
 	bool "Histogram triggers"
 	depends on ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select TRACING_MAP
 	select TRACING
 	select DYNAMIC_EVENTS
+	select SYNTH_EVENTS
 	default n
 	help
 	  Hist triggers allow one or more arbitrary trace event fields
@@ -824,7 +842,7 @@ config PREEMPTIRQ_DELAY_TEST
 
 config SYNTH_EVENT_GEN_TEST
 	tristate "Test module for in-kernel synthetic event generation"
-	depends on HIST_TRIGGERS
+	depends on SYNTH_EVENTS
 	help
           This option creates a test module to check the base
           functionality of in-kernel synthetic event definition and
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index f9dcd19165fa..1d8aaa546412 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -72,6 +72,7 @@ endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_EVENT_TRACING) += trace_events_trigger.o
 obj-$(CONFIG_TRACE_EVENT_INJECT) += trace_events_inject.o
+obj-$(CONFIG_SYNTH_EVENTS) += trace_events_synth.o
 obj-$(CONFIG_HIST_TRIGGERS) += trace_events_hist.o
 obj-$(CONFIG_BPF_EVENTS) += bpf_trace.o
 obj-$(CONFIG_KPROBE_EVENTS) += trace_kprobe.o
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 313227da4925..0b933546142e 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -19,13 +19,7 @@
 #include <trace/events/mmflags.h>
 
 #include "tracing_map.h"
-#include "trace.h"
-#include "trace_dynevent.h"
-
-#define SYNTH_SYSTEM		"synthetic"
-#define SYNTH_FIELDS_MAX	32
-
-#define STR_VAR_LEN_MAX		32 /* must be multiple of sizeof(u64) */
+#include "trace_synth.h"
 
 #define ERRORS								\
 	C(NONE,			"No error"),				\
@@ -380,69 +374,6 @@ struct hist_trigger_data {
 	unsigned int			n_save_var_str;
 };
 
-static int create_synth_event(int argc, const char **argv);
-static int synth_event_show(struct seq_file *m, struct dyn_event *ev);
-static int synth_event_release(struct dyn_event *ev);
-static bool synth_event_is_busy(struct dyn_event *ev);
-static bool synth_event_match(const char *system, const char *event,
-			int argc, const char **argv, struct dyn_event *ev);
-
-static struct dyn_event_operations synth_event_ops = {
-	.create = create_synth_event,
-	.show = synth_event_show,
-	.is_busy = synth_event_is_busy,
-	.free = synth_event_release,
-	.match = synth_event_match,
-};
-
-struct synth_field {
-	char *type;
-	char *name;
-	size_t size;
-	unsigned int offset;
-	bool is_signed;
-	bool is_string;
-};
-
-struct synth_event {
-	struct dyn_event			devent;
-	int					ref;
-	char					*name;
-	struct synth_field			**fields;
-	unsigned int				n_fields;
-	unsigned int				n_u64;
-	struct trace_event_class		class;
-	struct trace_event_call			call;
-	struct tracepoint			*tp;
-	struct module				*mod;
-};
-
-static bool is_synth_event(struct dyn_event *ev)
-{
-	return ev->ops == &synth_event_ops;
-}
-
-static struct synth_event *to_synth_event(struct dyn_event *ev)
-{
-	return container_of(ev, struct synth_event, devent);
-}
-
-static bool synth_event_is_busy(struct dyn_event *ev)
-{
-	struct synth_event *event = to_synth_event(ev);
-
-	return event->ref != 0;
-}
-
-static bool synth_event_match(const char *system, const char *event,
-			int argc, const char **argv, struct dyn_event *ev)
-{
-	struct synth_event *sev = to_synth_event(ev);
-
-	return strcmp(sev->name, event) == 0 &&
-		(!system || strcmp(system, SYNTH_SYSTEM) == 0);
-}
-
 struct action_data;
 
 typedef void (*action_fn_t) (struct hist_trigger_data *hist_data,
@@ -490,1882 +421,202 @@ struct action_data {
 	union {
 		struct {
 			char			*event;
-			char			*event_system;
-		} match_data;
-
-		struct {
-			/*
-			 * var_str contains the $-unstripped variable
-			 * name referenced by var_ref, and used when
-			 * printing the action.  Because var_ref
-			 * creation is deferred to create_actions(),
-			 * we need a per-action way to save it until
-			 * then, thus var_str.
-			 */
-			char			*var_str;
-
-			/*
-			 * var_ref refers to the variable being
-			 * tracked e.g onmax($var).
-			 */
-			struct hist_field	*var_ref;
-
-			/*
-			 * track_var contains the 'invisible' tracking
-			 * variable created to keep the current
-			 * e.g. max value.
-			 */
-			struct hist_field	*track_var;
-
-			check_track_val_fn_t	check_val;
-			action_fn_t		save_data;
-		} track_data;
-	};
-};
-
-struct track_data {
-	u64				track_val;
-	bool				updated;
-
-	unsigned int			key_len;
-	void				*key;
-	struct tracing_map_elt		elt;
-
-	struct action_data		*action_data;
-	struct hist_trigger_data	*hist_data;
-};
-
-struct hist_elt_data {
-	char *comm;
-	u64 *var_ref_vals;
-	char *field_var_str[SYNTH_FIELDS_MAX];
-};
-
-struct snapshot_context {
-	struct tracing_map_elt	*elt;
-	void			*key;
-};
-
-static void track_data_free(struct track_data *track_data)
-{
-	struct hist_elt_data *elt_data;
-
-	if (!track_data)
-		return;
-
-	kfree(track_data->key);
-
-	elt_data = track_data->elt.private_data;
-	if (elt_data) {
-		kfree(elt_data->comm);
-		kfree(elt_data);
-	}
-
-	kfree(track_data);
-}
-
-static struct track_data *track_data_alloc(unsigned int key_len,
-					   struct action_data *action_data,
-					   struct hist_trigger_data *hist_data)
-{
-	struct track_data *data = kzalloc(sizeof(*data), GFP_KERNEL);
-	struct hist_elt_data *elt_data;
-
-	if (!data)
-		return ERR_PTR(-ENOMEM);
-
-	data->key = kzalloc(key_len, GFP_KERNEL);
-	if (!data->key) {
-		track_data_free(data);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	data->key_len = key_len;
-	data->action_data = action_data;
-	data->hist_data = hist_data;
-
-	elt_data = kzalloc(sizeof(*elt_data), GFP_KERNEL);
-	if (!elt_data) {
-		track_data_free(data);
-		return ERR_PTR(-ENOMEM);
-	}
-	data->elt.private_data = elt_data;
-
-	elt_data->comm = kzalloc(TASK_COMM_LEN, GFP_KERNEL);
-	if (!elt_data->comm) {
-		track_data_free(data);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	return data;
-}
-
-static char last_cmd[MAX_FILTER_STR_VAL];
-static char last_cmd_loc[MAX_FILTER_STR_VAL];
-
-static int errpos(char *str)
-{
-	return err_pos(last_cmd, str);
-}
-
-static void last_cmd_set(struct trace_event_file *file, char *str)
-{
-	const char *system = NULL, *name = NULL;
-	struct trace_event_call *call;
-
-	if (!str)
-		return;
-
-	strcpy(last_cmd, "hist:");
-	strncat(last_cmd, str, MAX_FILTER_STR_VAL - 1 - sizeof("hist:"));
-
-	if (file) {
-		call = file->event_call;
-
-		system = call->class->system;
-		if (system) {
-			name = trace_event_name(call);
-			if (!name)
-				system = NULL;
-		}
-	}
-
-	if (system)
-		snprintf(last_cmd_loc, MAX_FILTER_STR_VAL, "hist:%s:%s", system, name);
-}
-
-static void hist_err(struct trace_array *tr, u8 err_type, u8 err_pos)
-{
-	tracing_log_err(tr, last_cmd_loc, last_cmd, err_text,
-			err_type, err_pos);
-}
-
-static void hist_err_clear(void)
-{
-	last_cmd[0] = '\0';
-	last_cmd_loc[0] = '\0';
-}
-
-struct synth_trace_event {
-	struct trace_entry	ent;
-	u64			fields[];
-};
-
-static int synth_event_define_fields(struct trace_event_call *call)
-{
-	struct synth_trace_event trace;
-	int offset = offsetof(typeof(trace), fields);
-	struct synth_event *event = call->data;
-	unsigned int i, size, n_u64;
-	char *name, *type;
-	bool is_signed;
-	int ret = 0;
-
-	for (i = 0, n_u64 = 0; i < event->n_fields; i++) {
-		size = event->fields[i]->size;
-		is_signed = event->fields[i]->is_signed;
-		type = event->fields[i]->type;
-		name = event->fields[i]->name;
-		ret = trace_define_field(call, type, name, offset, size,
-					 is_signed, FILTER_OTHER);
-		if (ret)
-			break;
-
-		event->fields[i]->offset = n_u64;
-
-		if (event->fields[i]->is_string) {
-			offset += STR_VAR_LEN_MAX;
-			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
-		} else {
-			offset += sizeof(u64);
-			n_u64++;
-		}
-	}
-
-	event->n_u64 = n_u64;
-
-	return ret;
-}
-
-static bool synth_field_signed(char *type)
-{
-	if (str_has_prefix(type, "u"))
-		return false;
-	if (strcmp(type, "gfp_t") == 0)
-		return false;
-
-	return true;
-}
-
-static int synth_field_is_string(char *type)
-{
-	if (strstr(type, "char[") != NULL)
-		return true;
-
-	return false;
-}
-
-static int synth_field_string_size(char *type)
-{
-	char buf[4], *end, *start;
-	unsigned int len;
-	int size, err;
-
-	start = strstr(type, "char[");
-	if (start == NULL)
-		return -EINVAL;
-	start += sizeof("char[") - 1;
-
-	end = strchr(type, ']');
-	if (!end || end < start)
-		return -EINVAL;
-
-	len = end - start;
-	if (len > 3)
-		return -EINVAL;
-
-	strncpy(buf, start, len);
-	buf[len] = '\0';
-
-	err = kstrtouint(buf, 0, &size);
-	if (err)
-		return err;
-
-	if (size > STR_VAR_LEN_MAX)
-		return -EINVAL;
-
-	return size;
-}
-
-static int synth_field_size(char *type)
-{
-	int size = 0;
-
-	if (strcmp(type, "s64") == 0)
-		size = sizeof(s64);
-	else if (strcmp(type, "u64") == 0)
-		size = sizeof(u64);
-	else if (strcmp(type, "s32") == 0)
-		size = sizeof(s32);
-	else if (strcmp(type, "u32") == 0)
-		size = sizeof(u32);
-	else if (strcmp(type, "s16") == 0)
-		size = sizeof(s16);
-	else if (strcmp(type, "u16") == 0)
-		size = sizeof(u16);
-	else if (strcmp(type, "s8") == 0)
-		size = sizeof(s8);
-	else if (strcmp(type, "u8") == 0)
-		size = sizeof(u8);
-	else if (strcmp(type, "char") == 0)
-		size = sizeof(char);
-	else if (strcmp(type, "unsigned char") == 0)
-		size = sizeof(unsigned char);
-	else if (strcmp(type, "int") == 0)
-		size = sizeof(int);
-	else if (strcmp(type, "unsigned int") == 0)
-		size = sizeof(unsigned int);
-	else if (strcmp(type, "long") == 0)
-		size = sizeof(long);
-	else if (strcmp(type, "unsigned long") == 0)
-		size = sizeof(unsigned long);
-	else if (strcmp(type, "pid_t") == 0)
-		size = sizeof(pid_t);
-	else if (strcmp(type, "gfp_t") == 0)
-		size = sizeof(gfp_t);
-	else if (synth_field_is_string(type))
-		size = synth_field_string_size(type);
-
-	return size;
-}
-
-static const char *synth_field_fmt(char *type)
-{
-	const char *fmt = "%llu";
-
-	if (strcmp(type, "s64") == 0)
-		fmt = "%lld";
-	else if (strcmp(type, "u64") == 0)
-		fmt = "%llu";
-	else if (strcmp(type, "s32") == 0)
-		fmt = "%d";
-	else if (strcmp(type, "u32") == 0)
-		fmt = "%u";
-	else if (strcmp(type, "s16") == 0)
-		fmt = "%d";
-	else if (strcmp(type, "u16") == 0)
-		fmt = "%u";
-	else if (strcmp(type, "s8") == 0)
-		fmt = "%d";
-	else if (strcmp(type, "u8") == 0)
-		fmt = "%u";
-	else if (strcmp(type, "char") == 0)
-		fmt = "%d";
-	else if (strcmp(type, "unsigned char") == 0)
-		fmt = "%u";
-	else if (strcmp(type, "int") == 0)
-		fmt = "%d";
-	else if (strcmp(type, "unsigned int") == 0)
-		fmt = "%u";
-	else if (strcmp(type, "long") == 0)
-		fmt = "%ld";
-	else if (strcmp(type, "unsigned long") == 0)
-		fmt = "%lu";
-	else if (strcmp(type, "pid_t") == 0)
-		fmt = "%d";
-	else if (strcmp(type, "gfp_t") == 0)
-		fmt = "%x";
-	else if (synth_field_is_string(type))
-		fmt = "%s";
-
-	return fmt;
-}
-
-static void print_synth_event_num_val(struct trace_seq *s,
-				      char *print_fmt, char *name,
-				      int size, u64 val, char *space)
-{
-	switch (size) {
-	case 1:
-		trace_seq_printf(s, print_fmt, name, (u8)val, space);
-		break;
-
-	case 2:
-		trace_seq_printf(s, print_fmt, name, (u16)val, space);
-		break;
-
-	case 4:
-		trace_seq_printf(s, print_fmt, name, (u32)val, space);
-		break;
-
-	default:
-		trace_seq_printf(s, print_fmt, name, val, space);
-		break;
-	}
-}
-
-static enum print_line_t print_synth_event(struct trace_iterator *iter,
-					   int flags,
-					   struct trace_event *event)
-{
-	struct trace_array *tr = iter->tr;
-	struct trace_seq *s = &iter->seq;
-	struct synth_trace_event *entry;
-	struct synth_event *se;
-	unsigned int i, n_u64;
-	char print_fmt[32];
-	const char *fmt;
-
-	entry = (struct synth_trace_event *)iter->ent;
-	se = container_of(event, struct synth_event, call.event);
-
-	trace_seq_printf(s, "%s: ", se->name);
-
-	for (i = 0, n_u64 = 0; i < se->n_fields; i++) {
-		if (trace_seq_has_overflowed(s))
-			goto end;
-
-		fmt = synth_field_fmt(se->fields[i]->type);
-
-		/* parameter types */
-		if (tr && tr->trace_flags & TRACE_ITER_VERBOSE)
-			trace_seq_printf(s, "%s ", fmt);
-
-		snprintf(print_fmt, sizeof(print_fmt), "%%s=%s%%s", fmt);
-
-		/* parameter values */
-		if (se->fields[i]->is_string) {
-			trace_seq_printf(s, print_fmt, se->fields[i]->name,
-					 (char *)&entry->fields[n_u64],
-					 i == se->n_fields - 1 ? "" : " ");
-			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
-		} else {
-			struct trace_print_flags __flags[] = {
-			    __def_gfpflag_names, {-1, NULL} };
-			char *space = (i == se->n_fields - 1 ? "" : " ");
-
-			print_synth_event_num_val(s, print_fmt,
-						  se->fields[i]->name,
-						  se->fields[i]->size,
-						  entry->fields[n_u64],
-						  space);
-
-			if (strcmp(se->fields[i]->type, "gfp_t") == 0) {
-				trace_seq_puts(s, " (");
-				trace_print_flags_seq(s, "|",
-						      entry->fields[n_u64],
-						      __flags);
-				trace_seq_putc(s, ')');
-			}
-			n_u64++;
-		}
-	}
-end:
-	trace_seq_putc(s, '\n');
-
-	return trace_handle_return(s);
-}
-
-static struct trace_event_functions synth_event_funcs = {
-	.trace		= print_synth_event
-};
-
-static notrace void trace_event_raw_event_synth(void *__data,
-						u64 *var_ref_vals,
-						unsigned int *var_ref_idx)
-{
-	struct trace_event_file *trace_file = __data;
-	struct synth_trace_event *entry;
-	struct trace_event_buffer fbuffer;
-	struct trace_buffer *buffer;
-	struct synth_event *event;
-	unsigned int i, n_u64, val_idx;
-	int fields_size = 0;
-
-	event = trace_file->event_call->data;
-
-	if (trace_trigger_soft_disabled(trace_file))
-		return;
-
-	fields_size = event->n_u64 * sizeof(u64);
-
-	/*
-	 * Avoid ring buffer recursion detection, as this event
-	 * is being performed within another event.
-	 */
-	buffer = trace_file->tr->array_buffer.buffer;
-	ring_buffer_nest_start(buffer);
-
-	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
-					   sizeof(*entry) + fields_size);
-	if (!entry)
-		goto out;
-
-	for (i = 0, n_u64 = 0; i < event->n_fields; i++) {
-		val_idx = var_ref_idx[i];
-		if (event->fields[i]->is_string) {
-			char *str_val = (char *)(long)var_ref_vals[val_idx];
-			char *str_field = (char *)&entry->fields[n_u64];
-
-			strscpy(str_field, str_val, STR_VAR_LEN_MAX);
-			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
-		} else {
-			struct synth_field *field = event->fields[i];
-			u64 val = var_ref_vals[val_idx];
-
-			switch (field->size) {
-			case 1:
-				*(u8 *)&entry->fields[n_u64] = (u8)val;
-				break;
-
-			case 2:
-				*(u16 *)&entry->fields[n_u64] = (u16)val;
-				break;
-
-			case 4:
-				*(u32 *)&entry->fields[n_u64] = (u32)val;
-				break;
-
-			default:
-				entry->fields[n_u64] = val;
-				break;
-			}
-			n_u64++;
-		}
-	}
-
-	trace_event_buffer_commit(&fbuffer);
-out:
-	ring_buffer_nest_end(buffer);
-}
-
-static void free_synth_event_print_fmt(struct trace_event_call *call)
-{
-	if (call) {
-		kfree(call->print_fmt);
-		call->print_fmt = NULL;
-	}
-}
-
-static int __set_synth_event_print_fmt(struct synth_event *event,
-				       char *buf, int len)
-{
-	const char *fmt;
-	int pos = 0;
-	int i;
-
-	/* When len=0, we just calculate the needed length */
-#define LEN_OR_ZERO (len ? len - pos : 0)
-
-	pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
-	for (i = 0; i < event->n_fields; i++) {
-		fmt = synth_field_fmt(event->fields[i]->type);
-		pos += snprintf(buf + pos, LEN_OR_ZERO, "%s=%s%s",
-				event->fields[i]->name, fmt,
-				i == event->n_fields - 1 ? "" : ", ");
-	}
-	pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
-
-	for (i = 0; i < event->n_fields; i++) {
-		pos += snprintf(buf + pos, LEN_OR_ZERO,
-				", REC->%s", event->fields[i]->name);
-	}
-
-#undef LEN_OR_ZERO
-
-	/* return the length of print_fmt */
-	return pos;
-}
-
-static int set_synth_event_print_fmt(struct trace_event_call *call)
-{
-	struct synth_event *event = call->data;
-	char *print_fmt;
-	int len;
-
-	/* First: called with 0 length to calculate the needed length */
-	len = __set_synth_event_print_fmt(event, NULL, 0);
-
-	print_fmt = kmalloc(len + 1, GFP_KERNEL);
-	if (!print_fmt)
-		return -ENOMEM;
-
-	/* Second: actually write the @print_fmt */
-	__set_synth_event_print_fmt(event, print_fmt, len + 1);
-	call->print_fmt = print_fmt;
-
-	return 0;
-}
-
-static void free_synth_field(struct synth_field *field)
-{
-	kfree(field->type);
-	kfree(field->name);
-	kfree(field);
-}
-
-static struct synth_field *parse_synth_field(int argc, const char **argv,
-					     int *consumed)
-{
-	struct synth_field *field;
-	const char *prefix = NULL, *field_type = argv[0], *field_name, *array;
-	int len, ret = 0;
-
-	if (field_type[0] == ';')
-		field_type++;
-
-	if (!strcmp(field_type, "unsigned")) {
-		if (argc < 3)
-			return ERR_PTR(-EINVAL);
-		prefix = "unsigned ";
-		field_type = argv[1];
-		field_name = argv[2];
-		*consumed = 3;
-	} else {
-		field_name = argv[1];
-		*consumed = 2;
-	}
-
-	field = kzalloc(sizeof(*field), GFP_KERNEL);
-	if (!field)
-		return ERR_PTR(-ENOMEM);
-
-	len = strlen(field_name);
-	array = strchr(field_name, '[');
-	if (array)
-		len -= strlen(array);
-	else if (field_name[len - 1] == ';')
-		len--;
-
-	field->name = kmemdup_nul(field_name, len, GFP_KERNEL);
-	if (!field->name) {
-		ret = -ENOMEM;
-		goto free;
-	}
-
-	if (field_type[0] == ';')
-		field_type++;
-	len = strlen(field_type) + 1;
-	if (array)
-		len += strlen(array);
-	if (prefix)
-		len += strlen(prefix);
-
-	field->type = kzalloc(len, GFP_KERNEL);
-	if (!field->type) {
-		ret = -ENOMEM;
-		goto free;
-	}
-	if (prefix)
-		strcat(field->type, prefix);
-	strcat(field->type, field_type);
-	if (array) {
-		strcat(field->type, array);
-		if (field->type[len - 1] == ';')
-			field->type[len - 1] = '\0';
-	}
-
-	field->size = synth_field_size(field->type);
-	if (!field->size) {
-		ret = -EINVAL;
-		goto free;
-	}
-
-	if (synth_field_is_string(field->type))
-		field->is_string = true;
-
-	field->is_signed = synth_field_signed(field->type);
-
- out:
-	return field;
- free:
-	free_synth_field(field);
-	field = ERR_PTR(ret);
-	goto out;
-}
-
-static void free_synth_tracepoint(struct tracepoint *tp)
-{
-	if (!tp)
-		return;
-
-	kfree(tp->name);
-	kfree(tp);
-}
-
-static struct tracepoint *alloc_synth_tracepoint(char *name)
-{
-	struct tracepoint *tp;
-
-	tp = kzalloc(sizeof(*tp), GFP_KERNEL);
-	if (!tp)
-		return ERR_PTR(-ENOMEM);
-
-	tp->name = kstrdup(name, GFP_KERNEL);
-	if (!tp->name) {
-		kfree(tp);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	return tp;
-}
-
-typedef void (*synth_probe_func_t) (void *__data, u64 *var_ref_vals,
-				    unsigned int *var_ref_idx);
-
-static inline void trace_synth(struct synth_event *event, u64 *var_ref_vals,
-			       unsigned int *var_ref_idx)
-{
-	struct tracepoint *tp = event->tp;
-
-	if (unlikely(atomic_read(&tp->key.enabled) > 0)) {
-		struct tracepoint_func *probe_func_ptr;
-		synth_probe_func_t probe_func;
-		void *__data;
-
-		if (!(cpu_online(raw_smp_processor_id())))
-			return;
-
-		probe_func_ptr = rcu_dereference_sched((tp)->funcs);
-		if (probe_func_ptr) {
-			do {
-				probe_func = probe_func_ptr->func;
-				__data = probe_func_ptr->data;
-				probe_func(__data, var_ref_vals, var_ref_idx);
-			} while ((++probe_func_ptr)->func);
-		}
-	}
-}
-
-static struct synth_event *find_synth_event(const char *name)
-{
-	struct dyn_event *pos;
-	struct synth_event *event;
-
-	for_each_dyn_event(pos) {
-		if (!is_synth_event(pos))
-			continue;
-		event = to_synth_event(pos);
-		if (strcmp(event->name, name) == 0)
-			return event;
-	}
-
-	return NULL;
-}
-
-static struct trace_event_fields synth_event_fields_array[] = {
-	{ .type = TRACE_FUNCTION_TYPE,
-	  .define_fields = synth_event_define_fields },
-	{}
-};
-
-static int register_synth_event(struct synth_event *event)
-{
-	struct trace_event_call *call = &event->call;
-	int ret = 0;
-
-	event->call.class = &event->class;
-	event->class.system = kstrdup(SYNTH_SYSTEM, GFP_KERNEL);
-	if (!event->class.system) {
-		ret = -ENOMEM;
-		goto out;
-	}
-
-	event->tp = alloc_synth_tracepoint(event->name);
-	if (IS_ERR(event->tp)) {
-		ret = PTR_ERR(event->tp);
-		event->tp = NULL;
-		goto out;
-	}
-
-	INIT_LIST_HEAD(&call->class->fields);
-	call->event.funcs = &synth_event_funcs;
-	call->class->fields_array = synth_event_fields_array;
-
-	ret = register_trace_event(&call->event);
-	if (!ret) {
-		ret = -ENODEV;
-		goto out;
-	}
-	call->flags = TRACE_EVENT_FL_TRACEPOINT;
-	call->class->reg = trace_event_reg;
-	call->class->probe = trace_event_raw_event_synth;
-	call->data = event;
-	call->tp = event->tp;
-
-	ret = trace_add_event_call(call);
-	if (ret) {
-		pr_warn("Failed to register synthetic event: %s\n",
-			trace_event_name(call));
-		goto err;
-	}
-
-	ret = set_synth_event_print_fmt(call);
-	if (ret < 0) {
-		trace_remove_event_call(call);
-		goto err;
-	}
- out:
-	return ret;
- err:
-	unregister_trace_event(&call->event);
-	goto out;
-}
-
-static int unregister_synth_event(struct synth_event *event)
-{
-	struct trace_event_call *call = &event->call;
-	int ret;
-
-	ret = trace_remove_event_call(call);
-
-	return ret;
-}
-
-static void free_synth_event(struct synth_event *event)
-{
-	unsigned int i;
-
-	if (!event)
-		return;
-
-	for (i = 0; i < event->n_fields; i++)
-		free_synth_field(event->fields[i]);
-
-	kfree(event->fields);
-	kfree(event->name);
-	kfree(event->class.system);
-	free_synth_tracepoint(event->tp);
-	free_synth_event_print_fmt(&event->call);
-	kfree(event);
-}
-
-static struct synth_event *alloc_synth_event(const char *name, int n_fields,
-					     struct synth_field **fields)
-{
-	struct synth_event *event;
-	unsigned int i;
-
-	event = kzalloc(sizeof(*event), GFP_KERNEL);
-	if (!event) {
-		event = ERR_PTR(-ENOMEM);
-		goto out;
-	}
-
-	event->name = kstrdup(name, GFP_KERNEL);
-	if (!event->name) {
-		kfree(event);
-		event = ERR_PTR(-ENOMEM);
-		goto out;
-	}
-
-	event->fields = kcalloc(n_fields, sizeof(*event->fields), GFP_KERNEL);
-	if (!event->fields) {
-		free_synth_event(event);
-		event = ERR_PTR(-ENOMEM);
-		goto out;
-	}
-
-	dyn_event_init(&event->devent, &synth_event_ops);
-
-	for (i = 0; i < n_fields; i++)
-		event->fields[i] = fields[i];
-
-	event->n_fields = n_fields;
- out:
-	return event;
-}
-
-static void action_trace(struct hist_trigger_data *hist_data,
-			 struct tracing_map_elt *elt, void *rec,
-			 struct ring_buffer_event *rbe, void *key,
-			 struct action_data *data, u64 *var_ref_vals)
-{
-	struct synth_event *event = data->synth_event;
-
-	trace_synth(event, var_ref_vals, data->var_ref_idx);
-}
-
-struct hist_var_data {
-	struct list_head list;
-	struct hist_trigger_data *hist_data;
-};
-
-static int synth_event_check_arg_fn(void *data)
-{
-	struct dynevent_arg_pair *arg_pair = data;
-	int size;
-
-	size = synth_field_size((char *)arg_pair->lhs);
-
-	return size ? 0 : -EINVAL;
-}
-
-/**
- * synth_event_add_field - Add a new field to a synthetic event cmd
- * @cmd: A pointer to the dynevent_cmd struct representing the new event
- * @type: The type of the new field to add
- * @name: The name of the new field to add
- *
- * Add a new field to a synthetic event cmd object.  Field ordering is in
- * the same order the fields are added.
- *
- * See synth_field_size() for available types. If field_name contains
- * [n] the field is considered to be an array.
- *
- * Return: 0 if successful, error otherwise.
- */
-int synth_event_add_field(struct dynevent_cmd *cmd, const char *type,
-			  const char *name)
-{
-	struct dynevent_arg_pair arg_pair;
-	int ret;
-
-	if (cmd->type != DYNEVENT_TYPE_SYNTH)
-		return -EINVAL;
-
-	if (!type || !name)
-		return -EINVAL;
-
-	dynevent_arg_pair_init(&arg_pair, 0, ';');
-
-	arg_pair.lhs = type;
-	arg_pair.rhs = name;
-
-	ret = dynevent_arg_pair_add(cmd, &arg_pair, synth_event_check_arg_fn);
-	if (ret)
-		return ret;
-
-	if (++cmd->n_fields > SYNTH_FIELDS_MAX)
-		ret = -EINVAL;
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_add_field);
-
-/**
- * synth_event_add_field_str - Add a new field to a synthetic event cmd
- * @cmd: A pointer to the dynevent_cmd struct representing the new event
- * @type_name: The type and name of the new field to add, as a single string
- *
- * Add a new field to a synthetic event cmd object, as a single
- * string.  The @type_name string is expected to be of the form 'type
- * name', which will be appended by ';'.  No sanity checking is done -
- * what's passed in is assumed to already be well-formed.  Field
- * ordering is in the same order the fields are added.
- *
- * See synth_field_size() for available types. If field_name contains
- * [n] the field is considered to be an array.
- *
- * Return: 0 if successful, error otherwise.
- */
-int synth_event_add_field_str(struct dynevent_cmd *cmd, const char *type_name)
-{
-	struct dynevent_arg arg;
-	int ret;
-
-	if (cmd->type != DYNEVENT_TYPE_SYNTH)
-		return -EINVAL;
-
-	if (!type_name)
-		return -EINVAL;
-
-	dynevent_arg_init(&arg, ';');
-
-	arg.str = type_name;
-
-	ret = dynevent_arg_add(cmd, &arg, NULL);
-	if (ret)
-		return ret;
-
-	if (++cmd->n_fields > SYNTH_FIELDS_MAX)
-		ret = -EINVAL;
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_add_field_str);
-
-/**
- * synth_event_add_fields - Add multiple fields to a synthetic event cmd
- * @cmd: A pointer to the dynevent_cmd struct representing the new event
- * @fields: An array of type/name field descriptions
- * @n_fields: The number of field descriptions contained in the fields array
- *
- * Add a new set of fields to a synthetic event cmd object.  The event
- * fields that will be defined for the event should be passed in as an
- * array of struct synth_field_desc, and the number of elements in the
- * array passed in as n_fields.  Field ordering will retain the
- * ordering given in the fields array.
- *
- * See synth_field_size() for available types. If field_name contains
- * [n] the field is considered to be an array.
- *
- * Return: 0 if successful, error otherwise.
- */
-int synth_event_add_fields(struct dynevent_cmd *cmd,
-			   struct synth_field_desc *fields,
-			   unsigned int n_fields)
-{
-	unsigned int i;
-	int ret = 0;
-
-	for (i = 0; i < n_fields; i++) {
-		if (fields[i].type == NULL || fields[i].name == NULL) {
-			ret = -EINVAL;
-			break;
-		}
-
-		ret = synth_event_add_field(cmd, fields[i].type, fields[i].name);
-		if (ret)
-			break;
-	}
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_add_fields);
-
-/**
- * __synth_event_gen_cmd_start - Start a synthetic event command from arg list
- * @cmd: A pointer to the dynevent_cmd struct representing the new event
- * @name: The name of the synthetic event
- * @mod: The module creating the event, NULL if not created from a module
- * @args: Variable number of arg (pairs), one pair for each field
- *
- * NOTE: Users normally won't want to call this function directly, but
- * rather use the synth_event_gen_cmd_start() wrapper, which
- * automatically adds a NULL to the end of the arg list.  If this
- * function is used directly, make sure the last arg in the variable
- * arg list is NULL.
- *
- * Generate a synthetic event command to be executed by
- * synth_event_gen_cmd_end().  This function can be used to generate
- * the complete command or only the first part of it; in the latter
- * case, synth_event_add_field(), synth_event_add_field_str(), or
- * synth_event_add_fields() can be used to add more fields following
- * this.
- *
- * There should be an even number variable args, each pair consisting
- * of a type followed by a field name.
- *
- * See synth_field_size() for available types. If field_name contains
- * [n] the field is considered to be an array.
- *
- * Return: 0 if successful, error otherwise.
- */
-int __synth_event_gen_cmd_start(struct dynevent_cmd *cmd, const char *name,
-				struct module *mod, ...)
-{
-	struct dynevent_arg arg;
-	va_list args;
-	int ret;
-
-	cmd->event_name = name;
-	cmd->private_data = mod;
-
-	if (cmd->type != DYNEVENT_TYPE_SYNTH)
-		return -EINVAL;
-
-	dynevent_arg_init(&arg, 0);
-	arg.str = name;
-	ret = dynevent_arg_add(cmd, &arg, NULL);
-	if (ret)
-		return ret;
-
-	va_start(args, mod);
-	for (;;) {
-		const char *type, *name;
-
-		type = va_arg(args, const char *);
-		if (!type)
-			break;
-		name = va_arg(args, const char *);
-		if (!name)
-			break;
-
-		if (++cmd->n_fields > SYNTH_FIELDS_MAX) {
-			ret = -EINVAL;
-			break;
-		}
-
-		ret = synth_event_add_field(cmd, type, name);
-		if (ret)
-			break;
-	}
-	va_end(args);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(__synth_event_gen_cmd_start);
-
-/**
- * synth_event_gen_cmd_array_start - Start synthetic event command from an array
- * @cmd: A pointer to the dynevent_cmd struct representing the new event
- * @name: The name of the synthetic event
- * @fields: An array of type/name field descriptions
- * @n_fields: The number of field descriptions contained in the fields array
- *
- * Generate a synthetic event command to be executed by
- * synth_event_gen_cmd_end().  This function can be used to generate
- * the complete command or only the first part of it; in the latter
- * case, synth_event_add_field(), synth_event_add_field_str(), or
- * synth_event_add_fields() can be used to add more fields following
- * this.
- *
- * The event fields that will be defined for the event should be
- * passed in as an array of struct synth_field_desc, and the number of
- * elements in the array passed in as n_fields.  Field ordering will
- * retain the ordering given in the fields array.
- *
- * See synth_field_size() for available types. If field_name contains
- * [n] the field is considered to be an array.
- *
- * Return: 0 if successful, error otherwise.
- */
-int synth_event_gen_cmd_array_start(struct dynevent_cmd *cmd, const char *name,
-				    struct module *mod,
-				    struct synth_field_desc *fields,
-				    unsigned int n_fields)
-{
-	struct dynevent_arg arg;
-	unsigned int i;
-	int ret = 0;
-
-	cmd->event_name = name;
-	cmd->private_data = mod;
-
-	if (cmd->type != DYNEVENT_TYPE_SYNTH)
-		return -EINVAL;
-
-	if (n_fields > SYNTH_FIELDS_MAX)
-		return -EINVAL;
-
-	dynevent_arg_init(&arg, 0);
-	arg.str = name;
-	ret = dynevent_arg_add(cmd, &arg, NULL);
-	if (ret)
-		return ret;
-
-	for (i = 0; i < n_fields; i++) {
-		if (fields[i].type == NULL || fields[i].name == NULL)
-			return -EINVAL;
-
-		ret = synth_event_add_field(cmd, fields[i].type, fields[i].name);
-		if (ret)
-			break;
-	}
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_gen_cmd_array_start);
-
-static int __create_synth_event(int argc, const char *name, const char **argv)
-{
-	struct synth_field *field, *fields[SYNTH_FIELDS_MAX];
-	struct synth_event *event = NULL;
-	int i, consumed = 0, n_fields = 0, ret = 0;
-
-	/*
-	 * Argument syntax:
-	 *  - Add synthetic event: <event_name> field[;field] ...
-	 *  - Remove synthetic event: !<event_name> field[;field] ...
-	 *      where 'field' = type field_name
-	 */
-
-	if (name[0] == '\0' || argc < 1)
-		return -EINVAL;
-
-	mutex_lock(&event_mutex);
-
-	event = find_synth_event(name);
-	if (event) {
-		ret = -EEXIST;
-		goto out;
-	}
-
-	for (i = 0; i < argc - 1; i++) {
-		if (strcmp(argv[i], ";") == 0)
-			continue;
-		if (n_fields == SYNTH_FIELDS_MAX) {
-			ret = -EINVAL;
-			goto err;
-		}
-
-		field = parse_synth_field(argc - i, &argv[i], &consumed);
-		if (IS_ERR(field)) {
-			ret = PTR_ERR(field);
-			goto err;
-		}
-		fields[n_fields++] = field;
-		i += consumed - 1;
-	}
-
-	if (i < argc && strcmp(argv[i], ";") != 0) {
-		ret = -EINVAL;
-		goto err;
-	}
-
-	event = alloc_synth_event(name, n_fields, fields);
-	if (IS_ERR(event)) {
-		ret = PTR_ERR(event);
-		event = NULL;
-		goto err;
-	}
-	ret = register_synth_event(event);
-	if (!ret)
-		dyn_event_add(&event->devent);
-	else
-		free_synth_event(event);
- out:
-	mutex_unlock(&event_mutex);
-
-	return ret;
- err:
-	for (i = 0; i < n_fields; i++)
-		free_synth_field(fields[i]);
-
-	goto out;
-}
-
-/**
- * synth_event_create - Create a new synthetic event
- * @name: The name of the new sythetic event
- * @fields: An array of type/name field descriptions
- * @n_fields: The number of field descriptions contained in the fields array
- * @mod: The module creating the event, NULL if not created from a module
- *
- * Create a new synthetic event with the given name under the
- * trace/events/synthetic/ directory.  The event fields that will be
- * defined for the event should be passed in as an array of struct
- * synth_field_desc, and the number elements in the array passed in as
- * n_fields. Field ordering will retain the ordering given in the
- * fields array.
- *
- * If the new synthetic event is being created from a module, the mod
- * param must be non-NULL.  This will ensure that the trace buffer
- * won't contain unreadable events.
- *
- * The new synth event should be deleted using synth_event_delete()
- * function.  The new synthetic event can be generated from modules or
- * other kernel code using trace_synth_event() and related functions.
- *
- * Return: 0 if successful, error otherwise.
- */
-int synth_event_create(const char *name, struct synth_field_desc *fields,
-		       unsigned int n_fields, struct module *mod)
-{
-	struct dynevent_cmd cmd;
-	char *buf;
-	int ret;
-
-	buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	synth_event_cmd_init(&cmd, buf, MAX_DYNEVENT_CMD_LEN);
-
-	ret = synth_event_gen_cmd_array_start(&cmd, name, mod,
-					      fields, n_fields);
-	if (ret)
-		goto out;
-
-	ret = synth_event_gen_cmd_end(&cmd);
- out:
-	kfree(buf);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_create);
-
-static int destroy_synth_event(struct synth_event *se)
-{
-	int ret;
-
-	if (se->ref)
-		ret = -EBUSY;
-	else {
-		ret = unregister_synth_event(se);
-		if (!ret) {
-			dyn_event_remove(&se->devent);
-			free_synth_event(se);
-		}
-	}
-
-	return ret;
-}
-
-/**
- * synth_event_delete - Delete a synthetic event
- * @event_name: The name of the new sythetic event
- *
- * Delete a synthetic event that was created with synth_event_create().
- *
- * Return: 0 if successful, error otherwise.
- */
-int synth_event_delete(const char *event_name)
-{
-	struct synth_event *se = NULL;
-	struct module *mod = NULL;
-	int ret = -ENOENT;
-
-	mutex_lock(&event_mutex);
-	se = find_synth_event(event_name);
-	if (se) {
-		mod = se->mod;
-		ret = destroy_synth_event(se);
-	}
-	mutex_unlock(&event_mutex);
-
-	if (mod) {
-		mutex_lock(&trace_types_lock);
-		/*
-		 * It is safest to reset the ring buffer if the module
-		 * being unloaded registered any events that were
-		 * used. The only worry is if a new module gets
-		 * loaded, and takes on the same id as the events of
-		 * this module. When printing out the buffer, traced
-		 * events left over from this module may be passed to
-		 * the new module events and unexpected results may
-		 * occur.
-		 */
-		tracing_reset_all_online_cpus();
-		mutex_unlock(&trace_types_lock);
-	}
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_delete);
-
-static int create_or_delete_synth_event(int argc, char **argv)
-{
-	const char *name = argv[0];
-	int ret;
-
-	/* trace_run_command() ensures argc != 0 */
-	if (name[0] == '!') {
-		ret = synth_event_delete(name + 1);
-		return ret;
-	}
-
-	ret = __create_synth_event(argc - 1, name, (const char **)argv + 1);
-	return ret == -ECANCELED ? -EINVAL : ret;
-}
-
-static int synth_event_run_command(struct dynevent_cmd *cmd)
-{
-	struct synth_event *se;
-	int ret;
-
-	ret = trace_run_command(cmd->seq.buffer, create_or_delete_synth_event);
-	if (ret)
-		return ret;
-
-	se = find_synth_event(cmd->event_name);
-	if (WARN_ON(!se))
-		return -ENOENT;
-
-	se->mod = cmd->private_data;
-
-	return ret;
-}
-
-/**
- * synth_event_cmd_init - Initialize a synthetic event command object
- * @cmd: A pointer to the dynevent_cmd struct representing the new event
- * @buf: A pointer to the buffer used to build the command
- * @maxlen: The length of the buffer passed in @buf
- *
- * Initialize a synthetic event command object.  Use this before
- * calling any of the other dyenvent_cmd functions.
- */
-void synth_event_cmd_init(struct dynevent_cmd *cmd, char *buf, int maxlen)
-{
-	dynevent_cmd_init(cmd, buf, maxlen, DYNEVENT_TYPE_SYNTH,
-			  synth_event_run_command);
-}
-EXPORT_SYMBOL_GPL(synth_event_cmd_init);
-
-static inline int
-__synth_event_trace_start(struct trace_event_file *file,
-			  struct synth_event_trace_state *trace_state)
-{
-	int entry_size, fields_size = 0;
-	int ret = 0;
-
-	memset(trace_state, '\0', sizeof(*trace_state));
-
-	/*
-	 * Normal event tracing doesn't get called at all unless the
-	 * ENABLED bit is set (which attaches the probe thus allowing
-	 * this code to be called, etc).  Because this is called
-	 * directly by the user, we don't have that but we still need
-	 * to honor not logging when disabled.  For the the iterated
-	 * trace case, we save the enabed state upon start and just
-	 * ignore the following data calls.
-	 */
-	if (!(file->flags & EVENT_FILE_FL_ENABLED) ||
-	    trace_trigger_soft_disabled(file)) {
-		trace_state->disabled = true;
-		goto out;
-	}
-
-	trace_state->event = file->event_call->data;
-
-	fields_size = trace_state->event->n_u64 * sizeof(u64);
-
-	/*
-	 * Avoid ring buffer recursion detection, as this event
-	 * is being performed within another event.
-	 */
-	trace_state->buffer = file->tr->array_buffer.buffer;
-	ring_buffer_nest_start(trace_state->buffer);
-
-	entry_size = sizeof(*trace_state->entry) + fields_size;
-	trace_state->entry = trace_event_buffer_reserve(&trace_state->fbuffer,
-							file,
-							entry_size);
-	if (!trace_state->entry) {
-		ring_buffer_nest_end(trace_state->buffer);
-		ret = -EINVAL;
-	}
-out:
-	return ret;
-}
-
-static inline void
-__synth_event_trace_end(struct synth_event_trace_state *trace_state)
-{
-	trace_event_buffer_commit(&trace_state->fbuffer);
-
-	ring_buffer_nest_end(trace_state->buffer);
-}
-
-/**
- * synth_event_trace - Trace a synthetic event
- * @file: The trace_event_file representing the synthetic event
- * @n_vals: The number of values in vals
- * @args: Variable number of args containing the event values
- *
- * Trace a synthetic event using the values passed in the variable
- * argument list.
- *
- * The argument list should be a list 'n_vals' u64 values.  The number
- * of vals must match the number of field in the synthetic event, and
- * must be in the same order as the synthetic event fields.
- *
- * All vals should be cast to u64, and string vals are just pointers
- * to strings, cast to u64.  Strings will be copied into space
- * reserved in the event for the string, using these pointers.
- *
- * Return: 0 on success, err otherwise.
- */
-int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...)
-{
-	struct synth_event_trace_state state;
-	unsigned int i, n_u64;
-	va_list args;
-	int ret;
-
-	ret = __synth_event_trace_start(file, &state);
-	if (ret || state.disabled)
-		return ret;
-
-	if (n_vals != state.event->n_fields) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	va_start(args, n_vals);
-	for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) {
-		u64 val;
-
-		val = va_arg(args, u64);
-
-		if (state.event->fields[i]->is_string) {
-			char *str_val = (char *)(long)val;
-			char *str_field = (char *)&state.entry->fields[n_u64];
-
-			strscpy(str_field, str_val, STR_VAR_LEN_MAX);
-			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
-		} else {
-			struct synth_field *field = state.event->fields[i];
-
-			switch (field->size) {
-			case 1:
-				*(u8 *)&state.entry->fields[n_u64] = (u8)val;
-				break;
-
-			case 2:
-				*(u16 *)&state.entry->fields[n_u64] = (u16)val;
-				break;
-
-			case 4:
-				*(u32 *)&state.entry->fields[n_u64] = (u32)val;
-				break;
-
-			default:
-				state.entry->fields[n_u64] = val;
-				break;
-			}
-			n_u64++;
-		}
-	}
-	va_end(args);
-out:
-	__synth_event_trace_end(&state);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_trace);
-
-/**
- * synth_event_trace_array - Trace a synthetic event from an array
- * @file: The trace_event_file representing the synthetic event
- * @vals: Array of values
- * @n_vals: The number of values in vals
- *
- * Trace a synthetic event using the values passed in as 'vals'.
- *
- * The 'vals' array is just an array of 'n_vals' u64.  The number of
- * vals must match the number of field in the synthetic event, and
- * must be in the same order as the synthetic event fields.
- *
- * All vals should be cast to u64, and string vals are just pointers
- * to strings, cast to u64.  Strings will be copied into space
- * reserved in the event for the string, using these pointers.
- *
- * Return: 0 on success, err otherwise.
- */
-int synth_event_trace_array(struct trace_event_file *file, u64 *vals,
-			    unsigned int n_vals)
-{
-	struct synth_event_trace_state state;
-	unsigned int i, n_u64;
-	int ret;
-
-	ret = __synth_event_trace_start(file, &state);
-	if (ret || state.disabled)
-		return ret;
-
-	if (n_vals != state.event->n_fields) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) {
-		if (state.event->fields[i]->is_string) {
-			char *str_val = (char *)(long)vals[i];
-			char *str_field = (char *)&state.entry->fields[n_u64];
-
-			strscpy(str_field, str_val, STR_VAR_LEN_MAX);
-			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
-		} else {
-			struct synth_field *field = state.event->fields[i];
-			u64 val = vals[i];
-
-			switch (field->size) {
-			case 1:
-				*(u8 *)&state.entry->fields[n_u64] = (u8)val;
-				break;
-
-			case 2:
-				*(u16 *)&state.entry->fields[n_u64] = (u16)val;
-				break;
-
-			case 4:
-				*(u32 *)&state.entry->fields[n_u64] = (u32)val;
-				break;
-
-			default:
-				state.entry->fields[n_u64] = val;
-				break;
-			}
-			n_u64++;
-		}
-	}
-out:
-	__synth_event_trace_end(&state);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(synth_event_trace_array);
-
-/**
- * synth_event_trace_start - Start piecewise synthetic event trace
- * @file: The trace_event_file representing the synthetic event
- * @trace_state: A pointer to object tracking the piecewise trace state
- *
- * Start the trace of a synthetic event field-by-field rather than all
- * at once.
- *
- * This function 'opens' an event trace, which means space is reserved
- * for the event in the trace buffer, after which the event's
- * individual field values can be set through either
- * synth_event_add_next_val() or synth_event_add_val().
- *
- * A pointer to a trace_state object is passed in, which will keep
- * track of the current event trace state until the event trace is
- * closed (and the event finally traced) using
- * synth_event_trace_end().
- *
- * Note that synth_event_trace_end() must be called after all values
- * have been added for each event trace, regardless of whether adding
- * all field values succeeded or not.
- *
- * Note also that for a given event trace, all fields must be added
- * using either synth_event_add_next_val() or synth_event_add_val()
- * but not both together or interleaved.
- *
- * Return: 0 on success, err otherwise.
- */
-int synth_event_trace_start(struct trace_event_file *file,
-			    struct synth_event_trace_state *trace_state)
-{
-	if (!trace_state)
-		return -EINVAL;
+			char			*event_system;
+		} match_data;
 
-	return __synth_event_trace_start(file, trace_state);
-}
-EXPORT_SYMBOL_GPL(synth_event_trace_start);
+		struct {
+			/*
+			 * var_str contains the $-unstripped variable
+			 * name referenced by var_ref, and used when
+			 * printing the action.  Because var_ref
+			 * creation is deferred to create_actions(),
+			 * we need a per-action way to save it until
+			 * then, thus var_str.
+			 */
+			char			*var_str;
 
-static int __synth_event_add_val(const char *field_name, u64 val,
-				 struct synth_event_trace_state *trace_state)
-{
-	struct synth_field *field = NULL;
-	struct synth_trace_event *entry;
-	struct synth_event *event;
-	int i, ret = 0;
+			/*
+			 * var_ref refers to the variable being
+			 * tracked e.g onmax($var).
+			 */
+			struct hist_field	*var_ref;
 
-	if (!trace_state) {
-		ret = -EINVAL;
-		goto out;
-	}
+			/*
+			 * track_var contains the 'invisible' tracking
+			 * variable created to keep the current
+			 * e.g. max value.
+			 */
+			struct hist_field	*track_var;
 
-	/* can't mix add_next_synth_val() with add_synth_val() */
-	if (field_name) {
-		if (trace_state->add_next) {
-			ret = -EINVAL;
-			goto out;
-		}
-		trace_state->add_name = true;
-	} else {
-		if (trace_state->add_name) {
-			ret = -EINVAL;
-			goto out;
-		}
-		trace_state->add_next = true;
-	}
+			check_track_val_fn_t	check_val;
+			action_fn_t		save_data;
+		} track_data;
+	};
+};
 
-	if (trace_state->disabled)
-		goto out;
+struct track_data {
+	u64				track_val;
+	bool				updated;
 
-	event = trace_state->event;
-	if (trace_state->add_name) {
-		for (i = 0; i < event->n_fields; i++) {
-			field = event->fields[i];
-			if (strcmp(field->name, field_name) == 0)
-				break;
-		}
-		if (!field) {
-			ret = -EINVAL;
-			goto out;
-		}
-	} else {
-		if (trace_state->cur_field >= event->n_fields) {
-			ret = -EINVAL;
-			goto out;
-		}
-		field = event->fields[trace_state->cur_field++];
-	}
+	unsigned int			key_len;
+	void				*key;
+	struct tracing_map_elt		elt;
 
-	entry = trace_state->entry;
-	if (field->is_string) {
-		char *str_val = (char *)(long)val;
-		char *str_field;
+	struct action_data		*action_data;
+	struct hist_trigger_data	*hist_data;
+};
 
-		if (!str_val) {
-			ret = -EINVAL;
-			goto out;
-		}
+struct hist_elt_data {
+	char *comm;
+	u64 *var_ref_vals;
+	char *field_var_str[SYNTH_FIELDS_MAX];
+};
 
-		str_field = (char *)&entry->fields[field->offset];
-		strscpy(str_field, str_val, STR_VAR_LEN_MAX);
-	} else {
-		switch (field->size) {
-		case 1:
-			*(u8 *)&trace_state->entry->fields[field->offset] = (u8)val;
-			break;
+struct snapshot_context {
+	struct tracing_map_elt	*elt;
+	void			*key;
+};
 
-		case 2:
-			*(u16 *)&trace_state->entry->fields[field->offset] = (u16)val;
-			break;
+static void track_data_free(struct track_data *track_data)
+{
+	struct hist_elt_data *elt_data;
 
-		case 4:
-			*(u32 *)&trace_state->entry->fields[field->offset] = (u32)val;
-			break;
+	if (!track_data)
+		return;
 
-		default:
-			trace_state->entry->fields[field->offset] = val;
-			break;
-		}
+	kfree(track_data->key);
+
+	elt_data = track_data->elt.private_data;
+	if (elt_data) {
+		kfree(elt_data->comm);
+		kfree(elt_data);
 	}
- out:
-	return ret;
-}
 
-/**
- * synth_event_add_next_val - Add the next field's value to an open synth trace
- * @val: The value to set the next field to
- * @trace_state: A pointer to object tracking the piecewise trace state
- *
- * Set the value of the next field in an event that's been opened by
- * synth_event_trace_start().
- *
- * The val param should be the value cast to u64.  If the value points
- * to a string, the val param should be a char * cast to u64.
- *
- * This function assumes all the fields in an event are to be set one
- * after another - successive calls to this function are made, one for
- * each field, in the order of the fields in the event, until all
- * fields have been set.  If you'd rather set each field individually
- * without regard to ordering, synth_event_add_val() can be used
- * instead.
- *
- * Note however that synth_event_add_next_val() and
- * synth_event_add_val() can't be intermixed for a given event trace -
- * one or the other but not both can be used at the same time.
- *
- * Note also that synth_event_trace_end() must be called after all
- * values have been added for each event trace, regardless of whether
- * adding all field values succeeded or not.
- *
- * Return: 0 on success, err otherwise.
- */
-int synth_event_add_next_val(u64 val,
-			     struct synth_event_trace_state *trace_state)
-{
-	return __synth_event_add_val(NULL, val, trace_state);
+	kfree(track_data);
 }
-EXPORT_SYMBOL_GPL(synth_event_add_next_val);
 
-/**
- * synth_event_add_val - Add a named field's value to an open synth trace
- * @field_name: The name of the synthetic event field value to set
- * @val: The value to set the next field to
- * @trace_state: A pointer to object tracking the piecewise trace state
- *
- * Set the value of the named field in an event that's been opened by
- * synth_event_trace_start().
- *
- * The val param should be the value cast to u64.  If the value points
- * to a string, the val param should be a char * cast to u64.
- *
- * This function looks up the field name, and if found, sets the field
- * to the specified value.  This lookup makes this function more
- * expensive than synth_event_add_next_val(), so use that or the
- * none-piecewise synth_event_trace() instead if efficiency is more
- * important.
- *
- * Note however that synth_event_add_next_val() and
- * synth_event_add_val() can't be intermixed for a given event trace -
- * one or the other but not both can be used at the same time.
- *
- * Note also that synth_event_trace_end() must be called after all
- * values have been added for each event trace, regardless of whether
- * adding all field values succeeded or not.
- *
- * Return: 0 on success, err otherwise.
- */
-int synth_event_add_val(const char *field_name, u64 val,
-			struct synth_event_trace_state *trace_state)
+static struct track_data *track_data_alloc(unsigned int key_len,
+					   struct action_data *action_data,
+					   struct hist_trigger_data *hist_data)
 {
-	return __synth_event_add_val(field_name, val, trace_state);
-}
-EXPORT_SYMBOL_GPL(synth_event_add_val);
+	struct track_data *data = kzalloc(sizeof(*data), GFP_KERNEL);
+	struct hist_elt_data *elt_data;
 
-/**
- * synth_event_trace_end - End piecewise synthetic event trace
- * @trace_state: A pointer to object tracking the piecewise trace state
- *
- * End the trace of a synthetic event opened by
- * synth_event_trace__start().
- *
- * This function 'closes' an event trace, which basically means that
- * it commits the reserved event and cleans up other loose ends.
- *
- * A pointer to a trace_state object is passed in, which will keep
- * track of the current event trace state opened with
- * synth_event_trace_start().
- *
- * Note that this function must be called after all values have been
- * added for each event trace, regardless of whether adding all field
- * values succeeded or not.
- *
- * Return: 0 on success, err otherwise.
- */
-int synth_event_trace_end(struct synth_event_trace_state *trace_state)
-{
-	if (!trace_state)
-		return -EINVAL;
+	if (!data)
+		return ERR_PTR(-ENOMEM);
 
-	__synth_event_trace_end(trace_state);
+	data->key = kzalloc(key_len, GFP_KERNEL);
+	if (!data->key) {
+		track_data_free(data);
+		return ERR_PTR(-ENOMEM);
+	}
 
-	return 0;
-}
-EXPORT_SYMBOL_GPL(synth_event_trace_end);
+	data->key_len = key_len;
+	data->action_data = action_data;
+	data->hist_data = hist_data;
 
-static int create_synth_event(int argc, const char **argv)
-{
-	const char *name = argv[0];
-	int len;
+	elt_data = kzalloc(sizeof(*elt_data), GFP_KERNEL);
+	if (!elt_data) {
+		track_data_free(data);
+		return ERR_PTR(-ENOMEM);
+	}
 
-	if (name[0] != 's' || name[1] != ':')
-		return -ECANCELED;
-	name += 2;
+	data->elt.private_data = elt_data;
 
-	/* This interface accepts group name prefix */
-	if (strchr(name, '/')) {
-		len = str_has_prefix(name, SYNTH_SYSTEM "/");
-		if (len == 0)
-			return -EINVAL;
-		name += len;
+	elt_data->comm = kzalloc(TASK_COMM_LEN, GFP_KERNEL);
+	if (!elt_data->comm) {
+		track_data_free(data);
+		return ERR_PTR(-ENOMEM);
 	}
-	return __create_synth_event(argc - 1, name, argv + 1);
-}
-
-static int synth_event_release(struct dyn_event *ev)
-{
-	struct synth_event *event = to_synth_event(ev);
-	int ret;
 
-	if (event->ref)
-		return -EBUSY;
+	return data;
+}
 
-	ret = unregister_synth_event(event);
-	if (ret)
-		return ret;
+static char last_cmd[MAX_FILTER_STR_VAL];
+static char last_cmd_loc[MAX_FILTER_STR_VAL];
 
-	dyn_event_remove(ev);
-	free_synth_event(event);
-	return 0;
+static int errpos(char *str)
+{
+	return err_pos(last_cmd, str);
 }
 
-static int __synth_event_show(struct seq_file *m, struct synth_event *event)
+static void last_cmd_set(struct trace_event_file *file, char *str)
 {
-	struct synth_field *field;
-	unsigned int i;
+	const char *system = NULL, *name = NULL;
+	struct trace_event_call *call;
 
-	seq_printf(m, "%s\t", event->name);
+	if (!str)
+		return;
 
-	for (i = 0; i < event->n_fields; i++) {
-		field = event->fields[i];
+	strcpy(last_cmd, "hist:");
+	strncat(last_cmd, str, MAX_FILTER_STR_VAL - 1 - sizeof("hist:"));
 
-		/* parameter values */
-		seq_printf(m, "%s %s%s", field->type, field->name,
-			   i == event->n_fields - 1 ? "" : "; ");
+	if (file) {
+		call = file->event_call;
+		system = call->class->system;
+		if (system) {
+			name = trace_event_name(call);
+			if (!name)
+				system = NULL;
+		}
 	}
 
-	seq_putc(m, '\n');
-
-	return 0;
+	if (system)
+		snprintf(last_cmd_loc, MAX_FILTER_STR_VAL, "hist:%s:%s", system, name);
 }
 
-static int synth_event_show(struct seq_file *m, struct dyn_event *ev)
+static void hist_err(struct trace_array *tr, u8 err_type, u8 err_pos)
 {
-	struct synth_event *event = to_synth_event(ev);
-
-	seq_printf(m, "s:%s/", event->class.system);
-
-	return __synth_event_show(m, event);
+	tracing_log_err(tr, last_cmd_loc, last_cmd, err_text,
+			err_type, err_pos);
 }
 
-static int synth_events_seq_show(struct seq_file *m, void *v)
+static void hist_err_clear(void)
 {
-	struct dyn_event *ev = v;
-
-	if (!is_synth_event(ev))
-		return 0;
-
-	return __synth_event_show(m, to_synth_event(ev));
+	last_cmd[0] = '\0';
+	last_cmd_loc[0] = '\0';
 }
 
-static const struct seq_operations synth_events_seq_op = {
-	.start	= dyn_event_seq_start,
-	.next	= dyn_event_seq_next,
-	.stop	= dyn_event_seq_stop,
-	.show	= synth_events_seq_show,
-};
+typedef void (*synth_probe_func_t) (void *__data, u64 *var_ref_vals,
+				    unsigned int *var_ref_idx);
 
-static int synth_events_open(struct inode *inode, struct file *file)
+static inline void trace_synth(struct synth_event *event, u64 *var_ref_vals,
+			       unsigned int *var_ref_idx)
 {
-	int ret;
+	struct tracepoint *tp = event->tp;
 
-	ret = security_locked_down(LOCKDOWN_TRACEFS);
-	if (ret)
-		return ret;
+	if (unlikely(atomic_read(&tp->key.enabled) > 0)) {
+		struct tracepoint_func *probe_func_ptr;
+		synth_probe_func_t probe_func;
+		void *__data;
 
-	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
-		ret = dyn_events_release_all(&synth_event_ops);
-		if (ret < 0)
-			return ret;
-	}
+		if (!(cpu_online(raw_smp_processor_id())))
+			return;
 
-	return seq_open(file, &synth_events_seq_op);
+		probe_func_ptr = rcu_dereference_sched((tp)->funcs);
+		if (probe_func_ptr) {
+			do {
+				probe_func = probe_func_ptr->func;
+				__data = probe_func_ptr->data;
+				probe_func(__data, var_ref_vals, var_ref_idx);
+			} while ((++probe_func_ptr)->func);
+		}
+	}
 }
 
-static ssize_t synth_events_write(struct file *file,
-				  const char __user *buffer,
-				  size_t count, loff_t *ppos)
+static void action_trace(struct hist_trigger_data *hist_data,
+			 struct tracing_map_elt *elt, void *rec,
+			 struct ring_buffer_event *rbe, void *key,
+			 struct action_data *data, u64 *var_ref_vals)
 {
-	return trace_parse_run_command(file, buffer, count, ppos,
-				       create_or_delete_synth_event);
+	struct synth_event *event = data->synth_event;
+
+	trace_synth(event, var_ref_vals, data->var_ref_idx);
 }
 
-static const struct file_operations synth_events_fops = {
-	.open           = synth_events_open,
-	.write		= synth_events_write,
-	.read           = seq_read,
-	.llseek         = seq_lseek,
-	.release        = seq_release,
+struct hist_var_data {
+	struct list_head list;
+	struct hist_trigger_data *hist_data;
 };
 
 static u64 hist_field_timestamp(struct hist_field *hist_field,
@@ -7653,37 +5904,3 @@ __init int register_trigger_hist_enable_disable_cmds(void)
 
 	return ret;
 }
-
-static __init int trace_events_hist_init(void)
-{
-	struct dentry *entry = NULL;
-	struct dentry *d_tracer;
-	int err = 0;
-
-	err = dyn_event_register(&synth_event_ops);
-	if (err) {
-		pr_warn("Could not register synth_event_ops\n");
-		return err;
-	}
-
-	d_tracer = tracing_init_dentry();
-	if (IS_ERR(d_tracer)) {
-		err = PTR_ERR(d_tracer);
-		goto err;
-	}
-
-	entry = tracefs_create_file("synthetic_events", 0644, d_tracer,
-				    NULL, &synth_events_fops);
-	if (!entry) {
-		err = -ENODEV;
-		goto err;
-	}
-
-	return err;
- err:
-	pr_warn("Could not create tracefs 'synthetic_events' entry\n");
-
-	return err;
-}
-
-fs_initcall(trace_events_hist_init);
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
new file mode 100644
index 000000000000..c6cca0d1d584
--- /dev/null
+++ b/kernel/trace/trace_events_synth.c
@@ -0,0 +1,1789 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * trace_events_synth - synthetic trace events
+ *
+ * Copyright (C) 2015, 2020 Tom Zanussi <tom.zanussi@linux.intel.com>
+ */
+
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/security.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/stacktrace.h>
+#include <linux/rculist.h>
+#include <linux/tracefs.h>
+
+/* for gfp flag names */
+#include <linux/trace_events.h>
+#include <trace/events/mmflags.h>
+
+#include "trace_synth.h"
+
+static int create_synth_event(int argc, const char **argv);
+static int synth_event_show(struct seq_file *m, struct dyn_event *ev);
+static int synth_event_release(struct dyn_event *ev);
+static bool synth_event_is_busy(struct dyn_event *ev);
+static bool synth_event_match(const char *system, const char *event,
+			int argc, const char **argv, struct dyn_event *ev);
+
+static struct dyn_event_operations synth_event_ops = {
+	.create = create_synth_event,
+	.show = synth_event_show,
+	.is_busy = synth_event_is_busy,
+	.free = synth_event_release,
+	.match = synth_event_match,
+};
+
+static bool is_synth_event(struct dyn_event *ev)
+{
+	return ev->ops == &synth_event_ops;
+}
+
+static struct synth_event *to_synth_event(struct dyn_event *ev)
+{
+	return container_of(ev, struct synth_event, devent);
+}
+
+static bool synth_event_is_busy(struct dyn_event *ev)
+{
+	struct synth_event *event = to_synth_event(ev);
+
+	return event->ref != 0;
+}
+
+static bool synth_event_match(const char *system, const char *event,
+			int argc, const char **argv, struct dyn_event *ev)
+{
+	struct synth_event *sev = to_synth_event(ev);
+
+	return strcmp(sev->name, event) == 0 &&
+		(!system || strcmp(system, SYNTH_SYSTEM) == 0);
+}
+
+struct synth_trace_event {
+	struct trace_entry	ent;
+	u64			fields[];
+};
+
+static int synth_event_define_fields(struct trace_event_call *call)
+{
+	struct synth_trace_event trace;
+	int offset = offsetof(typeof(trace), fields);
+	struct synth_event *event = call->data;
+	unsigned int i, size, n_u64;
+	char *name, *type;
+	bool is_signed;
+	int ret = 0;
+
+	for (i = 0, n_u64 = 0; i < event->n_fields; i++) {
+		size = event->fields[i]->size;
+		is_signed = event->fields[i]->is_signed;
+		type = event->fields[i]->type;
+		name = event->fields[i]->name;
+		ret = trace_define_field(call, type, name, offset, size,
+					 is_signed, FILTER_OTHER);
+		if (ret)
+			break;
+
+		event->fields[i]->offset = n_u64;
+
+		if (event->fields[i]->is_string) {
+			offset += STR_VAR_LEN_MAX;
+			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
+		} else {
+			offset += sizeof(u64);
+			n_u64++;
+		}
+	}
+
+	event->n_u64 = n_u64;
+
+	return ret;
+}
+
+static bool synth_field_signed(char *type)
+{
+	if (str_has_prefix(type, "u"))
+		return false;
+	if (strcmp(type, "gfp_t") == 0)
+		return false;
+
+	return true;
+}
+
+static int synth_field_is_string(char *type)
+{
+	if (strstr(type, "char[") != NULL)
+		return true;
+
+	return false;
+}
+
+static int synth_field_string_size(char *type)
+{
+	char buf[4], *end, *start;
+	unsigned int len;
+	int size, err;
+
+	start = strstr(type, "char[");
+	if (start == NULL)
+		return -EINVAL;
+	start += sizeof("char[") - 1;
+
+	end = strchr(type, ']');
+	if (!end || end < start)
+		return -EINVAL;
+
+	len = end - start;
+	if (len > 3)
+		return -EINVAL;
+
+	strncpy(buf, start, len);
+	buf[len] = '\0';
+
+	err = kstrtouint(buf, 0, &size);
+	if (err)
+		return err;
+
+	if (size > STR_VAR_LEN_MAX)
+		return -EINVAL;
+
+	return size;
+}
+
+static int synth_field_size(char *type)
+{
+	int size = 0;
+
+	if (strcmp(type, "s64") == 0)
+		size = sizeof(s64);
+	else if (strcmp(type, "u64") == 0)
+		size = sizeof(u64);
+	else if (strcmp(type, "s32") == 0)
+		size = sizeof(s32);
+	else if (strcmp(type, "u32") == 0)
+		size = sizeof(u32);
+	else if (strcmp(type, "s16") == 0)
+		size = sizeof(s16);
+	else if (strcmp(type, "u16") == 0)
+		size = sizeof(u16);
+	else if (strcmp(type, "s8") == 0)
+		size = sizeof(s8);
+	else if (strcmp(type, "u8") == 0)
+		size = sizeof(u8);
+	else if (strcmp(type, "char") == 0)
+		size = sizeof(char);
+	else if (strcmp(type, "unsigned char") == 0)
+		size = sizeof(unsigned char);
+	else if (strcmp(type, "int") == 0)
+		size = sizeof(int);
+	else if (strcmp(type, "unsigned int") == 0)
+		size = sizeof(unsigned int);
+	else if (strcmp(type, "long") == 0)
+		size = sizeof(long);
+	else if (strcmp(type, "unsigned long") == 0)
+		size = sizeof(unsigned long);
+	else if (strcmp(type, "pid_t") == 0)
+		size = sizeof(pid_t);
+	else if (strcmp(type, "gfp_t") == 0)
+		size = sizeof(gfp_t);
+	else if (synth_field_is_string(type))
+		size = synth_field_string_size(type);
+
+	return size;
+}
+
+static const char *synth_field_fmt(char *type)
+{
+	const char *fmt = "%llu";
+
+	if (strcmp(type, "s64") == 0)
+		fmt = "%lld";
+	else if (strcmp(type, "u64") == 0)
+		fmt = "%llu";
+	else if (strcmp(type, "s32") == 0)
+		fmt = "%d";
+	else if (strcmp(type, "u32") == 0)
+		fmt = "%u";
+	else if (strcmp(type, "s16") == 0)
+		fmt = "%d";
+	else if (strcmp(type, "u16") == 0)
+		fmt = "%u";
+	else if (strcmp(type, "s8") == 0)
+		fmt = "%d";
+	else if (strcmp(type, "u8") == 0)
+		fmt = "%u";
+	else if (strcmp(type, "char") == 0)
+		fmt = "%d";
+	else if (strcmp(type, "unsigned char") == 0)
+		fmt = "%u";
+	else if (strcmp(type, "int") == 0)
+		fmt = "%d";
+	else if (strcmp(type, "unsigned int") == 0)
+		fmt = "%u";
+	else if (strcmp(type, "long") == 0)
+		fmt = "%ld";
+	else if (strcmp(type, "unsigned long") == 0)
+		fmt = "%lu";
+	else if (strcmp(type, "pid_t") == 0)
+		fmt = "%d";
+	else if (strcmp(type, "gfp_t") == 0)
+		fmt = "%x";
+	else if (synth_field_is_string(type))
+		fmt = "%s";
+
+	return fmt;
+}
+
+static void print_synth_event_num_val(struct trace_seq *s,
+				      char *print_fmt, char *name,
+				      int size, u64 val, char *space)
+{
+	switch (size) {
+	case 1:
+		trace_seq_printf(s, print_fmt, name, (u8)val, space);
+		break;
+
+	case 2:
+		trace_seq_printf(s, print_fmt, name, (u16)val, space);
+		break;
+
+	case 4:
+		trace_seq_printf(s, print_fmt, name, (u32)val, space);
+		break;
+
+	default:
+		trace_seq_printf(s, print_fmt, name, val, space);
+		break;
+	}
+}
+
+static enum print_line_t print_synth_event(struct trace_iterator *iter,
+					   int flags,
+					   struct trace_event *event)
+{
+	struct trace_array *tr = iter->tr;
+	struct trace_seq *s = &iter->seq;
+	struct synth_trace_event *entry;
+	struct synth_event *se;
+	unsigned int i, n_u64;
+	char print_fmt[32];
+	const char *fmt;
+
+	entry = (struct synth_trace_event *)iter->ent;
+	se = container_of(event, struct synth_event, call.event);
+
+	trace_seq_printf(s, "%s: ", se->name);
+
+	for (i = 0, n_u64 = 0; i < se->n_fields; i++) {
+		if (trace_seq_has_overflowed(s))
+			goto end;
+
+		fmt = synth_field_fmt(se->fields[i]->type);
+
+		/* parameter types */
+		if (tr && tr->trace_flags & TRACE_ITER_VERBOSE)
+			trace_seq_printf(s, "%s ", fmt);
+
+		snprintf(print_fmt, sizeof(print_fmt), "%%s=%s%%s", fmt);
+
+		/* parameter values */
+		if (se->fields[i]->is_string) {
+			trace_seq_printf(s, print_fmt, se->fields[i]->name,
+					 (char *)&entry->fields[n_u64],
+					 i == se->n_fields - 1 ? "" : " ");
+			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
+		} else {
+			struct trace_print_flags __flags[] = {
+			    __def_gfpflag_names, {-1, NULL} };
+			char *space = (i == se->n_fields - 1 ? "" : " ");
+
+			print_synth_event_num_val(s, print_fmt,
+						  se->fields[i]->name,
+						  se->fields[i]->size,
+						  entry->fields[n_u64],
+						  space);
+
+			if (strcmp(se->fields[i]->type, "gfp_t") == 0) {
+				trace_seq_puts(s, " (");
+				trace_print_flags_seq(s, "|",
+						      entry->fields[n_u64],
+						      __flags);
+				trace_seq_putc(s, ')');
+			}
+			n_u64++;
+		}
+	}
+end:
+	trace_seq_putc(s, '\n');
+
+	return trace_handle_return(s);
+}
+
+static struct trace_event_functions synth_event_funcs = {
+	.trace		= print_synth_event
+};
+
+static notrace void trace_event_raw_event_synth(void *__data,
+						u64 *var_ref_vals,
+						unsigned int *var_ref_idx)
+{
+	struct trace_event_file *trace_file = __data;
+	struct synth_trace_event *entry;
+	struct trace_event_buffer fbuffer;
+	struct trace_buffer *buffer;
+	struct synth_event *event;
+	unsigned int i, n_u64, val_idx;
+	int fields_size = 0;
+
+	event = trace_file->event_call->data;
+
+	if (trace_trigger_soft_disabled(trace_file))
+		return;
+
+	fields_size = event->n_u64 * sizeof(u64);
+
+	/*
+	 * Avoid ring buffer recursion detection, as this event
+	 * is being performed within another event.
+	 */
+	buffer = trace_file->tr->array_buffer.buffer;
+	ring_buffer_nest_start(buffer);
+
+	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
+					   sizeof(*entry) + fields_size);
+	if (!entry)
+		goto out;
+
+	for (i = 0, n_u64 = 0; i < event->n_fields; i++) {
+		val_idx = var_ref_idx[i];
+		if (event->fields[i]->is_string) {
+			char *str_val = (char *)(long)var_ref_vals[val_idx];
+			char *str_field = (char *)&entry->fields[n_u64];
+
+			strscpy(str_field, str_val, STR_VAR_LEN_MAX);
+			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
+		} else {
+			struct synth_field *field = event->fields[i];
+			u64 val = var_ref_vals[val_idx];
+
+			switch (field->size) {
+			case 1:
+				*(u8 *)&entry->fields[n_u64] = (u8)val;
+				break;
+
+			case 2:
+				*(u16 *)&entry->fields[n_u64] = (u16)val;
+				break;
+
+			case 4:
+				*(u32 *)&entry->fields[n_u64] = (u32)val;
+				break;
+
+			default:
+				entry->fields[n_u64] = val;
+				break;
+			}
+			n_u64++;
+		}
+	}
+
+	trace_event_buffer_commit(&fbuffer);
+out:
+	ring_buffer_nest_end(buffer);
+}
+
+static void free_synth_event_print_fmt(struct trace_event_call *call)
+{
+	if (call) {
+		kfree(call->print_fmt);
+		call->print_fmt = NULL;
+	}
+}
+
+static int __set_synth_event_print_fmt(struct synth_event *event,
+				       char *buf, int len)
+{
+	const char *fmt;
+	int pos = 0;
+	int i;
+
+	/* When len=0, we just calculate the needed length */
+#define LEN_OR_ZERO (len ? len - pos : 0)
+
+	pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
+	for (i = 0; i < event->n_fields; i++) {
+		fmt = synth_field_fmt(event->fields[i]->type);
+		pos += snprintf(buf + pos, LEN_OR_ZERO, "%s=%s%s",
+				event->fields[i]->name, fmt,
+				i == event->n_fields - 1 ? "" : ", ");
+	}
+	pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
+
+	for (i = 0; i < event->n_fields; i++) {
+		pos += snprintf(buf + pos, LEN_OR_ZERO,
+				", REC->%s", event->fields[i]->name);
+	}
+
+#undef LEN_OR_ZERO
+
+	/* return the length of print_fmt */
+	return pos;
+}
+
+static int set_synth_event_print_fmt(struct trace_event_call *call)
+{
+	struct synth_event *event = call->data;
+	char *print_fmt;
+	int len;
+
+	/* First: called with 0 length to calculate the needed length */
+	len = __set_synth_event_print_fmt(event, NULL, 0);
+
+	print_fmt = kmalloc(len + 1, GFP_KERNEL);
+	if (!print_fmt)
+		return -ENOMEM;
+
+	/* Second: actually write the @print_fmt */
+	__set_synth_event_print_fmt(event, print_fmt, len + 1);
+	call->print_fmt = print_fmt;
+
+	return 0;
+}
+
+static void free_synth_field(struct synth_field *field)
+{
+	kfree(field->type);
+	kfree(field->name);
+	kfree(field);
+}
+
+static struct synth_field *parse_synth_field(int argc, const char **argv,
+					     int *consumed)
+{
+	struct synth_field *field;
+	const char *prefix = NULL, *field_type = argv[0], *field_name, *array;
+	int len, ret = 0;
+
+	if (field_type[0] == ';')
+		field_type++;
+
+	if (!strcmp(field_type, "unsigned")) {
+		if (argc < 3)
+			return ERR_PTR(-EINVAL);
+		prefix = "unsigned ";
+		field_type = argv[1];
+		field_name = argv[2];
+		*consumed = 3;
+	} else {
+		field_name = argv[1];
+		*consumed = 2;
+	}
+
+	field = kzalloc(sizeof(*field), GFP_KERNEL);
+	if (!field)
+		return ERR_PTR(-ENOMEM);
+
+	len = strlen(field_name);
+	array = strchr(field_name, '[');
+	if (array)
+		len -= strlen(array);
+	else if (field_name[len - 1] == ';')
+		len--;
+
+	field->name = kmemdup_nul(field_name, len, GFP_KERNEL);
+	if (!field->name) {
+		ret = -ENOMEM;
+		goto free;
+	}
+
+	if (field_type[0] == ';')
+		field_type++;
+	len = strlen(field_type) + 1;
+	if (array)
+		len += strlen(array);
+	if (prefix)
+		len += strlen(prefix);
+
+	field->type = kzalloc(len, GFP_KERNEL);
+	if (!field->type) {
+		ret = -ENOMEM;
+		goto free;
+	}
+	if (prefix)
+		strcat(field->type, prefix);
+	strcat(field->type, field_type);
+	if (array) {
+		strcat(field->type, array);
+		if (field->type[len - 1] == ';')
+			field->type[len - 1] = '\0';
+	}
+
+	field->size = synth_field_size(field->type);
+	if (!field->size) {
+		ret = -EINVAL;
+		goto free;
+	}
+
+	if (synth_field_is_string(field->type))
+		field->is_string = true;
+
+	field->is_signed = synth_field_signed(field->type);
+
+ out:
+	return field;
+ free:
+	free_synth_field(field);
+	field = ERR_PTR(ret);
+	goto out;
+}
+
+static void free_synth_tracepoint(struct tracepoint *tp)
+{
+	if (!tp)
+		return;
+
+	kfree(tp->name);
+	kfree(tp);
+}
+
+static struct tracepoint *alloc_synth_tracepoint(char *name)
+{
+	struct tracepoint *tp;
+
+	tp = kzalloc(sizeof(*tp), GFP_KERNEL);
+	if (!tp)
+		return ERR_PTR(-ENOMEM);
+
+	tp->name = kstrdup(name, GFP_KERNEL);
+	if (!tp->name) {
+		kfree(tp);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	return tp;
+}
+
+struct synth_event *find_synth_event(const char *name)
+{
+	struct dyn_event *pos;
+	struct synth_event *event;
+
+	for_each_dyn_event(pos) {
+		if (!is_synth_event(pos))
+			continue;
+		event = to_synth_event(pos);
+		if (strcmp(event->name, name) == 0)
+			return event;
+	}
+
+	return NULL;
+}
+
+static struct trace_event_fields synth_event_fields_array[] = {
+	{ .type = TRACE_FUNCTION_TYPE,
+	  .define_fields = synth_event_define_fields },
+	{}
+};
+
+static int register_synth_event(struct synth_event *event)
+{
+	struct trace_event_call *call = &event->call;
+	int ret = 0;
+
+	event->call.class = &event->class;
+	event->class.system = kstrdup(SYNTH_SYSTEM, GFP_KERNEL);
+	if (!event->class.system) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	event->tp = alloc_synth_tracepoint(event->name);
+	if (IS_ERR(event->tp)) {
+		ret = PTR_ERR(event->tp);
+		event->tp = NULL;
+		goto out;
+	}
+
+	INIT_LIST_HEAD(&call->class->fields);
+	call->event.funcs = &synth_event_funcs;
+	call->class->fields_array = synth_event_fields_array;
+
+	ret = register_trace_event(&call->event);
+	if (!ret) {
+		ret = -ENODEV;
+		goto out;
+	}
+	call->flags = TRACE_EVENT_FL_TRACEPOINT;
+	call->class->reg = trace_event_reg;
+	call->class->probe = trace_event_raw_event_synth;
+	call->data = event;
+	call->tp = event->tp;
+
+	ret = trace_add_event_call(call);
+	if (ret) {
+		pr_warn("Failed to register synthetic event: %s\n",
+			trace_event_name(call));
+		goto err;
+	}
+
+	ret = set_synth_event_print_fmt(call);
+	if (ret < 0) {
+		trace_remove_event_call(call);
+		goto err;
+	}
+ out:
+	return ret;
+ err:
+	unregister_trace_event(&call->event);
+	goto out;
+}
+
+static int unregister_synth_event(struct synth_event *event)
+{
+	struct trace_event_call *call = &event->call;
+	int ret;
+
+	ret = trace_remove_event_call(call);
+
+	return ret;
+}
+
+static void free_synth_event(struct synth_event *event)
+{
+	unsigned int i;
+
+	if (!event)
+		return;
+
+	for (i = 0; i < event->n_fields; i++)
+		free_synth_field(event->fields[i]);
+
+	kfree(event->fields);
+	kfree(event->name);
+	kfree(event->class.system);
+	free_synth_tracepoint(event->tp);
+	free_synth_event_print_fmt(&event->call);
+	kfree(event);
+}
+
+static struct synth_event *alloc_synth_event(const char *name, int n_fields,
+					     struct synth_field **fields)
+{
+	struct synth_event *event;
+	unsigned int i;
+
+	event = kzalloc(sizeof(*event), GFP_KERNEL);
+	if (!event) {
+		event = ERR_PTR(-ENOMEM);
+		goto out;
+	}
+
+	event->name = kstrdup(name, GFP_KERNEL);
+	if (!event->name) {
+		kfree(event);
+		event = ERR_PTR(-ENOMEM);
+		goto out;
+	}
+
+	event->fields = kcalloc(n_fields, sizeof(*event->fields), GFP_KERNEL);
+	if (!event->fields) {
+		free_synth_event(event);
+		event = ERR_PTR(-ENOMEM);
+		goto out;
+	}
+
+	dyn_event_init(&event->devent, &synth_event_ops);
+
+	for (i = 0; i < n_fields; i++)
+		event->fields[i] = fields[i];
+
+	event->n_fields = n_fields;
+ out:
+	return event;
+}
+
+static int synth_event_check_arg_fn(void *data)
+{
+	struct dynevent_arg_pair *arg_pair = data;
+	int size;
+
+	size = synth_field_size((char *)arg_pair->lhs);
+
+	return size ? 0 : -EINVAL;
+}
+
+/**
+ * synth_event_add_field - Add a new field to a synthetic event cmd
+ * @cmd: A pointer to the dynevent_cmd struct representing the new event
+ * @type: The type of the new field to add
+ * @name: The name of the new field to add
+ *
+ * Add a new field to a synthetic event cmd object.  Field ordering is in
+ * the same order the fields are added.
+ *
+ * See synth_field_size() for available types. If field_name contains
+ * [n] the field is considered to be an array.
+ *
+ * Return: 0 if successful, error otherwise.
+ */
+int synth_event_add_field(struct dynevent_cmd *cmd, const char *type,
+			  const char *name)
+{
+	struct dynevent_arg_pair arg_pair;
+	int ret;
+
+	if (cmd->type != DYNEVENT_TYPE_SYNTH)
+		return -EINVAL;
+
+	if (!type || !name)
+		return -EINVAL;
+
+	dynevent_arg_pair_init(&arg_pair, 0, ';');
+
+	arg_pair.lhs = type;
+	arg_pair.rhs = name;
+
+	ret = dynevent_arg_pair_add(cmd, &arg_pair, synth_event_check_arg_fn);
+	if (ret)
+		return ret;
+
+	if (++cmd->n_fields > SYNTH_FIELDS_MAX)
+		ret = -EINVAL;
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_add_field);
+
+/**
+ * synth_event_add_field_str - Add a new field to a synthetic event cmd
+ * @cmd: A pointer to the dynevent_cmd struct representing the new event
+ * @type_name: The type and name of the new field to add, as a single string
+ *
+ * Add a new field to a synthetic event cmd object, as a single
+ * string.  The @type_name string is expected to be of the form 'type
+ * name', which will be appended by ';'.  No sanity checking is done -
+ * what's passed in is assumed to already be well-formed.  Field
+ * ordering is in the same order the fields are added.
+ *
+ * See synth_field_size() for available types. If field_name contains
+ * [n] the field is considered to be an array.
+ *
+ * Return: 0 if successful, error otherwise.
+ */
+int synth_event_add_field_str(struct dynevent_cmd *cmd, const char *type_name)
+{
+	struct dynevent_arg arg;
+	int ret;
+
+	if (cmd->type != DYNEVENT_TYPE_SYNTH)
+		return -EINVAL;
+
+	if (!type_name)
+		return -EINVAL;
+
+	dynevent_arg_init(&arg, ';');
+
+	arg.str = type_name;
+
+	ret = dynevent_arg_add(cmd, &arg, NULL);
+	if (ret)
+		return ret;
+
+	if (++cmd->n_fields > SYNTH_FIELDS_MAX)
+		ret = -EINVAL;
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_add_field_str);
+
+/**
+ * synth_event_add_fields - Add multiple fields to a synthetic event cmd
+ * @cmd: A pointer to the dynevent_cmd struct representing the new event
+ * @fields: An array of type/name field descriptions
+ * @n_fields: The number of field descriptions contained in the fields array
+ *
+ * Add a new set of fields to a synthetic event cmd object.  The event
+ * fields that will be defined for the event should be passed in as an
+ * array of struct synth_field_desc, and the number of elements in the
+ * array passed in as n_fields.  Field ordering will retain the
+ * ordering given in the fields array.
+ *
+ * See synth_field_size() for available types. If field_name contains
+ * [n] the field is considered to be an array.
+ *
+ * Return: 0 if successful, error otherwise.
+ */
+int synth_event_add_fields(struct dynevent_cmd *cmd,
+			   struct synth_field_desc *fields,
+			   unsigned int n_fields)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; i < n_fields; i++) {
+		if (fields[i].type == NULL || fields[i].name == NULL) {
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = synth_event_add_field(cmd, fields[i].type, fields[i].name);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_add_fields);
+
+/**
+ * __synth_event_gen_cmd_start - Start a synthetic event command from arg list
+ * @cmd: A pointer to the dynevent_cmd struct representing the new event
+ * @name: The name of the synthetic event
+ * @mod: The module creating the event, NULL if not created from a module
+ * @args: Variable number of arg (pairs), one pair for each field
+ *
+ * NOTE: Users normally won't want to call this function directly, but
+ * rather use the synth_event_gen_cmd_start() wrapper, which
+ * automatically adds a NULL to the end of the arg list.  If this
+ * function is used directly, make sure the last arg in the variable
+ * arg list is NULL.
+ *
+ * Generate a synthetic event command to be executed by
+ * synth_event_gen_cmd_end().  This function can be used to generate
+ * the complete command or only the first part of it; in the latter
+ * case, synth_event_add_field(), synth_event_add_field_str(), or
+ * synth_event_add_fields() can be used to add more fields following
+ * this.
+ *
+ * There should be an even number variable args, each pair consisting
+ * of a type followed by a field name.
+ *
+ * See synth_field_size() for available types. If field_name contains
+ * [n] the field is considered to be an array.
+ *
+ * Return: 0 if successful, error otherwise.
+ */
+int __synth_event_gen_cmd_start(struct dynevent_cmd *cmd, const char *name,
+				struct module *mod, ...)
+{
+	struct dynevent_arg arg;
+	va_list args;
+	int ret;
+
+	cmd->event_name = name;
+	cmd->private_data = mod;
+
+	if (cmd->type != DYNEVENT_TYPE_SYNTH)
+		return -EINVAL;
+
+	dynevent_arg_init(&arg, 0);
+	arg.str = name;
+	ret = dynevent_arg_add(cmd, &arg, NULL);
+	if (ret)
+		return ret;
+
+	va_start(args, mod);
+	for (;;) {
+		const char *type, *name;
+
+		type = va_arg(args, const char *);
+		if (!type)
+			break;
+		name = va_arg(args, const char *);
+		if (!name)
+			break;
+
+		if (++cmd->n_fields > SYNTH_FIELDS_MAX) {
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = synth_event_add_field(cmd, type, name);
+		if (ret)
+			break;
+	}
+	va_end(args);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__synth_event_gen_cmd_start);
+
+/**
+ * synth_event_gen_cmd_array_start - Start synthetic event command from an array
+ * @cmd: A pointer to the dynevent_cmd struct representing the new event
+ * @name: The name of the synthetic event
+ * @fields: An array of type/name field descriptions
+ * @n_fields: The number of field descriptions contained in the fields array
+ *
+ * Generate a synthetic event command to be executed by
+ * synth_event_gen_cmd_end().  This function can be used to generate
+ * the complete command or only the first part of it; in the latter
+ * case, synth_event_add_field(), synth_event_add_field_str(), or
+ * synth_event_add_fields() can be used to add more fields following
+ * this.
+ *
+ * The event fields that will be defined for the event should be
+ * passed in as an array of struct synth_field_desc, and the number of
+ * elements in the array passed in as n_fields.  Field ordering will
+ * retain the ordering given in the fields array.
+ *
+ * See synth_field_size() for available types. If field_name contains
+ * [n] the field is considered to be an array.
+ *
+ * Return: 0 if successful, error otherwise.
+ */
+int synth_event_gen_cmd_array_start(struct dynevent_cmd *cmd, const char *name,
+				    struct module *mod,
+				    struct synth_field_desc *fields,
+				    unsigned int n_fields)
+{
+	struct dynevent_arg arg;
+	unsigned int i;
+	int ret = 0;
+
+	cmd->event_name = name;
+	cmd->private_data = mod;
+
+	if (cmd->type != DYNEVENT_TYPE_SYNTH)
+		return -EINVAL;
+
+	if (n_fields > SYNTH_FIELDS_MAX)
+		return -EINVAL;
+
+	dynevent_arg_init(&arg, 0);
+	arg.str = name;
+	ret = dynevent_arg_add(cmd, &arg, NULL);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < n_fields; i++) {
+		if (fields[i].type == NULL || fields[i].name == NULL)
+			return -EINVAL;
+
+		ret = synth_event_add_field(cmd, fields[i].type, fields[i].name);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_gen_cmd_array_start);
+
+static int __create_synth_event(int argc, const char *name, const char **argv)
+{
+	struct synth_field *field, *fields[SYNTH_FIELDS_MAX];
+	struct synth_event *event = NULL;
+	int i, consumed = 0, n_fields = 0, ret = 0;
+
+	/*
+	 * Argument syntax:
+	 *  - Add synthetic event: <event_name> field[;field] ...
+	 *  - Remove synthetic event: !<event_name> field[;field] ...
+	 *      where 'field' = type field_name
+	 */
+
+	if (name[0] == '\0' || argc < 1)
+		return -EINVAL;
+
+	mutex_lock(&event_mutex);
+
+	event = find_synth_event(name);
+	if (event) {
+		ret = -EEXIST;
+		goto out;
+	}
+
+	for (i = 0; i < argc - 1; i++) {
+		if (strcmp(argv[i], ";") == 0)
+			continue;
+		if (n_fields == SYNTH_FIELDS_MAX) {
+			ret = -EINVAL;
+			goto err;
+		}
+
+		field = parse_synth_field(argc - i, &argv[i], &consumed);
+		if (IS_ERR(field)) {
+			ret = PTR_ERR(field);
+			goto err;
+		}
+		fields[n_fields++] = field;
+		i += consumed - 1;
+	}
+
+	if (i < argc && strcmp(argv[i], ";") != 0) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	event = alloc_synth_event(name, n_fields, fields);
+	if (IS_ERR(event)) {
+		ret = PTR_ERR(event);
+		event = NULL;
+		goto err;
+	}
+	ret = register_synth_event(event);
+	if (!ret)
+		dyn_event_add(&event->devent);
+	else
+		free_synth_event(event);
+ out:
+	mutex_unlock(&event_mutex);
+
+	return ret;
+ err:
+	for (i = 0; i < n_fields; i++)
+		free_synth_field(fields[i]);
+
+	goto out;
+}
+
+/**
+ * synth_event_create - Create a new synthetic event
+ * @name: The name of the new sythetic event
+ * @fields: An array of type/name field descriptions
+ * @n_fields: The number of field descriptions contained in the fields array
+ * @mod: The module creating the event, NULL if not created from a module
+ *
+ * Create a new synthetic event with the given name under the
+ * trace/events/synthetic/ directory.  The event fields that will be
+ * defined for the event should be passed in as an array of struct
+ * synth_field_desc, and the number elements in the array passed in as
+ * n_fields. Field ordering will retain the ordering given in the
+ * fields array.
+ *
+ * If the new synthetic event is being created from a module, the mod
+ * param must be non-NULL.  This will ensure that the trace buffer
+ * won't contain unreadable events.
+ *
+ * The new synth event should be deleted using synth_event_delete()
+ * function.  The new synthetic event can be generated from modules or
+ * other kernel code using trace_synth_event() and related functions.
+ *
+ * Return: 0 if successful, error otherwise.
+ */
+int synth_event_create(const char *name, struct synth_field_desc *fields,
+		       unsigned int n_fields, struct module *mod)
+{
+	struct dynevent_cmd cmd;
+	char *buf;
+	int ret;
+
+	buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	synth_event_cmd_init(&cmd, buf, MAX_DYNEVENT_CMD_LEN);
+
+	ret = synth_event_gen_cmd_array_start(&cmd, name, mod,
+					      fields, n_fields);
+	if (ret)
+		goto out;
+
+	ret = synth_event_gen_cmd_end(&cmd);
+ out:
+	kfree(buf);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_create);
+
+static int destroy_synth_event(struct synth_event *se)
+{
+	int ret;
+
+	if (se->ref)
+		ret = -EBUSY;
+	else {
+		ret = unregister_synth_event(se);
+		if (!ret) {
+			dyn_event_remove(&se->devent);
+			free_synth_event(se);
+		}
+	}
+
+	return ret;
+}
+
+/**
+ * synth_event_delete - Delete a synthetic event
+ * @event_name: The name of the new sythetic event
+ *
+ * Delete a synthetic event that was created with synth_event_create().
+ *
+ * Return: 0 if successful, error otherwise.
+ */
+int synth_event_delete(const char *event_name)
+{
+	struct synth_event *se = NULL;
+	struct module *mod = NULL;
+	int ret = -ENOENT;
+
+	mutex_lock(&event_mutex);
+	se = find_synth_event(event_name);
+	if (se) {
+		mod = se->mod;
+		ret = destroy_synth_event(se);
+	}
+	mutex_unlock(&event_mutex);
+
+	if (mod) {
+		mutex_lock(&trace_types_lock);
+		/*
+		 * It is safest to reset the ring buffer if the module
+		 * being unloaded registered any events that were
+		 * used. The only worry is if a new module gets
+		 * loaded, and takes on the same id as the events of
+		 * this module. When printing out the buffer, traced
+		 * events left over from this module may be passed to
+		 * the new module events and unexpected results may
+		 * occur.
+		 */
+		tracing_reset_all_online_cpus();
+		mutex_unlock(&trace_types_lock);
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_delete);
+
+static int create_or_delete_synth_event(int argc, char **argv)
+{
+	const char *name = argv[0];
+	int ret;
+
+	/* trace_run_command() ensures argc != 0 */
+	if (name[0] == '!') {
+		ret = synth_event_delete(name + 1);
+		return ret;
+	}
+
+	ret = __create_synth_event(argc - 1, name, (const char **)argv + 1);
+	return ret == -ECANCELED ? -EINVAL : ret;
+}
+
+static int synth_event_run_command(struct dynevent_cmd *cmd)
+{
+	struct synth_event *se;
+	int ret;
+
+	ret = trace_run_command(cmd->seq.buffer, create_or_delete_synth_event);
+	if (ret)
+		return ret;
+
+	se = find_synth_event(cmd->event_name);
+	if (WARN_ON(!se))
+		return -ENOENT;
+
+	se->mod = cmd->private_data;
+
+	return ret;
+}
+
+/**
+ * synth_event_cmd_init - Initialize a synthetic event command object
+ * @cmd: A pointer to the dynevent_cmd struct representing the new event
+ * @buf: A pointer to the buffer used to build the command
+ * @maxlen: The length of the buffer passed in @buf
+ *
+ * Initialize a synthetic event command object.  Use this before
+ * calling any of the other dyenvent_cmd functions.
+ */
+void synth_event_cmd_init(struct dynevent_cmd *cmd, char *buf, int maxlen)
+{
+	dynevent_cmd_init(cmd, buf, maxlen, DYNEVENT_TYPE_SYNTH,
+			  synth_event_run_command);
+}
+EXPORT_SYMBOL_GPL(synth_event_cmd_init);
+
+static inline int
+__synth_event_trace_start(struct trace_event_file *file,
+			  struct synth_event_trace_state *trace_state)
+{
+	int entry_size, fields_size = 0;
+	int ret = 0;
+
+	memset(trace_state, '\0', sizeof(*trace_state));
+
+	/*
+	 * Normal event tracing doesn't get called at all unless the
+	 * ENABLED bit is set (which attaches the probe thus allowing
+	 * this code to be called, etc).  Because this is called
+	 * directly by the user, we don't have that but we still need
+	 * to honor not logging when disabled.  For the the iterated
+	 * trace case, we save the enabed state upon start and just
+	 * ignore the following data calls.
+	 */
+	if (!(file->flags & EVENT_FILE_FL_ENABLED) ||
+	    trace_trigger_soft_disabled(file)) {
+		trace_state->disabled = true;
+		ret = -ENOENT;
+		goto out;
+	}
+
+	trace_state->event = file->event_call->data;
+
+	fields_size = trace_state->event->n_u64 * sizeof(u64);
+
+	/*
+	 * Avoid ring buffer recursion detection, as this event
+	 * is being performed within another event.
+	 */
+	trace_state->buffer = file->tr->array_buffer.buffer;
+	ring_buffer_nest_start(trace_state->buffer);
+
+	entry_size = sizeof(*trace_state->entry) + fields_size;
+	trace_state->entry = trace_event_buffer_reserve(&trace_state->fbuffer,
+							file,
+							entry_size);
+	if (!trace_state->entry) {
+		ring_buffer_nest_end(trace_state->buffer);
+		ret = -EINVAL;
+	}
+out:
+	return ret;
+}
+
+static inline void
+__synth_event_trace_end(struct synth_event_trace_state *trace_state)
+{
+	trace_event_buffer_commit(&trace_state->fbuffer);
+
+	ring_buffer_nest_end(trace_state->buffer);
+}
+
+/**
+ * synth_event_trace - Trace a synthetic event
+ * @file: The trace_event_file representing the synthetic event
+ * @n_vals: The number of values in vals
+ * @args: Variable number of args containing the event values
+ *
+ * Trace a synthetic event using the values passed in the variable
+ * argument list.
+ *
+ * The argument list should be a list 'n_vals' u64 values.  The number
+ * of vals must match the number of field in the synthetic event, and
+ * must be in the same order as the synthetic event fields.
+ *
+ * All vals should be cast to u64, and string vals are just pointers
+ * to strings, cast to u64.  Strings will be copied into space
+ * reserved in the event for the string, using these pointers.
+ *
+ * Return: 0 on success, err otherwise.
+ */
+int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...)
+{
+	struct synth_event_trace_state state;
+	unsigned int i, n_u64;
+	va_list args;
+	int ret;
+
+	ret = __synth_event_trace_start(file, &state);
+	if (ret) {
+		if (ret == -ENOENT)
+			ret = 0; /* just disabled, not really an error */
+		return ret;
+	}
+
+	if (n_vals != state.event->n_fields) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	va_start(args, n_vals);
+	for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) {
+		u64 val;
+
+		val = va_arg(args, u64);
+
+		if (state.event->fields[i]->is_string) {
+			char *str_val = (char *)(long)val;
+			char *str_field = (char *)&state.entry->fields[n_u64];
+
+			strscpy(str_field, str_val, STR_VAR_LEN_MAX);
+			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
+		} else {
+			struct synth_field *field = state.event->fields[i];
+
+			switch (field->size) {
+			case 1:
+				*(u8 *)&state.entry->fields[n_u64] = (u8)val;
+				break;
+
+			case 2:
+				*(u16 *)&state.entry->fields[n_u64] = (u16)val;
+				break;
+
+			case 4:
+				*(u32 *)&state.entry->fields[n_u64] = (u32)val;
+				break;
+
+			default:
+				state.entry->fields[n_u64] = val;
+				break;
+			}
+			n_u64++;
+		}
+	}
+	va_end(args);
+out:
+	__synth_event_trace_end(&state);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_trace);
+
+/**
+ * synth_event_trace_array - Trace a synthetic event from an array
+ * @file: The trace_event_file representing the synthetic event
+ * @vals: Array of values
+ * @n_vals: The number of values in vals
+ *
+ * Trace a synthetic event using the values passed in as 'vals'.
+ *
+ * The 'vals' array is just an array of 'n_vals' u64.  The number of
+ * vals must match the number of field in the synthetic event, and
+ * must be in the same order as the synthetic event fields.
+ *
+ * All vals should be cast to u64, and string vals are just pointers
+ * to strings, cast to u64.  Strings will be copied into space
+ * reserved in the event for the string, using these pointers.
+ *
+ * Return: 0 on success, err otherwise.
+ */
+int synth_event_trace_array(struct trace_event_file *file, u64 *vals,
+			    unsigned int n_vals)
+{
+	struct synth_event_trace_state state;
+	unsigned int i, n_u64;
+	int ret;
+
+	ret = __synth_event_trace_start(file, &state);
+	if (ret) {
+		if (ret == -ENOENT)
+			ret = 0; /* just disabled, not really an error */
+		return ret;
+	}
+
+	if (n_vals != state.event->n_fields) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) {
+		if (state.event->fields[i]->is_string) {
+			char *str_val = (char *)(long)vals[i];
+			char *str_field = (char *)&state.entry->fields[n_u64];
+
+			strscpy(str_field, str_val, STR_VAR_LEN_MAX);
+			n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
+		} else {
+			struct synth_field *field = state.event->fields[i];
+			u64 val = vals[i];
+
+			switch (field->size) {
+			case 1:
+				*(u8 *)&state.entry->fields[n_u64] = (u8)val;
+				break;
+
+			case 2:
+				*(u16 *)&state.entry->fields[n_u64] = (u16)val;
+				break;
+
+			case 4:
+				*(u32 *)&state.entry->fields[n_u64] = (u32)val;
+				break;
+
+			default:
+				state.entry->fields[n_u64] = val;
+				break;
+			}
+			n_u64++;
+		}
+	}
+out:
+	__synth_event_trace_end(&state);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_trace_array);
+
+/**
+ * synth_event_trace_start - Start piecewise synthetic event trace
+ * @file: The trace_event_file representing the synthetic event
+ * @trace_state: A pointer to object tracking the piecewise trace state
+ *
+ * Start the trace of a synthetic event field-by-field rather than all
+ * at once.
+ *
+ * This function 'opens' an event trace, which means space is reserved
+ * for the event in the trace buffer, after which the event's
+ * individual field values can be set through either
+ * synth_event_add_next_val() or synth_event_add_val().
+ *
+ * A pointer to a trace_state object is passed in, which will keep
+ * track of the current event trace state until the event trace is
+ * closed (and the event finally traced) using
+ * synth_event_trace_end().
+ *
+ * Note that synth_event_trace_end() must be called after all values
+ * have been added for each event trace, regardless of whether adding
+ * all field values succeeded or not.
+ *
+ * Note also that for a given event trace, all fields must be added
+ * using either synth_event_add_next_val() or synth_event_add_val()
+ * but not both together or interleaved.
+ *
+ * Return: 0 on success, err otherwise.
+ */
+int synth_event_trace_start(struct trace_event_file *file,
+			    struct synth_event_trace_state *trace_state)
+{
+	int ret;
+
+	if (!trace_state)
+		return -EINVAL;
+
+	ret = __synth_event_trace_start(file, trace_state);
+	if (ret == -ENOENT)
+		ret = 0; /* just disabled, not really an error */
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(synth_event_trace_start);
+
+static int __synth_event_add_val(const char *field_name, u64 val,
+				 struct synth_event_trace_state *trace_state)
+{
+	struct synth_field *field = NULL;
+	struct synth_trace_event *entry;
+	struct synth_event *event;
+	int i, ret = 0;
+
+	if (!trace_state) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	/* can't mix add_next_synth_val() with add_synth_val() */
+	if (field_name) {
+		if (trace_state->add_next) {
+			ret = -EINVAL;
+			goto out;
+		}
+		trace_state->add_name = true;
+	} else {
+		if (trace_state->add_name) {
+			ret = -EINVAL;
+			goto out;
+		}
+		trace_state->add_next = true;
+	}
+
+	if (trace_state->disabled)
+		goto out;
+
+	event = trace_state->event;
+	if (trace_state->add_name) {
+		for (i = 0; i < event->n_fields; i++) {
+			field = event->fields[i];
+			if (strcmp(field->name, field_name) == 0)
+				break;
+		}
+		if (!field) {
+			ret = -EINVAL;
+			goto out;
+		}
+	} else {
+		if (trace_state->cur_field >= event->n_fields) {
+			ret = -EINVAL;
+			goto out;
+		}
+		field = event->fields[trace_state->cur_field++];
+	}
+
+	entry = trace_state->entry;
+	if (field->is_string) {
+		char *str_val = (char *)(long)val;
+		char *str_field;
+
+		if (!str_val) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		str_field = (char *)&entry->fields[field->offset];
+		strscpy(str_field, str_val, STR_VAR_LEN_MAX);
+	} else {
+		switch (field->size) {
+		case 1:
+			*(u8 *)&trace_state->entry->fields[field->offset] = (u8)val;
+			break;
+
+		case 2:
+			*(u16 *)&trace_state->entry->fields[field->offset] = (u16)val;
+			break;
+
+		case 4:
+			*(u32 *)&trace_state->entry->fields[field->offset] = (u32)val;
+			break;
+
+		default:
+			trace_state->entry->fields[field->offset] = val;
+			break;
+		}
+	}
+ out:
+	return ret;
+}
+
+/**
+ * synth_event_add_next_val - Add the next field's value to an open synth trace
+ * @val: The value to set the next field to
+ * @trace_state: A pointer to object tracking the piecewise trace state
+ *
+ * Set the value of the next field in an event that's been opened by
+ * synth_event_trace_start().
+ *
+ * The val param should be the value cast to u64.  If the value points
+ * to a string, the val param should be a char * cast to u64.
+ *
+ * This function assumes all the fields in an event are to be set one
+ * after another - successive calls to this function are made, one for
+ * each field, in the order of the fields in the event, until all
+ * fields have been set.  If you'd rather set each field individually
+ * without regard to ordering, synth_event_add_val() can be used
+ * instead.
+ *
+ * Note however that synth_event_add_next_val() and
+ * synth_event_add_val() can't be intermixed for a given event trace -
+ * one or the other but not both can be used at the same time.
+ *
+ * Note also that synth_event_trace_end() must be called after all
+ * values have been added for each event trace, regardless of whether
+ * adding all field values succeeded or not.
+ *
+ * Return: 0 on success, err otherwise.
+ */
+int synth_event_add_next_val(u64 val,
+			     struct synth_event_trace_state *trace_state)
+{
+	return __synth_event_add_val(NULL, val, trace_state);
+}
+EXPORT_SYMBOL_GPL(synth_event_add_next_val);
+
+/**
+ * synth_event_add_val - Add a named field's value to an open synth trace
+ * @field_name: The name of the synthetic event field value to set
+ * @val: The value to set the next field to
+ * @trace_state: A pointer to object tracking the piecewise trace state
+ *
+ * Set the value of the named field in an event that's been opened by
+ * synth_event_trace_start().
+ *
+ * The val param should be the value cast to u64.  If the value points
+ * to a string, the val param should be a char * cast to u64.
+ *
+ * This function looks up the field name, and if found, sets the field
+ * to the specified value.  This lookup makes this function more
+ * expensive than synth_event_add_next_val(), so use that or the
+ * none-piecewise synth_event_trace() instead if efficiency is more
+ * important.
+ *
+ * Note however that synth_event_add_next_val() and
+ * synth_event_add_val() can't be intermixed for a given event trace -
+ * one or the other but not both can be used at the same time.
+ *
+ * Note also that synth_event_trace_end() must be called after all
+ * values have been added for each event trace, regardless of whether
+ * adding all field values succeeded or not.
+ *
+ * Return: 0 on success, err otherwise.
+ */
+int synth_event_add_val(const char *field_name, u64 val,
+			struct synth_event_trace_state *trace_state)
+{
+	return __synth_event_add_val(field_name, val, trace_state);
+}
+EXPORT_SYMBOL_GPL(synth_event_add_val);
+
+/**
+ * synth_event_trace_end - End piecewise synthetic event trace
+ * @trace_state: A pointer to object tracking the piecewise trace state
+ *
+ * End the trace of a synthetic event opened by
+ * synth_event_trace__start().
+ *
+ * This function 'closes' an event trace, which basically means that
+ * it commits the reserved event and cleans up other loose ends.
+ *
+ * A pointer to a trace_state object is passed in, which will keep
+ * track of the current event trace state opened with
+ * synth_event_trace_start().
+ *
+ * Note that this function must be called after all values have been
+ * added for each event trace, regardless of whether adding all field
+ * values succeeded or not.
+ *
+ * Return: 0 on success, err otherwise.
+ */
+int synth_event_trace_end(struct synth_event_trace_state *trace_state)
+{
+	if (!trace_state)
+		return -EINVAL;
+
+	__synth_event_trace_end(trace_state);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(synth_event_trace_end);
+
+static int create_synth_event(int argc, const char **argv)
+{
+	const char *name = argv[0];
+	int len;
+
+	if (name[0] != 's' || name[1] != ':')
+		return -ECANCELED;
+	name += 2;
+
+	/* This interface accepts group name prefix */
+	if (strchr(name, '/')) {
+		len = str_has_prefix(name, SYNTH_SYSTEM "/");
+		if (len == 0)
+			return -EINVAL;
+		name += len;
+	}
+	return __create_synth_event(argc - 1, name, argv + 1);
+}
+
+static int synth_event_release(struct dyn_event *ev)
+{
+	struct synth_event *event = to_synth_event(ev);
+	int ret;
+
+	if (event->ref)
+		return -EBUSY;
+
+	ret = unregister_synth_event(event);
+	if (ret)
+		return ret;
+
+	dyn_event_remove(ev);
+	free_synth_event(event);
+	return 0;
+}
+
+static int __synth_event_show(struct seq_file *m, struct synth_event *event)
+{
+	struct synth_field *field;
+	unsigned int i;
+
+	seq_printf(m, "%s\t", event->name);
+
+	for (i = 0; i < event->n_fields; i++) {
+		field = event->fields[i];
+
+		/* parameter values */
+		seq_printf(m, "%s %s%s", field->type, field->name,
+			   i == event->n_fields - 1 ? "" : "; ");
+	}
+
+	seq_putc(m, '\n');
+
+	return 0;
+}
+
+static int synth_event_show(struct seq_file *m, struct dyn_event *ev)
+{
+	struct synth_event *event = to_synth_event(ev);
+
+	seq_printf(m, "s:%s/", event->class.system);
+
+	return __synth_event_show(m, event);
+}
+
+static int synth_events_seq_show(struct seq_file *m, void *v)
+{
+	struct dyn_event *ev = v;
+
+	if (!is_synth_event(ev))
+		return 0;
+
+	return __synth_event_show(m, to_synth_event(ev));
+}
+
+static const struct seq_operations synth_events_seq_op = {
+	.start	= dyn_event_seq_start,
+	.next	= dyn_event_seq_next,
+	.stop	= dyn_event_seq_stop,
+	.show	= synth_events_seq_show,
+};
+
+static int synth_events_open(struct inode *inode, struct file *file)
+{
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
+	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
+		ret = dyn_events_release_all(&synth_event_ops);
+		if (ret < 0)
+			return ret;
+	}
+
+	return seq_open(file, &synth_events_seq_op);
+}
+
+static ssize_t synth_events_write(struct file *file,
+				  const char __user *buffer,
+				  size_t count, loff_t *ppos)
+{
+	return trace_parse_run_command(file, buffer, count, ppos,
+				       create_or_delete_synth_event);
+}
+
+static const struct file_operations synth_events_fops = {
+	.open           = synth_events_open,
+	.write		= synth_events_write,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = seq_release,
+};
+
+static __init int trace_events_synth_init(void)
+{
+	struct dentry *entry = NULL;
+	struct dentry *d_tracer;
+	int err = 0;
+
+	err = dyn_event_register(&synth_event_ops);
+	if (err) {
+		pr_warn("Could not register synth_event_ops\n");
+		return err;
+	}
+
+	d_tracer = tracing_init_dentry();
+	if (IS_ERR(d_tracer)) {
+		err = PTR_ERR(d_tracer);
+		goto err;
+	}
+
+	entry = tracefs_create_file("synthetic_events", 0644, d_tracer,
+				    NULL, &synth_events_fops);
+	if (!entry) {
+		err = -ENODEV;
+		goto err;
+	}
+
+	return err;
+ err:
+	pr_warn("Could not create tracefs 'synthetic_events' entry\n");
+
+	return err;
+}
+
+fs_initcall(trace_events_synth_init);
diff --git a/kernel/trace/trace_synth.h b/kernel/trace/trace_synth.h
new file mode 100644
index 000000000000..ac35c45207c4
--- /dev/null
+++ b/kernel/trace/trace_synth.h
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef __TRACE_SYNTH_H
+#define __TRACE_SYNTH_H
+
+#include "trace_dynevent.h"
+
+#define SYNTH_SYSTEM		"synthetic"
+#define SYNTH_FIELDS_MAX	32
+
+#define STR_VAR_LEN_MAX		32 /* must be multiple of sizeof(u64) */
+
+struct synth_field {
+	char *type;
+	char *name;
+	size_t size;
+	unsigned int offset;
+	bool is_signed;
+	bool is_string;
+};
+
+struct synth_event {
+	struct dyn_event			devent;
+	int					ref;
+	char					*name;
+	struct synth_field			**fields;
+	unsigned int				n_fields;
+	unsigned int				n_u64;
+	struct trace_event_class		class;
+	struct trace_event_call			call;
+	struct tracepoint			*tp;
+	struct module				*mod;
+};
+
+extern struct synth_event *find_synth_event(const char *name);
+
+#endif /* __TRACE_SYNTH_H */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 10/12] selftests/ftrace: Distinguish between hist and synthetic event checks
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (7 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 09/12] tracing: Move synthetic events to a separate file Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 11/12] ftrace,bug: Improve traceoff_on_warn Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 12/12] tracing: Add a trace print when traceoff_on_warning is triggered Steven Rostedt
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Tom Zanussi

From: Tom Zanussi <zanussi@kernel.org>

With synthetic events now a separate config item as a result of
'tracing: Move synthetic events to a separate file', tests that use
both need to explicitly check for hist trigger support rather than
relying on hist triggers to pull in synthetic events.

Add an additional hist trigger check to all the trigger tests that now
require it, otherwise they'll fail if synthetic events but not hist
triggers are enabled.

Link: http://lkml.kernel.org/r/af36c539006ef2768114b4ed38e6b054f7c7a3bd.1590693308.git.zanussi@kernel.org

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 .../trigger/inter-event/trigger-field-variable-support.tc    | 5 +++++
 .../trigger/inter-event/trigger-inter-event-combined-hist.tc | 5 +++++
 .../trigger/inter-event/trigger-multi-actions-accept.tc      | 5 +++++
 .../trigger/inter-event/trigger-onmatch-action-hist.tc       | 5 +++++
 .../trigger/inter-event/trigger-onmatch-onmax-action-hist.tc | 5 +++++
 .../test.d/trigger/inter-event/trigger-onmax-action-hist.tc  | 5 +++++
 .../trigger/inter-event/trigger-snapshot-action-hist.tc      | 5 +++++
 .../test.d/trigger/inter-event/trigger-trace-action-hist.tc  | 5 +++++
 8 files changed, 40 insertions(+)

diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc
index 77be6e1f6e7b..e232059a8ab2 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc
@@ -17,6 +17,11 @@ if [ ! -f synthetic_events ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 echo "Test field variable support"
 
 echo 'wakeup_latency u64 lat; pid_t pid; int prio; char comm[16]' > synthetic_events
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-inter-event-combined-hist.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-inter-event-combined-hist.tc
index f3eb8aacec0e..07cfcb8157b6 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-inter-event-combined-hist.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-inter-event-combined-hist.tc
@@ -17,6 +17,11 @@ if [ ! -f synthetic_events ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 echo "Test create synthetic event"
 
 echo 'waking_latency  u64 lat pid_t pid' > synthetic_events
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-multi-actions-accept.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-multi-actions-accept.tc
index d281f056f980..73e413c2ca26 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-multi-actions-accept.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-multi-actions-accept.tc
@@ -17,6 +17,11 @@ if [ ! -f synthetic_events ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 echo "Test multiple actions on hist trigger"
 echo 'wakeup_latency u64 lat; pid_t pid' >> synthetic_events
 TRIGGER1=events/sched/sched_wakeup/trigger
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-action-hist.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-action-hist.tc
index a708f0e7858a..ebe0ad827f9f 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-action-hist.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-action-hist.tc
@@ -17,6 +17,11 @@ if [ ! -f synthetic_events ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 echo "Test create synthetic event"
 
 echo 'wakeup_latency  u64 lat pid_t pid char comm[16]' > synthetic_events
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-onmax-action-hist.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-onmax-action-hist.tc
index dfce6932d8be..2a2ef767249e 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-onmax-action-hist.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmatch-onmax-action-hist.tc
@@ -17,6 +17,11 @@ if [ ! -f synthetic_events ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 echo "Test create synthetic event"
 
 echo 'wakeup_latency  u64 lat pid_t pid char comm[16]' > synthetic_events
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmax-action-hist.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmax-action-hist.tc
index 0035995c2194..98d73bfb0296 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmax-action-hist.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-onmax-action-hist.tc
@@ -17,6 +17,11 @@ if [ ! -f synthetic_events ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 echo "Test create synthetic event"
 
 echo 'wakeup_latency  u64 lat pid_t pid char comm[16]' > synthetic_events
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-snapshot-action-hist.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-snapshot-action-hist.tc
index f546c1b66a9b..01b01b9c4e07 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-snapshot-action-hist.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-snapshot-action-hist.tc
@@ -12,6 +12,11 @@ if [ ! -f set_event ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 if [ ! -f snapshot ]; then
     echo "snapshot is not supported"
     exit_unsupported
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-trace-action-hist.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-trace-action-hist.tc
index 8021d60aafec..c3baa486aeb4 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-trace-action-hist.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-trace-action-hist.tc
@@ -17,6 +17,11 @@ if [ ! -f synthetic_events ]; then
     exit_unsupported
 fi
 
+if [ ! -f events/sched/sched_process_fork/hist ]; then
+    echo "hist trigger is not supported"
+    exit_unsupported
+fi
+
 grep -q "trace(<synthetic_event>" README || exit_unsupported # version issue
 
 echo "Test create synthetic event"
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 11/12] ftrace,bug: Improve traceoff_on_warn
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (8 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 10/12] selftests/ftrace: Distinguish between hist and synthetic event checks Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  2020-06-01 12:27 ` [for-next][PATCH 12/12] tracing: Add a trace print when traceoff_on_warning is triggered Steven Rostedt
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra (Intel)

From: Peter Zijlstra <peterz@infradead.org>

While doing some tracing, I found a huge portion of the per-cpu buffer
was taken by printk/serial output because we're disabling the trace far
too late (after printing the CUT string).

Improve matters for architectures that have GENERIC_BUG + _BUG_FLAGS by
killing the tracer in the exception handler before printing anything
much.

Link: https://lkml.kernel.org/r/20200528145240.GF706495@hirez.programming.kicks-ass.net

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 lib/bug.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/bug.c b/lib/bug.c
index 8c98af0bf585..7103440c0ee1 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -47,6 +47,7 @@
 #include <linux/bug.h>
 #include <linux/sched.h>
 #include <linux/rculist.h>
+#include <linux/ftrace.h>
 
 extern struct bug_entry __start___bug_table[], __stop___bug_table[];
 
@@ -153,6 +154,8 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 	if (!bug)
 		return BUG_TRAP_TYPE_NONE;
 
+	disable_trace_on_warning();
+
 	file = NULL;
 	line = 0;
 	warning = 0;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [for-next][PATCH 12/12] tracing: Add a trace print when traceoff_on_warning is triggered
  2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
                   ` (9 preceding siblings ...)
  2020-06-01 12:27 ` [for-next][PATCH 11/12] ftrace,bug: Improve traceoff_on_warn Steven Rostedt
@ 2020-06-01 12:27 ` Steven Rostedt
  10 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2020-06-01 12:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

When "traceoff_on_warning" is enabled and a warning happens, there can still
be many trace events happening on other CPUs between the time the warning
occurred and the last trace event on that same CPU. This can cause confusion
in examining the trace, as it may not be obvious where the warning happened.
By adding a trace print into the trace just before disabling tracing, it
makes it obvious where the warning occurred, and the developer doesn't have
to look at other means to see what CPU it occurred on.

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 29615f15a820..760fd102dbe2 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1299,8 +1299,11 @@ EXPORT_SYMBOL_GPL(tracing_off);
 
 void disable_trace_on_warning(void)
 {
-	if (__disable_trace_on_warning)
+	if (__disable_trace_on_warning) {
+		trace_array_printk_buf(global_trace.array_buffer.buffer, _THIS_IP_,
+			"Disabling tracing due to warning\n");
 		tracing_off();
+	}
 }
 
 /**
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-06-01 12:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01 12:27 [for-next][PATCH 00/12] tracing: Updates for 5.8 (finally finished testing them) Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 01/12] ftrace: show debugging information when panic_on_warn set Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 02/12] tools/bootconfig: Add a summary of test cases and return error Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 03/12] tracing/probe: reverse arguments to list_add Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 04/12] tracing: Check state.disabled in synth event trace functions Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 05/12] tracing: Add histogram-design document Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 06/12] tracing: Add hist_debug trace event files for histogram debugging Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 07/12] tracing/doc: Fix typos in histogram-design.rst Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 09/12] tracing: Move synthetic events to a separate file Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 10/12] selftests/ftrace: Distinguish between hist and synthetic event checks Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 11/12] ftrace,bug: Improve traceoff_on_warn Steven Rostedt
2020-06-01 12:27 ` [for-next][PATCH 12/12] tracing: Add a trace print when traceoff_on_warning is triggered Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).