All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Zanussi <tom.zanussi@linux.intel.com>
To: rostedt@goodmis.org
Cc: tglx@linutronix.de, mhiramat@kernel.org, namhyung@kernel.org,
	vedang.patel@intel.com, linux-kernel@vger.kernel.org,
	linux-rt-users@vger.kernel.org,
	Tom Zanussi <tom.zanussi@linux.intel.com>
Subject: [PATCH 30/32] tracing: Add inter-event hist trigger Documentation
Date: Mon, 26 Jun 2017 17:49:31 -0500	[thread overview]
Message-ID: <88231902dad76e5de9c49a50058f0fa1003c5b09.1498510759.git.tom.zanussi@linux.intel.com> (raw)
In-Reply-To: <cover.1498510759.git.tom.zanussi@linux.intel.com>
In-Reply-To: <cover.1498510759.git.tom.zanussi@linux.intel.com>

Add background and details on inter-event hist triggers, including
hist variables, synthetic events, and actions.

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
---
 Documentation/trace/events.txt | 376 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 376 insertions(+)

diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt
index f271d87..888907c 100644
--- a/Documentation/trace/events.txt
+++ b/Documentation/trace/events.txt
@@ -571,6 +571,7 @@ The following commands are supported:
 	.sym-offset display an address as a symbol and offset
 	.syscall    display a syscall id as a system call name
 	.execname   display a common_pid as a program name
+	.usecs      display a $common_timestamp in microseconds
 
   Note that in general the semantics of a given field aren't
   interpreted when applying a modifier to it, but there are some
@@ -2101,3 +2102,378 @@ The following commands are supported:
         Hits: 489
         Entries: 7
         Dropped: 0
+
+6.3 Inter-event hist triggers
+-----------------------------
+
+Inter-event hist triggers are hist triggers that combine values from
+one or more other events and create a histogram using that data.  Data
+from an inter-event histogram can in turn become the source for
+further combined histograms, thus providing a chain of related
+histograms, which is important for some applications.
+
+The most important example of an inter-event quantity that can be used
+in this manner is latency, which is simply a difference in timestamps
+between two events (although trace events don't have an externally
+visible timestamp field, the inter-event hist trigger support adds a
+pseudo-field to all events named '$common_timestamp' which can be used
+as if it were an actual event field).  Although latency is the most
+important inter-event quantity, note that because the support is
+completely general across the trace event subsystem, any event field
+can be used in an inter-event quantity.
+
+An example of a histogram that combines data from other histograms
+into a useful chain would be a 'wakeupswitch latency' histogram that
+combines a 'wakeup latency' histogram and a 'switch latency'
+histogram.
+
+Normally, a hist trigger specification consists of a (possibly
+compound) key along with one or more numeric values, which are
+continually updated sums associated with that key.  A histogram
+specification in this case consists of individual key and value
+specifications that refer to trace event fields associated with a
+single event type.
+
+The inter-event hist trigger extension allows fields from multiple
+events to be referenced and combined into a multi-event histogram
+specification.  In support of this overall goal, a few enabling
+features have been added to the hist trigger support:
+
+  - In order to compute an inter-event quantity, a value from one
+    event needs to saved and then referenced from another event.  This
+    requires the introduction of support for histogram 'variables'.
+
+  - The computation of inter-event quantities and their combination
+    require some minimal amount of support for applying simple
+    expressions to variables (+ and -).
+
+  - A histogram consisting of inter-event quantities isn't logically a
+    histogram on either event (so having the 'hist' file for either
+    event host the histogram output doesn't really make sense).  To
+    address the idea that the histogram is associated with a
+    combination of events, support is added allowing the creation of
+    'synthetic' events that are events derived from other events.
+    These synthetic events are full-fledged events just like any other
+    and can be used as such, as for instance to create the
+    'combination' histograms mentioned previously.
+
+  - A set of 'actions' can be associated with histogram entries -
+    these can be used to generate the previously mentioned synthetic
+    events, but can also be used for other purposes, such as for
+    example saving context when a 'max' latency has been hit.
+
+  - Trace events don't have a 'timestamp' associated with them, but
+    there is an implicit timestamp saved along with an event in the
+    underlying ftrace ring buffer.  This timestamp is now exposed as a
+    a synthetic field named '$common_timestamp' which can be used in
+    histograms as if it were any other event field.  Note that it has
+    a '$' prefixed to it - this is meant to indicate that it isn't an
+    actual field in the trace format but rather is a synthesized value
+    that nonetheless can be used as if it were an actual field.  By
+    default it is in units of nanoseconds; appending '.usecs' to a
+    common_timestamp field changes the units to microseconds.
+
+These features are decribed in more detail in the following sections.
+
+6.3.1 Histogram Variables
+-------------------------
+
+Variables are simply named locations used for saving and retrieving
+values between matching events.  A 'matching' event is defined as an
+event that has a matching key - if a variable is saved for a histogram
+entry corresponding to that key, any subsequent event with a matching
+key can access that variable.
+
+A variable's value is normally available to any subsequent event until
+it is set to something else by a subsequent event.  The one exception
+to that rule is that any variable used in an expression is essentially
+'read-once' - once it's used by an expression in a subsequent event,
+it's reset to its 'unset' state, which means it can't be used again
+unless it's set again.  This ensures not only that an event doesn't
+use an uninitialized variable in a calculation, but that that variable
+is used only once and not for any unrelated subsequent match.
+
+The basic syntax for saving a variable is to simply prefix a unique
+variable name not corresponding to any keyword along with an '=' sign
+to any event field.
+
+Either keys or values can be saved and retrieved in this way.  This
+creates a variable named 'ts0' for a histogram entry with the key
+'next_pid':
+
+  # echo 'hist:keys=next_pid:vals=ts0=$common_timestamp ... >> event/trigger
+
+The ts0 variable can be accessed by any subsequent event having the
+same pid as 'next_pid'.
+
+Variable references are formed by prepending the variable name with
+the '$' sign.  Thus for example, the ts0 variable above would be
+referenced as '$ts0' in subsequent expressions.
+
+Because 'vals=' is used, the $common_timestamp variable value above
+will also be summed as a normal histogram value would (though for a
+timestamp it makes little sense).
+
+The below shows that a key value can also be saved in the same way:
+
+  # echo 'hist:key=timer_pid=common_pid ...' >> event/trigger
+
+If a variable isn't a key variable or prefixed with 'vals=', the
+associated event field will be saved in a variable but won't be summed
+as a value:
+
+  # echo 'hist:keys=next_pid:ts1=$common_timestamp ... >> event/trigger
+
+Multiple variables can be assigned at the same time.  The below would
+result in both ts0 and b being created as variables, with both
+common_timestamp and field1 additionally being summed as values:
+
+  # echo 'hist:keys=pid:vals=ts0=$common_timestamp,b=field1 ... >> event/trigger
+
+Any number of variables not bound to a 'vals=' prefix can also be
+assigned by simply separating them with colons.  Below is the same
+thing but without the values being summed in the histogram:
+
+  # echo 'hist:keys=pid:ts0=$common_timestamp:b=field1 ... >> event/trigger
+
+Variables set as above can be referenced and used in expressions on
+another event.
+
+For example, here's how a latency can be calculated:
+
+  # echo 'hist:keys=pid,prio:ts0=$common_timestamp ... >> event1/trigger
+  # echo 'hist:keys=next_pid:wakeup_lat=$common_timestamp-$ts0 ... >> event2/trigger
+
+In the first line above, the event's timetamp is saved into the
+variable ts0.  In the next line, ts0 is subtracted from the second
+event's timestamp to produce the latency, which is then assigned into
+yet another variable, 'wakeup_lat'.  The hist trigger below in turn
+makes use of the wakeup_lat variable to compute a combined latency
+using the same key and variable from yet another event:
+
+  # echo 'hist:key=pid:wakeupswitch_lat=$wakeup_lat+$switchtime_lat ... >> event3/trigger
+
+6.3.2 Synthetic Events
+----------------------
+
+Synthetic events are user-defined events generated from hist trigger
+variables or fields associated with one or more other events.  Their
+purpose is to provide a mechanism for displaying data spanning
+multiple events consistent with the existing and already familiar
+usage for normal events.
+
+To define a synthetic event, the user writes a simple specification
+consisting of the name of the new event along with one or more
+variables and their types, which can be any valid field type,
+separated by semicolons, to the tracing/synthetic_events file.
+
+For instance, the following creates a new event named 'wakeup_latency'
+with 3 fields: lat, pid, and prio.  Each of those fields is simply a
+variable reference to a variable on another event:
+
+  # echo 'wakeup_latency \
+          u64 lat; \
+          pid_t pid; \
+	  int prio' >> \
+	  /sys/kernel/debug/tracing/synthetic_events
+
+Reading the tracing/synthetic_events file lists all the currently
+defined synthetic events, in this case the event defined above:
+
+  # cat /sys/kernel/debug/tracing/synthetic_events
+    wakeup_latency u64 lat; pid_t pid; int prio
+
+An existing synthetic event definition can be removed by prepending
+the command that defined it with a '!':
+
+  # echo '!wakeup_latency u64 lat pid_t pid int prio' >> \
+    /sys/kernel/debug/tracing/synthetic_events
+
+At this point, there isn't yet an actual 'wakeup_latency' event
+instantiated in the event subsytem - for this to happen, a 'hist
+trigger action' needs to be instantiated and bound to actual fields
+and variables defined on other events (see Section 6.3.3 below).
+
+Once that is done, an event instance is created, and a histogram can
+be defined using it:
+
+  # echo 'hist:keys=pid,prio,lat.log2:sort=pid,lat' >> \
+        /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
+
+The new event is created under the tracing/events/synthetic/ directory
+and looks and behaves just like any other event:
+
+  # ls /sys/kernel/debug/tracing/events/synthetic/wakeup_latency
+        enable  filter  format  hist  id  trigger
+
+Like any other event, once a histogram is enabled for the event, the
+output can be displayed by reading the event's 'hist' file.
+
+6.3.3 Hist trigger 'actions'
+----------------------------
+
+A hist trigger 'action' is a function that's executed whenever a
+histogram entry is added or updated.
+
+The default 'action' if no special function is explicity specified is
+as it always has been, to simply update the set of values associated
+with an entry.  Some applications, however, may want to perform
+additional actions at that point, such as generate another event, or
+compare and save a maximum.
+
+The following additional actions are available.  To specify an action
+for a given event, simply specify the action between colons in the
+hist trigger specification.
+
+  - onmatch(matching.event).<synthetic_event_name>(param list)
+
+    The 'onmatch(matching.event).<synthetic_event_name>(params)' hist
+    trigger action is invoked whenever an event matches and the
+    histogram entry would be added or updated.  It causes the named
+    synthetic event to be generated with the values given in the
+    'param list'.  The result is the generation of a synthetic event
+    that consists of the values contained in those variables at the
+    time the invoking event was hit.
+
+    The 'param list' consists of one or more parameters which may be
+    either variables or fields defined on either the 'matching.event'
+    or the target event.  The variables or fields specified in the
+    param list may be either fully-qualified or unqualified.  If a
+    variable is specified as unqualified, it must be unique between
+    the two events.  A field name used as a param can be unqualified
+    if it refers to the target event, but must be fully qualified if
+    it refers to the matching event.  A fully-qualified name is of the
+    form 'system.event_name.$var_name' or 'system.event_name.field'.
+
+    The 'matching.event' specification is simply the fully qualified
+    event name of the event that matches the target event for the
+    onmatch() functionality, in the form 'system.event_name'.
+
+    Finally, the number and type of variables/fields in the 'param
+    list' must match the number and types of the fields in the
+    synthetic event being generated.
+
+    As an example the below defines a simple synthetic event and uses
+    a variable defined on the sched_wakeup_new event as a parameter
+    when invoking the synthetic event.  Here we define the synthetic
+    event:
+
+    # echo 'wakeup_new_test pid_t pid' >> \
+           /sys/kernel/debug/tracing/synthetic_events
+
+    # cat /sys/kernel/debug/tracing/synthetic_events
+          wakeup_new_test pid_t pid
+
+    The following hist trigger both defines the missing testpid
+    variable and specifies an onmatch() action that generates a
+    wakeup_new_test synthetic event whenever a sched_wakeup_new event
+    occurs, which because of the 'if comm == "cyclictest"' filter only
+    happens when the executable is cyclictest:
+
+    # echo 'hist:keys=testpid=pid:onmatch(sched.sched_wakeup_new).\
+            wakeup_new_test($testpid) if comm=="cyclictest"' >> \
+            /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
+
+    Creating and displaying a histogram based on those events is now
+    just a matter of using the fields and new synthetic event in the
+    tracing/events/synthetic directory, as usual:
+
+    # echo 'hist:keys=pid:sort=pid' >> \
+           /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/trigger
+
+    Running 'cyclictest' should cause wakeup_new events to generate
+    wakeup_new_test synthetic events which should result in histogram
+    output in the wakeup_new_test event's hist file:
+
+    # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/hist
+
+    A more typical usage would be to use two events to calculate a
+    latency.  The following example uses a set of hist triggers to
+    produce a 'wakeup_latency' histogram:
+
+    First, we define a 'wakeup_latency' synthetic event:
+
+    # echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> \
+            /sys/kernel/debug/tracing/synthetic_events
+
+    Next, we specify that whenever we see a sched_wakeup event for a
+    cyclictest thread, save the timestamp in a 'ts0' variable:
+
+    # echo 'hist:keys=saved_pid=pid:ts0=$common_timestamp.usecs \
+            if comm=="cyclictest"' >> \
+	    /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger
+
+    Then, when the corresponding thread is actually scheduled onto the
+    CPU by a sched_switch event, calculate the latency and use that
+    along with another variable and an event field to generate a
+    wakeup_latency synthetic event:
+
+    # echo 'hist:keys=next_pid:wakeup_lat=$common_timestamp.usecs-$ts0:\
+            onmatch(sched.sched_wakeup).wakeup_latency($wakeup_lat,\
+	            $saved_pid,next_prio) if next_comm=="cyclictest"' >> \
+	    /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+
+    We also need to create a histogram on the wakeup_latency synthetic
+    event in order to aggregate the generated synthetic event data:
+
+    # echo 'hist:keys=pid,prio,lat:sort=pid,lat' >> \
+            /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
+
+    Finally, once we've run cyclictest to actually generate some
+    events, we can see the output by looking at the wakeup_latency
+    synthetic event's hist file:
+
+    # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist
+
+  - onmax(var).save(field,...)
+
+    The 'onmax(var).save(field,...)' hist trigger action is invoked
+    whenever the value of 'var' associated with a histogram entry
+    exceeds the current maximum contained in that variable.
+
+    The end result is that the trace event fields specified as the
+    onmax.save() params will be saved if 'var' exceeds the current
+    maximum for that hist trigger entry.  This allows context from the
+    event that exhibited the new maximum to be saved for later
+    reference.  When the histogram is displayed, additional fields
+    displaying the saved values will be printed.
+
+    As an example the below defines a couple of hist triggers, one for
+    sched_wakeup and another for sched_switch, keyed on pid.  Whenever
+    a sched_wakeup occurs, the timestamp is saved in the entry
+    corresponding to the current pid, and when the scheduler switches
+    back to that pid, the timestamp difference is calculated.  If the
+    resulting latency, stored in wakeup_lat, exceeds the current
+    maximum latency, the values specified in the save() fields are
+    recoreded:
+
+    # echo 'hist:keys=pid:ts0=$common_timestamp.usecs \
+            if comm=="cyclictest"' >> \
+            /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger
+
+    # echo 'hist:keys=next_pid:\
+            wakeup_lat=$common_timestamp.usecs-$ts0:\
+            onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) \
+            if next_comm=="cyclictest"' >> \
+            /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+
+    When the histogram is displayed, the max value and the saved
+    values corresponding to the max are displayed following the rest
+    of the fields:
+
+    # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist
+      { next_pid:       2255 } hitcount:        239
+        common_timestamp-ts0:          0
+        max:         27
+	next_comm: cyclictest
+        prev_pid:          0  prev_prio:        120  prev_comm: swapper/1
+
+      { next_pid:       2256 } hitcount:       2355
+        common_timestamp-ts0: 0
+        max:         49  next_comm: cyclictest
+        prev_pid:          0  prev_prio:        120  prev_comm: swapper/0
+
+      Totals:
+          Hits: 12970
+          Entries: 2
+          Dropped: 0
-- 
1.9.3

  parent reply	other threads:[~2017-06-26 22:53 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-26 22:49 [PATCH 00/32] tracing: Inter-event (e.g. latency) support Tom Zanussi
2017-06-26 22:49 ` [PATCH 01/32] tracing: Add hist_field_name() accessor Tom Zanussi
2017-07-13  6:49   ` Piotr Gregor
2017-07-13 14:41     ` Tom Zanussi
2017-07-14  2:19   ` Namhyung Kim
2017-06-26 22:49 ` [PATCH 02/32] tracing: Reimplement log2 Tom Zanussi
2017-07-14  2:33   ` Namhyung Kim
2017-07-14 16:13     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 03/32] ring-buffer: Add interface for setting absolute time stamps Tom Zanussi
2017-07-14  5:25   ` Namhyung Kim
2017-07-14 16:14     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 04/32] ring-buffer: Redefine the unimplemented RINGBUF_TIME_TIME_STAMP Tom Zanussi
2017-07-02  8:51   ` Joel Fernandes (Google)
2017-06-26 22:49 ` [PATCH 05/32] tracing: Give event triggers access to ring_buffer_event Tom Zanussi
2017-06-26 22:49 ` [PATCH 06/32] tracing: Add ring buffer event param to hist field functions Tom Zanussi
2017-06-26 22:49 ` [PATCH 07/32] tracing: Increase tracing map KEYS_MAX size Tom Zanussi
2017-06-26 22:49 ` [PATCH 08/32] tracing: Break out hist trigger assignment parsing Tom Zanussi
2017-06-26 22:49 ` [PATCH 09/32] tracing: Make traceprobe parsing code reusable Tom Zanussi
2017-06-26 22:49 ` [PATCH 10/32] tracing: Add NO_DISCARD event file flag Tom Zanussi
2017-06-26 22:49 ` [PATCH 11/32] tracing: Add post-trigger flag to hist trigger command Tom Zanussi
2017-06-26 22:49 ` [PATCH 12/32] tracing: Add hist trigger timestamp support Tom Zanussi
2017-07-17  6:00   ` Namhyung Kim
2017-07-17 16:57     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 13/32] tracing: Add per-element variable support to tracing_map Tom Zanussi
2017-06-26 22:49 ` [PATCH 14/32] tracing: Add hist_data member to hist_field Tom Zanussi
2017-06-26 22:49 ` [PATCH 15/32] tracing: Add usecs modifier for hist trigger timestamps Tom Zanussi
2017-06-26 22:49 ` [PATCH 16/32] tracing: Add variable support to hist triggers Tom Zanussi
2017-07-19  1:07   ` Namhyung Kim
2017-07-19 15:40     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 17/32] tracing: Account for variables in named trigger compatibility Tom Zanussi
2017-06-26 22:49 ` [PATCH 18/32] tracing: Add simple expression support to hist triggers Tom Zanussi
2017-07-21  2:02   ` Namhyung Kim
2017-07-21 16:29     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 19/32] tracing: Add variable reference handling " Tom Zanussi
2017-07-21  3:31   ` Namhyung Kim
2017-07-21 16:41     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 20/32] tracing: Add support for dynamic tracepoints Tom Zanussi
2017-06-26 22:49 ` [PATCH 21/32] tracing: Add hist trigger action hook Tom Zanussi
2017-06-26 22:49 ` [PATCH 22/32] tracing: Add support for 'synthetic' events Tom Zanussi
2017-07-23 12:00   ` Namhyung Kim
2017-07-24 16:11     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 23/32] tracing: Add 'onmatch' hist trigger action support Tom Zanussi
2017-07-23 15:12   ` Namhyung Kim
2017-07-24 16:17     ` Tom Zanussi
2017-07-26  2:40   ` Namhyung Kim
2017-07-26 16:38     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 24/32] tracing: Add 'onmax' " Tom Zanussi
2017-07-26  3:04   ` Namhyung Kim
2017-07-26 16:45     ` Tom Zanussi
2017-06-26 22:49 ` [PATCH 25/32] tracing: Allow whitespace to surround hist trigger filter Tom Zanussi
2017-06-26 22:49 ` [PATCH 26/32] tracing: Make duplicate count from tracing_map available Tom Zanussi
2017-06-26 22:49 ` [PATCH 27/32] tracing: Add cpu field for hist triggers Tom Zanussi
2017-06-26 22:49 ` [PATCH 28/32] tracing: Add hist trigger support for variable reference aliases Tom Zanussi
2017-06-26 22:49 ` [PATCH 29/32] tracing: Add 'last error' error facility for hist triggers Tom Zanussi
2017-07-26  4:39   ` Namhyung Kim
2017-07-26 16:47     ` Tom Zanussi
2017-06-26 22:49 ` Tom Zanussi [this message]
2017-06-26 22:49 ` [PATCH 31/32] tracing: Make tracing_set_clock() non-static Tom Zanussi
2017-06-26 22:49 ` [PATCH 32/32] tracing: Add a clock attribute for hist triggers Tom Zanussi
2017-06-27 14:58 ` [PATCH 00/32] tracing: Inter-event (e.g. latency) support Steven Rostedt
2017-06-28 14:21 ` Masami Hiramatsu
2017-06-28 19:09   ` Tom Zanussi
2017-07-01  7:01 ` Joel Fernandes (Google)
2017-07-12 19:17   ` Tom Zanussi
2017-07-04 16:12 ` Sebastian Andrzej Siewior
2017-07-12 19:20   ` Tom Zanussi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88231902dad76e5de9c49a50058f0fa1003c5b09.1498510759.git.tom.zanussi@linux.intel.com \
    --to=tom.zanussi@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vedang.patel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.