linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-next][PATCH 0/7] tracing: More changes queued up for 3.16
@ 2014-05-31  1:38 Steven Rostedt
  2014-05-31  1:38 ` [for-next][PATCH 1/7] tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines Steven Rostedt
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: 676412a37f248205ecadfaddb50e150a44ef2b0c


Robert Elliott (2)
      tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
      tracing: Add funcgraph_tail option to print function name after closing braces

Steven Rostedt (1)
      tracing: Print nasty banner when trace_printk() is in use

Steven Rostedt (Red Hat) (3)
      tracing: Add tracepoint benchmark tracepoint
      tracing: Try again for saved cmdline if failed due to locking
      tracing: Move locking of trace_cmdline_lock into start/stop seq calls

Yoshihiro YUNOMAE (1)
      tracing: Have saved_cmdlines use the seq_read infrastructure

----
 Documentation/trace/ftrace.txt       |   26 +++++
 kernel/trace/Kconfig                 |   30 +++++
 kernel/trace/Makefile                |    3 
 kernel/trace/trace.c                 |  137 ++++++++++++++++++---------
 kernel/trace/trace.h                 |    2 
 kernel/trace/trace_benchmark.c       |  176 +++++++++++++++++++++++++++++++++++
 kernel/trace/trace_benchmark.h       |   41 ++++++++
 kernel/trace/trace_functions_graph.c |   18 +--
 8 files changed, 375 insertions(+), 58 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [for-next][PATCH 1/7] tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
  2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
@ 2014-05-31  1:38 ` Steven Rostedt
  2014-05-31  1:38 ` [for-next][PATCH 2/7] tracing: Add funcgraph_tail option to print function name after closing braces Steven Rostedt
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Robert Elliott

[-- Attachment #1: 0000-tracing-Eliminate-duplicate-TRACE_GRAPH_PRINT_xx-def.patch --]
[-- Type: text/plain, Size: 1738 bytes --]

From: Robert Elliott <elliott@hp.com>

Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
in trace_functions_graph.c that are already in
trace.h.

Add TRACE_GRAPH_PRINT_IRQS to trace.h, which is
the only one that is missing.

Link: http://lkml.kernel.org/p/20140520221031.8359.24733.stgit@beardog.cce.hp.com

Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.h                 | 1 +
 kernel/trace/trace_functions_graph.c | 9 ---------
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 3b3e09e61f33..68050633255e 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -724,6 +724,7 @@ extern unsigned long trace_flags;
 #define TRACE_GRAPH_PRINT_PROC          0x8
 #define TRACE_GRAPH_PRINT_DURATION      0x10
 #define TRACE_GRAPH_PRINT_ABS_TIME      0x20
+#define TRACE_GRAPH_PRINT_IRQS          0x40
 #define TRACE_GRAPH_PRINT_FILL_SHIFT	28
 #define TRACE_GRAPH_PRINT_FILL_MASK	(0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index b86dd4d8c6a6..af08dd531cb8 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -38,15 +38,6 @@ struct fgraph_data {
 
 #define TRACE_GRAPH_INDENT	2
 
-/* Flag options */
-#define TRACE_GRAPH_PRINT_OVERRUN	0x1
-#define TRACE_GRAPH_PRINT_CPU		0x2
-#define TRACE_GRAPH_PRINT_OVERHEAD	0x4
-#define TRACE_GRAPH_PRINT_PROC		0x8
-#define TRACE_GRAPH_PRINT_DURATION	0x10
-#define TRACE_GRAPH_PRINT_ABS_TIME	0x20
-#define TRACE_GRAPH_PRINT_IRQS		0x40
-
 static unsigned int max_depth;
 
 static struct tracer_opt trace_opts[] = {
-- 
2.0.0.rc2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 2/7] tracing: Add funcgraph_tail option to print function name after closing braces
  2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
  2014-05-31  1:38 ` [for-next][PATCH 1/7] tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines Steven Rostedt
@ 2014-05-31  1:38 ` Steven Rostedt
  2014-05-31  1:39 ` [for-next][PATCH 3/7] tracing: Print nasty banner when trace_printk() is in use Steven Rostedt
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Robert Elliott

[-- Attachment #1: 0001-tracing-Add-funcgraph_tail-option-to-print-function-.patch --]
[-- Type: text/plain, Size: 4306 bytes --]

From: Robert Elliott <elliott@hp.com>

In the function-graph tracer, add a funcgraph_tail option
to print the function name on all } lines, not just
functions whose first line is no longer in the trace
buffer.

If a function calls other traced functions, its total
time appears on its } line.  This change allows grep
to be used to determine the function for which the
line corresponds.

Update Documentation/trace/ftrace.txt to describe
this new option.

Link: http://lkml.kernel.org/p/20140520221041.8359.6782.stgit@beardog.cce.hp.com

Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 Documentation/trace/ftrace.txt       | 26 ++++++++++++++++++++++++++
 kernel/trace/trace.h                 |  1 +
 kernel/trace/trace_functions_graph.c |  9 ++++++---
 3 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index bd365988e8d8..2479b2a0c77c 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -2003,6 +2003,32 @@ want, depending on your needs.
   360.774530 |   1)   0.594 us    |                                          __phys_addr();
 
 
+The function name is always displayed after the closing bracket
+for a function if the start of that function is not in the
+trace buffer.
+
+Display of the function name after the closing bracket may be
+enabled for functions whose start is in the trace buffer,
+allowing easier searching with grep for function durations.
+It is default disabled.
+
+	hide: echo nofuncgraph-tail > trace_options
+	show: echo funcgraph-tail > trace_options
+
+  Example with nofuncgraph-tail (default):
+  0)               |      putname() {
+  0)               |        kmem_cache_free() {
+  0)   0.518 us    |          __phys_addr();
+  0)   1.757 us    |        }
+  0)   2.861 us    |      }
+
+  Example with funcgraph-tail:
+  0)               |      putname() {
+  0)               |        kmem_cache_free() {
+  0)   0.518 us    |          __phys_addr();
+  0)   1.757 us    |        } /* kmem_cache_free() */
+  0)   2.861 us    |      } /* putname() */
+
 You can put some comments on specific functions by using
 trace_printk() For example, if you want to put a comment inside
 the __might_sleep() function, you just have to include
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 68050633255e..217207ad60b3 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -725,6 +725,7 @@ extern unsigned long trace_flags;
 #define TRACE_GRAPH_PRINT_DURATION      0x10
 #define TRACE_GRAPH_PRINT_ABS_TIME      0x20
 #define TRACE_GRAPH_PRINT_IRQS          0x40
+#define TRACE_GRAPH_PRINT_TAIL          0x80
 #define TRACE_GRAPH_PRINT_FILL_SHIFT	28
 #define TRACE_GRAPH_PRINT_FILL_MASK	(0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index af08dd531cb8..4de3e57f723c 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -55,11 +55,13 @@ static struct tracer_opt trace_opts[] = {
 	{ TRACER_OPT(funcgraph-abstime, TRACE_GRAPH_PRINT_ABS_TIME) },
 	/* Display interrupts */
 	{ TRACER_OPT(funcgraph-irqs, TRACE_GRAPH_PRINT_IRQS) },
+	/* Display function name after trailing } */
+	{ TRACER_OPT(funcgraph-tail, TRACE_GRAPH_PRINT_TAIL) },
 	{ } /* Empty entry */
 };
 
 static struct tracer_flags tracer_flags = {
-	/* Don't display overruns and proc by default */
+	/* Don't display overruns, proc, or tail by default */
 	.val = TRACE_GRAPH_PRINT_CPU | TRACE_GRAPH_PRINT_OVERHEAD |
 	       TRACE_GRAPH_PRINT_DURATION | TRACE_GRAPH_PRINT_IRQS,
 	.opts = trace_opts
@@ -1167,9 +1169,10 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s,
 	 * If the return function does not have a matching entry,
 	 * then the entry was lost. Instead of just printing
 	 * the '}' and letting the user guess what function this
-	 * belongs to, write out the function name.
+	 * belongs to, write out the function name. Always do
+	 * that if the funcgraph-tail option is enabled.
 	 */
-	if (func_match) {
+	if (func_match && !(flags & TRACE_GRAPH_PRINT_TAIL)) {
 		ret = trace_seq_puts(s, "}\n");
 		if (!ret)
 			return TRACE_TYPE_PARTIAL_LINE;
-- 
2.0.0.rc2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 3/7] tracing: Print nasty banner when trace_printk() is in use
  2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
  2014-05-31  1:38 ` [for-next][PATCH 1/7] tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines Steven Rostedt
  2014-05-31  1:38 ` [for-next][PATCH 2/7] tracing: Add funcgraph_tail option to print function name after closing braces Steven Rostedt
@ 2014-05-31  1:39 ` Steven Rostedt
  2014-05-31  1:39 ` [for-next][PATCH 4/7] tracing: Add tracepoint benchmark tracepoint Steven Rostedt
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0002-tracing-Print-nasty-banner-when-trace_printk-is-in-u.patch --]
[-- Type: text/plain, Size: 3591 bytes --]

From: Steven Rostedt <rostedt@goodmis.org>

trace_printk() is used to debug fast paths within the kernel. Places
that gets called in any context (interrupt or NMI) or thousands of
times a second. Something you do not want to do with a printk().

In order to make it completely lockless as it needs a temporary buffer
to handle some of the string formatting, a page is created per cpu for
every context (four per cpu; normal, softirq, irq, NMI).

Since trace_printk() should only be used for debugging purposes,
there's no reason to waste memory on these buffers on a production
system. That means, trace_printk() should never be used unless a
developer is debugging their kernel. There's macro magic to allocate
the buffers if trace_printk() is used anywhere in the kernel.

To help enforce that trace_printk() isn't used outside of development,
when it is used, a nasty banner is displayed on bootup (or when a module
is loaded that uses trace_printk() and the kernel core does not).

Here's the banner:

 **********************************************************
 **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
 **                                                      **
 ** trace_printk() being used. Allocating extra memory.  **
 **                                                      **
 ** This means that this is a DEBUG kernel and it is     **
 ** unsafe for produciton use.                           **
 **                                                      **
 ** If you see this message and you are not debugging    **
 ** the kernel, report this immediately to your vendor!  **
 **                                                      **
 **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
 **********************************************************

That should hopefully keep developers from trying to sneak in a
trace_printk() or two.

Link: http://lkml.kernel.org/p/20140528131440.2283213c@gandalf.local.home

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 05431696b10c..eb228b9de170 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1975,7 +1975,21 @@ void trace_printk_init_buffers(void)
 	if (alloc_percpu_trace_buffer())
 		return;
 
-	pr_info("ftrace: Allocated trace_printk buffers\n");
+	/* trace_printk() is for debug use only. Don't use it in production. */
+
+	pr_warning("\n**********************************************************\n");
+	pr_warning("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
+	pr_warning("**                                                      **\n");
+	pr_warning("** trace_printk() being used. Allocating extra memory.  **\n");
+	pr_warning("**                                                      **\n");
+	pr_warning("** This means that this is a DEBUG kernel and it is     **\n");
+	pr_warning("** unsafe for produciton use.                           **\n");
+	pr_warning("**                                                      **\n");
+	pr_warning("** If you see this message and you are not debugging    **\n");
+	pr_warning("** the kernel, report this immediately to your vendor!  **\n");
+	pr_warning("**                                                      **\n");
+	pr_warning("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
+	pr_warning("**********************************************************\n");
 
 	/* Expand the buffers to set size */
 	tracing_update_buffers();
-- 
2.0.0.rc2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 4/7] tracing: Add tracepoint benchmark tracepoint
  2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
                   ` (2 preceding siblings ...)
  2014-05-31  1:39 ` [for-next][PATCH 3/7] tracing: Print nasty banner when trace_printk() is in use Steven Rostedt
@ 2014-05-31  1:39 ` Steven Rostedt
  2014-05-31  1:39 ` [for-next][PATCH 5/7] tracing: Have saved_cmdlines use the seq_read infrastructure Steven Rostedt
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0003-tracing-Add-tracepoint-benchmark-tracepoint.patch --]
[-- Type: text/plain, Size: 9707 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

In order to help benchmark the time tracepoints take, a new config
option is added called CONFIG_TRACEPOINT_BENCHMARK. When this option
is set a tracepoint is created called "benchmark:benchmark_event".
When the tracepoint is enabled, it kicks off a kernel thread that
goes into an infinite loop (calling cond_sched() to let other tasks
run), and calls the tracepoint. Each iteration will record the time
it took to write to the tracepoint and the next iteration that
data will be passed to the tracepoint itself. That is, the tracepoint
will report the time it took to do the previous tracepoint.
The string written to the tracepoint is a static string of 128 bytes
to keep the time the same. The initial string is simply a write of
"START". The second string records the cold cache time of the first
write which is not added to the rest of the calculations.

As it is a tight loop, it benchmarks as hot cache. That's fine because
we care most about hot paths that are probably in cache already.

An example of the output:

     START
     first=3672 [COLD CACHED]
     last=632 first=3672 max=632 min=632 avg=316 std=446 std^2=199712
     last=278 first=3672 max=632 min=278 avg=303 std=316 std^2=100337
     last=277 first=3672 max=632 min=277 avg=296 std=258 std^2=67064
     last=273 first=3672 max=632 min=273 avg=292 std=224 std^2=50411
     last=273 first=3672 max=632 min=273 avg=288 std=200 std^2=40389
     last=281 first=3672 max=632 min=273 avg=287 std=183 std^2=33666

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/Kconfig           |  30 +++++++
 kernel/trace/Makefile          |   3 +
 kernel/trace/trace_benchmark.c | 176 +++++++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_benchmark.h |  41 ++++++++++
 4 files changed, 250 insertions(+)
 create mode 100644 kernel/trace/trace_benchmark.c
 create mode 100644 kernel/trace/trace_benchmark.h

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 8639819f6cef..d4409356f40d 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -535,6 +535,36 @@ config MMIOTRACE_TEST
 
 	  Say N, unless you absolutely know what you are doing.
 
+config TRACEPOINT_BENCHMARK
+        bool "Add tracepoint that benchmarks tracepoints"
+	help
+	 This option creates the tracepoint "benchmark:benchmark_event".
+	 When the tracepoint is enabled, it kicks off a kernel thread that
+	 goes into an infinite loop (calling cond_sched() to let other tasks
+	 run), and calls the tracepoint. Each iteration will record the time
+	 it took to write to the tracepoint and the next iteration that
+	 data will be passed to the tracepoint itself. That is, the tracepoint
+	 will report the time it took to do the previous tracepoint.
+	 The string written to the tracepoint is a static string of 128 bytes
+	 to keep the time the same. The initial string is simply a write of
+	 "START". The second string records the cold cache time of the first
+	 write which is not added to the rest of the calculations.
+
+	 As it is a tight loop, it benchmarks as hot cache. That's fine because
+	 we care most about hot paths that are probably in cache already.
+
+	 An example of the output:
+
+	      START
+	      first=3672 [COLD CACHED]
+	      last=632 first=3672 max=632 min=632 avg=316 std=446 std^2=199712
+	      last=278 first=3672 max=632 min=278 avg=303 std=316 std^2=100337
+	      last=277 first=3672 max=632 min=277 avg=296 std=258 std^2=67064
+	      last=273 first=3672 max=632 min=273 avg=292 std=224 std^2=50411
+	      last=273 first=3672 max=632 min=273 avg=288 std=200 std^2=40389
+	      last=281 first=3672 max=632 min=273 avg=287 std=183 std^2=33666
+
+
 config RING_BUFFER_BENCHMARK
 	tristate "Ring buffer benchmark stress tester"
 	depends on RING_BUFFER
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 1378e84fbe39..2611613f14f1 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -17,6 +17,7 @@ ifdef CONFIG_TRACING_BRANCHES
 KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
 endif
 
+CFLAGS_trace_benchmark.o := -I$(src)
 CFLAGS_trace_events_filter.o := -I$(src)
 
 obj-$(CONFIG_TRACE_CLOCK) += trace_clock.o
@@ -62,4 +63,6 @@ endif
 obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
 obj-$(CONFIG_UPROBE_EVENT) += trace_uprobe.o
 
+obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
+
 libftrace-y := ftrace.o
diff --git a/kernel/trace/trace_benchmark.c b/kernel/trace/trace_benchmark.c
new file mode 100644
index 000000000000..7dc1c42dfee2
--- /dev/null
+++ b/kernel/trace/trace_benchmark.c
@@ -0,0 +1,176 @@
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/kthread.h>
+#include <linux/trace_clock.h>
+
+#define CREATE_TRACE_POINTS
+#include "trace_benchmark.h"
+
+static struct task_struct *bm_event_thread;
+
+static char bm_str[BENCHMARK_EVENT_STRLEN] = "START";
+
+static u64 bm_total;
+static u64 bm_totalsq;
+static u64 bm_last;
+static u64 bm_max;
+static u64 bm_min;
+static u64 bm_first;
+static s64 bm_cnt;
+
+/*
+ * This gets called in a loop recording the time it took to write
+ * the tracepoint. What it writes is the time statistics of the last
+ * tracepoint write. As there is nothing to write the first time
+ * it simply writes "START". As the first write is cold cache and
+ * the rest is hot, we save off that time in bm_first and it is
+ * reported as "first", which is shown in the second write to the
+ * tracepoint. The "first" field is writen within the statics from
+ * then on but never changes.
+ */
+static void trace_do_benchmark(void)
+{
+	u64 start;
+	u64 stop;
+	u64 delta;
+	s64 stddev;
+	u64 seed;
+	u64 seedsq;
+	u64 last_seed;
+	unsigned int avg;
+	unsigned int std = 0;
+
+	/* Only run if the tracepoint is actually active */
+	if (!trace_benchmark_event_enabled())
+		return;
+
+	local_irq_disable();
+	start = trace_clock_local();
+	trace_benchmark_event(bm_str);
+	stop = trace_clock_local();
+	local_irq_enable();
+
+	bm_cnt++;
+
+	delta = stop - start;
+
+	/*
+	 * The first read is cold cached, keep it separate from the
+	 * other calculations.
+	 */
+	if (bm_cnt == 1) {
+		bm_first = delta;
+		scnprintf(bm_str, BENCHMARK_EVENT_STRLEN,
+			  "first=%llu [COLD CACHED]", bm_first);
+		return;
+	}
+
+	bm_last = delta;
+
+	bm_total += delta;
+	bm_totalsq += delta * delta;
+
+	if (delta > bm_max)
+		bm_max = delta;
+	if (!bm_min || delta < bm_min)
+		bm_min = delta;
+
+	if (bm_cnt > 1) {
+		/*
+		 * Apply Welford's method to calculate standard deviation:
+		 * s^2 = 1 / (n * (n-1)) * (n * \Sum (x_i)^2 - (\Sum x_i)^2)
+		 */
+		stddev = (u64)bm_cnt * bm_totalsq - bm_total * bm_total;
+		do_div(stddev, bm_cnt);
+		do_div(stddev, bm_cnt - 1);
+	} else
+		stddev = 0;
+
+	delta = bm_total;
+	do_div(delta, bm_cnt);
+	avg = delta;
+
+	if (stddev > 0) {
+		int i = 0;
+		/*
+		 * stddev is the square of standard deviation but
+		 * we want the actualy number. Use the average
+		 * as our seed to find the std.
+		 *
+		 * The next try is:
+		 *  x = (x + N/x) / 2
+		 *
+		 * Where N is the squared number to find the square
+		 * root of.
+		 */
+		seed = avg;
+		do {
+			last_seed = seed;
+			seed = stddev;
+			if (!last_seed)
+				break;
+			do_div(seed, last_seed);
+			seed += last_seed;
+			do_div(seed, 2);
+		} while (i++ < 10 && last_seed != seed);
+
+		std = seed;
+	}
+
+	scnprintf(bm_str, BENCHMARK_EVENT_STRLEN,
+		  "last=%llu first=%llu max=%llu min=%llu avg=%u std=%d std^2=%lld",
+		  bm_last, bm_first, bm_max, bm_min, avg, std, stddev);
+}
+
+static int benchmark_event_kthread(void *arg)
+{
+	/* sleep a bit to make sure the tracepoint gets activated */
+	msleep(100);
+
+	while (!kthread_should_stop()) {
+
+		trace_do_benchmark();
+
+		/*
+		 * We don't go to sleep, but let others
+		 * run as well.
+		 */
+		cond_resched();
+	}
+
+	return 0;
+}
+
+/*
+ * When the benchmark tracepoint is enabled, it calls this
+ * function and the thread that calls the tracepoint is created.
+ */
+void trace_benchmark_reg(void)
+{
+	bm_event_thread = kthread_run(benchmark_event_kthread,
+				      NULL, "event_benchmark");
+	WARN_ON(!bm_event_thread);
+}
+
+/*
+ * When the benchmark tracepoint is disabled, it calls this
+ * function and the thread that calls the tracepoint is deleted
+ * and all the numbers are reset.
+ */
+void trace_benchmark_unreg(void)
+{
+	if (!bm_event_thread)
+		return;
+
+	kthread_stop(bm_event_thread);
+
+	strcpy(bm_str, "START");
+	bm_total = 0;
+	bm_totalsq = 0;
+	bm_last = 0;
+	bm_max = 0;
+	bm_min = 0;
+	bm_cnt = 0;
+	/* bm_first doesn't need to be reset but reset it anyway */
+	bm_first = 0;
+}
diff --git a/kernel/trace/trace_benchmark.h b/kernel/trace/trace_benchmark.h
new file mode 100644
index 000000000000..3c1df1df4e29
--- /dev/null
+++ b/kernel/trace/trace_benchmark.h
@@ -0,0 +1,41 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM benchmark
+
+#if !defined(_TRACE_BENCHMARK_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_BENCHMARK_H
+
+#include <linux/tracepoint.h>
+
+extern void trace_benchmark_reg(void);
+extern void trace_benchmark_unreg(void);
+
+#define BENCHMARK_EVENT_STRLEN		128
+
+TRACE_EVENT_FN(benchmark_event,
+
+	TP_PROTO(const char *str),
+
+	TP_ARGS(str),
+
+	TP_STRUCT__entry(
+		__array(	char,	str,	BENCHMARK_EVENT_STRLEN	)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->str, str, BENCHMARK_EVENT_STRLEN);
+	),
+
+	TP_printk("%s", __entry->str),
+
+	trace_benchmark_reg, trace_benchmark_unreg
+);
+
+#endif /* _TRACE_BENCHMARK_H */
+
+#undef TRACE_INCLUDE_FILE
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_benchmark
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.0.0.rc2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 5/7] tracing: Have saved_cmdlines use the seq_read infrastructure
  2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
                   ` (3 preceding siblings ...)
  2014-05-31  1:39 ` [for-next][PATCH 4/7] tracing: Add tracepoint benchmark tracepoint Steven Rostedt
@ 2014-05-31  1:39 ` Steven Rostedt
  2014-05-31  1:39 ` [for-next][PATCH 6/7] tracing: Try again for saved cmdline if failed due to locking Steven Rostedt
  2014-05-31  1:39 ` [for-next][PATCH 7/7] tracing: Move locking of trace_cmdline_lock into start/stop seq calls Steven Rostedt
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Hidehiro Kawai, Namhyung Kim,
	Masami Hiramatsu, Yoshihiro YUNOMAE, Frederic Weisbecker,
	Ingo Molnar

[-- Attachment #1: 0004-tracing-Have-saved_cmdlines-use-the-seq_read-infrast.patch --]
[-- Type: text/plain, Size: 3816 bytes --]

From: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>

Current tracing_saved_cmdlines_read() implementation is naive; It allocates
a large buffer, constructs output data to that buffer for each read
operation, and then copies a portion of the buffer to the user space
buffer. This has several issues such as slow memory allocation, high
CPU usage, and even corruption of the output data.

The seq_read infrastructure is made to handle this type of work.
By converting it to use seq_read() the code becomes smaller, simplified,
as well as correct.

Link: http://lkml.kernel.org/p/20140220084431.3839.51793.stgit@yunodevel

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 89 +++++++++++++++++++++++++++++++---------------------
 1 file changed, 54 insertions(+), 35 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index eb228b9de170..1855dae73f34 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3699,55 +3699,74 @@ static const struct file_operations tracing_readme_fops = {
 	.llseek		= generic_file_llseek,
 };
 
-static ssize_t
-tracing_saved_cmdlines_read(struct file *file, char __user *ubuf,
-				size_t cnt, loff_t *ppos)
+static void *saved_cmdlines_next(struct seq_file *m, void *v, loff_t *pos)
 {
-	char *buf_comm;
-	char *file_buf;
-	char *buf;
-	int len = 0;
-	int pid;
-	int i;
+	unsigned int *ptr = v;
 
-	file_buf = kmalloc(SAVED_CMDLINES*(16+TASK_COMM_LEN), GFP_KERNEL);
-	if (!file_buf)
-		return -ENOMEM;
+	if (*pos || m->count)
+		ptr++;
 
-	buf_comm = kmalloc(TASK_COMM_LEN, GFP_KERNEL);
-	if (!buf_comm) {
-		kfree(file_buf);
-		return -ENOMEM;
-	}
+	(*pos)++;
+
+	for (; ptr < &map_cmdline_to_pid[SAVED_CMDLINES]; ptr++) {
+		if (*ptr == -1 || *ptr == NO_CMDLINE_MAP)
+			continue;
 
-	buf = file_buf;
+		return ptr;
+	}
 
-	for (i = 0; i < SAVED_CMDLINES; i++) {
-		int r;
+	return NULL;
+}
 
-		pid = map_cmdline_to_pid[i];
-		if (pid == -1 || pid == NO_CMDLINE_MAP)
-			continue;
+static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
+{
+	void *v;
+	loff_t l = 0;
 
-		trace_find_cmdline(pid, buf_comm);
-		r = sprintf(buf, "%d %s\n", pid, buf_comm);
-		buf += r;
-		len += r;
+	v = &map_cmdline_to_pid[0];
+	while (l <= *pos) {
+		v = saved_cmdlines_next(m, v, &l);
+		if (!v)
+			return NULL;
 	}
 
-	len = simple_read_from_buffer(ubuf, cnt, ppos,
-				      file_buf, len);
+	return v;
+}
 
-	kfree(file_buf);
-	kfree(buf_comm);
+static void saved_cmdlines_stop(struct seq_file *m, void *v)
+{
+}
 
-	return len;
+static int saved_cmdlines_show(struct seq_file *m, void *v)
+{
+	char buf[TASK_COMM_LEN];
+	unsigned int *pid = v;
+
+	trace_find_cmdline(*pid, buf);
+	seq_printf(m, "%d %s\n", *pid, buf);
+	return 0;
+}
+
+static const struct seq_operations tracing_saved_cmdlines_seq_ops = {
+	.start		= saved_cmdlines_start,
+	.next		= saved_cmdlines_next,
+	.stop		= saved_cmdlines_stop,
+	.show		= saved_cmdlines_show,
+};
+
+static int tracing_saved_cmdlines_open(struct inode *inode, struct file *filp)
+{
+	if (tracing_disabled)
+		return -ENODEV;
+
+	return seq_open(filp, &tracing_saved_cmdlines_seq_ops);
 }
 
 static const struct file_operations tracing_saved_cmdlines_fops = {
-    .open       = tracing_open_generic,
-    .read       = tracing_saved_cmdlines_read,
-    .llseek	= generic_file_llseek,
+	.open		= tracing_saved_cmdlines_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
 };
 
 static ssize_t
-- 
2.0.0.rc2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 6/7] tracing: Try again for saved cmdline if failed due to locking
  2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
                   ` (4 preceding siblings ...)
  2014-05-31  1:39 ` [for-next][PATCH 5/7] tracing: Have saved_cmdlines use the seq_read infrastructure Steven Rostedt
@ 2014-05-31  1:39 ` Steven Rostedt
  2014-05-31  1:39 ` [for-next][PATCH 7/7] tracing: Move locking of trace_cmdline_lock into start/stop seq calls Steven Rostedt
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0005-tracing-Try-again-for-saved-cmdline-if-failed-due-to.patch --]
[-- Type: text/plain, Size: 2311 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

In order to prevent the saved cmdline cache from being filled when
tracing is not active, the comms are only recorded after a trace event
is recorded.

The problem is, a comm can fail to be recorded if the trace_cmdline_lock
is held. That lock is taken via a trylock to allow it to happen from
any context (including NMI). If the lock fails to be taken, the comm
is skipped. No big deal, as we will try again later.

But! Because of the code that was added to only record after an event,
we may not try again later as the recording is made as a oneshot per
event per CPU.

Only disable the recording of the comm if the comm is actually recorded.

Fixes: 7ffbd48d5cab "tracing: Cache comms only after an event occurred"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 1855dae73f34..22a902e2ded9 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1441,12 +1441,12 @@ static void tracing_stop_tr(struct trace_array *tr)
 
 void trace_stop_cmdline_recording(void);
 
-static void trace_save_cmdline(struct task_struct *tsk)
+static int trace_save_cmdline(struct task_struct *tsk)
 {
 	unsigned pid, idx;
 
 	if (!tsk->pid || unlikely(tsk->pid > PID_MAX_DEFAULT))
-		return;
+		return 0;
 
 	/*
 	 * It's not the end of the world if we don't get
@@ -1455,7 +1455,7 @@ static void trace_save_cmdline(struct task_struct *tsk)
 	 * so if we miss here, then better luck next time.
 	 */
 	if (!arch_spin_trylock(&trace_cmdline_lock))
-		return;
+		return 0;
 
 	idx = map_pid_to_cmdline[tsk->pid];
 	if (idx == NO_CMDLINE_MAP) {
@@ -1480,6 +1480,8 @@ static void trace_save_cmdline(struct task_struct *tsk)
 	memcpy(&saved_cmdlines[idx], tsk->comm, TASK_COMM_LEN);
 
 	arch_spin_unlock(&trace_cmdline_lock);
+
+	return 1;
 }
 
 void trace_find_cmdline(int pid, char comm[])
@@ -1521,9 +1523,8 @@ void tracing_record_cmdline(struct task_struct *tsk)
 	if (!__this_cpu_read(trace_cmdline_save))
 		return;
 
-	__this_cpu_write(trace_cmdline_save, false);
-
-	trace_save_cmdline(tsk);
+	if (trace_save_cmdline(tsk))
+		__this_cpu_write(trace_cmdline_save, false);
 }
 
 void
-- 
2.0.0.rc2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next][PATCH 7/7] tracing: Move locking of trace_cmdline_lock into start/stop seq calls
  2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
                   ` (5 preceding siblings ...)
  2014-05-31  1:39 ` [for-next][PATCH 6/7] tracing: Try again for saved cmdline if failed due to locking Steven Rostedt
@ 2014-05-31  1:39 ` Steven Rostedt
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2014-05-31  1:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

[-- Attachment #1: 0006-tracing-Move-locking-of-trace_cmdline_lock-into-star.patch --]
[-- Type: text/plain, Size: 2332 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

With the conversion of the saved_cmdlines output to use seq_read, there
is now a race between accessing the values of the saved_cmdlines and
the writing to them. The trace_cmdline_lock needs to be taken at
the start and stop of the seq calls.

A new __trace_find_cmdline() call is created to allow for the look up
to happen without taking the lock.

Fixes: 42584c81c5ad tracing: Have saved_cmdlines use the seq_read infrastructure
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 22a902e2ded9..626dbfde5d56 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1484,7 +1484,7 @@ static int trace_save_cmdline(struct task_struct *tsk)
 	return 1;
 }
 
-void trace_find_cmdline(int pid, char comm[])
+static void __trace_find_cmdline(int pid, char comm[])
 {
 	unsigned map;
 
@@ -1503,13 +1503,19 @@ void trace_find_cmdline(int pid, char comm[])
 		return;
 	}
 
-	preempt_disable();
-	arch_spin_lock(&trace_cmdline_lock);
 	map = map_pid_to_cmdline[pid];
 	if (map != NO_CMDLINE_MAP)
 		strcpy(comm, saved_cmdlines[map]);
 	else
 		strcpy(comm, "<...>");
+}
+
+void trace_find_cmdline(int pid, char comm[])
+{
+	preempt_disable();
+	arch_spin_lock(&trace_cmdline_lock);
+
+	__trace_find_cmdline(pid, comm);
 
 	arch_spin_unlock(&trace_cmdline_lock);
 	preempt_enable();
@@ -3724,6 +3730,9 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
 	void *v;
 	loff_t l = 0;
 
+	preempt_disable();
+	arch_spin_lock(&trace_cmdline_lock);
+
 	v = &map_cmdline_to_pid[0];
 	while (l <= *pos) {
 		v = saved_cmdlines_next(m, v, &l);
@@ -3736,6 +3745,8 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
 
 static void saved_cmdlines_stop(struct seq_file *m, void *v)
 {
+	arch_spin_unlock(&trace_cmdline_lock);
+	preempt_enable();
 }
 
 static int saved_cmdlines_show(struct seq_file *m, void *v)
@@ -3743,7 +3754,7 @@ static int saved_cmdlines_show(struct seq_file *m, void *v)
 	char buf[TASK_COMM_LEN];
 	unsigned int *pid = v;
 
-	trace_find_cmdline(*pid, buf);
+	__trace_find_cmdline(*pid, buf);
 	seq_printf(m, "%d %s\n", *pid, buf);
 	return 0;
 }
-- 
2.0.0.rc2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-05-31  1:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-31  1:38 [for-next][PATCH 0/7] tracing: More changes queued up for 3.16 Steven Rostedt
2014-05-31  1:38 ` [for-next][PATCH 1/7] tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines Steven Rostedt
2014-05-31  1:38 ` [for-next][PATCH 2/7] tracing: Add funcgraph_tail option to print function name after closing braces Steven Rostedt
2014-05-31  1:39 ` [for-next][PATCH 3/7] tracing: Print nasty banner when trace_printk() is in use Steven Rostedt
2014-05-31  1:39 ` [for-next][PATCH 4/7] tracing: Add tracepoint benchmark tracepoint Steven Rostedt
2014-05-31  1:39 ` [for-next][PATCH 5/7] tracing: Have saved_cmdlines use the seq_read infrastructure Steven Rostedt
2014-05-31  1:39 ` [for-next][PATCH 6/7] tracing: Try again for saved cmdline if failed due to locking Steven Rostedt
2014-05-31  1:39 ` [for-next][PATCH 7/7] tracing: Move locking of trace_cmdline_lock into start/stop seq calls Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).