linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch 00/15] Reduce tracing payload size.
@ 2010-12-04  0:13 David Sharp
  2010-12-04  0:13 ` [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file David Sharp
                   ` (15 more replies)
  0 siblings, 16 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin

This is a series of patches for tracing aiming to reduce the size of events in
the trace buffer. For some tracing tasks, the machine is so heavily loaded such
that events are produced much faster than they can be written to disk or
network. In this situation, the sizes of the trace events and the ring buffer
are the primary factors limiting the length of time that can be traced.  Even
with ktrace (which has significantly smaller event sizes), a fairly minimal set
of enabled events, and a 16MB buffer per cpu, we are finding we can only get
about 20 seconds of tracing. This is not enough when the problem being
diagnosed only occurs at greater timescales and cannot be easily detected at
runtime.

These patches are just a start to shrinking the size of trace events. I am
planning to also make small versions of trace events that are used when
CONFIG_SMALL_TRACES is enabled. I'm also open to discussion of other ways to
shrink event sizes.


The patches can be divided into three major groups.

The first group is self explanitory patches, not actually related to shrinking
events:

	tracing: Add a 'buffer_overwrite' debugfs file
	ring_buffer.c: Remove unused #include <linux/trace_irq.h>
	ring_buffer: Align buffer_page struct allocations only to fit the flags.

The second group is patches to adjust packing and alignment of tracing event
structures:

	ftrace: pack event structures.
	ftrace: fix event alignment: ftrace:context_switch and ftrace:wakeup
	ftrace: fix event alignment: module:module_request
	ftrace: fix event alignment: kvm:kvm_hv_hypercall
	ftrace: fix event alignment: mce:mce_record
	ftrace: fix event alignment: skb:kfree_skb
	ftrace: fix event alignment: jbd2:*
	ftrace: fix event alignment: ext4:*

The third group are patches to remove 8 bytes from struct trace_entry:

	trace_output.c: adjust conditional expression formatting.
	small_traces: Add config option to shrink trace events.
	small_traces: Remove trace output of large fields.
	small_traces: Remove 8 bytes from trace_entry.


Thanks,
d#

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:57   ` Steven Rostedt
  2010-12-04  0:13 ` [PATCH 02/15] ring_buffer.c: Remove unused #include <linux/trace_irq.h> David Sharp
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, Jiaying Zhang, David Sharp

From: Jiaying Zhang <jiayingz@google.com>

Add a "buffer_overwrite" debugfs file for ftrace to control whether the buffer
should be overwritten on overflow or not. The default remains to overwrite old
events when the buffer is full. This patch adds the option to instead discard
newest events when the buffer is full. This is useful to get a snapshot of
traces just after enabling traces. Dropping the current event is also a simpler
code path.

Signed-off-by: David Sharp <dhsharp@google.com>
---
 Documentation/trace/ftrace.txt |    6 ++++
 include/linux/ring_buffer.h    |    2 +
 kernel/trace/ring_buffer.c     |   11 +++++++
 kernel/trace/trace.c           |   59 +++++++++++++++++++++++++++++++++++++---
 4 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index 557c1ed..9237da3 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -138,6 +138,12 @@ of ftrace. Here is a list of some of the key files:
 	This can only be updated when the current_tracer
 	is set to "nop".
 
+  buffer_overwrite:
+
+	This controls what happens when the trace buffer is full.
+	If "1" (default), the oldest events are discarded and 
+	overwritten. If "0", then the newest events are discarded.
+
   tracing_cpumask:
 
 	This is a mask that lets the user only trace
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 8d3a248..ab38ac8 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -100,6 +100,8 @@ void ring_buffer_free(struct ring_buffer *buffer);
 
 int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
 
+void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val);
+
 struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
 						   unsigned long length);
 int ring_buffer_unlock_commit(struct ring_buffer *buffer,
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 9ed509a..3207147 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1429,6 +1429,17 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
 }
 EXPORT_SYMBOL_GPL(ring_buffer_resize);
 
+void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val)
+{
+	mutex_lock(&buffer->mutex);
+	if (val)
+		buffer->flags |= RB_FL_OVERWRITE;
+	else
+		buffer->flags &= ~RB_FL_OVERWRITE;
+	mutex_unlock(&buffer->mutex);
+}
+EXPORT_SYMBOL_GPL(ring_buffer_change_overwrite);
+
 static inline void *
 __rb_data_page_index(struct buffer_data_page *bpage, unsigned index)
 {
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index c380612..ed5c14f 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -41,8 +41,6 @@
 #include "trace.h"
 #include "trace_output.h"
 
-#define TRACE_BUFFER_FLAGS	(RB_FL_OVERWRITE)
-
 /*
  * On boot up, the ring buffer is set to the minimum size, so that
  * we do not waste memory on systems that are not using tracing.
@@ -241,6 +239,9 @@ int tracing_is_enabled(void)
 
 static unsigned long		trace_buf_size = TRACE_BUF_SIZE_DEFAULT;
 
+/* whether the trace buffer should be overwritten on overflow or not. */
+static enum ring_buffer_flags  trace_buffer_flags = RB_FL_OVERWRITE;
+
 /* trace_types holds a link list of available tracers. */
 static struct tracer		*trace_types __read_mostly;
 
@@ -3466,6 +3467,47 @@ tracing_entries_write(struct file *filp, const char __user *ubuf,
 	return cnt;
 }
 
+static ssize_t
+tracing_overwrite_read(struct file *filp, char __user *ubuf,
+		     size_t cnt, loff_t *ppos)
+{
+	char buf[64];
+	int r;
+	r = snprintf(buf, 64, "%u\n", trace_buffer_flags);
+	return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
+static ssize_t
+tracing_overwrite_write(struct file *filp, const char __user *ubuf,
+		      size_t cnt, loff_t *ppos)
+{
+	unsigned long val;
+	char buf[64];
+	int ret;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	buf[cnt] = 0;
+
+	ret = strict_strtoul(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+	*ppos += cnt;
+
+	if (val != 0 && val != 1)
+		return -EINVAL;
+
+	if (trace_buffer_flags != val) {
+		trace_buffer_flags = val;
+		ring_buffer_change_overwrite(global_trace.buffer, val);
+	}
+	return cnt;
+}
+
 static int mark_printk(const char *fmt, ...)
 {
 	int ret;
@@ -3611,6 +3653,12 @@ static const struct file_operations tracing_entries_fops = {
 	.llseek		= generic_file_llseek,
 };
 
+static const struct file_operations tracing_overwrite_fops = {
+	.open		= tracing_open_generic,
+	.read		= tracing_overwrite_read,
+	.write		= tracing_overwrite_write,
+};
+
 static const struct file_operations tracing_mark_fops = {
 	.open		= tracing_open_generic,
 	.write		= tracing_mark_write,
@@ -4336,6 +4384,9 @@ static __init int tracer_init_debugfs(void)
 	trace_create_file("buffer_size_kb", 0644, d_tracer,
 			&global_trace, &tracing_entries_fops);
 
+	trace_create_file("buffer_overwrite", 0644, d_tracer,
+			&global_trace, &tracing_overwrite_fops);
+
 	trace_create_file("trace_marker", 0220, d_tracer,
 			NULL, &tracing_mark_fops);
 
@@ -4565,7 +4616,7 @@ __init static int tracer_alloc_buffers(void)
 
 	/* TODO: make the number of buffers hot pluggable with CPUS */
 	global_trace.buffer = ring_buffer_alloc(ring_buf_size,
-						   TRACE_BUFFER_FLAGS);
+						   trace_buffer_flags);
 	if (!global_trace.buffer) {
 		printk(KERN_ERR "tracer: failed to allocate ring buffer!\n");
 		WARN_ON(1);
@@ -4575,7 +4626,7 @@ __init static int tracer_alloc_buffers(void)
 
 
 #ifdef CONFIG_TRACER_MAX_TRACE
-	max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
+	max_tr.buffer = ring_buffer_alloc(1, trace_buffer_flags);
 	if (!max_tr.buffer) {
 		printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
 		WARN_ON(1);
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 02/15] ring_buffer.c: Remove unused #include <linux/trace_irq.h>
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
  2010-12-04  0:13 ` [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2011-03-11  9:46   ` [tip:perf/core] ring-buffer: " tip-bot for David Sharp
  2010-12-04  0:13 ` [PATCH 03/15] ring_buffer: Align buffer_page struct allocations only to fit the flags David Sharp
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 kernel/trace/ring_buffer.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 3207147..8ef7cc4 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -5,7 +5,6 @@
  */
 #include <linux/ring_buffer.h>
 #include <linux/trace_clock.h>
-#include <linux/ftrace_irq.h>
 #include <linux/spinlock.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 03/15] ring_buffer: Align buffer_page struct allocations only to fit the flags.
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
  2010-12-04  0:13 ` [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file David Sharp
  2010-12-04  0:13 ` [PATCH 02/15] ring_buffer.c: Remove unused #include <linux/trace_irq.h> David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:43   ` Steven Rostedt
  2010-12-04  0:13 ` [PATCH 04/15] ftrace: pack event structures David Sharp
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp

buffer_page structs need to be aligned to 4 byte boundaries because the page
flags are stored in the two least-significant bits of the pointers in the page
list. Aligning to cache lines is sufficient, but doesn't seem to be necessary.
Reducing the alignement to only 4 bytes may improve cache efficiency.

Testing with Autotest's tracing_microbenchmark, there was no significant change
in overhead with this change.

Signed-off-by: David Sharp <dhsharp@google.com>
---
 kernel/trace/ring_buffer.c |   20 ++++++++++++--------
 1 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 8ef7cc4..957a8b8 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -644,8 +644,14 @@ EXPORT_SYMBOL_GPL(ring_buffer_normalize_time_stamp);
 #define RB_PAGE_HEAD		1UL
 #define RB_PAGE_UPDATE		2UL
 
-
 #define RB_FLAG_MASK		3UL
+#define RB_PAGE_ALIGNMENT	(RB_FLAG_MASK+1)
+
+/* Ensure alignment of struct buffer_page */
+static __attribute__((unused)) void check_buffer_page_alignment(void)
+{
+        BUILD_BUG_ON(__alignof__(struct buffer_page) % RB_PAGE_ALIGNMENT != 0);
+}
 
 /* PAGE_MOVED is not part of the mask */
 #define RB_PAGE_MOVED		4UL
@@ -1004,8 +1010,8 @@ static int rb_allocate_pages(struct ring_buffer_per_cpu *cpu_buffer,
 	WARN_ON(!nr_pages);
 
 	for (i = 0; i < nr_pages; i++) {
-		bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()),
-				    GFP_KERNEL, cpu_to_node(cpu_buffer->cpu));
+		bpage = kzalloc_node(sizeof(*bpage), GFP_KERNEL,
+				     cpu_to_node(cpu_buffer->cpu));
 		if (!bpage)
 			goto free_pages;
 
@@ -1059,8 +1065,7 @@ rb_allocate_cpu_buffer(struct ring_buffer *buffer, int cpu)
 	lockdep_set_class(&cpu_buffer->reader_lock, buffer->reader_lock_key);
 	cpu_buffer->lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
-	bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()),
-			    GFP_KERNEL, cpu_to_node(cpu));
+	bpage = kzalloc_node(sizeof(*bpage), GFP_KERNEL, cpu_to_node(cpu));
 	if (!bpage)
 		goto fail_free_buffer;
 
@@ -1375,9 +1380,8 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
 
 	for_each_buffer_cpu(buffer, cpu) {
 		for (i = 0; i < new_pages; i++) {
-			bpage = kzalloc_node(ALIGN(sizeof(*bpage),
-						  cache_line_size()),
-					    GFP_KERNEL, cpu_to_node(cpu));
+			bpage = kzalloc_node(sizeof(*bpage), GFP_KERNEL,
+					     cpu_to_node(cpu));
 			if (!bpage)
 				goto free_pages;
 			list_add(&bpage->list, &pages);
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 04/15] ftrace: pack event structures.
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (2 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 03/15] ring_buffer: Align buffer_page struct allocations only to fit the flags David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2011-03-08 23:30   ` Steven Rostedt
  2011-03-09 13:01   ` Steven Rostedt
  2010-12-04  0:13 ` [PATCH 05/15] ftrace: fix event alignment: ftrace:context_switch and ftrace:wakeup David Sharp
                   ` (11 subsequent siblings)
  15 siblings, 2 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp

Ftrace event structures have a 12-byte struct trace_entry at the beginning.
If the structure is aligned, this means that if the first field is 64-bits,
there will be 4 bytes of padding. Ironically, due to the 4-byte ringbuffer
header, this will make 64-bit writes unaligned, if the ring buffer position
is currently 64-bit aligned:
4(rb)+12(ftrace)+4(pad) = 20;  20%8 = 4

Adding __attribute__((packed)) to the event structures removes the extra
space from the trace events, and actually improves alignment of trace
events with a first field that is 64-bits.

About 65 tracepoints have a 4-byte pad at offset 12:
 # find events -name format | xargs -n1 awk '
   $1=="name:" {name=$2}
   $1=="format:"{FS="\t"}
   $3=="offset:12;" && $4=="size:4;"{okay=1}
   $3=="offset:16;" && !okay {print name}' | wc -l
65

With all 'syscalls' and 'timer' events enabled, this results in a 5%
improvement in a simple 512MB read benchmark with warm caches.

Tested:

setup:
 # echo 1 >events/syscalls/enable
 # echo 1 >events/timer/enable
 # echo 0 > tracing_enabled
off:
 # for n in $(seq 10) ; do \
   time dd if=/dev/hda3 of=/dev/null bs=1K count=512K ; \
   done
on:
 # for n in $(seq 10) ; do \
   echo 1 >tracing_enabled; \
   time dd if=/dev/hda3 of=/dev/null bs=1K count=512K ; \
   echo 0 >tracing_enabled; \
   echo > trace; \
   done

real time      mean/median/stdev
w/o patch off: 1.1679/1.164/0.0169
w/o patch on : 1.9432/1.936/0.0274
w/  patch off: 1.1715/1.159/0.0431
w/  patch on : 1.8425/1.836/0.0138
"on" delta: -0.1007 --> -5.2%

Google-Bug-Id: 2895627

Signed-off-by: David Sharp <dhsharp@google.com>
---
 include/trace/ftrace.h |    5 +++--
 kernel/trace/trace.h   |    2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index a9377c0..51d1f52 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -48,7 +48,8 @@
 #define __array(type, item, len)	type	item[len];
 
 #undef __dynamic_array
-#define __dynamic_array(type, item, len) u32 __data_loc_##item;
+#define __dynamic_array(type, item, len) \
+    u32 __data_loc_##item __attribute__((aligned(4)));
 
 #undef __string
 #define __string(item, src) __dynamic_array(char, item, -1)
@@ -62,7 +63,7 @@
 		struct trace_entry	ent;				\
 		tstruct							\
 		char			__data[0];			\
-	};								\
+	} __attribute__((packed));					\
 									\
 	static struct ftrace_event_class event_class_##name;
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 9021f8c..2e80433 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -60,7 +60,7 @@ enum trace_type {
 	struct struct_name {					\
 		struct trace_entry	ent;			\
 		tstruct						\
-	}
+	} __attribute__((packed))
 
 #undef TP_ARGS
 #define TP_ARGS(args...)	args
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 05/15] ftrace: fix event alignment: ftrace:context_switch and ftrace:wakeup
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (3 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 04/15] ftrace: pack event structures David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2011-03-11  9:48   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  2010-12-04  0:13 ` [PATCH 06/15] ftrace: fix event alignment: module:module_request David Sharp
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 kernel/trace/trace_entries.h |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
index e3dfeca..81ab9c3 100644
--- a/kernel/trace/trace_entries.h
+++ b/kernel/trace/trace_entries.h
@@ -109,12 +109,12 @@ FTRACE_ENTRY(funcgraph_exit, ftrace_graph_ret_entry,
  */
 #define FTRACE_CTX_FIELDS					\
 	__field(	unsigned int,	prev_pid	)	\
+	__field(	unsigned int,	next_pid	)	\
+	__field(	unsigned int,	next_cpu	)       \
 	__field(	unsigned char,	prev_prio	)	\
 	__field(	unsigned char,	prev_state	)	\
-	__field(	unsigned int,	next_pid	)	\
 	__field(	unsigned char,	next_prio	)	\
-	__field(	unsigned char,	next_state	)	\
-	__field(	unsigned int,	next_cpu	)
+	__field(	unsigned char,	next_state	)
 
 FTRACE_ENTRY(context_switch, ctx_switch_entry,
 
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 06/15] ftrace: fix event alignment: module:module_request
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (4 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 05/15] ftrace: fix event alignment: ftrace:context_switch and ftrace:wakeup David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:47   ` Steven Rostedt
  2011-03-11  9:48   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  2010-12-04  0:13 ` [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall David Sharp
                   ` (9 subsequent siblings)
  15 siblings, 2 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 include/trace/events/module.h |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/module.h b/include/trace/events/module.h
index c7bb2f0..8154f5f 100644
--- a/include/trace/events/module.h
+++ b/include/trace/events/module.h
@@ -98,14 +98,14 @@ TRACE_EVENT(module_request,
 	TP_ARGS(name, wait, ip),
 
 	TP_STRUCT__entry(
-		__field(	bool,		wait		)
 		__field(	unsigned long,	ip		)
+		__field(	bool,		wait		)
 		__string(	name,		name		)
 	),
 
 	TP_fast_assign(
-		__entry->wait	= wait;
 		__entry->ip	= ip;
+		__entry->wait	= wait;
 		__assign_str(name, name);
 	),
 
@@ -119,4 +119,3 @@ TRACE_EVENT(module_request,
 
 /* This part must be outside protection */
 #include <trace/define_trace.h>
-
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (5 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 06/15] ftrace: fix event alignment: module:module_request David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:49   ` Steven Rostedt
                     ` (2 more replies)
  2010-12-04  0:13 ` [PATCH 08/15] ftrace: fix event alignment: mce:mce_record David Sharp
                   ` (8 subsequent siblings)
  15 siblings, 3 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 arch/x86/kvm/trace.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index a6544b8..ab41fb0 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -62,21 +62,21 @@ TRACE_EVENT(kvm_hv_hypercall,
 	TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
 
 	TP_STRUCT__entry(
-		__field(	__u16, 		code		)
-		__field(	bool,		fast		)
 		__field(	__u16,		rep_cnt		)
 		__field(	__u16,		rep_idx		)
 		__field(	__u64,		ingpa		)
 		__field(	__u64,		outgpa		)
+		__field(	__u16, 		code		)
+		__field(	bool,		fast		)
 	),
 
 	TP_fast_assign(
-		__entry->code		= code;
-		__entry->fast		= fast;
 		__entry->rep_cnt	= rep_cnt;
 		__entry->rep_idx	= rep_idx;
 		__entry->ingpa		= ingpa;
 		__entry->outgpa		= outgpa;
+		__entry->code		= code;
+		__entry->fast		= fast;
 	),
 
 	TP_printk("code 0x%x %s cnt 0x%x idx 0x%x in 0x%llx out 0x%llx",
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 08/15] ftrace: fix event alignment: mce:mce_record
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (6 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:50   ` Steven Rostedt
  2011-03-11  9:49   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  2010-12-04  0:13 ` [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb David Sharp
                   ` (7 subsequent siblings)
  15 siblings, 2 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 include/trace/events/mce.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 7eee778..4cbbcef 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -17,36 +17,36 @@ TRACE_EVENT(mce_record,
 	TP_STRUCT__entry(
 		__field(	u64,		mcgcap		)
 		__field(	u64,		mcgstatus	)
-		__field(	u8,		bank		)
 		__field(	u64,		status		)
 		__field(	u64,		addr		)
 		__field(	u64,		misc		)
 		__field(	u64,		ip		)
-		__field(	u8,		cs		)
 		__field(	u64,		tsc		)
 		__field(	u64,		walltime	)
 		__field(	u32,		cpu		)
 		__field(	u32,		cpuid		)
 		__field(	u32,		apicid		)
 		__field(	u32,		socketid	)
+		__field(	u8,		cs		)
+		__field(	u8,		bank		)
 		__field(	u8,		cpuvendor	)
 	),
 
 	TP_fast_assign(
 		__entry->mcgcap		= m->mcgcap;
 		__entry->mcgstatus	= m->mcgstatus;
-		__entry->bank		= m->bank;
 		__entry->status		= m->status;
 		__entry->addr		= m->addr;
 		__entry->misc		= m->misc;
 		__entry->ip		= m->ip;
-		__entry->cs		= m->cs;
 		__entry->tsc		= m->tsc;
 		__entry->walltime	= m->time;
 		__entry->cpu		= m->extcpu;
 		__entry->cpuid		= m->cpuid;
 		__entry->apicid		= m->apicid;
 		__entry->socketid	= m->socketid;
+		__entry->cs		= m->cs;
+		__entry->bank		= m->bank;
 		__entry->cpuvendor	= m->cpuvendor;
 	),
 
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (7 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 08/15] ftrace: fix event alignment: mce:mce_record David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:52   ` Steven Rostedt
  2011-03-11  9:50   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  2010-12-04  0:13 ` [PATCH 10/15] ftrace: fix event alignment: jbd2:* David Sharp
                   ` (6 subsequent siblings)
  15 siblings, 2 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 include/trace/events/skb.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 75ce9d5..aa4a56a 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -19,16 +19,16 @@ TRACE_EVENT(kfree_skb,
 
 	TP_STRUCT__entry(
 		__field(	void *,		skbaddr		)
-		__field(	unsigned short,	protocol	)
 		__field(	void *,		location	)
+		__field(	unsigned short,	protocol	)
 	),
 
 	TP_fast_assign(
 		__entry->skbaddr = skb;
+		__entry->location = location;
 		if (skb) {
 			__entry->protocol = ntohs(skb->protocol);
 		}
-		__entry->location = location;
 	),
 
 	TP_printk("skbaddr=%p protocol=%u location=%p",
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 10/15] ftrace: fix event alignment: jbd2:*
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (8 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:52   ` Steven Rostedt
  2010-12-04  0:13 ` [PATCH 11/15] ftrace: fix event alignment: ext4:* David Sharp
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 include/trace/events/jbd2.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
index 7447ea9..2f9f362 100644
--- a/include/trace/events/jbd2.h
+++ b/include/trace/events/jbd2.h
@@ -41,15 +41,15 @@ DECLARE_EVENT_CLASS(jbd2_commit,
 	TP_STRUCT__entry(
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
-		__field(	char,	sync_commit		  )
 		__field(	int,	transaction		  )
+		__field(	char,	sync_commit		  )
 	),
 
 	TP_fast_assign(
 		__entry->dev_major	= MAJOR(journal->j_fs_dev->bd_dev);
 		__entry->dev_minor	= MINOR(journal->j_fs_dev->bd_dev);
-		__entry->sync_commit = commit_transaction->t_synchronous_commit;
 		__entry->transaction	= commit_transaction->t_tid;
+		__entry->sync_commit = commit_transaction->t_synchronous_commit;
 	),
 
 	TP_printk("dev %d,%d transaction %d sync %d",
@@ -93,17 +93,17 @@ TRACE_EVENT(jbd2_end_commit,
 	TP_STRUCT__entry(
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
-		__field(	char,	sync_commit		  )
 		__field(	int,	transaction		  )
 		__field(	int,	head		  	  )
+		__field(	char,	sync_commit		  )
 	),
 
 	TP_fast_assign(
 		__entry->dev_major	= MAJOR(journal->j_fs_dev->bd_dev);
 		__entry->dev_minor	= MINOR(journal->j_fs_dev->bd_dev);
-		__entry->sync_commit = commit_transaction->t_synchronous_commit;
 		__entry->transaction	= commit_transaction->t_tid;
 		__entry->head		= journal->j_tail_sequence;
+		__entry->sync_commit = commit_transaction->t_synchronous_commit;
 	),
 
 	TP_printk("dev %d,%d transaction %d sync %d head %d",
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 11/15] ftrace: fix event alignment: ext4:*
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (9 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 10/15] ftrace: fix event alignment: jbd2:* David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:53   ` Steven Rostedt
  2010-12-04  0:13 ` [PATCH 12/15] trace_output.c: adjust conditional expression formatting David Sharp
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp


Signed-off-by: David Sharp <dhsharp@google.com>
---
 include/trace/events/ext4.h |   70 +++++++++++++++++++++---------------------
 1 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index e5e345f..9887cdb 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -23,21 +23,21 @@ TRACE_EVENT(ext4_free_inode,
 	TP_STRUCT__entry(
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
+		__field(	blkcnt_t, blocks		)
 		__field(	ino_t,	ino			)
-		__field(	umode_t, mode			)
 		__field(	uid_t,	uid			)
 		__field(	gid_t,	gid			)
-		__field(	blkcnt_t, blocks		)
+		__field(	umode_t, mode			)
 	),
 
 	TP_fast_assign(
 		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
 		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
+		__entry->blocks	= inode->i_blocks;
 		__entry->ino	= inode->i_ino;
-		__entry->mode	= inode->i_mode;
 		__entry->uid	= inode->i_uid;
 		__entry->gid	= inode->i_gid;
-		__entry->blocks	= inode->i_blocks;
+		__entry->mode	= inode->i_mode;
 	),
 
 	TP_printk("dev %d,%d ino %lu mode 0%o uid %u gid %u blocks %llu",
@@ -342,10 +342,10 @@ TRACE_EVENT(ext4_da_writepages,
 		__field(	long,	pages_skipped		)
 		__field(	loff_t,	range_start		)
 		__field(	loff_t,	range_end		)
+		__field(       pgoff_t,	writeback_index		)
 		__field(	char,	for_kupdate		)
 		__field(	char,	for_reclaim		)
 		__field(	char,	range_cyclic		)
-		__field(       pgoff_t,	writeback_index		)
 	),
 
 	TP_fast_assign(
@@ -356,10 +356,10 @@ TRACE_EVENT(ext4_da_writepages,
 		__entry->pages_skipped	= wbc->pages_skipped;
 		__entry->range_start	= wbc->range_start;
 		__entry->range_end	= wbc->range_end;
+		__entry->writeback_index = inode->i_mapping->writeback_index;
 		__entry->for_kupdate	= wbc->for_kupdate;
 		__entry->for_reclaim	= wbc->for_reclaim;
 		__entry->range_cyclic	= wbc->range_cyclic;
-		__entry->writeback_index = inode->i_mapping->writeback_index;
 	),
 
 	TP_printk("dev %d,%d ino %lu nr_to_write %ld pages_skipped %ld "
@@ -425,8 +425,8 @@ TRACE_EVENT(ext4_da_writepages_result,
 		__field(	int,	ret			)
 		__field(	int,	pages_written		)
 		__field(	long,	pages_skipped		)
-		__field(	char,	more_io			)	
 		__field(       pgoff_t,	writeback_index		)
+		__field(	char,	more_io			)	
 	),
 
 	TP_fast_assign(
@@ -436,8 +436,8 @@ TRACE_EVENT(ext4_da_writepages_result,
 		__entry->ret		= ret;
 		__entry->pages_written	= pages_written;
 		__entry->pages_skipped	= wbc->pages_skipped;
-		__entry->more_io	= wbc->more_io;
 		__entry->writeback_index = inode->i_mapping->writeback_index;
+		__entry->more_io	= wbc->more_io;
 	),
 
 	TP_printk("dev %d,%d ino %lu ret %d pages_written %d pages_skipped %ld more_io %d writeback_index %lu",
@@ -485,8 +485,8 @@ DECLARE_EVENT_CLASS(ext4__mb_new_pa,
 		__field(	int,   dev_minor                )
 		__field(	ino_t,	ino			)
 		__field(	__u64,	pa_pstart		)
-		__field(	__u32,	pa_len			)
 		__field(	__u64,	pa_lstart		)
+		__field(	__u32,	pa_len			)
 
 	),
 
@@ -495,8 +495,8 @@ DECLARE_EVENT_CLASS(ext4__mb_new_pa,
 		__entry->dev_minor	= MINOR(ac->ac_sb->s_dev);
 		__entry->ino		= ac->ac_inode->i_ino;
 		__entry->pa_pstart	= pa->pa_pstart;
-		__entry->pa_len		= pa->pa_len;
 		__entry->pa_lstart	= pa->pa_lstart;
+		__entry->pa_len		= pa->pa_len;
 	),
 
 	TP_printk("dev %d,%d ino %lu pstart %llu len %u lstart %llu",
@@ -724,20 +724,20 @@ TRACE_EVENT(ext4_free_blocks,
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
 		__field(	ino_t,	ino			)
-		__field(      umode_t, mode			)
 		__field(	__u64,	block			)
 		__field(	unsigned long,	count		)
 		__field(	 int,	flags			)
+		__field(      umode_t, mode			)
 	),
 
 	TP_fast_assign(
 		__entry->dev_major	= MAJOR(inode->i_sb->s_dev);
 		__entry->dev_minor	= MINOR(inode->i_sb->s_dev);
 		__entry->ino		= inode->i_ino;
-		__entry->mode		= inode->i_mode;
 		__entry->block		= block;
 		__entry->count		= count;
 		__entry->flags		= flags;
+		__entry->mode		= inode->i_mode;
 	),
 
 	TP_printk("dev %d,%d ino %lu mode 0%o block %llu count %lu flags %d",
@@ -766,8 +766,8 @@ TRACE_EVENT(ext4_sync_file,
 		__entry->dev_major	= MAJOR(dentry->d_inode->i_sb->s_dev);
 		__entry->dev_minor	= MINOR(dentry->d_inode->i_sb->s_dev);
 		__entry->ino		= dentry->d_inode->i_ino;
-		__entry->datasync	= datasync;
 		__entry->parent		= dentry->d_parent->d_inode->i_ino;
+		__entry->datasync	= datasync;
 	),
 
 	TP_printk("dev %d,%d ino %ld parent %ld datasync %d ",
@@ -834,12 +834,6 @@ TRACE_EVENT(ext4_mballoc_alloc,
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
 		__field(	ino_t,	ino			)
-		__field(	__u16,	found			)
-		__field(	__u16,	groups			)
-		__field(	__u16,	buddy			)
-		__field(	__u16,	flags			)
-		__field(	__u16,	tail			)
-		__field(	__u8,	cr			)
 		__field(	__u32, 	orig_logical		)
 		__field(	  int,	orig_start		)
 		__field(	__u32, 	orig_group		)
@@ -852,18 +846,18 @@ TRACE_EVENT(ext4_mballoc_alloc,
 		__field(	  int,	result_start		)
 		__field(	__u32, 	result_group		)
 		__field(	  int,	result_len		)
+		__field(	__u16,	found			)
+		__field(	__u16,	groups			)
+		__field(	__u16,	buddy			)
+		__field(	__u16,	flags			)
+		__field(	__u16,	tail			)
+		__field(	__u8,	cr			)
 	),
 
 	TP_fast_assign(
 		__entry->dev_major	= MAJOR(ac->ac_inode->i_sb->s_dev);
 		__entry->dev_minor	= MINOR(ac->ac_inode->i_sb->s_dev);
 		__entry->ino		= ac->ac_inode->i_ino;
-		__entry->found		= ac->ac_found;
-		__entry->flags		= ac->ac_flags;
-		__entry->groups		= ac->ac_groups_scanned;
-		__entry->buddy		= ac->ac_buddy;
-		__entry->tail		= ac->ac_tail;
-		__entry->cr		= ac->ac_criteria;
 		__entry->orig_logical	= ac->ac_o_ex.fe_logical;
 		__entry->orig_start	= ac->ac_o_ex.fe_start;
 		__entry->orig_group	= ac->ac_o_ex.fe_group;
@@ -876,6 +870,12 @@ TRACE_EVENT(ext4_mballoc_alloc,
 		__entry->result_start	= ac->ac_f_ex.fe_start;
 		__entry->result_group	= ac->ac_f_ex.fe_group;
 		__entry->result_len	= ac->ac_f_ex.fe_len;
+		__entry->found		= ac->ac_found;
+		__entry->groups		= ac->ac_groups_scanned;
+		__entry->buddy		= ac->ac_buddy;
+		__entry->flags		= ac->ac_flags;
+		__entry->tail		= ac->ac_tail;
+		__entry->cr		= ac->ac_criteria;
 	),
 
 	TP_printk("dev %d,%d inode %lu orig %u/%d/%u@%u goal %u/%d/%u@%u "
@@ -1001,18 +1001,18 @@ TRACE_EVENT(ext4_forget,
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
 		__field(	ino_t,	ino			)
-		__field(	umode_t, mode			)
-		__field(	int,	is_metadata		)
 		__field(	__u64,	block			)
+		__field(	int,	is_metadata		)
+		__field(	umode_t, mode			)
 	),
 
 	TP_fast_assign(
 		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
 		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
 		__entry->ino	= inode->i_ino;
-		__entry->mode	= inode->i_mode;
-		__entry->is_metadata = is_metadata;
 		__entry->block	= block;
+		__entry->is_metadata = is_metadata;
+		__entry->mode	= inode->i_mode;
 	),
 
 	TP_printk("dev %d,%d ino %lu mode 0%o is_metadata %d block %llu",
@@ -1030,24 +1030,24 @@ TRACE_EVENT(ext4_da_update_reserve_space,
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
 		__field(	ino_t,	ino			)
-		__field(	umode_t, mode			)
 		__field(	__u64,	i_blocks		)
 		__field(	int,	used_blocks		)
 		__field(	int,	reserved_data_blocks	)
 		__field(	int,	reserved_meta_blocks	)
 		__field(	int,	allocated_meta_blocks	)
+		__field(	umode_t, mode			)
 	),
 
 	TP_fast_assign(
 		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
 		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
 		__entry->ino	= inode->i_ino;
-		__entry->mode	= inode->i_mode;
 		__entry->i_blocks = inode->i_blocks;
 		__entry->used_blocks = used_blocks;
 		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
 		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
 		__entry->allocated_meta_blocks = EXT4_I(inode)->i_allocated_meta_blocks;
+		__entry->mode	= inode->i_mode;
 	),
 
 	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu used_blocks %d reserved_data_blocks %d reserved_meta_blocks %d allocated_meta_blocks %d",
@@ -1067,22 +1067,22 @@ TRACE_EVENT(ext4_da_reserve_space,
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
 		__field(	ino_t,	ino			)
-		__field(	umode_t, mode			)
 		__field(	__u64,	i_blocks		)
 		__field(	int,	md_needed		)
 		__field(	int,	reserved_data_blocks	)
 		__field(	int,	reserved_meta_blocks	)
+		__field(	umode_t, mode			)
 	),
 
 	TP_fast_assign(
 		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
 		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
 		__entry->ino	= inode->i_ino;
-		__entry->mode	= inode->i_mode;
 		__entry->i_blocks = inode->i_blocks;
 		__entry->md_needed = md_needed;
 		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
 		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
+		__entry->mode	= inode->i_mode;
 	),
 
 	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu md_needed %d reserved_data_blocks %d reserved_meta_blocks %d",
@@ -1102,24 +1102,24 @@ TRACE_EVENT(ext4_da_release_space,
 		__field(	int,   dev_major                )
 		__field(	int,   dev_minor                )
 		__field(	ino_t,	ino			)
-		__field(	umode_t, mode			)
 		__field(	__u64,	i_blocks		)
 		__field(	int,	freed_blocks		)
 		__field(	int,	reserved_data_blocks	)
 		__field(	int,	reserved_meta_blocks	)
 		__field(	int,	allocated_meta_blocks	)
+		__field(	umode_t, mode			)
 	),
 
 	TP_fast_assign(
 		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
 		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
 		__entry->ino	= inode->i_ino;
-		__entry->mode	= inode->i_mode;
 		__entry->i_blocks = inode->i_blocks;
 		__entry->freed_blocks = freed_blocks;
 		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
 		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
 		__entry->allocated_meta_blocks = EXT4_I(inode)->i_allocated_meta_blocks;
+		__entry->mode	= inode->i_mode;
 	),
 
 	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu freed_blocks %d reserved_data_blocks %d reserved_meta_blocks %d allocated_meta_blocks %d",
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 12/15] trace_output.c: adjust conditional expression formatting.
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (10 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 11/15] ftrace: fix event alignment: ext4:* David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2011-03-11  9:50   ` [tip:perf/core] tracing: Adjust conditional expression latency formatting tip-bot for David Sharp
  2010-12-04  0:13 ` [PATCH 13/15] small_traces: Add config option to shrink trace events David Sharp
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp

Formatting change only to improve readability. No code changes except to
introduce intermediate variables.

Signed-off-by: David Sharp <dhsharp@google.com>
---
 kernel/trace/trace_output.c |   26 +++++++++++++++-----------
 1 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 02272ba..9e33b1f 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -533,20 +533,24 @@ seq_print_ip_sym(struct trace_seq *s, unsigned long ip, unsigned long sym_flags)
  */
 int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry)
 {
-	int hardirq, softirq;
 	int ret;
-
-	hardirq = entry->flags & TRACE_FLAG_HARDIRQ;
-	softirq = entry->flags & TRACE_FLAG_SOFTIRQ;
+	int hardirq = entry->flags & TRACE_FLAG_HARDIRQ;
+	int softirq = entry->flags & TRACE_FLAG_SOFTIRQ;
+
+	const char irqs_off = 
+		(entry->flags & TRACE_FLAG_IRQS_OFF) ? 'd' :
+		(entry->flags & TRACE_FLAG_IRQS_NOSUPPORT) ? 'X' :
+		'.';
+	const char need_resched =
+		(entry->flags & TRACE_FLAG_NEED_RESCHED) ? 'N' : '.';
+	const char hardsoft_irq =
+		(hardirq && softirq) ? 'H' :
+		hardirq ? 'h' :
+		softirq ? 's' :
+		'.';
 
 	if (!trace_seq_printf(s, "%c%c%c",
-			      (entry->flags & TRACE_FLAG_IRQS_OFF) ? 'd' :
-				(entry->flags & TRACE_FLAG_IRQS_NOSUPPORT) ?
-				  'X' : '.',
-			      (entry->flags & TRACE_FLAG_NEED_RESCHED) ?
-				'N' : '.',
-			      (hardirq && softirq) ? 'H' :
-				hardirq ? 'h' : softirq ? 's' : '.'))
+			      irqs_off, need_resched, hardsoft_irq))
 		return 0;
 
 	if (entry->preempt_count)
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (11 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 12/15] trace_output.c: adjust conditional expression formatting David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  1:56   ` Steven Rostedt
  2011-03-09  0:23   ` Steven Rostedt
  2010-12-04  0:13 ` [PATCH 14/15] small_traces: Remove trace output of large fields David Sharp
                   ` (2 subsequent siblings)
  15 siblings, 2 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp

This option will remove several fields from trace events to reduce the
payload size. Under heavy load, events can be produced faster than they
can possibly be written to disk or sent on the network. Shrinking the
payload size will enable fitting more traces in the ring buffer, which
therefore enables tracing for longer periods, at the cost of losing some
relatively unimportant details.

Google-Bug-Id: 3224547

Signed-off-by: David Sharp <dhsharp@google.com>
---
 kernel/trace/Kconfig |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index ea37e2f..9c24fe5 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -403,6 +403,18 @@ config FUNCTION_PROFILER
 
 	  If in doubt, say N.
 
+config SMALL_TRACES
+	bool "Shrink trace events"
+	default n
+	help
+	  Minimize the size of trace events. Omits fields from trace event
+	  header like preempt_count, and lock_depth. Chooses smaller
+	  alternative event structures. Shrinking the payload size enables
+	  fitting more traces in the ring buffer, which enables tracing for
+	  longer periods when the buffer can not be read as fast as it is
+	  filled, at the cost of losing some details.
+
+
 config FTRACE_MCOUNT_RECORD
 	def_bool y
 	depends on DYNAMIC_FTRACE
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 14/15] small_traces: Remove trace output of large fields.
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (12 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 13/15] small_traces: Add config option to shrink trace events David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-04  0:13 ` [PATCH 15/15] small_traces: Remove 8 bytes from trace_entry David Sharp
  2010-12-06 13:22 ` [Patch 00/15] Reduce tracing payload size Andi Kleen
  15 siblings, 0 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp

Google-Bug-Id: 3224547

Signed-off-by: David Sharp <dhsharp@google.com>
---
 kernel/trace/trace.c        |    6 ++++++
 kernel/trace/trace_output.c |   18 ++++++++++++++++++
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index ed5c14f..177d4e5 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1747,6 +1747,7 @@ static void s_stop(struct seq_file *m, void *p)
 
 static void print_lat_help_header(struct seq_file *m)
 {
+#ifndef CONFIG_SMALL_TRACES
 	seq_puts(m, "#                  _------=> CPU#            \n");
 	seq_puts(m, "#                 / _-----=> irqs-off        \n");
 	seq_puts(m, "#                | / _----=> need-resched    \n");
@@ -1756,6 +1757,11 @@ static void print_lat_help_header(struct seq_file *m)
 	seq_puts(m, "#                |||||/     delay             \n");
 	seq_puts(m, "#  cmd     pid   |||||| time  |   caller      \n");
 	seq_puts(m, "#     \\   /      ||||||   \\   |   /           \n");
+#else
+	seq_puts(m, "#               CPU#   delay             \n");
+	seq_puts(m, "#  cmd     pid   | time  |   caller      \n");
+	seq_puts(m, "#     \\   /      |   \\   |   /           \n");
+#endif
 }
 
 static void print_func_help_header(struct seq_file *m)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 9e33b1f..d5d7575 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -531,6 +531,7 @@ seq_print_ip_sym(struct trace_seq *s, unsigned long ip, unsigned long sym_flags)
  * Prints the generic fields of irqs off, in hard or softirq, preempt
  * count and lock depth.
  */
+#ifndef CONFIG_SMALL_TRACES
 int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry)
 {
 	int ret;
@@ -566,6 +567,12 @@ int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry)
 
 	return trace_seq_printf(s, "%d", entry->lock_depth);
 }
+#else
+int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry)
+{
+	return 1;
+}
+#endif
 
 static int
 lat_print_generic(struct trace_seq *s, struct trace_entry *entry, int cpu)
@@ -628,6 +635,7 @@ int trace_print_lat_context(struct trace_iterator *iter)
 
 		trace_find_cmdline(entry->pid, comm);
 
+#ifndef CONFIG_SMALL_TRACES
 		ret = trace_seq_printf(s, "%16s %5d %3d %d %08x %08lx [%08llx]"
 				       " %ld.%03ldms (+%ld.%03ldms): ", comm,
 				       entry->pid, iter->cpu, entry->flags,
@@ -637,6 +645,16 @@ int trace_print_lat_context(struct trace_iterator *iter)
 				       abs_usecs % USEC_PER_MSEC,
 				       rel_usecs / USEC_PER_MSEC,
 				       rel_usecs % USEC_PER_MSEC);
+#else
+		ret = trace_seq_printf(s, "%16s %5d %3d %08lx [%08llx]"
+				       " %ld.%03ldms (+%ld.%03ldms): ", comm,
+				       entry->pid, iter->cpu, iter->idx,
+				       ns2usecs(iter->ts),
+				       abs_usecs / USEC_PER_MSEC,
+				       abs_usecs % USEC_PER_MSEC,
+				       rel_usecs / USEC_PER_MSEC,
+				       rel_usecs % USEC_PER_MSEC);
+#endif
 	} else {
 		ret = lat_print_generic(s, entry, iter->cpu);
 		if (ret)
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 15/15] small_traces: Remove 8 bytes from trace_entry.
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (13 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 14/15] small_traces: Remove trace output of large fields David Sharp
@ 2010-12-04  0:13 ` David Sharp
  2010-12-06 13:22 ` [Patch 00/15] Reduce tracing payload size Andi Kleen
  15 siblings, 0 replies; 74+ messages in thread
From: David Sharp @ 2010-12-04  0:13 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: mrubin, David Sharp

Remove flags, preempt_count, and lock_depth from trace_entry. These fields
were used only by the latency tracer output and are relatively unimportant,
while incurring high payload overhead.

Shrink pid to 16 bits. pid_max is usually 32768. The potential for loss of
fidelity here is minimal, and can be guaranteed by setting pid_max.

Google-Bug-Id: 3224547

Signed-off-by: David Sharp <dhsharp@google.com>
---
 include/linux/ftrace_event.h |    4 ++++
 kernel/trace/trace.c         |    7 ++++++-
 kernel/trace/trace_events.c  |    4 ++++
 3 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 8beabb9..3a0bdd1 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -34,10 +34,14 @@ const char *ftrace_print_hex_seq(struct trace_seq *p,
  */
 struct trace_entry {
 	unsigned short		type;
+#ifdef CONFIG_SMALL_TRACES
+	short		pid;
+#else
 	unsigned char		flags;
 	unsigned char		preempt_count;
 	int			pid;
 	int			lock_depth;
+#endif
 };
 
 #define FTRACE_MAX_EVENT						\
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 177d4e5..73b6482 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1101,8 +1101,11 @@ tracing_generic_entry_update(struct trace_entry *entry, unsigned long flags,
 {
 	struct task_struct *tsk = current;
 
-	entry->preempt_count		= pc & 0xff;
+#ifdef CONFIG_SMALL_TRACES
+        entry->pid			= (tsk) ? (typeof(entry->pid))tsk->pid : 0;
+#else
 	entry->pid			= (tsk) ? tsk->pid : 0;
+	entry->preempt_count		= pc & 0xff;
 	entry->lock_depth		= (tsk) ? tsk->lock_depth : 0;
 	entry->flags =
 #ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT
@@ -1113,9 +1116,11 @@ tracing_generic_entry_update(struct trace_entry *entry, unsigned long flags,
 		((pc & HARDIRQ_MASK) ? TRACE_FLAG_HARDIRQ : 0) |
 		((pc & SOFTIRQ_MASK) ? TRACE_FLAG_SOFTIRQ : 0) |
 		(need_resched() ? TRACE_FLAG_NEED_RESCHED : 0);
+#endif
 }
 EXPORT_SYMBOL_GPL(tracing_generic_entry_update);
 
+
 struct ring_buffer_event *
 trace_buffer_lock_reserve(struct ring_buffer *buffer,
 			  int type,
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 0725eea..3684a3e 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -107,10 +107,14 @@ static int trace_define_common_fields(void)
 	struct trace_entry ent;
 
 	__common_field(unsigned short, type);
+#ifdef CONFIG_SMALL_TRACES
+	__common_field(short, pid);
+#else
 	__common_field(unsigned char, flags);
 	__common_field(unsigned char, preempt_count);
 	__common_field(int, pid);
 	__common_field(int, lock_depth);
+#endif
 
 	return ret;
 }
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/15] ring_buffer: Align buffer_page struct allocations only to fit the flags.
  2010-12-04  0:13 ` [PATCH 03/15] ring_buffer: Align buffer_page struct allocations only to fit the flags David Sharp
@ 2010-12-04  1:43   ` Steven Rostedt
  2010-12-07 22:44     ` David Sharp
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:43 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin

On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> buffer_page structs need to be aligned to 4 byte boundaries because the page
> flags are stored in the two least-significant bits of the pointers in the page
> list. Aligning to cache lines is sufficient, but doesn't seem to be necessary.
> Reducing the alignement to only 4 bytes may improve cache efficiency.

The reason for the cache line boundaries because the bpages are per cpu,
and if they are allocated next to some variable that gets change
globally or even another bpage that is used for another cpu buffer, then
I would expect we would get some cache line bouncing.

Perhaps we should make bpage have its own slab allocation
(kmem_cache_alloc). And perhaps even try to keep all bpages that share
the same cache line on the same cpu buffer.

-- Steve



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 06/15] ftrace: fix event alignment: module:module_request
  2010-12-04  0:13 ` [PATCH 06/15] ftrace: fix event alignment: module:module_request David Sharp
@ 2010-12-04  1:47   ` Steven Rostedt
  2010-12-06  1:28     ` Li Zefan
  2011-03-11  9:48   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  1 sibling, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:47 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Li Zefan

[ Ccing Li for an Acked-by. He wrote this event ]

-- Steve


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  include/trace/events/module.h |    5 ++---
>  1 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/include/trace/events/module.h b/include/trace/events/module.h
> index c7bb2f0..8154f5f 100644
> --- a/include/trace/events/module.h
> +++ b/include/trace/events/module.h
> @@ -98,14 +98,14 @@ TRACE_EVENT(module_request,
>  	TP_ARGS(name, wait, ip),
>  
>  	TP_STRUCT__entry(
> -		__field(	bool,		wait		)
>  		__field(	unsigned long,	ip		)
> +		__field(	bool,		wait		)
>  		__string(	name,		name		)
>  	),
>  
>  	TP_fast_assign(
> -		__entry->wait	= wait;
>  		__entry->ip	= ip;
> +		__entry->wait	= wait;
>  		__assign_str(name, name);
>  	),
>  
> @@ -119,4 +119,3 @@ TRACE_EVENT(module_request,
>  
>  /* This part must be outside protection */
>  #include <trace/define_trace.h>
> -



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-04  0:13 ` [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall David Sharp
@ 2010-12-04  1:49   ` Steven Rostedt
  2010-12-04  8:11   ` Avi Kivity
  2011-03-11  9:49   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  2 siblings, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:49 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Marcelo Tosatti, Avi Kivity

[ Ccing Marcelo and Avi for Acked-by's ]

-- Steve


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  arch/x86/kvm/trace.h |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index a6544b8..ab41fb0 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -62,21 +62,21 @@ TRACE_EVENT(kvm_hv_hypercall,
>  	TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
>  
>  	TP_STRUCT__entry(
> -		__field(	__u16, 		code		)
> -		__field(	bool,		fast		)
>  		__field(	__u16,		rep_cnt		)
>  		__field(	__u16,		rep_idx		)
>  		__field(	__u64,		ingpa		)
>  		__field(	__u64,		outgpa		)
> +		__field(	__u16, 		code		)
> +		__field(	bool,		fast		)
>  	),
>  
>  	TP_fast_assign(
> -		__entry->code		= code;
> -		__entry->fast		= fast;
>  		__entry->rep_cnt	= rep_cnt;
>  		__entry->rep_idx	= rep_idx;
>  		__entry->ingpa		= ingpa;
>  		__entry->outgpa		= outgpa;
> +		__entry->code		= code;
> +		__entry->fast		= fast;
>  	),
>  
>  	TP_printk("code 0x%x %s cnt 0x%x idx 0x%x in 0x%llx out 0x%llx",



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/15] ftrace: fix event alignment: mce:mce_record
  2010-12-04  0:13 ` [PATCH 08/15] ftrace: fix event alignment: mce:mce_record David Sharp
@ 2010-12-04  1:50   ` Steven Rostedt
  2010-12-09 13:33     ` Frederic Weisbecker
  2011-03-11  9:49   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  1 sibling, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:50 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Hidetoshi Seto, Frederic Weisbecker

[ Ccing Hidetoshi and Frederic for Acked-by's ]

-- Steve


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  include/trace/events/mce.h |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
> index 7eee778..4cbbcef 100644
> --- a/include/trace/events/mce.h
> +++ b/include/trace/events/mce.h
> @@ -17,36 +17,36 @@ TRACE_EVENT(mce_record,
>  	TP_STRUCT__entry(
>  		__field(	u64,		mcgcap		)
>  		__field(	u64,		mcgstatus	)
> -		__field(	u8,		bank		)
>  		__field(	u64,		status		)
>  		__field(	u64,		addr		)
>  		__field(	u64,		misc		)
>  		__field(	u64,		ip		)
> -		__field(	u8,		cs		)
>  		__field(	u64,		tsc		)
>  		__field(	u64,		walltime	)
>  		__field(	u32,		cpu		)
>  		__field(	u32,		cpuid		)
>  		__field(	u32,		apicid		)
>  		__field(	u32,		socketid	)
> +		__field(	u8,		cs		)
> +		__field(	u8,		bank		)
>  		__field(	u8,		cpuvendor	)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->mcgcap		= m->mcgcap;
>  		__entry->mcgstatus	= m->mcgstatus;
> -		__entry->bank		= m->bank;
>  		__entry->status		= m->status;
>  		__entry->addr		= m->addr;
>  		__entry->misc		= m->misc;
>  		__entry->ip		= m->ip;
> -		__entry->cs		= m->cs;
>  		__entry->tsc		= m->tsc;
>  		__entry->walltime	= m->time;
>  		__entry->cpu		= m->extcpu;
>  		__entry->cpuid		= m->cpuid;
>  		__entry->apicid		= m->apicid;
>  		__entry->socketid	= m->socketid;
> +		__entry->cs		= m->cs;
> +		__entry->bank		= m->bank;
>  		__entry->cpuvendor	= m->cpuvendor;
>  	),
>  



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb
  2010-12-04  0:13 ` [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb David Sharp
@ 2010-12-04  1:52   ` Steven Rostedt
  2010-12-04 13:38     ` Neil Horman
  2011-03-11  9:50   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  1 sibling, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:52 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Neil Horman

[ Ccing Neil for Acked-by ]

-- Steve


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  include/trace/events/skb.h |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> index 75ce9d5..aa4a56a 100644
> --- a/include/trace/events/skb.h
> +++ b/include/trace/events/skb.h
> @@ -19,16 +19,16 @@ TRACE_EVENT(kfree_skb,
>  
>  	TP_STRUCT__entry(
>  		__field(	void *,		skbaddr		)
> -		__field(	unsigned short,	protocol	)
>  		__field(	void *,		location	)
> +		__field(	unsigned short,	protocol	)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->skbaddr = skb;
> +		__entry->location = location;
>  		if (skb) {
>  			__entry->protocol = ntohs(skb->protocol);
>  		}
> -		__entry->location = location;
>  	),
>  
>  	TP_printk("skbaddr=%p protocol=%u location=%p",



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/15] ftrace: fix event alignment: jbd2:*
  2010-12-04  0:13 ` [PATCH 10/15] ftrace: fix event alignment: jbd2:* David Sharp
@ 2010-12-04  1:52   ` Steven Rostedt
  2011-03-09  0:03     ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:52 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Theodore Ts'o

[ Ccing Ted for Acked-by ]

-- Steve


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  include/trace/events/jbd2.h |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
> index 7447ea9..2f9f362 100644
> --- a/include/trace/events/jbd2.h
> +++ b/include/trace/events/jbd2.h
> @@ -41,15 +41,15 @@ DECLARE_EVENT_CLASS(jbd2_commit,
>  	TP_STRUCT__entry(
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
> -		__field(	char,	sync_commit		  )
>  		__field(	int,	transaction		  )
> +		__field(	char,	sync_commit		  )
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major	= MAJOR(journal->j_fs_dev->bd_dev);
>  		__entry->dev_minor	= MINOR(journal->j_fs_dev->bd_dev);
> -		__entry->sync_commit = commit_transaction->t_synchronous_commit;
>  		__entry->transaction	= commit_transaction->t_tid;
> +		__entry->sync_commit = commit_transaction->t_synchronous_commit;
>  	),
>  
>  	TP_printk("dev %d,%d transaction %d sync %d",
> @@ -93,17 +93,17 @@ TRACE_EVENT(jbd2_end_commit,
>  	TP_STRUCT__entry(
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
> -		__field(	char,	sync_commit		  )
>  		__field(	int,	transaction		  )
>  		__field(	int,	head		  	  )
> +		__field(	char,	sync_commit		  )
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major	= MAJOR(journal->j_fs_dev->bd_dev);
>  		__entry->dev_minor	= MINOR(journal->j_fs_dev->bd_dev);
> -		__entry->sync_commit = commit_transaction->t_synchronous_commit;
>  		__entry->transaction	= commit_transaction->t_tid;
>  		__entry->head		= journal->j_tail_sequence;
> +		__entry->sync_commit = commit_transaction->t_synchronous_commit;
>  	),
>  
>  	TP_printk("dev %d,%d transaction %d sync %d head %d",



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/15] ftrace: fix event alignment: ext4:*
  2010-12-04  0:13 ` [PATCH 11/15] ftrace: fix event alignment: ext4:* David Sharp
@ 2010-12-04  1:53   ` Steven Rostedt
  2011-03-09  0:04     ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:53 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Theodore Ts'o

[ Ccing Ted for Acked-by ]

-- Steve


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  include/trace/events/ext4.h |   70 +++++++++++++++++++++---------------------
>  1 files changed, 35 insertions(+), 35 deletions(-)
> 
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index e5e345f..9887cdb 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -23,21 +23,21 @@ TRACE_EVENT(ext4_free_inode,
>  	TP_STRUCT__entry(
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
> +		__field(	blkcnt_t, blocks		)
>  		__field(	ino_t,	ino			)
> -		__field(	umode_t, mode			)
>  		__field(	uid_t,	uid			)
>  		__field(	gid_t,	gid			)
> -		__field(	blkcnt_t, blocks		)
> +		__field(	umode_t, mode			)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
>  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
> +		__entry->blocks	= inode->i_blocks;
>  		__entry->ino	= inode->i_ino;
> -		__entry->mode	= inode->i_mode;
>  		__entry->uid	= inode->i_uid;
>  		__entry->gid	= inode->i_gid;
> -		__entry->blocks	= inode->i_blocks;
> +		__entry->mode	= inode->i_mode;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu mode 0%o uid %u gid %u blocks %llu",
> @@ -342,10 +342,10 @@ TRACE_EVENT(ext4_da_writepages,
>  		__field(	long,	pages_skipped		)
>  		__field(	loff_t,	range_start		)
>  		__field(	loff_t,	range_end		)
> +		__field(       pgoff_t,	writeback_index		)
>  		__field(	char,	for_kupdate		)
>  		__field(	char,	for_reclaim		)
>  		__field(	char,	range_cyclic		)
> -		__field(       pgoff_t,	writeback_index		)
>  	),
>  
>  	TP_fast_assign(
> @@ -356,10 +356,10 @@ TRACE_EVENT(ext4_da_writepages,
>  		__entry->pages_skipped	= wbc->pages_skipped;
>  		__entry->range_start	= wbc->range_start;
>  		__entry->range_end	= wbc->range_end;
> +		__entry->writeback_index = inode->i_mapping->writeback_index;
>  		__entry->for_kupdate	= wbc->for_kupdate;
>  		__entry->for_reclaim	= wbc->for_reclaim;
>  		__entry->range_cyclic	= wbc->range_cyclic;
> -		__entry->writeback_index = inode->i_mapping->writeback_index;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu nr_to_write %ld pages_skipped %ld "
> @@ -425,8 +425,8 @@ TRACE_EVENT(ext4_da_writepages_result,
>  		__field(	int,	ret			)
>  		__field(	int,	pages_written		)
>  		__field(	long,	pages_skipped		)
> -		__field(	char,	more_io			)	
>  		__field(       pgoff_t,	writeback_index		)
> +		__field(	char,	more_io			)	
>  	),
>  
>  	TP_fast_assign(
> @@ -436,8 +436,8 @@ TRACE_EVENT(ext4_da_writepages_result,
>  		__entry->ret		= ret;
>  		__entry->pages_written	= pages_written;
>  		__entry->pages_skipped	= wbc->pages_skipped;
> -		__entry->more_io	= wbc->more_io;
>  		__entry->writeback_index = inode->i_mapping->writeback_index;
> +		__entry->more_io	= wbc->more_io;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu ret %d pages_written %d pages_skipped %ld more_io %d writeback_index %lu",
> @@ -485,8 +485,8 @@ DECLARE_EVENT_CLASS(ext4__mb_new_pa,
>  		__field(	int,   dev_minor                )
>  		__field(	ino_t,	ino			)
>  		__field(	__u64,	pa_pstart		)
> -		__field(	__u32,	pa_len			)
>  		__field(	__u64,	pa_lstart		)
> +		__field(	__u32,	pa_len			)
>  
>  	),
>  
> @@ -495,8 +495,8 @@ DECLARE_EVENT_CLASS(ext4__mb_new_pa,
>  		__entry->dev_minor	= MINOR(ac->ac_sb->s_dev);
>  		__entry->ino		= ac->ac_inode->i_ino;
>  		__entry->pa_pstart	= pa->pa_pstart;
> -		__entry->pa_len		= pa->pa_len;
>  		__entry->pa_lstart	= pa->pa_lstart;
> +		__entry->pa_len		= pa->pa_len;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu pstart %llu len %u lstart %llu",
> @@ -724,20 +724,20 @@ TRACE_EVENT(ext4_free_blocks,
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
>  		__field(	ino_t,	ino			)
> -		__field(      umode_t, mode			)
>  		__field(	__u64,	block			)
>  		__field(	unsigned long,	count		)
>  		__field(	 int,	flags			)
> +		__field(      umode_t, mode			)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major	= MAJOR(inode->i_sb->s_dev);
>  		__entry->dev_minor	= MINOR(inode->i_sb->s_dev);
>  		__entry->ino		= inode->i_ino;
> -		__entry->mode		= inode->i_mode;
>  		__entry->block		= block;
>  		__entry->count		= count;
>  		__entry->flags		= flags;
> +		__entry->mode		= inode->i_mode;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu mode 0%o block %llu count %lu flags %d",
> @@ -766,8 +766,8 @@ TRACE_EVENT(ext4_sync_file,
>  		__entry->dev_major	= MAJOR(dentry->d_inode->i_sb->s_dev);
>  		__entry->dev_minor	= MINOR(dentry->d_inode->i_sb->s_dev);
>  		__entry->ino		= dentry->d_inode->i_ino;
> -		__entry->datasync	= datasync;
>  		__entry->parent		= dentry->d_parent->d_inode->i_ino;
> +		__entry->datasync	= datasync;
>  	),
>  
>  	TP_printk("dev %d,%d ino %ld parent %ld datasync %d ",
> @@ -834,12 +834,6 @@ TRACE_EVENT(ext4_mballoc_alloc,
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
>  		__field(	ino_t,	ino			)
> -		__field(	__u16,	found			)
> -		__field(	__u16,	groups			)
> -		__field(	__u16,	buddy			)
> -		__field(	__u16,	flags			)
> -		__field(	__u16,	tail			)
> -		__field(	__u8,	cr			)
>  		__field(	__u32, 	orig_logical		)
>  		__field(	  int,	orig_start		)
>  		__field(	__u32, 	orig_group		)
> @@ -852,18 +846,18 @@ TRACE_EVENT(ext4_mballoc_alloc,
>  		__field(	  int,	result_start		)
>  		__field(	__u32, 	result_group		)
>  		__field(	  int,	result_len		)
> +		__field(	__u16,	found			)
> +		__field(	__u16,	groups			)
> +		__field(	__u16,	buddy			)
> +		__field(	__u16,	flags			)
> +		__field(	__u16,	tail			)
> +		__field(	__u8,	cr			)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major	= MAJOR(ac->ac_inode->i_sb->s_dev);
>  		__entry->dev_minor	= MINOR(ac->ac_inode->i_sb->s_dev);
>  		__entry->ino		= ac->ac_inode->i_ino;
> -		__entry->found		= ac->ac_found;
> -		__entry->flags		= ac->ac_flags;
> -		__entry->groups		= ac->ac_groups_scanned;
> -		__entry->buddy		= ac->ac_buddy;
> -		__entry->tail		= ac->ac_tail;
> -		__entry->cr		= ac->ac_criteria;
>  		__entry->orig_logical	= ac->ac_o_ex.fe_logical;
>  		__entry->orig_start	= ac->ac_o_ex.fe_start;
>  		__entry->orig_group	= ac->ac_o_ex.fe_group;
> @@ -876,6 +870,12 @@ TRACE_EVENT(ext4_mballoc_alloc,
>  		__entry->result_start	= ac->ac_f_ex.fe_start;
>  		__entry->result_group	= ac->ac_f_ex.fe_group;
>  		__entry->result_len	= ac->ac_f_ex.fe_len;
> +		__entry->found		= ac->ac_found;
> +		__entry->groups		= ac->ac_groups_scanned;
> +		__entry->buddy		= ac->ac_buddy;
> +		__entry->flags		= ac->ac_flags;
> +		__entry->tail		= ac->ac_tail;
> +		__entry->cr		= ac->ac_criteria;
>  	),
>  
>  	TP_printk("dev %d,%d inode %lu orig %u/%d/%u@%u goal %u/%d/%u@%u "
> @@ -1001,18 +1001,18 @@ TRACE_EVENT(ext4_forget,
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
>  		__field(	ino_t,	ino			)
> -		__field(	umode_t, mode			)
> -		__field(	int,	is_metadata		)
>  		__field(	__u64,	block			)
> +		__field(	int,	is_metadata		)
> +		__field(	umode_t, mode			)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
>  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
>  		__entry->ino	= inode->i_ino;
> -		__entry->mode	= inode->i_mode;
> -		__entry->is_metadata = is_metadata;
>  		__entry->block	= block;
> +		__entry->is_metadata = is_metadata;
> +		__entry->mode	= inode->i_mode;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu mode 0%o is_metadata %d block %llu",
> @@ -1030,24 +1030,24 @@ TRACE_EVENT(ext4_da_update_reserve_space,
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
>  		__field(	ino_t,	ino			)
> -		__field(	umode_t, mode			)
>  		__field(	__u64,	i_blocks		)
>  		__field(	int,	used_blocks		)
>  		__field(	int,	reserved_data_blocks	)
>  		__field(	int,	reserved_meta_blocks	)
>  		__field(	int,	allocated_meta_blocks	)
> +		__field(	umode_t, mode			)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
>  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
>  		__entry->ino	= inode->i_ino;
> -		__entry->mode	= inode->i_mode;
>  		__entry->i_blocks = inode->i_blocks;
>  		__entry->used_blocks = used_blocks;
>  		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
>  		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
>  		__entry->allocated_meta_blocks = EXT4_I(inode)->i_allocated_meta_blocks;
> +		__entry->mode	= inode->i_mode;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu used_blocks %d reserved_data_blocks %d reserved_meta_blocks %d allocated_meta_blocks %d",
> @@ -1067,22 +1067,22 @@ TRACE_EVENT(ext4_da_reserve_space,
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
>  		__field(	ino_t,	ino			)
> -		__field(	umode_t, mode			)
>  		__field(	__u64,	i_blocks		)
>  		__field(	int,	md_needed		)
>  		__field(	int,	reserved_data_blocks	)
>  		__field(	int,	reserved_meta_blocks	)
> +		__field(	umode_t, mode			)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
>  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
>  		__entry->ino	= inode->i_ino;
> -		__entry->mode	= inode->i_mode;
>  		__entry->i_blocks = inode->i_blocks;
>  		__entry->md_needed = md_needed;
>  		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
>  		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
> +		__entry->mode	= inode->i_mode;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu md_needed %d reserved_data_blocks %d reserved_meta_blocks %d",
> @@ -1102,24 +1102,24 @@ TRACE_EVENT(ext4_da_release_space,
>  		__field(	int,   dev_major                )
>  		__field(	int,   dev_minor                )
>  		__field(	ino_t,	ino			)
> -		__field(	umode_t, mode			)
>  		__field(	__u64,	i_blocks		)
>  		__field(	int,	freed_blocks		)
>  		__field(	int,	reserved_data_blocks	)
>  		__field(	int,	reserved_meta_blocks	)
>  		__field(	int,	allocated_meta_blocks	)
> +		__field(	umode_t, mode			)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
>  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
>  		__entry->ino	= inode->i_ino;
> -		__entry->mode	= inode->i_mode;
>  		__entry->i_blocks = inode->i_blocks;
>  		__entry->freed_blocks = freed_blocks;
>  		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
>  		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
>  		__entry->allocated_meta_blocks = EXT4_I(inode)->i_allocated_meta_blocks;
> +		__entry->mode	= inode->i_mode;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu freed_blocks %d reserved_data_blocks %d reserved_meta_blocks %d allocated_meta_blocks %d",



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-04  0:13 ` [PATCH 13/15] small_traces: Add config option to shrink trace events David Sharp
@ 2010-12-04  1:56   ` Steven Rostedt
  2010-12-04  2:33     ` David Sharp
  2011-03-09  0:23   ` Steven Rostedt
  1 sibling, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:56 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Frederic Weisbecker

[ Added Frederic on Cc ]

I rather have this as a run time option than a compile time.

You could use two different trace event structures. The normal one and a
"compact" one. Add to the "trace_options" and when it is set the traces
are compact.

-- Steve


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> This option will remove several fields from trace events to reduce the
> payload size. Under heavy load, events can be produced faster than they
> can possibly be written to disk or sent on the network. Shrinking the
> payload size will enable fitting more traces in the ring buffer, which
> therefore enables tracing for longer periods, at the cost of losing some
> relatively unimportant details.
> 
> Google-Bug-Id: 3224547
> 
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  kernel/trace/Kconfig |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index ea37e2f..9c24fe5 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -403,6 +403,18 @@ config FUNCTION_PROFILER
>  
>  	  If in doubt, say N.
>  
> +config SMALL_TRACES
> +	bool "Shrink trace events"
> +	default n
> +	help
> +	  Minimize the size of trace events. Omits fields from trace event
> +	  header like preempt_count, and lock_depth. Chooses smaller
> +	  alternative event structures. Shrinking the payload size enables
> +	  fitting more traces in the ring buffer, which enables tracing for
> +	  longer periods when the buffer can not be read as fast as it is
> +	  filled, at the cost of losing some details.
> +
> +
>  config FTRACE_MCOUNT_RECORD
>  	def_bool y
>  	depends on DYNAMIC_FTRACE



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file
  2010-12-04  0:13 ` [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file David Sharp
@ 2010-12-04  1:57   ` Steven Rostedt
  2010-12-08 20:15     ` David Sharp
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  1:57 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Jiaying Zhang

On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> From: Jiaying Zhang <jiayingz@google.com>
> 
> Add a "buffer_overwrite" debugfs file for ftrace to control whether the buffer

Instead of adding a new file, make this another "trace_option".

Thanks,

-- Steve


> should be overwritten on overflow or not. The default remains to overwrite old
> events when the buffer is full. This patch adds the option to instead discard
> newest events when the buffer is full. This is useful to get a snapshot of
> traces just after enabling traces. Dropping the current event is also a simpler
> code path.
> 
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  Documentation/trace/ftrace.txt |    6 ++++
>  include/linux/ring_buffer.h    |    2 +
>  kernel/trace/ring_buffer.c     |   11 +++++++
>  kernel/trace/trace.c           |   59 +++++++++++++++++++++++++++++++++++++---
>  4 files changed, 74 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
> index 557c1ed..9237da3 100644
> --- a/Documentation/trace/ftrace.txt
> +++ b/Documentation/trace/ftrace.txt
> @@ -138,6 +138,12 @@ of ftrace. Here is a list of some of the key files:
>  	This can only be updated when the current_tracer
>  	is set to "nop".
>  
> +  buffer_overwrite:
> +
> +	This controls what happens when the trace buffer is full.
> +	If "1" (default), the oldest events are discarded and 
> +	overwritten. If "0", then the newest events are discarded.
> +
>    tracing_cpumask:
>  
>  	This is a mask that lets the user only trace
> diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
> index 8d3a248..ab38ac8 100644
> --- a/include/linux/ring_buffer.h
> +++ b/include/linux/ring_buffer.h
> @@ -100,6 +100,8 @@ void ring_buffer_free(struct ring_buffer *buffer);
>  
>  int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
>  
> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val);
> +
>  struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
>  						   unsigned long length);
>  int ring_buffer_unlock_commit(struct ring_buffer *buffer,
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 9ed509a..3207147 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1429,6 +1429,17 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
>  }
>  EXPORT_SYMBOL_GPL(ring_buffer_resize);
>  
> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val)
> +{
> +	mutex_lock(&buffer->mutex);
> +	if (val)
> +		buffer->flags |= RB_FL_OVERWRITE;
> +	else
> +		buffer->flags &= ~RB_FL_OVERWRITE;
> +	mutex_unlock(&buffer->mutex);
> +}
> +EXPORT_SYMBOL_GPL(ring_buffer_change_overwrite);
> +
>  static inline void *
>  __rb_data_page_index(struct buffer_data_page *bpage, unsigned index)
>  {
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index c380612..ed5c14f 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -41,8 +41,6 @@
>  #include "trace.h"
>  #include "trace_output.h"
>  
> -#define TRACE_BUFFER_FLAGS	(RB_FL_OVERWRITE)
> -
>  /*
>   * On boot up, the ring buffer is set to the minimum size, so that
>   * we do not waste memory on systems that are not using tracing.
> @@ -241,6 +239,9 @@ int tracing_is_enabled(void)
>  
>  static unsigned long		trace_buf_size = TRACE_BUF_SIZE_DEFAULT;
>  
> +/* whether the trace buffer should be overwritten on overflow or not. */
> +static enum ring_buffer_flags  trace_buffer_flags = RB_FL_OVERWRITE;
> +
>  /* trace_types holds a link list of available tracers. */
>  static struct tracer		*trace_types __read_mostly;
>  
> @@ -3466,6 +3467,47 @@ tracing_entries_write(struct file *filp, const char __user *ubuf,
>  	return cnt;
>  }
>  
> +static ssize_t
> +tracing_overwrite_read(struct file *filp, char __user *ubuf,
> +		     size_t cnt, loff_t *ppos)
> +{
> +	char buf[64];
> +	int r;
> +	r = snprintf(buf, 64, "%u\n", trace_buffer_flags);
> +	return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
> +}
> +
> +static ssize_t
> +tracing_overwrite_write(struct file *filp, const char __user *ubuf,
> +		      size_t cnt, loff_t *ppos)
> +{
> +	unsigned long val;
> +	char buf[64];
> +	int ret;
> +
> +	if (cnt >= sizeof(buf))
> +		return -EINVAL;
> +
> +	if (copy_from_user(&buf, ubuf, cnt))
> +		return -EFAULT;
> +
> +	buf[cnt] = 0;
> +
> +	ret = strict_strtoul(buf, 10, &val);
> +	if (ret < 0)
> +		return ret;
> +	*ppos += cnt;
> +
> +	if (val != 0 && val != 1)
> +		return -EINVAL;
> +
> +	if (trace_buffer_flags != val) {
> +		trace_buffer_flags = val;
> +		ring_buffer_change_overwrite(global_trace.buffer, val);
> +	}
> +	return cnt;
> +}
> +
>  static int mark_printk(const char *fmt, ...)
>  {
>  	int ret;
> @@ -3611,6 +3653,12 @@ static const struct file_operations tracing_entries_fops = {
>  	.llseek		= generic_file_llseek,
>  };
>  
> +static const struct file_operations tracing_overwrite_fops = {
> +	.open		= tracing_open_generic,
> +	.read		= tracing_overwrite_read,
> +	.write		= tracing_overwrite_write,
> +};
> +
>  static const struct file_operations tracing_mark_fops = {
>  	.open		= tracing_open_generic,
>  	.write		= tracing_mark_write,
> @@ -4336,6 +4384,9 @@ static __init int tracer_init_debugfs(void)
>  	trace_create_file("buffer_size_kb", 0644, d_tracer,
>  			&global_trace, &tracing_entries_fops);
>  
> +	trace_create_file("buffer_overwrite", 0644, d_tracer,
> +			&global_trace, &tracing_overwrite_fops);
> +
>  	trace_create_file("trace_marker", 0220, d_tracer,
>  			NULL, &tracing_mark_fops);
>  
> @@ -4565,7 +4616,7 @@ __init static int tracer_alloc_buffers(void)
>  
>  	/* TODO: make the number of buffers hot pluggable with CPUS */
>  	global_trace.buffer = ring_buffer_alloc(ring_buf_size,
> -						   TRACE_BUFFER_FLAGS);
> +						   trace_buffer_flags);
>  	if (!global_trace.buffer) {
>  		printk(KERN_ERR "tracer: failed to allocate ring buffer!\n");
>  		WARN_ON(1);
> @@ -4575,7 +4626,7 @@ __init static int tracer_alloc_buffers(void)
>  
> 
>  #ifdef CONFIG_TRACER_MAX_TRACE
> -	max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
> +	max_tr.buffer = ring_buffer_alloc(1, trace_buffer_flags);
>  	if (!max_tr.buffer) {
>  		printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
>  		WARN_ON(1);



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-04  1:56   ` Steven Rostedt
@ 2010-12-04  2:33     ` David Sharp
  2010-12-04  2:54       ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-04  2:33 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, mrubin, Frederic Weisbecker

I considered that, and I generally thing it's a good idea. However, I
also want to use this switch to shrink individual tracepoint event
structures.

eg: sched switch is a high frequency event and it is 68 bytes (60
after these patches)

Can you suggest a syntax for TRACE_EVENT, DECLARE_EVENT_CLASS, etc,
that could express the two versions and produce the right code?

I'm worried about adding even further complexity to the TRACE_EVENT
macros. I could add TRACE_EVENT_SMALL that takes two versions of
TP_STRUCT__entry, TP_fast_assign, and TP_printk each, but then this
will need to be permuted with your TP_CONDITIONAL patches as well.

Thanks,
d#

On Fri, Dec 3, 2010 at 5:56 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> [ Added Frederic on Cc ]
>
> I rather have this as a run time option than a compile time.
>
> You could use two different trace event structures. The normal one and a
> "compact" one. Add to the "trace_options" and when it is set the traces
> are compact.
>
> -- Steve
>
>
> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
>> This option will remove several fields from trace events to reduce the
>> payload size. Under heavy load, events can be produced faster than they
>> can possibly be written to disk or sent on the network. Shrinking the
>> payload size will enable fitting more traces in the ring buffer, which
>> therefore enables tracing for longer periods, at the cost of losing some
>> relatively unimportant details.
>>
>> Google-Bug-Id: 3224547
>>
>> Signed-off-by: David Sharp <dhsharp@google.com>
>> ---
>>  kernel/trace/Kconfig |   12 ++++++++++++
>>  1 files changed, 12 insertions(+), 0 deletions(-)
>>
>> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
>> index ea37e2f..9c24fe5 100644
>> --- a/kernel/trace/Kconfig
>> +++ b/kernel/trace/Kconfig
>> @@ -403,6 +403,18 @@ config FUNCTION_PROFILER
>>
>>         If in doubt, say N.
>>
>> +config SMALL_TRACES
>> +     bool "Shrink trace events"
>> +     default n
>> +     help
>> +       Minimize the size of trace events. Omits fields from trace event
>> +       header like preempt_count, and lock_depth. Chooses smaller
>> +       alternative event structures. Shrinking the payload size enables
>> +       fitting more traces in the ring buffer, which enables tracing for
>> +       longer periods when the buffer can not be read as fast as it is
>> +       filled, at the cost of losing some details.
>> +
>> +
>>  config FTRACE_MCOUNT_RECORD
>>       def_bool y
>>       depends on DYNAMIC_FTRACE
>
>
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-04  2:33     ` David Sharp
@ 2010-12-04  2:54       ` Steven Rostedt
  2010-12-09 14:55         ` Frederic Weisbecker
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-04  2:54 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Frederic Weisbecker

On Fri, 2010-12-03 at 18:33 -0800, David Sharp wrote:
> I considered that, and I generally thing it's a good idea. However, I
> also want to use this switch to shrink individual tracepoint event
> structures.
> 
> eg: sched switch is a high frequency event and it is 68 bytes (60
> after these patches)
> 
> Can you suggest a syntax for TRACE_EVENT, DECLARE_EVENT_CLASS, etc,
> that could express the two versions and produce the right code?
> 
> I'm worried about adding even further complexity to the TRACE_EVENT
> macros. I could add TRACE_EVENT_SMALL that takes two versions of
> TP_STRUCT__entry, TP_fast_assign, and TP_printk each, but then this
> will need to be permuted with your TP_CONDITIONAL patches as well.

I would not touch the TRACE_EVENT() structures. They are there as is and
I would not think about changing them. Something like that would never
make it into mainline.

Now what you can do, is to make your own events based off of the same
tracepoints. For example, the TRACE_EVENT(sched_switch...) has in
sched.c:

	trace_sched_switch(prev, next);


You could even write a module that does something like this:

	register_trace_sched_switch(probe_sched_switch, mydata);



void probe_sched_switch(void *mydata,
		struct task_struct *prev,
		struct task_struct *next)
{
	struct ring_buffer *buffer;
	struct ring_buffer_event *event;
	struct myentry *entry;

	event = trace_current_buffer_lock_reserve(buffer,
			mytype, sizeof(*entry),
			0, 0);

	if (!event)
		return;

	entry = ring_buffer_event_data(event);

	entry->myfield = prev->x;
	...

	trace_nowake_buffer_unlock_commit(buffer, event,
					0, 0);
}

You will need to do a register_ftrace_event() to register that 'mytype'
and how to output it. Otherwise it would just be ignored in the "trace"
file.

All of the above would work fine as a loadable module that you could
easily maintain out of tree, and still uses the internals of the system.

-- Steve



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-04  0:13 ` [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall David Sharp
  2010-12-04  1:49   ` Steven Rostedt
@ 2010-12-04  8:11   ` Avi Kivity
  2010-12-06 20:38     ` David Sharp
  2011-03-11  9:49   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
  2 siblings, 1 reply; 74+ messages in thread
From: Avi Kivity @ 2010-12-04  8:11 UTC (permalink / raw)
  To: David Sharp; +Cc: rostedt, linux-kernel, mrubin, kvm-devel

On 12/04/2010 02:13 AM, David Sharp wrote:
> Signed-off-by: David Sharp<dhsharp@google.com>
> ---
>   arch/x86/kvm/trace.h |    8 ++++----
>   1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index a6544b8..ab41fb0 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -62,21 +62,21 @@ TRACE_EVENT(kvm_hv_hypercall,
>   	TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
>
>   	TP_STRUCT__entry(
> -		__field(	__u16, 		code		)
> -		__field(	bool,		fast		)
>   		__field(	__u16,		rep_cnt		)
>   		__field(	__u16,		rep_idx		)
>   		__field(	__u64,		ingpa		)
>   		__field(	__u64,		outgpa		)
> +		__field(	__u16, 		code		)
> +		__field(	bool,		fast		)
>   	),
>

Looks like a pessimisation.

Before: 24 bytes
After: 32 bytes

(on a 64-bit machine, assuming no packing)

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb
  2010-12-04  1:52   ` Steven Rostedt
@ 2010-12-04 13:38     ` Neil Horman
  0 siblings, 0 replies; 74+ messages in thread
From: Neil Horman @ 2010-12-04 13:38 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin

On Fri, Dec 03, 2010 at 08:52:05PM -0500, Steven Rostedt wrote:
> [ Ccing Neil for Acked-by ]
> 
> -- Steve
> 
> 
Looks good, thanks!
Acked-by: Neil Horman <nhorman@tuxdriver.com>


> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> > Signed-off-by: David Sharp <dhsharp@google.com>
> > ---
> >  include/trace/events/skb.h |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> > index 75ce9d5..aa4a56a 100644
> > --- a/include/trace/events/skb.h
> > +++ b/include/trace/events/skb.h
> > @@ -19,16 +19,16 @@ TRACE_EVENT(kfree_skb,
> >  
> >  	TP_STRUCT__entry(
> >  		__field(	void *,		skbaddr		)
> > -		__field(	unsigned short,	protocol	)
> >  		__field(	void *,		location	)
> > +		__field(	unsigned short,	protocol	)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->skbaddr = skb;
> > +		__entry->location = location;
> >  		if (skb) {
> >  			__entry->protocol = ntohs(skb->protocol);
> >  		}
> > -		__entry->location = location;
> >  	),
> >  
> >  	TP_printk("skbaddr=%p protocol=%u location=%p",
> 
> 
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 06/15] ftrace: fix event alignment: module:module_request
  2010-12-04  1:47   ` Steven Rostedt
@ 2010-12-06  1:28     ` Li Zefan
  0 siblings, 0 replies; 74+ messages in thread
From: Li Zefan @ 2010-12-06  1:28 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin

Steven Rostedt wrote:
> [ Ccing Li for an Acked-by. He wrote this event ]
> 
> -- Steve
> 
> 
> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
>> Signed-off-by: David Sharp <dhsharp@google.com>

Acked-by: Li Zefan <lizf@cn.fujitsu.com>

>> ---
>>  include/trace/events/module.h |    5 ++---
>>  1 files changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/trace/events/module.h b/include/trace/events/module.h
>> index c7bb2f0..8154f5f 100644
>> --- a/include/trace/events/module.h
>> +++ b/include/trace/events/module.h
>> @@ -98,14 +98,14 @@ TRACE_EVENT(module_request,
>>  	TP_ARGS(name, wait, ip),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(	bool,		wait		)
>>  		__field(	unsigned long,	ip		)
>> +		__field(	bool,		wait		)
>>  		__string(	name,		name		)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->wait	= wait;
>>  		__entry->ip	= ip;
>> +		__entry->wait	= wait;
>>  		__assign_str(name, name);
>>  	),
>>  
>> @@ -119,4 +119,3 @@ TRACE_EVENT(module_request,
>>  
>>  /* This part must be outside protection */
>>  #include <trace/define_trace.h>
>> -
> 
> 
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Patch 00/15] Reduce tracing payload size.
  2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
                   ` (14 preceding siblings ...)
  2010-12-04  0:13 ` [PATCH 15/15] small_traces: Remove 8 bytes from trace_entry David Sharp
@ 2010-12-06 13:22 ` Andi Kleen
  2010-12-06 13:56   ` Ted Ts'o
  15 siblings, 1 reply; 74+ messages in thread
From: Andi Kleen @ 2010-12-06 13:22 UTC (permalink / raw)
  To: David Sharp; +Cc: rostedt, linux-kernel, mrubin

David Sharp <dhsharp@google.com> writes:
>
> These patches are just a start to shrinking the size of trace events. I am
> planning to also make small versions of trace events that are used when
> CONFIG_SMALL_TRACES is enabled. I'm also open to discussion of other ways to
> shrink event sizes.

Maybe the simplest would be to lzo them as they get logged?
I assume you already considered that?

-Andi


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Patch 00/15] Reduce tracing payload size.
  2010-12-06 13:22 ` [Patch 00/15] Reduce tracing payload size Andi Kleen
@ 2010-12-06 13:56   ` Ted Ts'o
  2010-12-06 14:58     ` Andi Kleen
  0 siblings, 1 reply; 74+ messages in thread
From: Ted Ts'o @ 2010-12-06 13:56 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David Sharp, rostedt, linux-kernel, mrubin

On Mon, Dec 06, 2010 at 02:22:00PM +0100, Andi Kleen wrote:
> David Sharp <dhsharp@google.com> writes:
> >
> > These patches are just a start to shrinking the size of trace events. I am
> > planning to also make small versions of trace events that are used when
> > CONFIG_SMALL_TRACES is enabled. I'm also open to discussion of other ways to
> > shrink event sizes.
> 
> Maybe the simplest would be to lzo them as they get logged?
> I assume you already considered that?

The tracing subsystem is supposed to be extremely low-overhead.
Compressiong the event log would add considerable CPU overhead.  If we
can shrink the trace events by being more careful about unnecessary
padding, and not logging fields that aren't needed, it's all win....

The problem of course with the latter is that different people will
have disagreements about what is needed or not.  And even adding
conditionals will slow down the tracing hotpath.

						- Ted

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Patch 00/15] Reduce tracing payload size.
  2010-12-06 13:56   ` Ted Ts'o
@ 2010-12-06 14:58     ` Andi Kleen
  2010-12-06 16:17       ` Steven Rostedt
  2010-12-06 16:31       ` Miguel Ojeda
  0 siblings, 2 replies; 74+ messages in thread
From: Andi Kleen @ 2010-12-06 14:58 UTC (permalink / raw)
  To: Ted Ts'o, Andi Kleen, David Sharp, rostedt, linux-kernel, mrubin

On Mon, Dec 06, 2010 at 08:56:37AM -0500, Ted Ts'o wrote:
> On Mon, Dec 06, 2010 at 02:22:00PM +0100, Andi Kleen wrote:
> > David Sharp <dhsharp@google.com> writes:
> > >
> > > These patches are just a start to shrinking the size of trace events. I am
> > > planning to also make small versions of trace events that are used when
> > > CONFIG_SMALL_TRACES is enabled. I'm also open to discussion of other ways to
> > > shrink event sizes.
> > 
> > Maybe the simplest would be to lzo them as they get logged?
> > I assume you already considered that?
> 
> The tracing subsystem is supposed to be extremely low-overhead.
> Compressiong the event log would add considerable CPU overhead.  If we

lzo as a rule of thumb is about as expensive as 3x memcpy()
That's not really expensive.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Patch 00/15] Reduce tracing payload size.
  2010-12-06 14:58     ` Andi Kleen
@ 2010-12-06 16:17       ` Steven Rostedt
  2010-12-06 16:31       ` Miguel Ojeda
  1 sibling, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2010-12-06 16:17 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ted Ts'o, David Sharp, linux-kernel, mrubin

On Mon, 2010-12-06 at 15:58 +0100, Andi Kleen wrote:
> On Mon, Dec 06, 2010 at 08:56:37AM -0500, Ted Ts'o wrote:
> > On Mon, Dec 06, 2010 at 02:22:00PM +0100, Andi Kleen wrote:
> > > David Sharp <dhsharp@google.com> writes:
> > > >
> > > > These patches are just a start to shrinking the size of trace events. I am
> > > > planning to also make small versions of trace events that are used when
> > > > CONFIG_SMALL_TRACES is enabled. I'm also open to discussion of other ways to
> > > > shrink event sizes.
> > > 
> > > Maybe the simplest would be to lzo them as they get logged?
> > > I assume you already considered that?
> > 
> > The tracing subsystem is supposed to be extremely low-overhead.
> > Compressiong the event log would add considerable CPU overhead.  If we
> 
> lzo as a rule of thumb is about as expensive as 3x memcpy()
> That's not really expensive.

Usually the most expensive thing in the trace is the memcpy(). Which is
why we try to do it only once.

-- Steve



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Patch 00/15] Reduce tracing payload size.
  2010-12-06 14:58     ` Andi Kleen
  2010-12-06 16:17       ` Steven Rostedt
@ 2010-12-06 16:31       ` Miguel Ojeda
  2010-12-06 16:41         ` Andi Kleen
  1 sibling, 1 reply; 74+ messages in thread
From: Miguel Ojeda @ 2010-12-06 16:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ted Ts'o, David Sharp, rostedt, linux-kernel, mrubin

On Mon, Dec 6, 2010 at 3:58 PM, Andi Kleen <andi@firstfloor.org> wrote:
> On Mon, Dec 06, 2010 at 08:56:37AM -0500, Ted Ts'o wrote:
>> On Mon, Dec 06, 2010 at 02:22:00PM +0100, Andi Kleen wrote:
>> > David Sharp <dhsharp@google.com> writes:
>> > >
>> > > These patches are just a start to shrinking the size of trace events. I am
>> > > planning to also make small versions of trace events that are used when
>> > > CONFIG_SMALL_TRACES is enabled. I'm also open to discussion of other ways to
>> > > shrink event sizes.
>> >
>> > Maybe the simplest would be to lzo them as they get logged?
>> > I assume you already considered that?
>>
>> The tracing subsystem is supposed to be extremely low-overhead.
>> Compressiong the event log would add considerable CPU overhead.  If we
>
> lzo as a rule of thumb is about as expensive as 3x memcpy()
> That's not really expensive.

That is true for the decompression step but not for the compression
one, which takes more than 10 memcpys().

>
> -Andi
>
> --
> ak@linux.intel.com -- Speaking for myself only.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Patch 00/15] Reduce tracing payload size.
  2010-12-06 16:31       ` Miguel Ojeda
@ 2010-12-06 16:41         ` Andi Kleen
  0 siblings, 0 replies; 74+ messages in thread
From: Andi Kleen @ 2010-12-06 16:41 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Andi Kleen, Ted Ts'o, David Sharp, rostedt, linux-kernel, mrubin

> That is true for the decompression step but not for the compression
> one, which takes more than 10 memcpys().

That's a good point. At 10x it's definitely too slow. Too bad, would have been too easy.

I guess one could still consider some very cheap data specific delta compressions 
(I assume there's a lot of redundancy in a tracing stream with similar events
occurring in a row or nearby) I used this technique successfully with large log files 
in the past and overall moving less data around ends up being cheaper.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-04  8:11   ` Avi Kivity
@ 2010-12-06 20:38     ` David Sharp
  2010-12-07  9:22       ` Avi Kivity
  0 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-06 20:38 UTC (permalink / raw)
  To: Avi Kivity; +Cc: rostedt, linux-kernel, mrubin, kvm-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 1888 bytes --]

On Sat, Dec 4, 2010 at 12:11 AM, Avi Kivity <avi@redhat.com> wrote:
> On 12/04/2010 02:13 AM, David Sharp wrote:
>>
>> Signed-off-by: David Sharp<dhsharp@google.com>
>> ---
>>  arch/x86/kvm/trace.h |    8 ++++----
>>  1 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
>> index a6544b8..ab41fb0 100644
>> --- a/arch/x86/kvm/trace.h
>> +++ b/arch/x86/kvm/trace.h
>> @@ -62,21 +62,21 @@ TRACE_EVENT(kvm_hv_hypercall,
>>        TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
>>
>>        TP_STRUCT__entry(
>> -               __field(        __u16,          code            )
>> -               __field(        bool,           fast            )
>>                __field(        __u16,          rep_cnt         )
>>                __field(        __u16,          rep_idx         )
>>                __field(        __u64,          ingpa           )
>>                __field(        __u64,          outgpa          )
>> +               __field(        __u16,          code            )
>> +               __field(        bool,           fast            )
>>        ),
>>
>
> Looks like a pessimisation.
>
> Before: 24 bytes
> After: 32 bytes
>
> (on a 64-bit machine, assuming no packing)

This patch is predicated on packing the event structures. And since
the ring buffer is 32-bit addressable, I don't attempt to improve
alignment beyond 32-bit boundaries.

>
> --
> I have a truly marvellous patch that fixes the bug which this
> signature is too narrow to contain.
>
>
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-06 20:38     ` David Sharp
@ 2010-12-07  9:22       ` Avi Kivity
  2010-12-07 21:16         ` David Sharp
  0 siblings, 1 reply; 74+ messages in thread
From: Avi Kivity @ 2010-12-07  9:22 UTC (permalink / raw)
  To: David Sharp; +Cc: rostedt, linux-kernel, mrubin, kvm-devel

On 12/06/2010 10:38 PM, David Sharp wrote:
> On Sat, Dec 4, 2010 at 12:11 AM, Avi Kivity<avi@redhat.com>  wrote:
> >  On 12/04/2010 02:13 AM, David Sharp wrote:
> >>
> >>  Signed-off-by: David Sharp<dhsharp@google.com>
> >>  ---
> >>    arch/x86/kvm/trace.h |    8 ++++----
> >>    1 files changed, 4 insertions(+), 4 deletions(-)
> >>
> >>  diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> >>  index a6544b8..ab41fb0 100644
> >>  --- a/arch/x86/kvm/trace.h
> >>  +++ b/arch/x86/kvm/trace.h
> >>  @@ -62,21 +62,21 @@ TRACE_EVENT(kvm_hv_hypercall,
> >>          TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
> >>
> >>          TP_STRUCT__entry(
> >>  -               __field(        __u16,          code            )
> >>  -               __field(        bool,           fast            )
> >>                  __field(        __u16,          rep_cnt         )
> >>                  __field(        __u16,          rep_idx         )
> >>                  __field(        __u64,          ingpa           )
> >>                  __field(        __u64,          outgpa          )
> >>  +               __field(        __u16,          code            )
> >>  +               __field(        bool,           fast            )
> >>          ),
> >>
> >
> >  Looks like a pessimisation.
> >
> >  Before: 24 bytes
> >  After: 32 bytes
> >
> >  (on a 64-bit machine, assuming no packing)
>
> This patch is predicated on packing the event structures. And since
> the ring buffer is 32-bit addressable, I don't attempt to improve
> alignment beyond 32-bit boundaries.

I don't understand this.  Can you elaborate?  What does "32-bit 
addressable" mean?  And "predicated on packing the event structures"?  
Is the structure __attribute__((packed)), or is it not?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-07  9:22       ` Avi Kivity
@ 2010-12-07 21:16         ` David Sharp
  2010-12-08  9:18           ` Avi Kivity
  0 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-07 21:16 UTC (permalink / raw)
  To: Avi Kivity; +Cc: rostedt, linux-kernel, mrubin, kvm-devel

On Tue, Dec 7, 2010 at 1:22 AM, Avi Kivity <avi@redhat.com> wrote:
> On 12/06/2010 10:38 PM, David Sharp wrote:
>> On Sat, Dec 4, 2010 at 12:11 AM, Avi Kivity<avi@redhat.com>  wrote:
>> >  On 12/04/2010 02:13 AM, David Sharp wrote:
>> >>
>> >>  Signed-off-by: David Sharp<dhsharp@google.com>
>> >>  ---
>> >>    arch/x86/kvm/trace.h |    8 ++++----
>> >>    1 files changed, 4 insertions(+), 4 deletions(-)
>> >>
>> >>  diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
>> >>  index a6544b8..ab41fb0 100644
>> >>  --- a/arch/x86/kvm/trace.h
>> >>  +++ b/arch/x86/kvm/trace.h
>> >>  @@ -62,21 +62,21 @@ TRACE_EVENT(kvm_hv_hypercall,
>> >>          TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
>> >>
>> >>          TP_STRUCT__entry(
>> >>  -               __field(        __u16,          code            )
>> >>  -               __field(        bool,           fast            )
>> >>                  __field(        __u16,          rep_cnt         )
>> >>                  __field(        __u16,          rep_idx         )
>> >>                  __field(        __u64,          ingpa           )
>> >>                  __field(        __u64,          outgpa          )
>> >>  +               __field(        __u16,          code            )
>> >>  +               __field(        bool,           fast            )
>> >>          ),
>> >>
>> >
>> >  Looks like a pessimisation.
>> >
>> >  Before: 24 bytes
>> >  After: 32 bytes
>> >
>> >  (on a 64-bit machine, assuming no packing)
>>
>> This patch is predicated on packing the event structures. And since
>> the ring buffer is 32-bit addressable, I don't attempt to improve
>> alignment beyond 32-bit boundaries.
>
> I don't understand this.  Can you elaborate?  What does "32-bit addressable"
> mean?

The ring buffer gives you space that is a multiple of 4 bytes in
length, and 32-bit aligned. Therefore it is useless to attempt to
align the structure beyond 32-bit boundaries, eg, a 64-bit boundary,
because it is unpredictable if the memory the structure will be
written to is at a 64-bit boundary (addr % 8 could be 0 or 4).

> And "predicated on packing the event structures"?  Is the structure
> __attribute__((packed)), or is it not?

It is not packed in Linus' tree, but one of the patches before this
patch in this patch series adds __attribute__((packed)). This patch
assumes that the event packing patch has been applied. This patch
should not be applied if the packing patch is not (hence,
"predicated").

>
> --
> error compiling committee.c: too many arguments to function
>
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/15] ring_buffer: Align buffer_page struct allocations only to fit the flags.
  2010-12-04  1:43   ` Steven Rostedt
@ 2010-12-07 22:44     ` David Sharp
  0 siblings, 0 replies; 74+ messages in thread
From: David Sharp @ 2010-12-07 22:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, mrubin

On Fri, Dec 3, 2010 at 5:43 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
>> buffer_page structs need to be aligned to 4 byte boundaries because the page
>> flags are stored in the two least-significant bits of the pointers in the page
>> list. Aligning to cache lines is sufficient, but doesn't seem to be necessary.
>> Reducing the alignement to only 4 bytes may improve cache efficiency.
>
> The reason for the cache line boundaries because the bpages are per cpu,
> and if they are allocated next to some variable that gets change
> globally or even another bpage that is used for another cpu buffer, then
> I would expect we would get some cache line bouncing.

I see. That makes sense. I'm not attached to this patch, we can drop it.

> Perhaps we should make bpage have its own slab allocation
> (kmem_cache_alloc). And perhaps even try to keep all bpages that share
> the same cache line on the same cpu buffer.

Maybe. Although, I was thinking of some patches to keep pages in a
free-pool, so that when a cpu buffer needs a new page, it takes from
the pool, and when a reader is done with a page, it returns to the
free-pool. (This helps with the situation where more events occur on
one CPU than others: pages are not locked to a CPU that isn't
producing events.) In that case, it would be impossible to predict
what cpu a bpage belongs to. But since this is theoretical, maybe it's
not relevant.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-07 21:16         ` David Sharp
@ 2010-12-08  9:18           ` Avi Kivity
  2011-03-08 23:55             ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: Avi Kivity @ 2010-12-08  9:18 UTC (permalink / raw)
  To: David Sharp; +Cc: rostedt, linux-kernel, mrubin, kvm-devel

On 12/07/2010 11:16 PM, David Sharp wrote:
> >
> >  I don't understand this.  Can you elaborate?  What does "32-bit addressable"
> >  mean?
>
> The ring buffer gives you space that is a multiple of 4 bytes in
> length, and 32-bit aligned. Therefore it is useless to attempt to
> align the structure beyond 32-bit boundaries, eg, a 64-bit boundary,
> because it is unpredictable if the memory the structure will be
> written to is at a 64-bit boundary (addr % 8 could be 0 or 4).
>
> >  And "predicated on packing the event structures"?  Is the structure
> >  __attribute__((packed)), or is it not?
>
> It is not packed in Linus' tree, but one of the patches before this
> patch in this patch series adds __attribute__((packed)). This patch
> assumes that the event packing patch has been applied. This patch
> should not be applied if the packing patch is not (hence,
> "predicated").

Thanks for the explanations, it makes sense now.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file
  2010-12-04  1:57   ` Steven Rostedt
@ 2010-12-08 20:15     ` David Sharp
  2010-12-08 21:46       ` [PATCH] tracing: Add an 'overwrite' trace_option David Sharp
  0 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2010-12-08 20:15 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, mrubin, Jiaying Zhang

On Fri, Dec 3, 2010 at 5:57 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
>> From: Jiaying Zhang <jiayingz@google.com>
>>
>> Add a "buffer_overwrite" debugfs file for ftrace to control whether the buffer
>
> Instead of adding a new file, make this another "trace_option".

To do this, I had to add a special case to set_tracer_flags() to call
ring_buffer_change_overwrite. It also means adding something not
iterator-related to trace_iterator_flags. Although, maybe it's just
the name of the enum and value prefix that should change.

I'll send the new patch to you separately today.

>
> Thanks,
>
> -- Steve
>
>
>> should be overwritten on overflow or not. The default remains to overwrite old
>> events when the buffer is full. This patch adds the option to instead discard
>> newest events when the buffer is full. This is useful to get a snapshot of
>> traces just after enabling traces. Dropping the current event is also a simpler
>> code path.
>>
>> Signed-off-by: David Sharp <dhsharp@google.com>
>> ---
>>  Documentation/trace/ftrace.txt |    6 ++++
>>  include/linux/ring_buffer.h    |    2 +
>>  kernel/trace/ring_buffer.c     |   11 +++++++
>>  kernel/trace/trace.c           |   59 +++++++++++++++++++++++++++++++++++++---
>>  4 files changed, 74 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
>> index 557c1ed..9237da3 100644
>> --- a/Documentation/trace/ftrace.txt
>> +++ b/Documentation/trace/ftrace.txt
>> @@ -138,6 +138,12 @@ of ftrace. Here is a list of some of the key files:
>>       This can only be updated when the current_tracer
>>       is set to "nop".
>>
>> +  buffer_overwrite:
>> +
>> +     This controls what happens when the trace buffer is full.
>> +     If "1" (default), the oldest events are discarded and
>> +     overwritten. If "0", then the newest events are discarded.
>> +
>>    tracing_cpumask:
>>
>>       This is a mask that lets the user only trace
>> diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
>> index 8d3a248..ab38ac8 100644
>> --- a/include/linux/ring_buffer.h
>> +++ b/include/linux/ring_buffer.h
>> @@ -100,6 +100,8 @@ void ring_buffer_free(struct ring_buffer *buffer);
>>
>>  int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
>>
>> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val);
>> +
>>  struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
>>                                                  unsigned long length);
>>  int ring_buffer_unlock_commit(struct ring_buffer *buffer,
>> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>> index 9ed509a..3207147 100644
>> --- a/kernel/trace/ring_buffer.c
>> +++ b/kernel/trace/ring_buffer.c
>> @@ -1429,6 +1429,17 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
>>  }
>>  EXPORT_SYMBOL_GPL(ring_buffer_resize);
>>
>> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val)
>> +{
>> +     mutex_lock(&buffer->mutex);
>> +     if (val)
>> +             buffer->flags |= RB_FL_OVERWRITE;
>> +     else
>> +             buffer->flags &= ~RB_FL_OVERWRITE;
>> +     mutex_unlock(&buffer->mutex);
>> +}
>> +EXPORT_SYMBOL_GPL(ring_buffer_change_overwrite);
>> +
>>  static inline void *
>>  __rb_data_page_index(struct buffer_data_page *bpage, unsigned index)
>>  {
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index c380612..ed5c14f 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -41,8 +41,6 @@
>>  #include "trace.h"
>>  #include "trace_output.h"
>>
>> -#define TRACE_BUFFER_FLAGS   (RB_FL_OVERWRITE)
>> -
>>  /*
>>   * On boot up, the ring buffer is set to the minimum size, so that
>>   * we do not waste memory on systems that are not using tracing.
>> @@ -241,6 +239,9 @@ int tracing_is_enabled(void)
>>
>>  static unsigned long         trace_buf_size = TRACE_BUF_SIZE_DEFAULT;
>>
>> +/* whether the trace buffer should be overwritten on overflow or not. */
>> +static enum ring_buffer_flags  trace_buffer_flags = RB_FL_OVERWRITE;
>> +
>>  /* trace_types holds a link list of available tracers. */
>>  static struct tracer         *trace_types __read_mostly;
>>
>> @@ -3466,6 +3467,47 @@ tracing_entries_write(struct file *filp, const char __user *ubuf,
>>       return cnt;
>>  }
>>
>> +static ssize_t
>> +tracing_overwrite_read(struct file *filp, char __user *ubuf,
>> +                  size_t cnt, loff_t *ppos)
>> +{
>> +     char buf[64];
>> +     int r;
>> +     r = snprintf(buf, 64, "%u\n", trace_buffer_flags);
>> +     return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
>> +}
>> +
>> +static ssize_t
>> +tracing_overwrite_write(struct file *filp, const char __user *ubuf,
>> +                   size_t cnt, loff_t *ppos)
>> +{
>> +     unsigned long val;
>> +     char buf[64];
>> +     int ret;
>> +
>> +     if (cnt >= sizeof(buf))
>> +             return -EINVAL;
>> +
>> +     if (copy_from_user(&buf, ubuf, cnt))
>> +             return -EFAULT;
>> +
>> +     buf[cnt] = 0;
>> +
>> +     ret = strict_strtoul(buf, 10, &val);
>> +     if (ret < 0)
>> +             return ret;
>> +     *ppos += cnt;
>> +
>> +     if (val != 0 && val != 1)
>> +             return -EINVAL;
>> +
>> +     if (trace_buffer_flags != val) {
>> +             trace_buffer_flags = val;
>> +             ring_buffer_change_overwrite(global_trace.buffer, val);
>> +     }
>> +     return cnt;
>> +}
>> +
>>  static int mark_printk(const char *fmt, ...)
>>  {
>>       int ret;
>> @@ -3611,6 +3653,12 @@ static const struct file_operations tracing_entries_fops = {
>>       .llseek         = generic_file_llseek,
>>  };
>>
>> +static const struct file_operations tracing_overwrite_fops = {
>> +     .open           = tracing_open_generic,
>> +     .read           = tracing_overwrite_read,
>> +     .write          = tracing_overwrite_write,
>> +};
>> +
>>  static const struct file_operations tracing_mark_fops = {
>>       .open           = tracing_open_generic,
>>       .write          = tracing_mark_write,
>> @@ -4336,6 +4384,9 @@ static __init int tracer_init_debugfs(void)
>>       trace_create_file("buffer_size_kb", 0644, d_tracer,
>>                       &global_trace, &tracing_entries_fops);
>>
>> +     trace_create_file("buffer_overwrite", 0644, d_tracer,
>> +                     &global_trace, &tracing_overwrite_fops);
>> +
>>       trace_create_file("trace_marker", 0220, d_tracer,
>>                       NULL, &tracing_mark_fops);
>>
>> @@ -4565,7 +4616,7 @@ __init static int tracer_alloc_buffers(void)
>>
>>       /* TODO: make the number of buffers hot pluggable with CPUS */
>>       global_trace.buffer = ring_buffer_alloc(ring_buf_size,
>> -                                                TRACE_BUFFER_FLAGS);
>> +                                                trace_buffer_flags);
>>       if (!global_trace.buffer) {
>>               printk(KERN_ERR "tracer: failed to allocate ring buffer!\n");
>>               WARN_ON(1);
>> @@ -4575,7 +4626,7 @@ __init static int tracer_alloc_buffers(void)
>>
>>
>>  #ifdef CONFIG_TRACER_MAX_TRACE
>> -     max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
>> +     max_tr.buffer = ring_buffer_alloc(1, trace_buffer_flags);
>>       if (!max_tr.buffer) {
>>               printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
>>               WARN_ON(1);
>
>
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH] tracing: Add an 'overwrite' trace_option.
  2010-12-08 20:15     ` David Sharp
@ 2010-12-08 21:46       ` David Sharp
  2010-12-14  0:39         ` David Sharp
                           ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: David Sharp @ 2010-12-08 21:46 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, mrubin, jiayingz, David Sharp

Add an "overwrite" trace_option for ftrace to control whether the buffer should
be overwritten on overflow or not. The default remains to overwrite old events
when the buffer is full. This patch adds the option to instead discard newest
events when the buffer is full. This is useful to get a snapshot of traces just
after enabling traces. Dropping the current event is also a simpler code path.
---
 Documentation/trace/ftrace.txt |    5 +++++
 include/linux/ring_buffer.h    |    2 ++
 kernel/trace/ring_buffer.c     |   11 +++++++++++
 kernel/trace/trace.c           |   17 +++++++++++------
 kernel/trace/trace.h           |    1 +
 5 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index 557c1ed..2842ea7 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -491,6 +491,11 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
                    latencies, as described in "Latency
                    trace format".
 
+  overwrite - This controls what happens when the trace buffer is
+              full. If "1" (default), the oldest events are
+              discarded and overwritten. If "0", then the newest
+              events are discarded.
+
 sched_switch
 ------------
 
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 8d3a248..ab38ac8 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -100,6 +100,8 @@ void ring_buffer_free(struct ring_buffer *buffer);
 
 int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
 
+void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val);
+
 struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
 						   unsigned long length);
 int ring_buffer_unlock_commit(struct ring_buffer *buffer,
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 9ed509a..3207147 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1429,6 +1429,17 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
 }
 EXPORT_SYMBOL_GPL(ring_buffer_resize);
 
+void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val)
+{
+	mutex_lock(&buffer->mutex);
+	if (val)
+		buffer->flags |= RB_FL_OVERWRITE;
+	else
+		buffer->flags &= ~RB_FL_OVERWRITE;
+	mutex_unlock(&buffer->mutex);
+}
+EXPORT_SYMBOL_GPL(ring_buffer_change_overwrite);
+
 static inline void *
 __rb_data_page_index(struct buffer_data_page *bpage, unsigned index)
 {
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index c380612..6f7494c 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -41,8 +41,6 @@
 #include "trace.h"
 #include "trace_output.h"
 
-#define TRACE_BUFFER_FLAGS	(RB_FL_OVERWRITE)
-
 /*
  * On boot up, the ring buffer is set to the minimum size, so that
  * we do not waste memory on systems that are not using tracing.
@@ -340,7 +338,7 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
 /* trace_flags holds trace_options default values */
 unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
 	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
-	TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD;
+	TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE;
 
 static int trace_stop_count;
 static DEFINE_SPINLOCK(tracing_start_lock);
@@ -425,6 +423,7 @@ static const char *trace_options[] = {
 	"sleep-time",
 	"graph-time",
 	"record-cmd",
+	"overwrite",
 	NULL
 };
 
@@ -2523,6 +2522,9 @@ static void set_tracer_flags(unsigned int mask, int enabled)
 
 	if (mask == TRACE_ITER_RECORD_CMD)
 		trace_event_enable_cmd_record(enabled);
+
+	if (mask == TRACE_ITER_OVERWRITE)
+		ring_buffer_change_overwrite(global_trace.buffer, enabled);
 }
 
 static ssize_t
@@ -4545,9 +4547,11 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)
 __init static int tracer_alloc_buffers(void)
 {
 	int ring_buf_size;
+	enum ring_buffer_flags rb_flags;
 	int i;
 	int ret = -ENOMEM;
 
+
 	if (!alloc_cpumask_var(&tracing_buffer_mask, GFP_KERNEL))
 		goto out;
 
@@ -4560,12 +4564,13 @@ __init static int tracer_alloc_buffers(void)
 	else
 		ring_buf_size = 1;
 
+	rb_flags = trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0;
+
 	cpumask_copy(tracing_buffer_mask, cpu_possible_mask);
 	cpumask_copy(tracing_cpumask, cpu_all_mask);
 
 	/* TODO: make the number of buffers hot pluggable with CPUS */
-	global_trace.buffer = ring_buffer_alloc(ring_buf_size,
-						   TRACE_BUFFER_FLAGS);
+	global_trace.buffer = ring_buffer_alloc(ring_buf_size, rb_flags);
 	if (!global_trace.buffer) {
 		printk(KERN_ERR "tracer: failed to allocate ring buffer!\n");
 		WARN_ON(1);
@@ -4575,7 +4580,7 @@ __init static int tracer_alloc_buffers(void)
 
 
 #ifdef CONFIG_TRACER_MAX_TRACE
-	max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
+	max_tr.buffer = ring_buffer_alloc(1, rb_flags);
 	if (!max_tr.buffer) {
 		printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
 		WARN_ON(1);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 9021f8c..9cd025a 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -606,6 +606,7 @@ enum trace_iterator_flags {
 	TRACE_ITER_SLEEP_TIME		= 0x40000,
 	TRACE_ITER_GRAPH_TIME		= 0x80000,
 	TRACE_ITER_RECORD_CMD		= 0x100000,
+	TRACE_ITER_OVERWRITE		= 0x200000,
 };
 
 /*
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/15] ftrace: fix event alignment: mce:mce_record
  2010-12-04  1:50   ` Steven Rostedt
@ 2010-12-09 13:33     ` Frederic Weisbecker
  0 siblings, 0 replies; 74+ messages in thread
From: Frederic Weisbecker @ 2010-12-09 13:33 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin, Hidetoshi Seto

On Fri, Dec 03, 2010 at 08:50:22PM -0500, Steven Rostedt wrote:
> [ Ccing Hidetoshi and Frederic for Acked-by's ]

No problem for me.

Thanks.

> -- Steve
> 
> 
> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> > Signed-off-by: David Sharp <dhsharp@google.com>
> > ---
> >  include/trace/events/mce.h |    8 ++++----
> >  1 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
> > index 7eee778..4cbbcef 100644
> > --- a/include/trace/events/mce.h
> > +++ b/include/trace/events/mce.h
> > @@ -17,36 +17,36 @@ TRACE_EVENT(mce_record,
> >  	TP_STRUCT__entry(
> >  		__field(	u64,		mcgcap		)
> >  		__field(	u64,		mcgstatus	)
> > -		__field(	u8,		bank		)
> >  		__field(	u64,		status		)
> >  		__field(	u64,		addr		)
> >  		__field(	u64,		misc		)
> >  		__field(	u64,		ip		)
> > -		__field(	u8,		cs		)
> >  		__field(	u64,		tsc		)
> >  		__field(	u64,		walltime	)
> >  		__field(	u32,		cpu		)
> >  		__field(	u32,		cpuid		)
> >  		__field(	u32,		apicid		)
> >  		__field(	u32,		socketid	)
> > +		__field(	u8,		cs		)
> > +		__field(	u8,		bank		)
> >  		__field(	u8,		cpuvendor	)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->mcgcap		= m->mcgcap;
> >  		__entry->mcgstatus	= m->mcgstatus;
> > -		__entry->bank		= m->bank;
> >  		__entry->status		= m->status;
> >  		__entry->addr		= m->addr;
> >  		__entry->misc		= m->misc;
> >  		__entry->ip		= m->ip;
> > -		__entry->cs		= m->cs;
> >  		__entry->tsc		= m->tsc;
> >  		__entry->walltime	= m->time;
> >  		__entry->cpu		= m->extcpu;
> >  		__entry->cpuid		= m->cpuid;
> >  		__entry->apicid		= m->apicid;
> >  		__entry->socketid	= m->socketid;
> > +		__entry->cs		= m->cs;
> > +		__entry->bank		= m->bank;
> >  		__entry->cpuvendor	= m->cpuvendor;
> >  	),
> >  
> 
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-04  2:54       ` Steven Rostedt
@ 2010-12-09 14:55         ` Frederic Weisbecker
  2010-12-09 15:08           ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: Frederic Weisbecker @ 2010-12-09 14:55 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin

On Fri, Dec 03, 2010 at 09:54:12PM -0500, Steven Rostedt wrote:
> On Fri, 2010-12-03 at 18:33 -0800, David Sharp wrote:
> > I considered that, and I generally thing it's a good idea. However, I
> > also want to use this switch to shrink individual tracepoint event
> > structures.
> > 
> > eg: sched switch is a high frequency event and it is 68 bytes (60
> > after these patches)
> > 
> > Can you suggest a syntax for TRACE_EVENT, DECLARE_EVENT_CLASS, etc,
> > that could express the two versions and produce the right code?
> > 
> > I'm worried about adding even further complexity to the TRACE_EVENT
> > macros. I could add TRACE_EVENT_SMALL that takes two versions of
> > TP_STRUCT__entry, TP_fast_assign, and TP_printk each, but then this
> > will need to be permuted with your TP_CONDITIONAL patches as well.
> 
> I would not touch the TRACE_EVENT() structures. They are there as is and
> I would not think about changing them. Something like that would never
> make it into mainline.
> 
> Now what you can do, is to make your own events based off of the same
> tracepoints. For example, the TRACE_EVENT(sched_switch...) has in
> sched.c:
> 
> 	trace_sched_switch(prev, next);
> 
> 
> You could even write a module that does something like this:
> 
> 	register_trace_sched_switch(probe_sched_switch, mydata);
> 
> 
> 
> void probe_sched_switch(void *mydata,
> 		struct task_struct *prev,
> 		struct task_struct *next)
> {
> 	struct ring_buffer *buffer;
> 	struct ring_buffer_event *event;
> 	struct myentry *entry;
> 
> 	event = trace_current_buffer_lock_reserve(buffer,
> 			mytype, sizeof(*entry),
> 			0, 0);
> 
> 	if (!event)
> 		return;
> 
> 	entry = ring_buffer_event_data(event);
> 
> 	entry->myfield = prev->x;
> 	...
> 
> 	trace_nowake_buffer_unlock_commit(buffer, event,
> 					0, 0);
> }
> 
> You will need to do a register_ftrace_event() to register that 'mytype'
> and how to output it. Otherwise it would just be ignored in the "trace"
> file.
> 
> All of the above would work fine as a loadable module that you could
> easily maintain out of tree, and still uses the internals of the system.
> 
> -- Steve
> 
> 


But this would improve only google's tracing while this is a general
mainline tracing problem.

The first thing is that we need to get rid of the lock_depth field, the bkl
is dying.

For the rest what about having a bitmap of the fields we want to ignore,
which can be setup from a trace file for ftrace and as an ioctl for perf.

So this bitmap is easy to implement on the common fields.

For the rest, one could choose between using TP_fast_assign()
and TP_cond_assign().

TP_fast_assign() stays as is and doesn't implement bitmap field
ignoring. Those who want conditional record will need
TP_cond_assign().
Well, unfortunately this probably requires us to play
the same trickery than SYSCALL_DEFINE() in that we'll probably
need TP_cond_assign1(), TP_cond_assign2(), TP_cond_assign3(), etc...

#define TP_cond_assignx(nr, assign)	\
	if (call->bitmask & nr)	{	\
		assign
	}

#define TP_cond_assign2(nr, assign, ...)	\
	TP_cond_assignx(nr, assign)		\
	TP_cond_assign1(nr + 1, __VA_ARGS__)

#define TP_cond_assign3(nr, assign, ...)        \
        TP_cond_assignx(nr, assign)             \
        TP_cond_assign2(nr + 1, __VA_ARGS__)

That will also require a bit more trickery to dynamically
pre-compute the size of the trace entry.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-09 14:55         ` Frederic Weisbecker
@ 2010-12-09 15:08           ` Steven Rostedt
  2010-12-09 15:28             ` Frederic Weisbecker
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2010-12-09 15:08 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: David Sharp, linux-kernel, mrubin, Mathieu Desnoyers

On Thu, 2010-12-09 at 15:55 +0100, Frederic Weisbecker wrote:
> On Fri, Dec 03, 2010 at 09:54:12PM -0500, Steven Rostedt wrote:
> > On Fri, 2010-12-03 at 18:33 -0800, David Sharp wrote:
> > > I considered that, and I generally thing it's a good idea. However, I
> > > also want to use this switch to shrink individual tracepoint event
> > > structures.
> > > 
> > > eg: sched switch is a high frequency event and it is 68 bytes (60
> > > after these patches)
> > > 
> > > Can you suggest a syntax for TRACE_EVENT, DECLARE_EVENT_CLASS, etc,
> > > that could express the two versions and produce the right code?
> > > 
> > > I'm worried about adding even further complexity to the TRACE_EVENT
> > > macros. I could add TRACE_EVENT_SMALL that takes two versions of
> > > TP_STRUCT__entry, TP_fast_assign, and TP_printk each, but then this
> > > will need to be permuted with your TP_CONDITIONAL patches as well.
> > 
> > I would not touch the TRACE_EVENT() structures. They are there as is and
> > I would not think about changing them. Something like that would never
> > make it into mainline.
> > 
> > Now what you can do, is to make your own events based off of the same
> > tracepoints. For example, the TRACE_EVENT(sched_switch...) has in
> > sched.c:
> > 
> > 	trace_sched_switch(prev, next);
> > 
> > 
> > You could even write a module that does something like this:
> > 
> > 	register_trace_sched_switch(probe_sched_switch, mydata);
> > 
> > 
> > 
> > void probe_sched_switch(void *mydata,
> > 		struct task_struct *prev,
> > 		struct task_struct *next)
> > {
> > 	struct ring_buffer *buffer;
> > 	struct ring_buffer_event *event;
> > 	struct myentry *entry;
> > 
> > 	event = trace_current_buffer_lock_reserve(buffer,
> > 			mytype, sizeof(*entry),
> > 			0, 0);
> > 
> > 	if (!event)
> > 		return;
> > 
> > 	entry = ring_buffer_event_data(event);
> > 
> > 	entry->myfield = prev->x;
> > 	...
> > 
> > 	trace_nowake_buffer_unlock_commit(buffer, event,
> > 					0, 0);
> > }
> > 
> > You will need to do a register_ftrace_event() to register that 'mytype'
> > and how to output it. Otherwise it would just be ignored in the "trace"
> > file.
> > 
> > All of the above would work fine as a loadable module that you could
> > easily maintain out of tree, and still uses the internals of the system.
> > 
> > -- Steve
> > 
> > 
> 
> 
> But this would improve only google's tracing while this is a general
> mainline tracing problem.
> 
> The first thing is that we need to get rid of the lock_depth field, the bkl
> is dying.

Yeah that needs to go :-)

> 
> For the rest what about having a bitmap of the fields we want to ignore,
> which can be setup from a trace file for ftrace and as an ioctl for perf.
> 
> So this bitmap is easy to implement on the common fields.
> 
> For the rest, one could choose between using TP_fast_assign()
> and TP_cond_assign().
> 
> TP_fast_assign() stays as is and doesn't implement bitmap field
> ignoring. Those who want conditional record will need
> TP_cond_assign().
> Well, unfortunately this probably requires us to play
> the same trickery than SYSCALL_DEFINE() in that we'll probably
> need TP_cond_assign1(), TP_cond_assign2(), TP_cond_assign3(), etc...
> 
> #define TP_cond_assignx(nr, assign)	\
> 	if (call->bitmask & nr)	{	\
> 		assign
> 	}
> 
> #define TP_cond_assign2(nr, assign, ...)	\
> 	TP_cond_assignx(nr, assign)		\
> 	TP_cond_assign1(nr + 1, __VA_ARGS__)
> 
> #define TP_cond_assign3(nr, assign, ...)        \
>         TP_cond_assignx(nr, assign)             \
>         TP_cond_assign2(nr + 1, __VA_ARGS__)
> 
> That will also require a bit more trickery to dynamically
> pre-compute the size of the trace entry.

Mathieu is working on encapsulating the assignments in their own macros.

Instead of doing:

	__entry->foo = bar;

We will have:

	tp_assign(foo, bar);

This way we could probably use this to dynamically figure out what to
assign.

-- Steve



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-09 15:08           ` Steven Rostedt
@ 2010-12-09 15:28             ` Frederic Weisbecker
  2010-12-09 16:16               ` Mathieu Desnoyers
  0 siblings, 1 reply; 74+ messages in thread
From: Frederic Weisbecker @ 2010-12-09 15:28 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin, Mathieu Desnoyers

On Thu, Dec 09, 2010 at 10:08:17AM -0500, Steven Rostedt wrote:
> On Thu, 2010-12-09 at 15:55 +0100, Frederic Weisbecker wrote:
> > On Fri, Dec 03, 2010 at 09:54:12PM -0500, Steven Rostedt wrote:
> > > On Fri, 2010-12-03 at 18:33 -0800, David Sharp wrote:
> > > > I considered that, and I generally thing it's a good idea. However, I
> > > > also want to use this switch to shrink individual tracepoint event
> > > > structures.
> > > > 
> > > > eg: sched switch is a high frequency event and it is 68 bytes (60
> > > > after these patches)
> > > > 
> > > > Can you suggest a syntax for TRACE_EVENT, DECLARE_EVENT_CLASS, etc,
> > > > that could express the two versions and produce the right code?
> > > > 
> > > > I'm worried about adding even further complexity to the TRACE_EVENT
> > > > macros. I could add TRACE_EVENT_SMALL that takes two versions of
> > > > TP_STRUCT__entry, TP_fast_assign, and TP_printk each, but then this
> > > > will need to be permuted with your TP_CONDITIONAL patches as well.
> > > 
> > > I would not touch the TRACE_EVENT() structures. They are there as is and
> > > I would not think about changing them. Something like that would never
> > > make it into mainline.
> > > 
> > > Now what you can do, is to make your own events based off of the same
> > > tracepoints. For example, the TRACE_EVENT(sched_switch...) has in
> > > sched.c:
> > > 
> > > 	trace_sched_switch(prev, next);
> > > 
> > > 
> > > You could even write a module that does something like this:
> > > 
> > > 	register_trace_sched_switch(probe_sched_switch, mydata);
> > > 
> > > 
> > > 
> > > void probe_sched_switch(void *mydata,
> > > 		struct task_struct *prev,
> > > 		struct task_struct *next)
> > > {
> > > 	struct ring_buffer *buffer;
> > > 	struct ring_buffer_event *event;
> > > 	struct myentry *entry;
> > > 
> > > 	event = trace_current_buffer_lock_reserve(buffer,
> > > 			mytype, sizeof(*entry),
> > > 			0, 0);
> > > 
> > > 	if (!event)
> > > 		return;
> > > 
> > > 	entry = ring_buffer_event_data(event);
> > > 
> > > 	entry->myfield = prev->x;
> > > 	...
> > > 
> > > 	trace_nowake_buffer_unlock_commit(buffer, event,
> > > 					0, 0);
> > > }
> > > 
> > > You will need to do a register_ftrace_event() to register that 'mytype'
> > > and how to output it. Otherwise it would just be ignored in the "trace"
> > > file.
> > > 
> > > All of the above would work fine as a loadable module that you could
> > > easily maintain out of tree, and still uses the internals of the system.
> > > 
> > > -- Steve
> > > 
> > > 
> > 
> > 
> > But this would improve only google's tracing while this is a general
> > mainline tracing problem.
> > 
> > The first thing is that we need to get rid of the lock_depth field, the bkl
> > is dying.
> 
> Yeah that needs to go :-)
> 
> > 
> > For the rest what about having a bitmap of the fields we want to ignore,
> > which can be setup from a trace file for ftrace and as an ioctl for perf.
> > 
> > So this bitmap is easy to implement on the common fields.
> > 
> > For the rest, one could choose between using TP_fast_assign()
> > and TP_cond_assign().
> > 
> > TP_fast_assign() stays as is and doesn't implement bitmap field
> > ignoring. Those who want conditional record will need
> > TP_cond_assign().
> > Well, unfortunately this probably requires us to play
> > the same trickery than SYSCALL_DEFINE() in that we'll probably
> > need TP_cond_assign1(), TP_cond_assign2(), TP_cond_assign3(), etc...
> > 
> > #define TP_cond_assignx(nr, assign)	\
> > 	if (call->bitmask & nr)	{	\
> > 		assign
> > 	}
> > 
> > #define TP_cond_assign2(nr, assign, ...)	\
> > 	TP_cond_assignx(nr, assign)		\
> > 	TP_cond_assign1(nr + 1, __VA_ARGS__)
> > 
> > #define TP_cond_assign3(nr, assign, ...)        \
> >         TP_cond_assignx(nr, assign)             \
> >         TP_cond_assign2(nr + 1, __VA_ARGS__)
> > 
> > That will also require a bit more trickery to dynamically
> > pre-compute the size of the trace entry.
> 
> Mathieu is working on encapsulating the assignments in their own macros.
> 
> Instead of doing:
> 
> 	__entry->foo = bar;
> 
> We will have:
> 
> 	tp_assign(foo, bar);
> 
> This way we could probably use this to dynamically figure out what to
> assign.
> 
> -- Steve
> 
> 


Yep, it should also work that way.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-09 15:28             ` Frederic Weisbecker
@ 2010-12-09 16:16               ` Mathieu Desnoyers
  0 siblings, 0 replies; 74+ messages in thread
From: Mathieu Desnoyers @ 2010-12-09 16:16 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Steven Rostedt, David Sharp, linux-kernel, mrubin

* Frederic Weisbecker (fweisbec@gmail.com) wrote:
> On Thu, Dec 09, 2010 at 10:08:17AM -0500, Steven Rostedt wrote:
> > On Thu, 2010-12-09 at 15:55 +0100, Frederic Weisbecker wrote:
> > > On Fri, Dec 03, 2010 at 09:54:12PM -0500, Steven Rostedt wrote:
> > > > On Fri, 2010-12-03 at 18:33 -0800, David Sharp wrote:
> > > > > I considered that, and I generally thing it's a good idea. However, I
> > > > > also want to use this switch to shrink individual tracepoint event
> > > > > structures.
> > > > > 
> > > > > eg: sched switch is a high frequency event and it is 68 bytes (60
> > > > > after these patches)
> > > > > 
> > > > > Can you suggest a syntax for TRACE_EVENT, DECLARE_EVENT_CLASS, etc,
> > > > > that could express the two versions and produce the right code?
> > > > > 
> > > > > I'm worried about adding even further complexity to the TRACE_EVENT
> > > > > macros. I could add TRACE_EVENT_SMALL that takes two versions of
> > > > > TP_STRUCT__entry, TP_fast_assign, and TP_printk each, but then this
> > > > > will need to be permuted with your TP_CONDITIONAL patches as well.
> > > > 
> > > > I would not touch the TRACE_EVENT() structures. They are there as is and
> > > > I would not think about changing them. Something like that would never
> > > > make it into mainline.
> > > > 
> > > > Now what you can do, is to make your own events based off of the same
> > > > tracepoints. For example, the TRACE_EVENT(sched_switch...) has in
> > > > sched.c:
> > > > 
> > > > 	trace_sched_switch(prev, next);
> > > > 
> > > > 
> > > > You could even write a module that does something like this:
> > > > 
> > > > 	register_trace_sched_switch(probe_sched_switch, mydata);
> > > > 
> > > > 
> > > > 
> > > > void probe_sched_switch(void *mydata,
> > > > 		struct task_struct *prev,
> > > > 		struct task_struct *next)
> > > > {
> > > > 	struct ring_buffer *buffer;
> > > > 	struct ring_buffer_event *event;
> > > > 	struct myentry *entry;
> > > > 
> > > > 	event = trace_current_buffer_lock_reserve(buffer,
> > > > 			mytype, sizeof(*entry),
> > > > 			0, 0);
> > > > 
> > > > 	if (!event)
> > > > 		return;
> > > > 
> > > > 	entry = ring_buffer_event_data(event);
> > > > 
> > > > 	entry->myfield = prev->x;
> > > > 	...
> > > > 
> > > > 	trace_nowake_buffer_unlock_commit(buffer, event,
> > > > 					0, 0);
> > > > }
> > > > 
> > > > You will need to do a register_ftrace_event() to register that 'mytype'
> > > > and how to output it. Otherwise it would just be ignored in the "trace"
> > > > file.
> > > > 
> > > > All of the above would work fine as a loadable module that you could
> > > > easily maintain out of tree, and still uses the internals of the system.
> > > > 
> > > > -- Steve
> > > > 
> > > > 
> > > 
> > > 
> > > But this would improve only google's tracing while this is a general
> > > mainline tracing problem.
> > > 
> > > The first thing is that we need to get rid of the lock_depth field, the bkl
> > > is dying.
> > 
> > Yeah that needs to go :-)
> > 
> > > 
> > > For the rest what about having a bitmap of the fields we want to ignore,
> > > which can be setup from a trace file for ftrace and as an ioctl for perf.
> > > 
> > > So this bitmap is easy to implement on the common fields.
> > > 
> > > For the rest, one could choose between using TP_fast_assign()
> > > and TP_cond_assign().
> > > 
> > > TP_fast_assign() stays as is and doesn't implement bitmap field
> > > ignoring. Those who want conditional record will need
> > > TP_cond_assign().
> > > Well, unfortunately this probably requires us to play
> > > the same trickery than SYSCALL_DEFINE() in that we'll probably
> > > need TP_cond_assign1(), TP_cond_assign2(), TP_cond_assign3(), etc...
> > > 
> > > #define TP_cond_assignx(nr, assign)	\
> > > 	if (call->bitmask & nr)	{	\
> > > 		assign
> > > 	}
> > > 
> > > #define TP_cond_assign2(nr, assign, ...)	\
> > > 	TP_cond_assignx(nr, assign)		\
> > > 	TP_cond_assign1(nr + 1, __VA_ARGS__)
> > > 
> > > #define TP_cond_assign3(nr, assign, ...)        \
> > >         TP_cond_assignx(nr, assign)             \
> > >         TP_cond_assign2(nr + 1, __VA_ARGS__)
> > > 
> > > That will also require a bit more trickery to dynamically
> > > pre-compute the size of the trace entry.
> > 
> > Mathieu is working on encapsulating the assignments in their own macros.
> > 
> > Instead of doing:
> > 
> > 	__entry->foo = bar;
> > 
> > We will have:
> > 
> > 	tp_assign(foo, bar);
> > 
> > This way we could probably use this to dynamically figure out what to
> > assign.
> > 
> > -- Steve
> > 
> > 
> 
> Yep, it should also work that way.

In an incredible exercise of good timing, I'm planning to post them just now :)

Comments will be very welcome,

Thanks!

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH] tracing: Add an 'overwrite' trace_option.
  2010-12-08 21:46       ` [PATCH] tracing: Add an 'overwrite' trace_option David Sharp
@ 2010-12-14  0:39         ` David Sharp
  2011-03-09  0:45         ` David Sharp
  2011-03-11  9:45         ` [tip:perf/core] " tip-bot for David Sharp
  2 siblings, 0 replies; 74+ messages in thread
From: David Sharp @ 2010-12-14  0:39 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, mrubin, jiayingz, David Sharp

Steve, any feedback on this patch?

On Wed, Dec 8, 2010 at 1:46 PM, David Sharp <dhsharp@google.com> wrote:
> Add an "overwrite" trace_option for ftrace to control whether the buffer should
> be overwritten on overflow or not. The default remains to overwrite old events
> when the buffer is full. This patch adds the option to instead discard newest
> events when the buffer is full. This is useful to get a snapshot of traces just
> after enabling traces. Dropping the current event is also a simpler code path.
> ---
>  Documentation/trace/ftrace.txt |    5 +++++
>  include/linux/ring_buffer.h    |    2 ++
>  kernel/trace/ring_buffer.c     |   11 +++++++++++
>  kernel/trace/trace.c           |   17 +++++++++++------
>  kernel/trace/trace.h           |    1 +
>  5 files changed, 30 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
> index 557c1ed..2842ea7 100644
> --- a/Documentation/trace/ftrace.txt
> +++ b/Documentation/trace/ftrace.txt
> @@ -491,6 +491,11 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
>                    latencies, as described in "Latency
>                    trace format".
>
> +  overwrite - This controls what happens when the trace buffer is
> +              full. If "1" (default), the oldest events are
> +              discarded and overwritten. If "0", then the newest
> +              events are discarded.
> +
>  sched_switch
>  ------------
>
> diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
> index 8d3a248..ab38ac8 100644
> --- a/include/linux/ring_buffer.h
> +++ b/include/linux/ring_buffer.h
> @@ -100,6 +100,8 @@ void ring_buffer_free(struct ring_buffer *buffer);
>
>  int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
>
> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val);
> +
>  struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
>                                                   unsigned long length);
>  int ring_buffer_unlock_commit(struct ring_buffer *buffer,
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 9ed509a..3207147 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1429,6 +1429,17 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
>  }
>  EXPORT_SYMBOL_GPL(ring_buffer_resize);
>
> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val)
> +{
> +       mutex_lock(&buffer->mutex);
> +       if (val)
> +               buffer->flags |= RB_FL_OVERWRITE;
> +       else
> +               buffer->flags &= ~RB_FL_OVERWRITE;
> +       mutex_unlock(&buffer->mutex);
> +}
> +EXPORT_SYMBOL_GPL(ring_buffer_change_overwrite);
> +
>  static inline void *
>  __rb_data_page_index(struct buffer_data_page *bpage, unsigned index)
>  {
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index c380612..6f7494c 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -41,8 +41,6 @@
>  #include "trace.h"
>  #include "trace_output.h"
>
> -#define TRACE_BUFFER_FLAGS     (RB_FL_OVERWRITE)
> -
>  /*
>  * On boot up, the ring buffer is set to the minimum size, so that
>  * we do not waste memory on systems that are not using tracing.
> @@ -340,7 +338,7 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
>  /* trace_flags holds trace_options default values */
>  unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
>        TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
> -       TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD;
> +       TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE;
>
>  static int trace_stop_count;
>  static DEFINE_SPINLOCK(tracing_start_lock);
> @@ -425,6 +423,7 @@ static const char *trace_options[] = {
>        "sleep-time",
>        "graph-time",
>        "record-cmd",
> +       "overwrite",
>        NULL
>  };
>
> @@ -2523,6 +2522,9 @@ static void set_tracer_flags(unsigned int mask, int enabled)
>
>        if (mask == TRACE_ITER_RECORD_CMD)
>                trace_event_enable_cmd_record(enabled);
> +
> +       if (mask == TRACE_ITER_OVERWRITE)
> +               ring_buffer_change_overwrite(global_trace.buffer, enabled);
>  }
>
>  static ssize_t
> @@ -4545,9 +4547,11 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)
>  __init static int tracer_alloc_buffers(void)
>  {
>        int ring_buf_size;
> +       enum ring_buffer_flags rb_flags;
>        int i;
>        int ret = -ENOMEM;
>
> +
>        if (!alloc_cpumask_var(&tracing_buffer_mask, GFP_KERNEL))
>                goto out;
>
> @@ -4560,12 +4564,13 @@ __init static int tracer_alloc_buffers(void)
>        else
>                ring_buf_size = 1;
>
> +       rb_flags = trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0;
> +
>        cpumask_copy(tracing_buffer_mask, cpu_possible_mask);
>        cpumask_copy(tracing_cpumask, cpu_all_mask);
>
>        /* TODO: make the number of buffers hot pluggable with CPUS */
> -       global_trace.buffer = ring_buffer_alloc(ring_buf_size,
> -                                                  TRACE_BUFFER_FLAGS);
> +       global_trace.buffer = ring_buffer_alloc(ring_buf_size, rb_flags);
>        if (!global_trace.buffer) {
>                printk(KERN_ERR "tracer: failed to allocate ring buffer!\n");
>                WARN_ON(1);
> @@ -4575,7 +4580,7 @@ __init static int tracer_alloc_buffers(void)
>
>
>  #ifdef CONFIG_TRACER_MAX_TRACE
> -       max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
> +       max_tr.buffer = ring_buffer_alloc(1, rb_flags);
>        if (!max_tr.buffer) {
>                printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
>                WARN_ON(1);
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 9021f8c..9cd025a 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -606,6 +606,7 @@ enum trace_iterator_flags {
>        TRACE_ITER_SLEEP_TIME           = 0x40000,
>        TRACE_ITER_GRAPH_TIME           = 0x80000,
>        TRACE_ITER_RECORD_CMD           = 0x100000,
> +       TRACE_ITER_OVERWRITE            = 0x200000,
>  };
>
>  /*
> --
> 1.7.3.1
>
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/15] ftrace: pack event structures.
  2010-12-04  0:13 ` [PATCH 04/15] ftrace: pack event structures David Sharp
@ 2011-03-08 23:30   ` Steven Rostedt
  2011-03-09  1:09     ` Steven Rostedt
  2011-03-09  6:39     ` David Miller
  2011-03-09 13:01   ` Steven Rostedt
  1 sibling, 2 replies; 74+ messages in thread
From: Steven Rostedt @ 2011-03-08 23:30 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, mrubin, David Sharp

David (Miller),

I'm going back through this patch set and looking to incorporate it into
2.6.39. But I know how touchy sparc is with "packed" data structures. Do
you see this patch as hurting sparc?

I wonder if we should change this to something like:

#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
# define do_event_packed __attirbute__((packed))
#else
# define do_event_packed
#endif

and use "do_event_packed" instead?


On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Ftrace event structures have a 12-byte struct trace_entry at the beginning.
> If the structure is aligned, this means that if the first field is 64-bits,
> there will be 4 bytes of padding. Ironically, due to the 4-byte ringbuffer
> header, this will make 64-bit writes unaligned, if the ring buffer position
> is currently 64-bit aligned:
> 4(rb)+12(ftrace)+4(pad) = 20;  20%8 = 4

Note, on 64bit archs without CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, the
ring buffer forces the 8 byte header, to keep the data portion of the
event load 8byte aligned.

-- Steve

> 
> Adding __attribute__((packed)) to the event structures removes the extra
> space from the trace events, and actually improves alignment of trace
> events with a first field that is 64-bits.
> 
> About 65 tracepoints have a 4-byte pad at offset 12:
>  # find events -name format | xargs -n1 awk '
>    $1=="name:" {name=$2}
>    $1=="format:"{FS="\t"}
>    $3=="offset:12;" && $4=="size:4;"{okay=1}
>    $3=="offset:16;" && !okay {print name}' | wc -l
> 65
> 
> With all 'syscalls' and 'timer' events enabled, this results in a 5%
> improvement in a simple 512MB read benchmark with warm caches.
> 
> Tested:
> 
> setup:
>  # echo 1 >events/syscalls/enable
>  # echo 1 >events/timer/enable
>  # echo 0 > tracing_enabled
> off:
>  # for n in $(seq 10) ; do \
>    time dd if=/dev/hda3 of=/dev/null bs=1K count=512K ; \
>    done
> on:
>  # for n in $(seq 10) ; do \
>    echo 1 >tracing_enabled; \
>    time dd if=/dev/hda3 of=/dev/null bs=1K count=512K ; \
>    echo 0 >tracing_enabled; \
>    echo > trace; \
>    done
> 
> real time      mean/median/stdev
> w/o patch off: 1.1679/1.164/0.0169
> w/o patch on : 1.9432/1.936/0.0274
> w/  patch off: 1.1715/1.159/0.0431
> w/  patch on : 1.8425/1.836/0.0138
> "on" delta: -0.1007 --> -5.2%
> 
> Google-Bug-Id: 2895627
> 
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  include/trace/ftrace.h |    5 +++--
>  kernel/trace/trace.h   |    2 +-
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index a9377c0..51d1f52 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -48,7 +48,8 @@
>  #define __array(type, item, len)	type	item[len];
>  
>  #undef __dynamic_array
> -#define __dynamic_array(type, item, len) u32 __data_loc_##item;
> +#define __dynamic_array(type, item, len) \
> +    u32 __data_loc_##item __attribute__((aligned(4)));
>  
>  #undef __string
>  #define __string(item, src) __dynamic_array(char, item, -1)
> @@ -62,7 +63,7 @@
>  		struct trace_entry	ent;				\
>  		tstruct							\
>  		char			__data[0];			\
> -	};								\
> +	} __attribute__((packed));					\
>  									\
>  	static struct ftrace_event_class event_class_##name;
>  
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 9021f8c..2e80433 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -60,7 +60,7 @@ enum trace_type {
>  	struct struct_name {					\
>  		struct trace_entry	ent;			\
>  		tstruct						\
> -	}
> +	} __attribute__((packed))
>  
>  #undef TP_ARGS
>  #define TP_ARGS(args...)	args



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2010-12-08  9:18           ` Avi Kivity
@ 2011-03-08 23:55             ` Steven Rostedt
  2011-03-09  8:50               ` Avi Kivity
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2011-03-08 23:55 UTC (permalink / raw)
  To: Avi Kivity; +Cc: David Sharp, linux-kernel, mrubin, kvm-devel

Hi Avi,

This patch set got lost in the shuffle, and I'm not looking to include
it.

On Wed, 2010-12-08 at 11:18 +0200, Avi Kivity wrote:
> On 12/07/2010 11:16 PM, David Sharp wrote:
> > >
> > >  I don't understand this.  Can you elaborate?  What does "32-bit addressable"
> > >  mean?
> >
> > The ring buffer gives you space that is a multiple of 4 bytes in
> > length, and 32-bit aligned. Therefore it is useless to attempt to
> > align the structure beyond 32-bit boundaries, eg, a 64-bit boundary,
> > because it is unpredictable if the memory the structure will be
> > written to is at a 64-bit boundary (addr % 8 could be 0 or 4).
> >
> > >  And "predicated on packing the event structures"?  Is the structure
> > >  __attribute__((packed)), or is it not?
> >
> > It is not packed in Linus' tree, but one of the patches before this
> > patch in this patch series adds __attribute__((packed)). This patch
> > assumes that the event packing patch has been applied. This patch
> > should not be applied if the packing patch is not (hence,
> > "predicated").
> 
> Thanks for the explanations, it makes sense now.
> 

Does this mean I can add your "Acked-by"?

-- Steve



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/15] ftrace: fix event alignment: jbd2:*
  2010-12-04  1:52   ` Steven Rostedt
@ 2011-03-09  0:03     ` Steven Rostedt
  2011-03-09  0:31       ` Ted Ts'o
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09  0:03 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Theodore Ts'o

Now I know why I was so late at adding this. I was waiting for
Acked-bys ;)

On Fri, 2010-12-03 at 20:52 -0500, Steven Rostedt wrote:
> [ Ccing Ted for Acked-by ]

Ted, Ping?

-- Steve

> 
> -- Steve
> 
> 
> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> > Signed-off-by: David Sharp <dhsharp@google.com>
> > ---
> >  include/trace/events/jbd2.h |    8 ++++----
> >  1 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
> > index 7447ea9..2f9f362 100644
> > --- a/include/trace/events/jbd2.h
> > +++ b/include/trace/events/jbd2.h
> > @@ -41,15 +41,15 @@ DECLARE_EVENT_CLASS(jbd2_commit,
> >  	TP_STRUCT__entry(
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> > -		__field(	char,	sync_commit		  )
> >  		__field(	int,	transaction		  )
> > +		__field(	char,	sync_commit		  )
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major	= MAJOR(journal->j_fs_dev->bd_dev);
> >  		__entry->dev_minor	= MINOR(journal->j_fs_dev->bd_dev);
> > -		__entry->sync_commit = commit_transaction->t_synchronous_commit;
> >  		__entry->transaction	= commit_transaction->t_tid;
> > +		__entry->sync_commit = commit_transaction->t_synchronous_commit;
> >  	),
> >  
> >  	TP_printk("dev %d,%d transaction %d sync %d",
> > @@ -93,17 +93,17 @@ TRACE_EVENT(jbd2_end_commit,
> >  	TP_STRUCT__entry(
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> > -		__field(	char,	sync_commit		  )
> >  		__field(	int,	transaction		  )
> >  		__field(	int,	head		  	  )
> > +		__field(	char,	sync_commit		  )
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major	= MAJOR(journal->j_fs_dev->bd_dev);
> >  		__entry->dev_minor	= MINOR(journal->j_fs_dev->bd_dev);
> > -		__entry->sync_commit = commit_transaction->t_synchronous_commit;
> >  		__entry->transaction	= commit_transaction->t_tid;
> >  		__entry->head		= journal->j_tail_sequence;
> > +		__entry->sync_commit = commit_transaction->t_synchronous_commit;
> >  	),
> >  
> >  	TP_printk("dev %d,%d transaction %d sync %d head %d",
> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/15] ftrace: fix event alignment: ext4:*
  2010-12-04  1:53   ` Steven Rostedt
@ 2011-03-09  0:04     ` Steven Rostedt
  0 siblings, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09  0:04 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Theodore Ts'o

On Fri, 2010-12-03 at 20:53 -0500, Steven Rostedt wrote:
> [ Ccing Ted for Acked-by ]

Ted, Ping on this too.

-- Steve

> 
> -- Steve
> 
> 
> On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> > Signed-off-by: David Sharp <dhsharp@google.com>
> > ---
> >  include/trace/events/ext4.h |   70 +++++++++++++++++++++---------------------
> >  1 files changed, 35 insertions(+), 35 deletions(-)
> > 
> > diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> > index e5e345f..9887cdb 100644
> > --- a/include/trace/events/ext4.h
> > +++ b/include/trace/events/ext4.h
> > @@ -23,21 +23,21 @@ TRACE_EVENT(ext4_free_inode,
> >  	TP_STRUCT__entry(
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> > +		__field(	blkcnt_t, blocks		)
> >  		__field(	ino_t,	ino			)
> > -		__field(	umode_t, mode			)
> >  		__field(	uid_t,	uid			)
> >  		__field(	gid_t,	gid			)
> > -		__field(	blkcnt_t, blocks		)
> > +		__field(	umode_t, mode			)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
> >  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
> > +		__entry->blocks	= inode->i_blocks;
> >  		__entry->ino	= inode->i_ino;
> > -		__entry->mode	= inode->i_mode;
> >  		__entry->uid	= inode->i_uid;
> >  		__entry->gid	= inode->i_gid;
> > -		__entry->blocks	= inode->i_blocks;
> > +		__entry->mode	= inode->i_mode;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu mode 0%o uid %u gid %u blocks %llu",
> > @@ -342,10 +342,10 @@ TRACE_EVENT(ext4_da_writepages,
> >  		__field(	long,	pages_skipped		)
> >  		__field(	loff_t,	range_start		)
> >  		__field(	loff_t,	range_end		)
> > +		__field(       pgoff_t,	writeback_index		)
> >  		__field(	char,	for_kupdate		)
> >  		__field(	char,	for_reclaim		)
> >  		__field(	char,	range_cyclic		)
> > -		__field(       pgoff_t,	writeback_index		)
> >  	),
> >  
> >  	TP_fast_assign(
> > @@ -356,10 +356,10 @@ TRACE_EVENT(ext4_da_writepages,
> >  		__entry->pages_skipped	= wbc->pages_skipped;
> >  		__entry->range_start	= wbc->range_start;
> >  		__entry->range_end	= wbc->range_end;
> > +		__entry->writeback_index = inode->i_mapping->writeback_index;
> >  		__entry->for_kupdate	= wbc->for_kupdate;
> >  		__entry->for_reclaim	= wbc->for_reclaim;
> >  		__entry->range_cyclic	= wbc->range_cyclic;
> > -		__entry->writeback_index = inode->i_mapping->writeback_index;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu nr_to_write %ld pages_skipped %ld "
> > @@ -425,8 +425,8 @@ TRACE_EVENT(ext4_da_writepages_result,
> >  		__field(	int,	ret			)
> >  		__field(	int,	pages_written		)
> >  		__field(	long,	pages_skipped		)
> > -		__field(	char,	more_io			)	
> >  		__field(       pgoff_t,	writeback_index		)
> > +		__field(	char,	more_io			)	
> >  	),
> >  
> >  	TP_fast_assign(
> > @@ -436,8 +436,8 @@ TRACE_EVENT(ext4_da_writepages_result,
> >  		__entry->ret		= ret;
> >  		__entry->pages_written	= pages_written;
> >  		__entry->pages_skipped	= wbc->pages_skipped;
> > -		__entry->more_io	= wbc->more_io;
> >  		__entry->writeback_index = inode->i_mapping->writeback_index;
> > +		__entry->more_io	= wbc->more_io;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu ret %d pages_written %d pages_skipped %ld more_io %d writeback_index %lu",
> > @@ -485,8 +485,8 @@ DECLARE_EVENT_CLASS(ext4__mb_new_pa,
> >  		__field(	int,   dev_minor                )
> >  		__field(	ino_t,	ino			)
> >  		__field(	__u64,	pa_pstart		)
> > -		__field(	__u32,	pa_len			)
> >  		__field(	__u64,	pa_lstart		)
> > +		__field(	__u32,	pa_len			)
> >  
> >  	),
> >  
> > @@ -495,8 +495,8 @@ DECLARE_EVENT_CLASS(ext4__mb_new_pa,
> >  		__entry->dev_minor	= MINOR(ac->ac_sb->s_dev);
> >  		__entry->ino		= ac->ac_inode->i_ino;
> >  		__entry->pa_pstart	= pa->pa_pstart;
> > -		__entry->pa_len		= pa->pa_len;
> >  		__entry->pa_lstart	= pa->pa_lstart;
> > +		__entry->pa_len		= pa->pa_len;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu pstart %llu len %u lstart %llu",
> > @@ -724,20 +724,20 @@ TRACE_EVENT(ext4_free_blocks,
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> >  		__field(	ino_t,	ino			)
> > -		__field(      umode_t, mode			)
> >  		__field(	__u64,	block			)
> >  		__field(	unsigned long,	count		)
> >  		__field(	 int,	flags			)
> > +		__field(      umode_t, mode			)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major	= MAJOR(inode->i_sb->s_dev);
> >  		__entry->dev_minor	= MINOR(inode->i_sb->s_dev);
> >  		__entry->ino		= inode->i_ino;
> > -		__entry->mode		= inode->i_mode;
> >  		__entry->block		= block;
> >  		__entry->count		= count;
> >  		__entry->flags		= flags;
> > +		__entry->mode		= inode->i_mode;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu mode 0%o block %llu count %lu flags %d",
> > @@ -766,8 +766,8 @@ TRACE_EVENT(ext4_sync_file,
> >  		__entry->dev_major	= MAJOR(dentry->d_inode->i_sb->s_dev);
> >  		__entry->dev_minor	= MINOR(dentry->d_inode->i_sb->s_dev);
> >  		__entry->ino		= dentry->d_inode->i_ino;
> > -		__entry->datasync	= datasync;
> >  		__entry->parent		= dentry->d_parent->d_inode->i_ino;
> > +		__entry->datasync	= datasync;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %ld parent %ld datasync %d ",
> > @@ -834,12 +834,6 @@ TRACE_EVENT(ext4_mballoc_alloc,
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> >  		__field(	ino_t,	ino			)
> > -		__field(	__u16,	found			)
> > -		__field(	__u16,	groups			)
> > -		__field(	__u16,	buddy			)
> > -		__field(	__u16,	flags			)
> > -		__field(	__u16,	tail			)
> > -		__field(	__u8,	cr			)
> >  		__field(	__u32, 	orig_logical		)
> >  		__field(	  int,	orig_start		)
> >  		__field(	__u32, 	orig_group		)
> > @@ -852,18 +846,18 @@ TRACE_EVENT(ext4_mballoc_alloc,
> >  		__field(	  int,	result_start		)
> >  		__field(	__u32, 	result_group		)
> >  		__field(	  int,	result_len		)
> > +		__field(	__u16,	found			)
> > +		__field(	__u16,	groups			)
> > +		__field(	__u16,	buddy			)
> > +		__field(	__u16,	flags			)
> > +		__field(	__u16,	tail			)
> > +		__field(	__u8,	cr			)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major	= MAJOR(ac->ac_inode->i_sb->s_dev);
> >  		__entry->dev_minor	= MINOR(ac->ac_inode->i_sb->s_dev);
> >  		__entry->ino		= ac->ac_inode->i_ino;
> > -		__entry->found		= ac->ac_found;
> > -		__entry->flags		= ac->ac_flags;
> > -		__entry->groups		= ac->ac_groups_scanned;
> > -		__entry->buddy		= ac->ac_buddy;
> > -		__entry->tail		= ac->ac_tail;
> > -		__entry->cr		= ac->ac_criteria;
> >  		__entry->orig_logical	= ac->ac_o_ex.fe_logical;
> >  		__entry->orig_start	= ac->ac_o_ex.fe_start;
> >  		__entry->orig_group	= ac->ac_o_ex.fe_group;
> > @@ -876,6 +870,12 @@ TRACE_EVENT(ext4_mballoc_alloc,
> >  		__entry->result_start	= ac->ac_f_ex.fe_start;
> >  		__entry->result_group	= ac->ac_f_ex.fe_group;
> >  		__entry->result_len	= ac->ac_f_ex.fe_len;
> > +		__entry->found		= ac->ac_found;
> > +		__entry->groups		= ac->ac_groups_scanned;
> > +		__entry->buddy		= ac->ac_buddy;
> > +		__entry->flags		= ac->ac_flags;
> > +		__entry->tail		= ac->ac_tail;
> > +		__entry->cr		= ac->ac_criteria;
> >  	),
> >  
> >  	TP_printk("dev %d,%d inode %lu orig %u/%d/%u@%u goal %u/%d/%u@%u "
> > @@ -1001,18 +1001,18 @@ TRACE_EVENT(ext4_forget,
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> >  		__field(	ino_t,	ino			)
> > -		__field(	umode_t, mode			)
> > -		__field(	int,	is_metadata		)
> >  		__field(	__u64,	block			)
> > +		__field(	int,	is_metadata		)
> > +		__field(	umode_t, mode			)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
> >  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
> >  		__entry->ino	= inode->i_ino;
> > -		__entry->mode	= inode->i_mode;
> > -		__entry->is_metadata = is_metadata;
> >  		__entry->block	= block;
> > +		__entry->is_metadata = is_metadata;
> > +		__entry->mode	= inode->i_mode;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu mode 0%o is_metadata %d block %llu",
> > @@ -1030,24 +1030,24 @@ TRACE_EVENT(ext4_da_update_reserve_space,
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> >  		__field(	ino_t,	ino			)
> > -		__field(	umode_t, mode			)
> >  		__field(	__u64,	i_blocks		)
> >  		__field(	int,	used_blocks		)
> >  		__field(	int,	reserved_data_blocks	)
> >  		__field(	int,	reserved_meta_blocks	)
> >  		__field(	int,	allocated_meta_blocks	)
> > +		__field(	umode_t, mode			)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
> >  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
> >  		__entry->ino	= inode->i_ino;
> > -		__entry->mode	= inode->i_mode;
> >  		__entry->i_blocks = inode->i_blocks;
> >  		__entry->used_blocks = used_blocks;
> >  		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
> >  		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
> >  		__entry->allocated_meta_blocks = EXT4_I(inode)->i_allocated_meta_blocks;
> > +		__entry->mode	= inode->i_mode;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu used_blocks %d reserved_data_blocks %d reserved_meta_blocks %d allocated_meta_blocks %d",
> > @@ -1067,22 +1067,22 @@ TRACE_EVENT(ext4_da_reserve_space,
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> >  		__field(	ino_t,	ino			)
> > -		__field(	umode_t, mode			)
> >  		__field(	__u64,	i_blocks		)
> >  		__field(	int,	md_needed		)
> >  		__field(	int,	reserved_data_blocks	)
> >  		__field(	int,	reserved_meta_blocks	)
> > +		__field(	umode_t, mode			)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
> >  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
> >  		__entry->ino	= inode->i_ino;
> > -		__entry->mode	= inode->i_mode;
> >  		__entry->i_blocks = inode->i_blocks;
> >  		__entry->md_needed = md_needed;
> >  		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
> >  		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
> > +		__entry->mode	= inode->i_mode;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu md_needed %d reserved_data_blocks %d reserved_meta_blocks %d",
> > @@ -1102,24 +1102,24 @@ TRACE_EVENT(ext4_da_release_space,
> >  		__field(	int,   dev_major                )
> >  		__field(	int,   dev_minor                )
> >  		__field(	ino_t,	ino			)
> > -		__field(	umode_t, mode			)
> >  		__field(	__u64,	i_blocks		)
> >  		__field(	int,	freed_blocks		)
> >  		__field(	int,	reserved_data_blocks	)
> >  		__field(	int,	reserved_meta_blocks	)
> >  		__field(	int,	allocated_meta_blocks	)
> > +		__field(	umode_t, mode			)
> >  	),
> >  
> >  	TP_fast_assign(
> >  		__entry->dev_major = MAJOR(inode->i_sb->s_dev);
> >  		__entry->dev_minor = MINOR(inode->i_sb->s_dev);
> >  		__entry->ino	= inode->i_ino;
> > -		__entry->mode	= inode->i_mode;
> >  		__entry->i_blocks = inode->i_blocks;
> >  		__entry->freed_blocks = freed_blocks;
> >  		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
> >  		__entry->reserved_meta_blocks = EXT4_I(inode)->i_reserved_meta_blocks;
> >  		__entry->allocated_meta_blocks = EXT4_I(inode)->i_allocated_meta_blocks;
> > +		__entry->mode	= inode->i_mode;
> >  	),
> >  
> >  	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu freed_blocks %d reserved_data_blocks %d reserved_meta_blocks %d allocated_meta_blocks %d",
> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/15] small_traces: Add config option to shrink trace events.
  2010-12-04  0:13 ` [PATCH 13/15] small_traces: Add config option to shrink trace events David Sharp
  2010-12-04  1:56   ` Steven Rostedt
@ 2011-03-09  0:23   ` Steven Rostedt
  1 sibling, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09  0:23 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, Frederic Weisbecker

On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> This option will remove several fields from trace events to reduce the
> payload size. Under heavy load, events can be produced faster than they
> can possibly be written to disk or sent on the network. Shrinking the
> payload size will enable fitting more traces in the ring buffer, which
> therefore enables tracing for longer periods, at the cost of losing some
> relatively unimportant details.
> 

OK, I'm trying to get all your patches in for 2.6.39 except the last 3.
As this may break user interfaces. I would like a way to dynamically
shrink the payload, but we need to figure out how to do that.

For example, currently trace-cmd looks at the event files know what data
to record. Sure we could force it with a config option, but that is not
a good solution for mainline. What we would want is something more
dymanic. But then tools like trace-cmd, and I suspect perf will need to
have a way to know how to parse the data. Especially perf which is user
specific (where as trace-cmd is global). The change in events needs to
be atomic per user. So one user changing what gets recorded in an event
does not affect another user.

Perhaps in the future we can figure something out. But for now, patches
1-12 look good (with the exception of patch 3).

-- Steve


> Google-Bug-Id: 3224547
> 
> Signed-off-by: David Sharp <dhsharp@google.com>
> ---
>  kernel/trace/Kconfig |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index ea37e2f..9c24fe5 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -403,6 +403,18 @@ config FUNCTION_PROFILER
>  
>  	  If in doubt, say N.
>  
> +config SMALL_TRACES
> +	bool "Shrink trace events"
> +	default n
> +	help
> +	  Minimize the size of trace events. Omits fields from trace event
> +	  header like preempt_count, and lock_depth. Chooses smaller
> +	  alternative event structures. Shrinking the payload size enables
> +	  fitting more traces in the ring buffer, which enables tracing for
> +	  longer periods when the buffer can not be read as fast as it is
> +	  filled, at the cost of losing some details.
> +
> +
>  config FTRACE_MCOUNT_RECORD
>  	def_bool y
>  	depends on DYNAMIC_FTRACE



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/15] ftrace: fix event alignment: jbd2:*
  2011-03-09  0:03     ` Steven Rostedt
@ 2011-03-09  0:31       ` Ted Ts'o
  2011-03-09  0:41         ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: Ted Ts'o @ 2011-03-09  0:31 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin, Jiaying Zhang

On Tue, Mar 08, 2011 at 07:03:25PM -0500, Steven Rostedt wrote:
> Now I know why I was so late at adding this. I was waiting for
> Acked-bys ;)
> 
> On Fri, 2010-12-03 at 20:52 -0500, Steven Rostedt wrote:
> > [ Ccing Ted for Acked-by ]
> 
> Ted, Ping?

A month or two ago I did an analysis using the pahole binary (part of
the 'dwarves' package in Debian/Ubuntu) and noticed many more
structures that were flagged as having 'holes' where there were
optimization opportunities.

In addition, (in an interesting example of synchronicity) just this
afternoon, I had talked to Jiaying about a further optimization of the
ext4 trace header files, since she pointed out to me that the 'perf'
binary can actually handle the MAJOR() and MINOR() macros.  So we can
replace 8 bytes of "int major, minor", with a 4 byte dev_t, and save
another 4 bytes (assuming we can move the structure elements around so
we can save a full 8 bytes to avoid structure padding).

Jiaying was also going to send some patches to add some additional
tracepoints to the upstream kernel.

So probably the best thing to do at this point is to drop the ext4 and
jbd2 patches, since Jiaying is about to send me patches that to handle
these optimizations plus some further ones --- and add some additional
tracepoints.  Getting one patch from her that I'll feed in via the
ext4 tree probably will be much simpler than coordinating patches to
the same file from two different trees.

						- Ted

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/15] ftrace: fix event alignment: jbd2:*
  2011-03-09  0:31       ` Ted Ts'o
@ 2011-03-09  0:41         ` Steven Rostedt
  0 siblings, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09  0:41 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: David Sharp, linux-kernel, mrubin, Jiaying Zhang

On Tue, 2011-03-08 at 19:31 -0500, Ted Ts'o wrote:
> On Tue, Mar 08, 2011 at 07:03:25PM -0500, Steven Rostedt wrote:
> > Now I know why I was so late at adding this. I was waiting for
> > Acked-bys ;)
> > 
> > On Fri, 2010-12-03 at 20:52 -0500, Steven Rostedt wrote:
> > > [ Ccing Ted for Acked-by ]
> > 
> > Ted, Ping?
> 
> A month or two ago I did an analysis using the pahole binary (part of
> the 'dwarves' package in Debian/Ubuntu) and noticed many more
> structures that were flagged as having 'holes' where there were
> optimization opportunities.
> 
> In addition, (in an interesting example of synchronicity) just this
> afternoon, I had talked to Jiaying about a further optimization of the
> ext4 trace header files, since she pointed out to me that the 'perf'
> binary can actually handle the MAJOR() and MINOR() macros.  So we can
> replace 8 bytes of "int major, minor", with a 4 byte dev_t, and save
> another 4 bytes (assuming we can move the structure elements around so
> we can save a full 8 bytes to avoid structure padding).
> 
> Jiaying was also going to send some patches to add some additional
> tracepoints to the upstream kernel.
> 
> So probably the best thing to do at this point is to drop the ext4 and
> jbd2 patches, since Jiaying is about to send me patches that to handle
> these optimizations plus some further ones --- and add some additional
> tracepoints.  Getting one patch from her that I'll feed in via the
> ext4 tree probably will be much simpler than coordinating patches to
> the same file from two different trees.

OK, thanks for the explanation. I'll go ahead and drop those two patches
from my queue.

-- Steve



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH] tracing: Add an 'overwrite' trace_option.
  2010-12-08 21:46       ` [PATCH] tracing: Add an 'overwrite' trace_option David Sharp
  2010-12-14  0:39         ` David Sharp
@ 2011-03-09  0:45         ` David Sharp
  2011-03-11  9:45         ` [tip:perf/core] " tip-bot for David Sharp
  2 siblings, 0 replies; 74+ messages in thread
From: David Sharp @ 2011-03-09  0:45 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, mrubin, jiayingz, David Sharp

On Wed, Dec 8, 2010 at 1:46 PM, David Sharp <dhsharp@google.com> wrote:
> Add an "overwrite" trace_option for ftrace to control whether the buffer should
> be overwritten on overflow or not. The default remains to overwrite old events
> when the buffer is full. This patch adds the option to instead discard newest
> events when the buffer is full. This is useful to get a snapshot of traces just
> after enabling traces. Dropping the current event is also a simpler code path.

Signed-off-by: David Sharp <dhsharp@google.com>

> ---
>  Documentation/trace/ftrace.txt |    5 +++++
>  include/linux/ring_buffer.h    |    2 ++
>  kernel/trace/ring_buffer.c     |   11 +++++++++++
>  kernel/trace/trace.c           |   17 +++++++++++------
>  kernel/trace/trace.h           |    1 +
>  5 files changed, 30 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
> index 557c1ed..2842ea7 100644
> --- a/Documentation/trace/ftrace.txt
> +++ b/Documentation/trace/ftrace.txt
> @@ -491,6 +491,11 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
>                    latencies, as described in "Latency
>                    trace format".
>
> +  overwrite - This controls what happens when the trace buffer is
> +              full. If "1" (default), the oldest events are
> +              discarded and overwritten. If "0", then the newest
> +              events are discarded.
> +
>  sched_switch
>  ------------
>
> diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
> index 8d3a248..ab38ac8 100644
> --- a/include/linux/ring_buffer.h
> +++ b/include/linux/ring_buffer.h
> @@ -100,6 +100,8 @@ void ring_buffer_free(struct ring_buffer *buffer);
>
>  int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
>
> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val);
> +
>  struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
>                                                   unsigned long length);
>  int ring_buffer_unlock_commit(struct ring_buffer *buffer,
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 9ed509a..3207147 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1429,6 +1429,17 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
>  }
>  EXPORT_SYMBOL_GPL(ring_buffer_resize);
>
> +void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val)
> +{
> +       mutex_lock(&buffer->mutex);
> +       if (val)
> +               buffer->flags |= RB_FL_OVERWRITE;
> +       else
> +               buffer->flags &= ~RB_FL_OVERWRITE;
> +       mutex_unlock(&buffer->mutex);
> +}
> +EXPORT_SYMBOL_GPL(ring_buffer_change_overwrite);
> +
>  static inline void *
>  __rb_data_page_index(struct buffer_data_page *bpage, unsigned index)
>  {
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index c380612..6f7494c 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -41,8 +41,6 @@
>  #include "trace.h"
>  #include "trace_output.h"
>
> -#define TRACE_BUFFER_FLAGS     (RB_FL_OVERWRITE)
> -
>  /*
>  * On boot up, the ring buffer is set to the minimum size, so that
>  * we do not waste memory on systems that are not using tracing.
> @@ -340,7 +338,7 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
>  /* trace_flags holds trace_options default values */
>  unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
>        TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
> -       TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD;
> +       TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE;
>
>  static int trace_stop_count;
>  static DEFINE_SPINLOCK(tracing_start_lock);
> @@ -425,6 +423,7 @@ static const char *trace_options[] = {
>        "sleep-time",
>        "graph-time",
>        "record-cmd",
> +       "overwrite",
>        NULL
>  };
>
> @@ -2523,6 +2522,9 @@ static void set_tracer_flags(unsigned int mask, int enabled)
>
>        if (mask == TRACE_ITER_RECORD_CMD)
>                trace_event_enable_cmd_record(enabled);
> +
> +       if (mask == TRACE_ITER_OVERWRITE)
> +               ring_buffer_change_overwrite(global_trace.buffer, enabled);
>  }
>
>  static ssize_t
> @@ -4545,9 +4547,11 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)
>  __init static int tracer_alloc_buffers(void)
>  {
>        int ring_buf_size;
> +       enum ring_buffer_flags rb_flags;
>        int i;
>        int ret = -ENOMEM;
>
> +
>        if (!alloc_cpumask_var(&tracing_buffer_mask, GFP_KERNEL))
>                goto out;
>
> @@ -4560,12 +4564,13 @@ __init static int tracer_alloc_buffers(void)
>        else
>                ring_buf_size = 1;
>
> +       rb_flags = trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0;
> +
>        cpumask_copy(tracing_buffer_mask, cpu_possible_mask);
>        cpumask_copy(tracing_cpumask, cpu_all_mask);
>
>        /* TODO: make the number of buffers hot pluggable with CPUS */
> -       global_trace.buffer = ring_buffer_alloc(ring_buf_size,
> -                                                  TRACE_BUFFER_FLAGS);
> +       global_trace.buffer = ring_buffer_alloc(ring_buf_size, rb_flags);
>        if (!global_trace.buffer) {
>                printk(KERN_ERR "tracer: failed to allocate ring buffer!\n");
>                WARN_ON(1);
> @@ -4575,7 +4580,7 @@ __init static int tracer_alloc_buffers(void)
>
>
>  #ifdef CONFIG_TRACER_MAX_TRACE
> -       max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
> +       max_tr.buffer = ring_buffer_alloc(1, rb_flags);
>        if (!max_tr.buffer) {
>                printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
>                WARN_ON(1);
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 9021f8c..9cd025a 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -606,6 +606,7 @@ enum trace_iterator_flags {
>        TRACE_ITER_SLEEP_TIME           = 0x40000,
>        TRACE_ITER_GRAPH_TIME           = 0x80000,
>        TRACE_ITER_RECORD_CMD           = 0x100000,
> +       TRACE_ITER_OVERWRITE            = 0x200000,
>  };
>
>  /*
> --
> 1.7.3.1
>
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/15] ftrace: pack event structures.
  2011-03-08 23:30   ` Steven Rostedt
@ 2011-03-09  1:09     ` Steven Rostedt
  2011-03-09  6:39     ` David Miller
  1 sibling, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09  1:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, mrubin, David Sharp

On Tue, 2011-03-08 at 18:30 -0500, Steven Rostedt wrote:
> David (Miller),
> 
> I'm going back through this patch set and looking to incorporate it into
> 2.6.39. But I know how touchy sparc is with "packed" data structures. Do
> you see this patch as hurting sparc?
> 
> I wonder if we should change this to something like:
> 
> #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> # define do_event_packed __attirbute__((packed))
> #else
> # define do_event_packed
> #endif
> 
> and use "do_event_packed" instead?
> 

Here's the modified patch, if you think the current one is not OK:

-- Steve

commit c49fd97767b112a9a7a50c184caf5059f798d5a0
Author: David Sharp <dhsharp@google.com>
Date:   Fri Dec 3 16:13:18 2010 -0800

    tracing: Pack event structures.
    
    Ftrace event structures have a 12-byte struct trace_entry at the beginning.
    If the structure is aligned, this means that if the first field is 64-bits,
    there will be 4 bytes of padding. Ironically, due to the 4-byte ringbuffer
    header, this will make 64-bit writes unaligned, if the ring buffer position
    is currently 64-bit aligned:
    4(rb)+12(ftrace)+4(pad) = 20;  20%8 = 4
    
    Adding __attribute__((packed)) to the event structures removes the extra
    space from the trace events, and actually improves alignment of trace
    events with a first field that is 64-bits.
    
    About 65 tracepoints have a 4-byte pad at offset 12:
     # find events -name format | xargs -n1 awk '
       $1=="name:" {name=$2}
       $1=="format:"{FS="\t"}
       $3=="offset:12;" && $4=="size:4;"{okay=1}
       $3=="offset:16;" && !okay {print name}' | wc -l
    65
    
    With all 'syscalls' and 'timer' events enabled, this results in a 5%
    improvement in a simple 512MB read benchmark with warm caches.
    
    Tested:
    
    setup:
     # echo 1 >events/syscalls/enable
     # echo 1 >events/timer/enable
     # echo 0 > tracing_enabled
    off:
     # for n in $(seq 10) ; do \
       time dd if=/dev/hda3 of=/dev/null bs=1K count=512K ; \
       done
    on:
     # for n in $(seq 10) ; do \
       echo 1 >tracing_enabled; \
       time dd if=/dev/hda3 of=/dev/null bs=1K count=512K ; \
       echo 0 >tracing_enabled; \
       echo > trace; \
       done
    
    real time      mean/median/stdev
    w/o patch off: 1.1679/1.164/0.0169
    w/o patch on : 1.9432/1.936/0.0274
    w/  patch off: 1.1715/1.159/0.0431
    w/  patch on : 1.8425/1.836/0.0138
    "on" delta: -0.1007 --> -5.2%
    
    Google-Bug-Id: 2895627
    
    Cc: David S. Miller <davem@davemloft.net>
    Signed-off-by: David Sharp <dhsharp@google.com>
    LKML-Reference: <1291421609-14665-5-git-send-email-dhsharp@google.com>
    
    [
      Replaced "__attibute__((packed))" with "ftrace_do_packed" which
      is a macro that is defined as the packed attribute only if the
      arch has efficient unaligned access.
    ]
    
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 47e3997..c920ab5 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -282,4 +282,16 @@ perf_trace_buf_submit(void *raw_data, int size, int rctx, u64 addr,
 }
 #endif
 
+/*
+ * On systems that can access unaligned data efficiently we
+ * pack the data structures. This actually gives a boost
+ * in tracing speed. But leave the structs alone for archs
+ * that do not do as well with packed data structures.
+ */
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+# define ftrace_do_packed __attribute__((packed))
+#else
+# define ftrace_do_packed
+#endif
+
 #endif /* _LINUX_FTRACE_EVENT_H */
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 3e68366..6b8f366 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -48,7 +48,8 @@
 #define __array(type, item, len)	type	item[len];
 
 #undef __dynamic_array
-#define __dynamic_array(type, item, len) u32 __data_loc_##item;
+#define __dynamic_array(type, item, len) \
+    u32 __data_loc_##item __attribute__((aligned(4)));
 
 #undef __string
 #define __string(item, src) __dynamic_array(char, item, -1)
@@ -62,7 +63,7 @@
 		struct trace_entry	ent;				\
 		tstruct							\
 		char			__data[0];			\
-	};								\
+	} ftrace_do_packed;						\
 									\
 	static struct ftrace_event_class event_class_##name;
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 9cd025a..ce185fe 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -60,7 +60,7 @@ enum trace_type {
 	struct struct_name {					\
 		struct trace_entry	ent;			\
 		tstruct						\
-	}
+	} ftrace_do_packed
 
 #undef TP_ARGS
 #define TP_ARGS(args...)	args



^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/15] ftrace: pack event structures.
  2011-03-08 23:30   ` Steven Rostedt
  2011-03-09  1:09     ` Steven Rostedt
@ 2011-03-09  6:39     ` David Miller
  2011-03-09 15:18       ` Steven Rostedt
  1 sibling, 1 reply; 74+ messages in thread
From: David Miller @ 2011-03-09  6:39 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, mrubin, dhsharp

From: Steven Rostedt <rostedt@goodmis.org>
Date: Tue, 08 Mar 2011 18:30:03 -0500

> I'm going back through this patch set and looking to incorporate it into
> 2.6.39. But I know how touchy sparc is with "packed" data structures. Do
> you see this patch as hurting sparc?
> 
> I wonder if we should change this to something like:
> 
> #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> # define do_event_packed __attirbute__((packed))
> #else
> # define do_event_packed
> #endif
> 
> and use "do_event_packed" instead?

I think you should elide packed at all costs, and instead tell the compiler
what your intentions are by exposing the real types using unions or
similar instead of hiding it in an opaque way behind final char[] arrays.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2011-03-08 23:55             ` Steven Rostedt
@ 2011-03-09  8:50               ` Avi Kivity
  2011-03-09 12:54                 ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: Avi Kivity @ 2011-03-09  8:50 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin, kvm-devel

On 03/09/2011 01:55 AM, Steven Rostedt wrote:
> Hi Avi,
>
> This patch set got lost in the shuffle, and I'm not looking to include
> it.

Anything I need to do?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2011-03-09  8:50               ` Avi Kivity
@ 2011-03-09 12:54                 ` Steven Rostedt
  2011-03-09 13:01                   ` Avi Kivity
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09 12:54 UTC (permalink / raw)
  To: Avi Kivity; +Cc: David Sharp, linux-kernel, mrubin, kvm-devel

On Wed, 2011-03-09 at 10:50 +0200, Avi Kivity wrote:
> On 03/09/2011 01:55 AM, Steven Rostedt wrote:
> > Hi Avi,
> >
> > This patch set got lost in the shuffle, and I'm not looking to include

Nasty typo:  s/not/now/

> > it.
> 
> Anything I need to do?
> 

Yeah, give me your Acked-by ;)

-- Steve




^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall
  2011-03-09 12:54                 ` Steven Rostedt
@ 2011-03-09 13:01                   ` Avi Kivity
  0 siblings, 0 replies; 74+ messages in thread
From: Avi Kivity @ 2011-03-09 13:01 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Sharp, linux-kernel, mrubin, kvm-devel

On 03/09/2011 02:54 PM, Steven Rostedt wrote:
> On Wed, 2011-03-09 at 10:50 +0200, Avi Kivity wrote:
> >  On 03/09/2011 01:55 AM, Steven Rostedt wrote:
> >  >  Hi Avi,
> >  >
> >  >  This patch set got lost in the shuffle, and I'm not looking to include
>
> Nasty typo:  s/not/now/

Makes sense now.

> >  >  it.
> >
> >  Anything I need to do?
> >
>
> Yeah, give me your Acked-by ;)

Acked-by: $me.

Sorry, your previous mail got lost.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/15] ftrace: pack event structures.
  2010-12-04  0:13 ` [PATCH 04/15] ftrace: pack event structures David Sharp
  2011-03-08 23:30   ` Steven Rostedt
@ 2011-03-09 13:01   ` Steven Rostedt
  1 sibling, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09 13:01 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, mrubin, David S. Miller, Frederic Weisbecker

On Fri, 2010-12-03 at 16:13 -0800, David Sharp wrote:
> Ftrace event structures have a 12-byte struct trace_entry at the beginning.
> If the structure is aligned, this means that if the first field is 64-bits,
> there will be 4 bytes of padding. Ironically, due to the 4-byte ringbuffer
> header, this will make 64-bit writes unaligned, if the ring buffer position
> is currently 64-bit aligned:
> 4(rb)+12(ftrace)+4(pad) = 20;  20%8 = 4

Actually the better answer is: It's time to nuke the lock_depth field.
That was added temporarily in helping to remove the BKL, and now that
it's pretty much gone, I say we need to nuke it. That will remove 4
bytes from the 12 byte ftrace header giving us a 8 byte header. Much
nicer :)

I'll do that today. Thanks!

-- Steve

> 
> Adding __attribute__((packed)) to the event structures removes the extra
> space from the trace events, and actually improves alignment of trace
> events with a first field that is 64-bits.
> 
> About 65 tracepoints have a 4-byte pad at offset 12:
>  # find events -name format | xargs -n1 awk '
>    $1=="name:" {name=$2}
>    $1=="format:"{FS="\t"}
>    $3=="offset:12;" && $4=="size:4;"{okay=1}
>    $3=="offset:16;" && !okay {print name}' | wc -l
> 65
> 
> With all 'syscalls' and 'timer' events enabled, this results in a 5%
> improvement in a simple 512MB read benchmark with warm caches.



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/15] ftrace: pack event structures.
  2011-03-09  6:39     ` David Miller
@ 2011-03-09 15:18       ` Steven Rostedt
  2011-03-10 23:21         ` David Sharp
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2011-03-09 15:18 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel, mrubin, dhsharp

On Tue, 2011-03-08 at 22:39 -0800, David Miller wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
> Date: Tue, 08 Mar 2011 18:30:03 -0500
> 
> > I'm going back through this patch set and looking to incorporate it into
> > 2.6.39. But I know how touchy sparc is with "packed" data structures. Do
> > you see this patch as hurting sparc?
> > 
> > I wonder if we should change this to something like:
> > 
> > #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> > # define do_event_packed __attirbute__((packed))
> > #else
> > # define do_event_packed
> > #endif
> > 
> > and use "do_event_packed" instead?
> 
> I think you should elide packed at all costs, and instead tell the compiler
> what your intentions are by exposing the real types using unions or
> similar instead of hiding it in an opaque way behind final char[] arrays.

The problem here is not the issue of char[] but because of the way
ftrace's header was 12 bytes and caused everything that had a 8byte word
use another 4 bytes to align it. As I stated in another email, I think
removing the "int lock_depth" from the ftrace header should solve this.

David (Sharp), the reason for the "packed" was because of that extra 4
bytes in the ftrace header, right? So removing the lock_depth should fix
the issues that you've seen?

-- Steve




^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/15] ftrace: pack event structures.
  2011-03-09 15:18       ` Steven Rostedt
@ 2011-03-10 23:21         ` David Sharp
  2011-03-11  3:37           ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: David Sharp @ 2011-03-10 23:21 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Miller, linux-kernel, mrubin

On Wed, Mar 9, 2011 at 7:18 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2011-03-08 at 22:39 -0800, David Miller wrote:
>> From: Steven Rostedt <rostedt@goodmis.org>
>> Date: Tue, 08 Mar 2011 18:30:03 -0500
>>
>> > I'm going back through this patch set and looking to incorporate it into
>> > 2.6.39. But I know how touchy sparc is with "packed" data structures. Do
>> > you see this patch as hurting sparc?
>> >
>> > I wonder if we should change this to something like:
>> >
>> > #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>> > # define do_event_packed __attirbute__((packed))
>> > #else
>> > # define do_event_packed
>> > #endif
>> >
>> > and use "do_event_packed" instead?
>>
>> I think you should elide packed at all costs, and instead tell the compiler
>> what your intentions are by exposing the real types using unions or
>> similar instead of hiding it in an opaque way behind final char[] arrays.
>
> The problem here is not the issue of char[] but because of the way
> ftrace's header was 12 bytes and caused everything that had a 8byte word
> use another 4 bytes to align it.

Not only that, but those extra "4 bytes to align it" have a good
chance of in fact causing the 8 byte word to become *un*aligned, so
it's really just wasted space.

> As I stated in another email, I think
> removing the "int lock_depth" from the ftrace header should solve this.
>
> David (Sharp), the reason for the "packed" was because of that extra 4
> bytes in the ftrace header, right? So removing the lock_depth should fix
> the issues that you've seen?

Yes, I think that should have the same effect. Of course the next time
we come along and remove other useless fields like 'flags' and
'preempt_count', a similar condition could arise.

(Note I don't necessarily think that 'flags' and 'preempt_count' are
useless; although I don't know what they're used for, or why they need
to be in every entry. fwiw, we've taken trace_entry down to 4 bytes of
{unsigned short type; unsigned short pid;}, and we even have plans to
completely remove pid.)

>
> -- Steve
>
>
>
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/15] ftrace: pack event structures.
  2011-03-10 23:21         ` David Sharp
@ 2011-03-11  3:37           ` Steven Rostedt
  0 siblings, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2011-03-11  3:37 UTC (permalink / raw)
  To: David Sharp; +Cc: David Miller, linux-kernel, mrubin

On Thu, 2011-03-10 at 15:21 -0800, David Sharp wrote:

> Yes, I think that should have the same effect. Of course the next time
> we come along and remove other useless fields like 'flags' and
> 'preempt_count', a similar condition could arise.
> 
> (Note I don't necessarily think that 'flags' and 'preempt_count' are
> useless; although I don't know what they're used for, or why they need
> to be in every entry. fwiw, we've taken trace_entry down to 4 bytes of
> {unsigned short type; unsigned short pid;}, and we even have plans to
> completely remove pid.)

The flags and preempt count are legacy from the latency-tracer. But they
do come in quite handy when debugging -rt code. I've gotten trace data
from customers that knowing if preemption is enabled or disable and
whether or not something happened in interrupt context was detrimental.

That said, I would love to figure out a nice way to make these optional.

-- Steve



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [tip:perf/core] tracing: Add an 'overwrite' trace_option.
  2010-12-08 21:46       ` [PATCH] tracing: Add an 'overwrite' trace_option David Sharp
  2010-12-14  0:39         ` David Sharp
  2011-03-09  0:45         ` David Sharp
@ 2011-03-11  9:45         ` tip-bot for David Sharp
  2 siblings, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:45 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, tglx

Commit-ID:  750912fa366312e9c5bc83eab352898a26750401
Gitweb:     http://git.kernel.org/tip/750912fa366312e9c5bc83eab352898a26750401
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Wed, 8 Dec 2010 13:46:47 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Wed, 9 Mar 2011 13:52:27 -0500

tracing: Add an 'overwrite' trace_option.

Add an "overwrite" trace_option for ftrace to control whether the buffer should
be overwritten on overflow or not. The default remains to overwrite old events
when the buffer is full. This patch adds the option to instead discard newest
events when the buffer is full. This is useful to get a snapshot of traces just
after enabling traces. Dropping the current event is also a simpler code path.

Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 Documentation/trace/ftrace.txt |    5 +++++
 include/linux/ring_buffer.h    |    2 ++
 kernel/trace/ring_buffer.c     |   11 +++++++++++
 kernel/trace/trace.c           |   17 +++++++++++------
 kernel/trace/trace.h           |    1 +
 5 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index 67f1cc4..1ebc24c 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -454,6 +454,11 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
                    latencies, as described in "Latency
                    trace format".
 
+  overwrite - This controls what happens when the trace buffer is
+              full. If "1" (default), the oldest events are
+              discarded and overwritten. If "0", then the newest
+              events are discarded.
+
 ftrace_enabled
 --------------
 
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 8d3a248..ab38ac8 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -100,6 +100,8 @@ void ring_buffer_free(struct ring_buffer *buffer);
 
 int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
 
+void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val);
+
 struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
 						   unsigned long length);
 int ring_buffer_unlock_commit(struct ring_buffer *buffer,
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index bd1c35a..269db80 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1429,6 +1429,17 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
 }
 EXPORT_SYMBOL_GPL(ring_buffer_resize);
 
+void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val)
+{
+	mutex_lock(&buffer->mutex);
+	if (val)
+		buffer->flags |= RB_FL_OVERWRITE;
+	else
+		buffer->flags &= ~RB_FL_OVERWRITE;
+	mutex_unlock(&buffer->mutex);
+}
+EXPORT_SYMBOL_GPL(ring_buffer_change_overwrite);
+
 static inline void *
 __rb_data_page_index(struct buffer_data_page *bpage, unsigned index)
 {
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8dc8da6..85e3ee1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -41,8 +41,6 @@
 #include "trace.h"
 #include "trace_output.h"
 
-#define TRACE_BUFFER_FLAGS	(RB_FL_OVERWRITE)
-
 /*
  * On boot up, the ring buffer is set to the minimum size, so that
  * we do not waste memory on systems that are not using tracing.
@@ -340,7 +338,7 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
 /* trace_flags holds trace_options default values */
 unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
 	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
-	TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD;
+	TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE;
 
 static int trace_stop_count;
 static DEFINE_SPINLOCK(tracing_start_lock);
@@ -425,6 +423,7 @@ static const char *trace_options[] = {
 	"sleep-time",
 	"graph-time",
 	"record-cmd",
+	"overwrite",
 	NULL
 };
 
@@ -2529,6 +2528,9 @@ static void set_tracer_flags(unsigned int mask, int enabled)
 
 	if (mask == TRACE_ITER_RECORD_CMD)
 		trace_event_enable_cmd_record(enabled);
+
+	if (mask == TRACE_ITER_OVERWRITE)
+		ring_buffer_change_overwrite(global_trace.buffer, enabled);
 }
 
 static ssize_t
@@ -4555,9 +4557,11 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)
 __init static int tracer_alloc_buffers(void)
 {
 	int ring_buf_size;
+	enum ring_buffer_flags rb_flags;
 	int i;
 	int ret = -ENOMEM;
 
+
 	if (!alloc_cpumask_var(&tracing_buffer_mask, GFP_KERNEL))
 		goto out;
 
@@ -4570,12 +4574,13 @@ __init static int tracer_alloc_buffers(void)
 	else
 		ring_buf_size = 1;
 
+	rb_flags = trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0;
+
 	cpumask_copy(tracing_buffer_mask, cpu_possible_mask);
 	cpumask_copy(tracing_cpumask, cpu_all_mask);
 
 	/* TODO: make the number of buffers hot pluggable with CPUS */
-	global_trace.buffer = ring_buffer_alloc(ring_buf_size,
-						   TRACE_BUFFER_FLAGS);
+	global_trace.buffer = ring_buffer_alloc(ring_buf_size, rb_flags);
 	if (!global_trace.buffer) {
 		printk(KERN_ERR "tracer: failed to allocate ring buffer!\n");
 		WARN_ON(1);
@@ -4585,7 +4590,7 @@ __init static int tracer_alloc_buffers(void)
 
 
 #ifdef CONFIG_TRACER_MAX_TRACE
-	max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
+	max_tr.buffer = ring_buffer_alloc(1, rb_flags);
 	if (!max_tr.buffer) {
 		printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
 		WARN_ON(1);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 856e73c..951d0b7 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -606,6 +606,7 @@ enum trace_iterator_flags {
 	TRACE_ITER_SLEEP_TIME		= 0x40000,
 	TRACE_ITER_GRAPH_TIME		= 0x80000,
 	TRACE_ITER_RECORD_CMD		= 0x100000,
+	TRACE_ITER_OVERWRITE		= 0x200000,
 };
 
 /*

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:perf/core] ring-buffer: Remove unused #include <linux/trace_irq.h>
  2010-12-04  0:13 ` [PATCH 02/15] ring_buffer.c: Remove unused #include <linux/trace_irq.h> David Sharp
@ 2011-03-11  9:46   ` tip-bot for David Sharp
  0 siblings, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:46 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, tglx

Commit-ID:  de29be5e712dc8b7eef2bef9417af3bb6a88e47a
Gitweb:     http://git.kernel.org/tip/de29be5e712dc8b7eef2bef9417af3bb6a88e47a
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Fri, 3 Dec 2010 16:13:16 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Wed, 9 Mar 2011 13:52:28 -0500

ring-buffer: Remove unused #include <linux/trace_irq.h>

Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-3-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 269db80..3237d96 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -5,7 +5,6 @@
  */
 #include <linux/ring_buffer.h>
 #include <linux/trace_clock.h>
-#include <linux/ftrace_irq.h>
 #include <linux/spinlock.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:perf/core] tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup
  2010-12-04  0:13 ` [PATCH 05/15] ftrace: fix event alignment: ftrace:context_switch and ftrace:wakeup David Sharp
@ 2011-03-11  9:48   ` tip-bot for David Sharp
  0 siblings, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:48 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, tglx

Commit-ID:  140e4f2d1cd816aed196705c036763313c0e4bd3
Gitweb:     http://git.kernel.org/tip/140e4f2d1cd816aed196705c036763313c0e4bd3
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Fri, 3 Dec 2010 16:13:19 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 10 Mar 2011 10:34:16 -0500

tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup

Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-6-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_entries.h |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
index 6cf2237..1516cb3 100644
--- a/kernel/trace/trace_entries.h
+++ b/kernel/trace/trace_entries.h
@@ -109,12 +109,12 @@ FTRACE_ENTRY(funcgraph_exit, ftrace_graph_ret_entry,
  */
 #define FTRACE_CTX_FIELDS					\
 	__field(	unsigned int,	prev_pid	)	\
+	__field(	unsigned int,	next_pid	)	\
+	__field(	unsigned int,	next_cpu	)       \
 	__field(	unsigned char,	prev_prio	)	\
 	__field(	unsigned char,	prev_state	)	\
-	__field(	unsigned int,	next_pid	)	\
 	__field(	unsigned char,	next_prio	)	\
-	__field(	unsigned char,	next_state	)	\
-	__field(	unsigned int,	next_cpu	)
+	__field(	unsigned char,	next_state	)
 
 FTRACE_ENTRY(context_switch, ctx_switch_entry,
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:perf/core] tracing: Fix event alignment: module:module_request
  2010-12-04  0:13 ` [PATCH 06/15] ftrace: fix event alignment: module:module_request David Sharp
  2010-12-04  1:47   ` Steven Rostedt
@ 2011-03-11  9:48   ` tip-bot for David Sharp
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:48 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, lizf, dhsharp, tglx

Commit-ID:  b5e3008e489f5a00c6d5db914a4c4338c9ef5e8b
Gitweb:     http://git.kernel.org/tip/b5e3008e489f5a00c6d5db914a4c4338c9ef5e8b
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Fri, 3 Dec 2010 16:13:20 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 10 Mar 2011 10:34:20 -0500

tracing: Fix event alignment: module:module_request

Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-7-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/trace/events/module.h |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/module.h b/include/trace/events/module.h
index c6bae36..21a546d 100644
--- a/include/trace/events/module.h
+++ b/include/trace/events/module.h
@@ -108,14 +108,14 @@ TRACE_EVENT(module_request,
 	TP_ARGS(name, wait, ip),
 
 	TP_STRUCT__entry(
-		__field(	bool,		wait		)
 		__field(	unsigned long,	ip		)
+		__field(	bool,		wait		)
 		__string(	name,		name		)
 	),
 
 	TP_fast_assign(
-		__entry->wait	= wait;
 		__entry->ip	= ip;
+		__entry->wait	= wait;
 		__assign_str(name, name);
 	),
 
@@ -129,4 +129,3 @@ TRACE_EVENT(module_request,
 
 /* This part must be outside protection */
 #include <trace/define_trace.h>
-

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:perf/core] tracing: Fix event alignment: kvm:kvm_hv_hypercall
  2010-12-04  0:13 ` [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall David Sharp
  2010-12-04  1:49   ` Steven Rostedt
  2010-12-04  8:11   ` Avi Kivity
@ 2011-03-11  9:49   ` tip-bot for David Sharp
  2 siblings, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:49 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, tglx, avi

Commit-ID:  d5bf2ff07230a4a1b73ecb22363f77c02e1d85ab
Gitweb:     http://git.kernel.org/tip/d5bf2ff07230a4a1b73ecb22363f77c02e1d85ab
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Fri, 3 Dec 2010 16:13:21 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 10 Mar 2011 10:34:24 -0500

tracing: Fix event alignment: kvm:kvm_hv_hypercall

Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-8-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kvm/trace.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 1357d7c..db93276 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -62,21 +62,21 @@ TRACE_EVENT(kvm_hv_hypercall,
 	TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
 
 	TP_STRUCT__entry(
-		__field(	__u16, 		code		)
-		__field(	bool,		fast		)
 		__field(	__u16,		rep_cnt		)
 		__field(	__u16,		rep_idx		)
 		__field(	__u64,		ingpa		)
 		__field(	__u64,		outgpa		)
+		__field(	__u16, 		code		)
+		__field(	bool,		fast		)
 	),
 
 	TP_fast_assign(
-		__entry->code		= code;
-		__entry->fast		= fast;
 		__entry->rep_cnt	= rep_cnt;
 		__entry->rep_idx	= rep_idx;
 		__entry->ingpa		= ingpa;
 		__entry->outgpa		= outgpa;
+		__entry->code		= code;
+		__entry->fast		= fast;
 	),
 
 	TP_printk("code 0x%x %s cnt 0x%x idx 0x%x in 0x%llx out 0x%llx",

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:perf/core] tracing: Fix event alignment: mce:mce_record
  2010-12-04  0:13 ` [PATCH 08/15] ftrace: fix event alignment: mce:mce_record David Sharp
  2010-12-04  1:50   ` Steven Rostedt
@ 2011-03-11  9:49   ` tip-bot for David Sharp
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:49 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, tglx

Commit-ID:  ad440ad66f1617194738bf674dfe2d38978ac54d
Gitweb:     http://git.kernel.org/tip/ad440ad66f1617194738bf674dfe2d38978ac54d
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Fri, 3 Dec 2010 16:13:22 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 10 Mar 2011 10:34:28 -0500

tracing: Fix event alignment: mce:mce_record

Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-9-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/trace/events/mce.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 7eee778..4cbbcef 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -17,36 +17,36 @@ TRACE_EVENT(mce_record,
 	TP_STRUCT__entry(
 		__field(	u64,		mcgcap		)
 		__field(	u64,		mcgstatus	)
-		__field(	u8,		bank		)
 		__field(	u64,		status		)
 		__field(	u64,		addr		)
 		__field(	u64,		misc		)
 		__field(	u64,		ip		)
-		__field(	u8,		cs		)
 		__field(	u64,		tsc		)
 		__field(	u64,		walltime	)
 		__field(	u32,		cpu		)
 		__field(	u32,		cpuid		)
 		__field(	u32,		apicid		)
 		__field(	u32,		socketid	)
+		__field(	u8,		cs		)
+		__field(	u8,		bank		)
 		__field(	u8,		cpuvendor	)
 	),
 
 	TP_fast_assign(
 		__entry->mcgcap		= m->mcgcap;
 		__entry->mcgstatus	= m->mcgstatus;
-		__entry->bank		= m->bank;
 		__entry->status		= m->status;
 		__entry->addr		= m->addr;
 		__entry->misc		= m->misc;
 		__entry->ip		= m->ip;
-		__entry->cs		= m->cs;
 		__entry->tsc		= m->tsc;
 		__entry->walltime	= m->time;
 		__entry->cpu		= m->extcpu;
 		__entry->cpuid		= m->cpuid;
 		__entry->apicid		= m->apicid;
 		__entry->socketid	= m->socketid;
+		__entry->cs		= m->cs;
+		__entry->bank		= m->bank;
 		__entry->cpuvendor	= m->cpuvendor;
 	),
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:perf/core] tracing: Fix event alignment: skb:kfree_skb
  2010-12-04  0:13 ` [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb David Sharp
  2010-12-04  1:52   ` Steven Rostedt
@ 2011-03-11  9:50   ` tip-bot for David Sharp
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, rostedt, nhorman, dhsharp, tglx

Commit-ID:  ca9da2dd63b0b32de1b693953dff66cadeb6400b
Gitweb:     http://git.kernel.org/tip/ca9da2dd63b0b32de1b693953dff66cadeb6400b
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Fri, 3 Dec 2010 16:13:23 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 10 Mar 2011 10:34:31 -0500

tracing: Fix event alignment: skb:kfree_skb

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-10-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/trace/events/skb.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index f10293c..0c68ae22 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -19,14 +19,14 @@ TRACE_EVENT(kfree_skb,
 
 	TP_STRUCT__entry(
 		__field(	void *,		skbaddr		)
-		__field(	unsigned short,	protocol	)
 		__field(	void *,		location	)
+		__field(	unsigned short,	protocol	)
 	),
 
 	TP_fast_assign(
 		__entry->skbaddr = skb;
-		__entry->protocol = ntohs(skb->protocol);
 		__entry->location = location;
+		__entry->protocol = ntohs(skb->protocol);
 	),
 
 	TP_printk("skbaddr=%p protocol=%u location=%p",

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:perf/core] tracing: Adjust conditional expression latency formatting.
  2010-12-04  0:13 ` [PATCH 12/15] trace_output.c: adjust conditional expression formatting David Sharp
@ 2011-03-11  9:50   ` tip-bot for David Sharp
  0 siblings, 0 replies; 74+ messages in thread
From: tip-bot for David Sharp @ 2011-03-11  9:50 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, tglx

Commit-ID:  10da37a645b5e915d8572cc2b1f5eb11ada3ea4f
Gitweb:     http://git.kernel.org/tip/10da37a645b5e915d8572cc2b1f5eb11ada3ea4f
Author:     David Sharp <dhsharp@google.com>
AuthorDate: Fri, 3 Dec 2010 16:13:26 -0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 10 Mar 2011 10:34:35 -0500

tracing: Adjust conditional expression latency formatting.

Formatting change only to improve code readability. No code changes except to
introduce intermediate variables.

Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-13-git-send-email-dhsharp@google.com>

[ Keep variable declarations and assignment separate ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_output.c |   26 ++++++++++++++++++--------
 1 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 151f32e..456be90 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -533,20 +533,30 @@ seq_print_ip_sym(struct trace_seq *s, unsigned long ip, unsigned long sym_flags)
  */
 int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry)
 {
-	int hardirq, softirq;
+	char hardsoft_irq;
+	char need_resched;
+	char irqs_off;
+	int hardirq;
+	int softirq;
 	int ret;
 
 	hardirq = entry->flags & TRACE_FLAG_HARDIRQ;
 	softirq = entry->flags & TRACE_FLAG_SOFTIRQ;
 
+	irqs_off =
+		(entry->flags & TRACE_FLAG_IRQS_OFF) ? 'd' :
+		(entry->flags & TRACE_FLAG_IRQS_NOSUPPORT) ? 'X' :
+		'.';
+	need_resched =
+		(entry->flags & TRACE_FLAG_NEED_RESCHED) ? 'N' : '.';
+	hardsoft_irq =
+		(hardirq && softirq) ? 'H' :
+		hardirq ? 'h' :
+		softirq ? 's' :
+		'.';
+
 	if (!trace_seq_printf(s, "%c%c%c",
-			      (entry->flags & TRACE_FLAG_IRQS_OFF) ? 'd' :
-				(entry->flags & TRACE_FLAG_IRQS_NOSUPPORT) ?
-				  'X' : '.',
-			      (entry->flags & TRACE_FLAG_NEED_RESCHED) ?
-				'N' : '.',
-			      (hardirq && softirq) ? 'H' :
-				hardirq ? 'h' : softirq ? 's' : '.'))
+			      irqs_off, need_resched, hardsoft_irq))
 		return 0;
 
 	if (entry->preempt_count)

^ permalink raw reply related	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2011-03-11  9:50 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-04  0:13 [Patch 00/15] Reduce tracing payload size David Sharp
2010-12-04  0:13 ` [PATCH 01/15] tracing: Add a 'buffer_overwrite' debugfs file David Sharp
2010-12-04  1:57   ` Steven Rostedt
2010-12-08 20:15     ` David Sharp
2010-12-08 21:46       ` [PATCH] tracing: Add an 'overwrite' trace_option David Sharp
2010-12-14  0:39         ` David Sharp
2011-03-09  0:45         ` David Sharp
2011-03-11  9:45         ` [tip:perf/core] " tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 02/15] ring_buffer.c: Remove unused #include <linux/trace_irq.h> David Sharp
2011-03-11  9:46   ` [tip:perf/core] ring-buffer: " tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 03/15] ring_buffer: Align buffer_page struct allocations only to fit the flags David Sharp
2010-12-04  1:43   ` Steven Rostedt
2010-12-07 22:44     ` David Sharp
2010-12-04  0:13 ` [PATCH 04/15] ftrace: pack event structures David Sharp
2011-03-08 23:30   ` Steven Rostedt
2011-03-09  1:09     ` Steven Rostedt
2011-03-09  6:39     ` David Miller
2011-03-09 15:18       ` Steven Rostedt
2011-03-10 23:21         ` David Sharp
2011-03-11  3:37           ` Steven Rostedt
2011-03-09 13:01   ` Steven Rostedt
2010-12-04  0:13 ` [PATCH 05/15] ftrace: fix event alignment: ftrace:context_switch and ftrace:wakeup David Sharp
2011-03-11  9:48   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 06/15] ftrace: fix event alignment: module:module_request David Sharp
2010-12-04  1:47   ` Steven Rostedt
2010-12-06  1:28     ` Li Zefan
2011-03-11  9:48   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 07/15] ftrace: fix event alignment: kvm:kvm_hv_hypercall David Sharp
2010-12-04  1:49   ` Steven Rostedt
2010-12-04  8:11   ` Avi Kivity
2010-12-06 20:38     ` David Sharp
2010-12-07  9:22       ` Avi Kivity
2010-12-07 21:16         ` David Sharp
2010-12-08  9:18           ` Avi Kivity
2011-03-08 23:55             ` Steven Rostedt
2011-03-09  8:50               ` Avi Kivity
2011-03-09 12:54                 ` Steven Rostedt
2011-03-09 13:01                   ` Avi Kivity
2011-03-11  9:49   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 08/15] ftrace: fix event alignment: mce:mce_record David Sharp
2010-12-04  1:50   ` Steven Rostedt
2010-12-09 13:33     ` Frederic Weisbecker
2011-03-11  9:49   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 09/15] ftrace: fix event alignment: skb:kfree_skb David Sharp
2010-12-04  1:52   ` Steven Rostedt
2010-12-04 13:38     ` Neil Horman
2011-03-11  9:50   ` [tip:perf/core] tracing: Fix " tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 10/15] ftrace: fix event alignment: jbd2:* David Sharp
2010-12-04  1:52   ` Steven Rostedt
2011-03-09  0:03     ` Steven Rostedt
2011-03-09  0:31       ` Ted Ts'o
2011-03-09  0:41         ` Steven Rostedt
2010-12-04  0:13 ` [PATCH 11/15] ftrace: fix event alignment: ext4:* David Sharp
2010-12-04  1:53   ` Steven Rostedt
2011-03-09  0:04     ` Steven Rostedt
2010-12-04  0:13 ` [PATCH 12/15] trace_output.c: adjust conditional expression formatting David Sharp
2011-03-11  9:50   ` [tip:perf/core] tracing: Adjust conditional expression latency formatting tip-bot for David Sharp
2010-12-04  0:13 ` [PATCH 13/15] small_traces: Add config option to shrink trace events David Sharp
2010-12-04  1:56   ` Steven Rostedt
2010-12-04  2:33     ` David Sharp
2010-12-04  2:54       ` Steven Rostedt
2010-12-09 14:55         ` Frederic Weisbecker
2010-12-09 15:08           ` Steven Rostedt
2010-12-09 15:28             ` Frederic Weisbecker
2010-12-09 16:16               ` Mathieu Desnoyers
2011-03-09  0:23   ` Steven Rostedt
2010-12-04  0:13 ` [PATCH 14/15] small_traces: Remove trace output of large fields David Sharp
2010-12-04  0:13 ` [PATCH 15/15] small_traces: Remove 8 bytes from trace_entry David Sharp
2010-12-06 13:22 ` [Patch 00/15] Reduce tracing payload size Andi Kleen
2010-12-06 13:56   ` Ted Ts'o
2010-12-06 14:58     ` Andi Kleen
2010-12-06 16:17       ` Steven Rostedt
2010-12-06 16:31       ` Miguel Ojeda
2010-12-06 16:41         ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).