linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	prasad@linux.vnet.ibm.com,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mathieu Desnoyers <compudj@krystal.dyndns.org>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	David Wilder <dwilder@us.ibm.com>,
	hch@lst.de, Martin Bligh <mbligh@google.com>,
	Christoph Hellwig <hch@infradead.org>,
	Steven Rostedt <srostedt@redhat.com>
Subject: Re: [PATCH v5] Unified trace buffer
Date: Fri, 26 Sep 2008 14:31:30 -0300	[thread overview]
Message-ID: <20080926173130.GE15446@ghostprotocols.net> (raw)
In-Reply-To: <alpine.DEB.1.10.0809261245420.21618@gandalf.stny.rr.com>

Em Fri, Sep 26, 2008 at 01:11:57PM -0400, Steven Rostedt escreveu:
> 
> [
>   Note the removal of the RFC in the subject.
>   I am happy with this version. It handles everything I need
>   for ftrace.
> 
>   New since last version:
> 
>    - Fixed timing bug. I did not add the deltas properly when
>      reading the buffer.
> 
>    - Removed "-1" time stamp normalize test. This made the
>      clock go backwards!
> 
>    - Removed page pointer array and replaced it with the ftrace
>      page struct link list trick. Since this is my second time
>      writing this code (first with ftrace), it is actually much
>      cleaner than the ftrace code.
> 
>    - Implemented buffer resizing. By using the page link list trick,
>      this became much simpler.
> 
>    Note, the GOTD part is still not implemented, but can be done
>    later without affecting this interface.
> 
> ]
> 
> This is a unified tracing buffer that implements a ring buffer that
> hopefully everyone will eventually be able to use.
> 
> The events recorded into the buffer have the following structure:
> 
> struct ring_buffer_event {
> 	u32 type:2, len:3, time_delta:27;
> 	u32 array[];
> };
> 
> The minimum size of an event is 8 bytes. All events are 4 byte
> aligned inside the buffer.
> 
> There are 4 types (all internal use for the ring buffer, only
> the data type is exported to the interface users).
> 
> RB_TYPE_PADDING: this type is used to note extra space at the end
> 	of a buffer page.
> 
> RB_TYPE_TIME_EXTENT: This type is used when the time between events
> 	is greater than the 27 bit delta can hold. We add another
> 	32 bits, and record that in its own event (8 byte size).
> 
> RB_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to
> 	help keep the buffer timestamps in sync.
> 
> RB_TYPE_DATA: The event actually holds user data.
> 
> The "len" field is only three bits. Since the data must be
> 4 byte aligned, this field is shifted left by 2, giving a
> max length of 28 bytes. If the data load is greater than 28
> bytes, the first array field holds the full length of the
> data load and the len field is set to zero.
> 
> Example, data size of 7 bytes:
> 
> 	type = RB_TYPE_DATA
> 	len = 2
> 	time_delta: <time-stamp> - <prev_event-time-stamp>
> 	array[0..1]: <7 bytes of data> <1 byte empty>
> 
> This event is saved in 12 bytes of the buffer.
> 
> An event with 82 bytes of data:
> 
> 	type = RB_TYPE_DATA
> 	len = 0
> 	time_delta: <time-stamp> - <prev_event-time-stamp>
> 	array[0]: 84 (Note the alignment)
> 	array[1..14]: <82 bytes of data> <2 bytes empty>
> 
> The above event is saved in 92 bytes (if my math is correct).
> 82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length.
> 
> Do not reference the above event struct directly. Use the following
> functions to gain access to the event table, since the
> ring_buffer_event structure may change in the future.
> 
> ring_buffer_event_length(event): get the length of the event.
> 	This is the size of the memory used to record this
> 	event, and not the size of the data pay load.
> 
> ring_buffer_time_delta(event): get the time delta of the event
> 	This returns the delta time stamp since the last event.
> 	Note: Even though this is in the header, there should
> 		be no reason to access this directly, accept
> 		for debugging.
> 
> ring_buffer_event_data(event): get the data from the event
> 	This is the function to use to get the actual data
> 	from the event. Note, it is only a pointer to the
> 	data inside the buffer. This data must be copied to
> 	another location otherwise you risk it being written
> 	over in the buffer.
> 
> ring_buffer_lock: A way to lock the entire buffer.
> ring_buffer_unlock: unlock the buffer.
> 
> ring_buffer_alloc: create a new ring buffer. Can choose between
> 	overwrite or consumer/producer mode. Overwrite will
> 	overwrite old data, where as consumer producer will
> 	throw away new data if the consumer catches up with the
> 	producer.  The consumer/producer is the default.
> 
> ring_buffer_free: free the ring buffer.
> 
> ring_buffer_resize: resize the buffer. Changes the size of each cpu
> 	buffer. Note, it is up to the caller to provide that
> 	the buffer is not being used while this is happening.
> 	This requirement may go away but do not count on it.
> 
> ring_buffer_lock_reserve: locks the ring buffer and allocates an
> 	entry on the buffer to write to.
> ring_buffer_unlock_commit: unlocks the ring buffer and commits it to
> 	the buffer.
> 
> ring_buffer_write: writes some data into the ring buffer.
> 
> ring_buffer_peek: Look at a next item in the cpu buffer.
> ring_buffer_consume: get the next item in the cpu buffer and
> 	consume it. That is, this function increments the head
> 	pointer.
> 
> ring_buffer_read_start: Start an iterator of a cpu buffer.
> 	For now, this disables the cpu buffer, until you issue
> 	a finish. This is just because we do not want the iterator
> 	to be overwritten. This restriction may change in the future.
> 	But note, this is used for static reading of a buffer which
> 	is usually done "after" a trace. Live readings would want
> 	to use the ring_buffer_consume above, which will not
> 	disable the ring buffer.
> 
> ring_buffer_read_finish: Finishes the read iterator and reenables
> 	the ring buffer.
> 
> ring_buffer_iter_peek: Look at the next item in the cpu iterator.
> ring_buffer_read: Read the iterator and increment it.
> ring_buffer_iter_reset: Reset the iterator to point to the beginning
> 	of the cpu buffer.
> ring_buffer_iter_empty: Returns true if the iterator is at the end
> 	of the cpu buffer.
> 
> ring_buffer_size: returns the size in bytes of each cpu buffer.
> 	Note, the real size is this times the number of CPUs.
> 
> ring_buffer_reset_cpu: Sets the cpu buffer to empty
> ring_buffer_reset: sets all cpu buffers to empty
> 
> ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a
> 	cpu buffer of another buffer. This is handy when you
> 	want to take a snap shot of a running trace on just one
> 	cpu. Having a backup buffer, to swap with facilitates this.
> 	Ftrace max latencies use this.
> 
> ring_buffer_empty: Returns true if the ring buffer is empty.
> ring_buffer_empty_cpu: Returns true if the cpu buffer is empty.
> 
> ring_buffer_record_disable: disable all cpu buffers (read only)
> ring_buffer_record_disable_cpu: disable a single cpu buffer (read only)
> ring_buffer_record_enable: enable all cpu buffers.
> ring_buffer_record_enabl_cpu: enable a single cpu buffer.
> 
> ring_buffer_entries: The number of entries in a ring buffer.
> ring_buffer_overruns: The number of entries removed due to writing wrap.
> 
> ring_buffer_time_stamp: Get the time stamp used by the ring buffer
> ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp
> 	into nanosecs.
> 
> I still need to implement the GTOD feature. But we need support from
> the cpu frequency infrastructure.  But this can be done at a later
> time without affecting the ring buffer interface.
> 
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
>  include/linux/ring_buffer.h |  178 +++++
>  kernel/trace/Kconfig        |    4 
>  kernel/trace/Makefile       |    1 
>  kernel/trace/ring_buffer.c  | 1491 ++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 1674 insertions(+)
> 
> Index: linux-trace.git/include/linux/ring_buffer.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-trace.git/include/linux/ring_buffer.h	2008-09-25 21:29:16.000000000 -0400
> @@ -0,0 +1,178 @@
> +#ifndef _LINUX_RING_BUFFER_H
> +#define _LINUX_RING_BUFFER_H
> +
> +#include <linux/mm.h>
> +#include <linux/seq_file.h>
> +
> +struct ring_buffer;
> +struct ring_buffer_iter;
> +
> +/*
> + * Don't reference this struct directly, use the inline items below.
> + */
> +struct ring_buffer_event {
> +	u32		type:2, len:3, time_delta:27;
> +	u32		array[];
> +} __attribute__((__packed__));

Why do you need __packed__ here? With or without it the layout is the
same:

[acme@doppio examples]$ pahole packed
struct ring_buffer_event {
	u32 type:2;               /* 0:30  4 */
	u32 len:3;                /* 0:27  4 */
	u32 time_delta:27;        /* 0: 0  4 */
	u32 array[0];             /* 4     0 */

	/* size: 4, cachelines: 1, members: 4 */
	/* last cacheline: 4 bytes */
};

- Arnaldo

  reply	other threads:[~2008-09-26 17:39 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-25 18:51 [RFC PATCH 0/2 v3] Unified trace buffer Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 1/2 " Steven Rostedt
2008-09-26  1:02   ` [RFC PATCH v4] " Steven Rostedt
2008-09-26  1:52     ` Masami Hiramatsu
2008-09-26  2:11       ` Steven Rostedt
2008-09-26  2:47         ` Masami Hiramatsu
2008-09-26  3:20         ` Mathieu Desnoyers
2008-09-26  7:18           ` Peter Zijlstra
2008-09-26 10:45             ` Steven Rostedt
2008-09-26 11:00               ` Peter Zijlstra
2008-09-26 16:57                 ` Masami Hiramatsu
2008-09-26 17:14                   ` Steven Rostedt
2008-09-26 10:47             ` Steven Rostedt
2008-09-26 16:04             ` Mathieu Desnoyers
2008-09-26 17:11       ` [PATCH v5] " Steven Rostedt
2008-09-26 17:31         ` Arnaldo Carvalho de Melo [this message]
2008-09-26 17:37           ` Linus Torvalds
2008-09-26 17:46             ` Steven Rostedt
2008-09-27 17:02               ` Ingo Molnar
2008-09-27 17:18                 ` Steven Rostedt
2008-09-26 18:05         ` [PATCH v6] " Steven Rostedt
2008-09-26 18:30           ` Richard Holden
2008-09-26 18:39             ` Steven Rostedt
2008-09-26 18:59           ` Peter Zijlstra
2008-09-26 19:46             ` Martin Bligh
2008-09-26 19:52               ` Steven Rostedt
2008-09-26 21:37               ` Steven Rostedt
2008-09-26 19:14           ` Peter Zijlstra
2008-09-26 22:28             ` Mike Travis
2008-09-26 23:56               ` Steven Rostedt
2008-09-27  0:05                 ` Mike Travis
2008-09-27  0:18                   ` Steven Rostedt
2008-09-27  0:46                     ` Mike Travis
2008-09-27  0:52                       ` Steven Rostedt
2008-09-26 19:17           ` Peter Zijlstra
2008-09-26 23:16             ` Arjan van de Ven
2008-09-26 20:08           ` Peter Zijlstra
2008-09-26 21:14             ` Masami Hiramatsu
2008-09-26 21:26               ` Steven Rostedt
2008-09-26 21:13           ` [PATCH v7] " Steven Rostedt
2008-09-27  2:02             ` [PATCH v8] " Steven Rostedt
2008-09-27  6:06               ` [PATCH v9] " Steven Rostedt
2008-09-27 18:39                 ` Ingo Molnar
2008-09-27 19:24                   ` Steven Rostedt
2008-09-27 19:41                     ` Ingo Molnar
2008-09-27 19:54                       ` Steven Rostedt
2008-09-27 20:00                         ` Ingo Molnar
2008-09-29 15:05                           ` Steven Rostedt
2008-09-27 20:07                         ` Martin Bligh
2008-09-27 20:34                           ` Ingo Molnar
2008-09-29 16:10                 ` [PATCH v10 Golden] " Steven Rostedt
2008-09-29 16:11                   ` Steven Rostedt
2008-09-29 23:35                   ` Mathieu Desnoyers
2008-09-30  0:01                     ` Steven Rostedt
2008-09-30  0:03                       ` Mathieu Desnoyers
2008-09-30  0:12                         ` Steven Rostedt
2008-09-30  3:46                           ` Mathieu Desnoyers
2008-09-30  4:00                             ` Steven Rostedt
2008-09-30 15:20                               ` Jonathan Corbet
2008-09-30 15:54                                 ` Peter Zijlstra
2008-09-30 16:38                                   ` Linus Torvalds
2008-09-30 16:48                                     ` Steven Rostedt
2008-09-30 17:00                                       ` Peter Zijlstra
2008-09-30 17:41                                         ` Steven Rostedt
2008-09-30 17:49                                           ` Peter Zijlstra
2008-09-30 17:56                                             ` Steven Rostedt
2008-09-30 18:02                                               ` Steven Rostedt
2008-09-30 17:01                                       ` Linus Torvalds
2008-10-01 15:14                                         ` [PATCH] ring_buffer: allocate buffer page pointer Steven Rostedt
2008-10-01 17:36                                           ` Mathieu Desnoyers
2008-10-01 17:49                                             ` Steven Rostedt
2008-10-01 18:21                                           ` Mathieu Desnoyers
2008-10-02  8:50                                           ` Ingo Molnar
2008-10-02  8:51                                             ` Ingo Molnar
2008-10-02  9:05                                               ` [PATCH] ring-buffer: fix build error Ingo Molnar
2008-10-02  9:38                                                 ` [boot crash] " Ingo Molnar
2008-10-02 13:16                                                   ` Steven Rostedt
2008-10-02 13:17                                                   ` Steven Rostedt
2008-10-02 15:50                                                     ` Ingo Molnar
2008-10-02 18:27                                                       ` Steven Rostedt
2008-10-02 18:55                                                         ` Ingo Molnar
2008-10-02 23:18                                                   ` [PATCH] ring_buffer: map to cpu not page Steven Rostedt
2008-10-02 23:36                                                     ` Steven Rostedt
2008-10-03  4:56                                                     ` [PATCH] x86 Topology cpu_to_node parameter check Mathieu Desnoyers
2008-10-03  5:20                                                       ` Steven Rostedt
2008-10-03 15:56                                                         ` Mathieu Desnoyers
2008-10-03 16:26                                                           ` Steven Rostedt
2008-10-03 17:21                                                             ` Mathieu Desnoyers
2008-10-03 17:54                                                               ` Steven Rostedt
2008-10-03 18:53                                                                 ` [PATCH] topology.h define mess fix Mathieu Desnoyers
2008-10-03 20:14                                                                   ` Luck, Tony
2008-10-03 22:47                                                                     ` [PATCH] topology.h define mess fix v2 Mathieu Desnoyers
2008-10-03  7:27                                                     ` [PATCH] ring_buffer: map to cpu not page Ingo Molnar
2008-10-02  9:06                                             ` [PATCH] ring_buffer: allocate buffer page pointer Andrew Morton
2008-10-02  9:41                                               ` Ingo Molnar
2008-10-02 13:06                                               ` Steven Rostedt
2008-09-26 22:31           ` [PATCH v6] Unified trace buffer Arnaldo Carvalho de Melo
2008-09-26 23:58             ` Steven Rostedt
2008-09-27  0:13               ` Linus Torvalds
2008-09-27  0:23                 ` Steven Rostedt
2008-09-27  0:28                   ` Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 2/2 v3] ftrace: make work with new ring buffer Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080926173130.GE15446@ghostprotocols.net \
    --to=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=compudj@krystal.dyndns.org \
    --cc=dwilder@us.ibm.com \
    --cc=fche@redhat.com \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@google.com \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=prasad@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=srostedt@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).