From: Arnaldo Carvalho de Melo <acme@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
prasad@linux.vnet.ibm.com,
Linus Torvalds <torvalds@linux-foundation.org>,
Mathieu Desnoyers <compudj@krystal.dyndns.org>,
"Frank Ch. Eigler" <fche@redhat.com>,
David Wilder <dwilder@us.ibm.com>,
hch@lst.de, Martin Bligh <mbligh@google.com>,
Christoph Hellwig <hch@infradead.org>,
Steven Rostedt <srostedt@redhat.com>
Subject: Re: [PATCH v5] Unified trace buffer
Date: Fri, 26 Sep 2008 14:31:30 -0300 [thread overview]
Message-ID: <20080926173130.GE15446@ghostprotocols.net> (raw)
In-Reply-To: <alpine.DEB.1.10.0809261245420.21618@gandalf.stny.rr.com>
Em Fri, Sep 26, 2008 at 01:11:57PM -0400, Steven Rostedt escreveu:
>
> [
> Note the removal of the RFC in the subject.
> I am happy with this version. It handles everything I need
> for ftrace.
>
> New since last version:
>
> - Fixed timing bug. I did not add the deltas properly when
> reading the buffer.
>
> - Removed "-1" time stamp normalize test. This made the
> clock go backwards!
>
> - Removed page pointer array and replaced it with the ftrace
> page struct link list trick. Since this is my second time
> writing this code (first with ftrace), it is actually much
> cleaner than the ftrace code.
>
> - Implemented buffer resizing. By using the page link list trick,
> this became much simpler.
>
> Note, the GOTD part is still not implemented, but can be done
> later without affecting this interface.
>
> ]
>
> This is a unified tracing buffer that implements a ring buffer that
> hopefully everyone will eventually be able to use.
>
> The events recorded into the buffer have the following structure:
>
> struct ring_buffer_event {
> u32 type:2, len:3, time_delta:27;
> u32 array[];
> };
>
> The minimum size of an event is 8 bytes. All events are 4 byte
> aligned inside the buffer.
>
> There are 4 types (all internal use for the ring buffer, only
> the data type is exported to the interface users).
>
> RB_TYPE_PADDING: this type is used to note extra space at the end
> of a buffer page.
>
> RB_TYPE_TIME_EXTENT: This type is used when the time between events
> is greater than the 27 bit delta can hold. We add another
> 32 bits, and record that in its own event (8 byte size).
>
> RB_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to
> help keep the buffer timestamps in sync.
>
> RB_TYPE_DATA: The event actually holds user data.
>
> The "len" field is only three bits. Since the data must be
> 4 byte aligned, this field is shifted left by 2, giving a
> max length of 28 bytes. If the data load is greater than 28
> bytes, the first array field holds the full length of the
> data load and the len field is set to zero.
>
> Example, data size of 7 bytes:
>
> type = RB_TYPE_DATA
> len = 2
> time_delta: <time-stamp> - <prev_event-time-stamp>
> array[0..1]: <7 bytes of data> <1 byte empty>
>
> This event is saved in 12 bytes of the buffer.
>
> An event with 82 bytes of data:
>
> type = RB_TYPE_DATA
> len = 0
> time_delta: <time-stamp> - <prev_event-time-stamp>
> array[0]: 84 (Note the alignment)
> array[1..14]: <82 bytes of data> <2 bytes empty>
>
> The above event is saved in 92 bytes (if my math is correct).
> 82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length.
>
> Do not reference the above event struct directly. Use the following
> functions to gain access to the event table, since the
> ring_buffer_event structure may change in the future.
>
> ring_buffer_event_length(event): get the length of the event.
> This is the size of the memory used to record this
> event, and not the size of the data pay load.
>
> ring_buffer_time_delta(event): get the time delta of the event
> This returns the delta time stamp since the last event.
> Note: Even though this is in the header, there should
> be no reason to access this directly, accept
> for debugging.
>
> ring_buffer_event_data(event): get the data from the event
> This is the function to use to get the actual data
> from the event. Note, it is only a pointer to the
> data inside the buffer. This data must be copied to
> another location otherwise you risk it being written
> over in the buffer.
>
> ring_buffer_lock: A way to lock the entire buffer.
> ring_buffer_unlock: unlock the buffer.
>
> ring_buffer_alloc: create a new ring buffer. Can choose between
> overwrite or consumer/producer mode. Overwrite will
> overwrite old data, where as consumer producer will
> throw away new data if the consumer catches up with the
> producer. The consumer/producer is the default.
>
> ring_buffer_free: free the ring buffer.
>
> ring_buffer_resize: resize the buffer. Changes the size of each cpu
> buffer. Note, it is up to the caller to provide that
> the buffer is not being used while this is happening.
> This requirement may go away but do not count on it.
>
> ring_buffer_lock_reserve: locks the ring buffer and allocates an
> entry on the buffer to write to.
> ring_buffer_unlock_commit: unlocks the ring buffer and commits it to
> the buffer.
>
> ring_buffer_write: writes some data into the ring buffer.
>
> ring_buffer_peek: Look at a next item in the cpu buffer.
> ring_buffer_consume: get the next item in the cpu buffer and
> consume it. That is, this function increments the head
> pointer.
>
> ring_buffer_read_start: Start an iterator of a cpu buffer.
> For now, this disables the cpu buffer, until you issue
> a finish. This is just because we do not want the iterator
> to be overwritten. This restriction may change in the future.
> But note, this is used for static reading of a buffer which
> is usually done "after" a trace. Live readings would want
> to use the ring_buffer_consume above, which will not
> disable the ring buffer.
>
> ring_buffer_read_finish: Finishes the read iterator and reenables
> the ring buffer.
>
> ring_buffer_iter_peek: Look at the next item in the cpu iterator.
> ring_buffer_read: Read the iterator and increment it.
> ring_buffer_iter_reset: Reset the iterator to point to the beginning
> of the cpu buffer.
> ring_buffer_iter_empty: Returns true if the iterator is at the end
> of the cpu buffer.
>
> ring_buffer_size: returns the size in bytes of each cpu buffer.
> Note, the real size is this times the number of CPUs.
>
> ring_buffer_reset_cpu: Sets the cpu buffer to empty
> ring_buffer_reset: sets all cpu buffers to empty
>
> ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a
> cpu buffer of another buffer. This is handy when you
> want to take a snap shot of a running trace on just one
> cpu. Having a backup buffer, to swap with facilitates this.
> Ftrace max latencies use this.
>
> ring_buffer_empty: Returns true if the ring buffer is empty.
> ring_buffer_empty_cpu: Returns true if the cpu buffer is empty.
>
> ring_buffer_record_disable: disable all cpu buffers (read only)
> ring_buffer_record_disable_cpu: disable a single cpu buffer (read only)
> ring_buffer_record_enable: enable all cpu buffers.
> ring_buffer_record_enabl_cpu: enable a single cpu buffer.
>
> ring_buffer_entries: The number of entries in a ring buffer.
> ring_buffer_overruns: The number of entries removed due to writing wrap.
>
> ring_buffer_time_stamp: Get the time stamp used by the ring buffer
> ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp
> into nanosecs.
>
> I still need to implement the GTOD feature. But we need support from
> the cpu frequency infrastructure. But this can be done at a later
> time without affecting the ring buffer interface.
>
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
> include/linux/ring_buffer.h | 178 +++++
> kernel/trace/Kconfig | 4
> kernel/trace/Makefile | 1
> kernel/trace/ring_buffer.c | 1491 ++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 1674 insertions(+)
>
> Index: linux-trace.git/include/linux/ring_buffer.h
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-trace.git/include/linux/ring_buffer.h 2008-09-25 21:29:16.000000000 -0400
> @@ -0,0 +1,178 @@
> +#ifndef _LINUX_RING_BUFFER_H
> +#define _LINUX_RING_BUFFER_H
> +
> +#include <linux/mm.h>
> +#include <linux/seq_file.h>
> +
> +struct ring_buffer;
> +struct ring_buffer_iter;
> +
> +/*
> + * Don't reference this struct directly, use the inline items below.
> + */
> +struct ring_buffer_event {
> + u32 type:2, len:3, time_delta:27;
> + u32 array[];
> +} __attribute__((__packed__));
Why do you need __packed__ here? With or without it the layout is the
same:
[acme@doppio examples]$ pahole packed
struct ring_buffer_event {
u32 type:2; /* 0:30 4 */
u32 len:3; /* 0:27 4 */
u32 time_delta:27; /* 0: 0 4 */
u32 array[0]; /* 4 0 */
/* size: 4, cachelines: 1, members: 4 */
/* last cacheline: 4 bytes */
};
- Arnaldo
next prev parent reply other threads:[~2008-09-26 17:39 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-25 18:51 [RFC PATCH 0/2 v3] Unified trace buffer Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 1/2 " Steven Rostedt
2008-09-26 1:02 ` [RFC PATCH v4] " Steven Rostedt
2008-09-26 1:52 ` Masami Hiramatsu
2008-09-26 2:11 ` Steven Rostedt
2008-09-26 2:47 ` Masami Hiramatsu
2008-09-26 3:20 ` Mathieu Desnoyers
2008-09-26 7:18 ` Peter Zijlstra
2008-09-26 10:45 ` Steven Rostedt
2008-09-26 11:00 ` Peter Zijlstra
2008-09-26 16:57 ` Masami Hiramatsu
2008-09-26 17:14 ` Steven Rostedt
2008-09-26 10:47 ` Steven Rostedt
2008-09-26 16:04 ` Mathieu Desnoyers
2008-09-26 17:11 ` [PATCH v5] " Steven Rostedt
2008-09-26 17:31 ` Arnaldo Carvalho de Melo [this message]
2008-09-26 17:37 ` Linus Torvalds
2008-09-26 17:46 ` Steven Rostedt
2008-09-27 17:02 ` Ingo Molnar
2008-09-27 17:18 ` Steven Rostedt
2008-09-26 18:05 ` [PATCH v6] " Steven Rostedt
2008-09-26 18:30 ` Richard Holden
2008-09-26 18:39 ` Steven Rostedt
2008-09-26 18:59 ` Peter Zijlstra
2008-09-26 19:46 ` Martin Bligh
2008-09-26 19:52 ` Steven Rostedt
2008-09-26 21:37 ` Steven Rostedt
2008-09-26 19:14 ` Peter Zijlstra
2008-09-26 22:28 ` Mike Travis
2008-09-26 23:56 ` Steven Rostedt
2008-09-27 0:05 ` Mike Travis
2008-09-27 0:18 ` Steven Rostedt
2008-09-27 0:46 ` Mike Travis
2008-09-27 0:52 ` Steven Rostedt
2008-09-26 19:17 ` Peter Zijlstra
2008-09-26 23:16 ` Arjan van de Ven
2008-09-26 20:08 ` Peter Zijlstra
2008-09-26 21:14 ` Masami Hiramatsu
2008-09-26 21:26 ` Steven Rostedt
2008-09-26 21:13 ` [PATCH v7] " Steven Rostedt
2008-09-27 2:02 ` [PATCH v8] " Steven Rostedt
2008-09-27 6:06 ` [PATCH v9] " Steven Rostedt
2008-09-27 18:39 ` Ingo Molnar
2008-09-27 19:24 ` Steven Rostedt
2008-09-27 19:41 ` Ingo Molnar
2008-09-27 19:54 ` Steven Rostedt
2008-09-27 20:00 ` Ingo Molnar
2008-09-29 15:05 ` Steven Rostedt
2008-09-27 20:07 ` Martin Bligh
2008-09-27 20:34 ` Ingo Molnar
2008-09-29 16:10 ` [PATCH v10 Golden] " Steven Rostedt
2008-09-29 16:11 ` Steven Rostedt
2008-09-29 23:35 ` Mathieu Desnoyers
2008-09-30 0:01 ` Steven Rostedt
2008-09-30 0:03 ` Mathieu Desnoyers
2008-09-30 0:12 ` Steven Rostedt
2008-09-30 3:46 ` Mathieu Desnoyers
2008-09-30 4:00 ` Steven Rostedt
2008-09-30 15:20 ` Jonathan Corbet
2008-09-30 15:54 ` Peter Zijlstra
2008-09-30 16:38 ` Linus Torvalds
2008-09-30 16:48 ` Steven Rostedt
2008-09-30 17:00 ` Peter Zijlstra
2008-09-30 17:41 ` Steven Rostedt
2008-09-30 17:49 ` Peter Zijlstra
2008-09-30 17:56 ` Steven Rostedt
2008-09-30 18:02 ` Steven Rostedt
2008-09-30 17:01 ` Linus Torvalds
2008-10-01 15:14 ` [PATCH] ring_buffer: allocate buffer page pointer Steven Rostedt
2008-10-01 17:36 ` Mathieu Desnoyers
2008-10-01 17:49 ` Steven Rostedt
2008-10-01 18:21 ` Mathieu Desnoyers
2008-10-02 8:50 ` Ingo Molnar
2008-10-02 8:51 ` Ingo Molnar
2008-10-02 9:05 ` [PATCH] ring-buffer: fix build error Ingo Molnar
2008-10-02 9:38 ` [boot crash] " Ingo Molnar
2008-10-02 13:16 ` Steven Rostedt
2008-10-02 13:17 ` Steven Rostedt
2008-10-02 15:50 ` Ingo Molnar
2008-10-02 18:27 ` Steven Rostedt
2008-10-02 18:55 ` Ingo Molnar
2008-10-02 23:18 ` [PATCH] ring_buffer: map to cpu not page Steven Rostedt
2008-10-02 23:36 ` Steven Rostedt
2008-10-03 4:56 ` [PATCH] x86 Topology cpu_to_node parameter check Mathieu Desnoyers
2008-10-03 5:20 ` Steven Rostedt
2008-10-03 15:56 ` Mathieu Desnoyers
2008-10-03 16:26 ` Steven Rostedt
2008-10-03 17:21 ` Mathieu Desnoyers
2008-10-03 17:54 ` Steven Rostedt
2008-10-03 18:53 ` [PATCH] topology.h define mess fix Mathieu Desnoyers
2008-10-03 20:14 ` Luck, Tony
2008-10-03 22:47 ` [PATCH] topology.h define mess fix v2 Mathieu Desnoyers
2008-10-03 7:27 ` [PATCH] ring_buffer: map to cpu not page Ingo Molnar
2008-10-02 9:06 ` [PATCH] ring_buffer: allocate buffer page pointer Andrew Morton
2008-10-02 9:41 ` Ingo Molnar
2008-10-02 13:06 ` Steven Rostedt
2008-09-26 22:31 ` [PATCH v6] Unified trace buffer Arnaldo Carvalho de Melo
2008-09-26 23:58 ` Steven Rostedt
2008-09-27 0:13 ` Linus Torvalds
2008-09-27 0:23 ` Steven Rostedt
2008-09-27 0:28 ` Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 2/2 v3] ftrace: make work with new ring buffer Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080926173130.GE15446@ghostprotocols.net \
--to=acme@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=compudj@krystal.dyndns.org \
--cc=dwilder@us.ibm.com \
--cc=fche@redhat.com \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@google.com \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=prasad@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).