All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] [GIT PULL] (second try) ring-buffer: fixes for discard
@ 2009-06-03 14:16 Steven Rostedt
  2009-06-03 14:16 ` [PATCH 1/3] ring-buffer: fix bug in ring_buffer_discard_commit Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Steven Rostedt @ 2009-06-03 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Tim Bird


Ingo,

Please pull the latest tip/tracing/ftrace-3 tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/ftrace-3


Steven Rostedt (2):
      ring-buffer: try to discard unneeded timestamps
      ring-buffer: discard timestamps that are at the start of the buffer

Tim Bird (1):
      ring-buffer: fix bug in ring_buffer_discard_commit

----
 kernel/trace/ring_buffer.c |  104 ++++++++++++++++++++++++++------------------
 1 files changed, 62 insertions(+), 42 deletions(-)
-- 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] ring-buffer: fix bug in ring_buffer_discard_commit
  2009-06-03 14:16 [PATCH 0/3] [GIT PULL] (second try) ring-buffer: fixes for discard Steven Rostedt
@ 2009-06-03 14:16 ` Steven Rostedt
  2009-06-03 14:16 ` [PATCH 2/3] ring-buffer: try to discard unneeded timestamps Steven Rostedt
  2009-06-03 14:16 ` [PATCH 3/3] ring-buffer: discard timestamps that are at the start of the buffer Steven Rostedt
  2 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2009-06-03 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Tim Bird

[-- Attachment #1: 0001-ring-buffer-fix-bug-in-ring_buffer_discard_commit.patch --]
[-- Type: text/plain, Size: 2252 bytes --]

From: Tim Bird <tim.bird@am.sony.com>

There's a bug in ring_buffer_discard_commit.  The wrong
pointer is being compared in order to check if the event
can be freed from the buffer rather than discarded
(i.e. marked as PAD).

I noticed this when I was working on duration filtering.
The bug is not deadly - it just results in lots of wasted
space in the buffer.  All filtered events are left in
the buffer and marked as discarded, rather than being
removed from the buffer to make space for other events.

Unfortunately, when I fixed this bug, I got errors doing a
filtered function trace.  Multiple TIME_EXTEND
events pile up in the buffer, and trigger the
following loop overage warning in rb_iter_peek():

again:
	...
	if (RB_WARN_ON(cpu_buffer, ++nr_loops > 10))
		return NULL;

I'm not sure what the best way is to fix this. I don't
know if I should extend the loop threshhold, or if I should
make the test more complex (ignore TIME_EXTEND
events), or just get rid of this loop check completely.

Note that if I implement a workaround for this, then I
see another problem from rb_advance_iter().  I haven't
tracked that one down yet.

In general, it seems like the case of removing filtered
events has not been working properly, and so some assumptions
about buffer invariant conditions need to be revisited.

Here's the patch for the simple fix:

Compare correct pointer for checking if an event can be
freed rather than left as discarded in the buffer.

Signed-off-by: Tim Bird <tim.bird@am.sony.com>
LKML-Reference: <4A25BE9E.5090909@am.sony.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 16b24d4..9453023 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1708,7 +1708,7 @@ void ring_buffer_discard_commit(struct ring_buffer *buffer,
 
 	bpage = cpu_buffer->tail_page;
 
-	if (bpage == (void *)addr && rb_page_write(bpage) == old_index) {
+	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
 		/*
 		 * This is on the tail page. It is possible that
 		 * a write could come in and move the tail page
-- 
1.6.3.1

-- 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] ring-buffer: try to discard unneeded timestamps
  2009-06-03 14:16 [PATCH 0/3] [GIT PULL] (second try) ring-buffer: fixes for discard Steven Rostedt
  2009-06-03 14:16 ` [PATCH 1/3] ring-buffer: fix bug in ring_buffer_discard_commit Steven Rostedt
@ 2009-06-03 14:16 ` Steven Rostedt
  2009-06-03 18:54   ` Tim Bird
  2009-06-03 20:55   ` Tim Bird
  2009-06-03 14:16 ` [PATCH 3/3] ring-buffer: discard timestamps that are at the start of the buffer Steven Rostedt
  2 siblings, 2 replies; 8+ messages in thread
From: Steven Rostedt @ 2009-06-03 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Tim Bird

[-- Attachment #1: 0002-ring-buffer-try-to-discard-unneeded-timestamps.patch --]
[-- Type: text/plain, Size: 3406 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

There are times that a race may happen that we add a timestamp in a
nested write. This timestamp would just contain a zero delta and serves
no purpose.

Now that we have a way to discard events, this patch will try to discard
the timestamp instead of just wasting the space in the ring buffer.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c |   67 +++++++++++++++++++++++++++-----------------
 1 files changed, 41 insertions(+), 26 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 9453023..5092660 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1335,6 +1335,38 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 	return event;
 }
 
+static inline int
+rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
+		  struct ring_buffer_event *event)
+{
+	unsigned long new_index, old_index;
+	struct buffer_page *bpage;
+	unsigned long index;
+	unsigned long addr;
+
+	new_index = rb_event_index(event);
+	old_index = new_index + rb_event_length(event);
+	addr = (unsigned long)event;
+	addr &= PAGE_MASK;
+
+	bpage = cpu_buffer->tail_page;
+
+	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
+		/*
+		 * This is on the tail page. It is possible that
+		 * a write could come in and move the tail page
+		 * and write to the next page. That is fine
+		 * because we just shorten what is on this page.
+		 */
+		index = local_cmpxchg(&bpage->write, old_index, new_index);
+		if (index == old_index)
+			return 1;
+	}
+
+	/* could not discard */
+	return 0;
+}
+
 static int
 rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer,
 		  u64 *ts, u64 *delta)
@@ -1384,10 +1416,13 @@ rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer,
 		/* let the caller know this was the commit */
 		ret = 1;
 	} else {
-		/* Darn, this is just wasted space */
-		event->time_delta = 0;
-		event->array[0] = 0;
-		ret = 0;
+		/* Try to discard the event */
+		if (!rb_try_to_discard(cpu_buffer, event)) {
+			/* Darn, this is just wasted space */
+			event->time_delta = 0;
+			event->array[0] = 0;
+			ret = 0;
+		}
 	}
 
 	*delta = 0;
@@ -1682,10 +1717,6 @@ void ring_buffer_discard_commit(struct ring_buffer *buffer,
 				struct ring_buffer_event *event)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
-	unsigned long new_index, old_index;
-	struct buffer_page *bpage;
-	unsigned long index;
-	unsigned long addr;
 	int cpu;
 
 	/* The event is discarded regardless */
@@ -1701,24 +1732,8 @@ void ring_buffer_discard_commit(struct ring_buffer *buffer,
 	cpu = smp_processor_id();
 	cpu_buffer = buffer->buffers[cpu];
 
-	new_index = rb_event_index(event);
-	old_index = new_index + rb_event_length(event);
-	addr = (unsigned long)event;
-	addr &= PAGE_MASK;
-
-	bpage = cpu_buffer->tail_page;
-
-	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
-		/*
-		 * This is on the tail page. It is possible that
-		 * a write could come in and move the tail page
-		 * and write to the next page. That is fine
-		 * because we just shorten what is on this page.
-		 */
-		index = local_cmpxchg(&bpage->write, old_index, new_index);
-		if (index == old_index)
-			goto out;
-	}
+	if (!rb_try_to_discard(cpu_buffer, event))
+		goto out;
 
 	/*
 	 * The commit is still visible by the reader, so we
-- 
1.6.3.1

-- 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] ring-buffer: discard timestamps that are at the start of the buffer
  2009-06-03 14:16 [PATCH 0/3] [GIT PULL] (second try) ring-buffer: fixes for discard Steven Rostedt
  2009-06-03 14:16 ` [PATCH 1/3] ring-buffer: fix bug in ring_buffer_discard_commit Steven Rostedt
  2009-06-03 14:16 ` [PATCH 2/3] ring-buffer: try to discard unneeded timestamps Steven Rostedt
@ 2009-06-03 14:16 ` Steven Rostedt
  2 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2009-06-03 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Tim Bird

[-- Attachment #1: 0003-ring-buffer-discard-timestamps-that-are-at-the-start.patch --]
[-- Type: text/plain, Size: 4374 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Every buffer page in the ring buffer includes its own time stamp.
When an event is recorded to the ring buffer with a delta time greater
than what can be held in the event header, a time stamp event is created.

If the the create timestamp falls over to the next buffer page, it is
redundant because the buffer page holds a full time stamp. This patch
will try to discard the time stamp when it falls to the start of the
next page.

This change also fixes a issues with disarding events. If most events are
discarded, timestamps will start to creep into the ring buffer. If we
do not discard the timestamps then they can fill up the ring buffer over
time and waste space.

This change will keep time stamps from filling up over another page. If
something is recorded in the buffer page, and the rest is filtered, then
the time stamps can only fill up to the end of the page.

[ Impact: prevent time stamps from filling ring buffer ]

Reported-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c |   37 +++++++++++++++++++++----------------
 1 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 5092660..7102d7a 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -370,6 +370,9 @@ static inline int test_time_stamp(u64 delta)
 /* Max payload is BUF_PAGE_SIZE - header (8bytes) */
 #define BUF_MAX_DATA_SIZE (BUF_PAGE_SIZE - (sizeof(u32) * 2))
 
+/* Max number of timestamps that can fit on a page */
+#define RB_TIMESTAMPS_PER_PAGE	(BUF_PAGE_SIZE / RB_LEN_TIME_STAMP)
+
 int ring_buffer_print_page_header(struct trace_seq *s)
 {
 	struct buffer_data_page field;
@@ -1409,8 +1412,12 @@ rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer,
 			event->array[0] = *delta >> TS_SHIFT;
 		} else {
 			cpu_buffer->commit_page->page->time_stamp = *ts;
-			event->time_delta = 0;
-			event->array[0] = 0;
+			/* try to discard, since we do not need this */
+			if (!rb_try_to_discard(cpu_buffer, event)) {
+				/* nope, just zero it */
+				event->time_delta = 0;
+				event->array[0] = 0;
+			}
 		}
 		cpu_buffer->write_stamp = *ts;
 		/* let the caller know this was the commit */
@@ -2268,8 +2275,8 @@ static void rb_advance_iter(struct ring_buffer_iter *iter)
 	 * Check if we are at the end of the buffer.
 	 */
 	if (iter->head >= rb_page_size(iter->head_page)) {
-		if (RB_WARN_ON(buffer,
-			       iter->head_page == cpu_buffer->commit_page))
+		/* discarded commits can make the page empty */
+		if (iter->head_page == cpu_buffer->commit_page)
 			return;
 		rb_inc_iter(iter);
 		return;
@@ -2312,12 +2319,10 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 	/*
 	 * We repeat when a timestamp is encountered. It is possible
 	 * to get multiple timestamps from an interrupt entering just
-	 * as one timestamp is about to be written. The max times
-	 * that this can happen is the number of nested interrupts we
-	 * can have.  Nesting 10 deep of interrupts is clearly
-	 * an anomaly.
+	 * as one timestamp is about to be written, or from discarded
+	 * commits. The most that we can have is the number on a single page.
 	 */
-	if (RB_WARN_ON(cpu_buffer, ++nr_loops > 10))
+	if (RB_WARN_ON(cpu_buffer, ++nr_loops > RB_TIMESTAMPS_PER_PAGE))
 		return NULL;
 
 	reader = rb_get_reader_page(cpu_buffer);
@@ -2383,14 +2388,14 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 
  again:
 	/*
-	 * We repeat when a timestamp is encountered. It is possible
-	 * to get multiple timestamps from an interrupt entering just
-	 * as one timestamp is about to be written. The max times
-	 * that this can happen is the number of nested interrupts we
-	 * can have. Nesting 10 deep of interrupts is clearly
-	 * an anomaly.
+	 * We repeat when a timestamp is encountered.
+	 * We can get multiple timestamps by nested interrupts or also
+	 * if filtering is on (discarding commits). Since discarding
+	 * commits can be frequent we can get a lot of timestamps.
+	 * But we limit them by not adding timestamps if they begin
+	 * at the start of a page.
 	 */
-	if (RB_WARN_ON(cpu_buffer, ++nr_loops > 10))
+	if (RB_WARN_ON(cpu_buffer, ++nr_loops > RB_TIMESTAMPS_PER_PAGE))
 		return NULL;
 
 	if (rb_per_cpu_empty(cpu_buffer))
-- 
1.6.3.1

-- 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] ring-buffer: try to discard unneeded timestamps
  2009-06-03 14:16 ` [PATCH 2/3] ring-buffer: try to discard unneeded timestamps Steven Rostedt
@ 2009-06-03 18:54   ` Tim Bird
  2009-06-03 19:14     ` Steven Rostedt
  2009-06-03 20:55   ` Tim Bird
  1 sibling, 1 reply; 8+ messages in thread
From: Tim Bird @ 2009-06-03 18:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker

Steven Rostedt wrote:
> There are times that a race may happen that we add a timestamp in a
> nested write. This timestamp would just contain a zero delta and serves
> no purpose.
> 
> Now that we have a way to discard events, this patch will try to discard
> the timestamp instead of just wasting the space in the ring buffer.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  kernel/trace/ring_buffer.c |   67 +++++++++++++++++++++++++++-----------------
>  1 files changed, 41 insertions(+), 26 deletions(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 9453023..5092660 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1335,6 +1335,38 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
>  	return event;
>  }
>  
> +static inline int
> +rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
> +		  struct ring_buffer_event *event)
> +{
> +	unsigned long new_index, old_index;
> +	struct buffer_page *bpage;
> +	unsigned long index;
> +	unsigned long addr;
> +
> +	new_index = rb_event_index(event);
> +	old_index = new_index + rb_event_length(event);
> +	addr = (unsigned long)event;
> +	addr &= PAGE_MASK;
> +
> +	bpage = cpu_buffer->tail_page;
> +
> +	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
> +		/*
> +		 * This is on the tail page. It is possible that
> +		 * a write could come in and move the tail page
> +		 * and write to the next page. That is fine
> +		 * because we just shorten what is on this page.
> +		 */
> +		index = local_cmpxchg(&bpage->write, old_index, new_index);
> +		if (index == old_index)
> +			return 1;
> +	}
> +
> +	/* could not discard */
> +	return 0;
> +}
> +

Is this new routine only for discarding uncommitted events,
or can it be used on committed events?

I assume the former, since I see nothing about adjusting the
commit position.

In the ring_buffer API I see that there's a function for
discarding events (committed ones), but not for free-ing them.
In function duration filtering, it is desirable to free the
last committed event, which for a function exit of short
duration will be it's entry event 99% of the time.
 -- Tim

P.S. I'm very sorry about the missing '>' on the Signed-off-by line.
I ran checkpatch and got a passing score, but missed this.

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] ring-buffer: try to discard unneeded timestamps
  2009-06-03 18:54   ` Tim Bird
@ 2009-06-03 19:14     ` Steven Rostedt
  2009-06-03 19:36       ` Tim Bird
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2009-06-03 19:14 UTC (permalink / raw)
  To: Tim Bird; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker


On Wed, 3 Jun 2009, Tim Bird wrote:
> 
> Is this new routine only for discarding uncommitted events,
> or can it be used on committed events?
> 
> I assume the former, since I see nothing about adjusting the
> commit position.

Only uncommitted events.

> 
> In the ring_buffer API I see that there's a function for
> discarding events (committed ones), but not for free-ing them.
> In function duration filtering, it is desirable to free the
> last committed event, which for a function exit of short
> duration will be it's entry event 99% of the time.

In filtering we deside before commiting if we want to discard or not. 
(Note, this is only in tip right now.)  Once we commit it, there is no way 
to safely remove it from the ring buffer. Additions of items are not under 
a lock (only the moving from page to page is).

For the event tracer we check if we want to disard it or not before we 
commit.

> 
> P.S. I'm very sorry about the missing '>' on the Signed-off-by line.
> I ran checkpatch and got a passing score, but missed this.

No prob, you only made me spam LKML (and others) with about 10 gargage 
emails ;-)

-- Steve


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] ring-buffer: try to discard unneeded timestamps
  2009-06-03 19:14     ` Steven Rostedt
@ 2009-06-03 19:36       ` Tim Bird
  0 siblings, 0 replies; 8+ messages in thread
From: Tim Bird @ 2009-06-03 19:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker

Steven Rostedt wrote:
> On Wed, 3 Jun 2009, Tim Bird wrote:
>> Is this new routine only for discarding uncommitted events,
>> or can it be used on committed events?
>>
>> I assume the former, since I see nothing about adjusting the
>> commit position.
> 
> Only uncommitted events.
OK.

>> In the ring_buffer API I see that there's a function for
>> discarding events (committed ones), but not for free-ing them.
>> In function duration filtering, it is desirable to free the
>> last committed event, which for a function exit of short
>> duration will be it's entry event 99% of the time.
> 
> In filtering we deside before commiting if we want to discard or not. 
> (Note, this is only in tip right now.)  Once we commit it, there is no way 
> to safely remove it from the ring buffer. Additions of items are not under 
> a lock (only the moving from page to page is).
> 
> For the event tracer we check if we want to disard it or not before we 
> commit.

Yeah - that's what I thought.  I have duration filtering working
(well, the user interface is not done yet), but with the above
limitations, I can only free the exit from the trace, and mark
the entry event as discarded.  It would save a whole lot more
space to free the entry event as well.

I'm experimenting with free-ing only the last committed event,
when no other write has occurred in the buffer.  But I'm still
not sure I can make it safe.  Under normal conditions this
would be sufficient to catch 99% of the cases.  I did this
in KFT, but under locks, and I know you want to be lockless with
ftrace.

Are writes the only issue, or is it a problem with readers?
I was thinking of experimenting with allowing it when no readers
were active (or were on a different page).

>> P.S. I'm very sorry about the missing '>' on the Signed-off-by line.
>> I ran checkpatch and got a passing score, but missed this.
> 
> No prob, you only made me spam LKML (and others) with about 10 garbage 
> emails ;-)
LOL.  Sorry again. ;-)

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] ring-buffer: try to discard unneeded timestamps
  2009-06-03 14:16 ` [PATCH 2/3] ring-buffer: try to discard unneeded timestamps Steven Rostedt
  2009-06-03 18:54   ` Tim Bird
@ 2009-06-03 20:55   ` Tim Bird
  1 sibling, 0 replies; 8+ messages in thread
From: Tim Bird @ 2009-06-03 20:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker

Steve,

See my last comment for an important fix.

Steven Rostedt wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> There are times that a race may happen that we add a timestamp in a
> nested write. This timestamp would just contain a zero delta and serves
> no purpose.
> 
> Now that we have a way to discard events, this patch will try to discard
> the timestamp instead of just wasting the space in the ring buffer.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  kernel/trace/ring_buffer.c |   67 +++++++++++++++++++++++++++-----------------
>  1 files changed, 41 insertions(+), 26 deletions(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 9453023..5092660 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1335,6 +1335,38 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
>  	return event;
>  }
>  
> +static inline int
> +rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
> +		  struct ring_buffer_event *event)

It might be worth using the terminology 'rb_try_to_free' (or something else)
to distinguish the case of backing up the write pointer from the
case of marking an event as discarded (pad) in the log.

> +{
> +	unsigned long new_index, old_index;
> +	struct buffer_page *bpage;
> +	unsigned long index;
> +	unsigned long addr;
> +
> +	new_index = rb_event_index(event);
> +	old_index = new_index + rb_event_length(event);
> +	addr = (unsigned long)event;
> +	addr &= PAGE_MASK;
> +
> +	bpage = cpu_buffer->tail_page;
> +
> +	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
> +		/*
> +		 * This is on the tail page. It is possible that
> +		 * a write could come in and move the tail page
> +		 * and write to the next page. That is fine
> +		 * because we just shorten what is on this page.
> +		 */
> +		index = local_cmpxchg(&bpage->write, old_index, new_index);
> +		if (index == old_index)
> +			return 1;
> +	}
> +
> +	/* could not discard */
> +	return 0;
> +}
> +
>  static int
>  rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer,
>  		  u64 *ts, u64 *delta)
> @@ -1384,10 +1416,13 @@ rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer,
>  		/* let the caller know this was the commit */
>  		ret = 1;
>  	} else {
> -		/* Darn, this is just wasted space */
> -		event->time_delta = 0;
> -		event->array[0] = 0;
> -		ret = 0;
> +		/* Try to discard the event */
> +		if (!rb_try_to_discard(cpu_buffer, event)) {
> +			/* Darn, this is just wasted space */
> +			event->time_delta = 0;
> +			event->array[0] = 0;
> +			ret = 0;
> +		}
>  	}
>  
>  	*delta = 0;
> @@ -1682,10 +1717,6 @@ void ring_buffer_discard_commit(struct ring_buffer *buffer,
>  				struct ring_buffer_event *event)
>  {
>  	struct ring_buffer_per_cpu *cpu_buffer;
> -	unsigned long new_index, old_index;
> -	struct buffer_page *bpage;
> -	unsigned long index;
> -	unsigned long addr;
>  	int cpu;
>  
>  	/* The event is discarded regardless */
> @@ -1701,24 +1732,8 @@ void ring_buffer_discard_commit(struct ring_buffer *buffer,
>  	cpu = smp_processor_id();
>  	cpu_buffer = buffer->buffers[cpu];
>  
> -	new_index = rb_event_index(event);
> -	old_index = new_index + rb_event_length(event);
> -	addr = (unsigned long)event;
> -	addr &= PAGE_MASK;
> -
> -	bpage = cpu_buffer->tail_page;
> -
> -	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
> -		/*
> -		 * This is on the tail page. It is possible that
> -		 * a write could come in and move the tail page
> -		 * and write to the next page. That is fine
> -		 * because we just shorten what is on this page.
> -		 */
> -		index = local_cmpxchg(&bpage->write, old_index, new_index);
> -		if (index == old_index)
> -			goto out;
> -	}
> +	if (!rb_try_to_discard(cpu_buffer, event))
> +		goto out;

The sense of this test is wrong.  It should be:

	if (rb_try_to_discard(cpu_buffer, event))
		goto out;

You want to go to 'out' (skipping the increment of entries)
if you successfully freed the event.

In my testing, rb_try_to_discard was almost alway successful.
Since only one statement is jumped over by the goto, the
following might be better:

	if (unlikely(!rb_try_to_discard(cpu_buffer, event)))
		/*
		 * The commit is still visible by the reader, so we
		 * must increment entries.
		 */
		local_inc(&cpu_buffer->entries);
	

>  	/*
>  	 * The commit is still visible by the reader, so we
> -- 1.6.3.1
> -- 


-- 
=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-06-03 20:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-03 14:16 [PATCH 0/3] [GIT PULL] (second try) ring-buffer: fixes for discard Steven Rostedt
2009-06-03 14:16 ` [PATCH 1/3] ring-buffer: fix bug in ring_buffer_discard_commit Steven Rostedt
2009-06-03 14:16 ` [PATCH 2/3] ring-buffer: try to discard unneeded timestamps Steven Rostedt
2009-06-03 18:54   ` Tim Bird
2009-06-03 19:14     ` Steven Rostedt
2009-06-03 19:36       ` Tim Bird
2009-06-03 20:55   ` Tim Bird
2009-06-03 14:16 ` [PATCH 3/3] ring-buffer: discard timestamps that are at the start of the buffer Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.