All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ring-buffer: fix uninitialized read_stamp
@ 2012-06-07 23:27 David Sharp
  2012-06-07 23:34 ` David Sharp
  0 siblings, 1 reply; 15+ messages in thread
From: David Sharp @ 2012-06-07 23:27 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: vnagarnaik, David Sharp

This fixes a scenario in which trace_pipe_raw will return events with invalid
timestamps when events are copied out one at a time (instead of swapping out
the reader page with a spare page). In this scenario, ring_buffer_read_page()
uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
However, cpu_buffer->read_stamp was not always updated. The only function that
sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
is data immediately available on the page to read (read < rb_page_size()). This
is the bug.

Setting the read_stamp on the first read from a page fixes the bug.

Tested: On certain lightly-loaded machines, repetitive reads from
trace_pipe_raw without using splice() would sometimes result in invalid
timestamps. Poisoning read_stamp in rb_init_page() with a negative value makes
the problem more visible. After this fix, the invalid timstamps disappear.

Google-Bug-Id: 6410455
Signed-off-by: David Sharp <dhsharp@google.com>
---
 kernel/trace/ring_buffer.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 1d0f6a8..ad0239b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3226,8 +3226,15 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 	reader = cpu_buffer->reader_page;
 
 	/* If there's more to read, return this page */
-	if (cpu_buffer->reader_page->read < rb_page_size(reader))
+	if (cpu_buffer->reader_page->read < rb_page_size(reader)) {
+		/*
+		 * If this is the first read from this page,
+		 * initialize the read timestamp.
+		 */
+		if (cpu_buffer->reader_page->read == 0)
+			cpu_buffer->read_stamp = reader->page->time_stamp;
 		goto out;
+	}
 
 	/* Never should we have an index greater than the size */
 	if (RB_WARN_ON(cpu_buffer,
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-07 23:27 [PATCH] ring-buffer: fix uninitialized read_stamp David Sharp
@ 2012-06-07 23:34 ` David Sharp
  2012-06-18 23:02   ` David Sharp
  0 siblings, 1 reply; 15+ messages in thread
From: David Sharp @ 2012-06-07 23:34 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: vnagarnaik, David Sharp

On Thu, Jun 7, 2012 at 4:27 PM, David Sharp <dhsharp@google.com> wrote:
> This fixes a scenario in which trace_pipe_raw will return events with invalid
> timestamps when events are copied out one at a time (instead of swapping out
> the reader page with a spare page). In this scenario, ring_buffer_read_page()
> uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
> However, cpu_buffer->read_stamp was not always updated. The only function that
> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
> is data immediately available on the page to read (read < rb_page_size()). This
> is the bug.
>
> Setting the read_stamp on the first read from a page fixes the bug.
>
> Tested: On certain lightly-loaded machines, repetitive reads from
> trace_pipe_raw without using splice() would sometimes result in invalid
> timestamps. Poisoning read_stamp in rb_init_page() with a negative value makes
> the problem more visible. After this fix, the invalid timstamps disappear.

err, that's poisoning the bpage->time_stamp, not read_stamp.

>
> Google-Bug-Id: 6410455
> Signed-off-by: David Sharp <dhsharp@google.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-07 23:34 ` David Sharp
@ 2012-06-18 23:02   ` David Sharp
  2012-06-21 14:52     ` Steven Rostedt
                       ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: David Sharp @ 2012-06-18 23:02 UTC (permalink / raw)
  To: rostedt, linux-kernel; +Cc: vnagarnaik, David Sharp

This fixes a scenario in which trace_pipe_raw will return events with invalid
timestamps when events are copied out one at a time (instead of swapping out
the reader page with a spare page). In this scenario, ring_buffer_read_page()
uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
However, cpu_buffer->read_stamp was not always updated. The only function that
sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
is data immediately available on the page to read (read < rb_page_size()). This
is the bug.

Setting the read_stamp on the first read from a page fixes the bug.

Tested: On certain lightly-loaded machines, repetitive reads from
trace_pipe_raw without using splice() would sometimes result in invalid
timestamps. Poisoning bpage->timestamp in rb_init_page() with a negative value
makes the problem more visible. After this fix, the invalid timstamps
disappear.

Google-Bug-Id: 6410455
Signed-off-by: David Sharp <dhsharp@google.com>
---

Steve, did you see this? It fixes a data-accuracy bug in ftrace. Also taking
this reminder as an opportunity to correct that typo in the description.

 kernel/trace/ring_buffer.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 1d0f6a8..ad0239b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3226,8 +3226,15 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 	reader = cpu_buffer->reader_page;
 
 	/* If there's more to read, return this page */
-	if (cpu_buffer->reader_page->read < rb_page_size(reader))
+	if (cpu_buffer->reader_page->read < rb_page_size(reader)) {
+		/*
+		 * If this is the first read from this page,
+		 * initialize the read timestamp.
+		 */
+		if (cpu_buffer->reader_page->read == 0)
+			cpu_buffer->read_stamp = reader->page->time_stamp;
 		goto out;
+	}
 
 	/* Never should we have an index greater than the size */
 	if (RB_WARN_ON(cpu_buffer,
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-18 23:02   ` David Sharp
@ 2012-06-21 14:52     ` Steven Rostedt
  2012-06-21 15:46     ` Steven Rostedt
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2012-06-21 14:52 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, vnagarnaik

On Mon, 2012-06-18 at 16:02 -0700, David Sharp wrote:
> This fixes a scenario in which trace_pipe_raw will return events with invalid
> timestamps when events are copied out one at a time (instead of swapping out
> the reader page with a spare page). In this scenario, ring_buffer_read_page()
> uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
> However, cpu_buffer->read_stamp was not always updated. The only function that
> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
> is data immediately available on the page to read (read < rb_page_size()). This
> is the bug.

Hi David,

Thanks for the reminder, I'll be looking at it today.

Do you believe that this is an urgent fix and should be marked for
stable, or do you think it can wait till 3.6? If you think it should be
marked for stable, then it should be pushed for 3.5.

-- Steve



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-18 23:02   ` David Sharp
  2012-06-21 14:52     ` Steven Rostedt
@ 2012-06-21 15:46     ` Steven Rostedt
  2012-06-21 15:56       ` Steven Rostedt
  2012-06-22 20:50       ` David Sharp
  2012-07-06 10:49     ` [tip:perf/core] ring-buffer: Fix " tip-bot for Steven Rostedt
  2012-07-06 11:30     ` [tip:perf/urgent] " tip-bot for Steven Rostedt
  3 siblings, 2 replies; 15+ messages in thread
From: Steven Rostedt @ 2012-06-21 15:46 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, vnagarnaik

OK, I did look at this more.


On Mon, 2012-06-18 at 16:02 -0700, David Sharp wrote:
> This fixes a scenario in which trace_pipe_raw will return events with invalid
> timestamps when events are copied out one at a time (instead of swapping out
> the reader page with a spare page). In this scenario, ring_buffer_read_page()
> uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
> However, cpu_buffer->read_stamp was not always updated.



>  The only function that
> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> only by rb_get_reader_page().

Correct, which is the only way to get the reader page no matter how you
read the buffer.

>  rb_reset_reader_page() was not called when there
> is data immediately available on the page to read (read < rb_page_size()). This
> is the bug.

It is not called? but how did you get the reader_page without the swap
in the first place?

When the ring buffer is first allocated the reader page is set to a
zero'd page, making the "read" and "commit" both zero.

The first time rb_get_reader_page() is called, the compare of read <
rb_page_size() (which is the commit field) fails (0 < 0 is false).

A swap from the writable ring buffer takes place and the
cpu_buffer->read_stamp is updated.

Ah! I think this is where the bug you see happens. But your analysis is
flawed.

If the ring buffer is currently empty, we swap an empty page for an
empty page. But the writer ends up on the reader page preventing new
swaps from taking place.

    commit_page == reader_page

If the writer starts writing, it will update the time stamp of the page.
If your next read happens now, and it's just a single read, then you are
correct that it will not update the read_stamp.

I'm wondering if it would be better to just not do the swap, and return
NULL when it is empty. This would also fix the problem, as the
read_stamp will only be set when you actually have data.

Or it may just be simpler to take your patch.

-- Steve




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-21 15:46     ` Steven Rostedt
@ 2012-06-21 15:56       ` Steven Rostedt
  2012-06-22 20:50       ` David Sharp
  1 sibling, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2012-06-21 15:56 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, vnagarnaik

On Thu, 2012-06-21 at 11:46 -0400, Steven Rostedt wrote:

> I'm wondering if it would be better to just not do the swap, and return
> NULL when it is empty. This would also fix the problem, as the
> read_stamp will only be set when you actually have data.

Does something like this work for you. Note, this is totally untested!

-- Steve

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index ad0239b..5943044 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 	if (cpu_buffer->commit_page == cpu_buffer->reader_page)
 		goto out;
 
+	/* Don't bother swapping if the ring buffer is empty */
+	if (rb_num_of_entries(cpu_buffer) == 0)
+		goto out;
+
 	/*
 	 * Reset the reader page to size zero.
 	 */



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-21 15:46     ` Steven Rostedt
  2012-06-21 15:56       ` Steven Rostedt
@ 2012-06-22 20:50       ` David Sharp
  2012-06-22 23:31         ` Steven Rostedt
  1 sibling, 1 reply; 15+ messages in thread
From: David Sharp @ 2012-06-22 20:50 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, vnagarnaik

On Thu, Jun 21, 2012 at 7:52 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> Do you believe that this is an urgent fix and should be marked for
> stable, or do you think it can wait till 3.6? If you think it should be
> marked for stable, then it should be pushed for 3.5.

I think it should be considered for 3.5, since it affects the accuracy
of the timestamps in very roughly 20% of traces read with read() on
trace_pipe_raw. otoh, it only affects the first page or so on lightly
loaded CPUs. Personally it doesn't make much difference to us what
release it gets into.

On Thu, Jun 21, 2012 at 8:46 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> OK, I did look at this more.

>>  The only function that
>> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
>> only by rb_get_reader_page().
>
> Correct, which is the only way to get the reader page no matter how you
> read the buffer.
>
>>  rb_reset_reader_page() was not called when there
>> is data immediately available on the page to read (read < rb_page_size()). This
>> is the bug.
>
> It is not called?

Correct. I was able to add WARN_ONs that showed that it reached "out:"
without reaching rb_reset_reader_page() first while still having a
poison value in cpu_buffer->read_stamp.

> but how did you get the reader_page without the swap
> in the first place?
>
> When the ring buffer is first allocated the reader page is set to a
> zero'd page, making the "read" and "commit" both zero.
>
> The first time rb_get_reader_page() is called, the compare of read <
> rb_page_size() (which is the commit field) fails (0 < 0 is false).
>
> A swap from the writable ring buffer takes place and the
> cpu_buffer->read_stamp is updated.
>
> Ah! I think this is where the bug you see happens. But your analysis is
> flawed.

I admit I didn't look into it that closely, so flaws are quite
possible (also, I'm human).

>
> If the ring buffer is currently empty, we swap an empty page for an
> empty page. But the writer ends up on the reader page preventing new
> swaps from taking place.
>
>    commit_page == reader_page
>
> If the writer starts writing, it will update the time stamp of the page.
> If your next read happens now, and it's just a single read, then you are
> correct that it will not update the read_stamp.
>
> I'm wondering if it would be better to just not do the swap, and return
> NULL when it is empty. This would also fix the problem, as the
> read_stamp will only be set when you actually have data.

But we do have data... That's how we know we're getting invalid timestamps.

I don't quite understand what you're describing. Here's what I think
is happening though:

When the ring buffer is reset, commit_page == tail_page == head_page.
rb_get_reader_page() will pick up the head page. Then a few (less than
1 page worth) writes happen, on the tail_page which is currently (or
soon to be) also the reader_page. Now read == 0, but
rb_page_size(reader) is however many bytes have been written to the
page.

So we have valid data on the reader page, but read_stamp has not been set yet.

>
> Or it may just be simpler to take your patch.

Please? :)

On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> Does something like this work for you. Note, this is totally untested!
>
> -- Steve
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index ad0239b..5943044 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
>        if (cpu_buffer->commit_page == cpu_buffer->reader_page)
>                goto out;
>
> +       /* Don't bother swapping if the ring buffer is empty */

This doesn't make sense, since the ring buffer isn't empty in this
scenario. (I didn't bother testing it.)

> +       if (rb_num_of_entries(cpu_buffer) == 0)

Did you mean rb_page_entries(something?)

> +               goto out;
> +
>        /*
>         * Reset the reader page to size zero.
>         */
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-22 20:50       ` David Sharp
@ 2012-06-22 23:31         ` Steven Rostedt
  2012-06-23  1:27           ` David Sharp
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2012-06-22 23:31 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, vnagarnaik

On Fri, 2012-06-22 at 13:50 -0700, David Sharp wrote:
> On Thu, Jun 21, 2012 at 7:52 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > Do you believe that this is an urgent fix and should be marked for
> > stable, or do you think it can wait till 3.6? If you think it should be
> > marked for stable, then it should be pushed for 3.5.
> 
> I think it should be considered for 3.5, since it affects the accuracy
> of the timestamps in very roughly 20% of traces read with read() on
> trace_pipe_raw. otoh, it only affects the first page or so on lightly
> loaded CPUs. Personally it doesn't make much difference to us what
> release it gets into.

If there's a simple fix then I can try to get this into 3.5.

> 
> On Thu, Jun 21, 2012 at 8:46 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > OK, I did look at this more.
> 
> >>  The only function that
> >> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> >> only by rb_get_reader_page().
> >
> > Correct, which is the only way to get the reader page no matter how you
> > read the buffer.
> >
> >>  rb_reset_reader_page() was not called when there
> >> is data immediately available on the page to read (read < rb_page_size()). This
> >> is the bug.
> >
> > It is not called?
> 
> Correct. I was able to add WARN_ONs that showed that it reached "out:"
> without reaching rb_reset_reader_page() first while still having a
> poison value in cpu_buffer->read_stamp.

NO! You didn't understand what I wrote. The writer can not get onto the
reader page without a swap! The reader page is not part of the main ring
buffer.

> 
> > but how did you get the reader_page without the swap
> > in the first place?
> >
> > When the ring buffer is first allocated the reader page is set to a
> > zero'd page, making the "read" and "commit" both zero.
> >
> > The first time rb_get_reader_page() is called, the compare of read <
> > rb_page_size() (which is the commit field) fails (0 < 0 is false).
> >
> > A swap from the writable ring buffer takes place and the
> > cpu_buffer->read_stamp is updated.
> >
> > Ah! I think this is where the bug you see happens. But your analysis is
> > flawed.
> 
> I admit I didn't look into it that closely, so flaws are quite
> possible (also, I'm human).

I didn't mean to offend, as this is not something that can be easily
picked up.

> 
> >
> > If the ring buffer is currently empty, we swap an empty page for an
> > empty page. But the writer ends up on the reader page preventing new
> > swaps from taking place.
> >
> >    commit_page == reader_page
> >
> > If the writer starts writing, it will update the time stamp of the page.
> > If your next read happens now, and it's just a single read, then you are
> > correct that it will not update the read_stamp.
> >
> > I'm wondering if it would be better to just not do the swap, and return
> > NULL when it is empty. This would also fix the problem, as the
> > read_stamp will only be set when you actually have data.
> 
> But we do have data... That's how we know we're getting invalid timestamps.

I bet this never happens if you add a little data and then do a read. I
only happens when you read from an empty ring buffer, then do a small
write, and then read again. That's where it will get you.

> 
> I don't quite understand what you're describing. Here's what I think
> is happening though:
> 
> When the ring buffer is reset, commit_page == tail_page == head_page.

And at this moment, the writer is off the reader page and will not be on
the reader page. The only way to get date at this point is if you do a
swap.

> rb_get_reader_page() will pick up the head page.

Right. But this requires a read to happen. If you never do a read, then
rb_get_reader_page will not pick up the head page. The problem is when
you read from an empty ring buffer.


>  Then a few (less than
> 1 page worth) writes happen, on the tail_page which is currently (or
> soon to be) also the reader_page. Now read == 0, but
> rb_page_size(reader) is however many bytes have been written to the
> page.

Correct.

> 
> So we have valid data on the reader page, but read_stamp has not been set yet.

Yep

> 
> >
> > Or it may just be simpler to take your patch.
> 
> Please? :)

Now I think you may understand my patch.

> 
> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > Does something like this work for you. Note, this is totally untested!
> >
> > -- Steve
> >
> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> > index ad0239b..5943044 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
> >        if (cpu_buffer->commit_page == cpu_buffer->reader_page)
> >                goto out;
> >
> > +       /* Don't bother swapping if the ring buffer is empty */
> 
> This doesn't make sense, since the ring buffer isn't empty in this
> scenario. (I didn't bother testing it.)

Yes it is! If you never did a read on the empty buffer, the *only* way
to get what was written is to do the swap.

The problem you are seeing is that you did a read on an empty buffer
before writing. In this case the swap happens but all we have is an
empty page with the write pointer (commit and tail) on it. When a write
happens, and you do another read, *then* we do not do the swap and you
get the stale time stamp.

Thus, if you never let the first swap happen, you wont get into this
scenario that you read and do not update the read_stamp.

> 
> > +       if (rb_num_of_entries(cpu_buffer) == 0)
> 
> Did you mean rb_page_entries(something?)

Nope, I meant the actual ring buffer. If the ring buffer is empty, do
not perform the swap. Because we only update the read stamp when we
perform a swap, and it takes a write to happen before the page's
timestamp is updated. If we do the swap before any writes happen, we get
an invalid timestamp from the page.

Now do you understand?

-- Steve

> 
> > +               goto out;
> > +
> >        /*
> >         * Reset the reader page to size zero.
> >         */
> >
> >



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-22 23:31         ` Steven Rostedt
@ 2012-06-23  1:27           ` David Sharp
  2012-06-27  0:35             ` David Sharp
  0 siblings, 1 reply; 15+ messages in thread
From: David Sharp @ 2012-06-23  1:27 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, vnagarnaik

On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 2012-06-22 at 13:50 -0700, David Sharp wrote:

>> On Thu, Jun 21, 2012 at 8:46 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
>> > OK, I did look at this more.
>>
>> >>  The only function that
>> >> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
>> >> only by rb_get_reader_page().
>> >
>> > Correct, which is the only way to get the reader page no matter how you
>> > read the buffer.
>> >
>> >>  rb_reset_reader_page() was not called when there
>> >> is data immediately available on the page to read (read < rb_page_size()). This
>> >> is the bug.
>> >
>> > It is not called?
>>
>> Correct. I was able to add WARN_ONs that showed that it reached "out:"
>> without reaching rb_reset_reader_page() first while still having a
>> poison value in cpu_buffer->read_stamp.
>
> NO! You didn't understand what I wrote. The writer can not get onto the
> reader page without a swap! The reader page is not part of the main ring
> buffer.

I understand that. I just thought the writer was getting onto the
reader page during a swap in the first read, the same read in which we
were getting invalid timestamps. But that doesn't make sense.

>> > Ah! I think this is where the bug you see happens. But your analysis is
>> > flawed.
>>
>> I admit I didn't look into it that closely, so flaws are quite
>> possible (also, I'm human).
>
> I didn't mean to offend, as this is not something that can be easily
> picked up.

None taken. I only meant I wasn't as rigorous as I could have been.

>> > If the ring buffer is currently empty, we swap an empty page for an
>> > empty page. But the writer ends up on the reader page preventing new
>> > swaps from taking place.
>> >
>> >    commit_page == reader_page
>> >
>> > If the writer starts writing, it will update the time stamp of the page.
>> > If your next read happens now, and it's just a single read, then you are
>> > correct that it will not update the read_stamp.
>> >
>> > I'm wondering if it would be better to just not do the swap, and return
>> > NULL when it is empty. This would also fix the problem, as the
>> > read_stamp will only be set when you actually have data.
>>
>> But we do have data... That's how we know we're getting invalid timestamps.
>
> I bet this never happens if you add a little data and then do a read. I
> only happens when you read from an empty ring buffer, then do a small
> write, and then read again. That's where it will get you.

Ah, see what you mean now. That seems possible.

>
>>
>> I don't quite understand what you're describing. Here's what I think
>> is happening though:
>>
>> When the ring buffer is reset, commit_page == tail_page == head_page.
>
> And at this moment, the writer is off the reader page and will not be on
> the reader page. The only way to get date at this point is if you do a
> swap.
>
>> rb_get_reader_page() will pick up the head page.
>
> Right. But this requires a read to happen. If you never do a read, then
> rb_get_reader_page will not pick up the head page. The problem is when
> you read from an empty ring buffer.
>
>
>>  Then a few (less than
>> 1 page worth) writes happen, on the tail_page which is currently (or
>> soon to be) also the reader_page. Now read == 0, but
>> rb_page_size(reader) is however many bytes have been written to the
>> page.
>
> Correct.
>
>>
>> So we have valid data on the reader page, but read_stamp has not been set yet.
>
> Yep
>
>>
>> >
>> > Or it may just be simpler to take your patch.
>>
>> Please? :)
>
> Now I think you may understand my patch.

Yeah, mostly. At least enough that I think it's worth testing. But Monday.

>> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
>> > Does something like this work for you. Note, this is totally untested!
>> >
>> > -- Steve
>> >
>> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>> > index ad0239b..5943044 100644
>> > --- a/kernel/trace/ring_buffer.c
>> > +++ b/kernel/trace/ring_buffer.c
>> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
>> >        if (cpu_buffer->commit_page == cpu_buffer->reader_page)
>> >                goto out;
>> >
>> > +       /* Don't bother swapping if the ring buffer is empty */
>>
>> This doesn't make sense, since the ring buffer isn't empty in this
>> scenario. (I didn't bother testing it.)
>
> Yes it is! If you never did a read on the empty buffer, the *only* way
> to get what was written is to do the swap.
>
> The problem you are seeing is that you did a read on an empty buffer
> before writing. In this case the swap happens but all we have is an
> empty page with the write pointer (commit and tail) on it. When a write
> happens, and you do another read, *then* we do not do the swap and you
> get the stale time stamp.
>
> Thus, if you never let the first swap happen, you wont get into this
> scenario that you read and do not update the read_stamp.
>
>>
>> > +       if (rb_num_of_entries(cpu_buffer) == 0)
>>
>> Did you mean rb_page_entries(something?)
>
> Nope, I meant the actual ring buffer. If the ring buffer is empty, do
> not perform the swap. Because we only update the read stamp when we
> perform a swap, and it takes a write to happen before the page's
> timestamp is updated. If we do the swap before any writes happen, we get
> an invalid timestamp from the page.

Oops, I only asked because I'm blind, or can't type, or something,
because I couldn't find rb_num_of_entries().

>
> Now do you understand?
>
> -- Steve
>
>>
>> > +               goto out;
>> > +
>> >        /*
>> >         * Reset the reader page to size zero.
>> >         */
>> >
>> >
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-23  1:27           ` David Sharp
@ 2012-06-27  0:35             ` David Sharp
  2012-06-27 12:46               ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: David Sharp @ 2012-06-27  0:35 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, vnagarnaik

On Fri, Jun 22, 2012 at 6:27 PM, David Sharp <dhsharp@google.com> wrote:
> On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>>
>> Now I think you may understand my patch.
>
> Yeah, mostly. At least enough that I think it's worth testing. But Monday.

I got around to testing your patch today, and it fixes the issue. No
bad-looking timestamps in 40 runs, whereas could reproduce within 3
runs before. Do you want me to send you a fresh patch, or just use the
one you have?

>>> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
>>> > Does something like this work for you. Note, this is totally untested!
>>> >
>>> > -- Steve
>>> >
>>> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>>> > index ad0239b..5943044 100644
>>> > --- a/kernel/trace/ring_buffer.c
>>> > +++ b/kernel/trace/ring_buffer.c
>>> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
>>> >        if (cpu_buffer->commit_page == cpu_buffer->reader_page)
>>> >                goto out;
>>> >
>>> > +       /* Don't bother swapping if the ring buffer is empty */
>>> > +       if (rb_num_of_entries(cpu_buffer) == 0)
>>> > +               goto out;
>>> > +
>>> >        /*
>>> >         * Reset the reader page to size zero.
>>> >         */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-27  0:35             ` David Sharp
@ 2012-06-27 12:46               ` Steven Rostedt
  2012-06-27 18:00                 ` David Sharp
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2012-06-27 12:46 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, vnagarnaik

On Tue, 2012-06-26 at 17:35 -0700, David Sharp wrote:
> On Fri, Jun 22, 2012 at 6:27 PM, David Sharp <dhsharp@google.com> wrote:
> > On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >>
> >> Now I think you may understand my patch.
> >
> > Yeah, mostly. At least enough that I think it's worth testing. But Monday.
> 
> I got around to testing your patch today, and it fixes the issue. No
> bad-looking timestamps in 40 runs, whereas could reproduce within 3
> runs before. Do you want me to send you a fresh patch, or just use the
> one you have?

I rather use this one for two reasons.

1) it limits the places where read_stamp is updated. And I rather not
add an update because "it fixes an anomaly".

2) I think it is wrong to force the writer on the reader page when no
write has been made. There's some side effects that this causes. One is
that if you do a read with no write, and then do nothing, it forces the
writer on that page. Now if a lot of writes happen (function tracing),
the writes that were on the reader page are never overwritten when the
buffer is full. Then you get a page of very old data, followed by a
buffer full of new data.

-- Steve

> 
> >>> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >>> > Does something like this work for you. Note, this is totally untested!
> >>> >
> >>> > -- Steve
> >>> >
> >>> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> >>> > index ad0239b..5943044 100644
> >>> > --- a/kernel/trace/ring_buffer.c
> >>> > +++ b/kernel/trace/ring_buffer.c
> >>> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
> >>> >        if (cpu_buffer->commit_page == cpu_buffer->reader_page)
> >>> >                goto out;
> >>> >
> >>> > +       /* Don't bother swapping if the ring buffer is empty */
> >>> > +       if (rb_num_of_entries(cpu_buffer) == 0)
> >>> > +               goto out;
> >>> > +
> >>> >        /*
> >>> >         * Reset the reader page to size zero.
> >>> >         */



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-27 12:46               ` Steven Rostedt
@ 2012-06-27 18:00                 ` David Sharp
  2012-06-27 18:05                   ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: David Sharp @ 2012-06-27 18:00 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, vnagarnaik

On Wed, Jun 27, 2012 at 5:46 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2012-06-26 at 17:35 -0700, David Sharp wrote:
> > On Fri, Jun 22, 2012 at 6:27 PM, David Sharp <dhsharp@google.com> wrote:
> > > On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > >>
> > >> Now I think you may understand my patch.
> > >
> > > Yeah, mostly. At least enough that I think it's worth testing. But Monday.
> >
> > I got around to testing your patch today, and it fixes the issue. No
> > bad-looking timestamps in 40 runs, whereas could reproduce within 3
> > runs before. Do you want me to send you a fresh patch, or just use the
> > one you have?
>
> I rather use this one for two reasons.

I just meant, do you want me to send you a version of your patch with
my description (I'll update it, obviously), or do it yourself.

>
> 1) it limits the places where read_stamp is updated. And I rather not
> add an update because "it fixes an anomaly".
>
> 2) I think it is wrong to force the writer on the reader page when no
> write has been made. There's some side effects that this causes. One is
> that if you do a read with no write, and then do nothing, it forces the
> writer on that page. Now if a lot of writes happen (function tracing),
> the writes that were on the reader page are never overwritten when the
> buffer is full. Then you get a page of very old data, followed by a
> buffer full of new data.

I have always disliked that page of very old data, so I'm really happy
that this will get rid of it. As I recall, you once claimed this was a
"feature". :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ring-buffer: fix uninitialized read_stamp
  2012-06-27 18:00                 ` David Sharp
@ 2012-06-27 18:05                   ` Steven Rostedt
  0 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2012-06-27 18:05 UTC (permalink / raw)
  To: David Sharp; +Cc: linux-kernel, vnagarnaik

On Wed, 2012-06-27 at 11:00 -0700, David Sharp wrote:

> I just meant, do you want me to send you a version of your patch with
> my description (I'll update it, obviously), or do it yourself.

I'll do it. Although, I think this may just go for 3.6 and not 3.5, as
the bug is really minor.

> I have always disliked that page of very old data, so I'm really happy
> that this will get rid of it. As I recall, you once claimed this was a
> "feature". :)

Of course it was a feature. But we'll just act like Gnome and remove
it ;-)

-- Steve



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:perf/core] ring-buffer: Fix uninitialized read_stamp
  2012-06-18 23:02   ` David Sharp
  2012-06-21 14:52     ` Steven Rostedt
  2012-06-21 15:46     ` Steven Rostedt
@ 2012-07-06 10:49     ` tip-bot for Steven Rostedt
  2012-07-06 11:30     ` [tip:perf/urgent] " tip-bot for Steven Rostedt
  3 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Steven Rostedt @ 2012-07-06 10:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, srostedt, tglx

Commit-ID:  a5fb833172eca69136e9ee1ada778e404086ab8a
Gitweb:     http://git.kernel.org/tip/a5fb833172eca69136e9ee1ada778e404086ab8a
Author:     Steven Rostedt <srostedt@redhat.com>
AuthorDate: Thu, 28 Jun 2012 13:35:04 -0400
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 28 Jun 2012 13:52:15 -0400

ring-buffer: Fix uninitialized read_stamp

The ring buffer reader page is used to swap a page from the writable
ring buffer. If the writer happens to be on that page, it ends up on the
reader page, but will simply move off of it, back into the writable ring
buffer as writes are added.

The time stamp passed back to the readers is stored in the cpu_buffer per
CPU descriptor. This stamp is updated when a swap of the reader page takes
place, and it reads the current stamp from the page taken from the writable
ring buffer. Everytime a writer goes to a new page, it updates the time stamp
of that page.

The problem happens if a reader reads a page from an empty per CPU ring buffer.
If the buffer is empty, the swap still takes place, placing the writer at the
start of the reader page. If at a later time, a write happens, it updates the
page's time stamp and continues. But the problem is that the read_stamp does
not get updated, because the page was already swapped.

The solution to this was to not swap the page if the ring buffer happens to
be empty. This also removes the side effect that the writes on the reader
page will not get updated because the writer never gets back on the reader
page without a swap. That is, if a read happens on an empty buffer, but then
no reads happen for a while. If a swap took place, and the writer were to start
writing a lot of data (function tracer), it will start overflowing the ring buffer
and overwrite the older data. But because the writer never goes back onto the
reader page, the data left on the reader page never gets overwritten. This
causes the reader to see really old data, followed by a jump to newer data.

Link: http://lkml.kernel.org/r/1340060577-9112-1-git-send-email-dhsharp@google.com
Google-Bug-Id: 6410455
Reported-by: David Sharp <dhsharp@google.com>
tested-by: David Sharp <dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 1d0f6a8..82a3e0c 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3239,6 +3239,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 	if (cpu_buffer->commit_page == cpu_buffer->reader_page)
 		goto out;
 
+	/* Don't bother swapping if the ring buffer is empty */
+	if (rb_num_of_entries(cpu_buffer) == 0)
+		goto out;
+
 	/*
 	 * Reset the reader page to size zero.
 	 */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/urgent] ring-buffer: Fix uninitialized read_stamp
  2012-06-18 23:02   ` David Sharp
                       ` (2 preceding siblings ...)
  2012-07-06 10:49     ` [tip:perf/core] ring-buffer: Fix " tip-bot for Steven Rostedt
@ 2012-07-06 11:30     ` tip-bot for Steven Rostedt
  3 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Steven Rostedt @ 2012-07-06 11:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, rostedt, dhsharp, srostedt, tglx

Commit-ID:  01c4359c155e2613cb29363865804c2b11092d6f
Gitweb:     http://git.kernel.org/tip/01c4359c155e2613cb29363865804c2b11092d6f
Author:     Steven Rostedt <srostedt@redhat.com>
AuthorDate: Thu, 28 Jun 2012 13:35:04 -0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 6 Jul 2012 11:13:22 +0200

ring-buffer: Fix uninitialized read_stamp

The ring buffer reader page is used to swap a page from the writable
ring buffer. If the writer happens to be on that page, it ends up on the
reader page, but will simply move off of it, back into the writable ring
buffer as writes are added.

The time stamp passed back to the readers is stored in the cpu_buffer per
CPU descriptor. This stamp is updated when a swap of the reader page takes
place, and it reads the current stamp from the page taken from the writable
ring buffer. Everytime a writer goes to a new page, it updates the time stamp
of that page.

The problem happens if a reader reads a page from an empty per CPU ring buffer.
If the buffer is empty, the swap still takes place, placing the writer at the
start of the reader page. If at a later time, a write happens, it updates the
page's time stamp and continues. But the problem is that the read_stamp does
not get updated, because the page was already swapped.

The solution to this was to not swap the page if the ring buffer happens to
be empty. This also removes the side effect that the writes on the reader
page will not get updated because the writer never gets back on the reader
page without a swap. That is, if a read happens on an empty buffer, but then
no reads happen for a while. If a swap took place, and the writer were to start
writing a lot of data (function tracer), it will start overflowing the ring buffer
and overwrite the older data. But because the writer never goes back onto the
reader page, the data left on the reader page never gets overwritten. This
causes the reader to see really old data, followed by a jump to newer data.

Link: http://lkml.kernel.org/r/1340060577-9112-1-git-send-email-dhsharp@google.com
Google-Bug-Id: 6410455
Reported-by: David Sharp <dhsharp@google.com>
tested-by: David Sharp <dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/trace/ring_buffer.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 1d0f6a8..82a3e0c 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3239,6 +3239,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 	if (cpu_buffer->commit_page == cpu_buffer->reader_page)
 		goto out;
 
+	/* Don't bother swapping if the ring buffer is empty */
+	if (rb_num_of_entries(cpu_buffer) == 0)
+		goto out;
+
 	/*
 	 * Reset the reader page to size zero.
 	 */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-07-06 11:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-07 23:27 [PATCH] ring-buffer: fix uninitialized read_stamp David Sharp
2012-06-07 23:34 ` David Sharp
2012-06-18 23:02   ` David Sharp
2012-06-21 14:52     ` Steven Rostedt
2012-06-21 15:46     ` Steven Rostedt
2012-06-21 15:56       ` Steven Rostedt
2012-06-22 20:50       ` David Sharp
2012-06-22 23:31         ` Steven Rostedt
2012-06-23  1:27           ` David Sharp
2012-06-27  0:35             ` David Sharp
2012-06-27 12:46               ` Steven Rostedt
2012-06-27 18:00                 ` David Sharp
2012-06-27 18:05                   ` Steven Rostedt
2012-07-06 10:49     ` [tip:perf/core] ring-buffer: Fix " tip-bot for Steven Rostedt
2012-07-06 11:30     ` [tip:perf/urgent] " tip-bot for Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.