linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix race in ring_buffer_consume()
@ 2008-12-29  3:59 Indan Zupancic
  2008-12-29  7:34 ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
  2008-12-29 14:47 ` [PATCH] Fix race in ring_buffer_consume() Steven Rostedt
  0 siblings, 2 replies; 11+ messages in thread
From: Indan Zupancic @ 2008-12-29  3:59 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar; +Cc: linux-kernel

Hello,

While doing a grep -r as root somewhere in /proc, grep founds its way
into /mnt/debug/tracing and caused the below snippet:

------------[ cut here ]------------
kernel BUG at /home/indan/src/linux-2.6/kernel/trace/ring_buffer.c:1676!
invalid opcode: 0000 [#1] PREEMPT
last sysfs file: /sys/devices/virtual/backlight/thinkpad_screen/brightness
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in: i915 drm pl2303 usbserial usb_storage uhci_hcd ehci_hcd
usbcore

Pid: 10660, comm: grep Not tainted (2.6.28phc #12) 2371GHG
EIP: 0060:[<c0241464>] EFLAGS: 00010246 CPU: 0
EIP is at rb_advance_reader+0xe/0xd7
EAX: 00000000 EBX: f703c000 ECX: f717ea80 EDX: 00000003
ESI: f70081c0 EDI: f7008240 EBP: 00000fff ESP: f6ac2f3c
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process grep (pid: 10660, ti=f6ac2000 task=f689b900 task.ti=f6ac2000)
Stack:
 f703c000 f70081c0 00000000 c024171e f2de2000 f2de2000 f2de3038 c0242fc8
 00000a6a c0245105 089adff4 f2dca400 f2de2010 00005000 f2dca400 c0244f8f
 089adff4 c0263633 f6ac2fa0 f2dca400 fffffff7 00001000 f6ac2000 c02636f9
Call Trace:
 [<c024171e>] ring_buffer_consume+0x2b/0x31
 [<c0242fc8>] trace_consume+0x1d/0x23
 [<c0245105>] tracing_read_pipe+0x176/0x1dc
 [<c0244f8f>] tracing_read_pipe+0x0/0x1dc
 [<c0263633>] vfs_read+0x73/0xa1
 [<c02636f9>] sys_read+0x3c/0x63
 [<c0202ef5>] sysenter_do_call+0x12/0x2a
Code: 00 25 00 f0 ff ff e8 f9 fb ff ff 85 c0 74 05 e8 8c 61 1c 00
89 f0 5e 5f 5b 5e 5f 5d c3 57 89 c7 56 53 e8 c2 fe ff ff 85 c0 75
04 <0f> 0b eb fe 8b 47 1c 8b 70 1c 03 70 10 8a 06 24 03 3c 03 75 03
EIP: [<c0241464>] rb_advance_reader+0xe/0xd7 SS:ESP 0068:f6ac2f3c
---[ end trace 77bf081d52bacbc7 ]---
note: grep[10660] exited with preempt_count 1

I think preemptirqsoff tracing is enabled, can't check because doing
cat on most files in debug/tracing hangs. Kernel was booted with
initcall_debug. Kernel is 2.6.28.

$ zcat /proc/config.gz |grep TRACER=
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_IRQSOFF_TRACER=y
CONFIG_PREEMPT_TRACER=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_BOOT_TRACER=y

ring_buffer.c:1676:

static void rb_advance_reader(struct ring_buffer_per_cpu *cpu_buffer)
{
	struct ring_buffer_event *event;
	struct buffer_page *reader;
	unsigned length;

	reader = rb_get_reader_page(cpu_buffer);

	/* This function should not be called when buffer is empty */
	BUG_ON(!reader);

Called by ring_buffer_consume():

ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
{
	struct ring_buffer_per_cpu *cpu_buffer;
	struct ring_buffer_event *event;

	if (!cpu_isset(cpu, buffer->cpumask))
		return NULL;

	event = ring_buffer_peek(buffer, cpu, ts);
	if (!event)
		return NULL;

	cpu_buffer = buffer->buffers[cpu];
	rb_advance_reader(cpu_buffer);

ring_buffer_peek() is complicated, but boils down to:

ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
{
[...]
	cpu_buffer = buffer->buffers[cpu];
[...]
	reader = rb_get_reader_page(cpu_buffer);
	if (!reader)
		return NULL;

	event = rb_reader_event(cpu_buffer);
[...]
	return event or return NULL;
}

So between the rb_get_reader_page() call in ring_buffer_peek() and
the one in rb_advance_reader() something happened causing the event
to disappear. Also note: grep[10660] exited with preempt_count 1.
There's no locking between the ring_buffer_peek and rb_advance_reader
calls in ring_buffer_consume, so a tentative guess is that there should
be some.

Then again, the double tracing_read_pipe() in the backtrace seems weird,
so maybe something else is going on.

Easiest fix seems to merge ring_buffer_peek and ring_buffer_consume into
ring_buffer_get, which is the same as the current peek, but with an extra
parameter telling it to consume the buffer or not.

The same might be needed for ring_buffer_iter_peek() and ring_buffer_read().

Patch doing this:

commit 648cf957210619595856b78f12bd11752ae22aa5
Author: Indan Zupancic <indan@nul.nu>
Date:   Mon Dec 29 14:27:18 2008 +1100

    Fix race in ring_buffer_consume(): Replace ring_buffer_consume and
ring_buffer_peek with ring_buffer_get_event

    Signed-off-by: Indan Zupancic <indan@nul.nu>

 include/linux/ring_buffer.h   |    4 +---
 kernel/trace/ring_buffer.c    |   39 ++++++++-------------------------------
 kernel/trace/trace.c          |   15 ++++++++-------
 kernel/trace/trace_selftest.c |    2 +-
 4 files changed, 18 insertions(+), 42 deletions(-)

diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index e097c2e..d9320bd 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -83,9 +83,7 @@ int ring_buffer_write(struct ring_buffer *buffer,
 		      unsigned long length, void *data);

 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts);
-struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts);
+ring_buffer_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int
consume);

 struct ring_buffer_iter *
 ring_buffer_read_start(struct ring_buffer *buffer, int cpu);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 668bbb5..d4e5dbc 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1727,16 +1727,18 @@ static void rb_advance_iter(struct ring_buffer_iter
*iter)
 }

 /**
- * ring_buffer_peek - peek at the next event to be read
+ * ring_buffer_get_event - get the next event to be read
  * @buffer: The ring buffer to read
  * @cpu: The cpu to peak at
  * @ts: The timestamp counter of this event.
+ * @consume: Whether the event should be consumed.
+ * If true, sequential reads will keep returning a different event,
+ * and eventually empty the ring buffer if the producer is slower.
  *
- * This will return the event that will be read next, but does
- * not consume the data.
+ * This will return the event that will be read next.
  */
 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+ring_buffer_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int consume)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
@@ -1789,6 +1791,8 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu,
u64 *ts)
 			*ts = cpu_buffer->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(cpu_buffer->cpu, ts);
 		}
+		if (consume)
+			rb_advance_reader(cpu_buffer);
 		return event;

 	default:
@@ -1869,33 +1873,6 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter,
u64 *ts)
 }

 /**
- * ring_buffer_consume - return an event and consume it
- * @buffer: The ring buffer to get the next event from
- *
- * Returns the next event in the ring buffer, and that event is consumed.
- * Meaning, that sequential reads will keep returning a different event,
- * and eventually empty the ring buffer if the producer is slower.
- */
-struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
-{
-	struct ring_buffer_per_cpu *cpu_buffer;
-	struct ring_buffer_event *event;
-
-	if (!cpu_isset(cpu, buffer->cpumask))
-		return NULL;
-
-	event = ring_buffer_peek(buffer, cpu, ts);
-	if (!event)
-		return NULL;
-
-	cpu_buffer = buffer->buffers[cpu];
-	rb_advance_reader(cpu_buffer);
-
-	return event;
-}
-
-/**
  * ring_buffer_read_start - start a non consuming read of the buffer
  * @buffer: The ring buffer to read from
  * @cpu: The cpu buffer to iterate over
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d86e325..f1329ed 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -938,7 +938,7 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64
*ts)
 	if (buf_iter)
 		event = ring_buffer_iter_peek(buf_iter, ts);
 	else
-		event = ring_buffer_peek(iter->tr->buffer, cpu, ts);
+		event = ring_buffer_get_event(iter->tr->buffer, cpu, ts, 0);

 	ftrace_enable_cpu();

@@ -1002,7 +1002,7 @@ static void trace_consume(struct trace_iterator *iter)
 {
 	/* Don't allow ftrace to trace into the ring buffers */
 	ftrace_disable_cpu();
-	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts);
+	ring_buffer_get_event(iter->tr->buffer, iter->cpu, &iter->ts, 1);
 	ftrace_enable_cpu();
 }

@@ -1310,8 +1310,9 @@ void trace_seq_print_cont(struct trace_seq *s, struct
trace_iterator *iter)
 	struct trace_entry *ent;
 	struct trace_field_cont *cont;
 	bool ok = true;
+	int cpu = iter->cpu;

-	ent = peek_next_entry(iter, iter->cpu, NULL);
+	ent = peek_next_entry(iter, cpu, NULL);
 	if (!ent || ent->type != TRACE_CONT) {
 		trace_seq_putc(s, '\n');
 		return;
@@ -1324,14 +1325,14 @@ void trace_seq_print_cont(struct trace_seq *s, struct
trace_iterator *iter)

 		ftrace_disable_cpu();

-		if (iter->buffer_iter[iter->cpu])
-			ring_buffer_read(iter->buffer_iter[iter->cpu], NULL);
+		if (iter->buffer_iter[cpu])
+			ring_buffer_read(iter->buffer_iter[cpu], NULL);
 		else
-			ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL);
+			ring_buffer_get_event(iter->tr->buffer, cpu, NULL, 1);

 		ftrace_enable_cpu();

-		ent = peek_next_entry(iter, iter->cpu, NULL);
+		ent = peek_next_entry(iter, cpu, NULL);
 	} while (ent && ent->type == TRACE_CONT);

 	if (!ok)
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 90bc752..3894989 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -23,7 +23,7 @@ static int trace_test_buffer_cpu(struct trace_array *tr, int
cpu)
 	struct ring_buffer_event *event;
 	struct trace_entry *entry;

-	while ((event = ring_buffer_consume(tr->buffer, cpu, NULL))) {
+	while ((event = ring_buffer_get_event(tr->buffer, cpu, NULL, 1))) {
 		entry = ring_buffer_event_data(event);

 		if (!trace_valid_entry(entry)) {



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event
  2008-12-29  3:59 [PATCH] Fix race in ring_buffer_consume() Indan Zupancic
@ 2008-12-29  7:34 ` Indan Zupancic
  2008-12-29 12:24   ` Ingo Molnar
  2009-01-05 15:09   ` Steven Rostedt
  2008-12-29 14:47 ` [PATCH] Fix race in ring_buffer_consume() Steven Rostedt
  1 sibling, 2 replies; 11+ messages in thread
From: Indan Zupancic @ 2008-12-29  7:34 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar; +Cc: linux-kernel, Indan Zupancic

Original mail was mangled, patch resent via git.

Signed-off-by: Indan Zupancic <indan@nul.nu>
---
 include/linux/ring_buffer.h   |    4 +---
 kernel/trace/ring_buffer.c    |   39 ++++++++-------------------------------
 kernel/trace/trace.c          |   15 ++++++++-------
 kernel/trace/trace_selftest.c |    2 +-
 4 files changed, 18 insertions(+), 42 deletions(-)

diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index e097c2e..d9320bd 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -83,9 +83,7 @@ int ring_buffer_write(struct ring_buffer *buffer,
 		      unsigned long length, void *data);
 
 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts);
-struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts);
+ring_buffer_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int consume);
 
 struct ring_buffer_iter *
 ring_buffer_read_start(struct ring_buffer *buffer, int cpu);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 668bbb5..d4e5dbc 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1727,16 +1727,18 @@ static void rb_advance_iter(struct ring_buffer_iter *iter)
 }
 
 /**
- * ring_buffer_peek - peek at the next event to be read
+ * ring_buffer_get_event - get the next event to be read
  * @buffer: The ring buffer to read
  * @cpu: The cpu to peak at
  * @ts: The timestamp counter of this event.
+ * @consume: Whether the event should be consumed.
+ * If true, sequential reads will keep returning a different event,
+ * and eventually empty the ring buffer if the producer is slower.
  *
- * This will return the event that will be read next, but does
- * not consume the data.
+ * This will return the event that will be read next.
  */
 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+ring_buffer_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int consume)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
@@ -1789,6 +1791,8 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 			*ts = cpu_buffer->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(cpu_buffer->cpu, ts);
 		}
+		if (consume)
+			rb_advance_reader(cpu_buffer);
 		return event;
 
 	default:
@@ -1869,33 +1873,6 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 }
 
 /**
- * ring_buffer_consume - return an event and consume it
- * @buffer: The ring buffer to get the next event from
- *
- * Returns the next event in the ring buffer, and that event is consumed.
- * Meaning, that sequential reads will keep returning a different event,
- * and eventually empty the ring buffer if the producer is slower.
- */
-struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
-{
-	struct ring_buffer_per_cpu *cpu_buffer;
-	struct ring_buffer_event *event;
-
-	if (!cpu_isset(cpu, buffer->cpumask))
-		return NULL;
-
-	event = ring_buffer_peek(buffer, cpu, ts);
-	if (!event)
-		return NULL;
-
-	cpu_buffer = buffer->buffers[cpu];
-	rb_advance_reader(cpu_buffer);
-
-	return event;
-}
-
-/**
  * ring_buffer_read_start - start a non consuming read of the buffer
  * @buffer: The ring buffer to read from
  * @cpu: The cpu buffer to iterate over
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d86e325..f1329ed 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -938,7 +938,7 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts)
 	if (buf_iter)
 		event = ring_buffer_iter_peek(buf_iter, ts);
 	else
-		event = ring_buffer_peek(iter->tr->buffer, cpu, ts);
+		event = ring_buffer_get_event(iter->tr->buffer, cpu, ts, 0);
 
 	ftrace_enable_cpu();
 
@@ -1002,7 +1002,7 @@ static void trace_consume(struct trace_iterator *iter)
 {
 	/* Don't allow ftrace to trace into the ring buffers */
 	ftrace_disable_cpu();
-	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts);
+	ring_buffer_get_event(iter->tr->buffer, iter->cpu, &iter->ts, 1);
 	ftrace_enable_cpu();
 }
 
@@ -1310,8 +1310,9 @@ void trace_seq_print_cont(struct trace_seq *s, struct trace_iterator *iter)
 	struct trace_entry *ent;
 	struct trace_field_cont *cont;
 	bool ok = true;
+	int cpu = iter->cpu;
 
-	ent = peek_next_entry(iter, iter->cpu, NULL);
+	ent = peek_next_entry(iter, cpu, NULL);
 	if (!ent || ent->type != TRACE_CONT) {
 		trace_seq_putc(s, '\n');
 		return;
@@ -1324,14 +1325,14 @@ void trace_seq_print_cont(struct trace_seq *s, struct trace_iterator *iter)
 
 		ftrace_disable_cpu();
 
-		if (iter->buffer_iter[iter->cpu])
-			ring_buffer_read(iter->buffer_iter[iter->cpu], NULL);
+		if (iter->buffer_iter[cpu])
+			ring_buffer_read(iter->buffer_iter[cpu], NULL);
 		else
-			ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL);
+			ring_buffer_get_event(iter->tr->buffer, cpu, NULL, 1);
 
 		ftrace_enable_cpu();
 
-		ent = peek_next_entry(iter, iter->cpu, NULL);
+		ent = peek_next_entry(iter, cpu, NULL);
 	} while (ent && ent->type == TRACE_CONT);
 
 	if (!ok)
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 90bc752..3894989 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -23,7 +23,7 @@ static int trace_test_buffer_cpu(struct trace_array *tr, int cpu)
 	struct ring_buffer_event *event;
 	struct trace_entry *entry;
 
-	while ((event = ring_buffer_consume(tr->buffer, cpu, NULL))) {
+	while ((event = ring_buffer_get_event(tr->buffer, cpu, NULL, 1))) {
 		entry = ring_buffer_event_data(event);
 
 		if (!trace_valid_entry(entry)) {
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event
  2008-12-29  7:34 ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
@ 2008-12-29 12:24   ` Ingo Molnar
  2009-01-01  5:10     ` Indan Zupancic
  2009-01-05 15:09   ` Steven Rostedt
  1 sibling, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2008-12-29 12:24 UTC (permalink / raw)
  To: Indan Zupancic; +Cc: Steven Rostedt, Ingo Molnar, linux-kernel


* Indan Zupancic <indan@nul.nu> wrote:

> Original mail was mangled, patch resent via git.
> 
> Signed-off-by: Indan Zupancic <indan@nul.nu>
> ---
>  include/linux/ring_buffer.h   |    4 +---
>  kernel/trace/ring_buffer.c    |   39 ++++++++-------------------------------
>  kernel/trace/trace.c          |   15 ++++++++-------
>  kernel/trace/trace_selftest.c |    2 +-
>  4 files changed, 18 insertions(+), 42 deletions(-)

there's been a number of updates here - could you please do a patch 
against tip/master:

  http://people.redhat.com/mingo/tip.git/README

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix race in ring_buffer_consume()
  2008-12-29  3:59 [PATCH] Fix race in ring_buffer_consume() Indan Zupancic
  2008-12-29  7:34 ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
@ 2008-12-29 14:47 ` Steven Rostedt
  1 sibling, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2008-12-29 14:47 UTC (permalink / raw)
  To: Indan Zupancic; +Cc: Ingo Molnar, linux-kernel


On Mon, 2008-12-29 at 04:59 +0100, Indan Zupancic wrote:
> Hello,
> 
> While doing a grep -r as root somewhere in /proc, grep founds its way
> into /mnt/debug/tracing and caused the below snippet:
> 
> ------------[ cut here ]------------
> kernel BUG at /home/indan/src/linux-2.6/kernel/trace/ring_buffer.c:1676!
> invalid opcode: 0000 [#1] PREEMPT
> last sysfs file: /sys/devices/virtual/backlight/thinkpad_screen/brightness
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in: i915 drm pl2303 usbserial usb_storage uhci_hcd ehci_hcd
> usbcore
> 
> Pid: 10660, comm: grep Not tainted (2.6.28phc #12) 2371GHG
> EIP: 0060:[<c0241464>] EFLAGS: 00010246 CPU: 0
> EIP is at rb_advance_reader+0xe/0xd7
> EAX: 00000000 EBX: f703c000 ECX: f717ea80 EDX: 00000003
> ESI: f70081c0 EDI: f7008240 EBP: 00000fff ESP: f6ac2f3c
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process grep (pid: 10660, ti=f6ac2000 task=f689b900 task.ti=f6ac2000)
> Stack:
>  f703c000 f70081c0 00000000 c024171e f2de2000 f2de2000 f2de3038 c0242fc8
>  00000a6a c0245105 089adff4 f2dca400 f2de2010 00005000 f2dca400 c0244f8f
>  089adff4 c0263633 f6ac2fa0 f2dca400 fffffff7 00001000 f6ac2000 c02636f9
> Call Trace:
>  [<c024171e>] ring_buffer_consume+0x2b/0x31
>  [<c0242fc8>] trace_consume+0x1d/0x23
>  [<c0245105>] tracing_read_pipe+0x176/0x1dc
>  [<c0244f8f>] tracing_read_pipe+0x0/0x1dc
>  [<c0263633>] vfs_read+0x73/0xa1
>  [<c02636f9>] sys_read+0x3c/0x63
>  [<c0202ef5>] sysenter_do_call+0x12/0x2a
> Code: 00 25 00 f0 ff ff e8 f9 fb ff ff 85 c0 74 05 e8 8c 61 1c 00
> 89 f0 5e 5f 5b 5e 5f 5d c3 57 89 c7 56 53 e8 c2 fe ff ff 85 c0 75
> 04 <0f> 0b eb fe 8b 47 1c 8b 70 1c 03 70 10 8a 06 24 03 3c 03 75 03
> EIP: [<c0241464>] rb_advance_reader+0xe/0xd7 SS:ESP 0068:f6ac2f3c
> ---[ end trace 77bf081d52bacbc7 ]---
> note: grep[10660] exited with preempt_count 1
> 
> I think preemptirqsoff tracing is enabled, can't check because doing
> cat on most files in debug/tracing hangs. Kernel was booted with
> initcall_debug. Kernel is 2.6.28.
> 
> $ zcat /proc/config.gz |grep TRACER=
> CONFIG_NOP_TRACER=y
> CONFIG_HAVE_FUNCTION_TRACER=y
> CONFIG_IRQSOFF_TRACER=y
> CONFIG_PREEMPT_TRACER=y
> CONFIG_CONTEXT_SWITCH_TRACER=y
> CONFIG_BOOT_TRACER=y
> 
> ring_buffer.c:1676:
> 
> static void rb_advance_reader(struct ring_buffer_per_cpu *cpu_buffer)
> {
> 	struct ring_buffer_event *event;
> 	struct buffer_page *reader;
> 	unsigned length;
> 
> 	reader = rb_get_reader_page(cpu_buffer);
> 
> 	/* This function should not be called when buffer is empty */
> 	BUG_ON(!reader);
> 
> Called by ring_buffer_consume():
> 
> ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
> {
> 	struct ring_buffer_per_cpu *cpu_buffer;
> 	struct ring_buffer_event *event;
> 
> 	if (!cpu_isset(cpu, buffer->cpumask))
> 		return NULL;
> 
> 	event = ring_buffer_peek(buffer, cpu, ts);
> 	if (!event)
> 		return NULL;
> 
> 	cpu_buffer = buffer->buffers[cpu];
> 	rb_advance_reader(cpu_buffer);
> 
> ring_buffer_peek() is complicated, but boils down to:
> 
> ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
> {
> [...]
> 	cpu_buffer = buffer->buffers[cpu];
> [...]
> 	reader = rb_get_reader_page(cpu_buffer);
> 	if (!reader)
> 		return NULL;
> 
> 	event = rb_reader_event(cpu_buffer);
> [...]
> 	return event or return NULL;
> }
> 
> So between the rb_get_reader_page() call in ring_buffer_peek() and
> the one in rb_advance_reader() something happened causing the event
> to disappear. Also note: grep[10660] exited with preempt_count 1.
> There's no locking between the ring_buffer_peek and rb_advance_reader
> calls in ring_buffer_consume, so a tentative guess is that there should
> be some.
> 
> Then again, the double tracing_read_pipe() in the backtrace seems weird,
> so maybe something else is going on.
> 
> Easiest fix seems to merge ring_buffer_peek and ring_buffer_consume into
> ring_buffer_get, which is the same as the current peek, but with an extra
> parameter telling it to consume the buffer or not.
> 
> The same might be needed for ring_buffer_iter_peek() and ring_buffer_read().
> 
> Patch doing this:
> 
> commit 648cf957210619595856b78f12bd11752ae22aa5
> Author: Indan Zupancic <indan@nul.nu>
> Date:   Mon Dec 29 14:27:18 2008 +1100
> 
>     Fix race in ring_buffer_consume(): Replace ring_buffer_consume and
> ring_buffer_peek with ring_buffer_get_event
> 
>     Signed-off-by: Indan Zupancic <indan@nul.nu>
> 
>  include/linux/ring_buffer.h   |    4 +---
>  kernel/trace/ring_buffer.c    |   39 ++++++++-------------------------------
>  kernel/trace/trace.c          |   15 ++++++++-------
>  kernel/trace/trace_selftest.c |    2 +-
>  4 files changed, 18 insertions(+), 42 deletions(-)

Hi Indan,

I really would like to analyze this more, but I'm on leave till
January 5th. I'll definitely take a good look at what is happening then.

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix race in ring_buffer_consume(): Replace  ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event
  2008-12-29 12:24   ` Ingo Molnar
@ 2009-01-01  5:10     ` Indan Zupancic
  2009-01-02 22:08       ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Indan Zupancic @ 2009-01-01  5:10 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Steven Rostedt, Ingo Molnar, linux-kernel

On Mon, December 29, 2008 13:24, Ingo Molnar wrote:
>
> * Indan Zupancic <indan@nul.nu> wrote:
>
>> Original mail was mangled, patch resent via git.
>>
>> Signed-off-by: Indan Zupancic <indan@nul.nu>
>> ---
>>  include/linux/ring_buffer.h   |    4 +---
>>  kernel/trace/ring_buffer.c    |   39
>> ++++++++-------------------------------
>>  kernel/trace/trace.c          |   15 ++++++++-------
>>  kernel/trace/trace_selftest.c |    2 +-
>>  4 files changed, 18 insertions(+), 42 deletions(-)
>
> there's been a number of updates here - could you please do a patch
> against tip/master:
>
>   http://people.redhat.com/mingo/tip.git/README

Thanks for that readme, now I discovered git remote, which is not
mentioned often enough for some reason.

A lot changed indeed, most importantly the race that I hit is
fixed in tip by improved locking. Replace ring_buffer_consume()
and ring_buffer_peek() with ring_buffer_get_event() or not is
just a matter of taste now. Are you still interested in such a
patch?

(34 EXPORT_SYMBOL_GPLs in ring_buffer.c seems a bit excessive though.)

Greetings,

Indan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event
  2009-01-01  5:10     ` Indan Zupancic
@ 2009-01-02 22:08       ` Ingo Molnar
  2009-01-03 23:55         ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2009-01-02 22:08 UTC (permalink / raw)
  To: Indan Zupancic; +Cc: Steven Rostedt, Ingo Molnar, linux-kernel


* Indan Zupancic <indan@nul.nu> wrote:

> On Mon, December 29, 2008 13:24, Ingo Molnar wrote:
> >
> > * Indan Zupancic <indan@nul.nu> wrote:
> >
> >> Original mail was mangled, patch resent via git.
> >>
> >> Signed-off-by: Indan Zupancic <indan@nul.nu>
> >> ---
> >>  include/linux/ring_buffer.h   |    4 +---
> >>  kernel/trace/ring_buffer.c    |   39
> >> ++++++++-------------------------------
> >>  kernel/trace/trace.c          |   15 ++++++++-------
> >>  kernel/trace/trace_selftest.c |    2 +-
> >>  4 files changed, 18 insertions(+), 42 deletions(-)
> >
> > there's been a number of updates here - could you please do a patch
> > against tip/master:
> >
> >   http://people.redhat.com/mingo/tip.git/README
> 
> Thanks for that readme, now I discovered git remote, which is not
> mentioned often enough for some reason.
> 
> A lot changed indeed, most importantly the race that I hit is
> fixed in tip by improved locking. Replace ring_buffer_consume()
> and ring_buffer_peek() with ring_buffer_get_event() or not is
> just a matter of taste now. Are you still interested in such a
> patch?

the code gets simpler and more readable, so why not? Steve, any 
objections?

> (34 EXPORT_SYMBOL_GPLs in ring_buffer.c seems a bit excessive though.)

agreed.

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event
  2009-01-02 22:08       ` Ingo Molnar
@ 2009-01-03 23:55         ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2009-01-03 23:55 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Indan Zupancic, Ingo Molnar, linux-kernel


On Fri, 2009-01-02 at 23:08 +0100, Ingo Molnar wrote:
> * Indan Zupancic <indan@nul.nu> wrote:
> 
> > On Mon, December 29, 2008 13:24, Ingo Molnar wrote:
> > >
> > > * Indan Zupancic <indan@nul.nu> wrote:
> > >
> > >> Original mail was mangled, patch resent via git.
> > >>
> > >> Signed-off-by: Indan Zupancic <indan@nul.nu>
> > >> ---
> > >>  include/linux/ring_buffer.h   |    4 +---
> > >>  kernel/trace/ring_buffer.c    |   39
> > >> ++++++++-------------------------------
> > >>  kernel/trace/trace.c          |   15 ++++++++-------
> > >>  kernel/trace/trace_selftest.c |    2 +-
> > >>  4 files changed, 18 insertions(+), 42 deletions(-)
> > >
> > > there's been a number of updates here - could you please do a patch
> > > against tip/master:
> > >
> > >   http://people.redhat.com/mingo/tip.git/README
> > 
> > Thanks for that readme, now I discovered git remote, which is not
> > mentioned often enough for some reason.
> > 
> > A lot changed indeed, most importantly the race that I hit is
> > fixed in tip by improved locking. Replace ring_buffer_consume()
> > and ring_buffer_peek() with ring_buffer_get_event() or not is
> > just a matter of taste now. Are you still interested in such a
> > patch?
> 
> the code gets simpler and more readable, so why not? Steve, any 
> objections?

I'd like to look closer at this code on Monday.

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event
  2008-12-29  7:34 ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
  2008-12-29 12:24   ` Ingo Molnar
@ 2009-01-05 15:09   ` Steven Rostedt
  2009-01-11  8:35     ` [PATCH] Minor cleanups in ring_buffer.c Indan Zupancic
                       ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Steven Rostedt @ 2009-01-05 15:09 UTC (permalink / raw)
  To: Indan Zupancic; +Cc: Ingo Molnar, linux-kernel


On Mon, 2008-12-29 at 18:34 +1100, Indan Zupancic wrote:
> Original mail was mangled, patch resent via git.
> 

Indan,

I like the patch except for one thing. Could you make the
ring_buffer_get_event into a static function called rb_get_event. And
then you could have:

struct ring_buffer_event *
ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
{
	rb_get_event(buffer, cpu, ts, 0);
}

struct ring_buffer_event *
ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
{
	rb_get_event(buffer, cpu, ts, 1);
}

I just hate public functions that have flags that can get confusing.
Looking at code where I see:

	ring_buffer_get_event(buffer, cpu, ts, 0);

I find it hard to know if the ",0" is correct or should be a ",1"
instead.

This would also make the patch only need to touch ring_buffer.c and not
all the places that use it.

Thanks,

-- Steve


> Signed-off-by: Indan Zupancic <indan@nul.nu>
> ---
>  include/linux/ring_buffer.h   |    4 +---
>  kernel/trace/ring_buffer.c    |   39 ++++++++-------------------------------
>  kernel/trace/trace.c          |   15 ++++++++-------
>  kernel/trace/trace_selftest.c |    2 +-
>  4 files changed, 18 insertions(+), 42 deletions(-)
> 
> diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
> index e097c2e..d9320bd 100644
> --- a/include/linux/ring_buffer.h
> +++ b/include/linux/ring_buffer.h
> @@ -83,9 +83,7 @@ int ring_buffer_write(struct ring_buffer *buffer,
>  		      unsigned long length, void *data);
>  
>  struct ring_buffer_event *
> -ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts);
> -struct ring_buffer_event *
> -ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts);
> +ring_buffer_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int consume);
>  
>  struct ring_buffer_iter *
>  ring_buffer_read_start(struct ring_buffer *buffer, int cpu);
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 668bbb5..d4e5dbc 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1727,16 +1727,18 @@ static void rb_advance_iter(struct ring_buffer_iter *iter)
>  }
>  
>  /**
> - * ring_buffer_peek - peek at the next event to be read
> + * ring_buffer_get_event - get the next event to be read
>   * @buffer: The ring buffer to read
>   * @cpu: The cpu to peak at
>   * @ts: The timestamp counter of this event.
> + * @consume: Whether the event should be consumed.
> + * If true, sequential reads will keep returning a different event,
> + * and eventually empty the ring buffer if the producer is slower.
>   *
> - * This will return the event that will be read next, but does
> - * not consume the data.
> + * This will return the event that will be read next.
>   */
>  struct ring_buffer_event *
> -ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
> +ring_buffer_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int consume)
>  {
>  	struct ring_buffer_per_cpu *cpu_buffer;
>  	struct ring_buffer_event *event;
> @@ -1789,6 +1791,8 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
>  			*ts = cpu_buffer->read_stamp + event->time_delta;
>  			ring_buffer_normalize_time_stamp(cpu_buffer->cpu, ts);
>  		}
> +		if (consume)
> +			rb_advance_reader(cpu_buffer);
>  		return event;
>  
>  	default:
> @@ -1869,33 +1873,6 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
>  }
>  
>  /**
> - * ring_buffer_consume - return an event and consume it
> - * @buffer: The ring buffer to get the next event from
> - *
> - * Returns the next event in the ring buffer, and that event is consumed.
> - * Meaning, that sequential reads will keep returning a different event,
> - * and eventually empty the ring buffer if the producer is slower.
> - */
> -struct ring_buffer_event *
> -ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
> -{
> -	struct ring_buffer_per_cpu *cpu_buffer;
> -	struct ring_buffer_event *event;
> -
> -	if (!cpu_isset(cpu, buffer->cpumask))
> -		return NULL;
> -
> -	event = ring_buffer_peek(buffer, cpu, ts);
> -	if (!event)
> -		return NULL;
> -
> -	cpu_buffer = buffer->buffers[cpu];
> -	rb_advance_reader(cpu_buffer);
> -
> -	return event;
> -}
> -
> -/**
>   * ring_buffer_read_start - start a non consuming read of the buffer
>   * @buffer: The ring buffer to read from
>   * @cpu: The cpu buffer to iterate over
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index d86e325..f1329ed 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -938,7 +938,7 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts)
>  	if (buf_iter)
>  		event = ring_buffer_iter_peek(buf_iter, ts);
>  	else
> -		event = ring_buffer_peek(iter->tr->buffer, cpu, ts);
> +		event = ring_buffer_get_event(iter->tr->buffer, cpu, ts, 0);
>  
>  	ftrace_enable_cpu();
>  
> @@ -1002,7 +1002,7 @@ static void trace_consume(struct trace_iterator *iter)
>  {
>  	/* Don't allow ftrace to trace into the ring buffers */
>  	ftrace_disable_cpu();
> -	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts);
> +	ring_buffer_get_event(iter->tr->buffer, iter->cpu, &iter->ts, 1);
>  	ftrace_enable_cpu();
>  }
>  
> @@ -1310,8 +1310,9 @@ void trace_seq_print_cont(struct trace_seq *s, struct trace_iterator *iter)
>  	struct trace_entry *ent;
>  	struct trace_field_cont *cont;
>  	bool ok = true;
> +	int cpu = iter->cpu;
>  
> -	ent = peek_next_entry(iter, iter->cpu, NULL);
> +	ent = peek_next_entry(iter, cpu, NULL);
>  	if (!ent || ent->type != TRACE_CONT) {
>  		trace_seq_putc(s, '\n');
>  		return;
> @@ -1324,14 +1325,14 @@ void trace_seq_print_cont(struct trace_seq *s, struct trace_iterator *iter)
>  
>  		ftrace_disable_cpu();
>  
> -		if (iter->buffer_iter[iter->cpu])
> -			ring_buffer_read(iter->buffer_iter[iter->cpu], NULL);
> +		if (iter->buffer_iter[cpu])
> +			ring_buffer_read(iter->buffer_iter[cpu], NULL);
>  		else
> -			ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL);
> +			ring_buffer_get_event(iter->tr->buffer, cpu, NULL, 1);
>  
>  		ftrace_enable_cpu();
>  
> -		ent = peek_next_entry(iter, iter->cpu, NULL);
> +		ent = peek_next_entry(iter, cpu, NULL);
>  	} while (ent && ent->type == TRACE_CONT);
>  
>  	if (!ok)
> diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> index 90bc752..3894989 100644
> --- a/kernel/trace/trace_selftest.c
> +++ b/kernel/trace/trace_selftest.c
> @@ -23,7 +23,7 @@ static int trace_test_buffer_cpu(struct trace_array *tr, int cpu)
>  	struct ring_buffer_event *event;
>  	struct trace_entry *entry;
>  
> -	while ((event = ring_buffer_consume(tr->buffer, cpu, NULL))) {
> +	while ((event = ring_buffer_get_event(tr->buffer, cpu, NULL, 1))) {
>  		entry = ring_buffer_event_data(event);
>  
>  		if (!trace_valid_entry(entry)) {


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] Minor cleanups in ring_buffer.c
  2009-01-05 15:09   ` Steven Rostedt
@ 2009-01-11  8:35     ` Indan Zupancic
  2009-01-11  8:46     ` [PATCH] Minor cleanups in ring_buffer.c, try 2 Indan Zupancic
  2009-01-11  9:49     ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
  2 siblings, 0 replies; 11+ messages in thread
From: Indan Zupancic @ 2009-01-11  8:35 UTC (permalink / raw)
  To: srostedt; +Cc: mingo, linux-kernel, Indan Zupancic

Signed-off-by: Indan Zupancic <indan@nul.nu>
---
 kernel/trace/ring_buffer.c |   37 +++++++++++++------------------------
 1 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 4832ffa..11560b3 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1849,7 +1849,7 @@ static void rb_advance_iter(struct ring_buffer_iter *iter)
 }
 
 static struct ring_buffer_event *
-rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+rb_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int consume)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
@@ -1900,6 +1900,8 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 			*ts = cpu_buffer->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(cpu_buffer->cpu, ts);
 		}
+		if (consume)
+			rb_advance_reader(cpu_buffer);
 		return event;
 
 	default:
@@ -1908,10 +1910,9 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 
 	return NULL;
 }
-EXPORT_SYMBOL_GPL(ring_buffer_peek);
 
 static struct ring_buffer_event *
-rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
+rb_iter_get(struct ring_buffer_iter *iter, u64 *ts, int consume)
 {
 	struct ring_buffer *buffer;
 	struct ring_buffer_per_cpu *cpu_buffer;
@@ -1961,6 +1962,8 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 			*ts = iter->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(cpu_buffer->cpu, ts);
 		}
+		if (consume)
+			rb_advance_iter(iter);
 		return event;
 
 	default:
@@ -1988,11 +1991,12 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_buffer_peek(buffer, cpu, ts);
+	event = rb_get_event(buffer, cpu, ts, 0);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
 }
+EXPORT_SYMBOL_GPL(ring_buffer_peek);
 
 /**
  * ring_buffer_iter_peek - peek at the next event to be read
@@ -2010,7 +2014,7 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
+	event = rb_iter_peek(iter, ts, 0);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
@@ -2031,18 +2035,8 @@ ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
 	struct ring_buffer_event *event;
 	unsigned long flags;
 
-	if (!cpumask_test_cpu(cpu, buffer->cpumask))
-		return NULL;
-
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-
-	event = rb_buffer_peek(buffer, cpu, ts);
-	if (!event)
-		goto out;
-
-	rb_advance_reader(cpu_buffer);
-
- out:
+	event = rb_get_event(buffer, cpu, ts, 1);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
@@ -2124,12 +2118,7 @@ ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts)
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
-	if (!event)
-		goto out;
-
-	rb_advance_iter(iter);
- out:
+	event = rb_iter_peek(iter, ts, 1);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
@@ -2404,10 +2393,10 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
 
 	/*
-	 * rb_buffer_peek will get the next ring buffer if
+	 * rb_get_event will get the next ring buffer if
 	 * the current reader page is empty.
 	 */
-	event = rb_buffer_peek(buffer, cpu, NULL);
+	event = rb_get_event(buffer, cpu, NULL, 0);
 	if (!event)
 		goto out;
 
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH] Minor cleanups in ring_buffer.c, try 2.
  2009-01-05 15:09   ` Steven Rostedt
  2009-01-11  8:35     ` [PATCH] Minor cleanups in ring_buffer.c Indan Zupancic
@ 2009-01-11  8:46     ` Indan Zupancic
  2009-01-11  9:49     ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
  2 siblings, 0 replies; 11+ messages in thread
From: Indan Zupancic @ 2009-01-11  8:46 UTC (permalink / raw)
  To: srostedt; +Cc: mingo, linux-kernel, Indan Zupancic

Signed-off-by: Indan Zupancic <indan@nul.nu>
---
 kernel/trace/ring_buffer.c |   37 +++++++++++++------------------------
 1 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 4832ffa..875fdce 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1849,7 +1849,7 @@ static void rb_advance_iter(struct ring_buffer_iter *iter)
 }
 
 static struct ring_buffer_event *
-rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+rb_get_event(struct ring_buffer *buffer, int cpu, u64 *ts, int consume)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
@@ -1900,6 +1900,8 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 			*ts = cpu_buffer->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(cpu_buffer->cpu, ts);
 		}
+		if (consume)
+			rb_advance_reader(cpu_buffer);
 		return event;
 
 	default:
@@ -1908,10 +1910,9 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 
 	return NULL;
 }
-EXPORT_SYMBOL_GPL(ring_buffer_peek);
 
 static struct ring_buffer_event *
-rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
+rb_iter_get(struct ring_buffer_iter *iter, u64 *ts, int consume)
 {
 	struct ring_buffer *buffer;
 	struct ring_buffer_per_cpu *cpu_buffer;
@@ -1961,6 +1962,8 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 			*ts = iter->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(cpu_buffer->cpu, ts);
 		}
+		if (consume)
+			rb_advance_iter(iter);
 		return event;
 
 	default:
@@ -1988,11 +1991,12 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_buffer_peek(buffer, cpu, ts);
+	event = rb_get_event(buffer, cpu, ts, 0);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
 }
+EXPORT_SYMBOL_GPL(ring_buffer_peek);
 
 /**
  * ring_buffer_iter_peek - peek at the next event to be read
@@ -2010,7 +2014,7 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
+	event = rb_iter_get(iter, ts, 0);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
@@ -2031,18 +2035,8 @@ ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
 	struct ring_buffer_event *event;
 	unsigned long flags;
 
-	if (!cpumask_test_cpu(cpu, buffer->cpumask))
-		return NULL;
-
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-
-	event = rb_buffer_peek(buffer, cpu, ts);
-	if (!event)
-		goto out;
-
-	rb_advance_reader(cpu_buffer);
-
- out:
+	event = rb_get_event(buffer, cpu, ts, 1);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
@@ -2124,12 +2118,7 @@ ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts)
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
-	if (!event)
-		goto out;
-
-	rb_advance_iter(iter);
- out:
+	event = rb_iter_get(iter, ts, 1);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	return event;
@@ -2404,10 +2393,10 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
 
 	/*
-	 * rb_buffer_peek will get the next ring buffer if
+	 * rb_get_event will get the next ring buffer if
 	 * the current reader page is empty.
 	 */
-	event = rb_buffer_peek(buffer, cpu, NULL);
+	event = rb_get_event(buffer, cpu, NULL, 0);
 	if (!event)
 		goto out;
 
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix race in ring_buffer_consume(): Replace  ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event
  2009-01-05 15:09   ` Steven Rostedt
  2009-01-11  8:35     ` [PATCH] Minor cleanups in ring_buffer.c Indan Zupancic
  2009-01-11  8:46     ` [PATCH] Minor cleanups in ring_buffer.c, try 2 Indan Zupancic
@ 2009-01-11  9:49     ` Indan Zupancic
  2 siblings, 0 replies; 11+ messages in thread
From: Indan Zupancic @ 2009-01-11  9:49 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, linux-kernel

On Mon, January 5, 2009 16:09, Steven Rostedt wrote:
> I just hate public functions that have flags that can get confusing.
> Looking at code where I see:
>
> 	ring_buffer_get_event(buffer, cpu, ts, 0);
>
> I find it hard to know if the ",0" is correct or should be a ",1"
> instead.

That's what I didn't like about the original patch either.
Sent a new patch against tip doing the above, as well as doing
the same for rb_iter_peek (ignore the first one, see try 2).

Slightly different topic, there are quite a few unused functions,
e.g. ring_buffer_read_page(). Should those be removed until they
are actually used, or are there things in the pipeline that will
use them?

I'm not sure if rb_event_length() is safe as it is, returning -1
for RINGBUF_TYPE_PADDING, for an unsigned return value. That
easily overflows into a low value in e.g. rb_advance_iter().
Wouldn't it be better to return the maximum possible size?

rb_advance_reader() calls rb_get_reader_page() and rb_reader_event(),
but rb_get_event(), which is the only caller of rb_advance_reader,
does that as well, which seems a bit more of the same.

Further, rb_get_reader_page() takes cpu_buffer->lock, but all callers
of rb_get_event() take the cpu_buffer->reader_lock. As rb_get_event()
always calls rb_get_reader_page(), both locks will be taken for each
of those calls. Doesn't that make the cpu_buffer->lock redundant?
Getting rid of it and moving the locking into rb_get_event() would
simplify things (except that ring_buffer_read_page() is a bit funny).

Greetings,

Indan



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-01-11  9:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-29  3:59 [PATCH] Fix race in ring_buffer_consume() Indan Zupancic
2008-12-29  7:34 ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
2008-12-29 12:24   ` Ingo Molnar
2009-01-01  5:10     ` Indan Zupancic
2009-01-02 22:08       ` Ingo Molnar
2009-01-03 23:55         ` Steven Rostedt
2009-01-05 15:09   ` Steven Rostedt
2009-01-11  8:35     ` [PATCH] Minor cleanups in ring_buffer.c Indan Zupancic
2009-01-11  8:46     ` [PATCH] Minor cleanups in ring_buffer.c, try 2 Indan Zupancic
2009-01-11  9:49     ` [PATCH] Fix race in ring_buffer_consume(): Replace ring_buffer_consume and ring_buffer_peek with ring_buffer_get_event Indan Zupancic
2008-12-29 14:47 ` [PATCH] Fix race in ring_buffer_consume() Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).