All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>
Subject: [PATCH 3.10 38/71] ring-buffer: Always reset iterator to reader page
Date: Mon, 15 Sep 2014 12:26:36 -0700	[thread overview]
Message-ID: <20140915192639.964965676@linuxfoundation.org> (raw)
In-Reply-To: <20140915192638.702282534@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

commit 651e22f2701b4113989237c3048d17337dd2185c upstream.

When performing a consuming read, the ring buffer swaps out a
page from the ring buffer with a empty page and this page that
was swapped out becomes the new reader page. The reader page
is owned by the reader and since it was swapped out of the ring
buffer, writers do not have access to it (there's an exception
to that rule, but it's out of scope for this commit).

When reading the "trace" file, it is a non consuming read, which
means that the data in the ring buffer will not be modified.
When the trace file is opened, a ring buffer iterator is allocated
and writes to the ring buffer are disabled, such that the iterator
will not have issues iterating over the data.

Although the ring buffer disabled writes, it does not disable other
reads, or even consuming reads. If a consuming read happens, then
the iterator is reset and starts reading from the beginning again.

My tests would sometimes trigger this bug on my i386 box:

WARNING: CPU: 0 PID: 5175 at kernel/trace/trace.c:1527 __trace_find_cmdline+0x66/0xaa()
Modules linked in:
CPU: 0 PID: 5175 Comm: grep Not tainted 3.16.0-rc3-test+ #8
Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
 00000000 00000000 f09c9e1c c18796b3 c1b5d74c f09c9e4c c103a0e3 c1b5154b
 f09c9e78 00001437 c1b5d74c 000005f7 c10bd85a c10bd85a c1cac57c f09c9eb0
 ed0e0000 f09c9e64 c103a185 00000009 f09c9e5c c1b5154b f09c9e78 f09c9e80^M
Call Trace:
 [<c18796b3>] dump_stack+0x4b/0x75
 [<c103a0e3>] warn_slowpath_common+0x7e/0x95
 [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa
 [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa
 [<c103a185>] warn_slowpath_fmt+0x33/0x35
 [<c10bd85a>] __trace_find_cmdline+0x66/0xaa^M
 [<c10bed04>] trace_find_cmdline+0x40/0x64
 [<c10c3c16>] trace_print_context+0x27/0xec
 [<c10c4360>] ? trace_seq_printf+0x37/0x5b
 [<c10c0b15>] print_trace_line+0x319/0x39b
 [<c10ba3fb>] ? ring_buffer_read+0x47/0x50
 [<c10c13b1>] s_show+0x192/0x1ab
 [<c10bfd9a>] ? s_next+0x5a/0x7c
 [<c112e76e>] seq_read+0x267/0x34c
 [<c1115a25>] vfs_read+0x8c/0xef
 [<c112e507>] ? seq_lseek+0x154/0x154
 [<c1115ba2>] SyS_read+0x54/0x7f
 [<c188488e>] syscall_call+0x7/0xb
---[ end trace 3f507febd6b4cc83 ]---
>>>> ##### CPU 1 buffer started ####

Which was the __trace_find_cmdline() function complaining about the pid
in the event record being negative.

After adding more test cases, this would trigger more often. Strangely
enough, it would never trigger on a single test, but instead would trigger
only when running all the tests. I believe that was the case because it
required one of the tests to be shutting down via delayed instances while
a new test started up.

After spending several days debugging this, I found that it was caused by
the iterator becoming corrupted. Debugging further, I found out why
the iterator became corrupted. It happened with the rb_iter_reset().

As consuming reads may not read the full reader page, and only part
of it, there's a "read" field to know where the last read took place.
The iterator, must also start at the read position. In the rb_iter_reset()
code, if the reader page was disconnected from the ring buffer, the iterator
would start at the head page within the ring buffer (where writes still
happen). But the mistake there was that it still used the "read" field
to start the iterator on the head page, where it should always start
at zero because readers never read from within the ring buffer where
writes occur.

I originally wrote a patch to have it set the iter->head to 0 instead
of iter->head_page->read, but then I questioned why it wasn't always
setting the iter to point to the reader page, as the reader page is
still valid.  The list_empty(reader_page->list) just means that it was
successful in swapping out. But the reader_page may still have data.

There was a bug report a long time ago that was not reproducible that
had something about trace_pipe (consuming read) not matching trace
(iterator read). This may explain why that happened.

Anyway, the correct answer to this bug is to always use the reader page
an not reset the iterator to inside the writable ring buffer.

Fixes: d769041f8653 "ring_buffer: implement new locking"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/trace/ring_buffer.c |   17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3353,21 +3353,16 @@ static void rb_iter_reset(struct ring_bu
 	struct ring_buffer_per_cpu *cpu_buffer = iter->cpu_buffer;
 
 	/* Iterator usage is expected to have record disabled */
-	if (list_empty(&cpu_buffer->reader_page->list)) {
-		iter->head_page = rb_set_head_page(cpu_buffer);
-		if (unlikely(!iter->head_page))
-			return;
-		iter->head = iter->head_page->read;
-	} else {
-		iter->head_page = cpu_buffer->reader_page;
-		iter->head = cpu_buffer->reader_page->read;
-	}
+	iter->head_page = cpu_buffer->reader_page;
+	iter->head = cpu_buffer->reader_page->read;
+
+	iter->cache_reader_page = iter->head_page;
+	iter->cache_read = iter->head;
+
 	if (iter->head)
 		iter->read_stamp = cpu_buffer->read_stamp;
 	else
 		iter->read_stamp = iter->head_page->page->time_stamp;
-	iter->cache_reader_page = cpu_buffer->reader_page;
-	iter->cache_read = cpu_buffer->read;
 }
 
 /**



  parent reply	other threads:[~2014-09-15 20:03 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-15 19:25 [PATCH 3.10 00/71] 3.10.55-stable review Greg Kroah-Hartman
2014-09-15 19:25 ` [PATCH 3.10 01/71] media: xc5000: Fix get_frequency() Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 02/71] media: xc4000: " Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 03/71] media: au0828: Only alt setting logic when needed Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 05/71] iommu/amd: Fix cleanup_domain for mass device removal Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 06/71] spi: orion: fix incorrect handling of cell-index DT property Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 07/71] spi: omap2-mcspi: Configure hardware when slave driver changes mode Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 08/71] firmware: Do not use WARN_ON(!spin_is_locked()) Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 09/71] tpm: missing tpm_chip_put in tpm_get_random() Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 10/71] CAPABILITIES: remove undefined caps from all processes Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 11/71] kernel/smp.c:on_each_cpu_cond(): fix warning in fallback path Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 12/71] mfd: omap-usb-host: Fix improper mask use Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 13/71] regulator: arizona-ldo1: remove bypass functionality Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 14/71] powerpc/mm/numa: Fix break placement Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 15/71] powerpc/mm: Use read barrier when creating real_pte Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 16/71] powerpc/pseries: Failure on removing device node Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 17/71] Drivers: scsi: storvsc: Implement a eh_timed_out handler Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 18/71] drivers: scsi: storvsc: Correctly handle TEST_UNIT_READY failure Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 19/71] MIPS: GIC: Prevent array overrun Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 20/71] MIPS: Prevent user from setting FCSR cause bits Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 21/71] MIPS: tlbex: Fix a missing statement for HUGETLB Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 22/71] MIPS: Remove BUG_ON(!is_fpu_owner()) in do_ade() Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 23/71] MIPS: asm/reg.h: Make 32- and 64-bit definitions available at the same time Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 24/71] MIPS: Cleanup flags in syscall flags handlers Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 25/71] MIPS: asm: thread_info: Add _TIF_SECCOMP flag Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 26/71] MIPS: OCTEON: make get_system_type() thread-safe Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 27/71] MIPS: Fix accessing to per-cpu data when flushing the cache Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 28/71] openrisc: Rework signal handling Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 29/71] ASoC: pcm: fix dpcm_path_put in dpcm runtime update Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 30/71] ASoC: wm_adsp: Add missing MODULE_LICENSE Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 31/71] ASoC: samsung: Correct I2S DAI suspend/resume ops Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 32/71] ASoC: max98090: Fix missing free_irq Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 33/71] ASoC: pxa-ssp: drop SNDRV_PCM_FMTBIT_S24_LE Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 34/71] bfa: Fix undefined bit shift on big-endian architectures with 32-bit DMA address Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 35/71] ACPICA: Utilities: Fix memory leak in acpi_ut_copy_iobject_to_iobject Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 36/71] ACPI: Run fixed event device notifications in process context Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 37/71] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock Greg Kroah-Hartman
2014-09-15 19:26 ` Greg Kroah-Hartman [this message]
2014-09-15 19:26 ` [PATCH 3.10 39/71] ring-buffer: Up rb_iter_peek() loop count to 3 Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 40/71] mnt: Only change user settable mount flags in remount Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 41/71] mnt: Move the test for MNT_LOCK_READONLY from change_mount_flags into do_remount Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 42/71] mnt: Correct permission checks in do_remount Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 43/71] mnt: Change the default remount atime from relatime to the existing value Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 44/71] mnt: Add tests for unprivileged remount cases that have found to be faulty Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 45/71] Bluetooth: never linger on process exit Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 46/71] Bluetooth: Avoid use of session socket after the session gets freed Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 47/71] md/raid6: avoid data corruption during recovery of double-degraded RAID6 Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 48/71] md/raid10: fix memory leak when reshaping a RAID10 Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 49/71] md/raid10: Fix memory leak when raid10 reshape completes Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 50/71] RDMA/iwcm: Use a default listen backlog if needed Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 51/71] xfs: quotacheck leaves dquot buffers without verifiers Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 52/71] xfs: dont dirty buffers beyond EOF Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 53/71] xfs: dont zero partial page cache pages during O_DIRECT writes Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 54/71] xfs: dont zero partial page cache pages during O_DIRECT write Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 55/71] md/raid1,raid10: always abort recover on write error Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 56/71] libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 57/71] libceph: add process_one_ticket() helper Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 58/71] libceph: do not hard code max auth ticket len Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 59/71] CIFS: Fix STATUS_CANNOT_DELETE error mapping for SMB2 Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 60/71] CIFS: Fix async reading on reconnects Greg Kroah-Hartman
2014-09-15 19:26 ` [PATCH 3.10 61/71] CIFS: Possible null ptr deref in SMB2_tcon Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 62/71] CIFS: Fix wrong directory attributes after rename Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 63/71] CIFS: Fix wrong filename length for SMB2 Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 64/71] CIFS: Fix wrong restart readdir for SMB1 Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 65/71] mtd/ftl: fix the double free of the buffers allocated in build_maps() Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 66/71] mtd: nand: omap: Fix 1-bit Hamming code scheme, omap_calculate_ecc() Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 67/71] blkcg: dont call into policy draining if root_blkg is already gone Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 68/71] IB/srp: Fix deadlock between host removal and multipathd Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 69/71] dcache.c: get rid of pointless macros Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 70/71] vfs: fix bad hashing of dentries Greg Kroah-Hartman
2014-09-15 19:27 ` [PATCH 3.10 71/71] tpm: Provide a generic means to override the chip returned timeouts Greg Kroah-Hartman
2014-09-16  1:53 ` [PATCH 3.10 00/71] 3.10.55-stable review Guenter Roeck
2014-09-16 18:04   ` Greg Kroah-Hartman
2014-09-16 18:42 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140915192639.964965676@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.