linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vaibhav Nagarnaik <vnagarnaik@google.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@redhat.com>, Michael Rubin <mrubin@google.com>,
	David Sharp <dhsharp@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/5] trace: Make removal of ring buffer pages atomic
Date: Fri, 29 Jul 2011 16:30:20 -0700	[thread overview]
Message-ID: <CAL26m8K8_55r_KZ9hc1ZPxvFTnnRzmjB0RH62OUB=Be0FjpcjQ@mail.gmail.com> (raw)
In-Reply-To: <1311974602.21143.92.camel@gandalf.stny.rr.com>

On Fri, Jul 29, 2011 at 2:23 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2011-07-26 at 15:59 -0700, Vaibhav Nagarnaik wrote:
>
>> +
>> +             } else if (((unsigned long)to_remove & ~RB_PAGE_HEAD) ==
>> +                                     (unsigned long)to_remove) {
>> +
>> +                     /* not a head page, just update the next pointer */
>> +                     ret = cmpxchg(&tail_page->next, to_remove, next_page);
>
> This is not, it wont work.
>
> You can *only* remove the HEAD from the ring buffer without causing
> issues.
>
> As you probably know, the trick is done with the list pointers. We or
> the pointer with 1 for head, and the writer will or it with 2 when it
> updates the page.
>
> This only works if we have a 1 or 2 here. Now if we try to do what you
> suggest, by starting with a 0, and ending with 0, we may fail. Between
> the  to_remove = tail_page->next and the cmpxchg(), the writer could
> easily move to the tail page, and you would never know it.
>
> Now we just removed the tail page with no idea that the write is on it.
> The writer could have also moved on to the next page, and we just
> removed the most recently recorded data.
>
> The only way to really make this work is to always get it from the HEAD
> page. If there's data there, we could just store it separately, so that
> the read_page can read from it first. We will still need to be careful
> with the writer on the page. But I think this is doable.
>
> That is, read the pages from head, if there's no data on it, simply
> remove the pages. If there is data, we store it off later. If the writer
> happens to be on the page, we will can check that. We could even
> continue to get pages, because we will be moving the header page with
> the cmpxchg, and the writer does that too. It will take some serious
> thought, but it is possible to do this.
>
> -- Steve

There should only be IRQs and NMIs that preempt this operation since
the removal operation of a cpu ring buffer is scheduled on keventd of
the same CPU. But you're right there is a race between reading the
to_remove pointer and cmpxchg() operation.

While we are trying to remove the head page, the writer could move to
the head page. Additionally, we will be adding complexity to manage data
from all the removed pages for read_page.

I discussed with David and here are some ways we thought to address
this:
1. After the cmpxchg(), if we see that the tail page has moved to
   to_remove page, then revert the cmpxchg() operation and try with the
   next page. This might add some more complexity and doesn't work with
   an interrupt storm coming in.
2. Disable/enable IRQs while removing pages. This won't stop traced NMIs
   though and we are now affecting the system behavior.
3. David didn't like this, but we could increment
   cpu_buffer->record_disabled to prevent writer from moving any pages
   for the duration of this process. If we combine this with disabling
   preemption, we would be losing traces from an IRQ/NMI context, but we
   would be safe from races while this operation is going on.

The reason we want to remove the pages after tail is to give priority to
empty pages first before touching any data pages. Also according to your
suggestion, I am not sure how to manage the data pages once they are
removed, since they cannot be freed and the reader might not be present
which will make the pages stay resident, a form of memory leak.



Vaibhav Nagarnaik

  reply	other threads:[~2011-07-29 23:30 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-26 22:59 [PATCH 0/5] Add dynamic updates to trace ring buffer Vaibhav Nagarnaik
2011-07-26 22:59 ` [PATCH 1/5] trace: Add a new readonly entry to report total buffer size Vaibhav Nagarnaik
2011-07-29 18:01   ` Steven Rostedt
2011-07-29 19:09     ` Vaibhav Nagarnaik
2011-07-26 22:59 ` [PATCH 2/5] trace: Add ring buffer stats to measure rate of events Vaibhav Nagarnaik
2011-07-29 18:10   ` Steven Rostedt
2011-07-29 19:10     ` Vaibhav Nagarnaik
2011-07-26 22:59 ` [PATCH 3/5] trace: Add per_cpu ring buffer control files Vaibhav Nagarnaik
2011-07-29 18:14   ` Steven Rostedt
2011-07-29 19:13     ` Vaibhav Nagarnaik
2011-07-29 21:25       ` Steven Rostedt
2011-07-26 22:59 ` [PATCH 4/5] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2011-07-29 21:23   ` Steven Rostedt
2011-07-29 23:30     ` Vaibhav Nagarnaik [this message]
2011-07-30  1:12       ` Steven Rostedt
2011-07-30  1:50         ` David Sharp
2011-07-30  2:43           ` Steven Rostedt
2011-07-30  3:44             ` David Sharp
2011-07-26 22:59 ` [PATCH 5/5] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 0/5] Add dynamic updates to trace ring buffer Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 1/5] trace: Add a new readonly entry to report total buffer size Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 2/5] trace: Add ring buffer stats to measure rate of events Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 3/5] trace: Add per_cpu ring buffer control files Vaibhav Nagarnaik
2011-08-22 20:29   ` Steven Rostedt
2011-08-22 20:36     ` Vaibhav Nagarnaik
2011-08-22 22:09   ` [PATCH v3] " Vaibhav Nagarnaik
2011-08-23  0:49     ` Steven Rostedt
2011-08-23  1:16       ` Vaibhav Nagarnaik
2011-08-23  1:17   ` Vaibhav Nagarnaik
2011-09-03  2:45     ` Steven Rostedt
2011-09-06 18:56       ` Vaibhav Nagarnaik
2011-09-07 17:13         ` Steven Rostedt
2011-10-12  1:20     ` [PATCH v4 1/4] " Vaibhav Nagarnaik
2012-01-31 23:53       ` Vaibhav Nagarnaik
2012-02-02  2:42         ` Steven Rostedt
2012-02-02 19:20           ` Vaibhav Nagarnaik
2012-02-02 20:00       ` [PATCH v5 " Vaibhav Nagarnaik
2012-02-02 20:00         ` [PATCH v5 2/4] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2012-04-21  4:27           ` Steven Rostedt
2012-04-23 17:31             ` Vaibhav Nagarnaik
2012-04-25 21:18           ` [PATCH v6 1/3] " Vaibhav Nagarnaik
2012-04-25 21:18             ` [PATCH v6 2/3] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2012-04-25 21:18             ` [PATCH v6 3/3] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2012-05-03  1:55             ` [PATCH v6 1/3] trace: Make removal of ring buffer pages atomic Steven Rostedt
2012-05-03  6:40               ` Vaibhav Nagarnaik
2012-05-03 12:57                 ` Steven Rostedt
2012-05-03 14:12                   ` Steven Rostedt
2012-05-03 18:43                     ` Vaibhav Nagarnaik
2012-05-03 18:54                       ` Steven Rostedt
2012-05-03 18:54                         ` Vaibhav Nagarnaik
2012-05-04  1:59             ` [PATCH v7 " Vaibhav Nagarnaik
2012-05-04  1:59               ` [PATCH v7 2/3] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2012-05-19 10:18                 ` [tip:perf/core] ring-buffer: " tip-bot for Vaibhav Nagarnaik
2012-05-04  1:59               ` [PATCH v7 3/3] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2012-05-19 10:21                 ` [tip:perf/core] tracing: " tip-bot for Vaibhav Nagarnaik
2012-05-07 20:22               ` [PATCH v7 1/3] trace: Make removal of ring buffer pages atomic Steven Rostedt
2012-05-07 21:48                 ` Vaibhav Nagarnaik
2012-05-08  0:14                   ` Steven Rostedt
2012-05-09  3:38               ` Steven Rostedt
2012-05-09  5:00                 ` Vaibhav Nagarnaik
2012-05-09 14:29                   ` Steven Rostedt
2012-05-09 17:46                     ` Vaibhav Nagarnaik
2012-05-09 17:54                       ` Steven Rostedt
2012-05-19 10:17               ` [tip:perf/core] ring-buffer: " tip-bot for Vaibhav Nagarnaik
2012-02-02 20:00         ` [PATCH v5 3/4] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2012-02-02 20:00         ` [PATCH v5 4/4] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2012-03-08 23:51         ` [PATCH v5 1/4] trace: Add per_cpu ring buffer control files Vaibhav Nagarnaik
2012-05-02 21:03         ` [tip:perf/core] ring-buffer: " tip-bot for Vaibhav Nagarnaik
2011-10-12  1:20     ` [PATCH v4 2/4] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2011-10-12  1:20     ` [PATCH v4 3/4] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2011-10-12  1:20     ` [PATCH v4 4/4] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 4/5] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2011-08-23  3:27   ` Steven Rostedt
2011-08-23 18:55     ` Vaibhav Nagarnaik
2011-08-23 18:55   ` [PATCH v3 " Vaibhav Nagarnaik
2011-08-23 19:16     ` David Sharp
2011-08-23 19:20       ` Vaibhav Nagarnaik
2011-08-23 19:24       ` Steven Rostedt
2011-08-23 18:55   ` [PATCH v3 5/5] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 " Vaibhav Nagarnaik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL26m8K8_55r_KZ9hc1ZPxvFTnnRzmjB0RH62OUB=Be0FjpcjQ@mail.gmail.com' \
    --to=vnagarnaik@google.com \
    --cc=dhsharp@google.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mrubin@google.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).