linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Sharp <dhsharp@google.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@redhat.com>, Michael Rubin <mrubin@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/5] trace: Make removal of ring buffer pages atomic
Date: Fri, 29 Jul 2011 18:50:15 -0700	[thread overview]
Message-ID: <CAJL_ektBTtVp==CbjLMaeO3Tz24=KmE50Qy+St0yc9dvYzuQFw@mail.gmail.com> (raw)
In-Reply-To: <1311988337.21143.107.camel@gandalf.stny.rr.com>

On Fri, Jul 29, 2011 at 6:12 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 2011-07-29 at 16:30 -0700, Vaibhav Nagarnaik wrote:
>> On Fri, Jul 29, 2011 at 2:23 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>
>> There should only be IRQs and NMIs that preempt this operation since
>> the removal operation of a cpu ring buffer is scheduled on keventd of
>> the same CPU. But you're right there is a race between reading the
>> to_remove pointer and cmpxchg() operation.
>
> Bah, this is what I get for reviewing patches and doing other work at
> the same time. I saw the work/completion set up, but it didn't register
> to me that this was calling schedule_work_on(cpu..).
>
> But that said, I'm not sure I really like that. This still seems a bit
> too complex.

What is it that you don't like? the work/completion, the reliance on
running on the same cpu, or just the complexity of procedure?

>> While we are trying to remove the head page, the writer could move to
>> the head page. Additionally, we will be adding complexity to manage data
>> from all the removed pages for read_page.
>>
>> I discussed with David and here are some ways we thought to address
>> this:
>> 1. After the cmpxchg(), if we see that the tail page has moved to
>>    to_remove page, then revert the cmpxchg() operation and try with the
>>    next page. This might add some more complexity and doesn't work with
>>    an interrupt storm coming in.
>
> Egad no. That will just make things more complex, and harder to verify
> is correct.
>
>> 2. Disable/enable IRQs while removing pages. This won't stop traced NMIs
>>    though and we are now affecting the system behavior.
>> 3. David didn't like this, but we could increment
>>    cpu_buffer->record_disabled to prevent writer from moving any pages
>>    for the duration of this process. If we combine this with disabling
>>    preemption, we would be losing traces from an IRQ/NMI context, but we
>>    would be safe from races while this operation is going on.
>>
>> The reason we want to remove the pages after tail is to give priority to
>> empty pages first before touching any data pages. Also according to your
>> suggestion, I am not sure how to manage the data pages once they are
>> removed, since they cannot be freed and the reader might not be present
>> which will make the pages stay resident, a form of memory leak.
>
> They will be freed when they are eventually read. Right, if there's no
> reader, then they will not be freed, but that isn't really a true memory
> leak. It is basically just like we didn't remove the pages, but I do not
> consider this a memory leak. The pages are just waiting to be reclaimed,
> and will be freed on any reset of the ring buffer.
>
> Anyway, the choices are:
>
> * Remove from the HEAD and use the existing algorithm that we've been
> using since 2008. This requires a bit of accounting on the reader side,
> but nothing too complex.
>
> Pros: Should not have any major race conditions. Requires no
> schedule_work_on() calls. Uses existing algorithm
>
> Cons: Can keep pages around if no reader is present, and ring buffer is
> not reset.

Con: by definition, removes valid trace data from the ring buffer,
even if it is not full. I think that's a pretty big con for the
usability of the feature.

>
> * Read from tail. Modify the already complex but tried and true lockless
> algorithm.
>
> Pros: Removes empty pages first.
>
> Cons: Adds a lot more complexity to a complex system that has been
> working since 2008.
>
>
> The above makes me lean towards just taking from HEAD.
>
> If you are worried about leaked pages, we could even have a debugfs file
> that lets us monitor the pages that are pending read, and have the user
> (or application) be able to flush them if they see the ring buffer is
> full anyway.

The reason we want per-cpu dynamic resizing is to increase memory
utilization, so leaking pages would make me sad.

Let us mull it over this weekend... maybe we'll come up with something
that works more simply.

>
> -- Steve
>
>
>
>

  reply	other threads:[~2011-07-30  2:24 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-26 22:59 [PATCH 0/5] Add dynamic updates to trace ring buffer Vaibhav Nagarnaik
2011-07-26 22:59 ` [PATCH 1/5] trace: Add a new readonly entry to report total buffer size Vaibhav Nagarnaik
2011-07-29 18:01   ` Steven Rostedt
2011-07-29 19:09     ` Vaibhav Nagarnaik
2011-07-26 22:59 ` [PATCH 2/5] trace: Add ring buffer stats to measure rate of events Vaibhav Nagarnaik
2011-07-29 18:10   ` Steven Rostedt
2011-07-29 19:10     ` Vaibhav Nagarnaik
2011-07-26 22:59 ` [PATCH 3/5] trace: Add per_cpu ring buffer control files Vaibhav Nagarnaik
2011-07-29 18:14   ` Steven Rostedt
2011-07-29 19:13     ` Vaibhav Nagarnaik
2011-07-29 21:25       ` Steven Rostedt
2011-07-26 22:59 ` [PATCH 4/5] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2011-07-29 21:23   ` Steven Rostedt
2011-07-29 23:30     ` Vaibhav Nagarnaik
2011-07-30  1:12       ` Steven Rostedt
2011-07-30  1:50         ` David Sharp [this message]
2011-07-30  2:43           ` Steven Rostedt
2011-07-30  3:44             ` David Sharp
2011-07-26 22:59 ` [PATCH 5/5] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 0/5] Add dynamic updates to trace ring buffer Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 1/5] trace: Add a new readonly entry to report total buffer size Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 2/5] trace: Add ring buffer stats to measure rate of events Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 3/5] trace: Add per_cpu ring buffer control files Vaibhav Nagarnaik
2011-08-22 20:29   ` Steven Rostedt
2011-08-22 20:36     ` Vaibhav Nagarnaik
2011-08-22 22:09   ` [PATCH v3] " Vaibhav Nagarnaik
2011-08-23  0:49     ` Steven Rostedt
2011-08-23  1:16       ` Vaibhav Nagarnaik
2011-08-23  1:17   ` Vaibhav Nagarnaik
2011-09-03  2:45     ` Steven Rostedt
2011-09-06 18:56       ` Vaibhav Nagarnaik
2011-09-07 17:13         ` Steven Rostedt
2011-10-12  1:20     ` [PATCH v4 1/4] " Vaibhav Nagarnaik
2012-01-31 23:53       ` Vaibhav Nagarnaik
2012-02-02  2:42         ` Steven Rostedt
2012-02-02 19:20           ` Vaibhav Nagarnaik
2012-02-02 20:00       ` [PATCH v5 " Vaibhav Nagarnaik
2012-02-02 20:00         ` [PATCH v5 2/4] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2012-04-21  4:27           ` Steven Rostedt
2012-04-23 17:31             ` Vaibhav Nagarnaik
2012-04-25 21:18           ` [PATCH v6 1/3] " Vaibhav Nagarnaik
2012-04-25 21:18             ` [PATCH v6 2/3] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2012-04-25 21:18             ` [PATCH v6 3/3] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2012-05-03  1:55             ` [PATCH v6 1/3] trace: Make removal of ring buffer pages atomic Steven Rostedt
2012-05-03  6:40               ` Vaibhav Nagarnaik
2012-05-03 12:57                 ` Steven Rostedt
2012-05-03 14:12                   ` Steven Rostedt
2012-05-03 18:43                     ` Vaibhav Nagarnaik
2012-05-03 18:54                       ` Steven Rostedt
2012-05-03 18:54                         ` Vaibhav Nagarnaik
2012-05-04  1:59             ` [PATCH v7 " Vaibhav Nagarnaik
2012-05-04  1:59               ` [PATCH v7 2/3] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2012-05-19 10:18                 ` [tip:perf/core] ring-buffer: " tip-bot for Vaibhav Nagarnaik
2012-05-04  1:59               ` [PATCH v7 3/3] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2012-05-19 10:21                 ` [tip:perf/core] tracing: " tip-bot for Vaibhav Nagarnaik
2012-05-07 20:22               ` [PATCH v7 1/3] trace: Make removal of ring buffer pages atomic Steven Rostedt
2012-05-07 21:48                 ` Vaibhav Nagarnaik
2012-05-08  0:14                   ` Steven Rostedt
2012-05-09  3:38               ` Steven Rostedt
2012-05-09  5:00                 ` Vaibhav Nagarnaik
2012-05-09 14:29                   ` Steven Rostedt
2012-05-09 17:46                     ` Vaibhav Nagarnaik
2012-05-09 17:54                       ` Steven Rostedt
2012-05-19 10:17               ` [tip:perf/core] ring-buffer: " tip-bot for Vaibhav Nagarnaik
2012-02-02 20:00         ` [PATCH v5 3/4] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2012-02-02 20:00         ` [PATCH v5 4/4] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2012-03-08 23:51         ` [PATCH v5 1/4] trace: Add per_cpu ring buffer control files Vaibhav Nagarnaik
2012-05-02 21:03         ` [tip:perf/core] ring-buffer: " tip-bot for Vaibhav Nagarnaik
2011-10-12  1:20     ` [PATCH v4 2/4] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2011-10-12  1:20     ` [PATCH v4 3/4] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2011-10-12  1:20     ` [PATCH v4 4/4] trace: change CPU ring buffer state from tracing_cpumask Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 4/5] trace: Make removal of ring buffer pages atomic Vaibhav Nagarnaik
2011-08-23  3:27   ` Steven Rostedt
2011-08-23 18:55     ` Vaibhav Nagarnaik
2011-08-23 18:55   ` [PATCH v3 " Vaibhav Nagarnaik
2011-08-23 19:16     ` David Sharp
2011-08-23 19:20       ` Vaibhav Nagarnaik
2011-08-23 19:24       ` Steven Rostedt
2011-08-23 18:55   ` [PATCH v3 5/5] trace: Make addition of pages in ring buffer atomic Vaibhav Nagarnaik
2011-08-16 21:46 ` [PATCH v2 " Vaibhav Nagarnaik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJL_ektBTtVp==CbjLMaeO3Tz24=KmE50Qy+St0yc9dvYzuQFw@mail.gmail.com' \
    --to=dhsharp@google.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mrubin@google.com \
    --cc=rostedt@goodmis.org \
    --cc=vnagarnaik@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).