linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: David Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org, Wang Nan <wangnan0@huawei.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>
Subject: Re: A concern about overflow ring buffer mode
Date: Fri, 26 Oct 2018 15:38:05 -0300	[thread overview]
Message-ID: <20181026183805.GD3353@kernel.org> (raw)
In-Reply-To: <20181026.104513.2239058788450235574.davem@davemloft.net>

Addind a few folks to the CC list, Wang implemented the backwards ring
buffer code.

Em Fri, Oct 26, 2018 at 10:45:13AM -0700, David Miller escreveu:
> Since the last time I looked deeply into perf I notice that
> perf top now uses a new ring buffer mode by default.
> 
> Basically, events are written in reverse order, and when fetching
> events the tool uses an ioctl to "pause" the ring buffer.
> 
> I understand some of the reasons for pursing this kind of scheme but I
> think there may be a huge downside to this design.
> 
> Yes, if the tool can't keep up with the kernel, we'd rather see newer
> rather than older events.
> 
> However, pausing the ring buffer during the fetch is going to
> virtually guaratee that we lose critical events that impact
> interpretation of future events in a non-recoverable way.
> 
> The thing is, the new scheme causes events to be lost even if the tool
> can keep up with the kernel.
> 
> Any event that happens while the tool is fetching the ring entries
> will be lost forever.  The kernel simply skips queuing up the event
> and increments a lost counter.  During a kernel build, I typically see
> 9 or so events lost each fetch.
> 
> Ok, if this is just a SAMPLE then fine, it's not a big deal.
> 
> But what if the lost event is a FORK or an EXEC or the worst one to
> lose, an MMAP?

Right, we can't lose those, so for using this, we need something like
the intel_pt tooling code does, i.e. add an extra event to the mix, a
software event, "dummy", that then gets used to track just the
PERF_RECORD_!SAMPLE metadata events and then this one never gets paused.

The intel pt motivation is different, but the technique perhaps will
allow for using the backward code while not losing metadata events.

wdyt? Wang?

- Arnaldo
 
> Now we can't even match up events properly and we get tons of those
> dreaded "Unknown" symbols and DSOs.  The output looks terrible and the
> tool becomes useless.
> 
> And yes this happens frequently.
> 
> I think the overwrite ring buffer mode should be seriously
> reconsidered.  The "I'd rather see new than old events" part is fine,
> but the "pause" part is not.  You can't turn event recording off on

> the kernel side while you fetch some events, because it means that
> critical events that allow us to properly interpret future events will
> be lost.

  reply	other threads:[~2018-10-26 18:38 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-26 17:45 A concern about overflow ring buffer mode David Miller
2018-10-26 18:38 ` Arnaldo Carvalho de Melo [this message]
2018-10-26 18:42   ` Arnaldo Carvalho de Melo
2018-10-26 19:02     ` Arnaldo Carvalho de Melo
2018-10-26 19:07       ` Liang, Kan
2018-10-26 19:12         ` Arnaldo Carvalho de Melo
2018-10-26 19:16           ` Liang, Kan
2018-10-26 19:24             ` Arnaldo Carvalho de Melo
2018-10-26 20:11               ` Liang, Kan
2018-10-26 20:43                 ` Arnaldo Carvalho de Melo
2018-10-29 13:03                 ` [PATCHES/RFC] " Arnaldo Carvalho de Melo
2018-10-29 14:33                   ` Liang, Kan
2018-10-29 14:35                     ` Arnaldo Carvalho de Melo
2018-10-29 15:11                       ` Liang, Kan
2018-10-29 17:43                         ` David Miller
2018-10-29 17:56                           ` Arnaldo Carvalho de Melo
2018-10-29 17:40                     ` David Miller
2018-10-29 17:42                       ` Liang, Kan
2018-10-29 17:48                         ` David Miller
2018-10-29 18:20                           ` Liang, Kan
2018-10-29 18:32                             ` Arnaldo Carvalho de Melo
2018-10-29 22:32                               ` Liang, Kan
2018-10-29 22:42                                 ` David Miller
2018-10-30  1:54                                   ` Liang, Kan
2018-10-29 21:16                             ` David Miller
2018-10-29 17:55                       ` Arnaldo Carvalho de Melo
2018-10-30 19:05                     ` David Miller
2018-10-31 22:03                 ` [tip:perf/urgent] perf top: Do not use overwrite mode by default tip-bot for Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181026183805.GD3353@kernel.org \
    --to=acme@kernel.org \
    --cc=davem@davemloft.net \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).