All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC 1/6] perf/x86: Add perf text poke event
Date: Wed, 30 Oct 2019 17:23:25 +0100	[thread overview]
Message-ID: <20191030162325.GT4114@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20191030141950.GB21153@leoy-ThinkPad-X240s>

On Wed, Oct 30, 2019 at 10:19:50PM +0800, Leo Yan wrote:
> On Wed, Oct 30, 2019 at 01:46:59PM +0100, Peter Zijlstra wrote:
> > On Wed, Oct 30, 2019 at 06:47:47PM +0800, Leo Yan wrote:

> > Anyway, the below argument doesn't care much, it works for NOP/JMP just
> > fine.
> 
> We can support NOP/JMP case as the first step, but later should can
> extend to support other transitions.

Since all instructions (with the possible exception of RET) are
unconditional branch instructions: NOP, JMP, CALL. It makes no read
difference to the argument below.

( I'm thinking RET might be special in that it reads the return address
from the stack and therefore must emit the whole IP into the stream, as
we cannot know the stack state )

> > > we need to update dso cache for the
> > > 'PERF_TEXT_POKE_UPDATE_PREV' event; if detect the instruction is
> > > changed from branch to nop, we need to update dso cache for
> > > 'PERF_TEXT_POKE_UPDATE_POST' event.  The main idea is to ensure the
> > > branch instructions can be safely contained in the dso file and any
> > > branch samples can read out correct branch instruction.
> > > 
> > > Could you confirm this is the same with your understanding?  Or I miss
> > > anything?  I personally even think the pair events can be used for
> > > different arches (e.g. the solution can be reused on Arm64/x86, etc).
> > 
> > So the problem we have with PT is that it is a bit-stream of
> > branch taken/not-taken decisions. In order to decode that we need to
> > have an accurate view of the unconditional code flow.
> > 
> > Both NOP/JMP are unconditional and we need to exactly know which of the
> > two was encountered.
> 
> If I understand correctly, PT decoder needs to read out instructions
> from dso and decide the instruction type (NOP or JMP), and finally
> generate the accurate code flow.
> 
> So PT decoder relies on (cached) DSO for decoding.  As I know, this
> might be different from Arm CS, since Arm CS decoder is merely
> generate packets and it doesn't need to rely on DSO for decoding.

Given a start point (from a start or sync packet) we scan the
instruction stream forward until the first conditional branch
instruction. Then we consume the next available branch decision bit to
know where to continue.

So yes, we need to have a correct text image available for this to work.

> > With your scheme, I don't see how we can ever actually know that. When
> > we get the PRE event, all we really know is that we're going to change
> > a specific instruction into another. And at the POST event we know it
> > has been done. But in between these two events, we have no clue which of
> > the two instructions is live on which CPU (two CPUs might in fact have a
> > different live instruction at the same time).
> >
> > This means we _cannot_ unambiguously decode a taken/not-taken decision
> > stream.
> > 
> > Does CS have this same problem, and how would the PRE/POST events help
> > with that?
> 
> My purpose is to use PRE event and POST event to update cached DSO,
> thus perf tool can read out 'correct' instructions and fill them into
> instruction/branch samples.

The thing is, as I argued, the instruction state between PRE and POST is
ambiguous. This makes it impossible to decode the branch decision
stream.

Suppose CPU0 emits the PRE event at T1 and the POST event at T5, but we
have CPU1 covering the instruction at T3.

How do you decide where CPU1 goes and what the next conditional branch
is?


  parent reply	other threads:[~2019-10-30 16:23 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-25 12:59 [PATCH RFC 0/6] perf/x86: Add perf text poke event Adrian Hunter
2019-10-25 12:59 ` [PATCH RFC 1/6] " Adrian Hunter
2019-10-30 10:47   ` Leo Yan
2019-10-30 12:46     ` Peter Zijlstra
2019-10-30 14:19       ` Leo Yan
2019-10-30 15:00         ` Mike Leach
2019-10-30 16:23         ` Peter Zijlstra [this message]
2019-10-31  7:31           ` Leo Yan
2019-11-01 10:04             ` Peter Zijlstra
2019-11-01 10:09               ` Peter Zijlstra
2019-11-04  2:23               ` Leo Yan
2019-11-08 15:05                 ` Leo Yan
2019-11-11 14:46                   ` Peter Zijlstra
2019-11-11 15:39                     ` Will Deacon
2019-11-11 16:05                       ` Peter Zijlstra
2019-11-11 17:29                         ` Will Deacon
2019-11-11 20:32                           ` Peter Zijlstra
     [not found]             ` <CAJ9a7VgZH7g=rFDpKf=FzEcyBVLS_WjqbrqtRnjOi7WOY4st+w@mail.gmail.com>
2019-11-01 10:06               ` Peter Zijlstra
2019-11-04 10:40   ` Peter Zijlstra
2019-11-04 12:32     ` Adrian Hunter
2019-10-25 12:59 ` [PATCH RFC 2/6] perf dso: Refactor dso_cache__read() Adrian Hunter
2019-10-25 14:54   ` Arnaldo Carvalho de Melo
2019-10-28 15:39   ` Jiri Olsa
2019-10-29  9:19     ` Adrian Hunter
2019-11-12 11:18   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2019-10-25 12:59 ` [PATCH RFC 3/6] perf dso: Add dso__data_write_cache_addr() Adrian Hunter
2019-10-28 15:45   ` Jiri Olsa
2019-10-29  9:20     ` Adrian Hunter
2019-11-12 11:18   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2019-10-25 12:59 ` [PATCH RFC 4/6] perf tools: Add support for PERF_RECORD_TEXT_POKE Adrian Hunter
2019-10-25 12:59 ` [PATCH RFC 5/6] perf auxtrace: Add auxtrace_cache__remove() Adrian Hunter
2019-10-25 14:48   ` Arnaldo Carvalho de Melo
2019-11-12 11:18   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2019-10-25 13:00 ` [PATCH RFC 6/6] perf intel-pt: Add support for text poke events Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191030162325.GT4114@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.