linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephane Eranian <eranian@google.com>
To: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, acme@redhat.com, jolsa@redhat.com,
	kim.phillips@amd.com, namhyung@kernel.org, irogers@google.com,
	atrajeev@linux.vnet.ibm.com,
	Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [PATCH v1 01/13] perf/core: add union to struct perf_branch_entry
Date: Thu, 16 Sep 2021 23:48:31 -0700	[thread overview]
Message-ID: <CABPqkBQvvNQa=hb4OnYqH-f=DJiRWE+bTmv4i+gNvEdoSEHM4w@mail.gmail.com> (raw)
In-Reply-To: <b21bf42e-377d-36d0-49c3-af1e4edf5496@linux.ibm.com>

Hi,


Thanks for fixing this in the perf tool. But what about the struct
branch_entry in the header?


On Thu, Sep 16, 2021 at 11:38 PM Madhavan Srinivasan
<maddy@linux.ibm.com> wrote:
>
>
> On 9/15/21 11:33 AM, Stephane Eranian wrote:
> > Michael,
> >
> >
> > On Fri, Sep 10, 2021 at 7:16 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
> >> Michael Ellerman <mpe@ellerman.id.au> writes:
> >>> Peter Zijlstra <peterz@infradead.org> writes:
> >>>> On Thu, Sep 09, 2021 at 12:56:48AM -0700, Stephane Eranian wrote:
> >>>>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> >>>>> index f92880a15645..eb11f383f4be 100644
> >>>>> --- a/include/uapi/linux/perf_event.h
> >>>>> +++ b/include/uapi/linux/perf_event.h
> >>>>> @@ -1329,13 +1329,18 @@ union perf_mem_data_src {
> >>>>>   struct perf_branch_entry {
> >>>>>      __u64   from;
> >>>>>      __u64   to;
> >>>>> -   __u64   mispred:1,  /* target mispredicted */
> >>>>> -           predicted:1,/* target predicted */
> >>>>> -           in_tx:1,    /* in transaction */
> >>>>> -           abort:1,    /* transaction abort */
> >>>>> -           cycles:16,  /* cycle count to last branch */
> >>>>> -           type:4,     /* branch type */
> >>>>> -           reserved:40;
> >>>>> +   union {
> >>>>> +           __u64   val;        /* to make it easier to clear all fields */
> >>>>> +           struct {
> >>>>> +                   __u64   mispred:1,  /* target mispredicted */
> >>>>> +                           predicted:1,/* target predicted */
> >>>>> +                           in_tx:1,    /* in transaction */
> >>>>> +                           abort:1,    /* transaction abort */
> >>>>> +                           cycles:16,  /* cycle count to last branch */
> >>>>> +                           type:4,     /* branch type */
> >>>>> +                           reserved:40;
> >>>>> +           };
> >>>>> +   };
> >>>>>   };
> >>>>
> >>>> Hurpmh... all other bitfields have ENDIAN_BITFIELD things except this
> >>>> one. Power folks, could you please have a look?
> >>> The bit number of each field changes between big and little endian, but
> >>> as long as kernel and userspace are the same endian, and both only
> >>> access values via the bitfields then it works.
> >> ...
> >>> It does look like we have a bug in perf tool though, if I take a
> >>> perf.data from a big endian system to a little endian one I don't see
> >>> any of the branch flags decoded. eg:
> >>>
> >>> BE:
> >>>
> >>> 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
> >>> ... branch stack: nr:28
> >>> .....  0: c00000000045c028 -> c00000000dce7604 0 cycles  P   0
> >>>
> >>> LE:
> >>>
> >>> 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
> >>> ... branch stack: nr:28
> >>> .....  0: c00000000045c028 -> c00000000dce7604 0 cycles      0
> >>>                                                           ^
> >>>                                                           missing P
> >>>
> >>> I guess we're missing a byte swap somewhere.
> >> Ugh. We _do_ have a byte swap, but we also need a bit swap.
> >>
> >> That works for the single bit fields, not sure if it will for the
> >> multi-bit fields.
> >>
> >> So that's a bit of a mess :/
> >>
> > Based on what I see in perf_event.h for other structures, I think I
> > can make up what you would need for struct branch_entry. But Iit would
> > be easier if you could send me a patch that you would have verified on
> > your systems.
> > Thanks.
> Attached patch fixes the issue. Have tested both in both in BE and LE case.
>
> Maddy
>
>  From f816ba2e6ef8d5975f78442d7ecb50d66c3c4326 Mon Sep 17 00:00:00 2001
> From: Madhavan Srinivasan <maddy@linux.ibm.com>
> Date: Wed, 15 Sep 2021 22:29:09 +0530
> Subject: [RFC PATCH] tools/perf: Add reverse_64b macro
>
> branch_stack struct has bit field definition
> producing different bit ordering for big/little endian.
> Because of this, when branch_stack sample collected
> in a BE system viewed/reported in a LE system,
> bit fields of the branch stack are not presented
> properly. To address this issue, a reverse_64b
> macro is defined and introduced in evsel__parse_sample.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
>   tools/perf/util/evsel.c | 35 +++++++++++++++++++++++++++++++++--
>   1 file changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index dbfeceb2546c..3151606e516e 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2221,6 +2221,9 @@ void __weak arch_perf_parse_sample_weight(struct
> perf_sample *data,
>       data->weight = *array;
>   }
>
> +#define reverse_64b(src, pos, size)    \
> +    (((src >> pos) & (( 1ull <<size) - 1)) << (63 - (pos + size - 1)))
> +
>   int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
>               struct perf_sample *data)
>   {
> @@ -2408,6 +2411,8 @@ int evsel__parse_sample(struct evsel *evsel, union
> perf_event *event,
>       if (type & PERF_SAMPLE_BRANCH_STACK) {
>           const u64 max_branch_nr = UINT64_MAX /
>                         sizeof(struct branch_entry);
> +        struct branch_entry *e;
> +        unsigned i;
>
>           OVERFLOW_CHECK_u64(array);
>           data->branch_stack = (struct branch_stack *)array++;
> @@ -2416,10 +2421,36 @@ int evsel__parse_sample(struct evsel *evsel,
> union perf_event *event,
>               return -EFAULT;
>
>           sz = data->branch_stack->nr * sizeof(struct branch_entry);
> -        if (evsel__has_branch_hw_idx(evsel))
> +        if (evsel__has_branch_hw_idx(evsel)) {
>               sz += sizeof(u64);
> -        else
> +            e = &data->branch_stack->entries[0];
> +        } else {
>               data->no_hw_idx = true;
> +            e = (struct branch_entry *)&data->branch_stack->hw_idx;
> +        }
> +
> +        if (swapped) {
> +            for (i = 0; i < data->branch_stack->nr; i++, e++) {
> +                u64 new_val = 0;
> +
> +                /* mispred:1  target mispredicted */
> +                new_val = reverse_64b(e->flags.value, 0, 1);
> +                /* predicted:1  target predicted */
> +                new_val |= reverse_64b(e->flags.value, 1, 1);
> +                /* in_tx:1  in transaction */
> +                new_val |= reverse_64b(e->flags.value, 2, 1);
> +                /* abort:1  transaction abort */
> +                new_val |= reverse_64b(e->flags.value, 3, 1);
> +                /* cycles:16  cycle count to last branch */
> +                new_val |= reverse_64b(e->flags.value, 4, 16);
> +                /* type:4  branch type */
> +                new_val |= reverse_64b(e->flags.value, 20, 4);
> +                /* reserved:40 */
> +                new_val |= reverse_64b(e->flags.value, 24, 40);
> +                e->flags.value = new_val;
> +            }
> +        }
> +
>           OVERFLOW_CHECK(array, sz, max_size);
>           array = (void *)array + sz;
>       }
> --
> 2.31.1
>
>

  reply	other threads:[~2021-09-17  6:48 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-09  7:56 [PATCH v1 00/13] perf/x86/amd: Add AMD Fam19h Branch Sampling support Stephane Eranian
2021-09-09  7:56 ` [PATCH v1 01/13] perf/core: add union to struct perf_branch_entry Stephane Eranian
2021-09-09 19:03   ` Peter Zijlstra
2021-09-10 12:09     ` Michael Ellerman
2021-09-10 14:16       ` Michael Ellerman
2021-09-15  6:03         ` Stephane Eranian
2021-09-17  6:37           ` Madhavan Srinivasan
2021-09-17  6:48             ` Stephane Eranian [this message]
2021-09-17  7:05               ` Michael Ellerman
2021-09-17  7:39                 ` Stephane Eranian
2021-09-17 12:38                   ` Michael Ellerman
2021-09-17 16:42                     ` Stephane Eranian
2021-09-19 10:27                       ` Michael Ellerman
2021-09-09  7:56 ` [PATCH v1 02/13] x86/cpufeatures: add AMD Fam19h Branch Sampling feature Stephane Eranian
2021-09-09  7:56 ` [PATCH v1 03/13] perf/x86/amd: add AMD Fam19h Branch Sampling support Stephane Eranian
2021-09-09 10:44   ` kernel test robot
2021-09-09 15:33   ` kernel test robot
2021-09-09  7:56 ` [PATCH v1 04/13] perf/x86/amd: add branch-brs helper event for Fam19h BRS Stephane Eranian
2021-09-09  7:56 ` [PATCH v1 05/13] perf/x86/amd: enable branch sampling priv level filtering Stephane Eranian
2021-09-09  7:56 ` [PATCH v1 06/13] perf/x86/amd: add AMD branch sampling period adjustment Stephane Eranian
2021-09-09  7:56 ` [PATCH v1 07/13] perf/core: add idle hooks Stephane Eranian
2021-09-09  9:15   ` Peter Zijlstra
2021-09-09 10:42   ` kernel test robot
2021-09-09 11:02   ` kernel test robot
2021-09-09  7:56 ` [PATCH v1 08/13] perf/x86/core: " Stephane Eranian
2021-09-09  9:16   ` Peter Zijlstra
2021-09-09  7:56 ` [PATCH v1 09/13] perf/x86/amd: add idle hooks for branch sampling Stephane Eranian
2021-09-09  9:20   ` Peter Zijlstra
2021-09-09  7:56 ` [PATCH v1 10/13] perf tools: add branch-brs as a new event Stephane Eranian
2021-09-09  7:56 ` [PATCH v1 11/13] perf tools: improve IBS error handling Stephane Eranian
2021-09-13 19:34   ` Arnaldo Carvalho de Melo
2021-10-04 21:57     ` Kim Phillips
2021-10-04 23:44       ` Arnaldo Carvalho de Melo
2021-09-09  7:56 ` [PATCH v1 12/13] perf tools: improve error handling of AMD Branch Sampling Stephane Eranian
2021-10-04 21:57   ` Kim Phillips
2021-09-09  7:57 ` [PATCH v1 13/13] perf report: add addr_from/addr_to sort dimensions Stephane Eranian
2021-09-09  8:55 ` [PATCH v1 00/13] perf/x86/amd: Add AMD Fam19h Branch Sampling support Peter Zijlstra
2021-09-15  5:55   ` Stephane Eranian
2021-09-15  9:04     ` Peter Zijlstra
2021-10-28 18:30       ` Stephane Eranian
2021-09-27 20:17     ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPqkBQvvNQa=hb4OnYqH-f=DJiRWE+bTmv4i+gNvEdoSEHM4w@mail.gmail.com' \
    --to=eranian@google.com \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=atrajeev@linux.vnet.ibm.com \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kim.phillips@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).