All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Leach <mike.leach@linaro.org>
To: Leo Yan <leo.yan@linaro.org>
Cc: James Clark <james.clark@arm.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Coresight ML <coresight@lists.linaro.org>,
	Al Grant <al.grant@arm.com>,
	"Suzuki K. Poulose" <suzuki.poulose@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 2/6] perf cs-etm: Initialise architecture based on TRCIDR1
Date: Mon, 2 Aug 2021 16:43:59 +0100	[thread overview]
Message-ID: <CAJ9a7VhBU4QYWYbzJs2Z91k=NC+xYnmKJ-HH9CKiNdpDxsY1SA@mail.gmail.com> (raw)
In-Reply-To: <20210802150358.GA148327@leoy-ThinkPad-X240s>

Hi Leo,

On Mon, 2 Aug 2021 at 16:04, Leo Yan <leo.yan@linaro.org> wrote:
>
> Hi Mike,
>
> On Mon, Aug 02, 2021 at 03:04:14PM +0100, Mike Leach wrote:
>
> [...]
>
> > > > > +#define TRCIDR1_TRCARCHMIN_SHIFT 4
> > > > > +#define TRCIDR1_TRCARCHMIN_MASK  GENMASK(7, 4)
> > > > > +#define TRCIDR1_TRCARCHMIN(x)    (((x) & TRCIDR1_TRCARCHMIN_MASK) >> TRCIDR1_TRCARCHMIN_SHIFT)
> > > > > +static enum _ocsd_arch_version cs_etm_decoder__get_arch_ver(u32 reg_idr1)
> > > > > +{
> > > > > +       /*
> > > > > +        * If the ETM trace minor version is 4 or more then we can assume
> > > > > +        * the architecture is ARCH_AA64 rather than just V8
> > > > > +        */
> > > > > +       return TRCIDR1_TRCARCHMIN(reg_idr1) >= 4 ? ARCH_AA64 : ARCH_V8;
> > > > > +}
> > > >
> > > > This is true for ETM4.x & ETE 1.x (arch 5.x) but not ETM 3.x
> > > > Probably need to beef up this comment or the function name to emphasise this.
> > >
> > > Yeah, I think it's good to change the function name.  Eventually, this
> > > function should only be used for ETM4.x and ETE.
> > >
> > > Another minor comment is: can we refine the arch version number, e.g.
> > > change the OpenCSD's macro "ARCH_AA64" to "ARCH_V8R4", (or
> > > "ARCH_V8R3_AA64"), this can give more clear clue what's the ETM version.
> > >
> >
> > The purpose of these macros is to inform the decoder of the
> > architecture of the PE - not the version of the ETM.
> >
> > These OpenCSD macros are defined by the library headers
> > (ocsd_if_types.h) and not the perf headers.
> > These have been published as the API / ABI for OpenCSD and as such
> > changing them affects all OpenCSD clients, not just perf.
>
> I understand these macros are defined in OpenCSD lib as APIs, since I
> saw these macros have not been widely used in perf tool (e.g.
> ARCH_AA64), so this is why I think it's good to take chance to refine
> the naming conventions.
>

The macros are used in other tools - so changing now affects those
too. Not something I am prepared to do without good reason.

> > This PE architecture version is used along with the core profile to
> > determine which instructions are valid waypoint instructions to
> > associate with atom elements when walking the program image during
> > trace decode.
> >
> > From v8.3  onwards we moved away from filtering on specific
> > architecture versions. This was due to two factors:-
> > 1. The architectural rules now allow architectural features for one
> > increment e.g. Arch 8.4, to be backported into  the previous increment
> > - e,g, 8.3, which made this filtering more difficult to track.
> > 2. After discussion with the PE architects it was clear that
> > instructions in a later architect version would not re-use older
> > opcodes from a previous one and  be nop / invalid in the earlier
> > architectures. (certainly in the scope of AA64). Therefore
> > the policy in the decoder is to check for all the instructions we know
> > about for the latest version of architecture, even if we could be
> > decoding an earlier architecture version. This means we may check for
> > a few more opcodes than necessary for earlier version of the
> > architecture, but the overall decode is more robust and easier to
> > maintain.
> >
> > Therefore for any AA64 core beyond v8.3 - it is safe to use the
> > ARCH_AA64 PE architecture version and the decoder will handle it.
>
> I have no objection for current approach; but two things can cause
> confusions and it might be difficult for maintenance:
>
> - The first thing is now we base on the bit fields TRCIDR1::TRCARCHMIN
>   to decide the PE architecture version.  In the ETMv4 spec,
>   TRCIDR1::TRCARCHMIN is defined as the trace unit minor version,
>   so essentially it's a minor version number for tracer (ETM) but not
>   the PE architecture number.  But now we are using it to decide the
>   PE architecture number (8.3, 8.4, etc...).
>

This is a slight weakness in the implementation of perf. Ideally one
does need to establish the architecture version of the PE - but perf
/cs-etm is using an assumption regarding the profile and version of
the core, according to the ETM / ETE versiom.
That said - the ETM / ETE version numbers do have a strong
relationship with PE architecture version numbers, so this assumption
holds for the current supported devices.

> - The second thing is the macros' naming convention.
>   E.g. "AA64" gives me an impression it is a general naming "Arm Arch 64"
>   for all Arm 64-bit CPUs, it's something like an abbreviation for
>   "aarch64"; so seems to me it doesn't show any meaningful info for PE's
>   architecture version number.  This is why I proposed to use more
>   explict macro definition for architectures (e.g. ARCH_V8R3, ARCH_V8R4,
>   ARCH_V9R0, etc).
>

For modern cores it is sufficient for the decoder to know the profile
and that it is aarch 64 - so yes the macro is simply saying this a
general AA64 core.
The macros for earlier versions are a little more specific as certain
filtering is used according to the version of the PE.

ARCH_V8R4,  ARCH_V9R0 etc would have no significance to the decoder
and would not be useful. If we get to the stage where we need more
specific PE architecture versions - then these can be added as
required.
Using the ARCH_AA64 macro means that we do not have to update the API
for every version update of the architecture, and there are no changes
required to the perf / cs-etm handling.

> If we really want to use ARCH_AA64, it's better to give some comments in
> the code.
>

There are comments in the OpenCSD headers, though additional ones in
the perf  / cs-etm handling soruce code could be added.

Regards

Mike


> Thanks a lot for shared the background info.
>
> Leo



-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

WARNING: multiple messages have this Message-ID (diff)
From: Mike Leach <mike.leach@linaro.org>
To: Leo Yan <leo.yan@linaro.org>
Cc: James Clark <james.clark@arm.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	 Mathieu Poirier <mathieu.poirier@linaro.org>,
	Coresight ML <coresight@lists.linaro.org>,
	 Al Grant <al.grant@arm.com>,
	"Suzuki K. Poulose" <suzuki.poulose@arm.com>,
	 Anshuman Khandual <anshuman.khandual@arm.com>,
	John Garry <john.garry@huawei.com>,
	 Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	 Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	 linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 2/6] perf cs-etm: Initialise architecture based on TRCIDR1
Date: Mon, 2 Aug 2021 16:43:59 +0100	[thread overview]
Message-ID: <CAJ9a7VhBU4QYWYbzJs2Z91k=NC+xYnmKJ-HH9CKiNdpDxsY1SA@mail.gmail.com> (raw)
In-Reply-To: <20210802150358.GA148327@leoy-ThinkPad-X240s>

Hi Leo,

On Mon, 2 Aug 2021 at 16:04, Leo Yan <leo.yan@linaro.org> wrote:
>
> Hi Mike,
>
> On Mon, Aug 02, 2021 at 03:04:14PM +0100, Mike Leach wrote:
>
> [...]
>
> > > > > +#define TRCIDR1_TRCARCHMIN_SHIFT 4
> > > > > +#define TRCIDR1_TRCARCHMIN_MASK  GENMASK(7, 4)
> > > > > +#define TRCIDR1_TRCARCHMIN(x)    (((x) & TRCIDR1_TRCARCHMIN_MASK) >> TRCIDR1_TRCARCHMIN_SHIFT)
> > > > > +static enum _ocsd_arch_version cs_etm_decoder__get_arch_ver(u32 reg_idr1)
> > > > > +{
> > > > > +       /*
> > > > > +        * If the ETM trace minor version is 4 or more then we can assume
> > > > > +        * the architecture is ARCH_AA64 rather than just V8
> > > > > +        */
> > > > > +       return TRCIDR1_TRCARCHMIN(reg_idr1) >= 4 ? ARCH_AA64 : ARCH_V8;
> > > > > +}
> > > >
> > > > This is true for ETM4.x & ETE 1.x (arch 5.x) but not ETM 3.x
> > > > Probably need to beef up this comment or the function name to emphasise this.
> > >
> > > Yeah, I think it's good to change the function name.  Eventually, this
> > > function should only be used for ETM4.x and ETE.
> > >
> > > Another minor comment is: can we refine the arch version number, e.g.
> > > change the OpenCSD's macro "ARCH_AA64" to "ARCH_V8R4", (or
> > > "ARCH_V8R3_AA64"), this can give more clear clue what's the ETM version.
> > >
> >
> > The purpose of these macros is to inform the decoder of the
> > architecture of the PE - not the version of the ETM.
> >
> > These OpenCSD macros are defined by the library headers
> > (ocsd_if_types.h) and not the perf headers.
> > These have been published as the API / ABI for OpenCSD and as such
> > changing them affects all OpenCSD clients, not just perf.
>
> I understand these macros are defined in OpenCSD lib as APIs, since I
> saw these macros have not been widely used in perf tool (e.g.
> ARCH_AA64), so this is why I think it's good to take chance to refine
> the naming conventions.
>

The macros are used in other tools - so changing now affects those
too. Not something I am prepared to do without good reason.

> > This PE architecture version is used along with the core profile to
> > determine which instructions are valid waypoint instructions to
> > associate with atom elements when walking the program image during
> > trace decode.
> >
> > From v8.3  onwards we moved away from filtering on specific
> > architecture versions. This was due to two factors:-
> > 1. The architectural rules now allow architectural features for one
> > increment e.g. Arch 8.4, to be backported into  the previous increment
> > - e,g, 8.3, which made this filtering more difficult to track.
> > 2. After discussion with the PE architects it was clear that
> > instructions in a later architect version would not re-use older
> > opcodes from a previous one and  be nop / invalid in the earlier
> > architectures. (certainly in the scope of AA64). Therefore
> > the policy in the decoder is to check for all the instructions we know
> > about for the latest version of architecture, even if we could be
> > decoding an earlier architecture version. This means we may check for
> > a few more opcodes than necessary for earlier version of the
> > architecture, but the overall decode is more robust and easier to
> > maintain.
> >
> > Therefore for any AA64 core beyond v8.3 - it is safe to use the
> > ARCH_AA64 PE architecture version and the decoder will handle it.
>
> I have no objection for current approach; but two things can cause
> confusions and it might be difficult for maintenance:
>
> - The first thing is now we base on the bit fields TRCIDR1::TRCARCHMIN
>   to decide the PE architecture version.  In the ETMv4 spec,
>   TRCIDR1::TRCARCHMIN is defined as the trace unit minor version,
>   so essentially it's a minor version number for tracer (ETM) but not
>   the PE architecture number.  But now we are using it to decide the
>   PE architecture number (8.3, 8.4, etc...).
>

This is a slight weakness in the implementation of perf. Ideally one
does need to establish the architecture version of the PE - but perf
/cs-etm is using an assumption regarding the profile and version of
the core, according to the ETM / ETE versiom.
That said - the ETM / ETE version numbers do have a strong
relationship with PE architecture version numbers, so this assumption
holds for the current supported devices.

> - The second thing is the macros' naming convention.
>   E.g. "AA64" gives me an impression it is a general naming "Arm Arch 64"
>   for all Arm 64-bit CPUs, it's something like an abbreviation for
>   "aarch64"; so seems to me it doesn't show any meaningful info for PE's
>   architecture version number.  This is why I proposed to use more
>   explict macro definition for architectures (e.g. ARCH_V8R3, ARCH_V8R4,
>   ARCH_V9R0, etc).
>

For modern cores it is sufficient for the decoder to know the profile
and that it is aarch 64 - so yes the macro is simply saying this a
general AA64 core.
The macros for earlier versions are a little more specific as certain
filtering is used according to the version of the PE.

ARCH_V8R4,  ARCH_V9R0 etc would have no significance to the decoder
and would not be useful. If we get to the stage where we need more
specific PE architecture versions - then these can be added as
required.
Using the ARCH_AA64 macro means that we do not have to update the API
for every version update of the architecture, and there are no changes
required to the perf / cs-etm handling.

> If we really want to use ARCH_AA64, it's better to give some comments in
> the code.
>

There are comments in the OpenCSD headers, though additional ones in
the perf  / cs-etm handling soruce code could be added.

Regards

Mike


> Thanks a lot for shared the background info.
>
> Leo



-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-08-02 15:44 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21  9:06 [PATCH 0/6] Support ETE decoding James Clark
2021-07-21  9:06 ` James Clark
2021-07-21  9:07 ` [PATCH 1/6] perf cs-etm: Refactor initialisation of decoder params James Clark
2021-07-21  9:07   ` James Clark
2021-07-31  5:48   ` Leo Yan
2021-07-31  5:48     ` Leo Yan
2021-07-21  9:07 ` [PATCH 2/6] perf cs-etm: Initialise architecture based on TRCIDR1 James Clark
2021-07-21  9:07   ` James Clark
2021-07-22 11:10   ` Mike Leach
2021-07-22 11:10     ` Mike Leach
2021-07-31  6:03     ` Leo Yan
2021-07-31  6:03       ` Leo Yan
2021-08-02 14:04       ` Mike Leach
2021-08-02 14:04         ` Mike Leach
2021-08-02 15:03         ` Leo Yan
2021-08-02 15:03           ` Leo Yan
2021-08-02 15:43           ` Mike Leach [this message]
2021-08-02 15:43             ` Mike Leach
2021-07-21  9:07 ` [PATCH 3/6] perf cs-etm: Save TRCDEVARCH register James Clark
2021-07-21  9:07   ` James Clark
2021-07-21  9:48   ` Mike Leach
2021-07-21  9:48     ` Mike Leach
2021-07-23 12:09     ` James Clark
2021-07-23 12:09       ` James Clark
2021-07-31  6:37     ` Leo Yan
2021-07-31  6:37       ` Leo Yan
2021-08-03 12:33       ` James Clark
2021-08-03 12:33         ` James Clark
2021-08-03 12:34       ` James Clark
2021-08-03 12:34         ` James Clark
2021-08-05  9:40         ` Leo Yan
2021-08-05  9:40           ` Leo Yan
2021-08-03 12:36       ` James Clark
2021-08-03 12:36         ` James Clark
2021-07-31  7:43   ` Leo Yan
2021-07-31  7:43     ` Leo Yan
2021-08-02 11:21     ` Mike Leach
2021-08-02 11:21       ` Mike Leach
2021-08-02 12:05       ` Leo Yan
2021-08-02 12:05         ` Leo Yan
2021-08-02 12:48         ` Mike Leach
2021-08-02 12:48           ` Mike Leach
2021-08-03 12:29         ` James Clark
2021-08-03 12:29           ` James Clark
2021-07-21  9:07 ` [PATCH 4/6] perf cs-etm: Update OpenCSD decoder for ETE James Clark
2021-07-21  9:07   ` James Clark
2021-07-31  6:50   ` Leo Yan
2021-07-31  6:50     ` Leo Yan
2021-07-21  9:07 ` [PATCH 5/6] perf cs-etm: Create ETE decoder James Clark
2021-07-21  9:07   ` James Clark
2021-07-31  7:23   ` Leo Yan
2021-07-31  7:23     ` Leo Yan
2021-08-03 13:09     ` James Clark
2021-08-03 13:09       ` James Clark
2021-08-05 10:59       ` Leo Yan
2021-08-05 10:59         ` Leo Yan
2021-07-21  9:07 ` [PATCH 6/6] perf cs-etm: Print the decoder name James Clark
2021-07-21  9:07   ` James Clark
2021-07-31  7:30   ` Leo Yan
2021-07-31  7:30     ` Leo Yan
2021-08-06  9:43     ` James Clark
2021-08-06  9:43       ` James Clark
2021-08-06 11:52       ` Leo Yan
2021-08-06 11:52         ` Leo Yan
2021-07-21 14:59 ` [PATCH 0/6] Support ETE decoding Mathieu Poirier
2021-07-21 14:59   ` Mathieu Poirier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ9a7VhBU4QYWYbzJs2Z91k=NC+xYnmKJ-HH9CKiNdpDxsY1SA@mail.gmail.com' \
    --to=mike.leach@linaro.org \
    --cc=acme@kernel.org \
    --cc=al.grant@arm.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=anshuman.khandual@arm.com \
    --cc=coresight@lists.linaro.org \
    --cc=james.clark@arm.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@redhat.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=namhyung@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.