All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Dougall <dougallj@gmail.com>
Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io>,
	linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, Mark Rutland <mark.rutland@arm.com>,
	Will Deacon <will@kernel.org>, Hector Martin <marcan@marcan.st>,
	Sven Peter <sven@svenpeter.dev>, Rob Herring <robh+dt@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
Date: Mon, 15 Nov 2021 10:51:36 +0000	[thread overview]
Message-ID: <87czn18zev.wl-maz@kernel.org> (raw)
In-Reply-To: <CAHT7dO9G6tRfQCC_62OwgkoHQeUbz4AQwseX=t2S3YVYjoHCXA@mail.gmail.com>

On Sun, 14 Nov 2021 02:43:14 +0000,
Dougall <dougallj@gmail.com> wrote:
> 
> Apple distributes names (and descriptions and affinity masks) for 55
> of the events with macOS in the file /usr/share/kpep/a14.plist
> (exposed to users in Instruments.app). Many of those 55 events were
> added in macOS 12, so it's good to check the latest version. I use
> the command "plutil -convert json -o - /usr/share/kpep/a14.plist" to
> get these as JSON.

As it appears, the perf tool can ingest an event description from a
json file, and none of it has to be in the kernel itself.

So if someone was to provide a tool to convert the macOS file into
something that perf can understand, it would be great, and wouldn't
require any distribution of otherwise tainted material (distribute the
tool, and not the data).

> 
> There are many more events that I have discovered experimentally,
> but this work is unusually hard to verify, so I'd be inclined to
> stick with what's documented.
>
> However, I have observed a few oddities that might be of interest.
> 
> The counter 0x9B (INST_LDST) works on PMCs 5, 6 and 7, but gives
> different results for paired AMX instructions on PMC 7 (7 counts
> instructions, while 5 and 6 count pairs as one). Apple addresses
> this by restricting the affinity mask to PMC 7. This is also seen
> on undocumented counter 0x96, which counts integer stores. (For
> context, microarchitecturally non-load-store AMX operations appear
> as stores, as they just need to be posted to the AMX coprocessor on
> commit. Consecutive non-load-store AMX operations can be paired
> (fused), such that they issue as one uop, which is where this
> anomaly can be seen.)

Interesting. I guess we're unlikely to see any AMX support anytime
soon on Linux, unless we can make it fit the architected SME model
(and even that would be pretty controversial).

> Undocumented counters 0xF1 through 0xFF appear to be operation
> counters, meaning their result depends on events selected on other
> counters. There are three threshold registers (PMTRHLD2, PMTRHLD4,
> PMTRHLD6) which can specify a threshold (in number of cycles) for
> the operation counter on the PMC with the same number. There is also
> a mapping register (PMMAP), which contains a 3-bit field for each
> counter from PMC2 to PMC7, each specifying a PMC index which can be
> used as an input to the operation. Binary operations only use
> PMC2/4/6 and use PMC(n+1) as their other input. These operation
> counters may also behave differently depending on the value
> currently in the corresponding PMC (specifically counters F9/FA
> which implement shortest/longest run of non-zero counts).

Weeee... I'm sure there are super interesting uses for this, but I'd
rather have something simple for a start. Thanks for the heads up
though, this is extremely interesting!

> This is complicated, and it's not exposed to the user by macOS, so I
> wouldn't worry about supporting it for now.

We're in strong agreement here.

> Despite all this, the
> events and features on the P and E cores seem to be the same, so I
> don't expect a need to distinguish between them in the future.

That'd be the first big-little implementation to have consistent
events across the board. Amazing! :D

> (I've been meaning to write all this up properly, but haven't got
> around to it, sorry!)

No worries, and thanks for taking the time to write this email!

	M.

-- 
Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Dougall <dougallj@gmail.com>
Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io>,
	linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, Mark Rutland <mark.rutland@arm.com>,
	Will Deacon <will@kernel.org>, Hector Martin <marcan@marcan.st>,
	Sven Peter <sven@svenpeter.dev>, Rob Herring <robh+dt@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
Date: Mon, 15 Nov 2021 10:51:36 +0000	[thread overview]
Message-ID: <87czn18zev.wl-maz@kernel.org> (raw)
In-Reply-To: <CAHT7dO9G6tRfQCC_62OwgkoHQeUbz4AQwseX=t2S3YVYjoHCXA@mail.gmail.com>

On Sun, 14 Nov 2021 02:43:14 +0000,
Dougall <dougallj@gmail.com> wrote:
> 
> Apple distributes names (and descriptions and affinity masks) for 55
> of the events with macOS in the file /usr/share/kpep/a14.plist
> (exposed to users in Instruments.app). Many of those 55 events were
> added in macOS 12, so it's good to check the latest version. I use
> the command "plutil -convert json -o - /usr/share/kpep/a14.plist" to
> get these as JSON.

As it appears, the perf tool can ingest an event description from a
json file, and none of it has to be in the kernel itself.

So if someone was to provide a tool to convert the macOS file into
something that perf can understand, it would be great, and wouldn't
require any distribution of otherwise tainted material (distribute the
tool, and not the data).

> 
> There are many more events that I have discovered experimentally,
> but this work is unusually hard to verify, so I'd be inclined to
> stick with what's documented.
>
> However, I have observed a few oddities that might be of interest.
> 
> The counter 0x9B (INST_LDST) works on PMCs 5, 6 and 7, but gives
> different results for paired AMX instructions on PMC 7 (7 counts
> instructions, while 5 and 6 count pairs as one). Apple addresses
> this by restricting the affinity mask to PMC 7. This is also seen
> on undocumented counter 0x96, which counts integer stores. (For
> context, microarchitecturally non-load-store AMX operations appear
> as stores, as they just need to be posted to the AMX coprocessor on
> commit. Consecutive non-load-store AMX operations can be paired
> (fused), such that they issue as one uop, which is where this
> anomaly can be seen.)

Interesting. I guess we're unlikely to see any AMX support anytime
soon on Linux, unless we can make it fit the architected SME model
(and even that would be pretty controversial).

> Undocumented counters 0xF1 through 0xFF appear to be operation
> counters, meaning their result depends on events selected on other
> counters. There are three threshold registers (PMTRHLD2, PMTRHLD4,
> PMTRHLD6) which can specify a threshold (in number of cycles) for
> the operation counter on the PMC with the same number. There is also
> a mapping register (PMMAP), which contains a 3-bit field for each
> counter from PMC2 to PMC7, each specifying a PMC index which can be
> used as an input to the operation. Binary operations only use
> PMC2/4/6 and use PMC(n+1) as their other input. These operation
> counters may also behave differently depending on the value
> currently in the corresponding PMC (specifically counters F9/FA
> which implement shortest/longest run of non-zero counts).

Weeee... I'm sure there are super interesting uses for this, but I'd
rather have something simple for a start. Thanks for the heads up
though, this is extremely interesting!

> This is complicated, and it's not exposed to the user by macOS, so I
> wouldn't worry about supporting it for now.

We're in strong agreement here.

> Despite all this, the
> events and features on the P and E cores seem to be the same, so I
> don't expect a need to distinguish between them in the future.

That'd be the first big-little implementation to have consistent
events across the board. Amazing! :D

> (I've been meaning to write all this up properly, but haven't got
> around to it, sorry!)

No worries, and thanks for taking the time to write this email!

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-11-15 10:51 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-13 11:54 [PATCH 0/8] drivers/perf: CPU PMU driver for Apple M1 Marc Zyngier
2021-11-13 11:54 ` Marc Zyngier
2021-11-13 11:54 ` [PATCH 1/8] dt-bindings: arm-pmu: Document Apple PMU compatible strings Marc Zyngier
2021-11-13 11:54   ` Marc Zyngier
2021-11-29 21:24   ` Rob Herring
2021-11-29 21:24     ` Rob Herring
2021-11-13 11:54 ` [PATCH 2/8] dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts Marc Zyngier
2021-11-13 11:54   ` [PATCH 2/8] dt-bindings: apple, aic: " Marc Zyngier
2021-11-29 21:25   ` [PATCH 2/8] dt-bindings: apple,aic: " Rob Herring
2021-11-29 21:25     ` Rob Herring
2021-11-13 11:54 ` [PATCH 3/8] irqchip/apple-aic: Add cpumasks for E and P cores Marc Zyngier
2021-11-13 11:54   ` Marc Zyngier
2021-11-13 11:54 ` [PATCH 4/8] irqchip/apple-aic: Wire PMU interrupts Marc Zyngier
2021-11-13 11:54   ` Marc Zyngier
2021-11-13 11:54 ` [PATCH 5/8] irqchip/apple-aic: Move PMU-specific registers to their own include file Marc Zyngier
2021-11-13 11:54   ` Marc Zyngier
2021-11-13 11:54 ` [PATCH 6/8] arm64: apple: t8301: Add PMU nodes Marc Zyngier
2021-11-13 11:54   ` Marc Zyngier
2021-11-13 11:54 ` [PATCH 7/8] drivers/perf: arm_pmu: Handle 47 bit counters Marc Zyngier
2021-11-13 11:54   ` Marc Zyngier
2021-11-13 11:54 ` [PATCH 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver Marc Zyngier
2021-11-13 11:54   ` Marc Zyngier
2021-11-13 13:04   ` Alyssa Rosenzweig
2021-11-13 13:04     ` Alyssa Rosenzweig
2021-11-14  2:43     ` Dougall
2021-11-14  2:43       ` Dougall
2021-11-15 10:51       ` Marc Zyngier [this message]
2021-11-15 10:51         ` Marc Zyngier
2021-11-14 13:45   ` Alyssa Rosenzweig
2021-11-14 13:45     ` Alyssa Rosenzweig
2021-11-14 18:35     ` Marc Zyngier
2021-11-14 18:35       ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czn18zev.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=alyssa@rosenzweig.io \
    --cc=devicetree@vger.kernel.org \
    --cc=dougallj@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcan@marcan.st \
    --cc=mark.rutland@arm.com \
    --cc=robh+dt@kernel.org \
    --cc=sven@svenpeter.dev \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.