* [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding
@ 2021-11-23 8:46 Sandipan Das
2021-11-23 8:46 ` [PATCH v3 2/2] perf/docs: Update link to AMD documentation Sandipan Das
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Sandipan Das @ 2021-11-23 8:46 UTC (permalink / raw)
To: acme
Cc: santosh.shukla, ravi.bangoria, ananth.narayan, kim.phillips,
rrichter, linux-perf-users, jolsa, kjain
AMD processors have events with event select codes and unit
masks larger than a byte. The core PMU, for example, uses
12-bit event select codes split between bits 0-7 and 32-35
of the PERF_CTL MSRs as can be seen from
/sys/bus/event_sources/devices/cpu/format/*.
The Processor Programming Reference (PPR) lists the event
codes as unified 12-bit hexadecimal values instead and the
split between the bits is not apparent to someone who is
not aware of the layout of the PERF_CTL MSRs.
8-bit event select codes continue to work as the layout
matches that of the PERF_CTL MSRs i.e. bits 0-7 for event
select and 8-15 for unit mask.
This adds more details in the perf man pages about using
/sys/bus/event_sources/devices/*/format/* for determining
the correct raw event encoding scheme.
E.g. the "op_cache_hit_miss.op_cache_hit" event with code
0x28f and umask 0x03 can be programmed using its symbolic
name as:
$ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x20000038f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
One might use a simple eventsel+umask combination based on
what the current man pages say and incorrectly program the
event as:
$ sudo perf --debug perf-event-open stat -e r0328f sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x328f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
When it should have been based on the format from sysfs:
$ cat /sys/bus/event_source/devices/cpu/format/event
config:0-7,32-35
$ sudo perf --debug perf-event-open stat -e r20000038f sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x20000038f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
v1: https://lore.kernel.org/linux-perf-users/20211119111234.170726-1-sandipan.das@amd.com/
v2: https://lore.kernel.org/linux-perf-users/20211123065104.236717-1-sandipan.das@amd.com/
v3:
- Mention why simple eventsel+umask combinations will not
work in the commit message.
---
tools/perf/Documentation/perf-list.txt | 34 +++++++++++++++++++++++-
tools/perf/Documentation/perf-record.txt | 6 +++--
tools/perf/Documentation/perf-stat.txt | 6 +++--
tools/perf/Documentation/perf-top.txt | 7 ++---
4 files changed, 45 insertions(+), 8 deletions(-)
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 4dc8d0af19df..a922a95289a9 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -94,7 +94,7 @@ RAW HARDWARE EVENT DESCRIPTOR
Even when an event is not available in a symbolic form within perf right now,
it can be encoded in a per processor specific way.
-For instance For x86 CPUs NNN represents the raw register encoding with the
+For instance on x86 CPUs, N is a hexadecimal value that represents the raw register encoding with the
layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout
of IA32_PERFEVTSELx MSRs) or AMD's PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2: System Programming], Page 344,
Figure 13-7 Performance Event-Select Register (PerfEvtSeln)).
@@ -126,6 +126,38 @@ It's also possible to use pmu syntax:
perf record -e cpu/r1a8/ ...
perf record -e cpu/r0x1a8/ ...
+Some processors, like those from AMD, support event codes and unit masks
+larger than a byte. In such cases, the bits corresponding to the event
+configuration parameters can be seen with:
+
+ cat /sys/bus/event_source/devices/<pmu>/format/<config>
+
+Example:
+
+If the AMD docs for an EPYC 7713 processor describe an event as:
+
+ Event Umask Event Mask
+ Num. Value Mnemonic Description
+
+ 28FH 03H op_cache_hit_miss.op_cache_hit Counts Op Cache micro-tag
+ hit events.
+
+raw encoding of 0x0328F cannot be used since the upper nibble of the
+EventSelect bits have to be specified via bits 32-35 as can be seen with:
+
+ cat /sys/bus/event_source/devices/cpu/format/event
+
+raw encoding of 0x20000038F should be used instead:
+
+ perf stat -e r20000038f -a sleep 1
+ perf record -e r20000038f ...
+
+It's also possible to use pmu syntax:
+
+ perf record -e r20000038f -a sleep 1
+ perf record -e cpu/r20000038f/ ...
+ perf record -e cpu/r0x20000038f/ ...
+
You should refer to the processor specific documentation for getting these
details. Some of them are referenced in the SEE ALSO section below.
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 3cf7bac67239..55df7b073a55 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -30,8 +30,10 @@ OPTIONS
- a symbolic event name (use 'perf list' to list all events)
- - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
- hexadecimal event descriptor.
+ - a raw PMU event in the form of rN where N is a hexadecimal value
+ that represents the raw register encoding with the layout of the
+ event control registers as described by entries in
+ /sys/bus/event_sources/devices/cpu/format/*.
- a symbolic or raw PMU event followed by an optional colon
and a list of event modifiers, e.g., cpu-cycles:p. See the
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 7e6fb7cbc0f4..604e6f2301ea 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -36,8 +36,10 @@ report::
- a symbolic event name (use 'perf list' to list all events)
- - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
- hexadecimal event descriptor.
+ - a raw PMU event in the form of rN where N is a hexadecimal value
+ that represents the raw register encoding with the layout of the
+ event control registers as described by entries in
+ /sys/bus/event_sources/devices/cpu/format/*.
- a symbolic or raw PMU event followed by an optional colon
and a list of event modifiers, e.g., cpu-cycles:p. See the
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 9898a32b8d9c..cac3dfbee7d8 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -38,9 +38,10 @@ Default is to monitor all CPUS.
-e <event>::
--event=<event>::
Select the PMU event. Selection can be a symbolic event name
- (use 'perf list' to list all events) or a raw PMU
- event (eventsel+umask) in the form of rNNN where NNN is a
- hexadecimal event descriptor.
+ (use 'perf list' to list all events) or a raw PMU event in the form
+ of rN where N is a hexadecimal value that represents the raw register
+ encoding with the layout of the event control registers as described
+ by entries in /sys/bus/event_sources/devices/cpu/format/*.
-E <entries>::
--entries=<entries>::
--
2.30.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 2/2] perf/docs: Update link to AMD documentation
2021-11-23 8:46 [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding Sandipan Das
@ 2021-11-23 8:46 ` Sandipan Das
2021-11-23 8:50 ` [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding kajoljain
2021-11-28 16:38 ` Jiri Olsa
2 siblings, 0 replies; 5+ messages in thread
From: Sandipan Das @ 2021-11-23 8:46 UTC (permalink / raw)
To: acme
Cc: santosh.shukla, ravi.bangoria, ananth.narayan, kim.phillips,
rrichter, linux-perf-users, jolsa, kjain
This updates the link to documentation on AMD processors.
The new link points to a page where users can find the
Processor Programming Reference (PPR) documents for the
family and model codes corresponding to processors they
are using.
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
v1: https://lore.kernel.org/linux-perf-users/20211119111234.170726-2-sandipan.das@amd.com/
v2: https://lore.kernel.org/linux-perf-users/20211123065104.236717-2-sandipan.das@amd.com/
v3:
- No changes.
---
tools/perf/Documentation/perf-list.txt | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index a922a95289a9..57384a97c04f 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -81,7 +81,11 @@ On AMD systems it is implemented using IBS (up to precise-level 2).
The precise modifier works with event types 0x76 (cpu-cycles, CPU
clocks not halted) and 0xC1 (micro-ops retired). Both events map to
IBS execution sampling (IBS op) with the IBS Op Counter Control bit
-(IbsOpCntCtl) set respectively (see AMD64 Architecture Programmer’s
+(IbsOpCntCtl) set respectively (see the
+Core Complex (CCX) -> Processor x86 Core -> Instruction Based Sampling (IBS)
+section of the [AMD Processor Programming Reference (PPR)] relevant to the
+family, model and stepping of the processor being used).
+
Manual Volume 2: System Programming, 13.3 Instruction-Based
Sampling). Examples to use IBS:
@@ -96,8 +100,10 @@ it can be encoded in a per processor specific way.
For instance on x86 CPUs, N is a hexadecimal value that represents the raw register encoding with the
layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout
-of IA32_PERFEVTSELx MSRs) or AMD's PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2: System Programming], Page 344,
-Figure 13-7 Performance Event-Select Register (PerfEvtSeln)).
+of IA32_PERFEVTSELx MSRs) or AMD's PERF_CTL MSRs (see the
+Core Complex (CCX) -> Processor x86 Core -> MSR Registers section of the
+[AMD Processor Programming Reference (PPR)] relevant to the family, model
+and stepping of the processor being used).
Note: Only the following bit fields can be set in x86 counter
registers: event, umask, edge, inv, cmask. Esp. guest/host only and
@@ -348,4 +354,4 @@ SEE ALSO
linkperf:perf-stat[1], linkperf:perf-top[1],
linkperf:perf-record[1],
http://www.intel.com/sdm/[Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide],
-http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf[AMD64 Architecture Programmer’s Manual Volume 2: System Programming]
+https://bugzilla.kernel.org/show_bug.cgi?id=206537[AMD Processor Programming Reference (PPR)]
--
2.30.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding
2021-11-23 8:46 [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding Sandipan Das
2021-11-23 8:46 ` [PATCH v3 2/2] perf/docs: Update link to AMD documentation Sandipan Das
@ 2021-11-23 8:50 ` kajoljain
2021-11-28 16:38 ` Jiri Olsa
2 siblings, 0 replies; 5+ messages in thread
From: kajoljain @ 2021-11-23 8:50 UTC (permalink / raw)
To: Sandipan Das, acme
Cc: santosh.shukla, ravi.bangoria, ananth.narayan, kim.phillips,
rrichter, linux-perf-users, jolsa
On 11/23/21 2:16 PM, Sandipan Das wrote:
> AMD processors have events with event select codes and unit
> masks larger than a byte. The core PMU, for example, uses
> 12-bit event select codes split between bits 0-7 and 32-35
> of the PERF_CTL MSRs as can be seen from
> /sys/bus/event_sources/devices/cpu/format/*.
Patch looks good to me.
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
Thanks,
Kajol Jain
>
> The Processor Programming Reference (PPR) lists the event
> codes as unified 12-bit hexadecimal values instead and the
> split between the bits is not apparent to someone who is
> not aware of the layout of the PERF_CTL MSRs.
>
> 8-bit event select codes continue to work as the layout
> matches that of the PERF_CTL MSRs i.e. bits 0-7 for event
> select and 8-15 for unit mask.
>
> This adds more details in the perf man pages about using
> /sys/bus/event_sources/devices/*/format/* for determining
> the correct raw event encoding scheme.
>
> E.g. the "op_cache_hit_miss.op_cache_hit" event with code
> 0x28f and umask 0x03 can be programmed using its symbolic
> name as:
>
> $ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1
> ------------------------------------------------------------
> perf_event_attr:
> type 4
> size 128
> config 0x20000038f
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> [...]
>
> One might use a simple eventsel+umask combination based on
> what the current man pages say and incorrectly program the
> event as:
>
> $ sudo perf --debug perf-event-open stat -e r0328f sleep 1
> ------------------------------------------------------------
> perf_event_attr:
> type 4
> size 128
> config 0x328f
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> [...]
>
> When it should have been based on the format from sysfs:
>
> $ cat /sys/bus/event_source/devices/cpu/format/event
> config:0-7,32-35
>
> $ sudo perf --debug perf-event-open stat -e r20000038f sleep 1
> ------------------------------------------------------------
> perf_event_attr:
> type 4
> size 128
> config 0x20000038f
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> [...]
>
> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> ---
> v1: https://lore.kernel.org/linux-perf-users/20211119111234.170726-1-sandipan.das@amd.com/
> v2: https://lore.kernel.org/linux-perf-users/20211123065104.236717-1-sandipan.das@amd.com/
>
> v3:
> - Mention why simple eventsel+umask combinations will not
> work in the commit message.
>
> ---
> tools/perf/Documentation/perf-list.txt | 34 +++++++++++++++++++++++-
> tools/perf/Documentation/perf-record.txt | 6 +++--
> tools/perf/Documentation/perf-stat.txt | 6 +++--
> tools/perf/Documentation/perf-top.txt | 7 ++---
> 4 files changed, 45 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
> index 4dc8d0af19df..a922a95289a9 100644
> --- a/tools/perf/Documentation/perf-list.txt
> +++ b/tools/perf/Documentation/perf-list.txt
> @@ -94,7 +94,7 @@ RAW HARDWARE EVENT DESCRIPTOR
> Even when an event is not available in a symbolic form within perf right now,
> it can be encoded in a per processor specific way.
>
> -For instance For x86 CPUs NNN represents the raw register encoding with the
> +For instance on x86 CPUs, N is a hexadecimal value that represents the raw register encoding with the
> layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout
> of IA32_PERFEVTSELx MSRs) or AMD's PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2: System Programming], Page 344,
> Figure 13-7 Performance Event-Select Register (PerfEvtSeln)).
> @@ -126,6 +126,38 @@ It's also possible to use pmu syntax:
> perf record -e cpu/r1a8/ ...
> perf record -e cpu/r0x1a8/ ...
>
> +Some processors, like those from AMD, support event codes and unit masks
> +larger than a byte. In such cases, the bits corresponding to the event
> +configuration parameters can be seen with:
> +
> + cat /sys/bus/event_source/devices/<pmu>/format/<config>
> +
> +Example:
> +
> +If the AMD docs for an EPYC 7713 processor describe an event as:
> +
> + Event Umask Event Mask
> + Num. Value Mnemonic Description
> +
> + 28FH 03H op_cache_hit_miss.op_cache_hit Counts Op Cache micro-tag
> + hit events.
> +
> +raw encoding of 0x0328F cannot be used since the upper nibble of the
> +EventSelect bits have to be specified via bits 32-35 as can be seen with:
> +
> + cat /sys/bus/event_source/devices/cpu/format/event
> +
> +raw encoding of 0x20000038F should be used instead:
> +
> + perf stat -e r20000038f -a sleep 1
> + perf record -e r20000038f ...
> +
> +It's also possible to use pmu syntax:
> +
> + perf record -e r20000038f -a sleep 1
> + perf record -e cpu/r20000038f/ ...
> + perf record -e cpu/r0x20000038f/ ...
> +
> You should refer to the processor specific documentation for getting these
> details. Some of them are referenced in the SEE ALSO section below.
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 3cf7bac67239..55df7b073a55 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -30,8 +30,10 @@ OPTIONS
>
> - a symbolic event name (use 'perf list' to list all events)
>
> - - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
> - hexadecimal event descriptor.
> + - a raw PMU event in the form of rN where N is a hexadecimal value
> + that represents the raw register encoding with the layout of the
> + event control registers as described by entries in
> + /sys/bus/event_sources/devices/cpu/format/*.
>
> - a symbolic or raw PMU event followed by an optional colon
> and a list of event modifiers, e.g., cpu-cycles:p. See the
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 7e6fb7cbc0f4..604e6f2301ea 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -36,8 +36,10 @@ report::
>
> - a symbolic event name (use 'perf list' to list all events)
>
> - - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
> - hexadecimal event descriptor.
> + - a raw PMU event in the form of rN where N is a hexadecimal value
> + that represents the raw register encoding with the layout of the
> + event control registers as described by entries in
> + /sys/bus/event_sources/devices/cpu/format/*.
>
> - a symbolic or raw PMU event followed by an optional colon
> and a list of event modifiers, e.g., cpu-cycles:p. See the
> diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
> index 9898a32b8d9c..cac3dfbee7d8 100644
> --- a/tools/perf/Documentation/perf-top.txt
> +++ b/tools/perf/Documentation/perf-top.txt
> @@ -38,9 +38,10 @@ Default is to monitor all CPUS.
> -e <event>::
> --event=<event>::
> Select the PMU event. Selection can be a symbolic event name
> - (use 'perf list' to list all events) or a raw PMU
> - event (eventsel+umask) in the form of rNNN where NNN is a
> - hexadecimal event descriptor.
> + (use 'perf list' to list all events) or a raw PMU event in the form
> + of rN where N is a hexadecimal value that represents the raw register
> + encoding with the layout of the event control registers as described
> + by entries in /sys/bus/event_sources/devices/cpu/format/*.
>
> -E <entries>::
> --entries=<entries>::
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding
2021-11-23 8:46 [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding Sandipan Das
2021-11-23 8:46 ` [PATCH v3 2/2] perf/docs: Update link to AMD documentation Sandipan Das
2021-11-23 8:50 ` [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding kajoljain
@ 2021-11-28 16:38 ` Jiri Olsa
2021-11-30 15:08 ` Arnaldo Carvalho de Melo
2 siblings, 1 reply; 5+ messages in thread
From: Jiri Olsa @ 2021-11-28 16:38 UTC (permalink / raw)
To: Sandipan Das
Cc: acme, santosh.shukla, ravi.bangoria, ananth.narayan,
kim.phillips, rrichter, linux-perf-users, kjain
On Tue, Nov 23, 2021 at 02:16:12PM +0530, Sandipan Das wrote:
> AMD processors have events with event select codes and unit
> masks larger than a byte. The core PMU, for example, uses
> 12-bit event select codes split between bits 0-7 and 32-35
> of the PERF_CTL MSRs as can be seen from
> /sys/bus/event_sources/devices/cpu/format/*.
>
> The Processor Programming Reference (PPR) lists the event
> codes as unified 12-bit hexadecimal values instead and the
> split between the bits is not apparent to someone who is
> not aware of the layout of the PERF_CTL MSRs.
>
> 8-bit event select codes continue to work as the layout
> matches that of the PERF_CTL MSRs i.e. bits 0-7 for event
> select and 8-15 for unit mask.
>
> This adds more details in the perf man pages about using
> /sys/bus/event_sources/devices/*/format/* for determining
> the correct raw event encoding scheme.
>
> E.g. the "op_cache_hit_miss.op_cache_hit" event with code
> 0x28f and umask 0x03 can be programmed using its symbolic
> name as:
>
> $ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1
> ------------------------------------------------------------
> perf_event_attr:
> type 4
> size 128
> config 0x20000038f
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> [...]
>
> One might use a simple eventsel+umask combination based on
> what the current man pages say and incorrectly program the
> event as:
>
> $ sudo perf --debug perf-event-open stat -e r0328f sleep 1
> ------------------------------------------------------------
> perf_event_attr:
> type 4
> size 128
> config 0x328f
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> [...]
>
> When it should have been based on the format from sysfs:
>
> $ cat /sys/bus/event_source/devices/cpu/format/event
> config:0-7,32-35
>
> $ sudo perf --debug perf-event-open stat -e r20000038f sleep 1
> ------------------------------------------------------------
> perf_event_attr:
> type 4
> size 128
> config 0x20000038f
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> [...]
>
> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> ---
> v1: https://lore.kernel.org/linux-perf-users/20211119111234.170726-1-sandipan.das@amd.com/
> v2: https://lore.kernel.org/linux-perf-users/20211123065104.236717-1-sandipan.das@amd.com/
>
> v3:
> - Mention why simple eventsel+umask combinations will not
> work in the commit message.
for both patches:
Acked-by: Jiri Olsa <jolsa@redhat.com>
thanks,
jirka
>
> ---
> tools/perf/Documentation/perf-list.txt | 34 +++++++++++++++++++++++-
> tools/perf/Documentation/perf-record.txt | 6 +++--
> tools/perf/Documentation/perf-stat.txt | 6 +++--
> tools/perf/Documentation/perf-top.txt | 7 ++---
> 4 files changed, 45 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
> index 4dc8d0af19df..a922a95289a9 100644
> --- a/tools/perf/Documentation/perf-list.txt
> +++ b/tools/perf/Documentation/perf-list.txt
> @@ -94,7 +94,7 @@ RAW HARDWARE EVENT DESCRIPTOR
> Even when an event is not available in a symbolic form within perf right now,
> it can be encoded in a per processor specific way.
>
> -For instance For x86 CPUs NNN represents the raw register encoding with the
> +For instance on x86 CPUs, N is a hexadecimal value that represents the raw register encoding with the
> layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout
> of IA32_PERFEVTSELx MSRs) or AMD's PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2: System Programming], Page 344,
> Figure 13-7 Performance Event-Select Register (PerfEvtSeln)).
> @@ -126,6 +126,38 @@ It's also possible to use pmu syntax:
> perf record -e cpu/r1a8/ ...
> perf record -e cpu/r0x1a8/ ...
>
> +Some processors, like those from AMD, support event codes and unit masks
> +larger than a byte. In such cases, the bits corresponding to the event
> +configuration parameters can be seen with:
> +
> + cat /sys/bus/event_source/devices/<pmu>/format/<config>
> +
> +Example:
> +
> +If the AMD docs for an EPYC 7713 processor describe an event as:
> +
> + Event Umask Event Mask
> + Num. Value Mnemonic Description
> +
> + 28FH 03H op_cache_hit_miss.op_cache_hit Counts Op Cache micro-tag
> + hit events.
> +
> +raw encoding of 0x0328F cannot be used since the upper nibble of the
> +EventSelect bits have to be specified via bits 32-35 as can be seen with:
> +
> + cat /sys/bus/event_source/devices/cpu/format/event
> +
> +raw encoding of 0x20000038F should be used instead:
> +
> + perf stat -e r20000038f -a sleep 1
> + perf record -e r20000038f ...
> +
> +It's also possible to use pmu syntax:
> +
> + perf record -e r20000038f -a sleep 1
> + perf record -e cpu/r20000038f/ ...
> + perf record -e cpu/r0x20000038f/ ...
> +
> You should refer to the processor specific documentation for getting these
> details. Some of them are referenced in the SEE ALSO section below.
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 3cf7bac67239..55df7b073a55 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -30,8 +30,10 @@ OPTIONS
>
> - a symbolic event name (use 'perf list' to list all events)
>
> - - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
> - hexadecimal event descriptor.
> + - a raw PMU event in the form of rN where N is a hexadecimal value
> + that represents the raw register encoding with the layout of the
> + event control registers as described by entries in
> + /sys/bus/event_sources/devices/cpu/format/*.
>
> - a symbolic or raw PMU event followed by an optional colon
> and a list of event modifiers, e.g., cpu-cycles:p. See the
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 7e6fb7cbc0f4..604e6f2301ea 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -36,8 +36,10 @@ report::
>
> - a symbolic event name (use 'perf list' to list all events)
>
> - - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
> - hexadecimal event descriptor.
> + - a raw PMU event in the form of rN where N is a hexadecimal value
> + that represents the raw register encoding with the layout of the
> + event control registers as described by entries in
> + /sys/bus/event_sources/devices/cpu/format/*.
>
> - a symbolic or raw PMU event followed by an optional colon
> and a list of event modifiers, e.g., cpu-cycles:p. See the
> diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
> index 9898a32b8d9c..cac3dfbee7d8 100644
> --- a/tools/perf/Documentation/perf-top.txt
> +++ b/tools/perf/Documentation/perf-top.txt
> @@ -38,9 +38,10 @@ Default is to monitor all CPUS.
> -e <event>::
> --event=<event>::
> Select the PMU event. Selection can be a symbolic event name
> - (use 'perf list' to list all events) or a raw PMU
> - event (eventsel+umask) in the form of rNNN where NNN is a
> - hexadecimal event descriptor.
> + (use 'perf list' to list all events) or a raw PMU event in the form
> + of rN where N is a hexadecimal value that represents the raw register
> + encoding with the layout of the event control registers as described
> + by entries in /sys/bus/event_sources/devices/cpu/format/*.
>
> -E <entries>::
> --entries=<entries>::
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding
2021-11-28 16:38 ` Jiri Olsa
@ 2021-11-30 15:08 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-11-30 15:08 UTC (permalink / raw)
To: Jiri Olsa
Cc: Sandipan Das, santosh.shukla, ravi.bangoria, ananth.narayan,
kim.phillips, rrichter, linux-perf-users, kjain
Em Sun, Nov 28, 2021 at 05:38:17PM +0100, Jiri Olsa escreveu:
> On Tue, Nov 23, 2021 at 02:16:12PM +0530, Sandipan Das wrote:
> > AMD processors have events with event select codes and unit
> > masks larger than a byte. The core PMU, for example, uses
> > 12-bit event select codes split between bits 0-7 and 32-35
> > of the PERF_CTL MSRs as can be seen from
> > /sys/bus/event_sources/devices/cpu/format/*.
> >
> > The Processor Programming Reference (PPR) lists the event
> > codes as unified 12-bit hexadecimal values instead and the
> > split between the bits is not apparent to someone who is
> > not aware of the layout of the PERF_CTL MSRs.
> >
> > 8-bit event select codes continue to work as the layout
> > matches that of the PERF_CTL MSRs i.e. bits 0-7 for event
> > select and 8-15 for unit mask.
> >
> > This adds more details in the perf man pages about using
> > /sys/bus/event_sources/devices/*/format/* for determining
> > the correct raw event encoding scheme.
> >
> > E.g. the "op_cache_hit_miss.op_cache_hit" event with code
> > 0x28f and umask 0x03 can be programmed using its symbolic
> > name as:
> >
> > $ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 4
> > size 128
> > config 0x20000038f
> > sample_type IDENTIFIER
> > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > disabled 1
> > inherit 1
> > enable_on_exec 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > [...]
> >
> > One might use a simple eventsel+umask combination based on
> > what the current man pages say and incorrectly program the
> > event as:
> >
> > $ sudo perf --debug perf-event-open stat -e r0328f sleep 1
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 4
> > size 128
> > config 0x328f
> > sample_type IDENTIFIER
> > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > disabled 1
> > inherit 1
> > enable_on_exec 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > [...]
> >
> > When it should have been based on the format from sysfs:
> >
> > $ cat /sys/bus/event_source/devices/cpu/format/event
> > config:0-7,32-35
> >
> > $ sudo perf --debug perf-event-open stat -e r20000038f sleep 1
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 4
> > size 128
> > config 0x20000038f
> > sample_type IDENTIFIER
> > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > disabled 1
> > inherit 1
> > enable_on_exec 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > [...]
> >
> > Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> > ---
> > v1: https://lore.kernel.org/linux-perf-users/20211119111234.170726-1-sandipan.das@amd.com/
> > v2: https://lore.kernel.org/linux-perf-users/20211123065104.236717-1-sandipan.das@amd.com/
> >
> > v3:
> > - Mention why simple eventsel+umask combinations will not
> > work in the commit message.
>
> for both patches:
>
> Acked-by: Jiri Olsa <jolsa@redhat.com>
Thanks, applied.
- Arnaldo
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-11-30 15:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 8:46 [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding Sandipan Das
2021-11-23 8:46 ` [PATCH v3 2/2] perf/docs: Update link to AMD documentation Sandipan Das
2021-11-23 8:50 ` [PATCH v3 1/2] perf/docs: Add info on AMD raw event encoding kajoljain
2021-11-28 16:38 ` Jiri Olsa
2021-11-30 15:08 ` Arnaldo Carvalho de Melo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.