linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 04/10] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting.
@ 2017-01-02  6:49 Anurup M
  2017-01-10 17:55 ` Mark Rutland
  0 siblings, 1 reply; 3+ messages in thread
From: Anurup M @ 2017-01-02  6:49 UTC (permalink / raw)
  To: corbet, mark.rutland, will.deacon
  Cc: linux-kernel, linux-doc, linux-arm-kernel, anurup.m,
	zhangshaokun, tanxiaojun, xuwei5, sanil.kumar, john.garry,
	gabriele.paoloni, shiju.jose, linuxarm, shyju.pv, anurupvasu

Documentation for perf usage and Hisilicon SoC PMU uncore events.
The Hisilicon SOC has event counters for hardware modules like
L3 cache, Miscellaneous node etc. These events are all uncore.

Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
---
 Documentation/perf/hisi-pmu.txt | 75 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)
 create mode 100644 Documentation/perf/hisi-pmu.txt

diff --git a/Documentation/perf/hisi-pmu.txt b/Documentation/perf/hisi-pmu.txt
new file mode 100644
index 0000000..16f77ab
--- /dev/null
+++ b/Documentation/perf/hisi-pmu.txt
@@ -0,0 +1,75 @@
+Hisilicon SoC PMU (Performance Monitoring Unit)
+================================================
+The Hisilicon SoC HiP05/06/07 chips consist of various independent system
+device PMU's such as L3 cache(L3C) and Miscellaneous Nodes(MN).
+These PMU devices are independent and have hardware logic to gather
+statistics and performance information.
+
+HiP0x chips are encapsulated by multiple CPU and IO die's. The CPU die is
+called as Super CPU cluster (SCCL) which includes 16 cpu-cores. Every SCCL
+is further grouped as CPU clusters (CCL) which includes 4 cpu-cores each.
+Each SCCL has 1 L3 cache and 1 MN units.
+
+The L3 cache is shared by all CPU cores in a CPU die. The L3C has four banks
+(or instances). Each bank or instance of L3C has Eight 32-bit counter
+registers and also event control registers. The HiP05/06 chip L3 cache has
+22 statistics events. The HiP07 chip has 66 statistics events. These events
+are very useful for debugging.
+
+The MN module is also shared by all CPU cores in a CPU die. It receives
+barriers and DVM(Distributed Virtual Memory) messages from cpu or smmu, and
+perform the required actions and return response messages. These events are
+very useful for debugging. The MN has total 9 statistics events and support
+four 32-bit counter registers in hip05/06/07 chips.
+
+There is no memory mapping for L3 cache and MN registers. It can be accessed
+by using the Hisilicon djtag interface. The Djtag in a SCCL is an independent
+module which connects with some modules in the SoC by Debug Bus.
+
+Hisilicon SoC (HiP05/06/07) PMU driver
+--------------------------------------
+The HiP0x PMU driver shall register perf PMU drivers like L3 cache, MN, etc.
+The available events and configuration options shall be described in the sysfs.
+The "perf list" shall list the available events from sysfs.
+
+The L3 cache in a SCCL is divided as 4 banks. Each L3 cache bank have separate
+PMU registers for event counting and control. So each L3 cache bank is
+registered with perf as a separate PMU.
+The PMU name will appear in event listing as hisi_l3c<bank-id>_<scl-id>.
+where "bank-id" is the bank index (0 to 3) and "scl-id" is the SCCL identifier
+e.g. hisi_l3c0_2/read_hit is READ_HIT event of L3 cache bank #0 SCCL ID #2.
+
+The MN in a SCCL is registered as a separate PMU with perf.
+The PMU name will appear in event listing as hisi_mn_<scl-id>.
+e.g. hisi_mn_2/read_req. READ_REQUEST event of MN of Super CPU cluster #2.
+
+The event code is represented by 12 bits.
+	i) event 0-11
+		The event code will be represented using the LSB 12 bits.
+
+The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
+ID used to count the uncore PMU event.
+
+Example usage of perf:
+$# perf list
+hisi_l3c0_2/read_hit/ [kernel PMU event]
+------------------------------------------
+hisi_l3c1_2/write_hit/ [kernel PMU event]
+------------------------------------------
+hisi_l3c0_1/read_hit/ [kernel PMU event]
+------------------------------------------
+hisi_l3c0_1/write_hit/ [kernel PMU event]
+------------------------------------------
+hisi_mn_2/read_req/ [kernel PMU event]
+hisi_mn_2/write_req/ [kernel PMU event]
+------------------------------------------
+
+$# perf stat -a -e "hisi_l3c0_2/read_allocate/" sleep 5
+
+$# perf stat -A -C 0 -e "hisi_l3c0_2/read_allocate/" sleep 5
+
+The current driver doesnot support sampling. so "perf record" is unsupported.
+Also attach to a task is unsupported as the events are all uncore.
+
+Note: Please contact the maintainer for a complete list of events supported for
+the PMU devices in the SoC and its information if needed.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3 04/10] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting.
  2017-01-02  6:49 [PATCH v3 04/10] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting Anurup M
@ 2017-01-10 17:55 ` Mark Rutland
  2017-01-12  5:47   ` Anurup M
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Rutland @ 2017-01-10 17:55 UTC (permalink / raw)
  To: Anurup M
  Cc: corbet, will.deacon, linux-kernel, linux-doc, linux-arm-kernel,
	anurup.m, zhangshaokun, tanxiaojun, xuwei5, sanil.kumar,
	john.garry, gabriele.paoloni, shiju.jose, linuxarm, shyju.pv

On Mon, Jan 02, 2017 at 01:49:37AM -0500, Anurup M wrote:
> +The Hisilicon SoC HiP05/06/07 chips consist of various independent system
> +device PMU's such as L3 cache(L3C) and Miscellaneous Nodes(MN).
> +These PMU devices are independent and have hardware logic to gather
> +statistics and performance information.
> +
> +HiP0x chips are encapsulated by multiple CPU and IO die's. The CPU die is
> +called as Super CPU cluster (SCCL) which includes 16 cpu-cores. Every SCCL
> +is further grouped as CPU clusters (CCL) which includes 4 cpu-cores each.
> +Each SCCL has 1 L3 cache and 1 MN units.

Are there systems with multiple SCCLs? Or is there only one SCCL per
system?

> +The L3 cache is shared by all CPU cores in a CPU die. The L3C has four banks
> +(or instances). Each bank or instance of L3C has Eight 32-bit counter
> +registers and also event control registers. The HiP05/06 chip L3 cache has
> +22 statistics events. The HiP07 chip has 66 statistics events. These events
> +are very useful for debugging.

Is an L3C associated with a subset of physical memory (as with the ARM
CCN's L3C), or is it associated with a set of CPUs (e.g.  only those in
a single SCCL) covering all physical memory (as with each CPU's L1 &
L2)?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3 04/10] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting.
  2017-01-10 17:55 ` Mark Rutland
@ 2017-01-12  5:47   ` Anurup M
  0 siblings, 0 replies; 3+ messages in thread
From: Anurup M @ 2017-01-12  5:47 UTC (permalink / raw)
  To: Mark Rutland
  Cc: corbet, will.deacon, linux-kernel, linux-doc, linux-arm-kernel,
	anurup.m, zhangshaokun, tanxiaojun, xuwei5, sanil.kumar,
	john.garry, gabriele.paoloni, shiju.jose, linuxarm, shyju.pv,
	dikshit.n



On Tuesday 10 January 2017 11:25 PM, Mark Rutland wrote:
> On Mon, Jan 02, 2017 at 01:49:37AM -0500, Anurup M wrote:
>> +The Hisilicon SoC HiP05/06/07 chips consist of various independent system
>> +device PMU's such as L3 cache(L3C) and Miscellaneous Nodes(MN).
>> +These PMU devices are independent and have hardware logic to gather
>> +statistics and performance information.
>> +
>> +HiP0x chips are encapsulated by multiple CPU and IO die's. The CPU die is
>> +called as Super CPU cluster (SCCL) which includes 16 cpu-cores. Every SCCL
>> +is further grouped as CPU clusters (CCL) which includes 4 cpu-cores each.
>> +Each SCCL has 1 L3 cache and 1 MN units.
> Are there systems with multiple SCCLs? Or is there only one SCCL per
> system?

The HiP0x are encapsulated by multiple SCCL (CPU die) and SICL (IO die).
The HiP06 and HiP07 have two SCCLs.

>> +The L3 cache is shared by all CPU cores in a CPU die. The L3C has four banks
>> +(or instances). Each bank or instance of L3C has Eight 32-bit counter
>> +registers and also event control registers. The HiP05/06 chip L3 cache has
>> +22 statistics events. The HiP07 chip has 66 statistics events. These events
>> +are very useful for debugging.
> Is an L3C associated with a subset of physical memory (as with the ARM
> CCN's L3C), or is it associated with a set of CPUs (e.g.  only those in
> a single SCCL) covering all physical memory (as with each CPU's L1 &
> L2)?

Yes the L3C is associated with the set of CPUs in a single SCCL covering 
all physical memory.
The L3 cache in all  the SCCLs share the complete physical memory.


Thanks,
Anurup

> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-01-12  5:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-02  6:49 [PATCH v3 04/10] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting Anurup M
2017-01-10 17:55 ` Mark Rutland
2017-01-12  5:47   ` Anurup M

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).